aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2021-09-03mm: gup: fix potential pgmap refcnt leak in __gup_device_huge()Miaohe Lin1-5/+7
When failed to try_grab_page, put_dev_pagemap() is missed. So pgmap refcnt will leak in this case. Also we remove the check for pgmap against NULL as it's also checked inside the put_dev_pagemap(). [[email protected]: simplify, cleanup] [[email protected]: fix return value] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Miaohe Lin <[email protected]> Fixes: 3faa52c03f44 ("mm/gup: track FOLL_PIN pages") Reviewed-by: John Hubbard <[email protected]> Reviewed-by: Claudio Imbrenda <[email protected]> Cc: Jan Kara <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03mm: gup: remove useless BUG_ON in __get_user_pages()Miaohe Lin1-1/+0
Indeed, this BUG_ON couldn't catch anything useful. We are sure ret == 0 here because we would already bail out if ret != 0 and ret is untouched till here. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Miaohe Lin <[email protected]> Reviewed-by: John Hubbard <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Reviewed-by: Claudio Imbrenda <[email protected]> Cc: Jan Kara <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03mm: gup: remove unneed local variable orig_refsMiaohe Lin1-3/+1
Remove unneed local variable orig_refs since refs is unchanged now. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Miaohe Lin <[email protected]> Reviewed-by: John Hubbard <[email protected]> Reviewed-by: Claudio Imbrenda <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Cc: Jan Kara <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03mm: gup: remove set but unused local variable majorMiaohe Lin1-2/+1
Patch series "Cleanups and fixup for gup". This series contains cleanups to remove unneeded variable, useless BUG_ON and use helper to improve readability. Also we fix a potential pgmap refcnt leak. More details can be found in the respective changelogs. This patch (of 5): Since commit a2beb5f1efed ("mm: clean up the last pieces of page fault accountings"), the local variable major is unused. Remove it. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Miaohe Lin <[email protected]> Reviewed-by: John Hubbard <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Reviewed-by: Claudio Imbrenda <[email protected]> Cc: Jan Kara <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03include/linux/buffer_head.h: fix boolreturn.cocci warningsJing Yangyang1-1/+1
./include/linux/buffer_head.h:412:64-65:WARNING:return of 0/1 in function 'has_bh_in_lru' with return type bool Return statements in functions returning bool should use true/false instead of 1/0. Generated by: scripts/coccinelle/misc/boolreturn.cocci Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Jing Yangyang <[email protected]> Reported-by: Zeal Robot <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03writeback: memcg: simplify cgroup_writeback_by_idShakeel Butt4-24/+26
Currently cgroup_writeback_by_id calls mem_cgroup_wb_stats() to get dirty pages for a memcg. However mem_cgroup_wb_stats() does a lot more than just get the number of dirty pages. Just directly get the number of dirty pages instead of calling mem_cgroup_wb_stats(). Also cgroup_writeback_by_id() is only called for best-effort dirty flushing, so remove the unused 'nr' parameter and no need to explicitly flush memcg stats. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Shakeel Butt <[email protected]> Reviewed-by: Jan Kara <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Johannes Weiner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03fs: inode: count invalidated shadow pages in pginodestealJohannes Weiner2-11/+11
pginodesteal is supposed to capture the impact that inode reclaim has on the page cache state. Currently, it doesn't consider shadow pages that get dropped this way, even though this can have a significant impact on paging behavior, memory pressure calculations etc. To improve visibility into these effects, make sure shadow pages get counted when they get dropped through inode reclaim. This changes the return value semantics of invalidate_mapping_pages() semantics slightly, but the only two users are the inode shrinker itsel and a usb driver that logs it for debugging purposes. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Johannes Weiner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03fs: drop_caches: fix skipping over shadow cache inodesJohannes Weiner1-1/+2
When drop_caches truncates the page cache in an inode it also includes any shadow entries for evicted pages. However, there is a preliminary check on whether the inode has pages: if it has *only* shadow entries, it will skip running truncation on the inode and leave it behind. Fix the check to mapping_empty(), such that it runs truncation on any inode that has cache entries at all. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Johannes Weiner <[email protected]> Reported-by: Roman Gushchin <[email protected]> Acked-by: Roman Gushchin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03mm: remove irqsave/restore locking from contexts with irqs enabledJohannes Weiner3-19/+13
The page cache deletion paths all have interrupts enabled, so no need to use irqsafe/irqrestore locking variants. They used to have irqs disabled by the memcg lock added in commit c4843a7593a9 ("memcg: add per cgroup dirty page accounting"), but that has since been replaced by memcg taking the page lock instead, commit 0a31bc97c80c ("mm: memcontrol: rewrite uncharge AP"). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Johannes Weiner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03writeback: use READ_ONCE for unlocked reads of writeback statsJan Kara1-12/+13
We do some unlocked reads of writeback statistics like avg_write_bandwidth, dirty_ratelimit, or bw_time_stamp. Generally we are fine with getting somewhat out-of-date values but actually getting different values in various parts of the functions because the compiler decided to reload value from original memory location could confuse calculations. Use READ_ONCE for these unlocked accesses and WRITE_ONCE for the updates to be on the safe side. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Jan Kara <[email protected]> Cc: Michael Stapelberg <[email protected]> Cc: Wu Fengguang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03writeback: rename domain_update_bandwidth()Jan Kara1-4/+4
Rename domain_update_bandwidth() to domain_update_dirty_limit(). The original name is a misnomer. The function has nothing to do with a bandwidth, it updates dirty limits. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Jan Kara <[email protected]> Cc: Michael Stapelberg <[email protected]> Cc: Wu Fengguang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03writeback: fix bandwidth estimate for spiky workloadJan Kara4-14/+37
Michael Stapelberg has reported that for workload with short big spikes of writes (GCC linker seem to trigger this frequently) the write throughput is heavily underestimated and tends to steadily sink until it reaches zero. This has rather bad impact on writeback throttling (causing stalls). The problem is that writeback throughput estimate gets updated at most once per 200 ms. One update happens early after we submit pages for writeback (at that point writeout of only small fraction of pages is completed and thus observed throughput is tiny). Next update happens only during the next write spike (updates happen only from inode writeback and dirty throttling code) and if that is more than 1s after previous spike, we decide system was idle and just ignore whatever was written until this moment. Fix the problem by making sure writeback throughput estimate is also updated shortly after writeback completes to get reasonable estimate of throughput for spiky workloads. [[email protected]: avoid division by 0 in wb_update_dirty_ratelimit()] Link: https://lore.kernel.org/lkml/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Jan Kara <[email protected]> Reported-by: Michael Stapelberg <[email protected]> Tested-by: Michael Stapelberg <[email protected]> Cc: Wu Fengguang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03writeback: reliably update bandwidth estimationJan Kara4-16/+46
Currently we trigger writeback bandwidth estimation from balance_dirty_pages() and from wb_writeback(). However neither of these need to trigger when the system is relatively idle and writeback is triggered e.g. from fsync(2). Make sure writeback estimates happen reliably by triggering them from do_writepages(). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Jan Kara <[email protected]> Cc: Michael Stapelberg <[email protected]> Cc: Wu Fengguang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03writeback: track number of inodes under writebackJan Kara4-2/+27
Patch series "writeback: Fix bandwidth estimates", v4. Fix estimate of writeback throughput when device is not fully busy doing writeback. Michael Stapelberg has reported that such workload (e.g. generated by linking) tends to push estimated throughput down to 0 and as a result writeback on the device is practically stalled. The first three patches fix the reported issue, the remaining two patches are unrelated cleanups of problems I've noticed when reading the code. This patch (of 4): Track number of inodes under writeback for each bdi_writeback structure. We will use this to decide whether wb does any IO and so we can estimate its writeback throughput. In principle we could use number of pages under writeback (WB_WRITEBACK counter) for this however normal percpu counter reads are too inaccurate for our purposes and summing the counter is too expensive. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Jan Kara <[email protected]> Cc: Wu Fengguang <[email protected]> Cc: Michael Stapelberg <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03mm: add kernel_misc_reclaimable in show_free_areasliuhailong1-0/+2
Print NR_KERNEL_MISC_RECLAIMABLE stat from show_free_areas() so users can check whether the shrinker is working correctly and to show the current memory usage. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: liuhailong <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03mm: report a more useful address for reclaim acquisitionMatthew Wilcox (Oracle)3-14/+14
A recent lockdep report included these lines: [ 96.177910] 3 locks held by containerd/770: [ 96.177934] #0: ffff88810815ea28 (&mm->mmap_lock#2){++++}-{3:3}, at: do_user_addr_fault+0x115/0x770 [ 96.177999] #1: ffffffff82915020 (rcu_read_lock){....}-{1:2}, at: get_swap_device+0x33/0x140 [ 96.178057] #2: ffffffff82955ba0 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x5/0x30 While it was not useful to that bug report to know where the reclaim lock had been acquired, it might be useful under other circumstances. Allow the caller of __fs_reclaim_acquire to specify the instruction pointer to use. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Cc: Omar Sandoval <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Boqun Feng <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03mm/debug_vm_pgtable: fix corrupted page flagGavin Shan1-4/+51
In page table entry modifying tests, set_xxx_at() are used to populate the page table entries. On ARM64, PG_arch_1 (PG_dcache_clean) flag is set to the target page flag if execution permission is given. The logic exits since commit 4f04d8f00545 ("arm64: MMU definitions"). The page flag is kept when the page is free'd to buddy's free area list. However, it will trigger page checking failure when it's pulled from the buddy's free area list, as the following warning messages indicate. BUG: Bad page state in process memhog pfn:08000 page:0000000015c0a628 refcount:0 mapcount:0 \ mapping:0000000000000000 index:0x1 pfn:0x8000 flags: 0x7ffff8000000800(arch_1|node=0|zone=0|lastcpupid=0xfffff) raw: 07ffff8000000800 dead000000000100 dead000000000122 0000000000000000 raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: PAGE_FLAGS_CHECK_AT_PREP flag(s) set This fixes the issue by clearing PG_arch_1 through flush_dcache_page() after set_xxx_at() is called. For architectures other than ARM64, the unexpected overhead of cache flushing is acceptable. Link: https://lkml.kernel.org/r/[email protected] Fixes: a5c3b9ffb0f4 ("mm/debug_vm_pgtable: add tests validating advanced arch page table helpers") Signed-off-by: Gavin Shan <[email protected]> Reviewed-by: Anshuman Khandual <[email protected]> Tested-by: Christophe Leroy <[email protected]> [powerpc 8xx] Tested-by: Gerald Schaefer <[email protected]> [s390] Cc: Aneesh Kumar K.V <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Chunyu Hu <[email protected]> Cc: Qian Cai <[email protected]> Cc: Vineet Gupta <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03mm/debug_vm_pgtable: remove unused codeGavin Shan1-54/+0
The variables used by old implementation isn't needed as we switched to "struct pgtable_debug_args". Lets remove them and related code in debug_vm_pgtable(). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Gavin Shan <[email protected]> Reviewed-by: Anshuman Khandual <[email protected]> Tested-by: Christophe Leroy <[email protected]> [powerpc 8xx] Tested-by: Gerald Schaefer <[email protected]> [s390] Cc: Aneesh Kumar K.V <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Chunyu Hu <[email protected]> Cc: Qian Cai <[email protected]> Cc: Vineet Gupta <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03mm/debug_vm_pgtable: use struct pgtable_debug_args in PGD and P4D modifying ↵Gavin Shan1-48/+38
tests This uses struct pgtable_debug_args in PGD/P4D modifying tests. No allocated huge page is used in these tests. Besides, the unused variable @saved_p4dp and @saved_pudp are dropped. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Gavin Shan <[email protected]> Reviewed-by: Anshuman Khandual <[email protected]> Tested-by: Christophe Leroy <[email protected]> [powerpc 8xx] Tested-by: Gerald Schaefer <[email protected]> [s390] Cc: Aneesh Kumar K.V <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Chunyu Hu <[email protected]> Cc: Qian Cai <[email protected]> Cc: Vineet Gupta <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03mm/debug_vm_pgtable: use struct pgtable_debug_args in PUD modifying testsGavin Shan1-78/+48
This uses struct pgtable_debug_args in PUD modifying tests. The allocated huge page is used when set_pud_at() is used. The corresponding tests are skipped if the huge page doesn't exist. Besides, the following unused variables in debug_vm_pgtable() are dropped: @prot, @paddr, @pud_aligned. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Gavin Shan <[email protected]> Reviewed-by: Anshuman Khandual <[email protected]> Tested-by: Christophe Leroy <[email protected]> [powerpc 8xx] Tested-by: Gerald Schaefer <[email protected]> [s390] Cc: Aneesh Kumar K.V <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Chunyu Hu <[email protected]> Cc: Qian Cai <[email protected]> Cc: Vineet Gupta <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03mm/debug_vm_pgtable: use struct pgtable_debug_args in PMD modifying testsGavin Shan1-52/+46
This uses struct pgtable_debug_args in PMD modifying tests. The allocated huge page is used when set_pmd_at() is used. The corresponding tests are skipped if the huge page doesn't exist. Besides, the unused variable @pmd_aligned in debug_vm_pgtable() is dropped. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Gavin Shan <[email protected]> Reviewed-by: Anshuman Khandual <[email protected]> Tested-by: Christophe Leroy <[email protected]> [powerpc 8xx] Tested-by: Gerald Schaefer <[email protected]> [s390] Cc: Aneesh Kumar K.V <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Chunyu Hu <[email protected]> Cc: Qian Cai <[email protected]> Cc: Vineet Gupta <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03mm/debug_vm_pgtable: use struct pgtable_debug_args in PTE modifying testsGavin Shan1-35/+32
This uses struct pgtable_debug_args in PTE modifying tests. The allocated page is used as set_pte_at() is used there. The tests are skipped if the allocated page doesn't exist. It's notable that args->ptep need to be mapped before the tests. The reason why we don't map args->ptep at the beginning is PTE entry is only mapped and accessible in atomic context when CONFIG_HIGHPTE is enabled. So we avoid to do that so that atomic context is only enabled if needed. Besides, the unused variable @pte_aligned and @ptep in debug_vm_pgtable() are dropped. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Gavin Shan <[email protected]> Reviewed-by: Anshuman Khandual <[email protected]> Tested-by: Christophe Leroy <[email protected]> [powerpc 8xx] Tested-by: Gerald Schaefer <[email protected]> [s390] Cc: Aneesh Kumar K.V <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Chunyu Hu <[email protected]> Cc: Qian Cai <[email protected]> Cc: Vineet Gupta <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03mm/debug_vm_pgtable: use struct pgtable_debug_args in migration and thp testsGavin Shan1-19/+17
This uses struct pgtable_debug_args in the migration and thp test functions. It's notable that the pre-allocated page is used in swap_migration_tests() as set_pte_at() is used there. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Gavin Shan <[email protected]> Reviewed-by: Anshuman Khandual <[email protected]> Tested-by: Christophe Leroy <[email protected]> [powerpc 8xx] Tested-by: Gerald Schaefer <[email protected]> [s390] Cc: Aneesh Kumar K.V <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Chunyu Hu <[email protected]> Cc: Qian Cai <[email protected]> Cc: Vineet Gupta <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03mm/debug_vm_pgtable: use struct pgtable_debug_args in soft_dirty and swap testsGavin Shan1-25/+23
This uses struct pgtable_debug_args in the soft_dirty and swap test functions. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Gavin Shan <[email protected]> Reviewed-by: Anshuman Khandual <[email protected]> Tested-by: Christophe Leroy <[email protected]> [powerpc 8xx] Tested-by: Gerald Schaefer <[email protected]> [s390] Cc: Aneesh Kumar K.V <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Chunyu Hu <[email protected]> Cc: Qian Cai <[email protected]> Cc: Vineet Gupta <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03mm/debug_vm_pgtable: use struct pgtable_debug_args in protnone and devmap testsGavin Shan1-32/+26
This uses struct pgtable_debug_args in protnone and devmap test functions. After that, the unused variable @protnone in debug_vm_pgtable() is dropped. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Gavin Shan <[email protected]> Reviewed-by: Anshuman Khandual <[email protected]> Tested-by: Christophe Leroy <[email protected]> [powerpc 8xx] Tested-by: Gerald Schaefer <[email protected]> [s390] Cc: Aneesh Kumar K.V <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Chunyu Hu <[email protected]> Cc: Qian Cai <[email protected]> Cc: Vineet Gupta <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03mm/debug_vm_pgtable: use struct pgtable_debug_args in leaf and savewrite testsGavin Shan1-16/+16
This uses struct pgtable_debug_args in the leaf and savewrite test functions. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Gavin Shan <[email protected]> Reviewed-by: Anshuman Khandual <[email protected]> Tested-by: Christophe Leroy <[email protected]> [powerpc 8xx] Tested-by: Gerald Schaefer <[email protected]> [s390] Cc: Aneesh Kumar K.V <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Chunyu Hu <[email protected]> Cc: Qian Cai <[email protected]> Cc: Vineet Gupta <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03mm/debug_vm_pgtable: use struct pgtable_debug_args in basic testsGavin Shan1-26/+24
This uses struct pgtable_debug_args in the basic test functions. The unused variables @pgd_aligned and @p4d_aligned in debug_vm_pgtable() are dropped. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Gavin Shan <[email protected]> Reviewed-by: Anshuman Khandual <[email protected]> Tested-by: Christophe Leroy <[email protected]> [powerpc 8xx] Tested-by: Gerald Schaefer <[email protected]> [s390] Cc: Aneesh Kumar K.V <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Chunyu Hu <[email protected]> Cc: Qian Cai <[email protected]> Cc: Vineet Gupta <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03mm/debug_vm_pgtable: introduce struct pgtable_debug_argsGavin Shan1-1/+269
Patch series "mm/debug_vm_pgtable: Enhancements", v6. There are a couple of issues with current implementations and this series tries to resolve the issues: (a) All needed information are scattered in variables, passed to various test functions. The code is organized in pretty much relaxed fashion. (b) The page isn't allocated from buddy during page table entry modifying tests. The page can be invalid, conflicting to the implementations of set_xxx_at() on ARM64. The target page is accessed so that the iCache can be flushed when execution permission is given on ARM64. Besides, the target page can be unmapped and accessing to it causes kernel crash. "struct pgtable_debug_args" is introduced to address issue (a). For issue (b), the used page is allocated from buddy in page table entry modifying tests. The corresponding tets will be skipped if we fail to allocate the (huge) page. For other test cases, the original page around to kernel symbol (@start_kernel) is still used. The patches are organized as below. PATCH[2-10] could be combined to one patch, but it will make the review harder: PATCH[1] introduces "struct pgtable_debug_args" as place holder of all needed information. With it, the old and new implementation can coexist. PATCH[2-10] uses "struct pgtable_debug_args" in various test functions. PATCH[11] removes the unused code for old implementation. PATCH[12] fixes the issue of corrupted page flag for ARM64 This patch (of 6): In debug_vm_pgtable(), there are many local variables introduced to track the needed information and they are passed to the functions for various test cases. It'd better to introduce a struct as place holder for these information. With it, what the tests functions need is the struct. In this way, the code is simplified and easier to be maintained. Besides, set_xxx_at() could access the data on the corresponding pages in the page table modifying tests. So the accessed pages in the tests should have been allocated from buddy. Otherwise, we're accessing pages that aren't owned by us. This causes issues like page flag corruption or kernel crash on accessing unmapped page when CONFIG_DEBUG_PAGEALLOC is enabled. This introduces "struct pgtable_debug_args". The struct is initialized and destroyed, but the information in the struct isn't used yet. It will be used in subsequent patches. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Gavin Shan <[email protected]> Reviewed-by: Anshuman Khandual <[email protected]> Tested-by: Christophe Leroy <[email protected]> [powerpc 8xx] Tested-by: Gerald Schaefer <[email protected]> [s390] Cc: Anshuman Khandual <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Cc: Qian Cai <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Cc: Vineet Gupta <[email protected]> Cc: Chunyu Hu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03arch/csky/kernel/probes/kprobes.c: fix bugon.cocci warningskernel test robot1-2/+1
Use BUG_ON instead of a if condition followed by BUG. Generated by: scripts/coccinelle/misc/bugon.cocci Link: https://lkml.kernel.org/r/alpine.DEB.2.22.394.2107061049150.7197@hadrien Fixes: 7d37cb2c912d ("lib: fix kconfig dependency on ARCH_WANT_FRAME_POINTERS") Signed-off-by: kernel test robot <[email protected]> Signed-off-by: Julia Lawall <[email protected]> Reported-by: kernel test robot <[email protected]> Cc: Julian Braha <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03ocfs2: ocfs2_downconvert_lock failure results in deadlockGang He1-0/+12
Usually, ocfs2_downconvert_lock() function always downconverts dlm lock to the expected level for satisfy dlm bast requests from the other nodes. But there is a rare situation. When dlm lock conversion is being canceled, ocfs2_downconvert_lock() function will return -EBUSY. You need to be aware that ocfs2_cancel_convert() function is asynchronous in fsdlm implementation. If we does not requeue this lockres entry, ocfs2 downconvert thread no longer handles this dlm lock bast request. Then, the other nodes will not get the dlm lock again, the current node's process will be blocked when acquire this dlm lock again. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Gang He <[email protected]> Reviewed-by: Joseph Qi <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03ocfs2: quota_local: fix possible uninitialized-variable access in ↵Tuo Li2-1/+2
ocfs2_local_read_info() A memory block is allocated through kmalloc(), and its return value is assigned to the pointer oinfo. However, oinfo->dqi_gqinode is not initialized but it is accessed in: iput(oinfo->dqi_gqinode); To fix this possible uninitialized-variable access, assign NULL to oinfo->dqi_gqinode, and add ocfs2_qinfo_lock_res_init() behind the assignment in ocfs2_local_read_info(). Remove ocfs2_qinfo_lock_res_init() in ocfs2_global_read_info(). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Tuo Li <[email protected]> Reported-by: TOTE Robot <[email protected]> Reviewed-by: Joseph Qi <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03ocfs2: remove an unnecessary conditionDan Carpenter1-1/+1
The case where "tmp_oh" is NULL is handled at the start of the function. At this point we know it's non-NULL so this will always return 1. Link: https://lkml.kernel.org/r/YOcItgIXtisi3MaO@mwanda Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: Joseph Qi <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Cc: Larry Chen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03ia64: make num_rsvd_regions staticGeert Uytterhoeven2-2/+1
Commit f62800992e5917f2 ("ia64: switch to NO_BOOTMEM") removed the last user of num_rsvd_regions outside arch/ia64/kernel/setup.c. Link: https://lkml.kernel.org/r/a377b5437e3e9da93d02f996fe06a2b956cb0990.1629884459.git.geert+renesas@glider.be Signed-off-by: Geert Uytterhoeven <[email protected]> Cc: Frank Rowand <[email protected]> Cc: Jay Lan <[email protected]> Cc: Magnus Damm <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Rob Herring <[email protected]> Cc: Simon Horman <[email protected]> Cc: Tony Luck <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03ia64: make reserve_elfcorehdr() staticGeert Uytterhoeven2-27/+25
There never was a reason for reserve_elfcorehdr() to be global. Make the function static, and move it before its sole caller. Link: https://lkml.kernel.org/r/fe236cd73b64abc4abd03dd808cb015c907f4c8c.1629884459.git.geert+renesas@glider.be Fixes: cee87af2a5f75713 ("[IA64] kexec: Use EFI_LOADER_DATA for ELF core header") Signed-off-by: Geert Uytterhoeven <[email protected]> Cc: Frank Rowand <[email protected]> Cc: Jay Lan <[email protected]> Cc: Magnus Damm <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Rob Herring <[email protected]> Cc: Simon Horman <[email protected]> Cc: Tony Luck <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03ia64: fix #endif comment for reserve_elfcorehdr()Geert Uytterhoeven1-1/+1
Patch series "ia64: Miscellaneous fixes and cleanups". This patch series contains some miscellaneous fixes and cleanups for ia64. The second patch fixes a naming conflict triggered by a patch for the FDT code. This patch (of 3): The definition of reserve_elfcorehdr() depends on CONFIG_CRASH_DUMP, not CONFIG_PROC_VMCORE. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/77b4c0648f200cab7e1c2c5171c06763e09362aa.1629884459.git.geert+renesas@glider.be Fixes: d9a9855d0b06ca6d ("always reserve elfcore header memory in crash kernel") Fixes: 17c1f07ed70afa4f ("[IA64] Reserve elfcorehdr memory in CONFIG_CRASH_DUMP") Signed-off-by: Geert Uytterhoeven <[email protected]> Cc: Simon Horman <[email protected]> Cc: Tony Luck <[email protected]> Cc: Jay Lan <[email protected]> Cc: Magnus Damm <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Rob Herring <[email protected]> Cc: Frank Rowand <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03ia64: fix typo in a commentJason Wang1-1/+1
s/when when/when/ Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Jason Wang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03Merge branch 'linux-next' of ↵Konrad Rzeszutek Wilk1-8/+16
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/ibft into HEAD * 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/ibft: iscsi_ibft: Fix isa_bus_to_virt not working under ARM
2021-09-03Merge branch 'fixes' into nextMichael Ellerman18-81/+125
Merge our fixes branch into next. That lets us resolve a conflict in arch/powerpc/sysdev/xive/common.c. Between cbc06f051c52 ("powerpc/xive: Do not skip CPU-less nodes when creating the IPIs"), which moved request_irq() out of xive_init_ipis(), and 17df41fec5b8 ("powerpc: use IRQF_NO_DEBUG for IPIs") which added IRQF_NO_DEBUG to that request_irq() call, which has now moved.
2021-09-03parisc: Fix unaligned-access crash in bootloaderHelge Deller1-1/+1
Kernel v5.14 has various changes to optimize unaligned memory accesses, e.g. commit 0652035a5794 ("asm-generic: unaligned: remove byteshift helpers"). Those changes triggered an unalignment-exception and thus crashed the bootloader on parisc because the unaligned "output_len" variable now suddenly was read word-wise while it was read byte-wise in the past. Fix this issue by declaring the external output_len variable as char which then forces the compiler to generate byte-accesses. Signed-off-by: Helge Deller <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: John David Anglin <[email protected]> Bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162 Fixes: 8c031ba63f8f ("parisc: Unbreak bootloader due to gcc-7 optimizations") Fixes: 0652035a5794 ("asm-generic: unaligned: remove byteshift helpers") Cc: <[email protected]> # v5.14+
2021-09-03kbuild: redo fake deps at include/ksym/*.hMasahiro Yamada2-5/+4
Commit 0e0345b77ac4 ("kbuild: redo fake deps at include/config/*.h") simplified the Kconfig/fixdep interaction a lot. For CONFIG_FOO_BAR_BAZ, Kconfig now touches include/config/FOO_BAR_BAZ instead of the previous include/config/foo/bar/baz.h . This commit simplifies the TRIM_UNUSED_KSYMS feature in a similar way: - delete .h suffix - delete tolower() - put everything in 1 directory For EXPORT_SYMBOL(FOO_BAR_BAZ), scripts/adjust_autoksyms.sh now touches include/ksym/FOO_BAR_BAZ instead of include/ksym/foo/bar/baz.h . This is more precise, avoiding possibly unnecessary rebuilds. EXPORT_SYMBOL(FOO_BAR_BAZ) EXPORT_SYMBOL(_FOO_BAR_BAZ) EXPORT_SYMBOL(__FOO_BAR_BAZ) were previously mapped to the same header, include/ksym/foo/bar/baz.h but now are handled separately. Signed-off-by: Masahiro Yamada <[email protected]>
2021-09-03kbuild: clean up objtool_args slightlyMasahiro Yamada1-6/+5
The code: $(if $(or $(CONFIG_GCOV_KERNEL),$(CONFIG_LTO_CLANG)), ...) ... can be simpled to: $(if $(CONFIG_GCOV_KERNEL)$(CONFIG_LTO_CLANG), ...) Also, remove meaningless commas at the end of $(if ...). Signed-off-by: Masahiro Yamada <[email protected]>
2021-09-03modpost: get the *.mod file path more simplyMasahiro Yamada3-16/+11
get_src_version() strips 'o' or 'lto.o' from the end of the object file path (so, postfixlen is 1 or 5), then adds 'mod'. If you look at the code closely, mod->name already holds the base path with the extension stripped. Most of the code changes made by commit 7ac204b545f2 ("modpost: lto: strip .lto from module names") was actually unneeded. sumversion.c does not need strends(), so it can get back local in modpost.c again. Signed-off-by: Masahiro Yamada <[email protected]>
2021-09-03checkkconfigsymbols.py: Fix the '--ignore' optionAriel Marcovitch1-1/+1
It seems like the implementation of the --ignore option is broken. In check_symbols_helper, when going through the list of files, a file is added to the list of source files to check if it matches the ignore pattern. Instead, as stated in the comment below this condition, the file should be added if it doesn't match the pattern. This means that when providing an ignore pattern, the only files that will be checked will be the ones we want the ignore, in addition to the Kconfig files that don't match the pattern (the check in parse_kconfig_files is done right) Signed-off-by: Ariel Marcovitch <[email protected]> Signed-off-by: Masahiro Yamada <[email protected]>
2021-09-03kbuild: merge vmlinux_link() between ARCH=um and other architecturesMasahiro Yamada1-33/+23
For ARCH=um, ${CC} is used as the linker driver. Hence, the linker options are prefixed with -Wl, . Merge the similar code. I replaced the -T option with the long option --script= so that it works well with/without ${wl}. Signed-off-by: Masahiro Yamada <[email protected]> Reviewed-by: Kees Cook <[email protected]>
2021-09-03kbuild: do not remove 'linux' link in scripts/link-vmlinux.shMasahiro Yamada1-1/+0
arch/um/Makefile passes the -f option to the ln command: linux: vmlinux @echo ' LINK $@' $(Q)ln -f $< $@ So, the hard link is always re-created, and the old one is removed anyway. Signed-off-by: Masahiro Yamada <[email protected]> Reviewed-by: Kees Cook <[email protected]>
2021-09-03kbuild: merge vmlinux_link() between the ordinary link and Clang LTOMasahiro Yamada1-16/+14
When Clang LTO is enabled, vmlinux_link() reuses vmlinux.o instead of re-linking ${KBUILD_VMLINUX_OBJS} and ${KBUILD_VMLINUX_LIBS}. That is the only difference here, so merge the similar code. Signed-off-by: Masahiro Yamada <[email protected]> Reviewed-by: Kees Cook <[email protected]>
2021-09-03kbuild: remove stale *.symversionsMasahiro Yamada1-0/+2
cmd_update_lto_symversions merges all the existing *.symversions, but some of them might be stale. If the last EXPORT_SYMBOL is removed from a C file, the *.symversions file is not deleted or updated. It contains stale CRCs, but still they will be used for linking the vmlinux or modules. It is not a big deal when the EXPORT_SYMBOL is really removed. However, when the EXPORT_SYMBOL is moved to another file, the same __crc_<symbol> will appear twice in the merged *.symversions, possibly with different CRCs if the function argument is changed at the same time. It would confuse module versioning. If no EXPORT_SYMBOL is found, let's remove *.symversions explicitly. Signed-off-by: Masahiro Yamada <[email protected]> Reviewed-by: Kees Cook <[email protected]>
2021-09-03kbuild: remove unused quiet_cmd_update_lto_symversionsMasahiro Yamada1-1/+0
This is not used anywhere because the short log is displayed when it is used through a $(call cmd,...) invocation. Signed-off-by: Masahiro Yamada <[email protected]> Reviewed-by: Kees Cook <[email protected]>
2021-09-03gen_compile_commands: extract compiler command from a series of commandsMasahiro Yamada1-1/+1
The current gen_compile_commands.py assumes that objects are always built by a single command. It makes sense to support cases where objects are built by a series of commands: cmd_<object> := <command1> ; <command2> One use-case is that <command1> is a compiler command, and <command2> an objtool command. It allows *.cmd files to contain an objtool command so that any change in it triggers object rebuilds. If ; appears after the C source file, take the first command. Signed-off-by: Masahiro Yamada <[email protected]> Reviewed-by: Kees Cook <[email protected]>
2021-09-03x86: remove cc-option-yn test for -mtune=Nick Desaulniers1-6/+0
As noted in the comment, -mtune= has been supported since GCC 3.4. The minimum required version of GCC to build the kernel (as specified in Documentation/process/changes.rst) is GCC 4.9. tune is not immediately expanded. Instead it defines a macro that will test via cc-option later values for -mtune=. But we can skip the test whether to use -mtune= vs. -mcpu=. Signed-off-by: Nick Desaulniers <[email protected]> Reviewed-by: Nathan Chancellor <[email protected]> Reviewed-by: Miguel Ojeda <[email protected]> Signed-off-by: Masahiro Yamada <[email protected]>