aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2019-03-05memcg: localize memcg_kmem_enabled() checkShakeel Butt5-20/+44
Move the memcg_kmem_enabled() checks into memcg kmem charge/uncharge functions, so, the users don't have to explicitly check that condition. This is purely code cleanup patch without any functional change. Only the order of checks in memcg_charge_slab() can potentially be changed but the functionally it will be same. This should not matter as memcg_charge_slab() is not in the hot path. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Shakeel Butt <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Vladimir Davydov <[email protected]> Cc: Roman Gushchin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-03-05mm, slub: make the comment of put_cpu_partial() completeWei Yang1-2/+2
There are two cases when put_cpu_partial() is invoked. * __slab_free * get_partial_node This patch just makes it cover these two cases. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Wei Yang <[email protected]> Acked-by: Christoph Lameter <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: David Rientjes <[email protected]> Cc: Joonsoo Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-03-05mm: reuse only-pte-mapped KSM page in do_wp_page()Kirill Tkhai3-4/+49
Add an optimization for KSM pages almost in the same way that we have for ordinary anonymous pages. If there is a write fault in a page, which is mapped to an only pte, and it is not related to swap cache; the page may be reused without copying its content. [ Note that we do not consider PageSwapCache() pages at least for now, since we don't want to complicate __get_ksm_page(), which has nice optimization based on this (for the migration case). Currenly it is spinning on PageSwapCache() pages, waiting for when they have unfreezed counters (i.e., for the migration finish). But we don't want to make it also spinning on swap cache pages, which we try to reuse, since there is not a very high probability to reuse them. So, for now we do not consider PageSwapCache() pages at all. ] So in reuse_ksm_page() we check for 1) PageSwapCache() and 2) page_stable_node(), to skip a page, which KSM is currently trying to link to stable tree. Then we do page_ref_freeze() to prohibit KSM to merge one more page into the page, we are reusing. After that, nobody can refer to the reusing page: KSM skips !PageSwapCache() pages with zero refcount; and the protection against of all other participants is the same as for reused ordinary anon pages pte lock, page lock and mmap_sem. [[email protected]: replace BUG_ON()s with WARN_ON()s] Link: http://lkml.kernel.org/r/154471491016.31352.1168978849911555609.stgit@localhost.localdomain Signed-off-by: Kirill Tkhai <[email protected]> Reviewed-by: Yang Shi <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Christian Koenig <[email protected]> Cc: Claudio Imbrenda <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Huang Ying <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Kirill Tkhai <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-03-05tools/: replace open encodings for NUMA_NO_NODEStephen Rothwell2-3/+20
This replaces all open encodings in tools with NUMA_NO_NODE. Also linux/numa.h is now needed for the perf build. [[email protected]: fix for replace open encodings for NUMA_NO_NODE] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Stephen Rothwell <[email protected]> Signed-off-by: Anshuman Khandual <[email protected]> Signed-off-by: Stephen Rothwell <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Doug Ledford <[email protected]> [drivers/infiniband] Cc: Hans Verkuil <[email protected]> Cc: Jeff Kirsher <[email protected]> [ixgbe] Cc: Jens Axboe <[email protected]> [mtip32xx] Cc: Joseph Qi <[email protected]> Cc: Michael Ellerman <[email protected]> [powerpc] Cc: Vinod Koul <[email protected]> [dmaengine.c] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-03-05mm: replace all open encodings for NUMA_NO_NODEAnshuman Khandual39-74/+104
Patch series "Replace all open encodings for NUMA_NO_NODE", v3. All these places for replacement were found by running the following grep patterns on the entire kernel code. Please let me know if this might have missed some instances. This might also have replaced some false positives. I will appreciate suggestions, inputs and review. 1. git grep "nid == -1" 2. git grep "node == -1" 3. git grep "nid = -1" 4. git grep "node = -1" This patch (of 2): At present there are multiple places where invalid node number is encoded as -1. Even though implicitly understood it is always better to have macros in there. Replace these open encodings for an invalid node number with the global macro NUMA_NO_NODE. This helps remove NUMA related assumptions like 'invalid node' from various places redirecting them to a common definition. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Anshuman Khandual <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Acked-by: Jeff Kirsher <[email protected]> [ixgbe] Acked-by: Jens Axboe <[email protected]> [mtip32xx] Acked-by: Vinod Koul <[email protected]> [dmaengine.c] Acked-by: Michael Ellerman <[email protected]> [powerpc] Acked-by: Doug Ledford <[email protected]> [drivers/infiniband] Cc: Joseph Qi <[email protected]> Cc: Hans Verkuil <[email protected]> Cc: Stephen Rothwell <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-03-05mm/vmalloc.c: don't dereference possible NULL pointer in __vunmap()Liviu Dudau1-1/+1
find_vmap_area() can return a NULL pointer and we're going to dereference it without checking it first. Use the existing find_vm_area() function which does exactly what we want and checks for the NULL pointer. Link: http://lkml.kernel.org/r/[email protected] Fixes: f3c01d2f3ade ("mm: vmalloc: avoid racy handling of debugobjects in vunmap") Signed-off-by: Liviu Dudau <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Chintan Pandya <[email protected]> Cc: Andrey Ryabinin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-03-05PM/Hibernate: exclude all PageOffline() pagesDavid Hildenbrand1-2/+7
The content of pages that are marked PG_offline is not of interest (e.g. inflated by a balloon driver), let's skip these pages. In saveable_highmem_page(), move the PageReserved() check to a new check along with the PageOffline() check to separate it from the swsusp checks. [[email protected]: v2] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Acked-by: Pavel Machek <[email protected]> Acked-by: Rafael J. Wysocki <[email protected]> Cc: Pavel Machek <[email protected]> Cc: Len Brown <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Michal Hocko <[email protected]> Cc: "Michael S. Tsirkin" <[email protected]> Cc: Alexander Duyck <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Baoquan He <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Christian Hansen <[email protected]> Cc: Dave Young <[email protected]> Cc: David Rientjes <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Haiyang Zhang <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Julien Freche <[email protected]> Cc: Kairui Song <[email protected]> Cc: Kazuhito Hagio <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Cc: Konstantin Khlebnikov <[email protected]> Cc: "K. Y. Srinivasan" <[email protected]> Cc: Lianbo Jiang <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Miles Chen <[email protected]> Cc: Nadav Amit <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Omar Sandoval <[email protected]> Cc: Pankaj gupta <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Stephen Hemminger <[email protected]> Cc: Stephen Rothwell <[email protected]> Cc: Vitaly Kuznetsov <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Xavier Deguillard <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-03-05PM/Hibernate: use pfn_to_online_page()David Hildenbrand1-4/+4
Let's use pfn_to_online_page() instead of pfn_to_page() when checking for saveable pages to not save/restore offline memory sections. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Suggested-by: Michal Hocko <[email protected]> Acked-by: Michal Hocko <[email protected]> Acked-by: Pavel Machek <[email protected]> Acked-by: Rafael J. Wysocki <[email protected]> Cc: Len Brown <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: "Michael S. Tsirkin" <[email protected]> Cc: Alexander Duyck <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Baoquan He <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Christian Hansen <[email protected]> Cc: Dave Young <[email protected]> Cc: David Rientjes <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Haiyang Zhang <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Julien Freche <[email protected]> Cc: Kairui Song <[email protected]> Cc: Kazuhito Hagio <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Cc: Konstantin Khlebnikov <[email protected]> Cc: "K. Y. Srinivasan" <[email protected]> Cc: Lianbo Jiang <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Miles Chen <[email protected]> Cc: Nadav Amit <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Omar Sandoval <[email protected]> Cc: Pankaj gupta <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Stephen Hemminger <[email protected]> Cc: Stephen Rothwell <[email protected]> Cc: Vitaly Kuznetsov <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Xavier Deguillard <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-03-05vmw_balloon: mark inflated pages PG_offlineDavid Hildenbrand1-0/+32
Mark inflated and never onlined pages PG_offline, to tell the world that the content is stale and should not be dumped. [[email protected]: use vmballoon_page_in_frames more widely] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Acked-by: Nadav Amit <[email protected]> Cc: Xavier Deguillard <[email protected]> Cc: Nadav Amit <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Julien Freche <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Michal Hocko <[email protected]> Cc: "Michael S. Tsirkin" <[email protected]> Cc: Alexander Duyck <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Baoquan He <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Christian Hansen <[email protected]> Cc: Dave Young <[email protected]> Cc: David Rientjes <[email protected]> Cc: Haiyang Zhang <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Kairui Song <[email protected]> Cc: Kazuhito Hagio <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Cc: Konstantin Khlebnikov <[email protected]> Cc: "K. Y. Srinivasan" <[email protected]> Cc: Len Brown <[email protected]> Cc: Lianbo Jiang <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Miles Chen <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Omar Sandoval <[email protected]> Cc: Pankaj gupta <[email protected]> Cc: Pavel Machek <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Stephen Hemminger <[email protected]> Cc: Stephen Rothwell <[email protected]> Cc: Vitaly Kuznetsov <[email protected]> Cc: Vlastimil Babka <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-03-05hv_balloon: mark inflated pages PG_offlineDavid Hildenbrand1-2/+12
Mark inflated and never onlined pages PG_offline, to tell the world that the content is stale and should not be dumped. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Acked-by: Pankaj gupta <[email protected]> Cc: "K. Y. Srinivasan" <[email protected]> Cc: Haiyang Zhang <[email protected]> Cc: Stephen Hemminger <[email protected]> Cc: Kairui Song <[email protected]> Cc: Vitaly Kuznetsov <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Michal Hocko <[email protected]> Cc: "Michael S. Tsirkin" <[email protected]> Cc: Alexander Duyck <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Baoquan He <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Christian Hansen <[email protected]> Cc: Dave Young <[email protected]> Cc: David Rientjes <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Julien Freche <[email protected]> Cc: Kazuhito Hagio <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Cc: Konstantin Khlebnikov <[email protected]> Cc: Len Brown <[email protected]> Cc: Lianbo Jiang <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Miles Chen <[email protected]> Cc: Nadav Amit <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Omar Sandoval <[email protected]> Cc: Pavel Machek <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Stephen Rothwell <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Xavier Deguillard <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-03-05xen/balloon: mark inflated pages PG_offlineDavid Hildenbrand1-0/+3
Mark inflated and never onlined pages PG_offline, to tell the world that the content is stale and should not be dumped. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Michal Hocko <[email protected]> Cc: "Michael S. Tsirkin" <[email protected]> Cc: Alexander Duyck <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Baoquan He <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Christian Hansen <[email protected]> Cc: Dave Young <[email protected]> Cc: David Rientjes <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Haiyang Zhang <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Julien Freche <[email protected]> Cc: Kairui Song <[email protected]> Cc: Kazuhito Hagio <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Cc: Konstantin Khlebnikov <[email protected]> Cc: "K. Y. Srinivasan" <[email protected]> Cc: Len Brown <[email protected]> Cc: Lianbo Jiang <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Miles Chen <[email protected]> Cc: Nadav Amit <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Omar Sandoval <[email protected]> Cc: Pankaj gupta <[email protected]> Cc: Pavel Machek <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Stephen Hemminger <[email protected]> Cc: Stephen Rothwell <[email protected]> Cc: Vitaly Kuznetsov <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Xavier Deguillard <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-03-05kexec: export PG_offline to VMCOREINFODavid Hildenbrand1-0/+2
Right now, pages inflated as part of a balloon driver will be dumped by dump tools like makedumpfile. While XEN is able to check in the crash kernel whether a certain pfn is actuall backed by memory in the hypervisor (see xen_oldmem_pfn_is_ram) and optimize this case, dumps of other balloon inflated memory will essentially result in zero pages getting allocated by the hypervisor and the dump getting filled with this data. The allocation and reading of zero pages can directly be avoided if a dumping tool could know which pages only contain stale information not to be dumped. We now have PG_offline which can be (and already is by virtio-balloon) used for marking pages as logically offline. Follow up patches will make use of this flag also in other balloon implementations. Let's export PG_offline via PAGE_OFFLINE_MAPCOUNT_VALUE, so makedumpfile can directly skip pages that are logically offline and the content therefore stale. Please note that this is also helpful for a problem we were seeing under Hyper-V: Dumping logically offline memory (pages kept fake offline while onlining a section via online_page_callback) would under some condicions result in a kernel panic when dumping them. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Acked-by: Dave Young <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Cc: Baoquan He <[email protected]> Cc: Omar Sandoval <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Lianbo Jiang <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Kazuhito Hagio <[email protected]> Cc: Alexander Duyck <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Christian Hansen <[email protected]> Cc: David Rientjes <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Haiyang Zhang <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Julien Freche <[email protected]> Cc: Kairui Song <[email protected]> Cc: Konstantin Khlebnikov <[email protected]> Cc: "K. Y. Srinivasan" <[email protected]> Cc: Len Brown <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Miles Chen <[email protected]> Cc: Nadav Amit <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Pankaj gupta <[email protected]> Cc: Pavel Machek <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Stephen Hemminger <[email protected]> Cc: Stephen Rothwell <[email protected]> Cc: Vitaly Kuznetsov <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Xavier Deguillard <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-03-05mm: convert PG_balloon to PG_offlineDavid Hildenbrand6-15/+21
PG_balloon was introduced to implement page migration/compaction for pages inflated in virtio-balloon. Nowadays, it is only a marker that a page is part of virtio-balloon and therefore logically offline. We also want to make use of this flag in other balloon drivers - for inflated pages or when onlining a section but keeping some pages offline (e.g. used right now by XEN and Hyper-V via set_online_page_callback()). We are going to expose this flag to dump tools like makedumpfile. But instead of exposing PG_balloon, let's generalize the concept of marking pages as logically offline, so it can be reused for other purposes later on. Rename PG_balloon to PG_offline. This is an indicator that the page is logically offline, the content stale and that it should not be touched (e.g. a hypervisor would have to allocate backing storage in order for the guest to dump an unused page). We can then e.g. exclude such pages from dumps. We replace and reuse KPF_BALLOON (23), as this shouldn't really harm (and for now the semantics stay the same). In following patches, we will make use of this bit also in other balloon drivers. While at it, document PGTABLE. [[email protected]: fix comment text, per David] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Acked-by: Konstantin Khlebnikov <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Acked-by: Pankaj gupta <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Christian Hansen <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Cc: Stephen Rothwell <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Alexander Duyck <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Miles Chen <[email protected]> Cc: David Rientjes <[email protected]> Cc: Kazuhito Hagio <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Baoquan He <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Dave Young <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Haiyang Zhang <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Julien Freche <[email protected]> Cc: Kairui Song <[email protected]> Cc: "K. Y. Srinivasan" <[email protected]> Cc: Len Brown <[email protected]> Cc: Lianbo Jiang <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Nadav Amit <[email protected]> Cc: Omar Sandoval <[email protected]> Cc: Pavel Machek <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Stephen Hemminger <[email protected]> Cc: Vitaly Kuznetsov <[email protected]> Cc: Xavier Deguillard <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-03-05mm: balloon: update comment about isolation/migration/compactionDavid Hildenbrand1-17/+9
Patch series "mm/kdump: allow to exclude pages that are logically offline" Right now, pages inflated as part of a balloon driver will be dumped by dump tools like makedumpfile. While XEN is able to check in the crash kernel whether a certain pfn is actuall backed by memory in the hypervisor (see xen_oldmem_pfn_is_ram) and optimize this case, dumps of virtio-balloon, hv-balloon and VMWare balloon inflated memory will essentially result in zero pages getting allocated by the hypervisor and the dump getting filled with this data. The allocation and reading of zero pages can directly be avoided if a dumping tool could know which pages only contain stale information not to be dumped. Also for XEN, calling into the kernel and asking the hypervisor if a pfn is backed can be avoided if the duming tool would skip such pages right from the beginning. Dumping tools have no idea whether a given page is part of a balloon driver and shall not be dumped. Esp. PG_reserved cannot be used for that purpose as all memory allocated during early boot is also PG_reserved, see discussion at [1]. So some other way of indication is required and a new page flag is frowned upon. We have PG_balloon (MAPCOUNT value), which is essentially unused now. I suggest renaming it to something more generic (PG_offline) to mark pages as logically offline. This flag can than e.g. also be used by virtio-mem in the future to mark subsections as offline. Or by other code that wants to put pages logically offline (e.g. later maybe poisoned pages that shall no longer be used). This series converts PG_balloon to PG_offline, allows dumping tools to query the value to detect such pages and marks pages in the hv-balloon and XEN balloon properly as PG_offline. Note that virtio-balloon already set pages to PG_balloon (and now PG_offline). Please note that this is also helpful for a problem we were seeing under Hyper-V: Dumping logically offline memory (pages kept fake offline while onlining a section via online_page_callback) would under some condicions result in a kernel panic when dumping them. As I don't have access to neither XEN nor Hyper-V nor VMWare installations, this was only tested with the virtio-balloon and pages were properly skipped when dumping. I'll also attach the makedumpfile patch to this series. [1] https://lkml.org/lkml/2018/7/20/566 This patch (of 8): Commit b1123ea6d3b3 ("mm: balloon: use general non-lru movable page feature") reworked balloon handling to make use of the general non-lru movable page feature. The big comment block in balloon_compaction.h contains quite some outdated information. Let's fix this. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Alexander Duyck <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Baoquan He <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Christian Hansen <[email protected]> Cc: Dave Young <[email protected]> Cc: David Rientjes <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Haiyang Zhang <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Julien Freche <[email protected]> Cc: Kairui Song <[email protected]> Cc: Kazuhito Hagio <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Cc: Konstantin Khlebnikov <[email protected]> Cc: "K. Y. Srinivasan" <[email protected]> Cc: Len Brown <[email protected]> Cc: Lianbo Jiang <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Miles Chen <[email protected]> Cc: Nadav Amit <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Omar Sandoval <[email protected]> Cc: Pankaj gupta <[email protected]> Cc: Pavel Machek <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Stephen Hemminger <[email protected]> Cc: Stephen Rothwell <[email protected]> Cc: Vitaly Kuznetsov <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Xavier Deguillard <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-03-05mm/page_alloc.c: memory hotplug: free pages as higher orderArun KS6-25/+45
When freeing pages are done with higher order, time spent on coalescing pages by buddy allocator can be reduced. With section size of 256MB, hot add latency of a single section shows improvement from 50-60 ms to less than 1 ms, hence improving the hot add latency by 60 times. Modify external providers of online callback to align with the change. [[email protected]: v11] Link: http://lkml.kernel.org/r/[email protected] [[email protected]: remove unused local, per Arun] [[email protected]: avoid return of void-returning __free_pages_core(), per Oscar] [[email protected]: fix it for mm-convert-totalram_pages-and-totalhigh_pages-variables-to-atomic.patch] [[email protected]: v8] Link: http://lkml.kernel.org/r/[email protected] [[email protected]: v9] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arun KS <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Acked-by: Michal Hocko <[email protected]> Reviewed-by: Oscar Salvador <[email protected]> Reviewed-by: Alexander Duyck <[email protected]> Cc: K. Y. Srinivasan <[email protected]> Cc: Haiyang Zhang <[email protected]> Cc: Stephen Hemminger <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Dan Williams <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Mathieu Malaterre <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Cc: Souptick Joarder <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Aaron Lu <[email protected]> Cc: Srivatsa Vaddagiri <[email protected]> Cc: Vinayak Menon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-03-05mm/slub.c: remove an unused addr argumentQian Cai1-3/+2
"addr" function argument is not used in alloc_consistency_checks() at all, so remove it. Link: http://lkml.kernel.org/r/[email protected] Fixes: becfda68abca ("slub: convert SLAB_DEBUG_FREE to SLAB_CONSISTENCY_CHECKS") Signed-off-by: Qian Cai <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Acked-by: David Rientjes <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Joonsoo Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-03-05include/linux/slub_def.h: comment fixesTobin C. Harding1-6/+6
Capitialize comment string, use C89 comment style, correct grammar/punctuation in comments. Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Tobin C. Harding <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Reviewed-by: William Kucharski <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: David Rientjes <[email protected]> Cc: Joonsoo Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-03-05mm/slab.c: kmemleak no scan alien cachesQian Cai1-8/+9
Kmemleak throws endless warnings during boot due to in __alloc_alien_cache(), alc = kmalloc_node(memsize, gfp, node); init_arraycache(&alc->ac, entries, batch); kmemleak_no_scan(ac); Kmemleak does not track the array cache (alc->ac) but the alien cache (alc) instead, so let it track the latter by lifting kmemleak_no_scan() out of init_arraycache(). There is another place that calls init_arraycache(), but alloc_kmem_cache_cpus() uses the percpu allocation where will never be considered as a leak. kmemleak: Found object by alias at 0xffff8007b9aa7e38 CPU: 190 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc2+ #2 Call trace: dump_backtrace+0x0/0x168 show_stack+0x24/0x30 dump_stack+0x88/0xb0 lookup_object+0x84/0xac find_and_get_object+0x84/0xe4 kmemleak_no_scan+0x74/0xf4 setup_kmem_cache_node+0x2b4/0x35c __do_tune_cpucache+0x250/0x2d4 do_tune_cpucache+0x4c/0xe4 enable_cpucache+0xc8/0x110 setup_cpu_cache+0x40/0x1b8 __kmem_cache_create+0x240/0x358 create_cache+0xc0/0x198 kmem_cache_create_usercopy+0x158/0x20c kmem_cache_create+0x50/0x64 fsnotify_init+0x58/0x6c do_one_initcall+0x194/0x388 kernel_init_freeable+0x668/0x688 kernel_init+0x18/0x124 ret_from_fork+0x10/0x18 kmemleak: Object 0xffff8007b9aa7e00 (size 256): kmemleak: comm "swapper/0", pid 1, jiffies 4294697137 kmemleak: min_count = 1 kmemleak: count = 0 kmemleak: flags = 0x1 kmemleak: checksum = 0 kmemleak: backtrace: kmemleak_alloc+0x84/0xb8 kmem_cache_alloc_node_trace+0x31c/0x3a0 __kmalloc_node+0x58/0x78 setup_kmem_cache_node+0x26c/0x35c __do_tune_cpucache+0x250/0x2d4 do_tune_cpucache+0x4c/0xe4 enable_cpucache+0xc8/0x110 setup_cpu_cache+0x40/0x1b8 __kmem_cache_create+0x240/0x358 create_cache+0xc0/0x198 kmem_cache_create_usercopy+0x158/0x20c kmem_cache_create+0x50/0x64 fsnotify_init+0x58/0x6c do_one_initcall+0x194/0x388 kernel_init_freeable+0x668/0x688 kernel_init+0x18/0x124 kmemleak: Not scanning unknown object at 0xffff8007b9aa7e38 CPU: 190 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc2+ #2 Call trace: dump_backtrace+0x0/0x168 show_stack+0x24/0x30 dump_stack+0x88/0xb0 kmemleak_no_scan+0x90/0xf4 setup_kmem_cache_node+0x2b4/0x35c __do_tune_cpucache+0x250/0x2d4 do_tune_cpucache+0x4c/0xe4 enable_cpucache+0xc8/0x110 setup_cpu_cache+0x40/0x1b8 __kmem_cache_create+0x240/0x358 create_cache+0xc0/0x198 kmem_cache_create_usercopy+0x158/0x20c kmem_cache_create+0x50/0x64 fsnotify_init+0x58/0x6c do_one_initcall+0x194/0x388 kernel_init_freeable+0x668/0x688 kernel_init+0x18/0x124 ret_from_fork+0x10/0x18 Link: http://lkml.kernel.org/r/[email protected] Fixes: 1fe00d50a9e8 ("slab: factor out initialization of array cache") Signed-off-by: Qian Cai <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: David Rientjes <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: Catalin Marinas <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-03-05mm/slub.c: freelist is ensured to be NULL when new_slab() failsPeng Wang1-2/+1
new_slab_objects() will return immediately if freelist is not NULL. if (freelist) return freelist; One more assignment operation could be avoided. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Peng Wang <[email protected]> Reviewed-by: Pekka Enberg <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Acked-by: David Rientjes <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Joonsoo Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-03-05fs/file.c: initialize init_files.resize_waitShuriyc Chu1-0/+1
(Taken from https://bugzilla.kernel.org/show_bug.cgi?id=200647) 'get_unused_fd_flags' in kthread cause kernel crash. It works fine on 4.1, but causes crash after get 64 fds. It also cause crash on ubuntu1404/1604/1804, centos7.5, and the crash messages are almost the same. The crash message on centos7.5 shows below: start fd 61 start fd 62 start fd 63 BUG: unable to handle kernel NULL pointer dereference at (null) IP: __wake_up_common+0x2e/0x90 PGD 0 Oops: 0000 [#1] SMP Modules linked in: test(OE) xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter devlink sunrpc kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd sg ppdev pcspkr virtio_balloon parport_pc parport i2c_piix4 joydev ip_tables xfs libcrc32c sr_mod cdrom sd_mod crc_t10dif crct10dif_generic ata_generic pata_acpi virtio_scsi virtio_console virtio_net cirrus drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm crct10dif_pclmul crct10dif_common crc32c_intel drm ata_piix serio_raw libata virtio_pci virtio_ring i2c_core virtio floppy dm_mirror dm_region_hash dm_log dm_mod CPU: 2 PID: 1820 Comm: test_fd Kdump: loaded Tainted: G OE ------------ 3.10.0-862.3.3.el7.x86_64 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014 task: ffff8e92b9431fa0 ti: ffff8e94247a0000 task.ti: ffff8e94247a0000 RIP: 0010:__wake_up_common+0x2e/0x90 RSP: 0018:ffff8e94247a2d18 EFLAGS: 00010086 RAX: 0000000000000000 RBX: ffffffff9d09daa0 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffffffff9d09daa0 RBP: ffff8e94247a2d50 R08: 0000000000000000 R09: ffff8e92b95dfda8 R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff9d09daa8 R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000003 FS: 0000000000000000(0000) GS:ffff8e9434e80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000017c686000 CR4: 00000000000207e0 Call Trace: __wake_up+0x39/0x50 expand_files+0x131/0x250 __alloc_fd+0x47/0x170 get_unused_fd_flags+0x30/0x40 test_fd+0x12a/0x1c0 [test] kthread+0xd1/0xe0 ret_from_fork_nospec_begin+0x21/0x21 Code: 66 90 55 48 89 e5 41 57 41 89 f7 41 56 41 89 ce 41 55 41 54 49 89 fc 49 83 c4 08 53 48 83 ec 10 48 8b 47 08 89 55 cc 4c 89 45 d0 <48> 8b 08 49 39 c4 48 8d 78 e8 4c 8d 69 e8 75 08 eb 3b 4c 89 ef RIP __wake_up_common+0x2e/0x90 RSP <ffff8e94247a2d18> CR2: 0000000000000000 This issue exists since CentOS 7.5 3.10.0-862 and CentOS 7.4 (3.10.0-693.21.1 ) is ok. Root cause: the item 'resize_wait' is not initialized before being used. Reported-by: Richard Zhang <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-03-05fs/inode.c: inode_set_flags(): replace opencoded set_mask_bits()Vineet Gupta1-7/+1
It seems that commits 5f16f3225b0624 and 00a1a053ebe5, both with same commitlog ("ext4: atomically set inode->i_flags in ext4_set_inode_flags()") introduced the set_mask_bits API, but somehow missed not using it in ext4 in the end. Also, set_mask_bits() is used in fs quite a bit and we can possibly come up with a generic llsc based implementation (w/o the cmpxchg loop) Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Vineet Gupta <[email protected]> Reviewed-by: Anthony Yznaga <[email protected]> Cc: Alexander Viro <[email protected]> Cc: Theodore Ts'o <[email protected]> Cc: Peter Zijlstra (Intel) <[email protected]> Cc: Chris Wilson <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Miklos Szeredi <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-03-05ocfs2: Use zero-sized array and struct_size() in kzalloc()Gustavo A. R. Silva1-6/+2
Update the code to use a zero-sized array instead of a pointer in structure ocfs2_slot_info and use struct_size() in kzalloc(). Notice that one of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct foo { int stuff; void *entry[]; }; instance = kzalloc(sizeof(struct foo) + sizeof(void *) * count, GFP_KERNEL); Instead of leaving these open-coded and prone to type mistakes, we can now use the new struct_size() helper: instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL); This code was detected with the help of Coccinelle. Link: http://lkml.kernel.org/r/20190108191903.GA22056@embeddedor Signed-off-by: Gustavo A. R. Silva <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Joseph Qi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-03-05ocfs2: fix the application IO timeout when fstrim is runningGang He5-63/+106
The user reported this problem, the upper application IO was timeout when fstrim was running on this ocfs2 partition. the application monitoring resource agent considered that this application did not work, then this node was fenced by the cluster brain (e.g. pacemaker). The root cause is that fstrim thread always holds main_bm meta-file related locks until all the cluster groups are trimmed. This patch will make fstrim thread release main_bm meta-file related locks when each cluster group is trimmed, this will let the current application IO has a chance to claim the clusters from main_bm meta-file. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Gang He <[email protected]> Reviewed-by: Changwei Ge <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Joseph Qi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-03-05ocfs2: fix a panic problem caused by o2cb_ctlJia Guo1-6/+8
In the process of creating a node, it will cause NULL pointer dereference in kernel if o2cb_ctl failed in the interval (mkdir, o2cb_set_node_attribute(node_num)] in function o2cb_add_node. The node num is initialized to 0 in function o2nm_node_group_make_item, o2nm_node_group_drop_item will mistake the node number 0 for a valid node number when we delete the node before the node number is set correctly. If the local node number of the current host happens to be 0, cluster->cl_local_node will be set to O2NM_INVALID_NODE_NUM while o2hb_thread still running. The panic stack is generated as follows: o2hb_thread \-o2hb_do_disk_heartbeat \-o2hb_check_own_slot |-slot = &reg->hr_slots[o2nm_this_node()]; //o2nm_this_node() return O2NM_INVALID_NODE_NUM We need to check whether the node number is set when we delete the node. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jia Guo <[email protected]> Reviewed-by: Joseph Qi <[email protected]> Acked-by: Jun Piao <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Changwei Ge <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-03-05sh: remove nargs from __SYSCALLFiroz Khan2-3/+3
The __SYSCALL macro's arguments are system call number, system call entry name and number of arguments for the system call. Argument- nargs in __SYSCALL(nr, entry, nargs) is neither calculated nor used anywhere. So it would be better to keep the implementation as __SYSCALL(nr, entry). This unifies the implementation with some other architectures too. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Firoz Khan <[email protected]> Cc: Yoshinori Sato <[email protected]> Cc: Rich Felker <[email protected]> Cc: Simon Horman <[email protected]> Cc: Kuninori Morimoto <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Philippe Ombredanne <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Kate Stewart <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-03-05scripts/decode_stacktrace.sh: handle RIP address with segmentKonstantin Khlebnikov1-1/+8
decode line: RIP: 0010:khugepaged+0x2a2/0x2280 into RIP: 0010:khugepaged (mm/khugepaged.c:1885) Link: http://lkml.kernel.org/r/154660071227.52726.15645307951282727605.stgit@buzz Signed-off-by: Konstantin Khlebnikov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-03-05kasan: fix coccinelle warnings in kasan_p*_tableAndrey Konovalov1-3/+3
kasan_p4d_table(), kasan_pmd_table() and kasan_pud_table() are declared as returning bool, but return 0 instead of false, which produces a coccinelle warning. Fix it. Link: http://lkml.kernel.org/r/1fa6fadf644859e8a6a8ecce258444b49be8c7ee.1551716733.git.andreyknvl@google.com Fixes: 0207df4fa1a8 ("kernel/memremap, kasan: make ZONE_DEVICE with work with KASAN") Signed-off-by: Andrey Konovalov <[email protected]> Reported-by: kbuild test robot <[email protected]> Acked-by: Andrey Ryabinin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-03-05kasan: fix kasan_check_read/write definitionsArnd Bergmann2-1/+3
Building little-endian allmodconfig kernels on arm64 started failing with the generated atomic.h implementation, since we now try to call kasan helpers from the EFI stub: aarch64-linux-gnu-ld: drivers/firmware/efi/libstub/arm-stub.stub.o: in function `atomic_set': include/generated/atomic-instrumented.h:44: undefined reference to `__efistub_kasan_check_write' I suspect that we get similar problems in other files that explicitly disable KASAN for some reason but call atomic_t based helper functions. We can fix this by checking the predefined __SANITIZE_ADDRESS__ macro that the compiler sets instead of checking CONFIG_KASAN, but this in turn requires a small hack in mm/kasan/common.c so we do see the extern declaration there instead of the inline function. Link: http://lkml.kernel.org/r/[email protected] Fixes: b1864b828644 ("locking/atomics: build atomic headers as required") Signed-off-by: Arnd Bergmann <[email protected]> Reported-by: Anders Roxell <[email protected]> Acked-by: Andrey Ryabinin <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Will Deacon <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Andrey Konovalov <[email protected]> Cc: Stephen Rothwell <[email protected]>, Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-03-05page_poison: play nicely with KASANQian Cai2-1/+5
KASAN does not play well with the page poisoning (CONFIG_PAGE_POISONING). It triggers false positives in the allocation path: BUG: KASAN: use-after-free in memchr_inv+0x2ea/0x330 Read of size 8 at addr ffff88881f800000 by task swapper/0 CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc1+ #54 Call Trace: dump_stack+0xe0/0x19a print_address_description.cold.2+0x9/0x28b kasan_report.cold.3+0x7a/0xb5 __asan_report_load8_noabort+0x19/0x20 memchr_inv+0x2ea/0x330 kernel_poison_pages+0x103/0x3d5 get_page_from_freelist+0x15e7/0x4d90 because KASAN has not yet unpoisoned the shadow page for allocation before it checks memchr_inv() but only found a stale poison pattern. Also, false positives in free path, BUG: KASAN: slab-out-of-bounds in kernel_poison_pages+0x29e/0x3d5 Write of size 4096 at addr ffff8888112cc000 by task swapper/0/1 CPU: 5 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc1+ #55 Call Trace: dump_stack+0xe0/0x19a print_address_description.cold.2+0x9/0x28b kasan_report.cold.3+0x7a/0xb5 check_memory_region+0x22d/0x250 memset+0x28/0x40 kernel_poison_pages+0x29e/0x3d5 __free_pages_ok+0x75f/0x13e0 due to KASAN adds poisoned redzones around slab objects, but the page poisoning needs to poison the whole page. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Qian Cai <[email protected]> Acked-by: Andrey Ryabinin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-03-05kasan: remove use after scope bugs detection.Andrey Ryabinin9-73/+0
Use after scope bugs detector seems to be almost entirely useless for the linux kernel. It exists over two years, but I've seen only one valid bug so far [1]. And the bug was fixed before it has been reported. There were some other use-after-scope reports, but they were false-positives due to different reasons like incompatibility with structleak plugin. This feature significantly increases stack usage, especially with GCC < 9 version, and causes a 32K stack overflow. It probably adds performance penalty too. Given all that, let's remove use-after-scope detector entirely. While preparing this patch I've noticed that we mistakenly enable use-after-scope detection for clang compiler regardless of CONFIG_KASAN_EXTRA setting. This is also fixed now. [1] http://lkml.kernel.org/r/<[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Andrey Ryabinin <[email protected]> Acked-by: Will Deacon <[email protected]> [arm64] Cc: Qian Cai <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Catalin Marinas <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-03-05mm: hwpoison: fix thp split handing in soft_offline_in_use_page()zhongjiang1-8/+6
When soft_offline_in_use_page() runs on a thp tail page after pmd is split, we trigger the following VM_BUG_ON_PAGE(): Memory failure: 0x3755ff: non anonymous thp __get_any_page: 0x3755ff: unknown zero refcount page type 2fffff80000000 Soft offlining pfn 0x34d805 at process virtual address 0x20fff000 page:ffffea000d360140 count:0 mapcount:0 mapping:0000000000000000 index:0x1 flags: 0x2fffff80000000() raw: 002fffff80000000 ffffea000d360108 ffffea000d360188 0000000000000000 raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0) ------------[ cut here ]------------ kernel BUG at ./include/linux/mm.h:519! soft_offline_in_use_page() passed refcount and page lock from tail page to head page, which is not needed because we can pass any subpage to split_huge_page(). Naoya had fixed a similar issue in c3901e722b29 ("mm: hwpoison: fix thp split handling in memory_failure()"). But he missed fixing soft offline. Link: http://lkml.kernel.org/r/[email protected] Fixes: 61f5d698cc97 ("mm: re-enable THP") Signed-off-by: zhongjiang <[email protected]> Acked-by: Naoya Horiguchi <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: <[email protected]> [4.5+] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-03-05smb3: request more credits on normal (non-large read/write) opsSteve French1-2/+2
We can end up building up credits too slowly to do large operations (reads and writes for example) that require many credits. By comparison most other SMB3 clients request many more (sometimes thousands) of credits on all operations. Increase the number of credits we request on typical (non-large e.g read/write) operations to 10 from 2 so we can build a pool of credits faster. Signed-off-by: Steve French <[email protected]> Reviewed-by: Ronnie Sahlberg <[email protected]>
2019-03-05CIFS: Mask off signals when sending SMB packetsPavel Shilovsky1-3/+38
We don't want to break SMB sessions if we receive signals when sending packets through the network. Fix it by masking off signals inside __smb_send_rqst() to avoid partial packet sends due to interrupts. Return -EINTR if a signal is pending and only a part of the packet was sent. Return a success status code if the whole packet was sent regardless of signal being pending or not. This keeps a mid entry for the request in the pending queue and allows the demultiplex thread to handle a response from the server properly. Signed-off-by: Pavel Shilovsky <[email protected]> Signed-off-by: Steve French <[email protected]>
2019-03-05CIFS: Return -EAGAIN instead of -ENOTSOCKPavel Shilovsky1-1/+2
When we attempt to send a packet while the demultiplex thread is in the middle of cifs_reconnect() we may end up returning -ENOTSOCK to upper layers. The intent here is to retry the request once the TCP connection is up, so change it to return -EAGAIN instead. The latter error code is retryable and the upper layers will retry the request if needed. Signed-off-by: Pavel Shilovsky <[email protected]> Signed-off-by: Steve French <[email protected]>
2019-03-05CIFS: Only send SMB2_NEGOTIATE command on new TCP connectionsPavel Shilovsky1-0/+8
Do not allow commands other than SMB2_NEGOTIATE to be sent over recently established TCP connections. Return -EAGAIN to let upper layers handle it properly. Signed-off-by: Pavel Shilovsky <[email protected]> Signed-off-by: Steve French <[email protected]>
2019-03-05CIFS: Fix read after write for files with read cachingPavel Shilovsky1-5/+7
When we have a READ lease for a file and have just issued a write operation to the server we need to purge the cache and set oplock/lease level to NONE to avoid reading stale data. Currently we do that only if a write operation succedeed thus not covering cases when a request was sent to the server but a negative error code was returned later for some other reasons (e.g. -EIOCBQUEUED or -EINTR). Fix this by turning off caching regardless of the error code being returned. The patches fixes generic tests 075 and 112 from the xfs-tests. Cc: <[email protected]> Signed-off-by: Pavel Shilovsky <[email protected]> Signed-off-by: Steve French <[email protected]> Reviewed-by: Ronnie Sahlberg <[email protected]>
2019-03-05smb3: for kerberos mounts display the credential uid usedSteve French1-1/+1
For kerberos mounts, the cruid is helpful to display in /proc/mounts in order to tell which uid's krb5 cache we got the ticket for and to tell in the multiuser krb5 case which local users (uids) we have Kerberos authentic sessions for. Signed-off-by: Steve French <[email protected]> Reviewed-by: Ronnie Sahlberg <[email protected]>
2019-03-05cifs: use correct format charactersLouis Taylor2-3/+3
When compiling with -Wformat, clang emits the following warnings: fs/cifs/smb1ops.c:312:20: warning: format specifies type 'unsigned short' but the argument has type 'unsigned int' [-Wformat] tgt_total_cnt, total_in_tgt); ^~~~~~~~~~~~ fs/cifs/cifs_dfs_ref.c:289:4: warning: format specifies type 'short' but the argument has type 'int' [-Wformat] ref->flags, ref->server_type); ^~~~~~~~~~ fs/cifs/cifs_dfs_ref.c:289:16: warning: format specifies type 'short' but the argument has type 'int' [-Wformat] ref->flags, ref->server_type); ^~~~~~~~~~~~~~~~ fs/cifs/cifs_dfs_ref.c:291:4: warning: format specifies type 'short' but the argument has type 'int' [-Wformat] ref->ref_flag, ref->path_consumed); ^~~~~~~~~~~~~ fs/cifs/cifs_dfs_ref.c:291:19: warning: format specifies type 'short' but the argument has type 'int' [-Wformat] ref->ref_flag, ref->path_consumed); ^~~~~~~~~~~~~~~~~~ The types of these arguments are unconditionally defined, so this patch updates the format character to the correct ones for ints and unsigned ints. Link: https://github.com/ClangBuiltLinux/linux/issues/378 Signed-off-by: Louis Taylor <[email protected]> Signed-off-by: Steve French <[email protected]> Reviewed-by: Nick Desaulniers <[email protected]>
2019-03-05smb3: add dynamic trace point for query_info_enter/doneSteve French2-0/+48
Adds dynamic trace points for the query_info_enter and query_info_done (no error) case. We only had one existing trace point related to this which was on query_info errors. Note that these two new tracepoints are for the non-compounded query_info paths. Sample output (from: trace-cmd record -e smb3_query_info*) ls-24140 [001] .... 27811.866068: smb3_query_info_enter: xid=7 sid=0xd2d00587 tid=0xb5441939 fid=0xcf082bac class=18 type=0x1 ls-24140 [001] .... 27811.867656: smb3_query_info_done: xid=7 sid=0xd2d00587 tid=0xb5441939 fid=0xcf082bac class=18 type=0x1 getcifsacl-24149 [005] .... 27854.759873: smb3_query_info_enter: xid=15 sid=0xd2d00587 tid=0xb5441939 fid=0x99896e72 class=0 type=0x3 getcifsacl-24149 [005] .... 27854.761730: smb3_query_info_done: xid=15 sid=0xd2d00587 tid=0xb5441939 fid=0x99896e72 class=0 type=0x3 Signed-off-by: Steve French <[email protected]> Reviewed-by: Pavel Shilovsky <[email protected]>
2019-03-05smb3: add dynamic trace point for smb3_cmd_enterSteve French2-0/+4
Add tracepoint before sending an SMB3 command on the wire (ie add an smb3_cmd_enter tracepoint). This allows us to look in much more detail at response times (between request and response). Signed-off-by: Steve French <[email protected]> Reviewed-by: Pavel Shilovsky <[email protected]>
2019-03-05smb3: improve dynamic tracing of open and posix mkdirSteve French2-1/+45
Add dynamic trace point for open_enter (and posix mkdir enter) Signed-off-by: Steve French <[email protected]> Reviewed-by: Pavel Shilovsky <[email protected]>
2019-03-05smb3: add missing read completion trace pointSteve French1-1/+4
When ENODATA returned we weren't logging the read completion (not an error, but can be indicated by logging length 0) which makes looking at read traces confusing for smb3. Signed-off-by: Steve French <[email protected]> Reviewed-by: Ronnie Sahlberg <[email protected]>
2019-03-05smb3: Add tracepoints for read, write and query_dir enterSteve French2-0/+18
Allows tracing begin (not just completion) of read, write and query_dir which may be helpful in finding slow requests and other timing information Signed-off-by: Steve French <[email protected]> Reviewed-by: Pavel Shilovsky <[email protected]> Reviewed-by: Ronnie Sahlberg <[email protected]>
2019-03-05smb3: add tracepoints for query dirSteve French2-2/+14
Adds two tracepoints - one for query_dir done (no err) and one for query_dir_err Sanple output: To start the trace in one window: trace-cmd record -e smb3_query_dir* Then in another window after doing an ls /mnt View the trace output by: trace-cmd show Sample output: TASK-PID CPU# |||| TIMESTAMP FUNCTION | | | |||| | | ls-24869 [007] .... 90695.452009: smb3_query_dir_done: xid=7 sid=0x5027d24d tid=0xb95cf25a fid=0xc41a8c3e offset=0x0 len=0x16 ls-24869 [000] .... 90695.452764: smb3_query_dir_done: xid=8 sid=0x5027d24d tid=0xb95cf25a fid=0xc41a8c3e offset=0x0 len=0x0 ls-24874 [003] .... 90701.506342: smb3_query_dir_done: xid=11 sid=0x5027d24d tid=0xb95cf25a fid=0x33ad3601 offset=0x0 len=0x8 ls-24874 [003] .... 90701.506917: smb3_query_dir_done: xid=12 sid=0x5027d24d tid=0xb95cf25a fid=0x33ad3601 offset=0x0 len=0x0 Reviewed-by: Pavel Shilovsky <[email protected]> Reviewed-by: Ronnie Sahlberg <[email protected]> Signed-off-by: Steve French <[email protected]>
2019-03-05smb3: Update POSIX negotiate context with POSIX ctxt GUIDSteve French2-2/+19
POSIX negotiate context now includes the GUID specifying which POSIX open context we support. Signed-off-by: Steve French <[email protected]> Reviewed-by: Jeremy Allison <[email protected]>
2019-03-05cifs: update internal module version numberSteve French1-1/+1
To 2.18 Signed-off-by: Steve French <[email protected]>
2019-03-05CIFS: Try to acquire credits at once for compound requestsPavel Shilovsky1-5/+34
Currently we get one credit per compound part of the request individually. This may lead to being stuck on waiting for credits if multiple compounded operations happen in parallel. Try acquire credits for all compound parts at once. Return immediately if not enough credits and too few requests are in flight currently thus narrowing the possibility of infinite waiting for credits. The more advance fix is to return right away if not enough credits for the compound request and do not look at the number of requests in flight. The caller should handle such situations by falling back to sequential execution of SMB commands instead of compounding. Signed-off-by: Pavel Shilovsky <[email protected]> Signed-off-by: Steve French <[email protected]>
2019-03-05CIFS: Return error code when getting file handle for writebackPavel Shilovsky3-32/+70
Now we just return NULL cifsFileInfo pointer in cases we didn't find or couldn't reopen a file. This hides errors from cifs_reopen_file() especially retryable errors which should be handled appropriately. Create new cifs_get_writable_file() routine that returns error codes from cifs_reopen_file() and use it in the writeback codepath. Signed-off-by: Pavel Shilovsky <[email protected]> Signed-off-by: Steve French <[email protected]>
2019-03-05CIFS: Move open file handling to writepagesPavel Shilovsky1-15/+14
Currently we check for an open file existence in wdata_send_pages() which doesn't provide an easy way to handle error codes that will be returned from find_writable_filehandle() once it is changed. Move the check to writepages. Signed-off-by: Pavel Shilovsky <[email protected]> Signed-off-by: Steve French <[email protected]>
2019-03-05CIFS: Move unlocking pages from wdata_send_pages()Pavel Shilovsky1-7/+5
Currently wdata_send_pages() unlocks pages after sending. This complicates further refactoring and doesn't align with the function name. Move unlocking to writepages. Signed-off-by: Pavel Shilovsky <[email protected]> Signed-off-by: Steve French <[email protected]>