aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2018-11-30mm/khugepaged: collapse_shmem() without freezing new_pageHugh Dickins1-12/+7
khugepaged's collapse_shmem() does almost all of its work, to assemble the huge new_page from 512 scattered old pages, with the new_page's refcount frozen to 0 (and refcounts of all old pages so far also frozen to 0). Including shmem_getpage() to read in any which were out on swap, memory reclaim if necessary to allocate their intermediate pages, and copying over all the data from old to new. Imagine the frozen refcount as a spinlock held, but without any lock debugging to highlight the abuse: it's not good, and under serious load heads into lockups - speculative getters of the page are not expecting to spin while khugepaged is rescheduled. One can get a little further under load by hacking around elsewhere; but fortunately, freezing the new_page turns out to have been entirely unnecessary, with no hacks needed elsewhere. The huge new_page lock is already held throughout, and guards all its subpages as they are brought one by one into the page cache tree; and anything reading the data in that page, without the lock, before it has been marked PageUptodate, would already be in the wrong. So simply eliminate the freezing of the new_page. Each of the old pages remains frozen with refcount 0 after it has been replaced by a new_page subpage in the page cache tree, until they are all unfrozen on success or failure: just as before. They could be unfrozen sooner, but cause no problem once no longer visible to find_get_entry(), filemap_map_pages() and other speculative lookups. Link: http://lkml.kernel.org/r/[email protected] Fixes: f3f0e1d2150b2 ("khugepaged: add support of collapse for tmpfs/shmem pages") Signed-off-by: Hugh Dickins <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: Jerome Glisse <[email protected]> Cc: Konstantin Khlebnikov <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: <[email protected]> [4.8+] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-11-30mm/khugepaged: minor reorderings in collapse_shmem()Hugh Dickins1-40/+32
Several cleanups in collapse_shmem(): most of which probably do not really matter, beyond doing things in a more familiar and reassuring order. Simplify the failure gotos in the main loop, and on success update stats while interrupts still disabled from the last iteration. Link: http://lkml.kernel.org/r/[email protected] Fixes: f3f0e1d2150b2 ("khugepaged: add support of collapse for tmpfs/shmem pages") Signed-off-by: Hugh Dickins <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: Jerome Glisse <[email protected]> Cc: Konstantin Khlebnikov <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: <[email protected]> [4.8+] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-11-30mm/khugepaged: collapse_shmem() remember to clear holesHugh Dickins1-0/+10
Huge tmpfs testing reminds us that there is no __GFP_ZERO in the gfp flags khugepaged uses to allocate a huge page - in all common cases it would just be a waste of effort - so collapse_shmem() must remember to clear out any holes that it instantiates. The obvious place to do so, where they are put into the page cache tree, is not a good choice: because interrupts are disabled there. Leave it until further down, once success is assured, where the other pages are copied (before setting PageUptodate). Link: http://lkml.kernel.org/r/[email protected] Fixes: f3f0e1d2150b2 ("khugepaged: add support of collapse for tmpfs/shmem pages") Signed-off-by: Hugh Dickins <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: Jerome Glisse <[email protected]> Cc: Konstantin Khlebnikov <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: <[email protected]> [4.8+] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-11-30mm/khugepaged: fix crashes due to misaccounted holesHugh Dickins2-2/+9
Huge tmpfs testing on a shortish file mapped into a pmd-rounded extent hit shmem_evict_inode()'s WARN_ON(inode->i_blocks) followed by clear_inode()'s BUG_ON(inode->i_data.nrpages) when the file was later closed and unlinked. khugepaged's collapse_shmem() was forgetting to update mapping->nrpages on the rollback path, after it had added but then needs to undo some holes. There is indeed an irritating asymmetry between shmem_charge(), whose callers want it to increment nrpages after successfully accounting blocks, and shmem_uncharge(), when __delete_from_page_cache() already decremented nrpages itself: oh well, just add a comment on that to them both. And shmem_recalc_inode() is supposed to be called when the accounting is expected to be in balance (so it can deduce from imbalance that reclaim discarded some pages): so change shmem_charge() to update nrpages earlier (though it's rare for the difference to matter at all). Link: http://lkml.kernel.org/r/[email protected] Fixes: 800d8c63b2e98 ("shmem: add huge pages support") Fixes: f3f0e1d2150b2 ("khugepaged: add support of collapse for tmpfs/shmem pages") Signed-off-by: Hugh Dickins <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: Jerome Glisse <[email protected]> Cc: Konstantin Khlebnikov <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: <[email protected]> [4.8+] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-11-30mm/khugepaged: collapse_shmem() stop if punched or truncatedHugh Dickins1-0/+11
Huge tmpfs testing showed that although collapse_shmem() recognizes a concurrently truncated or hole-punched page correctly, its handling of holes was liable to refill an emptied extent. Add check to stop that. Link: http://lkml.kernel.org/r/[email protected] Fixes: f3f0e1d2150b2 ("khugepaged: add support of collapse for tmpfs/shmem pages") Signed-off-by: Hugh Dickins <[email protected]> Reviewed-by: Matthew Wilcox <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Cc: Jerome Glisse <[email protected]> Cc: Konstantin Khlebnikov <[email protected]> Cc: <[email protected]> [4.8+] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-11-30mm/huge_memory: fix lockdep complaint on 32-bit i_size_read()Hugh Dickins1-6/+13
Huge tmpfs testing, on 32-bit kernel with lockdep enabled, showed that __split_huge_page() was using i_size_read() while holding the irq-safe lru_lock and page tree lock, but the 32-bit i_size_read() uses an irq-unsafe seqlock which should not be nested inside them. Instead, read the i_size earlier in split_huge_page_to_list(), and pass the end offset down to __split_huge_page(): all while holding head page lock, which is enough to prevent truncation of that extent before the page tree lock has been taken. Link: http://lkml.kernel.org/r/[email protected] Fixes: baa355fd33142 ("thp: file pages support for split_huge_page()") Signed-off-by: Hugh Dickins <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: Jerome Glisse <[email protected]> Cc: Konstantin Khlebnikov <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: <[email protected]> [4.8+] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-11-30mm/huge_memory: splitting set mapping+index before unfreezeHugh Dickins1-6/+6
Huge tmpfs stress testing has occasionally hit shmem_undo_range()'s VM_BUG_ON_PAGE(page_to_pgoff(page) != index, page). Move the setting of mapping and index up before the page_ref_unfreeze() in __split_huge_page_tail() to fix this: so that a page cache lookup cannot get a reference while the tail's mapping and index are unstable. In fact, might as well move them up before the smp_wmb(): I don't see an actual need for that, but if I'm missing something, this way round is safer than the other, and no less efficient. You might argue that VM_BUG_ON_PAGE(page_to_pgoff(page) != index, page) is misplaced, and should be left until after the trylock_page(); but left as is has not crashed since, and gives more stringent assurance. Link: http://lkml.kernel.org/r/[email protected] Fixes: e9b61f19858a5 ("thp: reintroduce split_huge_page()") Requires: 605ca5ede764 ("mm/huge_memory.c: reorder operations in __split_huge_page_tail()") Signed-off-by: Hugh Dickins <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: Konstantin Khlebnikov <[email protected]> Cc: Jerome Glisse <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: <[email protected]> [4.8+] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-11-30mm/huge_memory: rename freeze_page() to unmap_page()Hugh Dickins2-16/+9
The term "freeze" is used in several ways in the kernel, and in mm it has the particular meaning of forcing page refcount temporarily to 0. freeze_page() is just too confusing a name for a function that unmaps a page: rename it unmap_page(), and rename unfreeze_page() remap_page(). Went to change the mention of freeze_page() added later in mm/rmap.c, but found it to be incorrect: ordinary page reclaim reaches there too; but the substance of the comment still seems correct, so edit it down. Link: http://lkml.kernel.org/r/[email protected] Fixes: e9b61f19858a5 ("thp: reintroduce split_huge_page()") Signed-off-by: Hugh Dickins <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: Jerome Glisse <[email protected]> Cc: Konstantin Khlebnikov <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: <[email protected]> [4.8+] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-11-30initramfs: clean old path before creating a hardlinkLi Zhijian1-10/+12
sys_link() can fail due to the new path already existing. This case ofen occurs when we use a concated initrd, for example: 1) prepare a basic rootfs, it contains a regular files rc.local lizhijian@:~/yocto-tiny-i386-2016-04-22$ cat etc/rc.local #!/bin/sh echo "Running /etc/rc.local..." yocto-tiny-i386-2016-04-22$ find . | sed 's,^\./,,' | cpio -o -H newc | gzip -n -9 >../rootfs.cgz 2) create a extra initrd which also includes a etc/rc.local lizhijian@:~/lkp-x86_64/etc$ echo "append initrd" >rc.local lizhijian@:~/lkp/lkp-x86_64/etc$ cat rc.local append initrd lizhijian@:~/lkp/lkp-x86_64/etc$ ln rc.local rc.local.hardlink append initrd lizhijian@:~/lkp/lkp-x86_64/etc$ stat rc.local rc.local.hardlink File: 'rc.local' Size: 14 Blocks: 8 IO Block: 4096 regular file Device: 801h/2049d Inode: 11296086 Links: 2 Access: (0664/-rw-rw-r--) Uid: ( 1002/lizhijian) Gid: ( 1002/lizhijian) Access: 2018-11-15 16:08:28.654464815 +0800 Modify: 2018-11-15 16:07:57.514903210 +0800 Change: 2018-11-15 16:08:24.180228872 +0800 Birth: - File: 'rc.local.hardlink' Size: 14 Blocks: 8 IO Block: 4096 regular file Device: 801h/2049d Inode: 11296086 Links: 2 Access: (0664/-rw-rw-r--) Uid: ( 1002/lizhijian) Gid: ( 1002/lizhijian) Access: 2018-11-15 16:08:28.654464815 +0800 Modify: 2018-11-15 16:07:57.514903210 +0800 Change: 2018-11-15 16:08:24.180228872 +0800 Birth: - lizhijian@:~/lkp/lkp-x86_64$ find . | sed 's,^\./,,' | cpio -o -H newc | gzip -n -9 >../rc-local.cgz lizhijian@:~/lkp/lkp-x86_64$ gzip -dc ../rc-local.cgz | cpio -t . etc etc/rc.local.hardlink <<< it will be extracted first at this initrd etc/rc.local 3) concate 2 initrds and boot lizhijian@:~/lkp$ cat rootfs.cgz rc-local.cgz >concate-initrd.cgz lizhijian@:~/lkp$ qemu-system-x86_64 -nographic -enable-kvm -cpu host -smp 1 -m 1024 -kernel ~/lkp/linux/arch/x86/boot/bzImage -append "console=ttyS0 earlyprint=ttyS0 ignore_loglevel" -initrd ./concate-initr.cgz -serial stdio -nodefaults In this case, sys_link(2) will fail and return -EEXIST, so we can only get the rc.local at rootfs.cgz instead of rc-local.cgz [[email protected]: move code to avoid forward declaration] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Li Zhijian <[email protected]> Cc: Philip Li <[email protected]> Cc: Dominik Brodowski <[email protected]> Cc: Li Zhijian <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-11-30kernel/kcov.c: mark funcs in __sanitizer_cov_trace_pc() as notraceAnders Roxell1-2/+2
Since __sanitizer_cov_trace_pc() is marked as notrace, function calls in __sanitizer_cov_trace_pc() shouldn't be traced either. ftrace_graph_caller() gets called for each function that isn't marked 'notrace', like canonicalize_ip(). This is the call trace from a run: [ 139.644550] ftrace_graph_caller+0x1c/0x24 [ 139.648352] canonicalize_ip+0x18/0x28 [ 139.652313] __sanitizer_cov_trace_pc+0x14/0x58 [ 139.656184] sched_clock+0x34/0x1e8 [ 139.659759] trace_clock_local+0x40/0x88 [ 139.663722] ftrace_push_return_trace+0x8c/0x1f0 [ 139.667767] prepare_ftrace_return+0xa8/0x100 [ 139.671709] ftrace_graph_caller+0x1c/0x24 Rework so that check_kcov_mode() and canonicalize_ip() that are called from __sanitizer_cov_trace_pc() are also marked as notrace. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnd Bergmann <[email protected]> Signen-off-by: Anders Roxell <[email protected]> Co-developed-by: Arnd Bergmann <[email protected]> Acked-by: Steven Rostedt (VMware) <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-11-30psi: make disabling/enabling easier for vendor kernelsJohannes Weiner5-14/+40
Mel Gorman reports a hackbench regression with psi that would prohibit shipping the suse kernel with it default-enabled, but he'd still like users to be able to opt in at little to no cost to others. With the current combination of CONFIG_PSI and the psi_disabled bool set from the commandline, this is a challenge. Do the following things to make it easier: 1. Add a config option CONFIG_PSI_DEFAULT_DISABLED that allows distros to enable CONFIG_PSI in their kernel but leave the feature disabled unless a user requests it at boot-time. To avoid double negatives, rename psi_disabled= to psi=. 2. Make psi_disabled a static branch to eliminate any branch costs when the feature is disabled. In terms of numbers before and after this patch, Mel says: : The following is a comparision using CONFIG_PSI=n as a baseline against : your patch and a vanilla kernel : : 4.20.0-rc4 4.20.0-rc4 4.20.0-rc4 : kconfigdisable-v1r1 vanilla psidisable-v1r1 : Amean 1 1.3100 ( 0.00%) 1.3923 ( -6.28%) 1.3427 ( -2.49%) : Amean 3 3.8860 ( 0.00%) 4.1230 * -6.10%* 3.8860 ( -0.00%) : Amean 5 6.8847 ( 0.00%) 8.0390 * -16.77%* 6.7727 ( 1.63%) : Amean 7 9.9310 ( 0.00%) 10.8367 * -9.12%* 9.9910 ( -0.60%) : Amean 12 16.6577 ( 0.00%) 18.2363 * -9.48%* 17.1083 ( -2.71%) : Amean 18 26.5133 ( 0.00%) 27.8833 * -5.17%* 25.7663 ( 2.82%) : Amean 24 34.3003 ( 0.00%) 34.6830 ( -1.12%) 32.0450 ( 6.58%) : Amean 30 40.0063 ( 0.00%) 40.5800 ( -1.43%) 41.5087 ( -3.76%) : Amean 32 40.1407 ( 0.00%) 41.2273 ( -2.71%) 39.9417 ( 0.50%) : : It's showing that the vanilla kernel takes a hit (as the bisection : indicated it would) and that disabling PSI by default is reasonably : close in terms of performance for this particular workload on this : particular machine so; Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Johannes Weiner <[email protected]> Tested-by: Mel Gorman <[email protected]> Reported-by: Mel Gorman <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-11-30proc: fixup map_files test on armAlexey Dobriyan1-2/+7
https://bugs.linaro.org/show_bug.cgi?id=3782 Turns out arm doesn't permit mapping address 0, so try minimum virtual address instead. Link: http://lkml.kernel.org/r/20181113165446.GA28157@avx2 Signed-off-by: Alexey Dobriyan <[email protected]> Reported-by: Rafael David Tinoco <[email protected]> Tested-by: Rafael David Tinoco <[email protected]> Acked-by: Cyrill Gorcunov <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-11-30debugobjects: avoid recursive calls with kmemleakQian Cai1-3/+2
CONFIG_DEBUG_OBJECTS_RCU_HEAD does not play well with kmemleak due to recursive calls. fill_pool kmemleak_ignore make_black_object put_object __call_rcu (kernel/rcu/tree.c) debug_rcu_head_queue debug_object_activate debug_object_init fill_pool kmemleak_ignore make_black_object ... So add SLAB_NOLEAKTRACE to kmem_cache_create() to not register newly allocated debug objects at all. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Qian Cai <[email protected]> Suggested-by: Catalin Marinas <[email protected]> Acked-by: Waiman Long <[email protected]> Acked-by: Catalin Marinas <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Yang Shi <[email protected]> Cc: Arnd Bergmann <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-11-30userfaultfd: shmem: UFFDIO_COPY: set the page dirty if VM_WRITE is not setAndrea Arcangeli1-0/+11
Set the page dirty if VM_WRITE is not set because in such case the pte won't be marked dirty and the page would be reclaimed without writepage (i.e. swapout in the shmem case). This was found by source review. Most apps (certainly including QEMU) only use UFFDIO_COPY on PROT_READ|PROT_WRITE mappings or the app can't modify the memory in the first place. This is for correctness and it could help the non cooperative use case to avoid unexpected data loss. Link: http://lkml.kernel.org/r/[email protected] Reviewed-by: Hugh Dickins <[email protected]> Cc: [email protected] Fixes: 4c27fe4c4c84 ("userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support") Reported-by: Hugh Dickins <[email protected]> Signed-off-by: Andrea Arcangeli <[email protected]> Cc: "Dr. David Alan Gilbert" <[email protected]> Cc: Jann Horn <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Peter Xu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-11-30userfaultfd: shmem: add i_size checksAndrea Arcangeli2-4/+40
With MAP_SHARED: recheck the i_size after taking the PT lock, to serialize against truncate with the PT lock. Delete the page from the pagecache if the i_size_read check fails. With MAP_PRIVATE: check the i_size after the PT lock before mapping anonymous memory or zeropages into the MAP_PRIVATE shmem mapping. A mostly irrelevant cleanup: like we do the delete_from_page_cache() pagecache removal after dropping the PT lock, the PT lock is a spinlock so drop it before the sleepable page lock. Link: http://lkml.kernel.org/r/[email protected] Fixes: 4c27fe4c4c84 ("userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support") Signed-off-by: Andrea Arcangeli <[email protected]> Reviewed-by: Mike Rapoport <[email protected]> Reviewed-by: Hugh Dickins <[email protected]> Reported-by: Jann Horn <[email protected]> Cc: <[email protected]> Cc: "Dr. David Alan Gilbert" <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Peter Xu <[email protected]> Cc: [email protected] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-11-30userfaultfd: shmem/hugetlbfs: only allow to register VM_MAYWRITE vmasAndrea Arcangeli2-9/+21
After the VMA to register the uffd onto is found, check that it has VM_MAYWRITE set before allowing registration. This way we inherit all common code checks before allowing to fill file holes in shmem and hugetlbfs with UFFDIO_COPY. The userfaultfd memory model is not applicable for readonly files unless it's a MAP_PRIVATE. Link: http://lkml.kernel.org/r/[email protected] Fixes: ff62a3421044 ("hugetlb: implement memfd sealing") Signed-off-by: Andrea Arcangeli <[email protected]> Reviewed-by: Mike Rapoport <[email protected]> Reviewed-by: Hugh Dickins <[email protected]> Reported-by: Jann Horn <[email protected]> Fixes: 4c27fe4c4c84 ("userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support") Cc: <[email protected]> Cc: "Dr. David Alan Gilbert" <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Peter Xu <[email protected]> Cc: [email protected] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-11-30userfaultfd: shmem: allocate anonymous memory for MAP_PRIVATE shmemAndrea Arcangeli1-2/+13
Userfaultfd did not create private memory when UFFDIO_COPY was invoked on a MAP_PRIVATE shmem mapping. Instead it wrote to the shmem file, even when that had not been opened for writing. Though, fortunately, that could only happen where there was a hole in the file. Fix the shmem-backed implementation of UFFDIO_COPY to create private memory for MAP_PRIVATE mappings. The hugetlbfs-backed implementation was already correct. This change is visible to userland, if userfaultfd has been used in unintended ways: so it introduces a small risk of incompatibility, but is necessary in order to respect file permissions. An app that uses UFFDIO_COPY for anything like postcopy live migration won't notice the difference, and in fact it'll run faster because there will be no copy-on-write and memory waste in the tmpfs pagecache anymore. Userfaults on MAP_PRIVATE shmem keep triggering only on file holes like before. The real zeropage can also be built on a MAP_PRIVATE shmem mapping through UFFDIO_ZEROPAGE and that's safe because the zeropage pte is never dirty, in turn even an mprotect upgrading the vma permission from PROT_READ to PROT_READ|PROT_WRITE won't make the zeropage pte writable. Link: http://lkml.kernel.org/r/[email protected] Fixes: 4c27fe4c4c84 ("userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support") Signed-off-by: Andrea Arcangeli <[email protected]> Reported-by: Mike Rapoport <[email protected]> Reviewed-by: Hugh Dickins <[email protected]> Cc: <[email protected]> Cc: "Dr. David Alan Gilbert" <[email protected]> Cc: Jann Horn <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Peter Xu <[email protected]> Cc: [email protected] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-11-30userfaultfd: use ENOENT instead of EFAULT if the atomic copy user failsAndrea Arcangeli3-5/+5
Patch series "userfaultfd shmem updates". Jann found two bugs in the userfaultfd shmem MAP_SHARED backend: the lack of the VM_MAYWRITE check and the lack of i_size checks. Then looking into the above we also fixed the MAP_PRIVATE case. Hugh by source review also found a data loss source if UFFDIO_COPY is used on shmem MAP_SHARED PROT_READ mappings (the production usages incidentally run with PROT_READ|PROT_WRITE, so the data loss couldn't happen in those production usages like with QEMU). The whole patchset is marked for stable. We verified QEMU postcopy live migration with guest running on shmem MAP_PRIVATE run as well as before after the fix of shmem MAP_PRIVATE. Regardless if it's shmem or hugetlbfs or MAP_PRIVATE or MAP_SHARED, QEMU unconditionally invokes a punch hole if the guest mapping is filebacked and a MADV_DONTNEED too (needed to get rid of the MAP_PRIVATE COWs and for the anon backend). This patch (of 5): We internally used EFAULT to communicate with the caller, switch to ENOENT, so EFAULT can be used as a non internal retval. Link: http://lkml.kernel.org/r/[email protected] Fixes: 4c27fe4c4c84 ("userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support") Signed-off-by: Andrea Arcangeli <[email protected]> Reviewed-by: Mike Rapoport <[email protected]> Reviewed-by: Hugh Dickins <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Jann Horn <[email protected]> Cc: Peter Xu <[email protected]> Cc: "Dr. David Alan Gilbert" <[email protected]> Cc: <[email protected]> Cc: [email protected] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-11-30lib/test_kmod.c: fix rmmod double freeLuis Chamberlain1-1/+0
We free the misc device string twice on rmmod; fix this. Without this we cannot remove the module without crashing. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Luis Chamberlain <[email protected]> Reported-by: Randy Dunlap <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: <[email protected]> [4.12+] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-11-30hfsplus: do not free node before usingPan Bian1-1/+2
hfs_bmap_free() frees node via hfs_bnode_put(node). However it then reads node->this when dumping error message on an error path, which may result in a use-after-free bug. This patch frees node only when it is never used. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Pan Bian <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Ernesto A. Fernandez <[email protected]> Cc: Joe Perches <[email protected]> Cc: Viacheslav Dubeyko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-11-30hfs: do not free node before usingPan Bian1-1/+2
hfs_bmap_free() frees the node via hfs_bnode_put(node). However, it then reads node->this when dumping error message on an error path, which may result in a use-after-free bug. This patch frees the node only when it is never again used. Link: http://lkml.kernel.org/r/[email protected] Fixes: a1185ffa2fc ("HFS rewrite") Signed-off-by: Pan Bian <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Joe Perches <[email protected]> Cc: Ernesto A. Fernandez <[email protected]> Cc: Viacheslav Dubeyko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-11-30proc: update MAINTAINERS with proc.txtAlexey Dobriyan1-0/+1
Turns out that /proc has official documentation and people even trying to keep it uptodate. Link: http://lkml.kernel.org/r/20181116134630.GA8004@avx2 Signed-off-by: Alexey Dobriyan <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-11-30mm/page_alloc.c: fix calculation of pgdat->nr_zonesWei Yang1-1/+3
init_currently_empty_zone() will adjust pgdat->nr_zones and set it to 'zone_idx(zone) + 1' unconditionally. This is correct in the normal case, while not exact in hot-plug situation. This function is used in two places: * free_area_init_core() * move_pfn_range_to_zone() In the first case, we are sure zone index increase monotonically. While in the second one, this is under users control. One way to reproduce this is: ---------------------------- 1. create a virtual machine with empty node1 -m 4G,slots=32,maxmem=32G \ -smp 4,maxcpus=8 \ -numa node,nodeid=0,mem=4G,cpus=0-3 \ -numa node,nodeid=1,mem=0G,cpus=4-7 2. hot-add cpu 3-7 cpu-add [3-7] 2. hot-add memory to nod1 object_add memory-backend-ram,id=ram0,size=1G device_add pc-dimm,id=dimm0,memdev=ram0,node=1 3. online memory with following order echo online_movable > memory47/state echo online > memory40/state After this, node1 will have its nr_zones equals to (ZONE_NORMAL + 1) instead of (ZONE_MOVABLE + 1). Michal said: "Having an incorrect nr_zones might result in all sorts of problems which would be quite hard to debug (e.g. reclaim not considering the movable zone). I do not expect many users would suffer from this it but still this is trivial and obviously right thing to do so backporting to the stable tree shouldn't be harmful (last famous words)" Link: http://lkml.kernel.org/r/[email protected] Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") Signed-off-by: Wei Yang <[email protected]> Acked-by: Michal Hocko <[email protected]> Reviewed-by: Oscar Salvador <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Dave Hansen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-11-30mm: use swp_offset as key in shmem_replace_page()Yu Zhao1-2/+4
We changed the key of swap cache tree from swp_entry_t.val to swp_offset. We need to do so in shmem_replace_page() as well. Hugh said: "shmem_replace_page() has been wrong since the day I wrote it: good enough to work on swap "type" 0, which is all most people ever use (especially those few who need shmem_replace_page() at all), but broken once there are any non-0 swp_type bits set in the higher order bits" Link: http://lkml.kernel.org/r/[email protected] Fixes: f6ab1f7f6b2d ("mm, swap: use offset of swap entry as key of swap cache") Signed-off-by: Yu Zhao <[email protected]> Reviewed-by: Matthew Wilcox <[email protected]> Acked-by: Hugh Dickins <[email protected]> Cc: <[email protected]> [4.9+] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-11-30mm: cleancache: fix corruption on missed inode invalidationPavel Tikhomirov1-2/+6
If all pages are deleted from the mapping by memory reclaim and also moved to the cleancache: __delete_from_page_cache (no shadow case) unaccount_page_cache_page cleancache_put_page page_cache_delete mapping->nrpages -= nr (nrpages becomes 0) We don't clean the cleancache for an inode after final file truncation (removal). truncate_inode_pages_final check (nrpages || nrexceptional) is false no truncate_inode_pages no cleancache_invalidate_inode(mapping) These way when reading the new file created with same inode we may get these trash leftover pages from cleancache and see wrong data instead of the contents of the new file. Fix it by always doing truncate_inode_pages which is already ready for nrpages == 0 && nrexceptional == 0 case and just invalidates inode. [[email protected]: add comment, per Jan] Link: http://lkml.kernel.org/r/[email protected] Fixes: commit 91b0abe36a7b ("mm + fs: store shadow entries in page cache") Signed-off-by: Pavel Tikhomirov <[email protected]> Reviewed-by: Vasily Averin <[email protected]> Reviewed-by: Andrey Ryabinin <[email protected]> Reviewed-by: Jan Kara <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Andi Kleen <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-11-30ocfs2: fix deadlock caused by ocfs2_defrag_extent()Larry Chen1-21/+26
ocfs2_defrag_extent may fall into deadlock. ocfs2_ioctl_move_extents ocfs2_ioctl_move_extents ocfs2_move_extents ocfs2_defrag_extent ocfs2_lock_allocators_move_extents ocfs2_reserve_clusters inode_lock GLOBAL_BITMAP_SYSTEM_INODE __ocfs2_flush_truncate_log inode_lock GLOBAL_BITMAP_SYSTEM_INODE As backtrace shows above, ocfs2_reserve_clusters() will call inode_lock against the global bitmap if local allocator has not sufficient cluters. Once global bitmap could meet the demand, ocfs2_reserve_cluster will return success with global bitmap locked. After ocfs2_reserve_cluster(), if truncate log is full, __ocfs2_flush_truncate_log() will definitely fall into deadlock because it needs to inode_lock global bitmap, which has already been locked. To fix this bug, we could remove from ocfs2_lock_allocators_move_extents() the code which intends to lock global allocator, and put the removed code after __ocfs2_flush_truncate_log(). ocfs2_lock_allocators_move_extents() is referred by 2 places, one is here, the other does not need the data allocator context, which means this patch does not affect the caller so far. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Larry Chen <[email protected]> Reviewed-by: Changwei Ge <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Joseph Qi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-11-30mm/gup: finish consolidating error handlingJohn Hubbard1-2/+1
Commit df06b37ffe5a ("mm/gup: cache dev_pagemap while pinning pages") attempted to operate on each page that get_user_pages had retrieved. In order to do that, it created a common exit point from the routine. However, one case was missed, which this patch fixes up. Also, there was still an unnecessary shadow declaration (with a different type) of the "ret" variable, which this patch removes. Keith's description of the situation is: This also fixes a potentially leaked dev_pagemap reference count if a failure occurs when an iteration crosses a vma boundary. I don't think it's normal to have different vma's on a users mapped zone device memory, but good to fix anyway. I actually thought that this code: /* first iteration or cross vma bound */ if (!vma || start >= vma->vm_end) { vma = find_extend_vma(mm, start); if (!vma && in_gate_area(mm, start)) { ret = get_gate_page(mm, start & PAGE_MASK, gup_flags, &vma, pages ? &pages[i] : NULL); if (ret) goto out; dealt with the "you're trying to pin the gate page, as part of this call", rather than the generic case of crossing a vma boundary. (I think there's a fine point that I must be overlooking.) But it's still a valid case, either way. Link: http://lkml.kernel.org/r/[email protected] Fixes: df06b37ffe5a4 ("mm/gup: cache dev_pagemap while pinning pages") Signed-off-by: John Hubbard <[email protected]> Reviewed-by: Keith Busch <[email protected]> Cc: Dan Williams <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Cc: Dave Hansen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-11-30MAINTAINERS: name change for LuisLuis Chamberlain1-6/+6
My name has changed, works better than Global Entry I tell ya. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Luis Chamberlain <[email protected]> Cc: Kees Cook <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-11-30unifdef: use memcpy instead of strncpyLinus Torvalds1-2/+2
New versions of gcc reasonably warn about the odd pattern of strncpy(p, q, strlen(q)); which really doesn't make sense: the strncpy() ends up being just a slow and odd way to write memcpy() in this case. There was a comment about _why_ the code used strncpy - to avoid the terminating NUL byte, but memcpy does the same and avoids the warning. Signed-off-by: Linus Torvalds <[email protected]>
2018-11-30Merge tag 'char-misc-4.20-rc5' of ↵Linus Torvalds8-25/+67
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc fixes from Greg KH: "Here are a few small char/misc driver fixes for 4.20-rc5 that resolve a number of reported issues. The "largest" here is the thunderbolt patch, which resolves an issue with NVM upgrade, the smallest being some fsi driver fixes. There's also a hyperv bugfix, and the usual binder bugfixes. All of these have been in linux-next with no reported issues" * tag 'char-misc-4.20-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: misc: mic/scif: fix copy-paste error in scif_create_remote_lookup thunderbolt: Prevent root port runtime suspend during NVM upgrade Drivers: hv: vmbus: check the creation_status in vmbus_establish_gpadl() binder: fix race that allows malicious free of live buffer fsi: fsi-scom.c: Remove duplicate header fsi: master-ast-cf: select GENERIC_ALLOCATOR
2018-11-30Merge tag 'driver-core-4.20-rc5' of ↵Linus Torvalds1-2/+8
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fix from Greg KH: "Here is a single driver core fix for 4.20-rc5 It resolves an issue with the data alignment in 'struct devres' for the ARC platform. The full details are in the commit changelog, but the short summary is the change is a single line: - unsigned long long data[]; /* guarantee ull alignment */ + u8 __aligned(ARCH_KMALLOC_MINALIGN) data[]; This has been in linux-next for a while with no reported issues" * tag 'driver-core-4.20-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: devres: Align data[] to ARCH_KMALLOC_MINALIGN
2018-11-30Merge tag 'staging-4.20-rc5' of ↵Linus Torvalds24-72/+103
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging and IIO driver fixes from Greg KH: "Here are some small IIO and staging driver fixes for 4.20-rc5. Nothing major, the IIO fix ended up touching the HID drivers at the same time, but the HID maintainer acked it. The staging fixes are all minor patches for reported issues and regressions, full details are in the shortlog. All of these have been in linux-next for a while with no reported issues" * tag 'staging-4.20-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: iio/hid-sensors: Fix IIO_CHAN_INFO_RAW returning wrong values for signed numbers staging: vchiq_arm: fix compat VCHIQ_IOC_AWAIT_COMPLETION staging: mt7621-pinctrl: fix uninitialized variable ngroups staging: rtl8723bs: Add missing return for cfg80211_rtw_get_station staging: most: use format specifier "%s" in snprintf staging: rtl8723bs: Fix incorrect sense of ether_addr_equal staging: mt7621-dma: fix potentially dereferencing uninitialized 'tx_desc' staging: comedi: clarify/unify macros for NI macro-defined terminals drivers: staging: cedrus: find ctx before dereferencing it ctx staging: rtl8723bs: Fix the return value in case of error in 'rtw_wx_read32()' staging: comedi: ni_mio_common: scale ao INSN_CONFIG_GET_CMD_TIMING_CONSTRAINTS iio:st_magn: Fix enable device after trigger
2018-11-30Merge tag 'usb-4.20-rc5' of ↵Linus Torvalds8-85/+86
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB/PHY driver fixes from Greg KH: "Here are some small USB and PHY driver fixes for 4.20-rc5 Nothing big at all, just the usual handful of USB fixes for reported issues, along with some gadget and PHY driver bug fixes. All of these have been in linux-next with no reported issues. Note, the USB gadget fixes were in linux-next on its own branch, not in mine, it just got merged into here yesterday and missed linux-next of today" * tag 'usb-4.20-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: usb: gadget: u_ether: fix unsafe list iteration USB: omap_udc: fix rejection of out transfers when DMA is used USB: omap_udc: fix USB gadget functionality on Palm Tungsten E USB: omap_udc: fix omap_udc_start() on 15xx machines USB: omap_udc: fix crashes on probe error and module removal USB: omap_udc: use devm_request_irq() usb: core: quirks: add RESET_RESUME quirk for Cherry G230 Stream series USB: usb-storage: Add new IDs to ums-realtek Revert "usb: dwc3: gadget: skip Set/Clear Halt when invalid" phy: qcom-qusb2: Fix HSTX_TRIM tuning with fused value for SDM845 phy: qcom-qusb2: Use HSTX_TRIM fused value as is dt-bindings: phy-qcom-qmp: Fix several mistakes from prior commits phy: uniphier-pcie: Depend on HAS_IOMEM
2018-11-30Merge tag 'mtd/fixes-for-4.20-rc5' of git://git.infradead.org/linux-mtdLinus Torvalds2-3/+31
Pull mtd fixes from Boris Brezillon: "NAND fix: - Fix BBT cache allocation done in nanddev_bbt_init() SPI NOR fixes: - Fix the erase type selection logic" * tag 'mtd/fixes-for-4.20-rc5' of git://git.infradead.org/linux-mtd: mtd: nand: Fix memory allocation in nanddev_bbt_init() mtd: spi-nor: fix erase_type array to indicate current map conf
2018-11-30test_hexdump: use memcpy instead of strncpyLinus Torvalds1-1/+1
New versions of gcc reasonably warn about the odd pattern of strncpy(p, q, strlen(q)); which really doesn't make sense: the strncpy() ends up being just a slow and odd way to write memcpy() in this case. Apparently there was a patch for this floating around earlier, but it got lost. Acked-again-by: Andy Shevchenko <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-11-30Merge tag 'imx-fixes-4.20-2' of ↵Olof Johansson1-6/+0
git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into fixes i.MX fixes for 4.20, round 2: - Reomve non-existing EEPROM device from imx51-zii-rdu1 board. It was added by mistake. * tag 'imx-fixes-4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux: ARM: dts: imx51-zii-rdu1: Remove EEPROM node Signed-off-by: Olof Johansson <[email protected]>
2018-11-30MAINTAINERS: Remove unused Qualcomm SoC mailing listAndy Gross1-1/+0
This patch removes the linux-soc mailing list from the Qualcomm SoC entry. We use the linux-msm and there is no need to have the second one and this clears the list for use by others. Signed-off-by: Andy Gross <[email protected]> Signed-off-by: Olof Johansson <[email protected]>
2018-11-30Merge tag 'omap-for-v4.20/fixes-rc4' of ↵Olof Johansson6-5/+8
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes Few minor fixes for omaps for v4.20-rc cycle This set of fixes contains minor regression fixes for LogicPD dts files for MMC pinctrl and interrupts. There is also one section annotation fix that shows up with Clang, and a fix for an unitialized field for omap1. * tag 'omap-for-v4.20/fixes-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: OMAP1: ams-delta: Fix possible use of uninitialized field ARM: dts: am3517-som: Fix WL127x Wifi interrupt ARM: dts: logicpd-somlv: Fix interrupt on mmc3_dat1 ARM: dts: LogicPD Torpedo: Fix mmc3_dat1 interrupt ARM: dts: am3517: Fix pinmuxing for CD on MMC1 ARM: OMAP2+: prm44xx: Fix section annotation on omap44xx_prm_enable_io_wakeup Signed-off-by: Olof Johansson <[email protected]>
2018-11-30Merge tag 'davinci-fixes-for-v4.20' of ↵Olof Johansson9-3/+152
git://git.kernel.org/pub/scm/linux/kernel/git/nsekhar/linux-davinci into fixes DaVinci: fix GPIO breakage after v4.19 This set of changes is needed to fix the broken GPIO support for DaVinci boards in legacy mode after certain changes made to the GPIO driver in 4.19, namely: commits 587f7a694f01 ("gpio: davinci: Use dev name for label and automatic base selection") and eb3744a2dd01 ("gpio: davinci: Do not assume continuous IRQ numbering"). * tag 'davinci-fixes-for-v4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/nsekhar/linux-davinci: ARM: davinci: dm644x: set the GPIO base to 0 ARM: davinci: da830: set the GPIO base to 0 ARM: davinci: dm355: set the GPIO base to 0 ARM: davinci: dm646x: set the GPIO base to 0 ARM: davinci: dm365: set the GPIO base to 0 ARM: davinci: da850: set the GPIO base to 0 gpio: davinci: restore a way to manually specify the GPIO base ARM: davinci: dm644x: define gpio interrupts as separate resources ARM: davinci: dm355: define gpio interrupts as separate resources ARM: davinci: dm646x: define gpio interrupts as separate resources ARM: davinci: dm365: define gpio interrupts as separate resources ARM: davinci: da8xx: define gpio interrupts as separate resources Signed-off-by: Olof Johansson <[email protected]>
2018-11-30Merge tag 'v4.20-rockchip-dts64fixes-1' of ↵Olof Johansson2-13/+1
git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into fixes Removal of vdd_log regulator on rk960 to fix a stability issue and fixup of the pcie reset polarity on puma-haikou. * tag 'v4.20-rockchip-dts64fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip: arm64: dts: rockchip: Fix PCIe reset polarity for rk3399-puma-haikou. arm64: dts: rockchip: remove vdd_log from rock960 to fix a stability issues Signed-off-by: Olof Johansson <[email protected]>
2018-11-30Merge tag 'v4.20-rockchip-dts32fixes-1' of ↵Olof Johansson1-1/+5
git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into fixes Moving the veyron memory node from memory@0 back to memory, as the firmware on these devices as issues identifying the formally correct node. * tag 'v4.20-rockchip-dts32fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip: ARM: dts: rockchip: Remove @0 from the veyron memory node Signed-off-by: Olof Johansson <[email protected]>
2018-11-30Merge tag 'at91-4.20-fixes' of ↵Olof Johansson1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/at91/linux into fixes AT91 fixes for 4.20 - Fix the SMC parent clock * tag 'at91-4.20-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/at91/linux: ARM: dts: at91: sama5d2: use the divided clock for SMC Signed-off-by: Olof Johansson <[email protected]>
2018-11-30Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds12-82/+16
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes: - MCE related boot crash fix on certain AMD systems - FPU exception handling fix - FPU handling race fix - revert+rewrite of the RSDP boot protocol extension, use boot_params instead - documentation fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/MCE/AMD: Fix the thresholding machinery initialization order x86/fpu: Use the correct exception table macro in the XSTATE_OP wrapper x86/fpu: Disable bottom halves while loading FPU registers x86/acpi, x86/boot: Take RSDP address from boot params if available x86/boot: Mostly revert commit ae7e1238e68f2a ("Add ACPI RSDP address to setup_header") x86/ptrace: Fix documentation for tracehook_report_syscall_entry()
2018-11-30Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds21-47/+166
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Misc fixes: - counter freezing related regression fix - uprobes race fix - Intel PMU unusual event combination fix - .. and diverse tooling fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: uprobes: Fix handle_swbp() vs. unregister() + register() race once more perf/x86/intel: Disallow precise_ip on BTS events perf/x86/intel: Add generic branch tracing check to intel_pmu_has_bts() perf/x86/intel: Move branch tracing setup to the Intel-specific source file perf/x86/intel: Fix regression by default disabling perfmon v4 interrupt handling perf tools beauty ioctl: Support new ISO7816 commands tools uapi asm-generic: Synchronize ioctls.h tools arch x86: Update tools's copy of cpufeatures.h tools headers uapi: Synchronize i915_drm.h perf tools: Restore proper cwd on return from mnt namespace tools build feature: Check if get_current_dir_name() is available perf tools: Fix crash on synthesizing the unit
2018-11-30Merge branch 'efi-urgent-for-linus' of ↵Linus Torvalds1-10/+26
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI fix from Ingo Molnar: "An arm64 warning fix" * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi: Prevent GICv3 WARN() by mapping the memreserve table before first use
2018-11-30Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds1-4/+15
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool fixes from Ingo Molnar: "Two fixes for boundary conditions" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Fix segfault in .cold detection with -ffunction-sections objtool: Fix double-free in .cold detection error path
2018-11-30Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds12-125/+189
Pull vfs fixes from Al Viro: "Assorted fixes all over the place. The iov_iter one is this cycle regression (splice from UDP triggering WARN_ON()), the rest is older" * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: afs: Use d_instantiate() rather than d_add() and don't d_drop() afs: Fix missing net error handling afs: Fix validation/callback interaction iov_iter: teach csum_and_copy_to_iter() to handle pipe-backed ones exportfs: do not read dentry after free exportfs: fix 'passing zero to ERR_PTR()' warning aio: fix failure to put the file pointer sysv: return 'err' instead of 0 in __sysv_write_inode
2018-11-30Merge tag 'trace-v4.20-rc4' of ↵Linus Torvalds5-6/+65
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull more tracing fixes from Steven Rostedt: "Two more fixes: - Change idx variable in DO_TRACE macro to __idx to avoid name conflicts. A kvm event had "idx" as a parameter and it confused the macro. - Fix a race where interrupts would be traced when set_graph_function was set. The previous patch set increased a race window that tricked the function graph tracer to think it should trace interrupts when it really should not have. The bug has been there before, but was seldom hit. Only the last patch series made it more common" * tag 'trace-v4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing/fgraph: Fix set_graph_function from showing interrupts tracepoint: Use __idx instead of idx in DO_TRACE macro to make it unique
2018-11-30Merge tag 'trace-v4.20-rc3' of ↵Linus Torvalds17-175/+78
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "While rewriting the function graph tracer, I discovered a design flaw that was introduced by a patch that tried to fix one bug, but by doing so created another bug. As both bugs corrupt the output (but they do not crash the kernel), I decided to fix the design such that it could have both bugs fixed. The original fix, fixed time reporting of the function graph tracer when doing a max_depth of one. This was code that can test how much the kernel interferes with userspace. But in doing so, it could corrupt the time keeping of the function profiler. The issue is that the curr_ret_stack variable was being used for two different meanings. One was to keep track of the stack pointer on the ret_stack (shadow stack used by the function graph tracer), and the other use case was the graph call depth. Although, the two may be closely related, where they got updated was the issue that lead to the two different bugs that required the two use cases to be updated differently. The big issue with this fix is that it requires changing each architecture. The good news is, I was able to remove a lot of code that was duplicated within the architectures and place it into a single location. Then I could make the fix in one place. I pushed this code into linux-next to let it settle over a week, and before doing so, I cross compiled all the affected architectures to make sure that they built fine. In the mean time, I also pulled in a patch that fixes the sched_switch previous tasks state output, that was not actually correct" * tag 'trace-v4.20-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: sched, trace: Fix prev_state output in sched_switch tracepoint function_graph: Have profiler use curr_ret_stack and not depth function_graph: Reverse the order of pushing the ret_stack and the callback function_graph: Move return callback before update of curr_ret_stack function_graph: Use new curr_ret_depth to manage depth instead of curr_ret_stack function_graph: Make ftrace_push_return_trace() static sparc/function_graph: Simplify with function_graph_enter() sh/function_graph: Simplify with function_graph_enter() s390/function_graph: Simplify with function_graph_enter() riscv/function_graph: Simplify with function_graph_enter() powerpc/function_graph: Simplify with function_graph_enter() parisc: function_graph: Simplify with function_graph_enter() nds32: function_graph: Simplify with function_graph_enter() MIPS: function_graph: Simplify with function_graph_enter() microblaze: function_graph: Simplify with function_graph_enter() arm64: function_graph: Simplify with function_graph_enter() ARM: function_graph: Simplify with function_graph_enter() x86/function_graph: Simplify with function_graph_enter() function_graph: Create function_graph_enter() to consolidate architecture code
2018-11-30ACPI/IORT: Fix iort_get_platform_device_domain() uninitialized pointer valueLorenzo Pieralisi1-1/+1
Running the Clang static analyzer on IORT code detected the following error: Logic error: Branch condition evaluates to a garbage value in iort_get_platform_device_domain() If the named component associated with a given device has no IORT mappings, iort_get_platform_device_domain() exits its MSI mapping loop with msi_parent pointer containing garbage, which can lead to erroneous code path execution. Initialize the msi_parent pointer, fixing the bug. Fixes: d4f54a186667 ("ACPI: platform: setup MSI domain for ACPI based platform device") Reported-by: Patrick Bellasi <[email protected]> Reviewed-by: Hanjun Guo <[email protected]> Acked-by: Will Deacon <[email protected]> Cc: Sudeep Holla <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>