aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2019-06-29Merge branch 'efi-urgent-for-linus' of ↵Linus Torvalds4-13/+18
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI fixes from Ingo Molnar: "Four fixes: - fix a kexec crash on arm64 - fix a reboot crash on some Android platforms - future-proof the code for upcoming ACPI 6.2 changes - fix a build warning on x86" * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efibc: Replace variable set function in notifier call x86/efi: fix a -Wtype-limits compilation warning efi/bgrt: Drop BGRT status field reserved bits check efi/memreserve: deal with memreserve entries in unmapped memory
2019-06-29Merge tag 'pm-5.2-rc7' of ↵Linus Torvalds3-6/+31
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fix from Rafael Wysocki: "Avoid skipping bus-level PCI power management during system resume for PCIe ports left in D0 during the preceding suspend transition on platforms where the power states of those ports can change out of the PCI layer's control" * tag 'pm-5.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PCI: PM: Avoid skipping bus-level PM on platforms without ACPI
2019-06-29x86/timer: Skip PIT initialization on modern chipsetsThomas Gleixner6-3/+63
Recent Intel chipsets including Skylake and ApolloLake have a special ITSSPRC register which allows the 8254 PIT to be gated. When gated, the 8254 registers can still be programmed as normal, but there are no IRQ0 timer interrupts. Some products such as the Connex L1430 and exone go Rugged E11 use this register to ship with the PIT gated by default. This causes Linux to fail to boot: Kernel panic - not syncing: IO-APIC + timer doesn't work! Boot with apic=debug and send a report. The panic happens before the framebuffer is initialized, so to the user, it appears as an early boot hang on a black screen. Affected products typically have a BIOS option that can be used to enable the 8254 and make Linux work (Chipset -> South Cluster Configuration -> Miscellaneous Configuration -> 8254 Clock Gating), however it would be best to make Linux support the no-8254 case. Modern sytems allow to discover the TSC and local APIC timer frequencies, so the calibration against the PIT is not required. These systems have always running timers and the local APIC timer works also in deep power states. So the setup of the PIT including the IO-APIC timer interrupt delivery checks are a pointless exercise. Skip the PIT setup and the IO-APIC timer interrupt checks on these systems, which avoids the panic caused by non ticking PITs and also speeds up the boot process. Thanks to Daniel for providing the changelog, initial analysis of the problem and testing against a variety of machines. Reported-by: Daniel Drake <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Daniel Drake <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2019-06-29Merge tag 'xarray-5.2-rc6' of git://git.infradead.org/users/willy/linux-daxLinus Torvalds6-5/+108
Pull XArray fixes from Matthew Wilcox: - Account XArray nodes for the page cache to the appropriate cgroup (Johannes Weiner) - Fix idr_get_next() when called under the RCU lock (Matthew Wilcox) - Add a test for xa_insert() (Matthew Wilcox) * tag 'xarray-5.2-rc6' of git://git.infradead.org/users/willy/linux-dax: XArray tests: Add check_insert idr: Fix idr_get_next race with idr_remove mm: fix page cache convergence regression
2019-06-29Merge branch 'akpm' (patches from Andrew)Linus Torvalds20-80/+97
Merge misc fixes from Andrew Morton: "15 fixes" * emailed patches from Andrew Morton <[email protected]>: linux/kernel.h: fix overflow for DIV_ROUND_UP_ULL mm, swap: fix THP swap out fork,memcg: alloc_thread_stack_node needs to set tsk->stack MAINTAINERS: add CLANG/LLVM BUILD SUPPORT info mm/vmalloc.c: avoid bogus -Wmaybe-uninitialized warning mm/page_idle.c: fix oops because end_pfn is larger than max_pfn initramfs: fix populate_initrd_image() section mismatch mm/oom_kill.c: fix uninitialized oc->constraint mm: hugetlb: soft-offline: dissolve_free_huge_page() return zero on !PageHuge mm: soft-offline: return -EBUSY if set_hwpoison_free_buddy_page() fails signal: remove the wrong signal_pending() check in restore_user_sigmask() fs/binfmt_flat.c: make load_flat_shared_library() work mm/mempolicy.c: fix an incorrect rebind node in mpol_rebind_nodemask fs/proc/array.c: allow reporting eip/esp for all coredumping threads mm/dev_pfn: exclude MEMORY_DEVICE_PRIVATE while computing virtual address
2019-06-29Merge tag 'arc-5.2-rc7' of ↵Linus Torvalds2-8/+157
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC fixes from Vineet Gupta: - hsdk platform unifying apertures - build system CROSS_COMPILE prefix * tag 'arc-5.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARC: [plat-hsdk]: unify memory apertures configuration ARC: build: Try to guess CROSS_COMPILE with cc-cross-prefix
2019-06-29Merge tag 'riscv-for-v5.2/fixes-rc7' of ↵Linus Torvalds6-16/+39
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Paul Walmsley: "Minor RISC-V fixes and one defconfig update. The fixes have no functional impact: - Fix some comment text in the memory management vmalloc_fault path. - Fix some warnings from the DT compiler in our newly-added DT files. - Change the newly-added DT bindings such that SoC IP blocks with external I/O are marked as "disabled" by default, then enable them explicitly in board DT files when the devices are used on the board. This aligns the bindings with existing upstream practice. - Add the MIT license as an option for a minor header file, at the request of one of the U-Boot maintainers. The RISC-V defconfig update builds the SiFive SPI driver and the MMC-SPI driver by default. The intention here is to make v5.2 more usable for testers and users with RISC-V hardware" * tag 'riscv-for-v5.2/fixes-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: mm: Fix code comment dt-bindings: clock: sifive: add MIT license as an option for the header file dt-bindings: riscv: resolve 'make dt_binding_check' warnings riscv: dts: Re-organize the DT nodes RISC-V: defconfig: enable MMC & SPI for RISC-V
2019-06-29Merge tag 'nfs-for-5.2-4' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds2-9/+9
Pull two more NFS client fixes from Anna Schumaker: "These are both stable fixes. One to calculate the correct client message length in the case of partial transmissions. And the other to set the proper TCP timeout for flexfiles" * tag 'nfs-for-5.2-4' of git://git.linux-nfs.org/projects/anna/linux-nfs: NFS/flexfiles: Use the correct TCP timeout for flexfiles I/O SUNRPC: Fix up calculation of client message length
2019-06-29Merge tag 'ceph-for-5.2-rc7' of git://github.com/ceph/ceph-clientLinus Torvalds1-1/+2
Pull ceph fix from Ilya Dryomov: "A small fix for a potential -rc1 regression from Jeff" * tag 'ceph-for-5.2-rc7' of git://github.com/ceph/ceph-client: ceph: fix ceph_mdsc_build_path to not stop on first component
2019-06-29Merge tag 'scsi-fixes' of ↵Linus Torvalds1-2/+4
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fix from James Bottomley: "One simple fix for a driver use after free" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: vmw_pscsi: Fix use-after-free in pvscsi_queue_lck()
2019-06-29Merge tag 'for-linus-20190628' of git://git.kernel.dk/linux-blockLinus Torvalds2-4/+3
Pull block fixes from Jens Axboe: "Just two small fixes. One from Paolo, fixing a silly mistake in BFQ. The other one is from me, ensuring that we have ->file cleared in the io_uring request a bit earlier. That avoids a use-before-free, if we encounter an error before ->file is assigned" * tag 'for-linus-20190628' of git://git.kernel.dk/linux-block: block, bfq: fix operator in BFQQ_TOTALLY_SEEKY io_uring: ensure req->file is cleared on allocation
2019-06-29Merge tag 'pinctrl-v5.2-3' of ↵Linus Torvalds3-27/+33
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: "Sorry to bomb in fixes this late. Maybe I can comfort you by saying it is only driver fixes, and mostly IRQ handling which is something GPIO and pin control drivers never get right. You think it works and then it doesn't. Summary: - Fix IRQ setup in the MCP23s08. - Fix pin setup on pins > 31 in the Ocelot driver. - Fix IRQs in the Mediatek driver" * tag 'pinctrl-v5.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: mediatek: Update cur_mask in mask/mask ops pinctrl: mediatek: Ignore interrupts that are wake only during resume pinctrl: ocelot: fix pinmuxing for pins after 31 pinctrl: ocelot: fix gpio direction for pins after 31 pinctrl: mcp23s08: Fix add_data and irqchip_add_nested call order
2019-06-29linux/kernel.h: fix overflow for DIV_ROUND_UP_ULLVinod Koul1-1/+2
DIV_ROUND_UP_ULL adds the two arguments and then invokes DIV_ROUND_DOWN_ULL. But on a 32bit system the addition of two 32 bit values can overflow. DIV_ROUND_DOWN_ULL does it correctly and stashes the addition into a unsigned long long so cast the result to unsigned long long here to avoid the overflow condition. [[email protected]: DIV_ROUND_UP_ULL must be an rval] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Vinod Koul <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Bjorn Andersson <[email protected]> Cc: Randy Dunlap <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-06-29mm, swap: fix THP swap outHuang Ying1-5/+2
0-Day test system reported some OOM regressions for several THP (Transparent Huge Page) swap test cases. These regressions are bisected to 6861428921b5 ("block: always define BIO_MAX_PAGES as 256"). In the commit, BIO_MAX_PAGES is set to 256 even when THP swap is enabled. So the bio_alloc(gfp_flags, 512) in get_swap_bio() may fail when swapping out THP. That causes the OOM. As in the patch description of 6861428921b5 ("block: always define BIO_MAX_PAGES as 256"), THP swap should use multi-page bvec to write THP to swap space. So the issue is fixed via doing that in get_swap_bio(). BTW: I remember I have checked the THP swap code when 6861428921b5 ("block: always define BIO_MAX_PAGES as 256") was merged, and thought the THP swap code needn't to be changed. But apparently, I was wrong. I should have done this at that time. Link: http://lkml.kernel.org/r/[email protected] Fixes: 6861428921b5 ("block: always define BIO_MAX_PAGES as 256") Signed-off-by: "Huang, Ying" <[email protected]> Reviewed-by: Ming Lei <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Daniel Jordan <[email protected]> Cc: Jens Axboe <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-06-29fork,memcg: alloc_thread_stack_node needs to set tsk->stackAndrea Arcangeli1-1/+5
Commit 5eed6f1dff87 ("fork,memcg: fix crash in free_thread_stack on memcg charge fail") corrected two instances, but there was a third instance of this bug. Without setting tsk->stack, if memcg_charge_kernel_stack fails, it'll execute free_thread_stack() on a dangling pointer. Enterprise kernels are compiled with VMAP_STACK=y so this isn't critical, but custom VMAP_STACK=n builds should have some performance advantage, with the drawback of risking to fail fork because compaction didn't succeed. So as long as VMAP_STACK=n is a supported option it's worth fixing it upstream. Link: http://lkml.kernel.org/r/[email protected] Fixes: 9b6f7e163cd0 ("mm: rework memcg kernel stack accounting") Signed-off-by: Andrea Arcangeli <[email protected]> Reviewed-by: Rik van Riel <[email protected]> Acked-by: Roman Gushchin <[email protected]> Acked-by: Michal Hocko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-06-29MAINTAINERS: add CLANG/LLVM BUILD SUPPORT infoNick Desaulniers1-0/+8
Add keyword support so that our mailing list gets cc'ed for clang/llvm patches. We're pretty active on our mailing list so far as code review. There are numerous Googlers like myself that are paid to support building the Linux kernel with Clang and LLVM. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Nick Desaulniers <[email protected]> Reviewed-by: Nathan Chancellor <[email protected]> Cc: Joe Perches <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-06-29mm/vmalloc.c: avoid bogus -Wmaybe-uninitialized warningArnd Bergmann1-2/+2
gcc gets confused in pcpu_get_vm_areas() because there are too many branches that affect whether 'lva' was initialized before it gets used: mm/vmalloc.c: In function 'pcpu_get_vm_areas': mm/vmalloc.c:991:4: error: 'lva' may be used uninitialized in this function [-Werror=maybe-uninitialized] insert_vmap_area_augment(lva, &va->rb_node, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ &free_vmap_area_root, &free_vmap_area_list); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ mm/vmalloc.c:916:20: note: 'lva' was declared here struct vmap_area *lva; ^~~ Add an intialization to NULL, and check whether this has changed before the first use. [[email protected]: tweak comments] Link: http://lkml.kernel.org/r/[email protected] Fixes: 68ad4a330433 ("mm/vmalloc.c: keep track of free blocks for vmap allocation") Signed-off-by: Arnd Bergmann <[email protected]> Reviewed-by: Uladzislau Rezki (Sony) <[email protected]> Cc: Joel Fernandes <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-06-29mm/page_idle.c: fix oops because end_pfn is larger than max_pfnColin Ian King1-2/+2
Currently the calcuation of end_pfn can round up the pfn number to more than the actual maximum number of pfns, causing an Oops. Fix this by ensuring end_pfn is never more than max_pfn. This can be easily triggered when on systems where the end_pfn gets rounded up to more than max_pfn using the idle-page stress-ng stress test: sudo stress-ng --idle-page 0 BUG: unable to handle kernel paging request at 00000000000020d8 #PF error: [normal kernel read fault] PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 1 PID: 11039 Comm: stress-ng-idle- Not tainted 5.0.0-5-generic #6-Ubuntu Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 RIP: 0010:page_idle_get_page+0xc8/0x1a0 Code: 0f b1 0a 75 7d 48 8b 03 48 89 c2 48 c1 e8 33 83 e0 07 48 c1 ea 36 48 8d 0c 40 4c 8d 24 88 49 c1 e4 07 4c 03 24 d5 00 89 c3 be <49> 8b 44 24 58 48 8d b8 80 a1 02 00 e8 07 d5 77 00 48 8b 53 08 48 RSP: 0018:ffffafd7c672fde8 EFLAGS: 00010202 RAX: 0000000000000005 RBX: ffffe36341fff700 RCX: 000000000000000f RDX: 0000000000000284 RSI: 0000000000000275 RDI: 0000000001fff700 RBP: ffffafd7c672fe00 R08: ffffa0bc34056410 R09: 0000000000000276 R10: ffffa0bc754e9b40 R11: ffffa0bc330f6400 R12: 0000000000002080 R13: ffffe36341fff700 R14: 0000000000080000 R15: ffffa0bc330f6400 FS: 00007f0ec1ea5740(0000) GS:ffffa0bc7db00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000020d8 CR3: 0000000077d68000 CR4: 00000000000006e0 Call Trace: page_idle_bitmap_write+0x8c/0x140 sysfs_kf_bin_write+0x5c/0x70 kernfs_fop_write+0x12e/0x1b0 __vfs_write+0x1b/0x40 vfs_write+0xab/0x1b0 ksys_write+0x55/0xc0 __x64_sys_write+0x1a/0x20 do_syscall_64+0x5a/0x110 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Link: http://lkml.kernel.org/r/[email protected] Fixes: 33c3fc71c8cf ("mm: introduce idle page tracking") Signed-off-by: Colin Ian King <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Acked-by: Vladimir Davydov <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Stephen Rothwell <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-06-29initramfs: fix populate_initrd_image() section mismatchGeert Uytterhoeven1-2/+2
With gcc-4.6.3: WARNING: vmlinux.o(.text.unlikely+0x140): Section mismatch in reference from the function populate_initrd_image() to the variable .init.ramfs.info:__initramfs_size The function populate_initrd_image() references the variable __init __initramfs_size. This is often because populate_initrd_image lacks a __init annotation or the annotation of __initramfs_size is wrong. WARNING: vmlinux.o(.text.unlikely+0x14c): Section mismatch in reference from the function populate_initrd_image() to the function .init.text:unpack_to_rootfs() The function populate_initrd_image() references the function __init unpack_to_rootfs(). This is often because populate_initrd_image lacks a __init annotation or the annotation of unpack_to_rootfs is wrong. WARNING: vmlinux.o(.text.unlikely+0x198): Section mismatch in reference from the function populate_initrd_image() to the function .init.text:xwrite() The function populate_initrd_image() references the function __init xwrite(). This is often because populate_initrd_image lacks a __init annotation or the annotation of xwrite is wrong. Indeed, if the compiler decides not to inline populate_initrd_image(), a warning is generated. Fix this by adding the missing __init annotations. Link: http://lkml.kernel.org/r/[email protected] Fixes: 7c184ecd262fe64f ("initramfs: factor out a helper to populate the initrd image") Signed-off-by: Geert Uytterhoeven <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-06-29mm/oom_kill.c: fix uninitialized oc->constraintYafang Shao1-7/+5
In dump_oom_summary() oc->constraint is used to show oom_constraint_text, but it hasn't been set before. So the value of it is always the default value 0. We should inititialize it before. Bellow is the output when memcg oom occurs, before this patch: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null), cpuset=/,mems_allowed=0,oom_memcg=/foo,task_memcg=/foo,task=bash,pid=7997,uid=0 after this patch: oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null), cpuset=/,mems_allowed=0,oom_memcg=/foo,task_memcg=/foo,task=bash,pid=13681,uid=0 Link: http://lkml.kernel.org/r/[email protected] Fixes: ef8444ea01d7 ("mm, oom: reorganize the oom report in dump_header") Signed-off-by: Yafang Shao <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Wind Yu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-06-29mm: hugetlb: soft-offline: dissolve_free_huge_page() return zero on !PageHugeNaoya Horiguchi2-13/+21
madvise(MADV_SOFT_OFFLINE) often returns -EBUSY when calling soft offline for hugepages with overcommitting enabled. That was caused by the suboptimal code in current soft-offline code. See the following part: ret = migrate_pages(&pagelist, new_page, NULL, MPOL_MF_MOVE_ALL, MIGRATE_SYNC, MR_MEMORY_FAILURE); if (ret) { ... } else { /* * We set PG_hwpoison only when the migration source hugepage * was successfully dissolved, because otherwise hwpoisoned * hugepage remains on free hugepage list, then userspace will * find it as SIGBUS by allocation failure. That's not expected * in soft-offlining. */ ret = dissolve_free_huge_page(page); if (!ret) { if (set_hwpoison_free_buddy_page(page)) num_poisoned_pages_inc(); } } return ret; Here dissolve_free_huge_page() returns -EBUSY if the migration source page was freed into buddy in migrate_pages(), but even in that case we actually has a chance that set_hwpoison_free_buddy_page() succeeds. So that means current code gives up offlining too early now. dissolve_free_huge_page() checks that a given hugepage is suitable for dissolving, where we should return success for !PageHuge() case because the given hugepage is considered as already dissolved. This change also affects other callers of dissolve_free_huge_page(), which are cleaned up together. [[email protected]: v3] Link: http://lkml.kernel.org/r/[email protected]: http://lkml.kernel.org/r/[email protected] Fixes: 6bc9b56433b76 ("mm: fix race on soft-offlining") Signed-off-by: Naoya Horiguchi <[email protected]> Reported-by: Chen, Jerry T <[email protected]> Tested-by: Chen, Jerry T <[email protected]> Reviewed-by: Mike Kravetz <[email protected]> Reviewed-by: Oscar Salvador <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Xishi Qiu <[email protected]> Cc: "Chen, Jerry T" <[email protected]> Cc: "Zhuo, Qiuxu" <[email protected]> Cc: <[email protected]> [4.19+] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-06-29mm: soft-offline: return -EBUSY if set_hwpoison_free_buddy_page() failsNaoya Horiguchi1-0/+2
The pass/fail of soft offline should be judged by checking whether the raw error page was finally contained or not (i.e. the result of set_hwpoison_free_buddy_page()), but current code do not work like that. It might lead us to misjudge the test result when set_hwpoison_free_buddy_page() fails. Without this fix, there are cases where madvise(MADV_SOFT_OFFLINE) may not offline the original page and will not return an error. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Naoya Horiguchi <[email protected]> Fixes: 6bc9b56433b76 ("mm: fix race on soft-offlining") Reviewed-by: Mike Kravetz <[email protected]> Reviewed-by: Oscar Salvador <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Xishi Qiu <[email protected]> Cc: "Chen, Jerry T" <[email protected]> Cc: "Zhuo, Qiuxu" <[email protected]> Cc: <[email protected]> [4.19+] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-06-29signal: remove the wrong signal_pending() check in restore_user_sigmask()Oleg Nesterov6-28/+36
This is the minimal fix for stable, I'll send cleanups later. Commit 854a6ed56839 ("signal: Add restore_user_sigmask()") introduced the visible change which breaks user-space: a signal temporary unblocked by set_user_sigmask() can be delivered even if the caller returns success or timeout. Change restore_user_sigmask() to accept the additional "interrupted" argument which should be used instead of signal_pending() check, and update the callers. Eric said: : For clarity. I don't think this is required by posix, or fundamentally to : remove the races in select. It is what linux has always done and we have : applications who care so I agree this fix is needed. : : Further in any case where the semantic change that this patch rolls back : (aka where allowing a signal to be delivered and the select like call to : complete) would be advantage we can do as well if not better by using : signalfd. : : Michael is there any chance we can get this guarantee of the linux : implementation of pselect and friends clearly documented. The guarantee : that if the system call completes successfully we are guaranteed that no : signal that is unblocked by using sigmask will be delivered? Link: http://lkml.kernel.org/r/[email protected] Fixes: 854a6ed56839a40f6b5d02a2962f48841482eec4 ("signal: Add restore_user_sigmask()") Signed-off-by: Oleg Nesterov <[email protected]> Reported-by: Eric Wong <[email protected]> Tested-by: Eric Wong <[email protected]> Acked-by: "Eric W. Biederman" <[email protected]> Acked-by: Arnd Bergmann <[email protected]> Acked-by: Deepa Dinamani <[email protected]> Cc: Michael Kerrisk <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Jason Baron <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Al Viro <[email protected]> Cc: David Laight <[email protected]> Cc: <[email protected]> [5.0+] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-06-29fs/binfmt_flat.c: make load_flat_shared_library() workJann Horn1-16/+7
load_flat_shared_library() is broken: It only calls load_flat_file() if prepare_binprm() returns zero, but prepare_binprm() returns the number of bytes read - so this only happens if the file is empty. Instead, call into load_flat_file() if the number of bytes read is non-negative. (Even if the number of bytes is zero - in that case, load_flat_file() will see nullbytes and return a nice -ENOEXEC.) In addition, remove the code related to bprm creds and stop using prepare_binprm() - this code is loading a library, not a main executable, and it only actually uses the members "buf", "file" and "filename" of the linux_binprm struct. Instead, call kernel_read() directly. Link: http://lkml.kernel.org/r/[email protected] Fixes: 287980e49ffc ("remove lots of IS_ERR_VALUE abuses") Signed-off-by: Jann Horn <[email protected]> Cc: Alexander Viro <[email protected]> Cc: Kees Cook <[email protected]> Cc: Nicolas Pitre <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Russell King <[email protected]> Cc: Greg Ungerer <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-06-29mm/mempolicy.c: fix an incorrect rebind node in mpol_rebind_nodemaskzhong jiang1-1/+1
mpol_rebind_nodemask() is called for MPOL_BIND and MPOL_INTERLEAVE mempoclicies when the tasks's cpuset's mems_allowed changes. For policies created without MPOL_F_STATIC_NODES or MPOL_F_RELATIVE_NODES, it works by remapping the policy's allowed nodes (stored in v.nodes) using the previous value of mems_allowed (stored in w.cpuset_mems_allowed) as the domain of map and the new mems_allowed (passed as nodes) as the range of the map (see the comment of bitmap_remap() for details). The result of remapping is stored back as policy's nodemask in v.nodes, and the new value of mems_allowed should be stored in w.cpuset_mems_allowed to facilitate the next rebind, if it happens. However, 213980c0f23b ("mm, mempolicy: simplify rebinding mempolicies when updating cpusets") introduced a bug where the result of remapping is stored in w.cpuset_mems_allowed instead. Thus, a mempolicy's allowed nodes can evolve in an unexpected way after a series of rebinding due to cpuset mems_allowed changes, possibly binding to a wrong node or a smaller number of nodes which may e.g. overload them. This patch fixes the bug so rebinding again works as intended. [[email protected]: new changlog] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Fixes: 213980c0f23b ("mm, mempolicy: simplify rebinding mempolicies when updating cpusets") Signed-off-by: zhong jiang <[email protected]> Reviewed-by: Vlastimil Babka <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Ralph Campbell <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-06-29fs/proc/array.c: allow reporting eip/esp for all coredumping threadsJohn Ogness1-1/+1
0a1eb2d474ed ("fs/proc: Stop reporting eip and esp in /proc/PID/stat") stopped reporting eip/esp and fd7d56270b52 ("fs/proc: Report eip/esp in /prod/PID/stat for coredumping") reintroduced the feature to fix a regression with userspace core dump handlers (such as minicoredumper). Because PF_DUMPCORE is only set for the primary thread, this didn't fix the original problem for secondary threads. Allow reporting the eip/esp for all threads by checking for PF_EXITING as well. This is set for all the other threads when they are killed. coredump_wait() waits for all the tasks to become inactive before proceeding to invoke a core dumper. Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Fixes: fd7d56270b526ca3 ("fs/proc: Report eip/esp in /prod/PID/stat for coredumping") Signed-off-by: John Ogness <[email protected]> Reported-by: Jan Luebbe <[email protected]> Tested-by: Jan Luebbe <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-06-29mm/dev_pfn: exclude MEMORY_DEVICE_PRIVATE while computing virtual addressAnshuman Khandual1-1/+1
The presence of struct page does not guarantee linear mapping for the pfn physical range. Device private memory which is non-coherent is excluded from linear mapping during devm_memremap_pages() though they will still have struct page coverage. Change pfn_t_to_virt() to just check for device private memory before giving out virtual address for a given pfn. pfn_t_to_virt() actually has no callers. Let's fix it for the 5.2 kernel and remove it in 5.3. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Anshuman Khandual <[email protected]> Cc: Dan Williams <[email protected]> Cc: Jérôme Glisse <[email protected]> Cc: Laurent Dufour <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-06-28drm/panfrost: Fix a double-free errorBoris Brezillon1-1/+1
drm_gem_shmem_create_with_handle() returns a GEM object and attach a handle to it. When the user closes the DRM FD, the core releases all GEM handles along with their backing GEM objs, which can lead to a double-free issue if panfrost_ioctl_create_bo() failed and went through the err_free path where drm_gem_object_put_unlocked() is called without deleting the associate handle. Replace this drm_gem_object_put_unlocked() call by a drm_gem_handle_delete() one to fix that. Fixes: f3ba91228e8e ("drm/panfrost: Add initial panfrost driver") Cc: <[email protected]> Signed-off-by: Boris Brezillon <[email protected]> Signed-off-by: Rob Herring <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2019-06-28tracing/snapshot: Resize spare buffer if size changedEiichi Tsukata1-4/+6
Current snapshot implementation swaps two ring_buffers even though their sizes are different from each other, that can cause an inconsistency between the contents of buffer_size_kb file and the current buffer size. For example: # cat buffer_size_kb 7 (expanded: 1408) # echo 1 > events/enable # grep bytes per_cpu/cpu0/stats bytes: 1441020 # echo 1 > snapshot // current:1408, spare:1408 # echo 123 > buffer_size_kb // current:123, spare:1408 # echo 1 > snapshot // current:1408, spare:123 # grep bytes per_cpu/cpu0/stats bytes: 1443700 # cat buffer_size_kb 123 // != current:1408 And also, a similar per-cpu case hits the following WARNING: Reproducer: # echo 1 > per_cpu/cpu0/snapshot # echo 123 > buffer_size_kb # echo 1 > per_cpu/cpu0/snapshot WARNING: WARNING: CPU: 0 PID: 1946 at kernel/trace/trace.c:1607 update_max_tr_single.part.0+0x2b8/0x380 Modules linked in: CPU: 0 PID: 1946 Comm: bash Not tainted 5.2.0-rc6 #20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014 RIP: 0010:update_max_tr_single.part.0+0x2b8/0x380 Code: ff e8 dc da f9 ff 0f 0b e9 88 fe ff ff e8 d0 da f9 ff 44 89 ee bf f5 ff ff ff e8 33 dc f9 ff 41 83 fd f5 74 96 e8 b8 da f9 ff <0f> 0b eb 8d e8 af da f9 ff 0f 0b e9 bf fd ff ff e8 a3 da f9 ff 48 RSP: 0018:ffff888063e4fca0 EFLAGS: 00010093 RAX: ffff888066214380 RBX: ffffffff99850fe0 RCX: ffffffff964298a8 RDX: 0000000000000000 RSI: 00000000fffffff5 RDI: 0000000000000005 RBP: 1ffff1100c7c9f96 R08: ffff888066214380 R09: ffffed100c7c9f9b R10: ffffed100c7c9f9a R11: 0000000000000003 R12: 0000000000000000 R13: 00000000ffffffea R14: ffff888066214380 R15: ffffffff99851060 FS: 00007f9f8173c700(0000) GS:ffff88806d000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000714dc0 CR3: 0000000066fa6000 CR4: 00000000000006f0 Call Trace: ? trace_array_printk_buf+0x140/0x140 ? __mutex_lock_slowpath+0x10/0x10 tracing_snapshot_write+0x4c8/0x7f0 ? trace_printk_init_buffers+0x60/0x60 ? selinux_file_permission+0x3b/0x540 ? tracer_preempt_off+0x38/0x506 ? trace_printk_init_buffers+0x60/0x60 __vfs_write+0x81/0x100 vfs_write+0x1e1/0x560 ksys_write+0x126/0x250 ? __ia32_sys_read+0xb0/0xb0 ? do_syscall_64+0x1f/0x390 do_syscall_64+0xc1/0x390 entry_SYSCALL_64_after_hwframe+0x49/0xbe This patch adds resize_buffer_duplicate_size() to check if there is a difference between current/spare buffer sizes and resize a spare buffer if necessary. Link: http://lkml.kernel.org/r/[email protected] Cc: [email protected] Fixes: ad909e21bbe69 ("tracing: Add internal tracing_snapshot() functions") Signed-off-by: Eiichi Tsukata <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2019-06-28dt: leds-lm36274.txt: fix a broken reference to ti-lmu.txtMauro Carvalho Chehab1-1/+1
There's a typo there: ti_lmu.txt -> ti-lmu.txt Signed-off-by: Mauro Carvalho Chehab <[email protected]> Acked-by: Pavel Machek <[email protected]> Acked-by: Dan Murphy <[email protected]> Signed-off-by: Jacek Anaszewski <[email protected]>
2019-06-28docs: leds: convert to ReSTMauro Carvalho Chehab25-779/+997
Rename the leds documentation files to ReST, add an index for them and adjust in order to produce a nice html output via the Sphinx build system. At its new index.rst, let's add a :orphan: while this is not linked to the main index.rst file, in order to avoid build warnings. Signed-off-by: Mauro Carvalho Chehab <[email protected]> Acked-by: Pavel Machek <[email protected]> Signed-off-by: Jacek Anaszewski <[email protected]>
2019-06-28tracing: Fix memory leak in tracing_err_log_open()Takeshi Misawa1-1/+13
When tracing_err_log_open() calls seq_open(), allocated memory is not freed. kmemleak report: unreferenced object 0xffff92c0781d1100 (size 128): comm "tail", pid 15116, jiffies 4295163855 (age 22.704s) hex dump (first 32 bytes): 00 f0 08 e5 c0 92 ff ff 00 10 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<000000000d0687d5>] kmem_cache_alloc+0x11f/0x1e0 [<000000003e3039a8>] seq_open+0x2f/0x90 [<000000008dd36b7d>] tracing_err_log_open+0x67/0x140 [<000000005a431ae2>] do_dentry_open+0x1df/0x3a0 [<00000000a2910603>] vfs_open+0x2f/0x40 [<0000000038b0a383>] path_openat+0x2e8/0x1690 [<00000000fe025bda>] do_filp_open+0x9b/0x110 [<00000000483a5091>] do_sys_open+0x1ba/0x260 [<00000000c558b5fd>] __x64_sys_openat+0x20/0x30 [<000000006881ec07>] do_syscall_64+0x5a/0x130 [<00000000571c2e94>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fix this by calling seq_release() in tracing_err_log_fops.release(). Link: http://lkml.kernel.org/r/20190628105640.GA1863@DESKTOP Fixes: 8a062902be725 ("tracing: Add tracing error log") Reviewed-by: Tom Zanussi <[email protected]> Signed-off-by: Takeshi Misawa <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2019-06-28ftrace/x86: Add a comment to why we take text_mutex in ↵Steven Rostedt (VMware)1-0/+5
ftrace_arch_code_modify_prepare() Taking the text_mutex in ftrace_arch_code_modify_prepare() is to fix a race against module loading and live kernel patching that might try to change the text permissions while ftrace has it as read/write. This really needs to be documented in the code. Add a comment that does such. Link: http://lkml.kernel.org/r/[email protected] Suggested-by: Josh Poimboeuf <[email protected]> Reviewed-by: Josh Poimboeuf <[email protected]> Reviewed-by: Petr Mladek <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2019-06-28ftrace/x86: Remove possible deadlock between register_kprobe() and ↵Petr Mladek2-9/+4
ftrace_run_update_code() The commit 9f255b632bf12c4dd7 ("module: Fix livepatch/ftrace module text permissions race") causes a possible deadlock between register_kprobe() and ftrace_run_update_code() when ftrace is using stop_machine(). The existing dependency chain (in reverse order) is: -> #1 (text_mutex){+.+.}: validate_chain.isra.21+0xb32/0xd70 __lock_acquire+0x4b8/0x928 lock_acquire+0x102/0x230 __mutex_lock+0x88/0x908 mutex_lock_nested+0x32/0x40 register_kprobe+0x254/0x658 init_kprobes+0x11a/0x168 do_one_initcall+0x70/0x318 kernel_init_freeable+0x456/0x508 kernel_init+0x22/0x150 ret_from_fork+0x30/0x34 kernel_thread_starter+0x0/0xc -> #0 (cpu_hotplug_lock.rw_sem){++++}: check_prev_add+0x90c/0xde0 validate_chain.isra.21+0xb32/0xd70 __lock_acquire+0x4b8/0x928 lock_acquire+0x102/0x230 cpus_read_lock+0x62/0xd0 stop_machine+0x2e/0x60 arch_ftrace_update_code+0x2e/0x40 ftrace_run_update_code+0x40/0xa0 ftrace_startup+0xb2/0x168 register_ftrace_function+0x64/0x88 klp_patch_object+0x1a2/0x290 klp_enable_patch+0x554/0x980 do_one_initcall+0x70/0x318 do_init_module+0x6e/0x250 load_module+0x1782/0x1990 __s390x_sys_finit_module+0xaa/0xf0 system_call+0xd8/0x2d0 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(text_mutex); lock(cpu_hotplug_lock.rw_sem); lock(text_mutex); lock(cpu_hotplug_lock.rw_sem); It is similar problem that has been solved by the commit 2d1e38f56622b9b ("kprobes: Cure hotplug lock ordering issues"). Many locks are involved. To be on the safe side, text_mutex must become a low level lock taken after cpu_hotplug_lock.rw_sem. This can't be achieved easily with the current ftrace design. For example, arm calls set_all_modules_text_rw() already in ftrace_arch_code_modify_prepare(), see arch/arm/kernel/ftrace.c. This functions is called: + outside stop_machine() from ftrace_run_update_code() + without stop_machine() from ftrace_module_enable() Fortunately, the problematic fix is needed only on x86_64. It is the only architecture that calls set_all_modules_text_rw() in ftrace path and supports livepatching at the same time. Therefore it is enough to move text_mutex handling from the generic kernel/trace/ftrace.c into arch/x86/kernel/ftrace.c: ftrace_arch_code_modify_prepare() ftrace_arch_code_modify_post_process() This patch basically reverts the ftrace part of the problematic commit 9f255b632bf12c4dd7 ("module: Fix livepatch/ftrace module text permissions race"). And provides x86_64 specific-fix. Some refactoring of the ftrace code will be needed when livepatching is implemented for arm or nds32. These architectures call set_all_modules_text_rw() and use stop_machine() at the same time. Link: http://lkml.kernel.org/r/[email protected] Fixes: 9f255b632bf12c4dd7 ("module: Fix livepatch/ftrace module text permissions race") Acked-by: Thomas Gleixner <[email protected]> Reported-by: Miroslav Benes <[email protected]> Reviewed-by: Miroslav Benes <[email protected]> Reviewed-by: Josh Poimboeuf <[email protected]> Signed-off-by: Petr Mladek <[email protected]> [ As reviewed by Miroslav Benes <[email protected]>, removed return value of ftrace_run_update_code() as it is a void function. ] Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2019-06-28Merge branch 'for-mingo' of ↵Ingo Molnar61-466/+845
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull rcu/next + tools/memory-model changes from Paul E. McKenney: - RCU flavor consolidation cleanups and optmizations - Documentation updates - Miscellaneous fixes - SRCU updates - RCU-sync flavor consolidation - Torture-test updates - Linux-kernel memory-consistency-model updates, most notably the addition of plain C-language accesses Signed-off-by: Ingo Molnar <[email protected]>
2019-06-28NFS/flexfiles: Use the correct TCP timeout for flexfiles I/OTrond Myklebust1-1/+1
Fix a typo where we're confusing the default TCP retrans value (NFS_DEF_TCP_RETRANS) for the default TCP timeout value. Fixes: 15d03055cf39f ("pNFS/flexfiles: Set reasonable default ...") Cc: [email protected] # 4.8+ Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2019-06-28SUNRPC: Fix up calculation of client message lengthTrond Myklebust1-8/+8
In the case where a record marker was used, xs_sendpages() needs to return the length of the payload + record marker so that we operate correctly in the case of a partial transmission. When the callers check return value, they therefore need to take into account the record marker length. Fixes: 06b5fc3ad94e ("Merge tag 'nfs-rdma-for-5.1-1'...") Cc: [email protected] # 5.1+ Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2019-06-28spi: stm32-qspi: remove signal sensitive on completionLudovic Barre1-7/+3
On umount step a sigkill signal is set (without user specific action), due to sigkill signal the completion will be interrupted and the data transfer can't be finished if a sync is needed. Signed-off-by: Ludovic Barre <[email protected]> Signed-off-by: Mark Brown <[email protected]>
2019-06-28dt-bindings: spi: stm32-qspi: add dma propertiesLudovic Barre1-1/+4
This patch adds description of dma properties (optional). Signed-off-by: Ludovic Barre <[email protected]> Signed-off-by: Mark Brown <[email protected]>
2019-06-28video: fbdev: s3c-fb: fix sparse warnings about using incorrect typesBartlomiej Zolnierkiewicz1-6/+6
Use ->screen_buffer instead of ->screen_base to fix sparse warnings. [ Please see commit 17a7b0b4d974 ("fb.h: Provide alternate screen_base pointer") for details. ] Reported-by: kbuild test robot <[email protected]> Acked-by: Jingoo Han <[email protected]> Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
2019-06-28video: fbdev: don't print error message on framebuffer_alloc() failureBartlomiej Zolnierkiewicz42-123/+40
framebuffer_alloc() can fail only on kzalloc() memory allocation failure and since kzalloc() will print error message in such case we can omit printing extra error message in drivers (which BTW is what the majority of framebuffer_alloc() users is doing already). Cc: "Bruno Prémont" <[email protected]> Cc: Jiri Kosina <[email protected]> Cc: Benjamin Tissoires <[email protected]> Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
2019-06-28video: fbdev: intelfb: return -ENOMEM on framebuffer_alloc() failureBartlomiej Zolnierkiewicz1-1/+1
Fix error code from -ENODEV to -ENOMEM. Cc: Maik Broemme <[email protected]> Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
2019-06-28video: fbdev: s3c-fb: return -ENOMEM on framebuffer_alloc() failureBartlomiej Zolnierkiewicz1-1/+1
Fix error code from -ENOENT to -ENOMEM. Acked-by: Jingoo Han <[email protected]> Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
2019-06-28ALSA: seq: fix incorrect order of dest_client/dest_ports argumentsColin Ian King2-2/+2
There are two occurrances of a call to snd_seq_oss_fill_addr where the dest_client and dest_port arguments are in the wrong order. Fix this by swapping them around. Addresses-Coverity: ("Arguments in wrong order") Signed-off-by: Colin Ian King <[email protected]> Cc: <[email protected]> Signed-off-by: Takashi Iwai <[email protected]>
2019-06-28ALSA: hda/realtek - Change front mic location for Lenovo M710qDennis Wassenberg1-0/+1
On M710q Lenovo ThinkCentre machine, there are two front mics, we change the location for one of them to avoid conflicts. Signed-off-by: Dennis Wassenberg <[email protected]> Cc: <[email protected]> Signed-off-by: Takashi Iwai <[email protected]>
2019-06-28drm/etnaviv: add missing failure path to destroy suballocLucas Stach1-2/+5
When something goes wrong in the GPU init after the cmdbuf suballocator has been constructed, we fail to destroy it properly. This causes havok later when the GPU is unbound due to a module unload or similar. Fixes: e66774dd6f6a (drm/etnaviv: add cmdbuf suballocator) Signed-off-by: Lucas Stach <[email protected]> Tested-by: Russell King <[email protected]>
2019-06-28ALSA: usb-audio: fix sign unintended sign extension on left shiftsColin Ian King1-2/+2
There are a couple of left shifts of unsigned 8 bit values that first get promoted to signed ints and hence get sign extended on the shift if the top bit of the 8 bit values are set. Fix this by casting the 8 bit values to unsigned ints to stop the unintentional sign extension. Addresses-Coverity: ("Unintended sign extension") Signed-off-by: Colin Ian King <[email protected]> Cc: <[email protected]> Signed-off-by: Takashi Iwai <[email protected]>
2019-06-28cifs: fix crash querying symlinks stored as reparse-pointsRonnie Sahlberg2-5/+73
We never parsed/returned any data from .get_link() when the object is a windows reparse-point containing a symlink. This results in the VFS layer oopsing accessing an uninitialized buffer: ... [ 171.407172] Call Trace: [ 171.408039] readlink_copy+0x29/0x70 [ 171.408872] vfs_readlink+0xc1/0x1f0 [ 171.409709] ? readlink_copy+0x70/0x70 [ 171.410565] ? simple_attr_release+0x30/0x30 [ 171.411446] ? getname_flags+0x105/0x2a0 [ 171.412231] do_readlinkat+0x1b7/0x1e0 [ 171.412938] ? __ia32_compat_sys_newfstat+0x30/0x30 ... Fix this by adding code to handle these buffers and make sure we do return a valid buffer to .get_link() CC: Stable <[email protected]> Signed-off-by: Ronnie Sahlberg <[email protected]> Signed-off-by: Steve French <[email protected]>
2019-06-28x86/mtrr: Skip cache flushes on CPUs with cache self-snoopingRicardo Neri1-2/+13
Programming MTRR registers in multi-processor systems is a rather lengthy process. Furthermore, all processors must program these registers in lock step and with interrupts disabled; the process also involves flushing caches and TLBs twice. As a result, the process may take a considerable amount of time. On some platforms, this can lead to a large skew of the refined-jiffies clock source. Early when booting, if no other clock is available (e.g., booting with hpet=disabled), the refined-jiffies clock source is used to monitor the TSC clock source. If the skew of refined-jiffies is too large, Linux wrongly assumes that the TSC is unstable: clocksource: timekeeping watchdog on CPU1: Marking clocksource 'tsc-early' as unstable because the skew is too large: clocksource: 'refined-jiffies' wd_now: fffedc10 wd_last: fffedb90 mask: ffffffff clocksource: 'tsc-early' cs_now: 5eccfddebc cs_last: 5e7e3303d4 mask: ffffffffffffffff tsc: Marking TSC unstable due to clocksource watchdog As per measurements, around 98% of the time needed by the procedure to program MTRRs in multi-processor systems is spent flushing caches with wbinvd(). As per the Section 11.11.8 of the Intel 64 and IA 32 Architectures Software Developer's Manual, it is not necessary to flush caches if the CPU supports cache self-snooping. Thus, skipping the cache flushes can reduce by several tens of milliseconds the time needed to complete the programming of the MTRR registers: Platform Before After 104-core (208 Threads) Skylake 1437ms 28ms 2-core ( 4 Threads) Haswell 114ms 2ms Reported-by: Mohammad Etemadi <[email protected]> Signed-off-by: Ricardo Neri <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Alan Cox <[email protected]> Cc: Tony Luck <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Andy Shevchenko <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Hans de Goede <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Jordan Borgner <[email protected]> Cc: "Ravi V. Shankar" <[email protected]> Cc: Ricardo Neri <[email protected]> Cc: Andy Shevchenko <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Peter Feiner <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Link: https://lkml.kernel.org/r/1561689337-19390-3-git-send-email-ricardo.neri-calderon@linux.intel.com
2019-06-28x86/cpu/intel: Clear cache self-snoop capability in CPUs with known errataRicardo Neri1-0/+27
Processors which have self-snooping capability can handle conflicting memory type across CPUs by snooping its own cache. However, there exists CPU models in which having conflicting memory types still leads to unpredictable behavior, machine check errors, or hangs. Clear this feature on affected CPUs to prevent its use. Suggested-by: Alan Cox <[email protected]> Signed-off-by: Ricardo Neri <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Tony Luck <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Andy Shevchenko <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Hans de Goede <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Jordan Borgner <[email protected]> Cc: "Ravi V. Shankar" <[email protected]> Cc: Mohammad Etemadi <[email protected]> Cc: Ricardo Neri <[email protected]> Cc: Andy Shevchenko <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Peter Feiner <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Link: https://lkml.kernel.org/r/1561689337-19390-2-git-send-email-ricardo.neri-calderon@linux.intel.com