aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2016-08-11mm/hugetlb: fix incorrect hugepages count during mem hotplugzhong jiang1-0/+1
When memory hotplug operates, free hugepages will be freed if the movable node is offline. Therefore, /proc/sys/vm/nr_hugepages will be incorrect. Fix it by reducing max_huge_pages when the node is offlined. [email protected] said: : dissolve_free_huge_page intends to break a hugepage into buddy, and the : destination hugepage is supposed to be allocated from the pool of the : destination node, so the system-wide pool size is reduced. So adding : h->max_huge_pages-- makes sense to me. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: zhong jiang <[email protected]> Cc: Mike Kravetz <[email protected]> Acked-by: Naoya Horiguchi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-08-11Merge tag 'fixes-for-linus' of ↵Linus Torvalds18-19/+29
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Arnd Bergmann: "A couple of bug fixes have come in for v4.8 so far. Since the first few were originally meant to go into -rc1 (but didn't get sent in time for travel reasons), the branch is unfortunately based on top of a commit in the middle of the merge window rather than -rc1. Content-wise we have: - a fix for the last remaining broken build in kernelci, getting mach-shmobile to build again with SMP disabled - a fix for a realview regression that broke real hardware but not the qemu model that everyone uses in practice (needed for v4.7 as well) - a merge conflict fix for Tegra that also broke v4.7 - two Kconfig fixes for arm64 build regressions - a couple of arm32 build warning fixes (all harmless) - fix the RTC on Exynos7 Espresso (which apparently never worked right)" * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: Merge tag 'pxa-fixes-v4.8' of https://github.com/rjarzmik/linux into randconfig-4.8 arm64: Kconfig: select HISILICON_IRQ_MBIGEN only if PCI is selected arm64: Kconfig: select ALPINE_MSI only if PCI is selected ARM: dts: realview: Fix PBX-A9 cache description ARM: tegra: fix erroneous address in dts ARM: dts: add syscon compatible string for AP syscon ARM: dts: add syscon compatible string for CP syscon ARM: oxnas: select reset controller framework ARM: hide mach-*/ include for ARM_SINGLE_ARMV7M ARM: don't include removed directories Revert "ARM: aspeed: adapt defconfigs for new CONFIG_PRINTK_TIME" ARM: shmobile: don't call platform_can_secondary_boot on UP MAINTAINER: alpine: add a mailing list ARM: do away with final ARCH_REQUIRE_GPIOLIB arm64: dts: Fix RTC by providing rtc_src clock
2016-08-11Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds8-24/+40
Pull virtio/vhost fixes and cleanups from Michael Tsirkin: "Misc fixes and cleanups all over the place" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio/s390: deprecate old transport virtio/s390: keep early_put_chars virtio_blk: Fix a slient kernel panic virtio-vsock: fix include guard typo vhost/vsock: fix vhost virtio_vsock_pkt use-after-free 9p/trans_virtio: use kvfree() for iov_iter_get_pages_alloc() virtio: fix error handling for debug builds virtio: fix memory leak in virtqueue_add()
2016-08-11Merge tag 'ceph-for-4.8-rc2' of https://github.com/ceph/ceph-clientLinus Torvalds6-19/+9
Pull ceph fixes from Ilya Dryomov: "A patch for a NULL dereference bug introduced in 4.8-rc1 and a handful of static checker fixes" * tag 'ceph-for-4.8-rc2' of https://github.com/ceph/ceph-client: ceph: initialize pathbase in the !dentry case in encode_caps_cb() rbd: nuke the 32-bit pool id check rbd: destroy header_oloc in rbd_dev_release() ceph: fix null pointer dereference in ceph_flush_snaps() libceph: using kfree_rcu() to simplify the code libceph: make cancel_generic_request() static libceph: fix return value check in alloc_msg_with_page_vector()
2016-08-11nfsd: Fix race between FREE_STATEID and LOCKChuck Lever1-12/+28
When running LTP's nfslock01 test, the Linux client can send a LOCK and a FREE_STATEID request at the same time. The outcome is: Frame 324 R OPEN stateid [2,O] Frame 115004 C LOCK lockowner_is_new stateid [2,O] offset 672000 len 64 Frame 115008 R LOCK stateid [1,L] Frame 115012 C WRITE stateid [0,L] offset 672000 len 64 Frame 115016 R WRITE NFS4_OK Frame 115019 C LOCKU stateid [1,L] offset 672000 len 64 Frame 115022 R LOCKU NFS4_OK Frame 115025 C FREE_STATEID stateid [2,L] Frame 115026 C LOCK lockowner_is_new stateid [2,O] offset 672128 len 64 Frame 115029 R FREE_STATEID NFS4_OK Frame 115030 R LOCK stateid [3,L] Frame 115034 C WRITE stateid [0,L] offset 672128 len 64 Frame 115038 R WRITE NFS4ERR_BAD_STATEID In other words, the server returns stateid L in a successful LOCK reply, but it has already released it. Subsequent uses of stateid L fail. To address this, protect the generation check in nfsd4_free_stateid with the st_mutex. This should guarantee that only one of two outcomes occurs: either LOCK returns a fresh valid stateid, or FREE_STATEID returns NFS4ERR_LOCKS_HELD. Reported-by: Alexey Kodanev <[email protected]> Fix-suggested-by: Jeff Layton <[email protected]> Signed-off-by: Chuck Lever <[email protected]> Tested-by: Alexey Kodanev <[email protected]> Cc: [email protected] Signed-off-by: J. Bruce Fields <[email protected]>
2016-08-11arm64: Remove stack duplicating code from jprobesDavid A. Long2-28/+5
Because the arm64 calling standard allows stacked function arguments to be anywhere in the stack frame, do not attempt to duplicate the stack frame for jprobes handler functions. Documentation changes to describe this issue have been broken out into a separate patch in order to simultaneously address them in other architecture(s). Signed-off-by: David A. Long <[email protected]> Acked-by: Masami Hiramatsu <[email protected]> Acked-by: Marc Zyngier <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2016-08-11nfsd: fix dentry refcounting on createJosef Bacik1-3/+6
b44061d0b9 introduced a dentry ref counting bug. Previously we were grabbing one ref to dchild in nfsd_create(), but with the creation of nfsd_create_locked() we have a ref for dchild from the lookup in nfsd_create(), and then another ref in nfsd_create_locked(). The ref from the lookup in nfsd_create() is never dropped and results in dentries still in use at unmount. Signed-off-by: Josef Bacik <[email protected]> Fixes: b44061d0b9 "nfsd: reorganize nfsd_create" Reported-by: kernel test robot <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Acked-by: Al Viro <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2016-08-11bvec: avoid variable shadowing warningJohannes Berg1-1/+2
Due to the (indirect) nesting of min(..., min(...)), sparse will show a variable shadowing warning whenever bvec.h is included. Avoid that by assigning the inner min() to a temporary variable first. Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2016-08-11doc: update block/queue-sysfs.txt entriesJoe Lawrence1-0/+18
Add descriptions for dax, io_poll, and write_same_max_bytes files. Signed-off-by: Joe Lawrence <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2016-08-11nvme: Suspend all queues before deletionGabriel Krisman Bertazi1-12/+8
When nvme_delete_queue fails in the first pass of the nvme_disable_io_queues() loop, we return early, failing to suspend all of the IO queues. Later, on the nvme_pci_disable path, this causes us to disable MSI without actually having freed all the IRQs, which triggers the BUG_ON in free_msi_irqs(), as show below. This patch refactors nvme_disable_io_queues to suspend all queues before start submitting delete queue commands. This way, we ensure that we have at least returned every IRQ before continuing with the removal path. [ 487.529200] kernel BUG at ../drivers/pci/msi.c:368! cpu 0x46: Vector: 700 (Program Check) at [c0000078c5b83650] pc: c000000000627a50: free_msi_irqs+0x90/0x200 lr: c000000000627a40: free_msi_irqs+0x80/0x200 sp: c0000078c5b838d0 msr: 9000000100029033 current = 0xc0000078c5b40000 paca = 0xc000000002bd7600 softe: 0 irq_happened: 0x01 pid = 1376, comm = kworker/70:1H kernel BUG at ../drivers/pci/msi.c:368! Linux version 4.7.0.mainline+ (root@iod76) (gcc version 5.3.1 20160413 (Ubuntu/IBM 5.3.1-14ubuntu2.1) ) #104 SMP Fri Jul 29 09:20:17 CDT 2016 enter ? for help [c0000078c5b83920] d0000000363b0cd8 nvme_dev_disable+0x208/0x4f0 [nvme] [c0000078c5b83a10] d0000000363b12a4 nvme_timeout+0xe4/0x250 [nvme] [c0000078c5b83ad0] c0000000005690e4 blk_mq_rq_timed_out+0x64/0x110 [c0000078c5b83b40] c00000000056c930 bt_for_each+0x160/0x170 [c0000078c5b83bb0] c00000000056d928 blk_mq_queue_tag_busy_iter+0x78/0x110 [c0000078c5b83c00] c0000000005675d8 blk_mq_timeout_work+0xd8/0x1b0 [c0000078c5b83c50] c0000000000e8cf0 process_one_work+0x1e0/0x590 [c0000078c5b83ce0] c0000000000e9148 worker_thread+0xa8/0x660 [c0000078c5b83d80] c0000000000f2090 kthread+0x110/0x130 [c0000078c5b83e30] c0000000000095f0 ret_from_kernel_thread+0x5c/0x6c Signed-off-by: Gabriel Krisman Bertazi <[email protected]> Cc: Brian King <[email protected]> Cc: Keith Busch <[email protected]> Cc: [email protected] Signed-off-by: Jens Axboe <[email protected]>
2016-08-11x86/apic/x2apic, smp/hotplug: Don't use before alloc in x2apic_cluster_probe()Sebastian Andrzej Siewior1-4/+9
I made a mistake while converting the driver to the hotplug state machine and as a result x2apic_cluster_probe() was accessing cpus_in_cluster before allocating it. This patch fixes it by setting the cpumask after the allocation the memory succeeded. While at it, I marked two functions static which are only used within this file. Reported-by: Laura Abbott <[email protected]> Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Fixes: 6b2c28471de5 ("x86/x2apic: Convert to CPU hotplug state machine") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-08-11sched/cputime: Fix omitted ticks passed in parameterFrederic Weisbecker1-1/+2
Commit: f9bcf1e0e014 ("sched/cputime: Fix steal time accounting") ... fixes a leak on steal time accounting but forgets to account the ticks passed in parameters, assuming there is only one to take into account. Let's consider that parameter back. Signed-off-by: Frederic Weisbecker <[email protected]> Acked-by: Wanpeng Li <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Radim <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Wanpeng Li <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/20160811125822.GB4214@lerouge Signed-off-by: Ingo Molnar <[email protected]>
2016-08-11efi/capsule: Allocate whole capsule into virtual memoryAustin Christ2-6/+8
According to UEFI 2.6 section 7.5.3, the capsule should be in contiguous virtual memory and firmware may consume the capsule immediately. To correctly implement this functionality, the kernel driver needs to vmap the entire capsule at the time it is made available to firmware. The virtual allocation of the capsule update has been changed from kmap, which was only allocating the first page of the update, to vmap, and allocates the entire data payload. Signed-off-by: Austin Christ <[email protected]> Signed-off-by: Matt Fleming <[email protected]> Reviewed-by: Matt Fleming <[email protected]> Reviewed-by: Lee, Chun-Yi <[email protected]> Cc: <[email protected]> # v4.7 Cc: Andy Lutomirski <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Bryan O'Donoghue <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Kweh Hock Leong <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-08-11x86/platform/uv: Skip UV runtime services mapping in the ↵Alex Thorlton1-1/+2
efi_runtime_disabled case This problem has actually been in the UV code for a while, but we didn't catch it until recently, because we had been relying on EFI_OLD_MEMMAP to allow our systems to boot for a period of time. We noticed the issue when trying to kexec a recent community kernel, where we hit this NULL pointer dereference in efi_sync_low_kernel_mappings(): [ 0.337515] BUG: unable to handle kernel NULL pointer dereference at 0000000000000880 [ 0.346276] IP: [<ffffffff8105df8d>] efi_sync_low_kernel_mappings+0x5d/0x1b0 The problem doesn't show up with EFI_OLD_MEMMAP because we skip the chunk of setup_efi_state() that sets the efi_loader_signature for the kexec'd kernel. When the kexec'd kernel boots, it won't set EFI_BOOT in setup_arch, so we completely avoid the bug. We always kexec with noefi on the command line, so this shouldn't be an issue, but since we're not actually checking for efi_runtime_disabled in uv_bios_init(), we end up trying to do EFI runtime callbacks when we shouldn't be. This patch just adds a check for efi_runtime_disabled in uv_bios_init() so that we don't map in uv_systab when runtime_disabled == true. Signed-off-by: Alex Thorlton <[email protected]> Signed-off-by: Matt Fleming <[email protected]> Cc: <[email protected]> # v4.7 Cc: Andy Lutomirski <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mike Travis <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Russ Anderson <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-08-11x86/efi: Allocate a trampoline if needed in efi_free_boot_services()Andy Lutomirski1-0/+21
On my Dell XPS 13 9350 with firmware 1.4.4 and SGX on, if I boot Fedora 24's grub2-efi off a hard disk, my first 1MB of RAM looks like: efi: mem00: [Runtime Data |RUN| | | | | | | |WB|WT|WC|UC] range=[0x0000000000000000-0x0000000000000fff] (0MB) efi: mem01: [Boot Data | | | | | | | | |WB|WT|WC|UC] range=[0x0000000000001000-0x0000000000027fff] (0MB) efi: mem02: [Loader Data | | | | | | | | |WB|WT|WC|UC] range=[0x0000000000028000-0x0000000000029fff] (0MB) efi: mem03: [Reserved | | | | | | | | |WB|WT|WC|UC] range=[0x000000000002a000-0x000000000002bfff] (0MB) efi: mem04: [Runtime Data |RUN| | | | | | | |WB|WT|WC|UC] range=[0x000000000002c000-0x000000000002cfff] (0MB) efi: mem05: [Loader Data | | | | | | | | |WB|WT|WC|UC] range=[0x000000000002d000-0x000000000002dfff] (0MB) efi: mem06: [Conventional Memory| | | | | | | | |WB|WT|WC|UC] range=[0x000000000002e000-0x0000000000057fff] (0MB) efi: mem07: [Reserved | | | | | | | | |WB|WT|WC|UC] range=[0x0000000000058000-0x0000000000058fff] (0MB) efi: mem08: [Conventional Memory| | | | | | | | |WB|WT|WC|UC] range=[0x0000000000059000-0x000000000009ffff] (0MB) My EBDA is at 0x2c000, which blocks off everything from 0x2c000 and up, and my trampoline is 0x6000 bytes (6 pages), so it doesn't fit in the loader data range at 0x28000. Without this patch, it panics due to a failure to allocate the trampoline. With this patch, it works: [ +0.001744] Base memory trampoline at [ffff880000001000] 1000 size 24576 Signed-off-by: Andy Lutomirski <[email protected]> Reviewed-by: Matt Fleming <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mario Limonciello <[email protected]> Cc: Matt Fleming <[email protected]> Cc: Matthew Garrett <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/998c77b3bf709f3dfed85cb30701ed1a5d8a438b.1470821230.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2016-08-11x86/boot: Rework reserve_real_mode() to allow multiple triesAndy Lutomirski2-8/+30
If reserve_real_mode() fails, panicing immediately means we're doomed. Make it safe to try more than once to allocate the trampoline: - Degrade a failure from panic() to pr_info(). (If we make it to setup_real_mode() without reserving the trampoline, we'll panic them.) - Factor out helpers so that platform code can supply a specific address to try. - Warn if reserve_real_mode() is called after we're done with the memblock allocator. If that were to happen, we would behave unpredictably. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mario Limonciello <[email protected]> Cc: Matt Fleming <[email protected]> Cc: Matthew Garrett <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/876e383038f3e9971aa72fd20a4f5da05f9d193d.1470821230.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2016-08-11x86/boot: Defer setup_real_mode() to early_initcall timeAndy Lutomirski3-6/+12
There's no need to run setup_real_mode() as early as we run it. Defer it to the same early_initcall that sets up the page permissions for the real mode code. This should be a code size reduction. More importantly, it give us a longer window in which we can allocate the real mode trampoline. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mario Limonciello <[email protected]> Cc: Matt Fleming <[email protected]> Cc: Matthew Garrett <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/fd62f0da4f79357695e9bf3e365623736b05f119.1470821230.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2016-08-11x86/boot: Synchronize trampoline_cr4_features and mmu_cr4_features directlyAndy Lutomirski2-8/+12
The initialization process for trampoline_cr4_features and mmu_cr4_features was confusing. The intent is for mmu_cr4_features and *trampoline_cr4_features to stay in sync, but trampoline_cr4_features is NULL until setup_real_mode() runs. The old code synchronized *trampoline_cr4_features *twice*, once in setup_real_mode() and once in setup_arch(). It also initialized mmu_cr4_features in setup_real_mode(), which causes the actual value of mmu_cr4_features to potentially depend on when setup_real_mode() is called. With this patch, mmu_cr4_features is initialized directly in setup_arch(), and *trampoline_cr4_features is synchronized to mmu_cr4_features when the trampoline is set up. After this patch, it should be safe to defer setup_real_mode(). Signed-off-by: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mario Limonciello <[email protected]> Cc: Matt Fleming <[email protected]> Cc: Matthew Garrett <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/d48a263f9912389b957dd495a7127b009259ffe0.1470821230.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2016-08-11x86/boot: Run reserve_bios_regions() after we initialize the memory mapAndy Lutomirski3-3/+2
reserve_bios_regions() is a quirk that reserves memory that we might otherwise think is available. There's no need to run it so early, and running it before we have the memory map initialized with its non-quirky inputs makes it hard to make reserve_bios_regions() more intelligent. Move it right after we populate the memblock state. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mario Limonciello <[email protected]> Cc: Matt Fleming <[email protected]> Cc: Matthew Garrett <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/59f58618911005c799c6c9979ce6ae4881d907c2.1470821230.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2016-08-11x86/irq: Do not substract irq_tlb_count from irq_call_countAaron Lu2-6/+1
Since commit: 52aec3308db8 ("x86/tlb: replace INVALIDATE_TLB_VECTOR by CALL_FUNCTION_VECTOR") the TLB remote shootdown is done through call function vector. That commit didn't take care of irq_tlb_count, which a later commit: fd0f5869724f ("x86: Distinguish TLB shootdown interrupts from other functions call interrupts") ... tried to fix. The fix assumes every increase of irq_tlb_count has a corresponding increase of irq_call_count. So the irq_call_count is always bigger than irq_tlb_count and we could substract irq_tlb_count from irq_call_count. Unfortunately this is not true for the smp_call_function_single() case. The IPI is only sent if the target CPU's call_single_queue is empty when adding a csd into it in generic_exec_single. That means if two threads are both adding flush tlb csds to the same CPU's call_single_queue, only one IPI is sent. In other words, the irq_call_count is incremented by 1 but irq_tlb_count is incremented by 2. Over time, irq_tlb_count will be bigger than irq_call_count and the substract will produce a very large irq_call_count value due to overflow. Considering that: 1) it's not worth to send more IPIs for the sake of accurate counting of irq_call_count in generic_exec_single(); 2) it's not easy to tell if the call function interrupt is for TLB shootdown in __smp_call_function_single_interrupt(). Not to exclude TLB shootdown from call function count seems to be the simplest fix and this patch just does that. This bug was found by LKP's cyclic performance regression tracking recently with the vm-scalability test suite. I have bisected to commit: 3dec0ba0be6a ("mm/rmap: share the i_mmap_rwsem") This commit didn't do anything wrong but revealed the irq_call_count problem. IIUC, the commit makes rwc->remap_one in rmap_walk_file concurrent with multiple threads. When remap_one is try_to_unmap_one(), then multiple threads could queue flush TLB to the same CPU but only one IPI will be sent. Since the commit was added in Linux v3.19, the counting problem only shows up from v3.19 onwards. Signed-off-by: Aaron Lu <[email protected]> Cc: Alex Shi <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Huang Ying <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tomoki Sekiyama <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-08-11x86/mm: Fix swap entry comment and macroDave Hansen1-2/+2
A recent patch changed the format of a swap PTE. The comment explaining the format of the swap PTE is wrong about the bits used for the swap type field. Amusingly, the ASCII art and the patch description are correct, but the comment itself is wrong. As I was looking at this, I also noticed that the SWP_OFFSET_FIRST_BIT has an off-by-one error. This does not really hurt anything. It just wasted a bit of space in the PTE, giving us 2^59 bytes of addressable space in our swapfiles instead of 2^60. But, it doesn't match with the comments, and it wastes a bit of space, so fix it. Signed-off-by: Dave Hansen <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Luis R. Rodriguez <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Toshi Kani <[email protected]> Fixes: 00839ee3b299 ("x86/mm: Move swap offset/type up in PTE to work around erratum") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-08-11sched/cputime: Fix steal time accountingWanpeng Li1-2/+9
Commit: 57430218317 ("sched/cputime: Count actually elapsed irq & softirq time") ... didn't take steal time into consideration with passing the noirqtime kernel parameter. As Paolo pointed out before: | Why not? If idle=poll, for example, any time the guest is suspended (and | thus cannot poll) does count as stolen time. This patch fixes it by reducing steal time from idle time accounting when the noirqtime parameter is true. The average idle time drops from 56.8% to 54.75% for nohz idle kvm guest(noirqtime, idle=poll, four vCPUs running on one pCPU). Signed-off-by: Wanpeng Li <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Peter Zijlstra (Intel) <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Radim <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-08-11x86/mm/kaslr: Fix -Wformat-security warningNicolas Iooss1-1/+1
debug_putstr() is used to output strings without using printf-like formatting but debug_putstr(v) is defined as early_printk(v) in arch/x86/lib/kaslr.c. This makes clang reports the following warning when building with -Wformat-security: arch/x86/lib/kaslr.c:57:15: warning: format string is not a string literal (potentially insecure) [-Wformat-security] debug_putstr(purpose); ^~~~~~~ Fix this by using "%s" in early_printk(). Signed-off-by: Nicolas Iooss <[email protected]> Acked-by: Kees Cook <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-08-11Merge tag 'pxa-fixes-v4.8' of https://github.com/rjarzmik/linux into ↵Arnd Bergmann2-0/+2
randconfig-4.8 This is the pxa changes for v4.8 cycle. This is a tiny fix couple to enable changes in includes in gpio API without breaking pxa boards. * tag 'pxa-fixes-v4.8' of https://github.com/rjarzmik/linux: ARM: pxa: add module.h for corgi symbol_get/symbol_put usage ARM: pxa: add module.h for spitz symbol_get/symbol_put usage
2016-08-10Merge branch 'akpm' (patches from Andrew)Linus Torvalds6-39/+60
Merge misc fixes from Andrew Morton: "8 fixes" * emailed patches from Andrew Morton <[email protected]>: mm/slub.c: run free_partial() outside of the kmem_cache_node->list_lock rmap: fix compound check logic in page_remove_file_rmap mm, rmap: fix false positive VM_BUG() in page_add_file_rmap() mm/page_alloc.c: recalculate some of node threshold when on/offline memory mm/page_alloc.c: fix wrong initialization when sysctl_min_unmapped_ratio changes thp: move shmem_huge_enabled() outside of SYSFS ifdef revert "ARM: keystone: dts: add psci command definition" rapidio: dereferencing an error pointer
2016-08-10mm/slub.c: run free_partial() outside of the kmem_cache_node->list_lockChris Wilson1-1/+5
With debugobjects enabled and using SLAB_DESTROY_BY_RCU, when a kmem_cache_node is destroyed the call_rcu() may trigger a slab allocation to fill the debug object pool (__debug_object_init:fill_pool). Everywhere but during kmem_cache_destroy(), discard_slab() is performed outside of the kmem_cache_node->list_lock and avoids a lockdep warning about potential recursion: ============================================= [ INFO: possible recursive locking detected ] 4.8.0-rc1-gfxbench+ #1 Tainted: G U --------------------------------------------- rmmod/8895 is trying to acquire lock: (&(&n->list_lock)->rlock){-.-...}, at: [<ffffffff811c80d7>] get_partial_node.isra.63+0x47/0x430 but task is already holding lock: (&(&n->list_lock)->rlock){-.-...}, at: [<ffffffff811cbda4>] __kmem_cache_shutdown+0x54/0x320 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(&n->list_lock)->rlock); lock(&(&n->list_lock)->rlock); *** DEADLOCK *** May be due to missing lock nesting notation 5 locks held by rmmod/8895: #0: (&dev->mutex){......}, at: driver_detach+0x42/0xc0 #1: (&dev->mutex){......}, at: driver_detach+0x50/0xc0 #2: (cpu_hotplug.dep_map){++++++}, at: get_online_cpus+0x2d/0x80 #3: (slab_mutex){+.+.+.}, at: kmem_cache_destroy+0x3c/0x220 #4: (&(&n->list_lock)->rlock){-.-...}, at: __kmem_cache_shutdown+0x54/0x320 stack backtrace: CPU: 6 PID: 8895 Comm: rmmod Tainted: G U 4.8.0-rc1-gfxbench+ #1 Hardware name: Gigabyte Technology Co., Ltd. H87M-D3H/H87M-D3H, BIOS F11 08/18/2015 Call Trace: __lock_acquire+0x1646/0x1ad0 lock_acquire+0xb2/0x200 _raw_spin_lock+0x36/0x50 get_partial_node.isra.63+0x47/0x430 ___slab_alloc.constprop.67+0x1a7/0x3b0 __slab_alloc.isra.64.constprop.66+0x43/0x80 kmem_cache_alloc+0x236/0x2d0 __debug_object_init+0x2de/0x400 debug_object_activate+0x109/0x1e0 __call_rcu.constprop.63+0x32/0x2f0 call_rcu+0x12/0x20 discard_slab+0x3d/0x40 __kmem_cache_shutdown+0xdb/0x320 shutdown_cache+0x19/0x60 kmem_cache_destroy+0x1ae/0x220 i915_gem_load_cleanup+0x14/0x40 [i915] i915_driver_unload+0x151/0x180 [i915] i915_pci_remove+0x14/0x20 [i915] pci_device_remove+0x34/0xb0 __device_release_driver+0x95/0x140 driver_detach+0xb6/0xc0 bus_remove_driver+0x53/0xd0 driver_unregister+0x27/0x50 pci_unregister_driver+0x25/0x70 i915_exit+0x1a/0x1e2 [i915] SyS_delete_module+0x193/0x1f0 entry_SYSCALL_64_fastpath+0x1c/0xac Fixes: 52b4b950b507 ("mm: slab: free kmem_cache_node after destroy sysfs file") Link: http://lkml.kernel.org/r/[email protected] Reported-by: Dave Gordon <[email protected]> Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Vladimir Davydov <[email protected]> Acked-by: Christoph Lameter <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: David Rientjes <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: Dmitry Safonov <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Dave Gordon <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-08-10rmap: fix compound check logic in page_remove_file_rmapSteve Capper1-1/+1
In page_remove_file_rmap(.) we have the following check: VM_BUG_ON_PAGE(compound && !PageTransHuge(page), page); This is meant to check for either HugeTLB pages or THP when a compound page is passed in. Unfortunately, if one disables CONFIG_TRANSPARENT_HUGEPAGE, then PageTransHuge(.) will always return false, provoking BUGs when one runs the libhugetlbfs test suite. This patch replaces PageTransHuge(), with PageHead() which will work for both HugeTLB and THP. Fixes: dd78fedde4b9 ("rmap: support file thp") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Steve Capper <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: Huang Shijie <[email protected]> Cc: Will Deacon <[email protected]> Cc: Catalin Marinas <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-08-10mm, rmap: fix false positive VM_BUG() in page_add_file_rmap()Kirill A. Shutemov1-2/+3
PageTransCompound() doesn't distinguish THP from from any other type of compound pages. This can lead to false-positive VM_BUG_ON() in page_add_file_rmap() if called on compound page from a driver[1]. I think we can exclude such cases by checking if the page belong to a mapping. The VM_BUG_ON_PAGE() is downgraded to VM_WARN_ON_ONCE(). This path should not cause any harm to non-THP page, but good to know if we step on anything else. [1] http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Kirill A. Shutemov <[email protected]> Reported-by: Laura Abbott <[email protected]> Tested-by: Laura Abbott <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-08-10mm/page_alloc.c: recalculate some of node threshold when on/offline memoryJoonsoo Kim1-15/+35
Some of node threshold depends on number of managed pages in the node. When memory is going on/offline, it can be changed and we need to adjust them. Add recalculation to appropriate places and clean-up related functions for better maintenance. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Joonsoo Kim <[email protected]> Acked-by: Mel Gorman <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Minchan Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-08-10mm/page_alloc.c: fix wrong initialization when sysctl_min_unmapped_ratio changesJoonsoo Kim1-1/+1
Before resetting min_unmapped_pages, we need to initialize min_unmapped_pages rather than min_slab_pages. Fixes: a5f5f91da6 (mm: convert zone_reclaim to node_reclaim) Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Joonsoo Kim <[email protected]> Acked-by: Mel Gorman <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Minchan Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-08-10thp: move shmem_huge_enabled() outside of SYSFS ifdefArnd Bergmann1-1/+3
The newly introduced shmem_huge_enabled() function has two definitions, but neither of them is visible if CONFIG_SYSFS is disabled, leading to a build error: mm/khugepaged.o: In function `khugepaged': khugepaged.c:(.text.khugepaged+0x3ca): undefined reference to `shmem_huge_enabled' This changes the #ifdef guards around the definition to match those that are used in the header file. Fixes: e496cf3d7821 ("thp: introduce CONFIG_TRANSPARENT_HUGE_PAGECACHE") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnd Bergmann <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-08-10revert "ARM: keystone: dts: add psci command definition"Andrew Morton1-8/+0
Revert commit 51d5d12b8f3d ("ARM: keystone: dts: add psci command definition"), which was inadvertently added twice. Cc: Russell King - ARM Linux <[email protected]> Cc: Vitaly Andrianov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-08-10rapidio: dereferencing an error pointerDan Carpenter1-11/+13
Original patch: https://lkml.org/lkml/2016/8/4/32 If riocm_ch_alloc() fails then we end up dereferencing the error pointer. The problem is that we're not unwinding in the reverse order from how we allocate things so it gets confusing. I've changed this around so now "ch" is NULL when we are done with it after we call riocm_put_channel(). That way we can check if it's NULL and avoid calling riocm_put_channel() on it twice. I renamed err_nodev to err_put_new_ch so that it better reflects what the goto does. Then because we had flipping things around, it means we don't neeed to initialize the pointers to NULL and we can remove an if statement and pull things in an indent level. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Alexandre Bounine <[email protected]> Cc: Matt Porter <[email protected]> Cc: Andre van Herk <[email protected]> Cc: Barry Wood <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-08-10tools/testing/nvdimm: fix SIGTERM vs hotplug crashDan Williams1-0/+2
The unit tests crash when hotplug races the previous probe. This race requires that the loading of the nfit_test module be terminated with SIGTERM, and the module to be unloaded while the ars scan is still running. In contrast to the normal nfit driver, the unit test calls acpi_nfit_init() twice to simulate hotplug, whereas the nominal case goes through the acpi_nfit_notify() event handler. The acpi_nfit_notify() path is careful to flush the previous region registration before servicing the hotplug event. The unit test was missing this guarantee. BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff810cdce7>] pwq_activate_delayed_work+0x47/0x170 [..] Call Trace: [<ffffffff810ce186>] pwq_dec_nr_in_flight+0x66/0xa0 [<ffffffff810ce490>] process_one_work+0x2d0/0x680 [<ffffffff810ce331>] ? process_one_work+0x171/0x680 [<ffffffff810ce88e>] worker_thread+0x4e/0x480 [<ffffffff810ce840>] ? process_one_work+0x680/0x680 [<ffffffff810ce840>] ? process_one_work+0x680/0x680 [<ffffffff810d5343>] kthread+0xf3/0x110 [<ffffffff8199846f>] ret_from_fork+0x1f/0x40 [<ffffffff810d5250>] ? kthread_create_on_node+0x230/0x230 Cc: <[email protected]> Signed-off-by: Dan Williams <[email protected]>
2016-08-10arm64: Kconfig: select HISILICON_IRQ_MBIGEN only if PCI is selectedSudeep Holla1-1/+1
Even when PCI is disabled, ARCH_HISI selects HISILICON_IRQ_MBIGEN triggerring the following config warning: warning: (ARM64 && HISILICON_IRQ_MBIGEN) selects ARM_GIC_V3_ITS which has unmet direct dependencies (PCI && PCI_MSI) This patch makes selection of HISILICON_IRQ_MBIGEN conditional on PCI. Cc: Ma Jun <[email protected]> Cc: Arnd Bergmann <[email protected]> Signed-off-by: Sudeep Holla <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]>
2016-08-10arm64: Kconfig: select ALPINE_MSI only if PCI is selectedSudeep Holla1-1/+1
Even when PCI is disabled, ARCH_ALPINE selects ALPINE_MSI triggerring the following config warning: warning: (ARCH_ALPINE) selects ALPINE_MSI which has unmet direct dependencies (PCI) This patch makes selection of ALPINE_MSI conditional on PCI. Cc: Arnd Bergmann <[email protected]> Acked-by: Antoine Tenart <[email protected]> Signed-off-by: Sudeep Holla <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]>
2016-08-10ARM: dts: realview: Fix PBX-A9 cache descriptionRobin Murphy1-5/+4
Clearly QEMU is very permissive in how its PL310 model may be set up, but the real hardware turns out to be far more particular about things actually being correct. Fix up the DT description so that the real thing actually boots: - The arm,data-latency and arm,tag-latency properties need 3 cells to be valid, otherwise we end up retaining the default 8-cycle latencies which leads pretty quickly to lockup. - The arm,dirty-latency property is only relevant to L210/L220, so get rid of it. - The cache geometry override also leads to lockup and/or general misbehaviour. Irritatingly, the manual doesn't state the actual PL310 configuration, but based on the boardfile code and poking registers from the Boot Monitor, it would seem to be 8 sets of 16KB ways. With that, we can successfully boot to enjoy the fun of mismatched FPUs... Cc: [email protected] Signed-off-by: Robin Murphy <[email protected]> Tested-by: Mark Rutland <[email protected]> Signed-off-by: Linus Walleij <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]>
2016-08-10ARM: tegra: fix erroneous address in dtsRalf Ramsauer1-2/+2
c90bb7b enabled the high speed UARTs of the Jetson TK1. Due to a merge quirk, wrong addresses were introduced. Fix it and use the correct addresses. Thierry let me know, that there is another patch (b5896f67ab3c in linux-next) in preparation which removes all the '0,' prefixes of unit addresses on Tegra124 and is planned to go upstream in 4.8, so this patch will get reverted then. But for the moment, this patch is necessary to fix current misbehaviour. Fixes: c90bb7b9b9 ("ARM: tegra: Add high speed UARTs to Jetson TK1 device tree") Signed-off-by: Ralf Ramsauer <[email protected]> Acked-by: Thierry Reding <[email protected]> Cc: [email protected] # v4.7 Cc: [email protected] Signed-off-by: Arnd Bergmann <[email protected]>
2016-08-10ARM: dts: add syscon compatible string for AP sysconLinus Walleij1-1/+1
This syscon needs to be looked up by clocks, flash protection and other consumers. Signed-off-by: Linus Walleij <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]>
2016-08-10ARM: dts: add syscon compatible string for CP sysconLinus Walleij1-1/+1
This syscon needs to be looked up by flash protection, CLCD display output settings and other consumers. Signed-off-by: Linus Walleij <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]>
2016-08-10ARM: oxnas: select reset controller frameworkArnd Bergmann1-0/+2
For unknown reasons, we have to enable three symbols for a platform to use a reset controller driver, otherwise we get a Kconfig warning: warning: (MACH_OX810SE) selects RESET_OXNAS which has unmet direct dependencies (RESET_CONTROLLER) This selects the other two symbols for oxnas. Signed-off-by: Arnd Bergmann <[email protected]> Acked-by: Neil Armstrong <[email protected]>
2016-08-10ARM: hide mach-*/ include for ARM_SINGLE_ARMV7MArnd Bergmann1-0/+2
The machine specific header files are exported for traditional platforms, but not for the ones that use ARCH_MULTIPLATFORM, as they could conflict with one another. In case of ARM_SINGLE_ARMV7M, we end up also exporting them, but that appears to be a mistake, and we should treat it the same way as ARCH_MULTIPLATFORM here. 'make W=1' warns about this because it passes -Wmissing-includes to gcc and the directories are not actually present. Signed-off-by: Arnd Bergmann <[email protected]>
2016-08-10ARM: don't include removed directoriesArnd Bergmann3-5/+3
Three platforms used to have header files in include/mach that are now all gone, but the removed directories are still being included, which leads to -Wmissing-include-dirs warnings. This removes the extra -I flags. Signed-off-by: Arnd Bergmann <[email protected]>
2016-08-10arm: oabi compat: add missing access checksDave Weinstein1-1/+7
Add access checks to sys_oabi_epoll_wait() and sys_oabi_semtimedop(). This fixes CVE-2016-3857, a local privilege escalation under CONFIG_OABI_COMPAT. Cc: [email protected] Reported-by: Chiachih Wu <[email protected]> Reviewed-by: Kees Cook <[email protected]> Reviewed-by: Nicolas Pitre <[email protected]> Signed-off-by: Dave Weinstein <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-08-10Merge branch 'for-linus-4.8' of ↵Linus Torvalds6-59/+283
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "Some fixes for btrfs send/recv and fsync from Filipe and Robbie Ko. Bonus points to Filipe for already having xfstests in place for many of these" * 'for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: remove unused function btrfs_add_delayed_qgroup_reserve() Btrfs: improve performance on fsync against new inode after rename/unlink Btrfs: be more precise on errors when getting an inode from disk Btrfs: send, don't bug on inconsistent snapshots Btrfs: send, avoid incorrect leaf accesses when sending utimes operations Btrfs: send, fix invalid leaf accesses due to incorrect utimes operations Btrfs: send, fix warning due to late freeing of orphan_dir_info structures Btrfs: incremental send, fix premature rmdir operations Btrfs: incremental send, fix invalid paths for rename operations Btrfs: send, add missing error check for calls to path_loop() Btrfs: send, fix failure to move directories with the same name around Btrfs: add missing check for writeback errors on fsync
2016-08-10Merge tag 'metag-for-v4.8-rc2' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag Pull metag architecture fix from James Hogan: "A single fix for a boot crash since a commit in the merge window. Metag was unusual in calling show_mem() early, before setup_per_cpu_pageset(), which is no longer safe. It doesn't add much value to the log, so the fix just drops the call" * tag 'metag-for-v4.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag: metag: Drop show_mem() from mem_init()
2016-08-10get_maintainer: Don't check if STDIN exists in a VCS repositoryJoe Perches1-1/+1
If get_maintainer is not given any filename arguments on the command line, the standard input is read for a patch. But checking if a VCS has a file named &STDIN is not a good idea and fails. Verify the nominal input file is not &STDIN before checking the VCS. Fixes: 4cad35a7ca69 ("get_maintainer.pl: reduce need for command-line option -f") Reported-by: Christopher Covington <[email protected]> Signed-off-by: Joe Perches <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-08-10x86/mm/pkeys: Fix compact mode by removing protection keys' XSAVE buffer ↵Dave Hansen1-121/+17
manipulation The Memory Protection Keys "rights register" (PKRU) is XSAVE-managed, and is saved/restored along with the FPU state. When kernel code accesses FPU regsisters, it does a delicate dance with preempt. Otherwise, the context switching code can get confused as to whether the most up-to-date state is in the registers themselves or in the XSAVE buffer. But, PKRU is not a normal FPU register. Using it does not generate the normal device-not-available (#NM) exceptions which means we can not manage it lazily, and the kernel completley disallows using lazy mode when it is enabled. The dance with preempt *only* occurs when managing the FPU lazily. Since we never manage PKRU lazily, we do not have to do the dance with preempt; we can access it directly. Doing it this way saves a ton of complicated code (and is faster too). Further, the XSAVES reenabling failed to patch a bit of code in fpu__xfeature_set_state() the checked for compacted buffers. That check caused fpu__xfeature_set_state() to silently refuse to work when the kernel is using compacted XSAVE buffers. This broke execute-only and future pkey_mprotect() support when using compact XSAVE buffers. But, removing fpu__xfeature_set_state() gets rid of this issue, in addition to the nice cleanup and speedup. This fixes the same thing as a fix that Sai posted: https://lkml.org/lkml/2016/7/25/637 The fix that he posted is a much more obviously correct, but I think we should just do this instead. Reported-by: Sai Praneeth Prakhya <[email protected]> Signed-off-by: Dave Hansen <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Fenghua Yu <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Quentin Casasnovas <[email protected]> Cc: Ravi Shankar <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Yu-Cheng Yu <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-08-10x86/build: Reduce the W=1 warnings noise when compiling x86 syscall tablesValdis Kletnieks1-0/+2
Building an X86_64 kernel with W=1 throws a total of 9,948 lines of warnings of this form for both 32-bit and 64-bit syscall tables. Given that the entire rest of the build for my config only generates 8,375 lines of output, this is a big reduction in the warnings generated. The warnings follow this pattern: ./arch/x86/include/generated/asm/syscalls_32.h:885:21: warning: initialized field overwritten [-Woverride-init] __SYSCALL_I386(379, compat_sys_pwritev2, ) ^ arch/x86/entry/syscall_32.c:13:46: note: in definition of macro '__SYSCALL_I386' #define __SYSCALL_I386(nr, sym, qual) [nr] = sym, ^~~ ./arch/x86/include/generated/asm/syscalls_32.h:885:21: note: (near initialization for 'ia32_sys_call_table[379]') __SYSCALL_I386(379, compat_sys_pwritev2, ) ^ arch/x86/entry/syscall_32.c:13:46: note: in definition of macro '__SYSCALL_I386' #define __SYSCALL_I386(nr, sym, qual) [nr] = sym, Since we intentionally build the syscall tables this way, ignore that one warning in the two files. Signed-off-by: Valdis Kletnieks <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-08-10x86/platform/UV: Fix kernel panic running RHEL kdump kernel on UV systemsMike Travis1-0/+5
The latest UV kernel support panics when RHEL7 kexec's the kdump kernel to make a dumpfile. This patch fixes the problem by turning off all UV support if NUMA is off. Tested-by: Frank Ramsay <[email protected]> Tested-by: John Estabrook <[email protected]> Signed-off-by: Mike Travis <[email protected]> Reviewed-by: Dimitri Sivanich <[email protected]> Reviewed-by: Nathan Zimmer <[email protected]> Cc: Alex Thorlton <[email protected]> Cc: Andrew Banman <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Russ Anderson <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>