aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2016-07-15x86/dumpstack: Rename thread_struct::sig_on_uaccess_error to sig_on_uaccess_errIngo Molnar3-7/+7
Rename it to match the thread_struct::uaccess_err pattern and also because it was too long. Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-07-15x86/uaccess: Move thread_info::uaccess_err and ↵Andy Lutomirski6-9/+10
thread_info::sig_on_uaccess_err to thread_struct struct thread_info is a legacy mess. To prepare for its partial removal, move the uaccess control fields out -- they're straightforward. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/d0ac4d01c8e4d4d756264604e47445d5acc7900e.1468527351.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2016-07-15x86/dumpstack: When OOPSing, rewind the stack before do_exit()Andy Lutomirski3-1/+31
If we call do_exit() with a clean stack, we greatly reduce the risk of recursive oopses due to stack overflow in do_exit, and we allow do_exit to work even if we OOPS from an IST stack. The latter gives us a much better chance of surviving long enough after we detect a stack overflow to write out our logs. Signed-off-by: Andy Lutomirski <[email protected]> Reviewed-by: Josh Poimboeuf <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/32f73ceb372ec61889598da5e5b145889b9f2e19.1468527351.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2016-07-15x86/mm/64: In vmalloc_fault(), use CR3 instead of current->active_mmAndy Lutomirski1-1/+1
If we get a vmalloc fault while current->active_mm->pgd doesn't match CR3, we'll crash without this change. I've seen this failure mode on heavily instrumented kernels with virtually mapped stacks. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/4650d7674185f165ed8fdf9ac4c5c35c5c179ba8.1468527351.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2016-07-15x86/dumpstack/64: Handle faults when printing the "Stack: " part of an OOPSAndy Lutomirski1-2/+10
If we overflow the stack into a guard page, we'll recursively fault when trying to dump the contents of the guard page. Use probe_kernel_address() so we can recover if this happens. Signed-off-by: Andy Lutomirski <[email protected]> Reviewed-by: Josh Poimboeuf <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/e626d47a55d7b04dcb1b4d33faa95e8505b217c8.1468527351.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2016-07-15x86/dumpstack: Try harder to get a call trace on stack overflowAndy Lutomirski1-1/+9
If we overflow the stack, print_context_stack() will abort. Detect this case and rewind back into the valid part of the stack so that we can trace it. Signed-off-by: Andy Lutomirski <[email protected]> Reviewed-by: Josh Poimboeuf <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/ee1690eb2715ccc5dc187fde94effa4ca0ccbbcd.1468527351.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2016-07-15x86/mm: Remove kernel_unmap_pages_in_pgd() and efi_cleanup_page_tables()Andy Lutomirski6-41/+0
kernel_unmap_pages_in_pgd() is dangerous: if a PGD entry in init_mm.pgd were to be cleared, callers would need to ensure that the pgd entry hadn't been propagated to any other pgd. Its only caller was efi_cleanup_page_tables(), and that, in turn, was unused, so just delete both functions. This leaves a couple of other helpers unused, so delete them, too. Signed-off-by: Andy Lutomirski <[email protected]> Reviewed-by: Matt Fleming <[email protected]> Acked-by: Borislav Petkov <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/77ff20fdde3b75cd393be5559ad8218870520248.1468527351.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2016-07-15x86/mm/cpa: In populate_pgd(), don't set the PGD entry until it's populatedAndy Lutomirski1-3/+6
This avoids pointless races in which another CPU or task might see a partially populated global PGD entry. These races should normally be harmless, but, if another CPU propagates the entry via vmalloc_fault() and then populate_pgd() fails (due to memory allocation failure, for example), this prevents a use-after-free of the PGD entry. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/bf99df27eac6835f687005364bd1fbd89130946c.1468527351.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2016-07-15x86/mm/hotplug: Don't remove PGD entries in remove_pagetable()Ingo Molnar1-27/+0
So when memory hotplug removes a piece of physical memory from pagetable mappings, it also frees the underlying PGD entry. This complicates PGD management, so don't do this. We can keep the PGD mapped and the PUD table all clear - it's only a single 4K page per 512 GB of memory hotplugged. Signed-off-by: Ingo Molnar <[email protected]> Signed-off-by: Andy Lutomirski <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Waiman Long <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/064ff6c7275734537f969e876f6cd0baa954d2cc.1468527351.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2016-07-15Merge branch 'x86/asm' into x86/mm, to resolve conflictsIngo Molnar162-817/+3790
Conflicts: tools/testing/selftests/x86/Makefile Signed-off-by: Ingo Molnar <[email protected]>
2016-07-13x86/mm: Use pte_none() to test for empty PTEDave Hansen3-8/+8
The page table manipulation code seems to have grown a couple of sites that are looking for empty PTEs. Just in case one of these entries got a stray bit set, use pte_none() instead of checking for a zero pte_val(). The use pte_same() makes me a bit nervous. If we were doing a pte_same() check against two cleared entries and one of them had a stray bit set, it might fail the pte_same() check. But, I don't think we ever _do_ pte_same() for cleared entries. It is almost entirely used for checking for races in fault-in paths. Signed-off-by: Dave Hansen <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Luis R. Rodriguez <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Toshi Kani <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-07-13x86/mm: Disallow running with 32-bit PTEs to work around erratumDave Hansen5-0/+38
The Intel(R) Xeon Phi(TM) Processor x200 Family (codename: Knights Landing) has an erratum where a processor thread setting the Accessed or Dirty bits may not do so atomically against its checks for the Present bit. This may cause a thread (which is about to page fault) to set A and/or D, even though the Present bit had already been atomically cleared. These bits are truly "stray". In the case of the Dirty bit, the thread associated with the stray set was *not* allowed to write to the page. This means that we do not have to launder the bit(s); we can simply ignore them. If the PTE is used for storing a swap index or a NUMA migration index, the A bit could be misinterpreted as part of the swap type. The stray bits being set cause a software-cleared PTE to be interpreted as a swap entry. In some cases (like when the swap index ends up being for a non-existent swapfile), the kernel detects the stray value and WARN()s about it, but there is no guarantee that the kernel can always detect it. When we have 64-bit PTEs (64-bit mode or 32-bit PAE), we were able to move the swap PTE format around to avoid these troublesome bits. But, 32-bit non-PAE is tight on bits. So, disallow it from running on this hardware. I can't imagine anyone wanting to run 32-bit non-highmem kernels on this hardware, but disallowing them from running entirely is surely the safe thing to do. Signed-off-by: Dave Hansen <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Luis R. Rodriguez <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Toshi Kani <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-07-13x86/mm: Ignore A/D bits in pte/pmd/pud_none()Dave Hansen2-3/+16
The erratum we are fixing here can lead to stray setting of the A and D bits. That means that a pte that we cleared might suddenly have A/D set. So, stop considering those bits when determining if a pte is pte_none(). The same goes for the other pmd_none() and pud_none(). pgd_none() can be skipped because it is not affected; we do not use PGD entries for anything other than pagetables on affected configurations. This adds a tiny amount of overhead to all pte_none() checks. I doubt we'll be able to measure it anywhere. Signed-off-by: Dave Hansen <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Luis R. Rodriguez <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Toshi Kani <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-07-13x86/mm: Move swap offset/type up in PTE to work around erratumDave Hansen1-6/+20
This erratum can result in Accessed/Dirty getting set by the hardware when we do not expect them to be (on !Present PTEs). Instead of trying to fix them up after this happens, we just allow the bits to get set and try to ignore them. We do this by shifting the layout of the bits we use for swap offset/type in our 64-bit PTEs. It looks like this: bitnrs: | ... | 11| 10| 9|8|7|6|5| 4| 3|2|1|0| names: | ... |SW3|SW2|SW1|G|L|D|A|CD|WT|U|W|P| before: | OFFSET (9-63) |0|X|X| TYPE(1-5) |0| after: | OFFSET (14-63) | TYPE (9-13) |0|X|X|X| X| X|X|X|0| Note that D was already a don't care (X) even before. We just move TYPE up and turn its old spot (which could be hit by the A bit) into all don't cares. We take 5 bits away from the offset, but that still leaves us with 50 bits which lets us index into a 62-bit swapfile (4 EiB). I think that's probably fine for the moment. We could theoretically reclaim 5 of the bits (1, 2, 3, 4, 7) but it doesn't gain us anything. Signed-off-by: Dave Hansen <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Luis R. Rodriguez <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Toshi Kani <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-07-10x86/entry: Inline enter_from_user_mode()Paolo Bonzini1-1/+1
This matches what is already done for prepare_exit_to_usermode(), and saves about 60 clock cycles (4% speedup) with the benchmark in the previous commit message. Signed-off-by: Paolo Bonzini <[email protected]> Reviewed-by: Rik van Riel <[email protected]> Reviewed-by: Andy Lutomirski <[email protected]> Acked-by: Paolo Bonzini <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-07-10x86/entry: Avoid interrupt flag save and restorePaolo Bonzini2-2/+17
Thanks to all the work that was done by Andy Lutomirski and others, enter_from_user_mode() and prepare_exit_to_usermode() are now called only with interrupts disabled. Let's provide them a version of user_enter()/user_exit() that skips saving and restoring the interrupt flag. On an AMD-based machine I tested this patch on, with force-enabled context tracking, the speed-up in system calls was 90 clock cycles or 6%, measured with the following simple benchmark: #include <sys/signal.h> #include <time.h> #include <unistd.h> #include <stdio.h> unsigned long rdtsc() { unsigned long result; asm volatile("rdtsc; shl $32, %%rdx; mov %%eax, %%eax\n" "or %%rdx, %%rax" : "=a" (result) : : "rdx"); return result; } int main() { unsigned long tsc1, tsc2; int pid = getpid(); int i; tsc1 = rdtsc(); for (i = 0; i < 100000000; i++) kill(pid, SIGWINCH); tsc2 = rdtsc(); printf("%ld\n", tsc2 - tsc1); } Signed-off-by: Paolo Bonzini <[email protected]> Reviewed-by: Rik van Riel <[email protected]> Reviewed-by: Andy Lutomirski <[email protected]> Acked-by: Paolo Bonzini <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-07-09Merge branch 'linus' into x86/asm, to pick up fixes before merging new changesIngo Molnar862-5045/+8630
Signed-off-by: Ingo Molnar <[email protected]>
2016-07-08Merge tag 'scsi-fixes' of ↵Linus Torvalds3-5/+8
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Three fixes. One is the qla24xx MSI regression, one is a theoretical problem over blacklist matching, which would bite USB badly if it ever triggered and one is a system hang with a particular type of IPR device" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: qla2xxx: Fix NULL pointer deref in QLA interrupt SCSI: fix new bug in scsi_dev_info_list string matching ipr: Clear interrupt on croc/crocodile when running with LSI
2016-07-08Merge tag 'ecryptfs-4.7-rc7-fixes' of ↵Linus Torvalds4-20/+23
git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs Pull eCryptfs fixes from Tyler Hicks: "Provide a more concise fix for CVE-2016-1583: - Additionally fixes linux-stable regressions caused by the cherry-picking of the original fix Some very minor changes that have queued up: - Fix typos in code comments - Remove unnecessary check for NULL before destroying kmem_cache" * tag 'ecryptfs-4.7-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs: ecryptfs: don't allow mmap when the lower fs doesn't support it Revert "ecryptfs: forbid opening files without mmap handler" ecryptfs: fix spelling mistakes eCryptfs: fix typos in comment ecryptfs: drop null test before destroy functions
2016-07-08Merge tag 'iommu-fixes-v4.7-rc6' of ↵Linus Torvalds2-4/+14
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU fixes from Joerg Roedel: "Two Fixes: - Intel VT-d fix for a suspend/resume issue, introduced with the scalability improvements in this cycle. - AMD IOMMU fix for systems that have unity mappings defined. There was a race where translation got enabled before the unity mappings were in place. This issue was seen on some HP servers" * tag 'iommu-fixes-v4.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/amd: Fix unity mapping initialization race iommu/vt-d: Fix infinite loop in free_all_cpu_cached_iovas
2016-07-08Merge tag 'for-linus-4.7b-rc6-tag' of ↵Linus Torvalds3-45/+14
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen bug fixes from David Vrabel: - Fix two bugs in the handling of xenbus transactions. - Make the xen acpi driver compatible with Xen 4.7. * tag 'for-linus-4.7b-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/acpi: allow xen-acpi-processor driver to load on Xen 4.7 xenbus: simplify xenbus_dev_request_and_reply() xenbus: don't bail early from xenbus_dev_request_and_reply() xenbus: don't BUG() on user mode induced condition
2016-07-08Merge tag 'arm64-fixes' of ↵Linus Torvalds6-3/+30
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "A couple of late fixes here, but one that we've been sitting on for a few weeks while the details were worked out. Specifically, we now enforce USER_DS on taking exceptions whilst in the kernel, which avoids leaking kernel data to userspace through things like perf. The other patch is an update to a workaround for a hardware erratum on some Cavium SoCs. Summary: - Enforce USER_DS on exception entry from EL1 - Apply workaround for Cavium errata #27456 on Thunderx-81xx parts" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: Enable workaround for Cavium erratum 27456 on thunderx-81xx arm64: kernel: Save and restore UAO and addr_limit on exception entry
2016-07-08Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds4-8/+8
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Three fixes: - A boot crash fix with certain configs - a MAINTAINERS entry update - Documentation typo fixes" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/Documentation: Fix various typos in Documentation/x86/ files x86/amd_nb: Fix boot crash on non-AMD systems MAINTAINERS: Update the Calgary IOMMU entry
2016-07-08Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds1-22/+20
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Two load-balancing fixes for cgroups-intense workloads" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/fair: Fix calc_cfs_shares() fixed point arithmetics width confusion sched/fair: Fix effective_load() to consistently use smoothed load
2016-07-08Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds4-8/+59
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Various fixes: - 32-bit callgraph bug fix - suboptimal event group scheduling bug fix - event constraint fixes for Broadwell/Skylake - RAPL module name collision fix" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/core: Fix pmu::filter_match for SW-led groups x86/perf/intel/rapl: Fix module name collision with powercap intel-rapl perf/x86: Fix 32-bit perf user callgraph collection perf/x86/intel: Update event constraints when HT is off
2016-07-08Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Ingo Molnar: "Two MIPS-GIC irqchip driver fixes to unbreak certain MIPS boards" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/mips-gic: Match IPI IRQ domain by bus token only irqchip/mips-gic: Map to VPs using HW VPNum
2016-07-08Merge tag 'gpio-v4.7-5' of ↵Linus Torvalds4-52/+31
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO fixes from Linus Walleij: "I don't like to toss in last minute patches, but these are all for things that are broken, and have bitten people for real. Two of them go into stable. Maybe all of them if the compile test problem is a pain in the ass also for stable folks. Final (hopefully) GPIO fixes for v4.7: - Fix an oops on the Asus Eee PC 1201 - Revert a patch trying to split GPIO parsing and GPIO configuration - Revert a too liberal compile testing thing" * tag 'gpio-v4.7-5' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: Revert "gpio: gpiolib-of: Allow compile testing" Revert "gpiolib: Split GPIO flags parsing and GPIO configuration" gpio: sch: Fix Oops on module load on Asus Eee PC 1201
2016-07-08Merge tag 'drm-fixes-for-v4.7-rc7' of ↵Linus Torvalds8-24/+32
git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "One nouveau fix, and a few AMD Polaris fixes and some Allwinner fixes. I've got some vmware fixes that I might send separate over the weekend, they fix some black screens, but I'm still debating them" * tag 'drm-fixes-for-v4.7-rc7' of git://people.freedesktop.org/~airlied/linux: drm/amd/powerplay: Update CKS on/ CKS off voltage offset calculation. drm/amd/powerplay: fix bug that get wrong polaris evv voltage. drm/amd/powerplay: incorrectly use of the function return value drm/amd/powerplay: fix incorrect voltage table value for tonga drm/amd/powerplay: fix incorrect voltage table value for polaris10 drm/nouveau/disp/sor/gf119: select correct sor when poking training pattern gpu: drm: sun4i_drv: add missing of_node_put after calling of_parse_phandle drm/sun4i: Send vblank event when the CRTC is disabled drm/sun4i: Report proper vblank
2016-07-08ecryptfs: don't allow mmap when the lower fs doesn't support itJeff Mahoney1-1/+14
There are legitimate reasons to disallow mmap on certain files, notably in sysfs or procfs. We shouldn't emulate mmap support on file systems that don't offer support natively. CVE-2016-1583 Signed-off-by: Jeff Mahoney <[email protected]> Cc: [email protected] [tyhicks: clean up f_op check by using ecryptfs_file_to_lower()] Signed-off-by: Tyler Hicks <[email protected]>
2016-07-08x86/asm/entry: Make thunk's restore a local labelBorislav Petkov1-3/+3
No need to have it appear in objdump output. No functionality change. Signed-off-by: Borislav Petkov <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-07-08xen/acpi: allow xen-acpi-processor driver to load on Xen 4.7Jan Beulich1-32/+3
As of Xen 4.7 PV CPUID doesn't expose either of CPUID[1].ECX[7] and CPUID[0x80000007].EDX[7] anymore, causing the driver to fail to load on both Intel and AMD systems. Doing any kind of hardware capability checks in the driver as a prerequisite was wrong anyway: With the hypervisor being in charge, all such checking should be done by it. If ACPI data gets uploaded despite some missing capability, the hypervisor is free to ignore part or all of that data. Ditch the entire check_prereq() function, and do the only valid check (xen_initial_domain()) in the caller in its place. Signed-off-by: Jan Beulich <[email protected]> Cc: <[email protected]> Signed-off-by: David Vrabel <[email protected]>
2016-07-08selftests/x86: Add vDSO mremap() testDmitry Safonov2-1/+112
Should print this on vDSO remapping success (on new kernels): [root@localhost ~]# ./test_mremap_vdso_32 AT_SYSINFO_EHDR is 0xf773f000 [NOTE] Moving vDSO: [f773f000, f7740000] -> [a000000, a001000] [OK] Or print that mremap() for vDSOs is unsupported: [root@localhost ~]# ./test_mremap_vdso_32 AT_SYSINFO_EHDR is 0xf773c000 [NOTE] Moving vDSO: [0xf773c000, 0xf773d000] -> [0xf7737000, 0xf7738000] [FAIL] mremap() of the vDSO does not work on this kernel! Suggested-by: Andy Lutomirski <[email protected]> Signed-off-by: Dmitry Safonov <[email protected]> Acked-by: Andy Lutomirski <[email protected]> Cc: [email protected] Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-07-08x86/vdso: Add mremap hook to vm_special_mappingDmitry Safonov3-5/+55
Add possibility for 32-bit user-space applications to move the vDSO mapping. Previously, when a user-space app called mremap() for the vDSO address, in the syscall return path it would land on the previous address of the vDSOpage, resulting in segmentation violation. Now it lands fine and returns to userspace with a remapped vDSO. This will also fix the context.vdso pointer for 64-bit, which does not affect the user of vDSO after mremap() currently, but this may change in the future. As suggested by Andy, return -EINVAL for mremap() that would split the vDSO image: that operation cannot possibly result in a working system so reject it. Renamed and moved the text_mapping structure declaration inside map_vdso(), as it used only there and now it complements the vvar_mapping variable. There is still a problem for remapping the vDSO in glibc applications: the linker relocates addresses for syscalls on the vDSO page, so you need to relink with the new addresses. Without that the next syscall through glibc may fail: Program received signal SIGSEGV, Segmentation fault. #0 0xf7fd9b80 in __kernel_vsyscall () #1 0xf7ec8238 in _exit () from /usr/lib32/libc.so.6 Signed-off-by: Dmitry Safonov <[email protected]> Acked-by: Andy Lutomirski <[email protected]> Cc: [email protected] Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-07-08xenbus: simplify xenbus_dev_request_and_reply()Jan Beulich1-4/+3
No need to retain a local copy of the full request message, only the type is really needed. Signed-off-by: Jan Beulich <[email protected]> Signed-off-by: David Vrabel <[email protected]>
2016-07-08xenbus: don't bail early from xenbus_dev_request_and_reply()Jan Beulich1-3/+0
xenbus_dev_request_and_reply() needs to track whether a transaction is open. For XS_TRANSACTION_START messages it calls transaction_start() and for XS_TRANSACTION_END messages it calls transaction_end(). If sending an XS_TRANSACTION_START message fails or responds with an an error, the transaction is not open and transaction_end() must be called. If sending an XS_TRANSACTION_END message fails, the transaction is still open, but if an error response is returned the transaction is closed. Commit 027bd7e89906 ("xen/xenbus: Avoid synchronous wait on XenBus stalling shutdown/restart") introduced a regression where failed XS_TRANSACTION_START messages were leaving the transaction open. This can cause problems with suspend (and migration) as all transactions must be closed before suspending. It appears that the problematic change was added accidentally, so just remove it. Signed-off-by: Jan Beulich <[email protected]> Cc: Konrad Rzeszutek Wilk <[email protected]> Cc: <[email protected]> Signed-off-by: David Vrabel <[email protected]>
2016-07-08x86/mm/pat, /dev/mem: Remove superfluous error messageJiri Kosina2-9/+2
Currently it's possible for broken (or malicious) userspace to flood a kernel log indefinitely with messages a-la Program dmidecode tried to access /dev/mem between f0000->100000 because range_is_allowed() is case of CONFIG_STRICT_DEVMEM being turned on dumps this information each and every time devmem_is_allowed() fails. Reportedly userspace that is able to trigger contignuous flow of these messages exists. It would be possible to rate limit this message, but that'd have a questionable value; the administrator wouldn't get information about all the failing accessess, so then the information would be both superfluous and incomplete at the same time :) Returning EPERM (which is what is actually happening) is enough indication for userspace what has happened; no need to log this particular error as some sort of special condition. Signed-off-by: Jiri Kosina <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Kees Cook <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Luis R. Rodriguez <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Toshi Kani <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-07-08Merge tag 'v4.7-rc6' into x86/mm, to merge fixes before applying new changesIngo Molnar10354-203375/+507933
Signed-off-by: Ingo Molnar <[email protected]>
2016-07-07Merge branch 'for-linus' of ↵Linus Torvalds1-17/+19
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull apparmor fix from James Morris. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: apparmor: fix oops, validate buffer size in apparmor_setprocattr()
2016-07-07Merge tag 'acpi-4.7-rc7' of ↵Linus Torvalds6-25/+60
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "All of these fix recent regressions in ACPICA, in the ACPI PCI IRQ management code and in the ACPI AML debugger. Specifics: - Fix a lock ordering issue in ACPICA introduced by a recent commit that attempted to fix a deadlock in the dynamic table loading code which in turn appeared after changes related to the handling of module-level AML also made in this cycle (Lv Zheng). - Fix a recent regression in the ACPI IRQ management code that may cause PCI drivers to be unable to register an IRQ if that IRQ happens to be shared with a device on the ISA bus, like the parallel port, by reverting one commit entirely and restoring the previous behavior in two other places (Sinan Kaya). - Fix a recent regression in the ACPI AML debugger introduced by the commit that removed incorrect usage of IS_ERR_VALUE() from multiple places (Lv Zheng)" * tag 'acpi-4.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / debugger: Fix regression introduced by IS_ERR_VALUE() removal ACPICA: Namespace: Fix namespace/interpreter lock ordering ACPI,PCI,IRQ: separate ISA penalty calculation Revert "ACPI, PCI, IRQ: remove redundant code in acpi_irq_penalty_init()" ACPI,PCI,IRQ: factor in PCI possible
2016-07-07Merge tag 'pm-4.7-rc7' of ↵Linus Torvalds3-51/+113
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "One fix for a recent cpuidle core change that, against all odds, introduced a functional regression on Power systems and the fix for the crash during resume from hibernation on x86-64 that has been in the works for the last few weeks (it actually was ready last week, but I wanted to allow the reporters to test if for some more time). Specifics: - Fix a recent performance regression on Power systems (powernv and pseries) introduced by a core cpuidle commit that decreased the precision of the last_residency conversion from nano- to microseconds, which should not matter in theory, but turned out to play not-so-well with the special "snooze" idle state on Power (Shreyas B Prabhu). - Fix a crash during resume from hibernation on x86-64 caused by possible corruption of the kernel text part of page tables in the last phase of image restoration exposed by a security-related change during the 4.3 development cycle (Rafael Wysocki)" * tag 'pm-4.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpuidle: Fix last_residency division x86/power/64: Fix kernel text mapping corruption during image restoration
2016-07-08Merge tag 'sunxi-drm-fixes-for-4.7-2' of ↵Dave Airlie2-1/+10
https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux into drm-fixes Allwinner DRM driver fixes for 4.7, take 2 A new set of fixes for the sun4i driver, mostly related to vblank handling, and a minor fix to release a reference on the device tree nodes we're parsing in the probe logic. * tag 'sunxi-drm-fixes-for-4.7-2' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux: gpu: drm: sun4i_drv: add missing of_node_put after calling of_parse_phandle drm/sun4i: Send vblank event when the CRTC is disabled drm/sun4i: Report proper vblank
2016-07-08apparmor: fix oops, validate buffer size in apparmor_setprocattr()Vegard Nossum1-17/+19
When proc_pid_attr_write() was changed to use memdup_user apparmor's (interface violating) assumption that the setprocattr buffer was always a single page was violated. The size test is not strictly speaking needed as proc_pid_attr_write() will reject anything larger, but for the sake of robustness we can keep it in. SMACK and SELinux look safe to me, but somebody else should probably have a look just in case. Based on original patch from Vegard Nossum <[email protected]> modified for the case that apparmor provides null termination. Fixes: bb646cdb12e75d82258c2f2e7746d5952d3e321a Reported-by: Vegard Nossum <[email protected]> Cc: Al Viro <[email protected]> Cc: John Johansen <[email protected]> Cc: Paul Moore <[email protected]> Cc: Stephen Smalley <[email protected]> Cc: Eric Paris <[email protected]> Cc: Casey Schaufler <[email protected]> Cc: [email protected] Signed-off-by: John Johansen <[email protected]> Reviewed-by: Tyler Hicks <[email protected]> Signed-off-by: James Morris <[email protected]>
2016-07-07Revert "ecryptfs: forbid opening files without mmap handler"Jeff Mahoney1-11/+2
This reverts commit 2f36db71009304b3f0b95afacd8eba1f9f046b87. It fixed a local root exploit but also introduced a dependency on the lower file system implementing an mmap operation just to open a file, which is a bit of a heavy hammer. The right fix is to have mmap depend on the existence of the mmap handler instead. Signed-off-by: Jeff Mahoney <[email protected]> Cc: [email protected] Signed-off-by: Tyler Hicks <[email protected]>
2016-07-07Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds3-52/+43
Pull block IO fixes from Jens Axboe: "Three small fixes that have been queued up and tested for this series: - A bug fix for xen-blkfront from Bob Liu, fixing an issue with incomplete requests during migration. - A fix for an ancient issue in retrieving the IO priority of a different PID than self, preventing that task from going away while we access it. From Omar. - A writeback fix from Tahsin, fixing a case where we'd call ihold() with a zero ref count inode" * 'for-linus' of git://git.kernel.dk/linux-block: block: fix use-after-free in sys_ioprio_get() writeback: inode cgroup wb switch should not call ihold() xen-blkfront: save uncompleted reqs in blkfront_resume()
2016-07-07Merge tag 'configfs-for-4.7' of git://git.infradead.org/users/hch/configfsLinus Torvalds1-2/+0
Pull configfs fix from Christoph Hellwig: "A fix from Marek for ppos handling in configfs_write_bin_file, which was introduced in Linux 4.5, but didn't have any users until recently" * tag 'configfs-for-4.7' of git://git.infradead.org/users/hch/configfs: configfs: Remove ppos increment in configfs_write_bin_file
2016-07-07Merge branches 'acpica-fixes', 'acpi-pci-fixes' and 'acpi-debug-fixes'Rafael J. Wysocki9695-193607/+491791
* acpica-fixes: ACPICA: Namespace: Fix namespace/interpreter lock ordering * acpi-pci-fixes: ACPI,PCI,IRQ: separate ISA penalty calculation Revert "ACPI, PCI, IRQ: remove redundant code in acpi_irq_penalty_init()" ACPI,PCI,IRQ: factor in PCI possible * acpi-debug-fixes: ACPI / debugger: Fix regression introduced by IS_ERR_VALUE() removal
2016-07-07Merge branches 'pm-cpuidle-fixes' and 'pm-sleep-fixes'Rafael J. Wysocki3-51/+113
* pm-cpuidle-fixes: cpuidle: Fix last_residency division * pm-sleep-fixes: x86/power/64: Fix kernel text mapping corruption during image restoration
2016-07-07arm64: Enable workaround for Cavium erratum 27456 on thunderx-81xxGanapatrao Kulkarni2-0/+8
Cavium erratum 27456 commit 104a0c02e8b1 ("arm64: Add workaround for Cavium erratum 27456") is applicable for thunderx-81xx pass1.0 SoC as well. Adding code to enable to 81xx. Signed-off-by: Ganapatrao Kulkarni <[email protected]> Reviewed-by: Andrew Pinski <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2016-07-07arm64: kernel: Save and restore UAO and addr_limit on exception entryJames Morse4-3/+22
If we take an exception while at EL1, the exception handler inherits the original context's addr_limit and PSTATE.UAO values. To be consistent always reset addr_limit and PSTATE.UAO on (re-)entry to EL1. This prevents accidental re-use of the original context's addr_limit. Based on a similar patch for arm from Russell King. Cc: <[email protected]> # 4.6- Acked-by: Will Deacon <[email protected]> Reviewed-by: Mark Rutland <[email protected]> Signed-off-by: James Morse <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2016-07-07xenbus: don't BUG() on user mode induced conditionJan Beulich1-6/+8
Inability to locate a user mode specified transaction ID should not lead to a kernel crash. For other than XS_TRANSACTION_START also don't issue anything to xenbus if the specified ID doesn't match that of any active transaction. Signed-off-by: Jan Beulich <[email protected]> Cc: <[email protected]> Signed-off-by: David Vrabel <[email protected]>