aboutsummaryrefslogtreecommitdiff
path: root/arch/x86/entry/vdso
AgeCommit message (Collapse)AuthorFilesLines
2016-06-07GCC plugin infrastructureEmese Revfy1-1/+2
This patch allows to build the whole kernel with GCC plugins. It was ported from grsecurity/PaX. The infrastructure supports building out-of-tree modules and building in a separate directory. Cross-compilation is supported too. Currently the x86, arm, arm64 and uml architectures enable plugins. The directory of the gcc plugins is scripts/gcc-plugins. You can use a file or a directory there. The plugins compile with these options: * -fno-rtti: gcc is compiled with this option so the plugins must use it too * -fno-exceptions: this is inherited from gcc too * -fasynchronous-unwind-tables: this is inherited from gcc too * -ggdb: it is useful for debugging a plugin (better backtrace on internal errors) * -Wno-narrowing: to suppress warnings from gcc headers (ipa-utils.h) * -Wno-unused-variable: to suppress warnings from gcc headers (gcc_version variable, plugin-version.h) The infrastructure introduces a new Makefile target called gcc-plugins. It supports all gcc versions from 4.5 to 6.0. The scripts/gcc-plugin.sh script chooses the proper host compiler (gcc-4.7 can be built by either gcc or g++). This script also checks the availability of the included headers in scripts/gcc-plugins/gcc-common.h. The gcc-common.h header contains frequently included headers for GCC plugins and it has a compatibility layer for the supported gcc versions. The gcc-generate-*-pass.h headers automatically generate the registration structures for GIMPLE, SIMPLE_IPA, IPA and RTL passes. Note that 'make clean' keeps the *.so files (only the distclean or mrproper targets clean all) because they are needed for out-of-tree modules. Based on work created by the PaX Team. Signed-off-by: Emese Revfy <[email protected]> Acked-by: Kees Cook <[email protected]> Signed-off-by: Michal Marek <[email protected]>
2016-05-26Merge branch 'kbuild' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild Pull kbuild updates from Michal Marek: - new option CONFIG_TRIM_UNUSED_KSYMS which does a two-pass build and unexports symbols which are not used in the current config [Nicolas Pitre] - several kbuild rule cleanups [Masahiro Yamada] - warning option adjustments for gcov etc [Arnd Bergmann] - a few more small fixes * 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: (31 commits) kbuild: move -Wunused-const-variable to W=1 warning level kbuild: fix if_change and friends to consider argument order kbuild: fix adjust_autoksyms.sh for modules that need only one symbol kbuild: fix ksym_dep_filter when multiple EXPORT_SYMBOL() on the same line gcov: disable -Wmaybe-uninitialized warning gcov: disable tree-loop-im to reduce stack usage gcov: disable for COMPILE_TEST Kbuild: disable 'maybe-uninitialized' warning for CONFIG_PROFILE_ALL_BRANCHES Kbuild: change CC_OPTIMIZE_FOR_SIZE definition kbuild: forbid kernel directory to contain spaces and colons kbuild: adjust ksym_dep_filter for some cmd_* renames kbuild: Fix dependencies for final vmlinux link kbuild: better abstract vmlinux sequential prerequisites kbuild: fix call to adjust_autoksyms.sh when output directory specified kbuild: Get rid of KBUILD_STR kbuild: rename cmd_as_s_S to cmd_cpp_s_S kbuild: rename cmd_cc_i_c to cmd_cpp_i_c kbuild: drop redundant "PHONY += FORCE" kbuild: delete unnecessary "@:" kbuild: mark help target as PHONY ...
2016-05-23vdso: make arch_setup_additional_pages wait for mmap_sem for write killableMichal Hocko1-1/+2
most architectures are relying on mmap_sem for write in their arch_setup_additional_pages. If the waiting task gets killed by the oom killer it would block oom_reaper from asynchronous address space reclaim and reduce the chances of timely OOM resolving. Wait for the lock in the killable mode and return with EINTR if the task got killed while waiting. Signed-off-by: Michal Hocko <[email protected]> Acked-by: Andy Lutomirski <[email protected]> [x86 vdso] Acked-by: Vlastimil Babka <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-04-20kbuild: drop FORCE from PHONY targetsMasahiro Yamada1-2/+2
These targets are marked as PHONY. No need to add FORCE to their dependency. Signed-off-by: Masahiro Yamada <[email protected]> Signed-off-by: Michal Marek <[email protected]>
2016-04-13x86/vdso: Remove direct HPET access through the vDSOAndy Lutomirski3-29/+2
Allowing user code to map the HPET is problematic. HPET implementations are notoriously buggy, and there are probably many machines on which even MMIO reads from bogus HPET addresses are problematic. We have a report that the Dell Precision M2800 with: ACPI: HPET 0x00000000C8FE6238 000038 (v01 DELL CBX3 01072009 AMI. 00000005) is either so slow when accessing the HPET or actually hangs in some regard, causing soft lockups to be reported if users do unexpected things to the HPET. The vclock HPET code has also always been a questionable speedup. Accessing an HPET is exceedingly slow (on the order of several microseconds), so the added overhead in requiring a syscall to read the HPET is a small fraction of the total code of accessing it. To avoid future problems, let's just delete the code entirely. In the long run, this could actually be a speedup. Waiman Long as a patch to optimize the case where multiple CPUs contend for the HPET, but that won't help unless all the accesses are mediated by the kernel. Reported-by: Rasmus Villemoes <[email protected]> Signed-off-by: Andy Lutomirski <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Acked-by: Borislav Petkov <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Waiman Long <[email protected]> Cc: Waiman Long <[email protected]> Link: http://lkml.kernel.org/r/d2f90bba98db9905041cff294646d290d378f67a.1460074438.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2016-03-24Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes: - fix hotplug bugs - fix irq live lock - fix various topology handling bugs - fix APIC ACK ordering - fix PV iopl handling - fix speling - fix/tweak memcpy_mcsafe() return value - fix fbcon bug - remove stray prototypes" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/msr: Remove unused native_read_tscp() x86/apic: Remove declaration of unused hw_nmi_is_cpu_stuck x86/oprofile/nmi: Add missing hotplug FROZEN handling x86/hpet: Use proper mask to modify hotplug action x86/apic/uv: Fix the hotplug notifier x86/apb/timer: Use proper mask to modify hotplug action x86/topology: Use total_cpus not nr_cpu_ids for logical packages x86/topology: Fix Intel HT disable x86/topology: Fix logical package mapping x86/irq: Cure live lock in fixup_irqs() x86/tsc: Prevent NULL pointer deref in calibrate_delay_is_known() x86/apic: Fix suspicious RCU usage in smp_trace_call_function_interrupt() x86/iopl: Fix iopl capability check on Xen PV x86/iopl/64: Properly context-switch IOPL on Xen PV selftests/x86: Add an iopl test x86/mm, x86/mce: Fix return type/value for memcpy_mcsafe() x86/video: Don't assume all FB devices are PCI devices arch/x86/irq: Purge useless handler declarations from hw_irq.h x86: Fix misspellings in comments
2016-03-22kernel: add kcov code coverageDmitry Vyukov1-0/+3
kcov provides code coverage collection for coverage-guided fuzzing (randomized testing). Coverage-guided fuzzing is a testing technique that uses coverage feedback to determine new interesting inputs to a system. A notable user-space example is AFL (http://lcamtuf.coredump.cx/afl/). However, this technique is not widely used for kernel testing due to missing compiler and kernel support. kcov does not aim to collect as much coverage as possible. It aims to collect more or less stable coverage that is function of syscall inputs. To achieve this goal it does not collect coverage in soft/hard interrupts and instrumentation of some inherently non-deterministic or non-interesting parts of kernel is disbled (e.g. scheduler, locking). Currently there is a single coverage collection mode (tracing), but the API anticipates additional collection modes. Initially I also implemented a second mode which exposes coverage in a fixed-size hash table of counters (what Quentin used in his original patch). I've dropped the second mode for simplicity. This patch adds the necessary support on kernel side. The complimentary compiler support was added in gcc revision 231296. We've used this support to build syzkaller system call fuzzer, which has found 90 kernel bugs in just 2 months: https://github.com/google/syzkaller/wiki/Found-Bugs We've also found 30+ bugs in our internal systems with syzkaller. Another (yet unexplored) direction where kcov coverage would greatly help is more traditional "blob mutation". For example, mounting a random blob as a filesystem, or receiving a random blob over wire. Why not gcov. Typical fuzzing loop looks as follows: (1) reset coverage, (2) execute a bit of code, (3) collect coverage, repeat. A typical coverage can be just a dozen of basic blocks (e.g. an invalid input). In such context gcov becomes prohibitively expensive as reset/collect coverage steps depend on total number of basic blocks/edges in program (in case of kernel it is about 2M). Cost of kcov depends only on number of executed basic blocks/edges. On top of that, kernel requires per-thread coverage because there are always background threads and unrelated processes that also produce coverage. With inlined gcov instrumentation per-thread coverage is not possible. kcov exposes kernel PCs and control flow to user-space which is insecure. But debugfs should not be mapped as user accessible. Based on a patch by Quentin Casasnovas. [[email protected]: make task_struct.kcov_mode have type `enum kcov_mode'] [[email protected]: unbreak allmodconfig] [[email protected]: follow x86 Makefile layout standards] Signed-off-by: Dmitry Vyukov <[email protected]> Reviewed-by: Kees Cook <[email protected]> Cc: syzkaller <[email protected]> Cc: Vegard Nossum <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Tavis Ormandy <[email protected]> Cc: Will Deacon <[email protected]> Cc: Quentin Casasnovas <[email protected]> Cc: Kostya Serebryany <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Kees Cook <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Sasha Levin <[email protected]> Cc: David Drysdale <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Cc: Jiri Slaby <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-20Merge branch 'core-objtool-for-linus' of ↵Linus Torvalds1-2/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull 'objtool' stack frame validation from Ingo Molnar: "This tree adds a new kernel build-time object file validation feature (ONFIG_STACK_VALIDATION=y): kernel stack frame correctness validation. It was written by and is maintained by Josh Poimboeuf. The motivation: there's a category of hard to find kernel bugs, most of them in assembly code (but also occasionally in C code), that degrades the quality of kernel stack dumps/backtraces. These bugs are hard to detect at the source code level. Such bugs result in incorrect/incomplete backtraces most of time - but can also in some rare cases result in crashes or other undefined behavior. The build time correctness checking is done via the new 'objtool' user-space utility that was written for this purpose and which is hosted in the kernel repository in tools/objtool/. The tool's (very simple) UI and source code design is shaped after Git and perf and shares quite a bit of infrastructure with tools/perf (which tooling infrastructure sharing effort got merged via perf and is already upstream). Objtool follows the well-known kernel coding style. Objtool does not try to check .c or .S files, it instead analyzes the resulting .o generated machine code from first principles: it decodes the instruction stream and interprets it. (Right now objtool supports the x86-64 architecture.) From tools/objtool/Documentation/stack-validation.txt: "The kernel CONFIG_STACK_VALIDATION option enables a host tool named objtool which runs at compile time. It has a "check" subcommand which analyzes every .o file and ensures the validity of its stack metadata. It enforces a set of rules on asm code and C inline assembly code so that stack traces can be reliable. Currently it only checks frame pointer usage, but there are plans to add CFI validation for C files and CFI generation for asm files. For each function, it recursively follows all possible code paths and validates the correct frame pointer state at each instruction. It also follows code paths involving special sections, like .altinstructions, __jump_table, and __ex_table, which can add alternative execution paths to a given instruction (or set of instructions). Similarly, it knows how to follow switch statements, for which gcc sometimes uses jump tables." When this new kernel option is enabled (it's disabled by default), the tool, if it finds any suspicious assembly code pattern, outputs warnings in compiler warning format: warning: objtool: rtlwifi_rate_mapping()+0x2e7: frame pointer state mismatch warning: objtool: cik_tiling_mode_table_init()+0x6ce: call without frame pointer save/setup warning: objtool:__schedule()+0x3c0: duplicate frame pointer save warning: objtool:__schedule()+0x3fd: sibling call from callable instruction with changed frame pointer ... so that scripts that pick up compiler warnings will notice them. All known warnings triggered by the tool are fixed by the tree, most of the commits in fact prepare the kernel to be warning-free. Most of them are bugfixes or cleanups that stand on their own, but there are also some annotations of 'special' stack frames for justified cases such entries to JIT-ed code (BPF) or really special boot time code. There are two other long-term motivations behind this tool as well: - To improve the quality and reliability of kernel stack frames, so that they can be used for optimized live patching. - To create independent infrastructure to check the correctness of CFI stack frames at build time. CFI debuginfo is notoriously unreliable and we cannot use it in the kernel as-is without extra checking done both on the kernel side and on the build side. The quality of kernel stack frames matters to debuggability as well, so IMO we can merge this without having to consider the live patching or CFI debuginfo angle" * 'core-objtool-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits) objtool: Only print one warning per function objtool: Add several performance improvements tools: Copy hashtable.h into tools directory objtool: Fix false positive warnings for functions with multiple switch statements objtool: Rename some variables and functions objtool: Remove superflous INIT_LIST_HEAD objtool: Add helper macros for traversing instructions objtool: Fix false positive warnings related to sibling calls objtool: Compile with debugging symbols objtool: Detect infinite recursion objtool: Prevent infinite recursion in noreturn detection objtool: Detect and warn if libelf is missing and don't break the build tools: Support relative directory path for 'O=' objtool: Support CROSS_COMPILE x86/asm/decoder: Use explicitly signed chars objtool: Enable stack metadata validation on 64-bit x86 objtool: Add CONFIG_STACK_VALIDATION option objtool: Add tool to perform compile-time stack metadata validation x86/kprobes: Mark kretprobe_trampoline() stack frame as non-standard sched: Always inline context_switch() ...
2016-03-17Merge branch 'x86/cleanups' into x86/urgentIngo Molnar1-1/+1
Pull in some merge window leftovers. Signed-off-by: Ingo Molnar <[email protected]>
2016-03-15Merge branch 'x86-asm-for-linus' of ↵Linus Torvalds4-57/+80
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm updates from Ingo Molnar: "This is another big update. Main changes are: - lots of x86 system call (and other traps/exceptions) entry code enhancements. In particular the complex parts of the 64-bit entry code have been migrated to C code as well, and a number of dusty corners have been refreshed. (Andy Lutomirski) - vDSO special mapping robustification and general cleanups (Andy Lutomirski) - cpufeature refactoring, cleanups and speedups (Borislav Petkov) - lots of other changes ..." * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (64 commits) x86/cpufeature: Enable new AVX-512 features x86/entry/traps: Show unhandled signal for i386 in do_trap() x86/entry: Call enter_from_user_mode() with IRQs off x86/entry/32: Change INT80 to be an interrupt gate x86/entry: Improve system call entry comments x86/entry: Remove TIF_SINGLESTEP entry work x86/entry/32: Add and check a stack canary for the SYSENTER stack x86/entry/32: Simplify and fix up the SYSENTER stack #DB/NMI fixup x86/entry: Only allocate space for tss_struct::SYSENTER_stack if needed x86/entry: Vastly simplify SYSENTER TF (single-step) handling x86/entry/traps: Clear DR6 early in do_debug() and improve the comment x86/entry/traps: Clear TIF_BLOCKSTEP on all debug exceptions x86/entry/32: Restore FLAGS on SYSEXIT x86/entry/32: Filter NT and speed up AC filtering in SYSENTER x86/entry/compat: In SYSENTER, sink AC clearing below the existing FLAGS test selftests/x86: In syscall_nt, test NT|TF as well x86/asm-offsets: Remove PARAVIRT_enabled x86/entry/32: Introduce and use X86_BUG_ESPFIX instead of paravirt_enabled uprobes: __create_xol_area() must nullify xol_mapping.fault x86/cpufeature: Create a new synthetic cpu capability for machine check recovery ...
2016-02-29objtool: Mark non-standard object files and directoriesJosh Poimboeuf1-2/+4
Code which runs outside the kernel's normal mode of operation often does unusual things which can cause a static analysis tool like objtool to emit false positive warnings: - boot image - vdso image - relocation - realmode - efi - head - purgatory - modpost Set OBJECT_FILES_NON_STANDARD for their related files and directories, which will tell objtool to skip checking them. It's ok to skip them because they don't affect runtime stack traces. Also skip the following code which does the right thing with respect to frame pointers, but is too "special" to be validated by a tool: - entry - mcount Also skip the test_nx module because it modifies its exception handling table at runtime, which objtool can't understand. Fortunately it's just a test module so it doesn't matter much. Currently objtool is the only user of OBJECT_FILES_NON_STANDARD, but it might eventually be useful for other tools. Signed-off-by: Josh Poimboeuf <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Bernd Petrovitsch <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Chris J Arges <[email protected]> Cc: Jiri Slaby <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Michal Marek <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Pedro Alves <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/366c080e3844e8a5b6a0327dc7e8c2b90ca3baeb.1456719558.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <[email protected]>
2016-02-24x86: Fix misspellings in commentsAdam Buchbinder1-1/+1
Signed-off-by: Adam Buchbinder <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-02-22x86/vdso: Mark the vDSO code read-only after initKees Cook1-1/+1
The vDSO does not need to be writable after __init, so mark it as __ro_after_init. The result kills the exploit method of writing to the vDSO from kernel space resulting in userspace executing the modified code, as shown here to bypass SMEP restrictions: http://itszn.com/blog/?p=21 The memory map (with added vDSO address reporting) shows the vDSO moving into read-only memory: Before: [ 0.143067] vDSO @ ffffffff82004000 [ 0.143551] vDSO @ ffffffff82006000 ---[ High Kernel Mapping ]--- 0xffffffff80000000-0xffffffff81000000 16M pmd 0xffffffff81000000-0xffffffff81800000 8M ro PSE GLB x pmd 0xffffffff81800000-0xffffffff819f3000 1996K ro GLB x pte 0xffffffff819f3000-0xffffffff81a00000 52K ro NX pte 0xffffffff81a00000-0xffffffff81e00000 4M ro PSE GLB NX pmd 0xffffffff81e00000-0xffffffff81e05000 20K ro GLB NX pte 0xffffffff81e05000-0xffffffff82000000 2028K ro NX pte 0xffffffff82000000-0xffffffff8214f000 1340K RW GLB NX pte 0xffffffff8214f000-0xffffffff82281000 1224K RW NX pte 0xffffffff82281000-0xffffffff82400000 1532K RW GLB NX pte 0xffffffff82400000-0xffffffff83200000 14M RW PSE GLB NX pmd 0xffffffff83200000-0xffffffffc0000000 974M pmd After: [ 0.145062] vDSO @ ffffffff81da1000 [ 0.146057] vDSO @ ffffffff81da4000 ---[ High Kernel Mapping ]--- 0xffffffff80000000-0xffffffff81000000 16M pmd 0xffffffff81000000-0xffffffff81800000 8M ro PSE GLB x pmd 0xffffffff81800000-0xffffffff819f3000 1996K ro GLB x pte 0xffffffff819f3000-0xffffffff81a00000 52K ro NX pte 0xffffffff81a00000-0xffffffff81e00000 4M ro PSE GLB NX pmd 0xffffffff81e00000-0xffffffff81e0b000 44K ro GLB NX pte 0xffffffff81e0b000-0xffffffff82000000 2004K ro NX pte 0xffffffff82000000-0xffffffff8214c000 1328K RW GLB NX pte 0xffffffff8214c000-0xffffffff8227e000 1224K RW NX pte 0xffffffff8227e000-0xffffffff82400000 1544K RW GLB NX pte 0xffffffff82400000-0xffffffff83200000 14M RW PSE GLB NX pmd 0xffffffff83200000-0xffffffffc0000000 974M pmd Based on work by PaX Team and Brad Spengler. Signed-off-by: Kees Cook <[email protected]> Acked-by: Andy Lutomirski <[email protected]> Acked-by: H. Peter Anvin <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brad Spengler <[email protected]> Cc: Brian Gerst <[email protected]> Cc: David Brown <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: Emese Revfy <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mathias Krause <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: PaX Team <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: linux-arch <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-01-30x86/vdso: Use static_cpu_has()Borislav Petkov1-1/+1
... and simplify and speed up a tad. Signed-off-by: Borislav Petkov <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-01-30x86/cpufeature: Carve out X86_FEATURE_*Borislav Petkov3-2/+2
Move them to a separate header and have the following dependency: x86/cpufeatures.h <- x86/processor.h <- x86/cpufeature.h This makes it easier to use the header in asm code and not include the whole cpufeature.h and add guards for asm. Suggested-by: H. Peter Anvin <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-01-29Merge tag 'v4.5-rc1' into x86/asm, to refresh the branch before merging new ↵Ingo Molnar2-6/+7
changes Signed-off-by: Ingo Molnar <[email protected]>
2016-01-20UBSAN: run-time undefined behavior sanity checkerAndrey Ryabinin1-0/+1
UBSAN uses compile-time instrumentation to catch undefined behavior (UB). Compiler inserts code that perform certain kinds of checks before operations that could cause UB. If check fails (i.e. UB detected) __ubsan_handle_* function called to print error message. So the most of the work is done by compiler. This patch just implements ubsan handlers printing errors. GCC has this capability since 4.9.x [1] (see -fsanitize=undefined option and its suboptions). However GCC 5.x has more checkers implemented [2]. Article [3] has a bit more details about UBSAN in the GCC. [1] - https://gcc.gnu.org/onlinedocs/gcc-4.9.0/gcc/Debugging-Options.html [2] - https://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html [3] - http://developerblog.redhat.com/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/ Issues which UBSAN has found thus far are: Found bugs: * out-of-bounds access - 97840cb67ff5 ("netfilter: nfnetlink: fix insufficient validation in nfnetlink_bind") undefined shifts: * d48458d4a768 ("jbd2: use a better hash function for the revoke table") * 10632008b9e1 ("clockevents: Prevent shift out of bounds") * 'x << -1' shift in ext4 - http://lkml.kernel.org/r/<[email protected]> * undefined rol32(0) - http://lkml.kernel.org/r/<[email protected]> * undefined dirty_ratelimit calculation - http://lkml.kernel.org/r/<[email protected]> * undefined roundown_pow_of_two(0) - http://lkml.kernel.org/r/<[email protected]> * [WONTFIX] undefined shift in __bpf_prog_run - http://lkml.kernel.org/r/<CACT4Y+ZxoR3UjLgcNdUm4fECLMx2VdtfrENMtRRCdgHB2n0bJA@mail.gmail.com> WONTFIX here because it should be fixed in bpf program, not in kernel. signed overflows: * 32a8df4e0b33f ("sched: Fix odd values in effective_load() calculations") * mul overflow in ntp - http://lkml.kernel.org/r/<[email protected]> * incorrect conversion into rtc_time in rtc_time64_to_tm() - http://lkml.kernel.org/r/<[email protected]> * unvalidated timespec in io_getevents() - http://lkml.kernel.org/r/<CACT4Y+bBxVYLQ6LtOKrKtnLthqLHcw-BMp3aqP3mjdAvr9FULQ@mail.gmail.com> * [NOTABUG] signed overflow in ktime_add_safe() - http://lkml.kernel.org/r/<CACT4Y+aJ4muRnWxsUe1CMnA6P8nooO33kwG-c8YZg=0Xc8rJqw@mail.gmail.com> [[email protected]: fix unused local warning] [[email protected]: fix __int128 build woes] Signed-off-by: Andrey Ryabinin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sasha Levin <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Rasmus Villemoes <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Michal Marek <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Yury Gribov <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Konstantin Khlebnikov <[email protected]> Cc: Kostya Serebryany <[email protected]> Cc: Johannes Berg <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-01-13x86/vdso/pvclock: Protect STABLE check with the seqcountAndy Lutomirski1-6/+6
If the clock becomes unstable while we're reading it, we need to bail. We can do this by simply moving the check into the seqcount loop. Reported-by: Marcelo Tosatti <[email protected]> Signed-off-by: Andy Lutomirski <[email protected]> Cc: Alexander Graf <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Radim Krcmar <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/755dcedb17269e1d7ce12a9a713dea303835137e.1451949191.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2016-01-12x86/vdso: Disallow vvar access to vclock IO for never-used vclocksAndy Lutomirski1-2/+2
It makes me uncomfortable that even modern systems grant every process direct read access to the HPET. While fixing this for real without regressing anything is a mess (unmapping the HPET is tricky because we don't adequately track all the mappings), we can do almost as well by tracking which vclocks have ever been used and only allowing pages associated with used vclocks to be faulted in. This will cause rogue programs that try to peek at the HPET to get SIGBUS instead on most systems. We can't restrict faults to vclock pages that are associated with the currently selected vclock due to a race: a process could start to access the HPET for the first time and race against a switch away from the HPET as the current clocksource. We can't segfault the process trying to peek at the HPET in this case, even though the process isn't going to do anything useful with the data. Signed-off-by: Andy Lutomirski <[email protected]> Reviewed-by: Kees Cook <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Fenghua Yu <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Quentin Casasnovas <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/e79d06295625c02512277737ab55085a498ac5d8.1451446564.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2016-01-12x86/vdso: Use ->fault() instead of remap_pfn_range() for the vvar mappingAndy Lutomirski1-40/+57
This is IMO much less ugly, and it also opens the door to disallowing unprivileged userspace HPET access on systems with usable TSCs. Signed-off-by: Andy Lutomirski <[email protected]> Reviewed-by: Kees Cook <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Fenghua Yu <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Quentin Casasnovas <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/c19c2909e5ee3c3d8742f916586676bb7c40345f.1451446564.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2016-01-12x86/vdso: Use .fault for the vDSO text mappingAndy Lutomirski2-14/+19
The old scheme for mapping the vDSO text is rather complicated. vdso2c generates a struct vm_special_mapping and a blank .pages array of the correct size for each vdso image. Init code in vdso/vma.c populates the .pages array for each vDSO image, and the mapping code selects the appropriate struct vm_special_mapping. With .fault, we can use a less roundabout approach: vdso_fault() just returns the appropriate page for the selected vDSO image. Signed-off-by: Andy Lutomirski <[email protected]> Reviewed-by: Kees Cook <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Fenghua Yu <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Quentin Casasnovas <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/f886954c186bafd74e1b967c8931d852ae199aa2.1451446564.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2016-01-12x86/vdso: Track each mm's loaded vDSO image as well as its baseAndy Lutomirski1-0/+1
As we start to do more intelligent things with the vDSO at runtime (as opposed to just at mm initialization time), we'll need to know which vDSO is in use. In principle, we could guess based on the mm type, but that's over-complicated and error-prone. Instead, just track it in the mmu context. Signed-off-by: Andy Lutomirski <[email protected]> Reviewed-by: Kees Cook <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Fenghua Yu <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Quentin Casasnovas <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/c99ac48681bad709ca7ad5ee899d9042a3af6b00.1451446564.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2016-01-11Merge branch 'x86-asm-for-linus' of ↵Linus Torvalds4-79/+92
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm updates from Ingo Molnar: "The main changes in this cycle were: - vDSO and asm entry improvements (Andy Lutomirski) - Xen paravirt entry enhancements (Boris Ostrovsky) - asm entry labels enhancement (Borislav Petkov) - and other misc changes (Thomas Gleixner, me)" * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/vsdo: Fix build on PARAVIRT_CLOCK=y, KVM_GUEST=n Revert "x86/kvm: On KVM re-enable (e.g. after suspend), update clocks" x86/entry/64_compat: Make labels local x86/platform/uv: Include clocksource.h for clocksource_touch_watchdog() x86/vdso: Enable vdso pvclock access on all vdso variants x86/vdso: Remove pvclock fixmap machinery x86/vdso: Get pvclock data from the vvar VMA instead of the fixmap x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader x86/kvm: On KVM re-enable (e.g. after suspend), update clocks x86/entry/64: Bypass enter_from_user_mode on non-context-tracking boots x86/asm: Add asm macros for static keys/jump labels x86/asm: Error out if asm/jump_label.h is included inappropriately context_tracking: Switch to new static_branch API x86/entry, x86/paravirt: Remove the unused usergs_sysret32 PV op x86/paravirt: Remove the unused irq_enable_sysexit pv op x86/xen: Avoid fast syscall path for Xen PV guests
2015-12-21x86/entry: Restore traditional SYSENTER calling conventionAndy Lutomirski1-10/+42
It turns out that some Android versions hardcode the SYSENTER calling convention. This is buggy and will cause problems no matter what the kernel does. Nonetheless, we should try to support it. Credit goes to Linus for pointing out a clean way to handle the SYSENTER/SYSCALL clobber differences while preserving straightforward DWARF annotations. I believe that the original offending Android commit was: https://android.googlesource.com/platform%2Fbionic/+/7dc3684d7a2587e43e6d2a8e0e3f39bf759bd535 Reported-by: Qiuxu Zhuo <[email protected]> Signed-off-by: Andy Lutomirski <[email protected]> Reviewed-and-tested-by: Borislav Petkov <[email protected]> Cc: <[email protected]> Cc: Su Tao <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Mingwei Shi <[email protected]> Cc: Linus Torvalds <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2015-12-21x86/entry: Fix some commentsAndy Lutomirski1-1/+1
Signed-off-by: Andy Lutomirski <[email protected]> Reviewed-and-tested-by: Borislav Petkov <[email protected]> Cc: <[email protected]> Cc: Su Tao <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Mingwei Shi <[email protected]> Cc: Linus Torvalds <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2015-12-11x86/vdso: Enable vdso pvclock access on all vdso variantsAndy Lutomirski1-51/+40
Now that pvclock doesn't require access to the fixmap, all vdso variants can use it. The kernel side isn't wired up for 32-bit kernels yet, but this covers 32-bit and x32 userspace on 64-bit kernels. Signed-off-by: Andy Lutomirski <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/a7ef693b7a4c88dd2173dc1d4bf6bc27023626eb.1449702533.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-12-11x86/vdso: Remove pvclock fixmap machineryAndy Lutomirski2-1/+1
Signed-off-by: Andy Lutomirski <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/4933029991103ae44672c82b97a20035f5c1fe4f.1449702533.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-12-11x86/vdso: Get pvclock data from the vvar VMA instead of the fixmapAndy Lutomirski4-13/+26
Signed-off-by: Andy Lutomirski <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/9d37826fdc7e2d2809efe31d5345f97186859284.1449702533.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-12-11x86, vdso, pvclock: Simplify and speed up the vdso pvclock readerAndy Lutomirski1-35/+46
The pvclock vdso code was too abstracted to understand easily and excessively paranoid. Simplify it for a huge speedup. This opens the door for additional simplifications, as the vdso no longer accesses the pvti for any vcpu other than vcpu 0. Before, vclock_gettime using kvm-clock took about 45ns on my machine. With this change, it takes 29ns, which is almost as fast as the pure TSC implementation. Signed-off-by: Andy Lutomirski <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/6b51dcc41f1b101f963945c5ec7093d72bdac429.1449702533.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-11-03Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm changes from Ingo Molnar: "The main changes are: continued PAT work by Toshi Kani, plus a new boot time warning about insecure RWX kernel mappings, by Stephen Smalley. The new CONFIG_DEBUG_WX=y warning is marked default-y if CONFIG_DEBUG_RODATA=y is already eanbled, as a special exception, as these bugs are hard to notice and this check already found several live bugs" * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Warn on W^X mappings x86/mm: Fix no-change case in try_preserve_large_page() x86/mm: Fix __split_large_page() to handle large PAT bit x86/mm: Fix try_preserve_large_page() to handle large PAT bit x86/mm: Fix gup_huge_p?d() to handle large PAT bit x86/mm: Fix slow_virt_to_phys() to handle large PAT bit x86/mm: Fix page table dump to show PAT bit x86/asm: Add pud_pgprot() and pmd_pgprot() x86/asm: Fix pud/pmd interfaces to handle large PAT bit x86/asm: Add pud/pmd mask interfaces to handle large PAT bit x86/asm: Move PUD_PAGE macros to page_types.h x86/vdso32: Define PGTABLE_LEVELS to 32bit VDSO
2015-10-09x86/entry/32: Re-implement SYSENTER using the new C pathAndy Lutomirski1-0/+2
Signed-off-by: Andy Lutomirski <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/5b99659e8be70f3dd10cd8970a5c90293d9ad9a7.1444091585.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-10-09x86/vdso/compat: Wire up SYSENTER and SYSCSALL for compat userspaceAndy Lutomirski1-0/+8
What, you didn't realize that SYSENTER and SYSCALL were actually the same thing? :) Unlike the old code, this actually passes the ptrace_syscall_32 test on AMD systems. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/b74615af58d785aa02d917213ec64e2022a2c796.1444091585.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-10-09x86/vdso/32: Save extra registers in the INT80 vsyscall pathAndy Lutomirski2-1/+25
The goal is to integrate the SYSENTER and SYSCALL32 entry paths with the INT80 path. SYSENTER clobbers ESP and EIP. SYSCALL32 clobbers ECX (and, invisibly, R11). SYSRETL (long mode to compat mode) clobbers ECX and, invisibly, R11. SYSEXIT (which we only need for native 32-bit) clobbers ECX and EDX. This means that we'll need to provide ESP to the kernel in a register (I chose ECX, since it's only needed for SYSENTER) and we need to provide the args that normally live in ECX and EDX in memory. The epilogue needs to restore ECX and EDX, since user code relies on regs being preserved. We don't need to do anything special about EIP, since the kernel already knows where we are. The kernel will eventually need to know where int $0x80 lands, so add a vdso_image entry for it. The only user-visible effect of this code is that ptrace-induced changes to ECX and EDX during fast syscalls will be lost. This is already the case for the SYSENTER path. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/b860925adbee2d2627a0671fbfe23a7fd04127f8.1444091584.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-10-09x86/vdso: Replace hex int80 CFI annotations with GAS directivesAndy Lutomirski1-40/+8
Maintaining the current CFI annotations written in R'lyehian is difficult for most of us. Translate them to something a little closer to English. This will remove the CFI data for kernels built with extremely old versions of binutils. I think this is a fair tradeoff for the ability for mortals to edit the asm. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/ae3ff4ff5278b4bfc1e1dab368823469866d4b71.1444091584.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-10-09x86/vdso: Define BUILD_VDSO while building and emit .eh_frame in asmAndy Lutomirski1-2/+2
For the vDSO, user code wants runtime unwind info. Make sure that, if we use .cfi directives, we generate it. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/16e29ad8855e6508197000d8c41f56adb00d7580.1444091584.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-10-07x86/vdso: Remove runtime 32-bit vDSO selectionAndy Lutomirski7-255/+13
32-bit userspace will now always see the same vDSO, which is exactly what used to be the int80 vDSO. Subsequent patches will clean it up and make it support SYSENTER and SYSCALL using alternatives. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/e7e6b3526fa442502e6125fe69486aab50813c32.1444091584.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-09-22x86/vdso32: Define PGTABLE_LEVELS to 32bit VDSOToshi Kani1-0/+2
In case of CONFIG_X86_64, vdso32/vclock_gettime.c fakes a 32-bit non-PAE kernel configuration by re-defining it to CONFIG_X86_32. However, it does not re-define CONFIG_PGTABLE_LEVELS leaving it as 4 levels. This mismatch leads <asm/pgtable_type.h> to NOT include <asm-generic/ pgtable-nopud.h> and <asm-generic/pgtable-nopmd.h>, which will cause compile errors when a later patch enhances <asm/pgtable_type.h> to use PUD_SHIFT and PMD_SHIFT. These -nopud & -nopmd headers define these SHIFTs for the 32-bit non-PAE kernel. Fix it by re-defining CONFIG_PGTABLE_LEVELS to 2 levels. Signed-off-by: Toshi Kani <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Juergen Gross <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Konrad Wilk <[email protected]> Cc: Robert Elliot <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-08-08x86/vdso: Emit a GNU hashAndy Lutomirski1-1/+1
Some dynamic loaders may be slightly faster if a GNU hash is available. Strangely, this seems to have no effect at all on the vdso size. This is unlikely to have any measurable effect on the time it takes to resolve vdso symbols (since there are so few of them). In some contexts, it can be a win for a different reason: if every DSO has a GNU hash section, then libc can avoid calculating SysV hashes at all. Both musl and glibc appear to have this optimization. It's plausible that this breaks some ancient glibc version. If so, then, depending on what glibc versions break, we could either require COMPAT_VDSO for them or consider reverting. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Isaac Dunham <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Nathan Lynch <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rich Felker <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] <[email protected]> Link: http://lkml.kernel.org/r/fd56cc057a2d62ab31c56a48d04fccb435b3fd4f.1438897382.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-07-06x86/compat: Don't build the 32-bit VDSO if not neededBrian Gerst2-5/+8
Build the 32-bit vdso only for native 32-bit or 32-bit compat is enabled. x32 should not force it to build. Signed-off-by: Brian Gerst <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-07-06x86/asm/tsc: Add rdtsc_ordered() and use it in trivial call sitesAndy Lutomirski1-14/+2
rdtsc_barrier(); rdtsc() is an unnecessary mouthful and requires more thought than should be necessary. Add an rdtsc_ordered() helper and replace the trivial call sites with it. This should not change generated code. The duplication of the fence asm is temporary. Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Huang Rui <[email protected]> Cc: John Stultz <[email protected]> Cc: Len Brown <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: kvm ML <[email protected]> Link: http://lkml.kernel.org/r/dddbf98a2af53312e9aa73a5a2b1622fe5d6f52b.1434501121.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-07-06x86/asm/tsc: Rename native_read_tsc() to rdtsc()Andy Lutomirski1-1/+1
Now that there is no paravirt TSC, the "native" is inappropriate. The function does RDTSC, so give it the obvious name: rdtsc(). Suggested-by: Borislav Petkov <[email protected]> Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Huang Rui <[email protected]> Cc: John Stultz <[email protected]> Cc: Len Brown <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: kvm ML <[email protected]> Link: http://lkml.kernel.org/r/fd43e16281991f096c1e4d21574d9e1402c62d39.1434501121.git.luto@kernel.org [ Ported it to v4.2-rc1. ] Signed-off-by: Ingo Molnar <[email protected]>
2015-07-06x86/asm/tsc: Inline native_read_tsc() and remove __native_read_tsc()Andy Lutomirski1-1/+1
In the following commit: cdc7957d1954 ("x86: move native_read_tsc() offline") ... native_read_tsc() was moved out of line, presumably for some now-obsolete vDSO-related reason. Undo it. The entire rdtsc, shl, or sequence is only 11 bytes, and calls via rdtscl() and similar helpers were already inlined. Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Huang Rui <[email protected]> Cc: John Stultz <[email protected]> Cc: Len Brown <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: kvm ML <[email protected]> Link: http://lkml.kernel.org/r/d05ffe2aaf8468ca475ebc00efad7b2fa174af19.1434501121.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-06-03x86/asm/entry, x86/vdso: Move the vDSO code to arch/x86/entry/vdso/Ingo Molnar22-0/+2142
Cc: Borislav Petkov <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>