aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2018-04-02RISC-V: Support ADD32 relocation type in kernel moduleZong Li1-0/+8
Signed-off-by: Zong Li <[email protected]> Signed-off-by: Palmer Dabbelt <[email protected]>
2018-04-02RISC-V: Support ALIGN relocation type in kernel moduleZong Li1-0/+10
Just fail on align type. Kernel modules loader didn't do relax like linker, it is difficult to remove or migrate the code, but the remnant nop instructions harm the performaace of module. We expect the building module with the no-relax option. Signed-off-by: Zong Li <[email protected]> Signed-off-by: Palmer Dabbelt <[email protected]>
2018-04-02RISC-V: Support RVC_BRANCH/JUMP relocation type in kernel modulewqZong Li1-0/+35
Signed-off-by: Zong Li <[email protected]> Signed-off-by: Palmer Dabbelt <[email protected]>
2018-04-02RISC-V: Support HI20/LO12_I/LO12_S relocation type in kernel moduleZong Li1-0/+42
HI20 and LO12_I/LO12_S relocate the absolute address, the range of offset must in 32-bit. Signed-off-by: Zong Li <[email protected]> Signed-off-by: Palmer Dabbelt <[email protected]>
2018-04-02RISC-V: Support CALL relocation type in kernel moduleZong Li1-0/+22
Signed-off-by: Zong Li <[email protected]> Signed-off-by: Palmer Dabbelt <[email protected]>
2018-04-02RISC-V: Support GOT_HI20/CALL_PLT relocation type in kernel moduleZong Li1-10/+52
For CALL_PLT, emit the plt entry only when offset is more than 32-bit. For PCREL_LO12, it uses the location of corresponding HI20 to get the address of external symbol. It should check the HI20 type is the PCREL_HI20 or GOT_HI20, because sometime the location will have two or more relocation types. For example: 0: 00000797 auipc a5,0x0 0: R_RISCV_ALIGN *ABS* 0: R_RISCV_GOT_HI20 SYMBOL 4: 0007b783 ld a5,0(a5) # 0 <SYMBOL> 4: R_RISCV_PCREL_LO12_I .L0 4: R_RISCV_RELAX *ABS* Signed-off-by: Zong Li <[email protected]> Signed-off-by: Palmer Dabbelt <[email protected]>
2018-04-02RISC-V: Add section of GOT.PLT for kernel moduleZong Li3-17/+45
Separate the function symbol address from .plt to .got.plt section. The original plt entry has trampoline code with symbol address, there is a 32-bit padding bwtween jar instruction and symbol address. Extract the symbol address to .got.plt to reduce the module size. Signed-off-by: Zong Li <[email protected]> Signed-off-by: Palmer Dabbelt <[email protected]>
2018-04-02RISC-V: Add sections of PLT and GOT for kernel moduleZong Li6-0/+260
The address of external symbols will locate more than 32-bit offset in 64-bit kernel with sv39 or sv48 virtual addressing. Module loader emits the GOT and PLT entries for data symbols and function symbols respectively. The PLT entry is a trampoline code for jumping to the 64-bit real address. The GOT entry is just the data symbol address. Signed-off-by: Zong Li <[email protected]> Signed-off-by: Palmer Dabbelt <[email protected]>
2018-04-02riscv/atomic: Strengthen implementations with fencesAndrea Parri2-220/+588
Atomics present the same issue with locking: release and acquire variants need to be strengthened to meet the constraints defined by the Linux-kernel memory consistency model [1]. Atomics present a further issue: implementations of atomics such as atomic_cmpxchg() and atomic_add_unless() rely on LR/SC pairs, which do not give full-ordering with .aqrl; for example, current implementations allow the "lr-sc-aqrl-pair-vs-full-barrier" test below to end up with the state indicated in the "exists" clause. In order to "synchronize" LKMM and RISC-V's implementation, this commit strengthens the implementations of the atomics operations by replacing .rl and .aq with the use of ("lightweigth") fences, and by replacing .aqrl LR/SC pairs in sequences such as: 0: lr.w.aqrl %0, %addr bne %0, %old, 1f ... sc.w.aqrl %1, %new, %addr bnez %1, 0b 1: with sequences of the form: 0: lr.w %0, %addr bne %0, %old, 1f ... sc.w.rl %1, %new, %addr /* SC-release */ bnez %1, 0b fence rw, rw /* "full" fence */ 1: following Daniel's suggestion. These modifications were validated with simulation of the RISC-V memory consistency model. C lr-sc-aqrl-pair-vs-full-barrier {} P0(int *x, int *y, atomic_t *u) { int r0; int r1; WRITE_ONCE(*x, 1); r0 = atomic_cmpxchg(u, 0, 1); r1 = READ_ONCE(*y); } P1(int *x, int *y, atomic_t *v) { int r0; int r1; WRITE_ONCE(*y, 1); r0 = atomic_cmpxchg(v, 0, 1); r1 = READ_ONCE(*x); } exists (u=1 /\ v=1 /\ 0:r1=0 /\ 1:r1=0) [1] https://marc.info/?l=linux-kernel&m=151930201102853&w=2 https://groups.google.com/a/groups.riscv.org/forum/#!topic/isa-dev/hKywNHBkAXM https://marc.info/?l=linux-kernel&m=151633436614259&w=2 Suggested-by: Daniel Lustig <[email protected]> Signed-off-by: Andrea Parri <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Albert Ou <[email protected]> Cc: Daniel Lustig <[email protected]> Cc: Alan Stern <[email protected]> Cc: Will Deacon <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: David Howells <[email protected]> Cc: Jade Alglave <[email protected]> Cc: Luc Maranget <[email protected]> Cc: "Paul E. McKenney" <[email protected]> Cc: Akira Yokosawa <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Palmer Dabbelt <[email protected]>
2018-04-02riscv/spinlock: Strengthen implementations with fencesAndrea Parri2-14/+27
Current implementations map locking operations using .rl and .aq annotations. However, this mapping is unsound w.r.t. the kernel memory consistency model (LKMM) [1]: Referring to the "unlock-lock-read-ordering" test reported below, Daniel wrote: "I think an RCpc interpretation of .aq and .rl would in fact allow the two normal loads in P1 to be reordered [...] The intuition would be that the amoswap.w.aq can forward from the amoswap.w.rl while that's still in the store buffer, and then the lw x3,0(x4) can also perform while the amoswap.w.rl is still in the store buffer, all before the l1 x1,0(x2) executes. That's not forbidden unless the amoswaps are RCsc, unless I'm missing something. Likewise even if the unlock()/lock() is between two stores. A control dependency might originate from the load part of the amoswap.w.aq, but there still would have to be something to ensure that this load part in fact performs after the store part of the amoswap.w.rl performs globally, and that's not automatic under RCpc." Simulation of the RISC-V memory consistency model confirmed this expectation. In order to "synchronize" LKMM and RISC-V's implementation, this commit strengthens the implementations of the locking operations by replacing .rl and .aq with the use of ("lightweigth") fences, resp., "fence rw, w" and "fence r , rw". C unlock-lock-read-ordering {} /* s initially owned by P1 */ P0(int *x, int *y) { WRITE_ONCE(*x, 1); smp_wmb(); WRITE_ONCE(*y, 1); } P1(int *x, int *y, spinlock_t *s) { int r0; int r1; r0 = READ_ONCE(*y); spin_unlock(s); spin_lock(s); r1 = READ_ONCE(*x); } exists (1:r0=1 /\ 1:r1=0) [1] https://marc.info/?l=linux-kernel&m=151930201102853&w=2 https://groups.google.com/a/groups.riscv.org/forum/#!topic/isa-dev/hKywNHBkAXM https://marc.info/?l=linux-kernel&m=151633436614259&w=2 Signed-off-by: Andrea Parri <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Albert Ou <[email protected]> Cc: Daniel Lustig <[email protected]> Cc: Alan Stern <[email protected]> Cc: Will Deacon <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: David Howells <[email protected]> Cc: Jade Alglave <[email protected]> Cc: Luc Maranget <[email protected]> Cc: "Paul E. McKenney" <[email protected]> Cc: Akira Yokosawa <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Palmer Dabbelt <[email protected]>
2018-04-02riscv/barrier: Define __smp_{store_release,load_acquire}Andrea Parri1-0/+15
Introduce __smp_{store_release,load_acquire}, and rely on the generic definitions for smp_{store_release,load_acquire}. This avoids the use of full ("rw,rw") fences on SMP. Signed-off-by: Andrea Parri <[email protected]> Signed-off-by: Palmer Dabbelt <[email protected]>
2018-04-02riscv/ftrace: Add HAVE_FUNCTION_GRAPH_RET_ADDR_PTR supportAlan Kao2-1/+7
In walk_stackframe, the pc now receives the address from calling ftrace_graph_ret_addr instead of manual calculation. Note that the original calculation, pc = frame->ra - 4 is buggy when the instruction at the return address happened to be a compressed inst. But since it is not a critical part of ftrace, it is ignored for now to ease the review process. Cc: Greentime Hu <[email protected]> Signed-off-by: Alan Kao <[email protected]> Signed-off-by: Palmer Dabbelt <[email protected]>
2018-04-02riscv/ftrace: Add DYNAMIC_FTRACE_WITH_REGS supportAlan Kao4-0/+141
Cc: Greentime Hu <[email protected]> Signed-off-by: Alan Kao <[email protected]> Signed-off-by: Palmer Dabbelt <[email protected]>
2018-04-02riscv/ftrace: Add ARCH_SUPPORTS_FTRACE_OPS supportAlan Kao2-0/+4
Cc: Greentime Hu <[email protected]> Signed-off-by: Alan Kao <[email protected]> Signed-off-by: Palmer Dabbelt <[email protected]>
2018-04-02riscv/ftrace: Add dynamic function graph tracer supportAlan Kao2-1/+118
Once the function_graph tracer is enabled, a filtered function has the following call sequence: * ftracer_caller ==> on/off by ftrace_make_call/ftrace_make_nop * ftrace_graph_caller * ftrace_graph_call ==> on/off by ftrace_en/disable_ftrace_graph_caller * prepare_ftrace_return Considering the following DYNAMIC_FTRACE_WITH_REGS feature, it would be more extendable to have a ftrace_graph_caller function, instead of calling prepare_ftrace_return directly in ftrace_caller. Cc: Greentime Hu <[email protected]> Signed-off-by: Alan Kao <[email protected]> Signed-off-by: Palmer Dabbelt <[email protected]>
2018-04-02riscv/ftrace: Add dynamic function tracer supportAlan Kao6-12/+223
We now have dynamic ftrace with the following added items: * ftrace_make_call, ftrace_make_nop (in kernel/ftrace.c) The two functions turn each recorded call site of filtered functions into a call to ftrace_caller or nops * ftracce_update_ftrace_func (in kernel/ftrace.c) turns the nops at ftrace_call into a call to a generic entry for function tracers. * ftrace_caller (in kernel/mcount-dyn.S) The entry where each _mcount call sites calls to once they are filtered to be traced. Also, this patch fixes the semantic problems in mcount.S, which will be treated as only a reference implementation once we have the dynamic ftrace. Cc: Greentime Hu <[email protected]> Signed-off-by: Alan Kao <[email protected]> Signed-off-by: Palmer Dabbelt <[email protected]>
2018-04-02riscv/ftrace: Add RECORD_MCOUNT supportAlan Kao3-0/+9
Now recordmcount.pl recognizes RISC-V object files. For the mechanism to work, we have to disable the linker relaxation. Cc: Greentime Hu <[email protected]> Signed-off-by: Alan Kao <[email protected]> Signed-off-by: Palmer Dabbelt <[email protected]>
2018-04-02Merge tag 'nds32-for-linus-4.17' of ↵Linus Torvalds130-21/+11864
git://git.kernel.org/pub/scm/linux/kernel/git/greentime/linux Pull nds32 architecture support from Greentime Hu: "This contains the core nds32 Linux port (including interrupt controller driver and timer driver), which has been through seven rounds of review on mailing list. It is able to boot to shell and passes most LTP-2017 testsuites in nds32 AE3XX platform: Total Tests: 1901 Total Skipped Tests: 618 Total Failures: 78" Reviewed-by: Arnd Bergmann <[email protected]> * tag 'nds32-for-linus-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/greentime/linux: (44 commits) nds32: To use the generic dump_stack() nds32: fix building failed if using elf toolchain. nios2: add ioremap_nocache declaration before include asm-generic/io.h. nds32: fix building failed if using older version gcc. dt-bindings: timer: Add andestech atcpit100 timer binding doc clocksource/drivers/atcpit100: VDSO support clocksource/drivers/atcpit100: Add andestech atcpit100 timer net: faraday add nds32 support. irqchip: Andestech Internal Vector Interrupt Controller driver dt-bindings: interrupt-controller: Andestech Internal Vector Interrupt Controller dt-bindings: nds32 SoC Bindings dt-bindings: nds32 L2 cache controller Bindings dt-bindings: nds32 CPU Bindings MAINTAINERS: Add nds32 nds32: Build infrastructure nds32: defconfig nds32: Miscellaneous header files nds32: Device tree support nds32: Generic timers support nds32: Loadable modules ...
2018-04-02Merge tag 'm68k-for-v4.17-tag1' of ↵Linus Torvalds19-149/+296
git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k Pull m68k updates from Geert Uytterhoeven: - Macintosh enhancements and fixes - Remove useless memory layout printing using hashed pointers - Add missing Amiga Zorro bus DMA mask - Small fixes and cleanups - Defconfig updates * tag 'm68k-for-v4.17-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: m68k/mac: Remove bogus "FIXME" comment m68k/mac: Enable RTC for 100-series PowerBooks m68k/mac: Clean up whitespace and remove redundant parentheses m68k/defconfig: Update defconfigs for v4.16-rc5 zorro: Set up z->dev.dma_mask for the DMA API m68k/time: Stop validating rtc_time in .read_time m68k/mm: Stop printing the virtual memory layout macintosh/via-pmu68k: Initialize PMU driver with setup_arch and arch_initcall m68k/mac: Fix apparent race condition in Baboon interrupt dispatch m68k/mac: Enable PDMA support for PowerBook 190
2018-04-02Merge branch 'efi-core-for-linus' of ↵Linus Torvalds18-108/+141
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI updates from Ingo Molnar: "The main EFI changes in this cycle were: - Fix the apple-properties code (Andy Shevchenko) - Add WARN() on arm64 if UEFI Runtime Services corrupt the reserved x18 register (Ard Biesheuvel) - Use efi_switch_mm() on x86 instead of manipulating %cr3 directly (Sai Praneeth) - Fix early memremap leak in ESRT code (Ard Biesheuvel) - Switch to L"xxx" notation for wide string literals (Ard Biesheuvel) - ... plus misc other cleanups and bugfixes" * 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/efi: Use efi_switch_mm() rather than manually twiddling with %cr3 x86/efi: Replace efi_pgd with efi_mm.pgd efi: Use string literals for efi_char16_t variable initializers efi/esrt: Fix handling of early ESRT table mapping efi: Use efi_mm in x86 as well as ARM efi: Make const array 'apple' static efi/apple-properties: Use memremap() instead of ioremap() efi: Reorder pr_notice() with add_device_randomness() call x86/efi: Replace GFP_ATOMIC with GFP_KERNEL in efi_query_variable_store() efi/arm64: Check whether x18 is preserved by runtime services calls efi/arm*: Stop printing addresses of virtual mappings efi/apple-properties: Remove redundant attribute initialization from unmarshal_key_value_pairs() efi/arm*: Only register page tables when they exist
2018-04-02Merge branch 'x86-dma-for-linus' of ↵Linus Torvalds30-558/+182
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 dma mapping updates from Ingo Molnar: "This tree, by Christoph Hellwig, switches over the x86 architecture to the generic dma-direct and swiotlb code, and also unifies more of the dma-direct code between architectures. The now unused x86-only primitives are removed" * 'x86-dma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: dma-mapping: Don't clear GFP_ZERO in dma_alloc_attrs swiotlb: Make swiotlb_{alloc,free}_buffer depend on CONFIG_DMA_DIRECT_OPS dma/swiotlb: Remove swiotlb_{alloc,free}_coherent() dma/direct: Handle force decryption for DMA coherent buffers in common code dma/direct: Handle the memory encryption bit in common code dma/swiotlb: Remove swiotlb_set_mem_attributes() set_memory.h: Provide set_memory_{en,de}crypted() stubs x86/dma: Remove dma_alloc_coherent_gfp_flags() iommu/intel-iommu: Enable CONFIG_DMA_DIRECT_OPS=y and clean up intel_{alloc,free}_coherent() iommu/amd_iommu: Use CONFIG_DMA_DIRECT_OPS=y and dma_direct_{alloc,free}() x86/dma/amd_gart: Use dma_direct_{alloc,free}() x86/dma/amd_gart: Look at dev->coherent_dma_mask instead of GFP_DMA x86/dma: Use generic swiotlb_ops x86/dma: Use DMA-direct (CONFIG_DMA_DIRECT_OPS=y) x86/dma: Remove dma_alloc_coherent_mask()
2018-04-02Merge branch 'sched-wait-for-linus' of ↵Linus Torvalds19-170/+147
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull wait_var_event updates from Ingo Molnar: "This introduces the new wait_var_event() API, which is a more flexible waiting primitive than wait_on_atomic_t(). All wait_on_atomic_t() users are migrated over to the new API and wait_on_atomic_t() is removed. The migration fixes one bug and should result in no functional changes for the other usecases" * 'sched-wait-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/wait: Improve __var_waitqueue() code generation sched/wait: Remove the wait_on_atomic_t() API sched/wait, arch/mips: Fix and convert wait_on_atomic_t() usage to the new wait_var_event() API sched/wait, fs/ocfs2: Convert wait_on_atomic_t() usage to the new wait_var_event() API sched/wait, fs/nfs: Convert wait_on_atomic_t() usage to the new wait_var_event() API sched/wait, fs/fscache: Convert wait_on_atomic_t() usage to the new wait_var_event() API sched/wait, fs/btrfs: Convert wait_on_atomic_t() usage to the new wait_var_event() API sched/wait, fs/afs: Convert wait_on_atomic_t() usage to the new wait_var_event() API sched/wait, drivers/media: Convert wait_on_atomic_t() usage to the new wait_var_event() API sched/wait, drivers/drm: Convert wait_on_atomic_t() usage to the new wait_var_event() API sched/wait: Introduce wait_var_event()
2018-04-02Merge branch 'x86-timers-for-linus' of ↵Linus Torvalds5-5/+43
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 timer updates from Ingo Molnar: "Two changes: add the new convert_art_ns_to_tsc() API for upcoming Intel Goldmont+ drivers, and remove the obsolete rdtscll() API" * 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/tsc: Get rid of rdtscll() x86/tsc: Convert ART in nanoseconds to TSC
2018-04-02Merge branch 'x86-platform-for-linus' of ↵Linus Torvalds22-103/+185
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 platform updates from Ingo Molnar: "The main changes in this cycle were: - Add "Jailhouse" hypervisor support (Jan Kiszka) - Update DeviceTree support (Ivan Gorinov) - Improve DMI date handling (Andy Shevchenko)" * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/PCI: Fix a potential regression when using dmi_get_bios_year() firmware/dmi_scan: Uninline dmi_get_bios_year() helper x86/devicetree: Use CPU description from Device Tree of/Documentation: Specify local APIC ID in "reg" MAINTAINERS: Add entry for Jailhouse x86/jailhouse: Allow to use PCI_MMCONFIG without ACPI x86: Consolidate PCI_MMCONFIG configs x86: Align x86_64 PCI_MMCONFIG with 32-bit variant x86/jailhouse: Enable PCI mmconfig access in inmates PCI: Scan all functions when running over Jailhouse jailhouse: Provide detection for non-x86 systems x86/devicetree: Fix device IRQ settings in DT x86/devicetree: Initialize device tree before using it pci: Simplify code by using the new dmi_get_bios_year() helper ACPI/sleep: Simplify code by using the new dmi_get_bios_year() helper x86/pci: Simplify code by using the new dmi_get_bios_year() helper dmi: Introduce the dmi_get_bios_year() helper function x86/platform/quark: Re-use DEFINE_SHOW_ATTRIBUTE() macro x86/platform/atom: Re-use DEFINE_SHOW_ATTRIBUTE() macro
2018-04-02Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds68-1016/+1657
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm updates from Ingo Molnar: - Extend the memmap= boot parameter syntax to allow the redeclaration and dropping of existing ranges, and to support all e820 range types (Jan H. Schönherr) - Improve the W+X boot time security checks to remove false positive warnings on Xen (Jan Beulich) - Support booting as Xen PVH guest (Juergen Gross) - Improved 5-level paging (LA57) support, in particular it's possible now to have a single kernel image for both 4-level and 5-level hardware (Kirill A. Shutemov) - AMD hardware RAM encryption support (SME/SEV) fixes (Tom Lendacky) - Preparatory commits for hardware-encrypted RAM support on Intel CPUs. (Kirill A. Shutemov) - Improved Intel-MID support (Andy Shevchenko) - Show EFI page tables in page_tables debug files (Andy Lutomirski) - ... plus misc fixes and smaller cleanups * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (56 commits) x86/cpu/tme: Fix spelling: "configuation" -> "configuration" x86/boot: Fix SEV boot failure from change to __PHYSICAL_MASK_SHIFT x86/mm: Update comment in detect_tme() regarding x86_phys_bits x86/mm/32: Remove unused node_memmap_size_bytes() & CONFIG_NEED_NODE_MEMMAP_SIZE logic x86/mm: Remove pointless checks in vmalloc_fault x86/platform/intel-mid: Add special handling for ACPI HW reduced platforms ACPI, x86/boot: Introduce the ->reduced_hw_early_init() ACPI callback ACPI, x86/boot: Split out acpi_generic_reduce_hw_init() and export x86/pconfig: Provide defines and helper to run MKTME_KEY_PROG leaf x86/pconfig: Detect PCONFIG targets x86/tme: Detect if TME and MKTME is activated by BIOS x86/boot/compressed/64: Handle 5-level paging boot if kernel is above 4G x86/boot/compressed/64: Use page table in trampoline memory x86/boot/compressed/64: Use stack from trampoline memory x86/boot/compressed/64: Make sure we have a 32-bit code segment x86/mm: Do not use paravirtualized calls in native_set_p4d() kdump, vmcoreinfo: Export pgtable_l5_enabled value x86/boot/compressed/64: Prepare new top-level page table for trampoline x86/boot/compressed/64: Set up trampoline memory x86/boot/compressed/64: Save and restore trampoline memory ...
2018-04-02Merge branch 'x86-cleanups-for-linus' of ↵Linus Torvalds12-130/+119
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups and msr updates from Ingo Molnar: "The main change is a performance/latency improvement to /dev/msr access. The rest are misc cleanups" * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/msr: Make rdmsrl_safe_on_cpu() scheduling safe as well x86/cpuid: Allow cpuid_read() to schedule x86/msr: Allow rdmsr_safe_on_cpu() to schedule x86/rtc: Stop using deprecated functions x86/dumpstack: Unify show_regs() x86/fault: Do not print IP in show_fault_oops() x86/MSR: Move native_* variants to msr.h
2018-04-02Merge branch 'x86-build-for-linus' of ↵Linus Torvalds6-32/+16
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 build updates from Ingo Molnar: "The biggest change is the forcing of asm-goto support on x86, which effectively increases the GCC minimum supported version to gcc-4.5 (on x86)" * 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/build: Don't pass in -D__KERNEL__ multiple times x86: Remove FAST_FEATURE_TESTS x86: Force asm-goto x86/build: Drop superfluous ALIGN from the linker script
2018-04-02Merge branch 'x86-asm-for-linus' of ↵Linus Torvalds2-3/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm fixlets from Ingo Molnar: "A clobber list fix and cleanups" * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/asm: Trim clear_page.S includes x86/asm: Clobber flags in clear_page()
2018-04-02Merge branch 'x86-apic-for-linus' of ↵Linus Torvalds15-117/+101
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 apic updates from Ingo Molnar: "The main x86 APIC/IOAPIC changes in this cycle were: - Robustify kexec support to more carefully restore IRQ hardware state before calling into kexec/kdump kernels. (Baoquan He) - Clean up the local APIC code a bit (Dou Liyang) - Remove unused callbacks (David Rientjes)" * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic: Finish removing unused callbacks x86/apic: Drop logical_smp_processor_id() inline x86/apic: Modernize the pending interrupt code x86/apic: Move pending interrupt check code into it's own function x86/apic: Set up through-local-APIC mode on the boot CPU if 'noapic' specified x86/apic: Rename variables and functions related to x86_io_apic_ops x86/apic: Remove the (now) unused disable_IO_APIC() function x86/apic: Fix restoring boot IRQ mode in reboot and kexec/kdump x86/apic: Split disable_IO_APIC() into two functions to fix CONFIG_KEXEC_JUMP=y x86/apic: Split out restore_boot_irq_mode() from disable_IO_APIC() x86/apic: Make setup_local_APIC() static x86/apic: Simplify init_bsp_APIC() usage x86/x2apic: Mark set_x2apic_phys_mode() as __init
2018-04-02Merge branch 'smp-hotplug-for-linus' of ↵Linus Torvalds1-36/+24
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull SMP hotplug updates from Ingo Molnar: "Simplify the CPU hot-plug state machine" * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: cpu/hotplug: Fix unused function warning cpu/hotplug: Merge cpuhp_bp_states and cpuhp_ap_states
2018-04-02Merge branch 'sched-core-for-linus' of ↵Linus Torvalds41-1529/+2082
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "The main scheduler changes in this cycle were: - NUMA balancing improvements (Mel Gorman) - Further load tracking improvements (Patrick Bellasi) - Various NOHZ balancing cleanups and optimizations (Peter Zijlstra) - Improve blocked load handling, in particular we can now reduce and eventually stop periodic load updates on 'very idle' CPUs. (Vincent Guittot) - On isolated CPUs offload the final 1Hz scheduler tick as well, plus related cleanups and reorganization. (Frederic Weisbecker) - Core scheduler code cleanups (Ingo Molnar)" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits) sched/core: Update preempt_notifier_key to modern API sched/cpufreq: Rate limits for SCHED_DEADLINE sched/fair: Update util_est only on util_avg updates sched/cpufreq/schedutil: Use util_est for OPP selection sched/fair: Use util_est in LB and WU paths sched/fair: Add util_est on top of PELT sched/core: Remove TASK_ALL sched/completions: Use bool in try_wait_for_completion() sched/fair: Update blocked load when newly idle sched/fair: Move idle_balance() sched/nohz: Merge CONFIG_NO_HZ_COMMON blocks sched/fair: Move rebalance_domains() sched/nohz: Optimize nohz_idle_balance() sched/fair: Reduce the periodic update duration sched/nohz: Stop NOHZ stats when decayed sched/cpufreq: Provide migration hint sched/nohz: Clean up nohz enter/exit sched/fair: Update blocked load from NEWIDLE sched/fair: Add NOHZ stats balancing sched/fair: Restructure nohz_balance_kick() ...
2018-04-02Merge branch 'ras-core-for-linus' of ↵Linus Torvalds6-117/+156
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 RAS updates from Ingo Molnar: "The main changes in this cycle were: - AMD MCE support/decoding improvements (Yazen Ghannam) - general MCE header cleanups and reorganization (Borislav Petkov)" * 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Revert "x86/mce/AMD: Collect error info even if valid bits are not set" x86/MCE: Cleanup and complete struct mce fields definitions x86/mce/AMD: Carve out SMCA get_block_address() code x86/mce/AMD: Get address from already initialized block x86/mce/AMD, EDAC/mce_amd: Enumerate Reserved SMCA bank type x86/mce/AMD: Pass the bank number to smca_get_bank_type() x86/mce/AMD: Collect error info even if valid bits are not set x86/mce: Issue the 'mcelog --ascii' message only on !AMD x86/mce: Convert 'struct mca_config' bools to a bitfield x86/mce: Put private structures and definitions into the internal header
2018-04-02bpf: whitelist all syscalls for error injectionHoward McLauchlan2-0/+6
Error injection is a useful mechanism to fail arbitrary kernel functions. However, it is often hard to guarantee an error propagates appropriately to user space programs. By injecting into syscalls, we can return arbitrary values to user space directly; this increases flexibility and robustness in testing, allowing us to test user space error paths effectively. The following script, for example, fails calls to sys_open() from a given pid: from bcc import BPF from sys import argv pid = argv[1] prog = r""" int kprobe__SyS_open(struct pt_regs *ctx, const char *pathname, int flags) { u32 pid = bpf_get_current_pid_tgid(); if (pid == %s) bpf_override_return(ctx, -ENOMEM); return 0; } """ % pid b = BPF(text=prog) while 1: b.perf_buffer_poll() This patch whitelists all syscalls defined with SYSCALL_DEFINE and COMPAT_SYSCALL_DEFINE for error injection. These changes are not intended to be considered stable, and would normally be configured off. Signed-off-by: Howard McLauchlan <[email protected]> Signed-off-by: Dominik Brodowski <[email protected]>
2018-04-02kernel/sys_ni: remove {sys_,sys_compat} from cond_syscall definitionsDominik Brodowski2-216/+219
This keeps it in line with the SYSCALL_DEFINEx() / COMPAT_SYSCALL_DEFINEx() calling convention. Signed-off-by: Dominik Brodowski <[email protected]>
2018-04-02kernel/sys_ni: sort cond_syscall() entriesDominik Brodowski1-174/+332
Shuffle the cond_syscall() entries in kernel/sys_ni.c around so that they are kept in the same order as in include/uapi/asm-generic/unistd.h. For better structuring, add the same comments as in that file, but keep a few additional comments and extend the commentary where it seems useful. Signed-off-by: Dominik Brodowski <[email protected]>
2018-04-02syscalls/x86: auto-create compat_sys_*() prototypesDominik Brodowski4-80/+4
compat_sys_*() functions are no longer called from within the kernel on x86 except from the system call table. Linking the system call does not require compat_sys_*() function prototypes at least on x86. Therefore, generate compat_sys_*() prototypes on-the-fly within the COMPAT_SYSCALL_DEFINEx() macro, and remove x86-specific prototypes from various header files. Suggested-by: Andy Lutomirski <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: David S. Miller <[email protected]> Cc: [email protected] Cc: Thomas Gleixner <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Al Viro <[email protected]> Cc: [email protected] Signed-off-by: Dominik Brodowski <[email protected]>
2018-04-02syscalls: sort syscall prototypes in include/linux/compat.hDominik Brodowski1-277/+378
Shuffle the syscall prototypes in include/linux/compat.h around so that they are kept in the same order as in include/uapi/asm-generic/unistd.h. The individual entries are kept the same, and neither modified to bring them in line with kernel coding style nor wrapped in proper ifdefs -- as an exception to this, add the prefix "asmlinkage" where it was missing. Cc: Arnd Bergmann <[email protected]> Cc: Andrew Morton <[email protected]> Signed-off-by: Dominik Brodowski <[email protected]>
2018-04-02net: remove compat_sys_*() prototypes from net/compat.hDominik Brodowski1-11/+0
As the syscall functions should only be called from the system call table but not from elsewhere in the kernel, it is sufficient that they are defined in linux/compat.h. Cc: David S. Miller <[email protected]> Cc: [email protected] Signed-off-by: Dominik Brodowski <[email protected]>
2018-04-02syscalls: sort syscall prototypes in include/linux/syscalls.hDominik Brodowski1-537/+667
Shuffle the syscall prototypes in include/linux/syscalls.h around so that they are kept in the same order as in include/uapi/asm-generic/unistd.h. The individual entries are kept the same, and neither modified to bring them in line with kernel coding style nor wrapped in proper ifdefs. Cc: Arnd Bergmann <[email protected]> Cc: Andrew Morton <[email protected]> Signed-off-by: Dominik Brodowski <[email protected]>
2018-04-02kexec: move sys_kexec_load() prototype to syscalls.hDominik Brodowski2-4/+4
As the syscall function should only be called from the system call table but not from elsewhere in the kernel, move the prototype for sys_kexec_load() to include/syscall.h. Cc: Eric Biederman <[email protected]> Cc: [email protected] Signed-off-by: Dominik Brodowski <[email protected]>
2018-04-02x86/sigreturn: use SYSCALL_DEFINE0Tautschnig, Michael1-2/+3
All definitions of syscalls in x86 except for those patched here have already been using the appropriate SYSCALL_DEFINE*. Signed-off-by: Michael Tautschnig <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Jaswinder Singh <[email protected]> Cc: Andi Kleen <[email protected]> Cc: [email protected] Reviewed-by: Thomas Gleixner <[email protected]> Signed-off-by: Dominik Brodowski <[email protected]>
2018-04-02x86: fix sys_sigreturn() return type to be long, not unsigned longDominik Brodowski2-2/+2
Same as with other system calls, sys_sigreturn() should return a value of type long, not unsigned long. This also matches the behaviour for IA32_EMULATION, see sys32_sigreturn() in arch/x86/ia32/ia32_signal.c . Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Slaby <[email protected]> Cc: [email protected] Cc: Michael Tautschnig <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Signed-off-by: Dominik Brodowski <[email protected]>
2018-04-02x86/ioport: add ksys_ioperm() helper; remove in-kernel calls to sys_ioperm()Dominik Brodowski3-4/+10
Using this helper allows us to avoid the in-kernel calls to the sys_ioperm() syscall. The ksys_ prefix denotes that this function is meant as a drop-in replacement for the syscall. In particular, it uses the same calling convention as sys_ioperm(). This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/[email protected] Cc: Ingo Molnar <[email protected]> Cc: Jiri Slaby <[email protected]> Cc: [email protected] Acked-by: Greg Kroah-Hartman <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Signed-off-by: Dominik Brodowski <[email protected]>
2018-04-02mm: add ksys_readahead() helper; remove in-kernel calls to sys_readahead()Dominik Brodowski8-7/+13
Using this helper allows us to avoid the in-kernel calls to the sys_readahead() syscall. The ksys_ prefix denotes that this function is meant as a drop-in replacement for the syscall. In particular, it uses the same calling convention as sys_readahead(). This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/[email protected] Cc: Andrew Morton <[email protected]> Cc: [email protected] Signed-off-by: Dominik Brodowski <[email protected]>
2018-04-02mm: add ksys_mmap_pgoff() helper; remove in-kernel calls to sys_mmap_pgoff()Dominik Brodowski21-41/+60
Using this helper allows us to avoid the in-kernel calls to the sys_mmap_pgoff() syscall. The ksys_ prefix denotes that this function is meant as a drop-in replacement for the syscall. In particular, it uses the same calling convention as sys_mmap_pgoff(). This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/[email protected] Cc: Andrew Morton <[email protected]> Cc: [email protected] Signed-off-by: Dominik Brodowski <[email protected]>
2018-04-02mm: add ksys_fadvise64_64() helper; remove in-kernel call to sys_fadvise64_64()Dominik Brodowski12-27/+43
Using the ksys_fadvise64_64() helper allows us to avoid the in-kernel calls to the sys_fadvise64_64() syscall. The ksys_ prefix denotes that this function is meant as a drop-in replacement for the syscall. In particular, it uses the same calling convention as ksys_fadvise64_64(). Some compat stubs called sys_fadvise64(), which then just passed through the arguments to sys_fadvise64_64(). Get rid of this indirection, and call ksys_fadvise64_64() directly. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/[email protected] Cc: Andrew Morton <[email protected]> Cc: [email protected] Signed-off-by: Dominik Brodowski <[email protected]>
2018-04-02fs: add ksys_fallocate() wrapper; remove in-kernel calls to sys_fallocate()Dominik Brodowski8-12/+18
Using the ksys_fallocate() wrapper allows us to get rid of in-kernel calls to the sys_fallocate() syscall. The ksys_ prefix denotes that this function is meant as a drop-in replacement for the syscall. In particular, it uses the same calling convention as sys_fallocate(). This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/[email protected] Cc: Al Viro <[email protected]> Cc: Andrew Morton <[email protected]> Signed-off-by: Dominik Brodowski <[email protected]>
2018-04-02fs: add ksys_p{read,write}64() helpers; remove in-kernel calls to syscallsDominik Brodowski9-20/+36
Using the ksys_p{read,write}64() wrappers allows us to get rid of in-kernel calls to the sys_pread64() and sys_pwrite64() syscalls. The ksys_ prefix denotes that this function is meant as a drop-in replacement for the syscall. In particular, it uses the same calling convention as sys_p{read,write}64(). This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/[email protected] Cc: Al Viro <[email protected]> Cc: Andrew Morton <[email protected]> Signed-off-by: Dominik Brodowski <[email protected]>
2018-04-02fs: add ksys_truncate() wrapper; remove in-kernel calls to sys_truncate()Dominik Brodowski8-9/+17
Using the ksys_truncate() wrapper allows us to get rid of in-kernel calls to the sys_truncate() syscall. The ksys_ prefix denotes that this function is meant as a drop-in replacement for the syscall. In particular, it uses the same calling convention as sys_truncate(). This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/[email protected] Cc: Al Viro <[email protected]> Cc: Andrew Morton <[email protected]> Signed-off-by: Dominik Brodowski <[email protected]>
2018-04-02fs: add ksys_sync_file_range helper(); remove in-kernel calls to syscallDominik Brodowski8-11/+19
Using this helper allows us to avoid the in-kernel calls to the sys_sync_file_range() syscall. The ksys_ prefix denotes that this function is meant as a drop-in replacement for the syscall. In particular, it uses the same calling convention as sys_sync_file_range(). This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/[email protected] Cc: Al Viro <[email protected]> Cc: Andrew Morton <[email protected]> Signed-off-by: Dominik Brodowski <[email protected]>