aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-06-01powerpc/64s: Add dt_cpu_ftrs boot time setup optionNicholas Piggin3-14/+57
Provide a dt_cpu_ftrs= cmdline option to disable the dt_cpu_ftrs CPU feature discovery, and fall back to the "cputable" based version. Also allow control of advertising unknown features to userspace and with this parameter, and remove the clunky CONFIG option. Signed-off-by: Nicholas Piggin <[email protected]> [mpe: Add explicit early check of bootargs in dt_cpu_ftrs_init()] Signed-off-by: Michael Ellerman <[email protected]>
2017-05-30powerpc: Link warning for orphan sectionsNicholas Piggin3-2/+27
Add --orphan-handling=warn to final link flags. This ensures we can handle all sections explicitly. This would have caught subtle breakage such as 7de3b27bac47da9de08409df1d69664acbb72197 at build-time. Also bring existing orphan sections into the fold: - .text.hot and .text.unlikely are compiler generated sections. - .sdata2, .dynsbss, .plt are used by PPC32 - We previously did not specify DWARF_DEBUG or STABS_DEBUG - DWARF_DEBUG did not include all DWARF sections that can be emitted - A number of sections are unused and can be discarded. Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-05-30powerpc/64: Tool to check head sections location sanityNicholas Piggin4-25/+87
Use a tool to check that the location of "fixed sections" are where we expected them to be, which catches cases the linker script can't (stubs being added to start of .text section), and which ends up being neater. Sample output: ERROR: start_text address is c000000000008100, should be c000000000008000 ERROR: see comments in arch/powerpc/tools/head_check.sh Signed-off-by: Nicholas Piggin <[email protected]> [mpe: Fold in fix from Nick for 4.6 era toolchains] Signed-off-by: Michael Ellerman <[email protected]>
2017-05-30powerpc/64: Handle linker stubs in low .text codeNicholas Piggin3-0/+34
Very large kernels may require linker stubs for branches from HEAD text code. The linker may place these stubs before the HEAD text sections, which breaks the assumption that HEAD text is located at 0 (or the .text section being located at 0x7000/0x8000 on Book3S kernels). Provide an option to create a small section just before the .text section with an empty 256 - 4 bytes, and adjust the start of the .text section to match. The linker will tend to put stubs in that section and not break our relative-to-absolute offset assumptions. This causes a small waste of space on common kernels, but allows large kernels to build and boot. For now, it is an EXPERT config option, defaulting to =n, but a reference is provided for it in the build-time check for such breakage. This is good enough for allyesconfig and custom users / hackers. Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-05-30powerpc/64s: Tool to flag direct branches from unrelocated interrupt vectorsNicholas Piggin2-1/+65
Direct banches from code below __end_interrupts to code above __end_interrupts when built with CONFIG_RELOCATABLE are disallowed because they will break when the kernel is not located at 0. Sample output: WARNING: Unrelocated relative branches c000000000000118 bl-> 0xc000000000038fb8 <pnv_restore_hyp_resource> c00000000000013c b-> 0xc0000000001068a4 <kvm_start_guest> c000000000000148 b-> 0xc00000000003919c <pnv_wakeup_loss> c00000000000014c b-> 0xc00000000003923c <pnv_wakeup_noloss> c0000000000005a4 b-> 0xc000000000106ffc <kvmppc_interrupt_hv> c000000000001af0 b-> 0xc000000000106ffc <kvmppc_interrupt_hv> c000000000001b24 b-> 0xc000000000106ffc <kvmppc_interrupt_hv> c000000000001b58 b-> 0xc000000000106ffc <kvmppc_interrupt_hv> Signed-off-by: Balbir Singh <[email protected]> Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-05-30powerpc/64: Linker on-demand sfpr functions for modulesNicholas Piggin2-5/+18
For final link, the powerpc64 linker generates fpr save/restore functions on-demand, placing them in the .sfpr section. Starting with binutils 2.25, these can be provided for non-final links with --save-restore-funcs. Use that where possible for module links. This saves about 200 bytes per module (~60kB) on powernv defconfig build. Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-05-30powerpc/64: Do not create new section for save/restore functionsNicholas Piggin1-4/+2
There is no need to create a new section for these. Consolidate with 32-bit and just use .text. Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-05-30powerpc/64: Do not link crtsaveres.o in bootNicholas Piggin2-5/+8
crtsaveres.S is empty with 64-bit builds already, so just don't build and link it to match the vmlinux build. Signed-off-by: Nicholas Piggin <[email protected]> [mpe: Use CONFIG_PPC64_BOOT_WRAPPER not CONFIG_PPC32 to fix BE build] Signed-off-by: Michael Ellerman <[email protected]>
2017-05-30powerpc/64: Do not link crtsavres.o in vmlinuxNicholas Piggin1-2/+6
The 64-bit linker creates save/restore functions on demand with final links, so vmlinux does not require crtsavres.o. Make crtsavres.o extra-y on 64-bit (it is still required by modules). Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-05-30powerpc/64: Place sfpr section explicitly with the linker scriptNicholas Piggin1-0/+8
Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-05-30powerpc: Use uapi/asm-generic/sockios.hStephen Rothwell2-20/+1
The arch version is identical except for comments and white space. Signed-off-by: Stephen Rothwell <[email protected]> Acked-by: Arnd Bergmann <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-05-30powerpc: Use the asm-generic versions of some uapi includesStephen Rothwell5-9/+5
These are completely obvious as all they do is include the asm-generic versions. Signed-off-by: Stephen Rothwell <[email protected]> Acked-by: Arnd Bergmann <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-05-30powerpc/[booke|4xx]: Don't clobber TCR[WP] when setting TCR[DIE]Ivan Mikhaylov1-3/+11
Prevent a kernel panic caused by unintentionally clearing TCR watchdog bits. At this point in the kernel boot, the watchdog may have already been enabled by u-boot. The original code's attempt to write to the TCR register results in an inadvertent clearing of the watchdog configuration bits, causing the 476 to reset. Signed-off-by: Ivan Mikhaylov <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-05-30powerpc/44x/fsp2: Add defconfig for FSP2 boardIvan Mikhaylov1-0/+126
This patch adds default FSP2 config for main usage. Signed-off-by: Ivan Mikhaylov <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-05-30powerpc/44x/fsp2: Add device tree for FSP2 boardIvan Mikhaylov1-0/+608
Add a device tree for FSP2 board (476 based). Signed-off-by: Ivan Mikhaylov <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-05-30powerpc/44x/fsp2: Platform support for FSP2 (476fpe) boardIvan Mikhaylov3-0/+75
Add platform code support for FSP2 (476fpe) board. Signed-off-by: Ivan Mikhaylov <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-05-30cpuidle-powernv: Allow Deep stop states that don't stop timeGautham R. Shenoy1-6/+10
The current code in the cpuidle-powernv intialization only allows deep stop states (indicated by OPAL_PM_STOP_INST_DEEP) which lose timebase (indicated by OPAL_PM_TIMEBASE_STOP). This assumption goes back to POWER8 time where deep states used to lose the timebase. However, on POWER9, we do have stop states that are deep (they lose hypervisor state) but retain the timebase. Fix the initialization code in the cpuidle-powernv driver to allow such deep states. Further, there is a bug in cpuidle-powernv driver with CONFIG_TICK_ONESHOT=n where we end up incrementing the nr_idle_states even if a platform idle state which loses time base was not added to the cpuidle table. Fix this by ensuring that the nr_idle_states variable gets incremented only when the platform idle state was added to the cpuidle table. Signed-off-by: Gautham R. Shenoy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-05-30powerpc/powernv/idle: Use Requested Level for restoring state on P9 DD1Gautham R. Shenoy3-1/+15
On Power9 DD1 due to a hardware bug the Power-Saving Level Status field (PLS) of the PSSCR for a thread waking up from a deep state can under-report if some other thread in the core is in a shallow stop state. The scenario in which this can manifest is as follows: 1) All the threads of the core are in deep stop. 2) One of the threads is woken up. The PLS for this thread will correctly reflect that it is waking up from deep stop. 3) The thread that has woken up now executes a shallow stop. 4) When some other thread in the core is woken, its PLS will reflect the shallow stop state. Thus, the subsequent thread for which the PLS is under-reporting the wakeup state will not restore the hypervisor resources. Hence, on DD1 systems, use the Requested Level (RL) field as a workaround to restore the contents of the hypervisor resources on the wakeup from the stop state. Signed-off-by: Gautham R. Shenoy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-05-30powerpc/powernv/idle: Restore SPRs for deep idle states via stop API.Akshay Adiga1-31/+52
Some of the SPR values (HID0, MSR, SPRG0) don't change during the run time of a booted kernel, once they have been initialized. The contents of these SPRs are lost when the CPUs enter deep stop states. So instead saving and restoring SPRs from the kernel, use the stop-api provided by the firmware by which the firmware can restore the contents of these SPRs to their initialized values after wakeup from a deep stop state. Apart from these, program the PSSCR value to that of the deepest stop state via the stop-api. This will be used to indicate to the underlying firmware as to what stop state to put the threads that have been woken up by a special-wakeup. And while we are at programming SPRs via stop-api, ensure that HID1, HID4 and HID5 registers which are only available on POWER8 are not requested to be restored by the firware on POWER9. Signed-off-by: Akshay Adiga <[email protected]> Signed-off-by: Gautham R. Shenoy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-05-30powerpc/powernv/idle: Restore LPCR on wakeup from deep-stopGautham R. Shenoy1-3/+10
On wakeup from a deep stop state which is supposed to lose the hypervisor state, we don't restore the LPCR to the old value but set it to a "sane" value via cur_cpu_spec->cpu_restore(). The problem is that the "sane" value doesn't include UPRT and the HR bits which are required to run correctly in Radix mode. Fix this on POWER9 onwards by restoring the LPCR value whatever it was before executing the stop instruction. Signed-off-by: Gautham R. Shenoy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-05-30powerpc/powernv/idle: Decouple Timebase restore & Per-core SPRs restoreGautham R. Shenoy1-3/+4
On POWER8, in case of - nap: both timebase and hypervisor state is retained. - fast-sleep: timebase is lost. But the hypervisor state is retained. - winkle: timebase and hypervisor state is lost. Hence, the current code for handling exit from a idle state assumes that if the timebase value is retained, then so is the hypervisor state. Thus, the current code doesn't restore per-core hypervisor state in such cases. But that is no longer the case on POWER9 where we do have stop states in which timebase value is retained, but the hypervisor state is lost. So we have to ensure that the per-core hypervisor state gets restored in such cases. Fix this by ensuring that even in the case when timebase is retained, we explicitly check if we are waking up from a deep stop that loses per-core hypervisor state (indicated by cr4 being eq or gt), and if this is the case, we restore the per-core hypervisor state. Signed-off-by: Gautham R. Shenoy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-05-30powerpc/powernv/idle: Correctly initialize core_idle_state_ptrGautham R. Shenoy1-10/+19
The lower 8 bits of core_idle_state_ptr tracks the number of non-idle threads in the core. This is supposed to be initialized to bit-map corresponding to the threads_per_core. However, currently it is initialized to PNV_CORE_IDLE_THREAD_BITS (0xFF). This is correct for POWER8 which has 8 threads per core, but not for POWER9 which has 4 threads per core. As a result, on POWER9, core_idle_state_ptr gets initialized to 0xFF. In case when all the threads of the core are idle, the bits corresponding tracking the idle-threads are non-zero. As a result, the idle entry/exit code fails to save/restore per-core hypervisor state since it assumes that there are threads in the cores which are still active. Fix this by correctly initializing the lower bits of the core_idle_state_ptr on the basis of threads_per_core. Signed-off-by: Gautham R. Shenoy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-05-30powerpc: Add HAVE_IRQ_TIME_ACCOUNTINGAnton Blanchard1-0/+1
Allow us to enable IRQ_TIME_ACCOUNTING. Even though we currently use VIRT_CPU_ACCOUNTING_NATIVE, that option is quite heavy weight and IRQ_TIME_ACCOUNTING might be better in some cases. Signed-off-by: Anton Blanchard <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-05-30powerpc/sequoia: Fix NAND partitions not to overlapPavel Machek1-1/+1
Currently the DTS defines two partitions at the same addresses, if you use one, you will corrupt information on the other one. Fix it by shifting the second partition. Signed-off-by: Pavel Machek <[email protected]> [mpe: Reconstruct change log from email thread] Signed-off-by: Michael Ellerman <[email protected]>
2017-05-30powerpc: Tweak copy selection parameter in __copy_tofrom_user_power7()Andrew Jeffery1-2/+2
Experiments with the netperf benchmark indicated that the size selecting VMX-based copies in __copy_tofrom_user_power7() was suboptimal on POWER8. Measurements showed that parity was in the neighbourhood of 3328 bytes, rather than greater than 4096. The change gives a 1.5-2.0% improvement in performance for 4096-byte buffers, reducing the relative time spent in __copy_tofrom_user_power7() from approximately 7% to approximately 5% in the TCP_RR benchmark. Signed-off-by: Andrew Jeffery <[email protected]> Acked-by: Anton Blanchard <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-05-30powerpc/xmon: Fix compile error with PPC_8xx=yNicholas Piggin1-4/+4
Rearrange the code so that mode and badaddr are only defined when they're used. Also unsplit the string for easier grepping, and switch from CONFIG_8xx which is deprecated to CONFIG_PPC_8xx. Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-05-30powerpc/powernv: Fix CPU_HOTPLUG=n idle.c compile errorNicholas Piggin1-0/+2
Fixes: a7cd88da97 ("powerpc/powernv: Move CPU-Offline idle state invocation from smp.c to idle.c") Cc: Gautham R. Shenoy <[email protected]> Signed-off-by: Nicholas Piggin <[email protected]> Acked-by: Gautham R. Shenoy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-05-30powerpc/64s: Fix OPAL_CALL non-maskable interrupt reentrancyNicholas Piggin1-3/+3
OPAL_CALL uses SRR[01] with MSR_RI=1, which gets corrupted if there is an interleaving system reset or machine check interrupt. Use HSRR[01] instead, which does not require MSR_RI=0. Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-05-30powerpc/64s: Fix FIXUP_ENDIAN non-maskable interrupt reentrancyNicholas Piggin2-9/+14
FIXUP_ENDIAN uses SRR[01] with MSR_RI=1, which gets corrupted if there is an interleaving system reset or machine check interrupt. Set MSR_RI=0 before setting SRRs. The rfid will restore MSR. Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-05-28Linux 4.12-rc3Linus Torvalds1-1/+1
2017-05-28Merge branch 'fixes' of ↵Linus Torvalds4-17/+11
git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal Pull thermal SoC management fixes from Eduardo Valentin: - fixes to TI SoC driver, Broadcom, qoriq - small sparse warning fix on thermal core * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal: thermal: broadcom: ns-thermal: default on iProc SoCs ti-soc-thermal: Fix a typo in a comment line ti-soc-thermal: Delete error messages for failed memory allocations in ti_bandgap_build() ti-soc-thermal: Use devm_kcalloc() in ti_bandgap_build() thermal: core: make thermal_emergency_poweroff static thermal: qoriq: remove useless call for of_thermal_get_trip_points()
2017-05-27Merge tag 'tty-4.12-rc3' of ↵Linus Torvalds14-49/+162
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial fixes from Greg KH: "Here are some serial and tty fixes for 4.12-rc3. They are a bit bigger than normal, which is why I had them bake in linux-next for a few weeks and didn't send them to you for -rc2. They revert a few of the serdev patches from 4.12-rc1, and bring things back to how they were in 4.11, to try to make things a bit more stable there. Rob and Johan both agree that this is the way forward, so this isn't people squabbling over semantics. Other than that, just a few minor serial driver fixes that people have had problems with. All of these have been in linux-next for a few weeks with no reported issues" * tag 'tty-4.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: serial: altera_uart: call iounmap() at driver remove serial: imx: ensure UCR3 and UFCR are setup correctly MAINTAINERS/serial: Change maintainer of jsm driver serial: enable serdev support tty/serdev: add serdev registration interface serdev: Restore serdev_device_write_buf for atomic context serial: core: fix crash in uart_suspend_port tty: fix port buffer locking tty: ehv_bytechan: clean up init error handling serial: ifx6x60: fix use-after-free on module unload serial: altera_jtaguart: adding iounmap() serial: exar: Fix stuck MSIs serial: efm32: Fix parity management in 'efm32_uart_console_get_options()' serdev: fix tty-port client deregistration Revert "tty_port: register tty ports with serdev bus" drivers/tty: 8250: only call fintek_8250_probe when doing port I/O
2017-05-27Merge tag 'powerpc-4.12-4' of ↵Linus Torvalds6-6/+12
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Fix running SPU programs on Cell, and a few other minor fixes. Thanks to Alistair Popple, Jeremy Kerr, Michael Neuling, Nicholas Piggin" * tag 'powerpc-4.12-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc: Add PPC_FEATURE userspace bits for SCV and DARN instructions powerpc/spufs: Fix hash faults for kernel regions powerpc: Fix booting P9 hash with CONFIG_PPC_RADIX_MMU=N powerpc/powernv/npu-dma.c: Fix opal_npu_destroy_context() call selftests/powerpc: Fix TM resched DSCR test with some compilers
2017-05-27Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds9-37/+81
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "A series of fixes for X86: - The final fix for the end-of-stack issue in the unwinder - Handle non PAT systems gracefully - Prevent access to uninitiliazed memory - Move early delay calaibration after basic init - Fix Kconfig help text - Fix a cross compile issue - Unbreak older make versions" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/timers: Move simple_udelay_calibration past init_hypervisor_platform x86/alternatives: Prevent uninitialized stack byte read in apply_alternatives() x86/PAT: Fix Xorg regression on CPUs that don't support PAT x86/watchdog: Fix Kconfig help text file path reference to lockup watchdog documentation x86/build: Permit building with old make versions x86/unwind: Add end-of-stack check for ftrace handlers Revert "x86/entry: Fix the end of the stack for newly forked tasks" x86/boot: Use CROSS_COMPILE prefix for readelf
2017-05-27Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds1-8/+16
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixlet from Thomas Gleixner: "Silence dmesg spam by making the posix cpu timer printks depend on print_fatal_signals" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: posix-timers: Make signal printks conditional
2017-05-27Merge branch 'ras-urgent-for-linus' of ↵Linus Torvalds3-8/+8
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RAS fixes from Thomas Gleixner: "Two fixlets for RAS: - Export memory_error() so the NFIT module can utilize it - Handle memory errors in NFIT correctly" * 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: acpi, nfit: Fix the memory error check in nfit_handle_mce() x86/MCE: Export memory_error()
2017-05-27Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds18-47/+176
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf tooling fixes from Thomas Gleixner: - Synchronization of tools and kernel headers - A series of fixes for perf report addressing various failures: * Handle invalid maps proper * Plug a memory leak * Handle frames and callchain order correctly - Fixes for handling inlines and children mode * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tools/include: Sync kernel ABI headers with tooling headers perf tools: Put caller above callee in --children mode perf report: Do not drop last inlined frame perf report: Always honor callchain order for inlined nodes perf script: Add --inline option for debugging perf report: Fix off-by-one for non-activation frames perf report: Fix memory leak in addr2line when called by addr2inlines perf report: Don't crash on invalid maps in `-g srcline` mode
2017-05-27Merge branch 'locking-urgent-for-linus' of ↵Linus Torvalds1-6/+18
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fix from Thomas Gleixner: "A fix for a state leak which was introduced in the recent rework of futex/rtmutex interaction" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: futex,rt_mutex: Fix rt_mutex_cleanup_proxy_lock()
2017-05-27Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds1-5/+12
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull kthread fix from Thomas Gleixner: "A single fix which prevents a use after free when kthread fork fails" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: kthread: Fix use-after-free if kthread fork fails
2017-05-27Merge tag 'trace-v4.12-rc2' of ↵Linus Torvalds6-9/+47
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull ftrace fixes from Steven Rostedt: "There's been a few memory issues found with ftrace. One was simply a memory leak where not all was being freed that should have been in releasing a file pointer on set_graph_function. Then Thomas found that the ftrace trampolines were marked for read/write as well as execute. To shrink the possible attack surface, he added calls to set them to ro. Which also uncovered some other issues with freeing module allocated memory that had its permissions changed. Kprobes had a similar issue which is fixed and a selftest was added to trigger that issue again" * tag 'trace-v4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: x86/ftrace: Make sure that ftrace trampolines are not RWX x86/mm/ftrace: Do not bug in early boot on irqs_disabled in cpu_flush_range() selftests/ftrace: Add a testcase for many kprobe events kprobes/x86: Fix to set RWX bits correctly before releasing trampoline ftrace: Fix memory leak in ftrace_graph_release()
2017-05-26x86/ftrace: Make sure that ftrace trampolines are not RWXThomas Gleixner1-6/+14
ftrace use module_alloc() to allocate trampoline pages. The mapping of module_alloc() is RWX, which makes sense as the memory is written to right after allocation. But nothing makes these pages RO after writing to them. Add proper set_memory_rw/ro() calls to protect the trampolines after modification. Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1705251056410.1862@nanos Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2017-05-26x86/mm/ftrace: Do not bug in early boot on irqs_disabled in cpu_flush_range()Steven Rostedt (VMware)1-1/+1
With function tracing starting in early bootup and having its trampoline pages being read only, a bug triggered with the following: kernel BUG at arch/x86/mm/pageattr.c:189! invalid opcode: 0000 [#1] SMP Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 4.12.0-rc2-test+ #3 Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014 task: ffffffffb4222500 task.stack: ffffffffb4200000 RIP: 0010:change_page_attr_set_clr+0x269/0x302 RSP: 0000:ffffffffb4203c88 EFLAGS: 00010046 RAX: 0000000000000046 RBX: 0000000000000000 RCX: 00000001b6000000 RDX: ffffffffb4203d40 RSI: 0000000000000000 RDI: ffffffffb4240d60 RBP: ffffffffb4203d18 R08: 00000001b6000000 R09: 0000000000000001 R10: ffffffffb4203aa8 R11: 0000000000000003 R12: ffffffffc029b000 R13: ffffffffb4203d40 R14: 0000000000000001 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff9a639ea00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff9a636b384000 CR3: 00000001ea21d000 CR4: 00000000000406b0 Call Trace: change_page_attr_clear+0x1f/0x21 set_memory_ro+0x1e/0x20 arch_ftrace_update_trampoline+0x207/0x21c ? ftrace_caller+0x64/0x64 ? 0xffffffffc029b000 ftrace_startup+0xf4/0x198 register_ftrace_function+0x26/0x3c function_trace_init+0x5e/0x73 tracer_init+0x1e/0x23 tracing_set_tracer+0x127/0x15a register_tracer+0x19b/0x1bc init_function_trace+0x90/0x92 early_trace_init+0x236/0x2b3 start_kernel+0x200/0x3f5 x86_64_start_reservations+0x29/0x2b x86_64_start_kernel+0x17c/0x18f secondary_startup_64+0x9f/0x9f ? secondary_startup_64+0x9f/0x9f Interrupts should not be enabled at this early in the boot process. It is also fine to leave interrupts enabled during this time as there's only one CPU running, and on_each_cpu() means to only run on the current CPU. If early_boot_irqs_disabled is set, it is safe to run cpu_flush_range() with interrupts disabled. Don't trigger a BUG_ON() in that case. Link: http://lkml.kernel.org/r/[email protected] Suggested-by: Thomas Gleixner <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2017-05-26selftests/ftrace: Add a testcase for many kprobe eventsMasami Hiramatsu1-0/+21
Add a testcase to test kprobes via ftrace interface with many concurrent kprobe events. This tries to add many kprobe events (up to 256) on kernel functions. To avoid making ftrace-based kprobes (kprobes on fentry), it skips first N bytes (on x86 N=5, on ppc or arm N=4) of function entry. After that, it enables all those events, disable it, and remove it. Since the unoptimization buffer reclaiming will be delayed, after removing events, it will wait enough time. Link: http://lkml.kernel.org/r/149577388470.11702.11832460851769204511.stgit@devbox Signed-off-by: Masami Hiramatsu <[email protected]> Suggested-by: Steven Rostedt <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2017-05-26kprobes/x86: Fix to set RWX bits correctly before releasing trampolineMasami Hiramatsu2-1/+10
Fix kprobes to set(recover) RWX bits correctly on trampoline buffer before releasing it. Releasing readonly page to module_memfree() crash the kernel. Without this fix, if kprobes user register a bunch of kprobes in function body (since kprobes on function entry usually use ftrace) and unregister it, kernel hits a BUG and crash. Link: http://lkml.kernel.org/r/149570868652.3518.14120169373590420503.stgit@devbox Signed-off-by: Masami Hiramatsu <[email protected]> Fixes: d0381c81c2f7 ("kprobes/x86: Set kprobes pages read-only") Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2017-05-26ftrace: Fix memory leak in ftrace_graph_release()Luis Henriques1-1/+1
ftrace_hash is being kfree'ed in ftrace_graph_release(), however the ->buckets field is not. This results in a memory leak that is easily captured by kmemleak: unreferenced object 0xffff880038afe000 (size 8192): comm "trace-cmd", pid 238, jiffies 4294916898 (age 9.736s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff815f561e>] kmemleak_alloc+0x4e/0xb0 [<ffffffff8113964d>] __kmalloc+0x12d/0x1a0 [<ffffffff810bf6d1>] alloc_ftrace_hash+0x51/0x80 [<ffffffff810c0523>] __ftrace_graph_open.isra.39.constprop.46+0xa3/0x100 [<ffffffff810c05e8>] ftrace_graph_open+0x68/0xa0 [<ffffffff8114003d>] do_dentry_open.isra.1+0x1bd/0x2d0 [<ffffffff81140df7>] vfs_open+0x47/0x60 [<ffffffff81150f95>] path_openat+0x2a5/0x1020 [<ffffffff81152d6a>] do_filp_open+0x8a/0xf0 [<ffffffff811411df>] do_sys_open+0x12f/0x200 [<ffffffff811412ce>] SyS_open+0x1e/0x20 [<ffffffff815fa6e0>] entry_SYSCALL_64_fastpath+0x13/0x94 [<ffffffffffffffff>] 0xffffffffffffffff Link: http://lkml.kernel.org/r/[email protected] Cc: [email protected] Fixes: b9b0c831bed2 ("ftrace: Convert graph filter to use hash tables") Signed-off-by: Luis Henriques <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2017-05-26Merge branch 'for-linus' of ↵Linus Torvalds5-17/+20
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input layer fixes from Dmitry Torokhov: "Just a few fixups to a couple of drivers" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: elan_i2c - ignore signals when finishing updating firmware Input: elan_i2c - clear INT before resetting controller Input: atmel_mxt_ts - add T100 as a readable object Input: edt-ft5x06 - increase allowed data range for threshold parameter
2017-05-26Merge tag 'led_fixes_for_4-12-rc3' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds Pull LED fix from Jacek Anaszewski: "A single LED fix for 4.12-rc3. leds-pca955x driver uses only i2c_smbus API and thus it should pass I2C_FUNC_SMBUS_BYTE_DATA flag to i2c_check_functionality" * tag 'led_fixes_for_4-12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds: leds: pca955x: Correct I2C Functionality
2017-05-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds57-254/+701
Pull networking fixes from David Miller: 1) Fix state pruning in bpf verifier wrt. alignment, from Daniel Borkmann. 2) Handle non-linear SKBs properly in SCTP ICMP parsing, from Davide Caratti. 3) Fix bit field definitions for rss_hash_type of descriptors in mlx5 driver, from Jesper Brouer. 4) Defer slave->link updates until bonding is ready to do a full commit to the new settings, from Nithin Sujir. 5) Properly reference count ipv4 FIB metrics to avoid use after free situations, from Eric Dumazet and several others including Cong Wang and Julian Anastasov. 6) Fix races in llc_ui_bind(), from Lin Zhang. 7) Fix regression of ESP UDP encapsulation for TCP packets, from Steffen Klassert. 8) Fix mdio-octeon driver Kconfig deps, from Randy Dunlap. 9) Fix regression in setting DSCP on ipv6/GRE encapsulation, from Peter Dawson. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (43 commits) ipv4: add reference counting to metrics net: ethernet: ax88796: don't call free_irq without request_irq first ip6_tunnel, ip6_gre: fix setting of DSCP on encapsulated packets sctp: fix ICMP processing if skb is non-linear net: llc: add lock_sock in llc_ui_bind to avoid a race condition bonding: Don't update slave->link until ready to commit test_bpf: Add a couple of tests for BPF_JSGE. bpf: add various verifier test cases bpf: fix wrong exposure of map_flags into fdinfo for lpm bpf: add bpf_clone_redirect to bpf_helper_changes_pkt_data bpf: properly reset caller saved regs after helper call and ld_abs/ind bpf: fix incorrect pruning decision when alignment must be tracked arp: fixed -Wuninitialized compiler warning tcp: avoid fastopen API to be used on AF_UNSPEC net: move somaxconn init from sysctl code net: fix potential null pointer dereference geneve: fix fill_info when using collect_metadata virtio-net: enable TSO/checksum offloads for Q-in-Q vlans be2net: Fix offload features for Q-in-Q packets vlan: Fix tcp checksum offloads in Q-in-Q vlans ...
2017-05-26Merge tag 'xfs-4.12-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds6-74/+66
Pull XFS fixes from Darrick Wong: "A few miscellaneous bug fixes & cleanups: - Fix indlen block reservation accounting bug when splitting delalloc extent - Fix warnings about unused variables that appeared in -rc1. - Don't spew errors when bmapping a local format directory - Fix an off-by-one error in a delalloc eof assertion - Make fsmap only return inode information for CAP_SYS_ADMIN - Fix a potential mount time deadlock recovering cow extents - Fix unaligned memory access in _btree_visit_blocks - Fix various SEEK_HOLE/SEEK_DATA bugs" * tag 'xfs-4.12-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: Move handling of missing page into one place in xfs_find_get_desired_pgoff() xfs: Fix off-by-in in loop termination in xfs_find_get_desired_pgoff() xfs: Fix missed holes in SEEK_HOLE implementation xfs: fix off-by-one on max nr_pages in xfs_find_get_desired_pgoff() xfs: fix unaligned access in xfs_btree_visit_blocks xfs: avoid mount-time deadlock in CoW extent recovery xfs: only return detailed fsmap info if the caller has CAP_SYS_ADMIN xfs: bad assertion for delalloc an extent that start at i_size xfs: fix warnings about unused stack variables xfs: BMAPX shouldn't barf on inline-format directories xfs: fix indlen accounting error on partial delalloc conversion
2017-05-26ipv4: add reference counting to metricsEric Dumazet5-23/+45
Andrey Konovalov reported crashes in ipv4_mtu() I could reproduce the issue with KASAN kernels, between 10.246.7.151 and 10.246.7.152 : 1) 20 concurrent netperf -t TCP_RR -H 10.246.7.152 -l 1000 & 2) At the same time run following loop : while : do ip ro add 10.246.7.152 dev eth0 src 10.246.7.151 mtu 1500 ip ro del 10.246.7.152 dev eth0 src 10.246.7.151 mtu 1500 done Cong Wang attempted to add back rt->fi in commit 82486aa6f1b9 ("ipv4: restore rt->fi for reference counting") but this proved to add some issues that were complex to solve. Instead, I suggested to add a refcount to the metrics themselves, being a standalone object (in particular, no reference to other objects) I tried to make this patch as small as possible to ease its backport, instead of being super clean. Note that we believe that only ipv4 dst need to take care of the metric refcount. But if this is wrong, this patch adds the basic infrastructure to extend this to other families. Many thanks to Julian Anastasov for reviewing this patch, and Cong Wang for his efforts on this problem. Fixes: 2860583fe840 ("ipv4: Kill rt->fi") Signed-off-by: Eric Dumazet <[email protected]> Reported-by: Andrey Konovalov <[email protected]> Reviewed-by: Julian Anastasov <[email protected]> Acked-by: Cong Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>