aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2014-06-06tracing: Only calculate stats of tracepoint benchmarks for 2^32 timesSteven Rostedt (Red Hat)1-7/+30
When calculating the average and standard deviation, it is required that the count be less than UINT_MAX, otherwise the do_div() will get undefined results. After 2^32 counts of data, the average and standard deviation should pretty much be set anyway. Signed-off-by: Steven Rostedt <[email protected]>
2014-06-05tracing: Convert stddev into u64 in tracepoint benchmarkSteven Rostedt (Red Hat)1-1/+1
I've been told that do_div() expects an unsigned 64 bit number, and is undefined if a signed is used. This gave a warning on the MIPS build. I'm not sure if a signed 64 bit dividend is really an issue or not, but the calculation this is used for is standard deviation, and that isn't going to be negative. We can just convert it to unsigned and be safe. Reported-by: David Daney <[email protected]> Signed-off-by: Steven Rostedt <[email protected]>
2014-06-05Merge tag 'microblaze-3.16-rc1' of git://git.monstr.eu/linux-2.6-microblaze ↵Linus Torvalds12-144/+9
into next Pull Microblaze updates from Michal Simek: - cleanup PCI and DMA handling - use generic device.h - some cleanups * tag 'microblaze-3.16-rc1' of git://git.monstr.eu/linux-2.6-microblaze: microblaze: Fix typo in head.S s/substract/subtract/ microblaze: remove check for CONFIG_XILINX_CONSOLE microblaze: Use generic device.h microblaze: Do not setup empty unmap_sg function microblaze: Remove device_to_mask microblaze: Clean device dma_ops structure microblaze: Cleanup PCI_DRAM_OFFSET handling microblaze: Do not setup pci_dma_ops microblaze: Return default dma operations microblaze: Enable SERIAL_OF_PLATFORM
2014-06-05Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm into nextLinus Torvalds141-1558/+2408
Pull ARM updates from Russell King: - Major clean-up of the L2 cache support code. The existing mess was becoming rather unmaintainable through all the additions that others have done over time. This turns it into a much nicer structure, and implements a few performance improvements as well. - Clean up some of the CP15 control register tweaks for alignment support, moving some code and data into alignment.c - DMA properties for ARM, from Santosh and reviewed by DT people. This adds DT properties to specify bus translations we can't discover automatically, and to indicate whether devices are coherent. - Hibernation support for ARM - Make ftrace work with read-only text in modules - add suspend support for PJ4B CPUs - rework interrupt masking for undefined instruction handling, which allows us to enable interrupts earlier in the handling of these exceptions. - support for big endian page tables - fix stacktrace support to exclude stacktrace functions from the trace, and add save_stack_trace_regs() implementation so that kprobes can record stack traces. - Add support for the Cortex-A17 CPU. - Remove last vestiges of ARM710 support. - Removal of ARM "meminfo" structure, finally converting us solely to memblock to handle the early memory initialisation. * 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: (142 commits) ARM: ensure C page table setup code follows assembly code (part II) ARM: ensure C page table setup code follows assembly code ARM: consolidate last remaining open-coded alignment trap enable ARM: remove global cr_no_alignment ARM: remove CPU_CP15 conditional from alignment.c ARM: remove unused adjust_cr() function ARM: move "noalign" command line option to alignment.c ARM: provide common method to clear bits in CPU control register ARM: 8025/1: Get rid of meminfo ARM: 8060/1: mm: allow sub-architectures to override PCI I/O memory type ARM: 8066/1: correction for ARM patch 8031/2 ARM: 8049/1: ftrace/add save_stack_trace_regs() implementation ARM: 8065/1: remove last use of CONFIG_CPU_ARM710 ARM: 8062/1: Modify ldrt fixup handler to re-execute the userspace instruction ARM: 8047/1: rwsem: use asm-generic rwsem implementation ARM: l2c: trial at enabling some Cortex-A9 optimisations ARM: l2c: add warnings for stuff modifying aux_ctrl register values ARM: l2c: print a warning with L2C-310 caches if the cache size is modified ARM: l2c: remove old .set_debug method ARM: l2c: kill L2X0_AUX_CTRL_MASK before anyone else makes use of this ...
2014-06-05Merge branch 'arm64-efi-for-linus' of ↵Linus Torvalds21-26/+1619
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next Pull ARM64 EFI update from Peter Anvin: "By agreement with the ARM64 EFI maintainers, we have agreed to make -tip the upstream for all EFI patches. That is why this patchset comes from me :) This patchset enables EFI stub support for ARM64, like we already have on x86" * 'arm64-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: arm64: efi: only attempt efi map setup if booting via EFI efi/arm64: ignore dtb= when UEFI SecureBoot is enabled doc: arm64: add description of EFI stub support arm64: efi: add EFI stub doc: arm: add UEFI support documentation arm64: add EFI runtime services efi: Add shared FDT related functions for ARM/ARM64 arm64: Add function to create identity mappings efi: add helper function to get UEFI params from FDT doc: efi-stub.txt updates for ARM lib: add fdt_empty_tree.c
2014-06-05Merge tag 'efi-urgent' into x86/urgentH. Peter Anvin1-0/+3
* Fix earlyprintk=efi,keep support by switching to an ioremap() mapping of the framebuffer when early_ioremap() is no longer available and dropping __init from functions that may be invoked after free_initmem() - Dave Young * We shouldn't be exporting the EFI runtime map in sysfs if not using the new 1:1 EFI mapping code since in that case the mappings are not static across a kexec reboot - Dave Young Signed-off-by: H. Peter Anvin <[email protected]>
2014-06-05Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds2-4/+7
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Two last minute tooling fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf probe: Fix perf probe to find correct variable DIE perf probe: Fix a segfault if asked for variable it doesn't find
2014-06-05Merge branch 'futex-fixes' (futex fixes from Thomas Gleixner)Linus Torvalds1-53/+160
Merge futex fixes from Thomas Gleixner: "So with more awake and less futex wreckaged brain, I went through my list of points again and came up with the following 4 patches. 1) Prevent pi requeueing on the same futex I kept Kees check for uaddr1 == uaddr2 as a early check for private futexes and added a key comparison to both futex_requeue and futex_wait_requeue_pi. Sebastian, sorry for the confusion yesterday night. I really misunderstood your question. You are right the check is pointless for shared futexes where the same physical address is mapped to two different virtual addresses. 2) Sanity check atomic acquisiton in futex_lock_pi_atomic That's basically what Darren suggested. I just simplified it to use futex_top_waiter() to find kernel internal state. If state is found return -EINVAL and do not bother to fix up the user space variable. It's corrupted already. 3) Ensure state consistency in futex_unlock_pi The code is silly versus the owner died bit. There is no point to preserve it on unlock when the user space thread owns the futex. What's worse is that it does not update the user space value when the owner died bit is set. So the kernel itself creates observable inconsistency. Another "optimization" is to retry an atomic unlock. That's pointless as in a sane environment user space would not call into that code if it could have unlocked it atomically. So we always check whether there is kernel state around and only if there is none, we do the unlock by setting the user space value to 0. 4) Sanitize lookup_pi_state lookup_pi_state is ambigous about TID == 0 in the user space value. This can be a valid state even if there is kernel state on this uaddr, but we miss a few corner case checks. I tried to come up with a smaller solution hacking the checks into the current cruft, but it turned out to be ugly as hell and I got more confused than I was before. So I rewrote the sanity checks along the state documentation with awful lots of commentry" * emailed patches from Thomas Gleixner <[email protected]>: futex: Make lookup_pi_state more robust futex: Always cleanup owner tid in unlock_pi futex: Validate atomic acquisition in futex_lock_pi_atomic() futex-prevent-requeue-pi-on-same-futex.patch futex: Forbid uaddr == uaddr2 in futex_requeue(..., requeue_pi=1)
2014-06-05futex: Make lookup_pi_state more robustThomas Gleixner1-28/+106
The current implementation of lookup_pi_state has ambigous handling of the TID value 0 in the user space futex. We can get into the kernel even if the TID value is 0, because either there is a stale waiters bit or the owner died bit is set or we are called from the requeue_pi path or from user space just for fun. The current code avoids an explicit sanity check for pid = 0 in case that kernel internal state (waiters) are found for the user space address. This can lead to state leakage and worse under some circumstances. Handle the cases explicit: Waiter | pi_state | pi->owner | uTID | uODIED | ? [1] NULL | --- | --- | 0 | 0/1 | Valid [2] NULL | --- | --- | >0 | 0/1 | Valid [3] Found | NULL | -- | Any | 0/1 | Invalid [4] Found | Found | NULL | 0 | 1 | Valid [5] Found | Found | NULL | >0 | 1 | Invalid [6] Found | Found | task | 0 | 1 | Valid [7] Found | Found | NULL | Any | 0 | Invalid [8] Found | Found | task | ==taskTID | 0/1 | Valid [9] Found | Found | task | 0 | 0 | Invalid [10] Found | Found | task | !=taskTID | 0/1 | Invalid [1] Indicates that the kernel can acquire the futex atomically. We came came here due to a stale FUTEX_WAITERS/FUTEX_OWNER_DIED bit. [2] Valid, if TID does not belong to a kernel thread. If no matching thread is found then it indicates that the owner TID has died. [3] Invalid. The waiter is queued on a non PI futex [4] Valid state after exit_robust_list(), which sets the user space value to FUTEX_WAITERS | FUTEX_OWNER_DIED. [5] The user space value got manipulated between exit_robust_list() and exit_pi_state_list() [6] Valid state after exit_pi_state_list() which sets the new owner in the pi_state but cannot access the user space value. [7] pi_state->owner can only be NULL when the OWNER_DIED bit is set. [8] Owner and user space value match [9] There is no transient state which sets the user space TID to 0 except exit_robust_list(), but this is indicated by the FUTEX_OWNER_DIED bit. See [4] [10] There is no transient state which leaves owner and user space TID out of sync. Signed-off-by: Thomas Gleixner <[email protected]> Cc: Kees Cook <[email protected]> Cc: Will Drewry <[email protected]> Cc: Darren Hart <[email protected]> Cc: [email protected] Signed-off-by: Linus Torvalds <[email protected]>
2014-06-05futex: Always cleanup owner tid in unlock_piThomas Gleixner1-22/+18
If the owner died bit is set at futex_unlock_pi, we currently do not cleanup the user space futex. So the owner TID of the current owner (the unlocker) persists. That's observable inconsistant state, especially when the ownership of the pi state got transferred. Clean it up unconditionally. Signed-off-by: Thomas Gleixner <[email protected]> Cc: Kees Cook <[email protected]> Cc: Will Drewry <[email protected]> Cc: Darren Hart <[email protected]> Cc: [email protected] Signed-off-by: Linus Torvalds <[email protected]>
2014-06-05futex: Validate atomic acquisition in futex_lock_pi_atomic()Thomas Gleixner1-3/+11
We need to protect the atomic acquisition in the kernel against rogue user space which sets the user space futex to 0, so the kernel side acquisition succeeds while there is existing state in the kernel associated to the real owner. Verify whether the futex has waiters associated with kernel state. If it has, return -EINVAL. The state is corrupted already, so no point in cleaning it up. Subsequent calls will fail as well. Not our problem. [ tglx: Use futex_top_waiter() and explain why we do not need to try restoring the already corrupted user space state. ] Signed-off-by: Darren Hart <[email protected]> Cc: Kees Cook <[email protected]> Cc: Will Drewry <[email protected]> Cc: [email protected] Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-06-05futex-prevent-requeue-pi-on-same-futex.patch futex: Forbid uaddr == uaddr2 ↵Thomas Gleixner1-0/+25
in futex_requeue(..., requeue_pi=1) If uaddr == uaddr2, then we have broken the rule of only requeueing from a non-pi futex to a pi futex with this call. If we attempt this, then dangling pointers may be left for rt_waiter resulting in an exploitable condition. This change brings futex_requeue() in line with futex_wait_requeue_pi() which performs the same check as per commit 6f7b0a2a5c0f ("futex: Forbid uaddr == uaddr2 in futex_wait_requeue_pi()") [ tglx: Compare the resulting keys as well, as uaddrs might be different depending on the mapping ] Fixes CVE-2014-3153. Reported-by: Pinkie Pie Signed-off-by: Will Drewry <[email protected]> Signed-off-by: Kees Cook <[email protected]> Cc: [email protected] Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Darren Hart <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-06-05IB/core: Fix kobject leak on device register error flowHaggai Eran1-26/+26
The ports kobject isn't being released during error flow in device registration. This patch refactors the ports kobject cleanup into a single function called from both the error flow in device registration and from the unregistration function. A couple of attributes aren't being deleted (iw_stats_group, and ib_class_attributes). While this may be handled implicitly by the destruction of their kobjects, it seems better to handle all the attributes the same way. Signed-off-by: Haggai Eran <[email protected]> [ Make free_port_list_attributes() static. - Roland ] Signed-off-by: Roland Dreier <[email protected]>
2014-06-05tracing: Introduce saved_cmdlines_size fileYoshihiro YUNOMAE1-24/+154
Introduce saved_cmdlines_size file for changing the number of saved pid-comms. saved_cmdlines currently stores 128 command names using SAVED_CMDLINES, but 'no-existing processes' names are often lost in saved_cmdlines when we read the trace data. So, by introducing saved_cmdlines_size file, we can now change the 128 command names saved to something much larger if needed. When we write a value to saved_cmdlines_size, the number of the value will be stored in pid-comm list: # echo 1024 > /sys/kernel/debug/tracing/saved_cmdlines_size Here, 1024 command names can be stored. The default number is 128 and the maximum number is PID_MAX_DEFAULT (=32768 if CONFIG_BASE_SMALL is not set). So, if we want to avoid losing any command names, we need to set 32768 to saved_cmdlines_size. We can read the maximum number of the list: # cat /sys/kernel/debug/tracing/saved_cmdlines_size 128 Link: http://lkml.kernel.org/p/20140605012427.22115.16173.stgit@yunodevel Signed-off-by: Yoshihiro YUNOMAE <[email protected]> Signed-off-by: Steven Rostedt <[email protected]>
2014-06-05RDMA/cxgb4: add missing padding at end of struct c4iw_alloc_ucontext_respYann Droneaud2-2/+4
The i386 ABI disagrees with most other ABIs regarding alignment of data types larger than 4 bytes: on most ABIs a padding must be added at end of the structures, while it is not required on i386. So for most ABI struct c4iw_alloc_ucontext_resp gets implicitly padded to be aligned on a 8 bytes multiple, while for i386, such padding is not added. The tool pahole can be used to find such implicit padding: $ pahole --anon_include \ --nested_anon_include \ --recursive \ --class_name c4iw_alloc_ucontext_resp \ drivers/infiniband/hw/cxgb4/iw_cxgb4.o Then, structure layout can be compared between i386 and x86_64: +++ obj-i386/drivers/infiniband/hw/cxgb4/iw_cxgb4.o.pahole.txt 2014-03-28 11:43:05.547432195 +0100 --- obj-x86_64/drivers/infiniband/hw/cxgb4/iw_cxgb4.o.pahole.txt 2014-03-28 10:55:10.990133017 +0100 @@ -2,9 +2,8 @@ struct c4iw_alloc_ucontext_resp { __u64 status_page_key; /* 0 8 */ __u32 status_page_size; /* 8 4 */ - /* size: 12, cachelines: 1, members: 2 */ - /* last cacheline: 12 bytes */ + /* size: 16, cachelines: 1, members: 2 */ + /* padding: 4 */ + /* last cacheline: 16 bytes */ }; This ABI disagreement will make an x86_64 kernel try to write past the buffer provided by an i386 binary. When boundary check will be implemented, the x86_64 kernel will refuse to write past the i386 userspace provided buffer and the uverbs will fail. If the structure is on a page boundary and the next page is not mapped, ib_copy_to_udata() will fail and the uverb will fail. Additionally, as reported by Dan Carpenter, without the implicit padding being properly cleared, an information leak would take place in most architectures. This patch adds an explicit padding to struct c4iw_alloc_ucontext_resp, and, like 92b0ca7cb149 ("IB/mlx5: Fix stack info leak in mlx5_ib_alloc_ucontext()"), makes function c4iw_alloc_ucontext() not writting this padding field to userspace. This way, x86_64 kernel will be able to write struct c4iw_alloc_ucontext_resp as expected by unpatched and patched i386 libcxgb4. Link: http://marc.info/[email protected] Link: http://marc.info/[email protected] Link: http://marc.info/?i=20140328082428.GH25192@mwanda Cc: <[email protected]> Fixes: 05eb23893c2c ("cxgb4/iw_cxgb4: Doorbell Drop Avoidance Bug Fixes") Reported-by: Yann Droneaud <[email protected]> Reported-by: Dan Carpenter <[email protected]> Signed-off-by: Yann Droneaud <[email protected]> Acked-by: Steve Wise <[email protected]> Signed-off-by: Roland Dreier <[email protected]>
2014-06-05Merge branch 'x86-efi-for-linus' of ↵Linus Torvalds11-263/+361
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next Pull x86 EFI updates from Peter Anvin: "A collection of EFI changes. The perhaps most important one is to fully save and restore the FPU state around each invocation of EFI runtime, and to not choke on non-ASCII characters in the boot stub" * 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efivars: Add compatibility code for compat tasks efivars: Refactor sanity checking code into separate function efivars: Stop passing a struct argument to efivar_validate() efivars: Check size of user object efivars: Use local variables instead of a pointer dereference x86/efi: Save and restore FPU context around efi_calls (i386) x86/efi: Save and restore FPU context around efi_calls (x86_64) x86/efi: Implement a __efi_call_virt macro x86, fpu: Extend the use of static_cpu_has_safe x86/efi: Delete most of the efi_call* macros efi: x86: Handle arbitrary Unicode characters efi: Add get_dram_base() helper function efi: Add shared printk wrapper for consistent prefixing efi: create memory map iteration helper efi: efi-stub-helper cleanup
2014-06-05Merge branch 'x86/vdso' of ↵Linus Torvalds40-608/+794
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next Pull x86 cdso updates from Peter Anvin: "Vdso cleanups and improvements largely from Andy Lutomirski. This makes the vdso a lot less ''special''" * 'x86/vdso' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/vdso, build: Make LE access macros clearer, host-safe x86/vdso, build: Fix cross-compilation from big-endian architectures x86/vdso, build: When vdso2c fails, unlink the output x86, vdso: Fix an OOPS accessing the HPET mapping w/o an HPET x86, mm: Replace arch_vma_name with vm_ops->name for vsyscalls x86, mm: Improve _install_special_mapping and fix x86 vdso naming mm, fs: Add vm_ops->name as an alternative to arch_vma_name x86, vdso: Fix an OOPS accessing the HPET mapping w/o an HPET x86, vdso: Remove vestiges of VDSO_PRELINK and some outdated comments x86, vdso: Move the vvar and hpet mappings next to the 64-bit vDSO x86, vdso: Move the 32-bit vdso special pages after the text x86, vdso: Reimplement vdso.so preparation in build-time C x86, vdso: Move syscall and sysenter setup into kernel/cpu/common.c x86, vdso: Clean up 32-bit vs 64-bit vdso params x86, mm: Ensure correct alignment of the fixmap
2014-06-05Merge branch 'x86/espfix' of ↵Linus Torvalds14-41/+387
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next Pull x86-64 espfix changes from Peter Anvin: "This is the espfix64 code, which fixes the IRET information leak as well as the associated functionality problem. With this code applied, 16-bit stack segments finally work as intended even on a 64-bit kernel. Consequently, this patchset also removes the runtime option that we added as an interim measure. To help the people working on Linux kernels for very small systems, this patchset also makes these compile-time configurable features" * 'x86/espfix' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Revert "x86-64, modify_ldt: Make support for 16-bit segments a runtime option" x86, espfix: Make it possible to disable 16-bit support x86, espfix: Make espfix64 a Kconfig option, fix UML x86, espfix: Fix broken header guard x86, espfix: Move espfix definitions into a separate header file x86-32, espfix: Remove filter for espfix32 due to race x86-64, espfix: Don't leak bits 31:16 of %esp returning to 16-bit stack
2014-06-05Merge branch 'x86-x32-for-linus' of ↵Linus Torvalds1-2/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next Pull x86 x32 ABI fix from Peter Anvin: "A single fix for the x32 ABI: the io_setup() and io_submit() system call need to use the compat stubs" * 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, x32: Use compat shims for io_{setup,submit}
2014-06-05x86/smpboot: Initialize secondary CPU only if master CPU will wait for itIgor Mammedov2-79/+47
Hang is observed on virtual machines during CPU hotplug, especially in big guests with many CPUs. (It reproducible more often if host is over-committed). It happens because master CPU gives up waiting on secondary CPU and allows it to run wild. As result AP causes locking or crashing system. For example as described here: https://lkml.org/lkml/2014/3/6/257 If master CPU have sent STARTUP IPI successfully, and AP signalled to master CPU that it's ready to start initialization, make master CPU wait indefinitely till AP is onlined. To ensure that AP won't ever run wild, make it wait at early startup till master CPU confirms its intention to wait for AP. If AP doesn't respond in 10 seconds, the master CPU will timeout and cancel AP onlining. Signed-off-by: Igor Mammedov <[email protected]> Acked-by: Toshi Kani <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2014-06-05x86/smpboot: Log error on secondary CPU wakeup failure at ERR levelIgor Mammedov1-1/+1
If system is running without debug level logging, it will not log error if do_boot_cpu() failed to wakeup AP. It may lead to silent AP bringup failures at boot time. Change message level to KERN_ERR to make error visible to user as it's done on other architectures. Signed-off-by: Igor Mammedov <[email protected]> Acked-by: Toshi Kani <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2014-06-05x86: Fix list/memory corruption on CPU hotplugIgor Mammedov1-3/+0
currently if AP wake up is failed, master CPU marks AP as not present in do_boot_cpu() by calling set_cpu_present(cpu, false). That leads to following list corruption on the next physical CPU hotplug: [ 418.107336] WARNING: CPU: 1 PID: 45 at lib/list_debug.c:33 __list_add+0xbe/0xd0() [ 418.115268] list_add corruption. prev->next should be next (ffff88003dc57600), but was ffff88003e20c3a0. (prev=ffff88003e20c3a0). [ 418.123693] Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT ipt_REJECT cfg80211 xt_conntrack rfkill ee [ 418.138979] CPU: 1 PID: 45 Comm: kworker/u10:1 Not tainted 3.14.0-rc6+ #387 [ 418.149989] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007 [ 418.165750] Workqueue: kacpi_hotplug acpi_hotplug_work_fn [ 418.166433] 0000000000000021 ffff880038ca7988 ffffffff8159b22d 0000000000000021 [ 418.176460] ffff880038ca79d8 ffff880038ca79c8 ffffffff8106942c ffff880038ca79e8 [ 418.177453] ffff88003e20c3a0 ffff88003dc57600 ffff88003e20c3a0 00000000ffffffea [ 418.178445] Call Trace: [ 418.185811] [<ffffffff8159b22d>] dump_stack+0x49/0x5c [ 418.186440] [<ffffffff8106942c>] warn_slowpath_common+0x8c/0xc0 [ 418.187192] [<ffffffff81069516>] warn_slowpath_fmt+0x46/0x50 [ 418.191231] [<ffffffff8136ef51>] ? acpi_ns_get_node+0xb7/0xc7 [ 418.193889] [<ffffffff812f796e>] __list_add+0xbe/0xd0 [ 418.196649] [<ffffffff812e2aa9>] kobject_add_internal+0x79/0x200 [ 418.208610] [<ffffffff812e2e18>] kobject_add_varg+0x38/0x60 [ 418.213831] [<ffffffff812e2ef4>] kobject_add+0x44/0x70 [ 418.229961] [<ffffffff813e2c60>] device_add+0xd0/0x550 [ 418.234991] [<ffffffff813f0e95>] ? pm_runtime_init+0xe5/0xf0 [ 418.250226] [<ffffffff813e32be>] device_register+0x1e/0x30 [ 418.255296] [<ffffffff813e82a3>] register_cpu+0xe3/0x130 [ 418.266539] [<ffffffff81592be5>] arch_register_cpu+0x65/0x150 [ 418.285845] [<ffffffff81355c0d>] acpi_processor_hotadd_init+0x5a/0x9b ... Which is caused by the fact that generic_processor_info() allocates logical CPU id by calling: cpu = cpumask_next_zero(-1, cpu_present_mask); which returns id of previously failed to wake up CPU, since its bit is cleared by do_boot_cpu() and as result register_cpu() tries to register another CPU with the same id as already present but failed to be onlined CPU. Taking in account that AP will not do anything if master CPU failed to wake it up, there is no reason to mark that AP as not present and break next cpu hotplug attempts. As a side effect of not marking AP as not present, user would be allowed to online it again later. Also fix memory corruption in acpi_unmap_lsapic() if during CPU hotplug master CPU failed to wake up AP it set percpu x86_cpu_to_apicid to BAD_APICID=0xFFFF for AP. However following attempt to unplug that CPU will lead to out of bound write access to __apicid_to_node[] which is 32768 items long on x86_64 kernel. So with above fix of cpu_present_mask make sure that a present CPU has a valid APIC ID by not setting x86_cpu_to_apicid to BAD_APICID in do_boot_cpu() on failure and allow acpi_processor_remove()->acpi_unmap_lsapic() cleanly remove CPU. Signed-off-by: Igor Mammedov <[email protected]> Acked-by: Toshi Kani <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2014-06-05cgroup: disallow disabled controllers on the default hierarchyLi Zefan1-5/+9
After booting with cgroup_disable=memory, I still saw memcg files in the default hierarchy, and I can write to them, though it won't take effect. # dmesg ... Disabling memory control group subsystem ... # mount -t cgroup -o __DEVEL__sane_behavior xxx /cgroup # ls /cgroup ... memory.failcnt memory.move_charge_at_immigrate memory.force_empty memory.numa_stat memory.limit_in_bytes memory.oom_control ... # cat /cgroup/memory.usage_in_bytes 0 tj: Minor comment update. Signed-off-by: Li Zefan <[email protected]> Signed-off-by: Tejun Heo <[email protected]>
2014-06-05i2c: pca954x: Fix compilation without CONFIG_GPIOLIBLaurent Pinchart1-2/+1
The pca954x driver recently switched to the GPIO descriptor API without including the correct <linux/gpio/consumer.h> header. This breaks compilation without CONFIG_GPIOLIB. drivers/i2c/muxes/i2c-mux-pca954x.c: In function ‘pca954x_probe’: drivers/i2c/muxes/i2c-mux-pca954x.c:204:2: error: implicit declaration of function ‘devm_gpiod_get’ [-Werror=implicit-function-declaration] gpio = devm_gpiod_get(&client->dev, "reset"); ^ drivers/i2c/muxes/i2c-mux-pca954x.c:204:7: warning: assignment makes pointer from integer without a cast [enabled by default] gpio = devm_gpiod_get(&client->dev, "reset"); ^ drivers/i2c/muxes/i2c-mux-pca954x.c:206:3: error: implicit declaration of function ‘gpiod_direction_output’ [-Werror=implicit-function-declaration] gpiod_direction_output(gpio, 0); ^ cc1: some warnings being treated as errors make[3]: *** [drivers/i2c/muxes/i2c-mux-pca954x.o] Error 1 Fix it by including the right header. Reported-by: Jim Davis <[email protected]> Signed-off-by: Laurent Pinchart <[email protected]> Signed-off-by: Wolfram Sang <[email protected]>
2014-06-05Merge branch 'devel-stable' into for-nextRussell King7-9/+218
2014-06-05Merge branches 'alignment', 'fixes', 'l2c' (early part) and 'misc' into for-nextRussell King135-1471/+2098
2014-06-05microblaze: Fix typo in head.S s/substract/subtract/Antonio Ospite1-1/+1
Signed-off-by: Antonio Ospite <[email protected]> Cc: Michal Simek <[email protected]> Cc: "Edgar E. Iglesias" <[email protected]> Signed-off-by: Michal Simek <[email protected]>
2014-06-05sched/fair: Fix tg_set_cfs_bandwidth() deadlock on rq->lockRoman Gushchin3-7/+6
tg_set_cfs_bandwidth() sets cfs_b->timer_active to 0 to force the period timer restart. It's not safe, because can lead to deadlock, described in commit 927b54fccbf0: "__start_cfs_bandwidth calls hrtimer_cancel while holding rq->lock, waiting for the hrtimer to finish. However, if sched_cfs_period_timer runs for another loop iteration, the hrtimer can attempt to take rq->lock, resulting in deadlock." Three CPUs must be involved: CPU0 CPU1 CPU2 take rq->lock period timer fired ... take cfs_b lock ... ... tg_set_cfs_bandwidth() throttle_cfs_rq() release cfs_b lock take cfs_b lock ... distribute_cfs_runtime() timer_active = 0 take cfs_b->lock wait for rq->lock ... __start_cfs_bandwidth() {wait for timer callback break if timer_active == 1} So, CPU0 and CPU1 are deadlocked. Instead of resetting cfs_b->timer_active, tg_set_cfs_bandwidth can wait for period timer callbacks (ignoring cfs_b->timer_active) and restart the timer explicitly. Signed-off-by: Roman Gushchin <[email protected]> Reviewed-by: Ben Segall <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/87wqdi9g8e.wl\%[email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: Linus Torvalds <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2014-06-05sched/dl: Fix race in dl_task_timer()Kirill Tkhai1-1/+9
Throttled task is still on rq, and it may be moved to other cpu if user is playing with sched_setaffinity(). Therefore, unlocked task_rq() access makes the race. Juri Lelli reports he got this race when dl_bandwidth_enabled() was not set. Other thing, pointed by Peter Zijlstra: "Now I suppose the problem can still actually happen when you change the root domain and trigger a effective affinity change that way". To fix that we do the same as made in __task_rq_lock(). We do not use __task_rq_lock() itself, because it has a useful lockdep check, which is not correct in case of dl_task_timer(). We do not need pi_lock locked here. This case is an exception (PeterZ): "The only reason we don't strictly need ->pi_lock now is because we're guaranteed to have p->state == TASK_RUNNING here and are thus free of ttwu races". Signed-off-by: Kirill Tkhai <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Cc: <[email protected]> # v3.14+ Cc: Linus Torvalds <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2014-06-05sched: Fix sched_policy < 0 comparisonRichard Weinberger1-1/+1
attr.sched_policy is u32, therefore a comparison against < 0 is never true. Fix this by casting sched_policy to int. This issue was reported by coverity CID 1219934. Fixes: dbdb22754fde ("sched: Disallow sched_attr::sched_policy < 0") Signed-off-by: Richard Weinberger <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Cc: Michael Kerrisk <[email protected]> Cc: Linus Torvalds <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2014-06-05sched/numa: Fix use of spin_{un}lock_irq() when interrupts are disabledSteven Rostedt1-3/+4
As Peter Zijlstra told me, we have the following path: do_exit() exit_itimers() itimer_delete() spin_lock_irqsave(&timer->it_lock, &flags); timer_delete_hook(timer); kc->timer_del(timer) := posix_cpu_timer_del() put_task_struct() __put_task_struct() task_numa_free() spin_lock(&grp->lock); Which means that task_numa_free() can be called with interrupts disabled, which means that we should not be using spin_lock_irq() but spin_lock_irqsave() instead. Otherwise we are enabling interrupts while holding an interrupt unsafe lock! Signed-off-by: Steven Rostedt <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner<[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Linus Torvalds <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2014-06-05Merge tag 'perf-urgent-for-mingo' of ↵Ingo Molnar2-4/+7
git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf into perf/urgent Pull perf/urgent fixes from Jiri Olsa: * Fix perf probe to find correct variable DIE (Masami Hiramatsu) * Fix a segfault in perf probe if asked for variable it doesn't find (Masami Hiramatsu) Signed-off-by: Jiri Olsa <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2014-06-04cgroup: don't destroy the default rootLi Zefan1-1/+4
The default root is allocated and initialized at boot phase, so we shouldn't destroy the default root when it's umounted, otherwise it will lead to disaster. Just try mount and then umount the default root, and the kernel will crash immediately. v2: - No need to check for CSS_NO_REF in cgroup_get/put(). (Tejun) - Better call cgroup_put() for the default root in kill_sb(). (Tejun) - Add a comment. Signed-off-by: Li Zefan <[email protected]> Signed-off-by: Tejun Heo <[email protected]>
2014-06-05drivers: sh: Enable PM runtime for new R-Car Gen2 SoCsGeert Uytterhoeven1-0/+3
The PM runtime code should also be enabled for: - r8a7792 (R-Car V2H) - r8a7793 (R-Car M2-N) - r8a7794 (R-Car E2) Signed-off-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Simon Horman <[email protected]>
2014-06-04Merge branch 'akpm' (patchbomb from Andrew) into nextLinus Torvalds279-3514/+4712
Merge misc updates from Andrew Morton: - a few fixes for 3.16. Cc'ed to stable so they'll get there somehow. - various misc fixes and cleanups - most of the ocfs2 queue. Review is slow... - most of MM. The MM queue is pretty huge this time, but not much in the way of feature work. - some tweaks under kernel/ - printk maintenance work - updates to lib/ - checkpatch updates - tweaks to init/ * emailed patches from Andrew Morton <[email protected]>: (276 commits) fs/autofs4/dev-ioctl.c: add __init to autofs_dev_ioctl_init fs/ncpfs/getopt.c: replace simple_strtoul by kstrtoul init/main.c: remove an ifdef kthreads: kill CLONE_KERNEL, change kernel_thread(kernel_init) to avoid CLONE_SIGHAND init/main.c: add initcall_blacklist kernel parameter init/main.c: don't use pr_debug() fs/binfmt_flat.c: make old_reloc() static fs/binfmt_elf.c: fix bool assignements fs/efs: convert printk(KERN_DEBUG to pr_debug fs/efs: add pr_fmt / use __func__ fs/efs: convert printk to pr_foo() scripts/checkpatch.pl: device_initcall is not the only __initcall substitute checkpatch: check stable email address checkpatch: warn on unnecessary void function return statements checkpatch: prefer kstrto<foo> to sscanf(buf, "%<lhuidx>", &bar); checkpatch: add warning for kmalloc/kzalloc with multiply checkpatch: warn on #defines ending in semicolon checkpatch: make --strict a default for files in drivers/net and net/ checkpatch: always warn on missing blank line after variable declaration block checkpatch: fix wildcard DT compatible string checking ...
2014-06-04fs/autofs4/dev-ioctl.c: add __init to autofs_dev_ioctl_initFabian Frederick1-1/+1
autofs_dev_ioctl_init is only called by __init init_autofs4_fs Signed-off-by: Fabian Frederick <[email protected]> Acked-by: Ian Kent <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-06-04fs/ncpfs/getopt.c: replace simple_strtoul by kstrtoulFabian Frederick1-7/+6
Remove obsolete simple_strtoul in ncp_getopt Signed-off-by: Fabian Frederick <[email protected]> Cc: Petr Vandrovec <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-06-04init/main.c: remove an ifdefAndrew Morton2-2/+4
Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-06-04kthreads: kill CLONE_KERNEL, change kernel_thread(kernel_init) to avoid ↵Oleg Nesterov2-7/+1
CLONE_SIGHAND 1. Remove CLONE_KERNEL, it has no users and it is dangerous. The (old) comment says "List of flags we want to share for kernel threads" but this is not true, we do not want to share ->sighand by default. This flag can only be used if the caller is sure that both parent/child will never play with signals (say, allow_signal/etc). 2. Change rest_init() to clone kernel_init() without CLONE_SIGHAND. In this case CLONE_SIGHAND does not really hurt, and it looks like optimization because copy_sighand() can avoid kmem_cache_alloc(). But in fact this only adds the minor pessimization. kernel_init() is going to exec the init process, and de_thread() will need to unshare ->sighand and do kmem_cache_alloc(sighand_cachep) anyway, but it needs to do more work and take tasklist_lock and siglock. Signed-off-by: Oleg Nesterov <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Acked-by: Steven Rostedt <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-06-04init/main.c: add initcall_blacklist kernel parameterPrarit Bhargava2-0/+72
When a module is built into the kernel the module_init() function becomes an initcall. Sometimes debugging through dynamic debug can help, however, debugging built in kernel modules is typically done by changing the .config, recompiling, and booting the new kernel in an effort to determine exactly which module caused a problem. This patchset can be useful stand-alone or combined with initcall_debug. There are cases where some initcalls can hang the machine before the console can be flushed, which can make initcall_debug output inaccurate. Having the ability to skip initcalls can help further debugging of these scenarios. Usage: initcall_blacklist=<list of comma separated initcalls> ex) added "initcall_blacklist=sgi_uv_sysfs_init" as a kernel parameter and the log contains: blacklisting initcall sgi_uv_sysfs_init ... ... initcall sgi_uv_sysfs_init blacklisted ex) added "initcall_blacklist=foo_bar,sgi_uv_sysfs_init" as a kernel parameter and the log contains: blacklisting initcall foo_bar blacklisting initcall sgi_uv_sysfs_init ... ... initcall sgi_uv_sysfs_init blacklisted [[email protected]: tweak printk text] Signed-off-by: Prarit Bhargava <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Josh Boyer <[email protected]> Cc: Rob Landley <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Frederic Weisbecker <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-06-04init/main.c: don't use pr_debug()Andrew Morton1-2/+2
Pertially revert commit ea676e846a81 ("init/main.c: convert to pr_foo()"). Unbeknownst to me, pr_debug() is different from the other pr_foo() levels: pr_debug() is a no-op when DEBUG is not defined. Happily, init/main.c does have a #define DEBUG so we didn't break initcall_debug. But the functioning of initcall_debug should not be dependent upon the presence of that #define DEBUG. Reported-by: Russell King <[email protected]> Cc: Joe Perches <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-06-04fs/binfmt_flat.c: make old_reloc() staticAxel Lin1-1/+1
old_reloc() is only used in this file, make it static. Signed-off-by: Axel Lin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-06-04fs/binfmt_elf.c: fix bool assignementsFabian Frederick1-2/+2
Fix coccinelle warnings. Signed-off-by: Fabian Frederick <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-06-04fs/efs: convert printk(KERN_DEBUG to pr_debugFabian Frederick3-26/+17
All KERN_DEBUG callsites being under #ifdef DEBUG we can safely convert everything to pr_debug without changing current behaviour. Remove #ifdef DEBUG around pr_debugs only (suggested by Joe Perches) Signed-off-by: Fabian Frederick <[email protected]> Cc: Joe Perches <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-06-04fs/efs: add pr_fmt / use __func__Fabian Frederick6-35/+49
Also uniformize function arguments. Signed-off-by: Fabian Frederick <[email protected]> Cc: Joe Perches <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-06-04fs/efs: convert printk to pr_foo()Fabian Frederick5-35/+32
Convert all except KERN_DEBUG (pr_debug doesn't work the same as printk(KERN_DEBUG and requires special check) Signed-off-by: Fabian Frederick <[email protected]> Cc: Joe Perches <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-06-04scripts/checkpatch.pl: device_initcall is not the only __initcall substituteFabian Frederick1-2/+2
This patch adds a link to init.h to find appropriate initcall function to replace obsolete __initcall Signed-off-by: Fabian Frederick <[email protected]> Cc: Andy Whitcroft <[email protected]> Cc: Joe Perches <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-06-04checkpatch: check stable email addressJoe Perches1-0/+6
It should be [email protected], not [email protected]. Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-06-04checkpatch: warn on unnecessary void function return statementsJoe Perches1-0/+7
void function lines that use a single tab then "return;" are generally unnecessary. Signed-off-by: Joe Perches <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-06-04checkpatch: prefer kstrto<foo> to sscanf(buf, "%<lhuidx>", &bar);Joe Perches1-0/+21
Use the kstrto<foo> functions in preference to sscanf. Signed-off-by: Joe Perches <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>