aboutsummaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)AuthorFilesLines
2014-07-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller21-119/+253
Conflicts: drivers/infiniband/hw/cxgb4/device.c The cxgb4 conflict was simply overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-21tracing: Fix wraparound problems in "uptime" trace clockTony Luck2-5/+6
The "uptime" trace clock added in: commit 8aacf017b065a805d27467843490c976835eb4a5 tracing: Add "uptime" trace clock that uses jiffies has wraparound problems when the system has been up more than 1 hour 11 minutes and 34 seconds. It converts jiffies to nanoseconds using: (u64)jiffies_to_usecs(jiffy) * 1000ULL but since jiffies_to_usecs() only returns a 32-bit value, it truncates at 2^32 microseconds. An additional problem on 32-bit systems is that the argument is "unsigned long", so fixing the return value only helps until 2^32 jiffies (49.7 days on a HZ=1000 system). Avoid these problems by using jiffies_64 as our basis, and not converting to nanoseconds (we do convert to clock_t because user facing API must not be dependent on internal kernel HZ values). Link: http://lkml.kernel.org/p/99d63c5bfe9b320a3b428d773825a37095bf6a51.1405708254.git.tony.luck@intel.com Cc: stable@vger.kernel.org # 3.10+ Fixes: 8aacf017b065 "tracing: Add "uptime" trace clock that uses jiffies" Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-19Merge branch 'locking-urgent-for-linus' of ↵Linus Torvalds7-45/+85
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Thomas Gleixner: "The locking department delivers: - A rather large and intrusive bundle of fixes to address serious performance regressions introduced by the new rwsem / mcs technology. Simpler solutions have been discussed, but they would have been ugly bandaids with more risk than doing the right thing. - Make the rwsem spin on owner technology opt-in for architectures and enable it only on the known to work ones. - A few fixes to the lockdep userspace library" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/rwsem: Add CONFIG_RWSEM_SPIN_ON_OWNER locking/mutex: Disable optimistic spinning on some architectures locking/rwsem: Reduce the size of struct rw_semaphore locking/rwsem: Rename 'activity' to 'count' locking/spinlocks/mcs: Micro-optimize osq_unlock() locking/spinlocks/mcs: Introduce and use init macro and function for osq locks locking/spinlocks/mcs: Convert osq lock to atomic_t to reduce overhead locking/spinlocks/mcs: Rename optimistic_spin_queue() to optimistic_spin_node() locking/rwsem: Allow conservative optimistic spinning when readers have lock tools/liblockdep: Account for bitfield changes in lockdeps lock_acquire tools/liblockdep: Remove debug print left over from development tools/liblockdep: Fix comparison of a boolean value with a value of 2
2014-07-19Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fix from Thomas Gleixner: "Prevent a possible divide by zero in the debugging code" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Fix possible divide by zero in avg_atom() calculation
2014-07-19Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds1-2/+18
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Thomas Gleixner: "A single fix for a long standing issue in the alarm timer subsystem, which was noticed recently when people finally started to use alarm timers for serious work" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: alarmtimer: Fix bug where relative alarm timers were treated as absolute
2014-07-19Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds5-56/+121
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU fixes from Thomas Gleixner: "Two RCU patches: - Address a serious performance regression on open/close caused by commit ac1bea85781e ("Make cond_resched() report RCU quiescent states") - Export RCU debug functions. Not a regression, but enablement to address a serious recursion bug in the sl*b allocators in 3.17" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: rcu: Reduce overhead of cond_resched() checks for RCU rcu: Export debug_init_rcu_head() and and debug_init_rcu_head()
2014-07-19Merge tag 'irqchip-core-3.17-3' of ↵Thomas Gleixner1-2/+3
git://git.infradead.org/users/jcooper/linux into irq/core irqchip core changes for v3.17 (round #3) from Jason Cooper * gic: Add GICv3 driver * atmel: Move atmel aic driver from arch code to irqchip/
2014-07-18Merge tag 'pm+acpi-3.16-rc6' of ↵Linus Torvalds2-2/+3
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI and power management fixes from Rafael Wysocki: "These are a few recent regression fixes, a revert of the ACPI video commit I promised, a system resume fix related to request_firmware(), an ACPI video quirk for one more Win8-oriented BIOS, an ACPI device enumeration documentation update and a few fixes for ARM cpufreq drivers. Specifics: - Fix for a recently introduced NULL pointer dereference in the core system suspend code occuring when platforms without ACPI attempt to use the "freeze" sleep state from Zhang Rui. - Fix for a recently introduced build warning in cpufreq headers from Brian W Hart. - Fix for a 3.13 cpufreq regression related to sysem resume that triggers on some systems with multiple CPU clusters from Viresh Kumar. - Fix for a 3.4 regression in request_firmware() resulting in WARN_ON()s on some systems during system resume from Takashi Iwai. - Revert of the ACPI video commit that changed the default value of the video.brightness_switch_enabled command line argument to 0 as it has been reported to break existing setups. - ACPI device enumeration documentation update to take recent code changes into account and make the documentation match the code again from Darren Hart. - Fixes for the sa1110, imx6q, kirkwood, and cpu0 cpufreq drivers from Linus Walleij, Nicolas Del Piano, Quentin Armitage, Viresh Kumar. - New ACPI video blacklist entry for HP ProBook 4540s from Hans de Goede" * tag 'pm+acpi-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: make table sentinel macros unsigned to match use cpufreq: move policy kobj to policy->cpu at resume cpufreq: cpu0: OPPs can be populated at runtime cpufreq: kirkwood: Reinstate cpufreq driver for ARCH_KIRKWOOD cpufreq: imx6q: Select PM_OPP cpufreq: sa1110: set memory type for h3600 ACPI / video: Add use_native_backlight quirk for HP ProBook 4540s PM / sleep: fix freeze_ops NULL pointer dereferences PM / sleep: Fix request_firmware() error at resume Revert "ACPI / video: change acpi-video brightness_switch_enabled default to 0" ACPI / documentation: Remove reference to acpi_platform_device_ids from enumeration.txt
2014-07-18tracing: Convert local function_graph functions to staticSteven Rostedt (Red Hat)1-4/+4
Local functions should be static. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-18workqueue: wake regular worker if need_more_worker() when rescuer leave the poolLai Jiangshan1-2/+2
We don't need to wake up regular worker when nr_running==1, so need_more_worker() is sufficient here. And need_more_worker() gives us better readability due to the name of "keep_working()" implies the rescuer should keep working now but the rescuer is actually leaving. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2014-07-18ftrace: Do not copy old hash when resettingWang Nan1-10/+5
Do not waste time copying the old hash if the hash is going to be reset. Just allocate a new hash and free the old one, as that is the same result as copying te old one and then resetting it. Link: http://lkml.kernel.org/p/1405384820-48837-1-git-send-email-wangnan0@huawei.com Signed-off-by: Wang Nan <wangnan0@huawei.com> [ SDR: Removed unused ftrace_filter_reset() function ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-18tracing: let user specify tracing_thresh after selecting function_graphStanislav Fomichev3-7/+67
Currently, tracing_thresh works only if we specify it before selecting function_graph tracer. If we do the opposite, tracing_thresh will change it's value, but it will not be applied. To fix it, we add update_thresh callback which is called whenever tracing_thresh is updated and for function_graph tracer we register handler which reinitializes tracer depending on tracing_thresh. Link: http://lkml.kernel.org/p/20140718111727.GA3206@stfomichev-desktop.yandex.net Signed-off-by: Stanislav Fomichev <stfomichev@yandex-team.ru> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-18seccomp: implement SECCOMP_FILTER_FLAG_TSYNCKees Cook1-1/+134
Applying restrictive seccomp filter programs to large or diverse codebases often requires handling threads which may be started early in the process lifetime (e.g., by code that is linked in). While it is possible to apply permissive programs prior to process start up, it is difficult to further restrict the kernel ABI to those threads after that point. This change adds a new seccomp syscall flag to SECCOMP_SET_MODE_FILTER for synchronizing thread group seccomp filters at filter installation time. When calling seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_TSYNC, filter) an attempt will be made to synchronize all threads in current's threadgroup to its new seccomp filter program. This is possible iff all threads are using a filter that is an ancestor to the filter current is attempting to synchronize to. NULL filters (where the task is running as SECCOMP_MODE_NONE) are also treated as ancestors allowing threads to be transitioned into SECCOMP_MODE_FILTER. If prctrl(PR_SET_NO_NEW_PRIVS, ...) has been set on the calling thread, no_new_privs will be set for all synchronized threads too. On success, 0 is returned. On failure, the pid of one of the failing threads will be returned and no filters will have been applied. The race conditions against another thread are: - requesting TSYNC (already handled by sighand lock) - performing a clone (already handled by sighand lock) - changing its filter (already handled by sighand lock) - calling exec (handled by cred_guard_mutex) The clone case is assisted by the fact that new threads will have their seccomp state duplicated from their parent before appearing on the tasklist. Holding cred_guard_mutex means that seccomp filters cannot be assigned while in the middle of another thread's exec (potentially bypassing no_new_privs or similar). The call to de_thread() may kill threads waiting for the mutex. Changes across threads to the filter pointer includes a barrier. Based on patches by Will Drewry. Suggested-by: Julien Tinnes <jln@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Andy Lutomirski <luto@amacapital.net>
2014-07-18seccomp: allow mode setting across threadsKees Cook1-11/+25
This changes the mode setting helper to allow threads to change the seccomp mode from another thread. We must maintain barriers to keep TIF_SECCOMP synchronized with the rest of the seccomp state. Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Andy Lutomirski <luto@amacapital.net>
2014-07-18seccomp: introduce writer lockingKees Cook2-2/+63
Normally, task_struct.seccomp.filter is only ever read or modified by the task that owns it (current). This property aids in fast access during system call filtering as read access is lockless. Updating the pointer from another task, however, opens up race conditions. To allow cross-thread filter pointer updates, writes to the seccomp fields are now protected by the sighand spinlock (which is shared by all threads in the thread group). Read access remains lockless because pointer updates themselves are atomic. However, writes (or cloning) often entail additional checking (like maximum instruction counts) which require locking to perform safely. In the case of cloning threads, the child is invisible to the system until it enters the task list. To make sure a child can't be cloned from a thread and left in a prior state, seccomp duplication is additionally moved under the sighand lock. Then parent and child are certain have the same seccomp state when they exit the lock. Based on patches by Will Drewry and David Drysdale. Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Andy Lutomirski <luto@amacapital.net>
2014-07-18seccomp: split filter prep from check and applyKees Cook1-30/+67
In preparation for adding seccomp locking, move filter creation away from where it is checked and applied. This will allow for locking where no memory allocation is happening. The validation, filter attachment, and seccomp mode setting can all happen under the future locks. For extreme defensiveness, I've added a BUG_ON check for the calculated size of the buffer allocation in case BPF_MAXINSN ever changes, which shouldn't ever happen. The compiler should actually optimize out this check since the test above it makes it impossible. Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Andy Lutomirski <luto@amacapital.net>
2014-07-18sched: move no_new_privs into new atomic flagsKees Cook2-3/+3
Since seccomp transitions between threads requires updates to the no_new_privs flag to be atomic, the flag must be part of an atomic flag set. This moves the nnp flag into a separate task field, and introduces accessors. Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Andy Lutomirski <luto@amacapital.net>
2014-07-18seccomp: add "seccomp" syscallKees Cook2-5/+53
This adds the new "seccomp" syscall with both an "operation" and "flags" parameter for future expansion. The third argument is a pointer value, used with the SECCOMP_SET_MODE_FILTER operation. Currently, flags must be 0. This is functionally equivalent to prctl(PR_SET_SECCOMP, ...). In addition to the TSYNC flag later in this patch series, there is a non-zero chance that this syscall could be used for configuring a fixed argument area for seccomp-tracer-aware processes to pass syscall arguments in the future. Hence, the use of "seccomp" not simply "seccomp_add_filter" for this syscall. Additionally, this syscall uses operation, flags, and user pointer for arguments because strictly passing arguments via a user pointer would mean seccomp itself would be unable to trivially filter the seccomp syscall itself. Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Andy Lutomirski <luto@amacapital.net>
2014-07-18seccomp: split mode setting routinesKees Cook1-23/+48
Separates the two mode setting paths to make things more readable with fewer #ifdefs within function bodies. Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Andy Lutomirski <luto@amacapital.net>
2014-07-18seccomp: extract check/assign mode helpersKees Cook1-4/+18
To support splitting mode 1 from mode 2, extract the mode checking and assignment logic into common functions. Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Andy Lutomirski <luto@amacapital.net>
2014-07-18seccomp: create internal mode-setting functionKees Cook1-2/+14
In preparation for having other callers of the seccomp mode setting logic, split the prctl entry point away from the core logic that performs seccomp mode setting. Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Andy Lutomirski <luto@amacapital.net>
2014-07-18ring-buffer: Always run per-cpu ring buffer resize with schedule_work_on()Corey Minyard1-20/+4
The code for resizing the trace ring buffers has to run the per-cpu resize on the CPU itself. The code was using preempt_off() and running the code for the current CPU directly, otherwise calling schedule_work_on(). At least on RT this could result in the following: |BUG: sleeping function called from invalid context at kernel/rtmutex.c:673 |in_atomic(): 1, irqs_disabled(): 0, pid: 607, name: bash |3 locks held by bash/607: |CPU: 0 PID: 607 Comm: bash Not tainted 3.12.15-rt25+ #124 |(rt_spin_lock+0x28/0x68) |(free_hot_cold_page+0x84/0x3b8) |(free_buffer_page+0x14/0x20) |(rb_update_pages+0x280/0x338) |(ring_buffer_resize+0x32c/0x3dc) |(free_snapshot+0x18/0x38) |(tracing_set_tracer+0x27c/0x2ac) probably via |cd /sys/kernel/debug/tracing/ |echo 1 > events/enable ; sleep 2 |echo 1024 > buffer_size_kb If we just always use schedule_work_on(), there's no need for the preempt_off(). So do that. Link: http://lkml.kernel.org/p/1405537633-31518-1-git-send-email-cminyard@mvista.com Reported-by: Stanislav Meduna <stano@meduna.org> Signed-off-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-18tracing: Remove function_trace_stop and HAVE_FUNCTION_TRACE_MCOUNT_TESTSteven Rostedt (Red Hat)2-8/+0
All users of function_trace_stop and HAVE_FUNCTION_TRACE_MCOUNT_TEST have been removed. We can safely remove them from the kernel. Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-18ftrace: Remove function_trace_stop check from list funcSteven Rostedt (Red Hat)1-6/+2
function_trace_stop is no longer used to stop function tracing. Remove the check from __ftrace_ops_list_func(). Also, call FTRACE_WARN_ON() instead of setting function_trace_stop if a ops has no func to call. Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-18ftrace: Do no disable function tracing on enabling function tracingSteven Rostedt (Red Hat)1-7/+0
When function tracing is being updated function_trace_stop is set to keep from tracing the updates. This was fine when function tracing was done from stop machine. But it is no longer done that way and this can cause real tracing to be missed. Remove it. Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-18ftrace-graph: Remove usage of ftrace_stop() in ftrace_graph_stop()Steven Rostedt (Red Hat)1-5/+0
All archs now use ftrace_graph_is_dead() to stop function graph tracing. Remove the usage of ftrace_stop() as that is no longer needed. Cc: Frederic Weisbecker <fweisbec@gmail.com> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-18kprobes: Fix "Failed to find blacklist" probing errors on ia64 and ppc64Masami Hiramatsu1-5/+9
On ia64 and ppc64, function pointers do not point to the entry address of the function, but to the address of a function descriptor (which contains the entry address and misc data). Since the kprobes code passes the function pointer stored by NOKPROBE_SYMBOL() to kallsyms_lookup_size_offset() for initalizing its blacklist, it fails and reports many errors, such as: Failed to find blacklist 0001013168300000 Failed to find blacklist 0001013000f0a000 [...] To fix this bug, use arch_deref_entry_point() to get the function entry address for kallsyms_lookup_size_offset() instead of the raw function pointer. Suzuki also pointed out that blacklist entries should also be updated as well. Reported-by: Tony Luck <tony.luck@gmail.com> Fixed-by: Suzuki K. Poulose <suzuki@in.ibm.com> Tested-by: Tony Luck <tony.luck@intel.com> Tested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (for powerpc) Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: sparse@chrisli.org Cc: Paul Mackerras <paulus@samba.org> Cc: akataria@vmware.com Cc: anil.s.keshavamurthy@intel.com Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Chris Wright <chrisw@sous-sol.org> Cc: yrl.pp-manager.tt@hitachi.com Cc: Kevin Hao <haokexin@gmail.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: rdunlap@infradead.org Cc: dl9pf@gmx.de Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: David S. Miller <davem@davemloft.net> Cc: linux-ia64@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/20140717114411.13401.2632.stgit@kbuild-fedora.novalocal Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-17Merge tag 'trace-fixes-v3.16-rc5-v2' of ↵Linus Torvalds4-8/+19
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "A few more fixes for ftrace infrastructure. I was cleaning out my INBOX and found two fixes from zhangwei from a year ago that were lost in my mail. These fix an inconsistency between trace_puts() and the way trace_printk() works. The reason this is important to fix is because when trace_printk() doesn't have any arguments, it turns into a trace_puts(). Not being able to enable a stack trace against trace_printk() because it does not have any arguments is quite confusing. Also, the fix is rather trivial and low risk. While porting some changes to PowerPC I discovered that it still has the function graph tracer filter bug that if you also enable stack tracing the function graph tracer filter is ignored. I fixed that up. Finally, Martin Lau, fixed a bug that would cause readers of the ftrace ring buffer to block forever even though it was suppose to be NONBLOCK" This also includes the fix from an earlier pull request: "Oleg Nesterov fixed a memory leak that happens if a user creates a tracing instance, sets up a filter in an event, and then removes that instance. The filter allocates memory that is never freed when the instance is destroyed" * tag 'trace-fixes-v3.16-rc5-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: ring-buffer: Fix polling on trace_pipe tracing: Add TRACE_ITER_PRINTK flag check in __trace_puts/__trace_bputs tracing: Fix graph tracer with stack tracer on other archs tracing: Add ftrace_trace_stack into __trace_puts/__trace_bputs tracing: instance_rmdir() leaks ftrace_event_file->filter
2014-07-17ftrace-graph: Remove dependency of ftrace_stop() from ftrace_graph_stop()Steven Rostedt (Red Hat)2-5/+35
ftrace_stop() is going away as it disables parts of function tracing that affects users that should not be affected. But ftrace_graph_stop() is built on ftrace_stop(). Here's another example of killing all of function tracing because something went wrong with function graph tracing. Instead of disabling all users of function tracing on function graph error, disable only function graph tracing. A new function is created called ftrace_graph_is_dead(). This is called in strategic paths to prevent function graph from doing more harm and allowing at least a warning to be printed before the system crashes. NOTE: ftrace_stop() is still used until all the archs are converted over to use ftrace_graph_is_dead(). After that, ftrace_stop() will be removed. Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-17PM / Sleep: Remove ftrace_stop/start() from suspend and hibernateSteven Rostedt (Red Hat)2-8/+0
ftrace_stop() and ftrace_start() were added to the suspend and hibernate process because there was some function within the work flow that caused the system to reboot if it was traced. This function has recently been found (restore_processor_state()). Now there's no reason to disable function tracing while we are going into suspend or hibernate, which means that being able to trace this will help tremendously in debugging any issues with suspend or hibernate. This also means that the ftrace_stop/start() functions can be removed and simplify the function tracing code a bit. Link: http://lkml.kernel.org/r/1518201.VD9cU33jRU@vostro.rjw.lan Acked-by: "Rafael J. Wysocki" <rjw@rjwysocki.net> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-17KEYS: validate certificate trust only with builtin keysDmitry Kasatkin1-0/+1
Instead of allowing public keys, with certificates signed by any key on the system trusted keyring, to be added to a trusted keyring, this patch further restricts the certificates to those signed only by builtin keys on the system keyring. This patch defines a new option 'builtin' for the kernel parameter 'keys_ownerid' to allow trust validation using builtin keys. Simplified Mimi's "KEYS: define an owner trusted keyring" patch Changelog v7: - rename builtin_keys to use_builtin_keys Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-07-17genirq: generic chip: Export irq_map_generic_chip functionBoris BREZILLON1-2/+3
Export the generic irq map function in order to provide irq_domain ops with generic mapping and specific of xlate function (needed by the new atmel AIC driver). Signed-off-by: Boris BREZILLON <boris.brezillon@free-electrons.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/1405012462-766-2-git-send-email-boris.brezillon@free-electrons.com Signed-off-by: Jason Cooper <jason@lakedaemon.net>
2014-07-17arch, locking: Ciao arch_mutex_cpu_relax()Davidlohr Bueso5-16/+13
The arch_mutex_cpu_relax() function, introduced by 34b133f, is hacky and ugly. It was added a few years ago to address the fact that common cpu_relax() calls include yielding on s390, and thus impact the optimistic spinning functionality of mutexes. Nowadays we use this function well beyond mutexes: rwsem, qrwlock, mcs and lockref. Since the macro that defines the call is in the mutex header, any users must include mutex.h and the naming is misleading as well. This patch (i) renames the call to cpu_relax_lowlatency ("relax, but only if you can do it with very low latency") and (ii) defines it in each arch's asm/processor.h local header, just like for regular cpu_relax functions. On all archs, except s390, cpu_relax_lowlatency is simply cpu_relax, and thus we can take it out of mutex.h. While this can seem redundant, I believe it is a good choice as it allows us to move out arch specific logic from generic locking primitives and enables future(?) archs to transparently define it, similarly to System Z. Signed-off-by: Davidlohr Bueso <davidlohr@hp.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Anton Blanchard <anton@samba.org> Cc: Aurelien Jacquiot <a-jacquiot@ti.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Bharat Bhushan <r65777@freescale.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chen Liqin <liqin.linux@gmail.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Chris Zankel <chris@zankel.net> Cc: David Howells <dhowells@redhat.com> Cc: David S. Miller <davem@davemloft.net> Cc: Deepthi Dharwar <deepthi@linux.vnet.ibm.com> Cc: Dominik Dingel <dingel@linux.vnet.ibm.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: James Hogan <james.hogan@imgtec.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Joe Perches <joe@perches.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Joseph Myers <joseph@codesourcery.com> Cc: Kees Cook <keescook@chromium.org> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: Lennox Wu <lennox.wu@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Neuling <mikey@neuling.org> Cc: Michal Simek <monstr@monstr.eu> Cc: Mikael Starvik <starvik@axis.com> Cc: Nicolas Pitre <nico@linaro.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Qais Yousef <qais.yousef@imgtec.com> Cc: Qiaowei Ren <qiaowei.ren@intel.com> Cc: Rafael Wysocki <rafael.j.wysocki@intel.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Steven Miao <realmz6@gmail.com> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Stratos Karafotis <stratosk@semaphore.gr> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Vasily Kulikov <segoon@openwall.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com> Cc: Waiman Long <Waiman.Long@hp.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Wolfram Sang <wsa@the-dreams.de> Cc: adi-buildroot-devel@lists.sourceforge.net Cc: linux390@de.ibm.com Cc: linux-alpha@vger.kernel.org Cc: linux-am33-list@redhat.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: linux-cris-kernel@axis.com Cc: linux-hexagon@vger.kernel.org Cc: linux-ia64@vger.kernel.org Cc: linux@lists.openrisc.net Cc: linux-m32r-ja@ml.linux-m32r.org Cc: linux-m32r@ml.linux-m32r.org Cc: linux-m68k@lists.linux-m68k.org Cc: linux-metag@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: linux-parisc@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-s390@vger.kernel.org Cc: linux-sh@vger.kernel.org Cc: linux-xtensa@linux-xtensa.org Cc: sparclinux@vger.kernel.org Link: http://lkml.kernel.org/r/1404079773.2619.4.camel@buesod1.americas.hpqcorp.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-17locking/lockdep: Only ask for /proc/lock_stat output when availableAndreas Gruenbacher1-0/+2
When lockdep turns itself off, the following message is logged: Please attach the output of /proc/lock_stat to the bug report Omit this message when CONFIG_LOCK_STAT is off, and /proc/lock_stat doesn't exist. Signed-off-by: Andreas Gruenbacher <andreas.gruenbacher@gmail.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1405451452-3824-1-git-send-email-andreas.gruenbacher@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-17Merge branch 'locking/urgent' into locking/core, before applying larger ↵Ingo Molnar26-147/+397
changes and to refresh the branch with fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-17Merge branch 'rcu/next' of ↵Ingo Molnar10-155/+450
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU updates from Paul E. McKenney: * Update RCU documentation. * Miscellaneous fixes. * Maintainership changes. * Torture-test updates. * Callback-offloading changes. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller9-63/+122
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-16Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Tooling fixes and an Intel PMU driver fixlet" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Do not allow optimized switch for non-cloned events perf/x86/intel: ignore CondChgd bit to avoid false NMI handling perf symbols: Get kernel start address by symbol name perf tools: Fix segfault in cumulative.callchain report
2014-07-16Merge tag 'v3.16-rc5' into timers/coreThomas Gleixner70-2312/+4230
Reason: Bring in upstream modifications, so the pending changes which depend on them can be queued.
2014-07-16tracing: Kill "filter_string" arg of replace_preds()Oleg Nesterov1-6/+3
Cosmetic, but replace_preds() doesn't need/use "char *filter_string". Remove it to microsimplify the code. Link: http://lkml.kernel.org/p/20140715184832.GA20519@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-16tracing: Change apply_subsystem_event_filter() paths to check file->system ↵Oleg Nesterov1-23/+16
== dir filter_free_subsystem_preds(), filter_free_subsystem_filters() and replace_system_preds() can simply check file->system->subsystem and avoid strcmp(call->class->system). Better yet, we can pass "struct ftrace_subsystem_dir *dir" instead of event_subsystem and just check file->system == dir. Thanks to Namhyung Kim who pointed out that replace_system_preds() can be changed too. Link: http://lkml.kernel.org/p/20140715184829.GA20516@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-16tracing/uprobes: Kill the dead TRACE_EVENT_FL_USE_CALL_FILTER logicOleg Nesterov1-2/+1
alloc_trace_uprobe() sets TRACE_EVENT_FL_USE_CALL_FILTER for unknown reason and this is simply wrong. Fortunately this has no effect because register_uprobe_event() clears call->flags after that. Kill both. This trace_uprobe was kzalloc'ed and we rely on this fact anyway. Link: http://lkml.kernel.org/p/20140715184824.GA20505@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-16tracing: Kill call_filter_disable()Oleg Nesterov1-6/+1
It seems that the only purpose of call_filter_disable() is to make filter_disable() less clear and symmetrical, remove it. Link: http://lkml.kernel.org/p/20140715184821.GA20498@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-16tracing: Kill destroy_call_preds()Oleg Nesterov2-7/+2
Remove destroy_call_preds(). Its only caller, __trace_remove_event_call(), can use free_event_filter() and nullify ->filter by hand. Perhaps we could keep this trivial helper although imo it is pointless, but then it should be static in trace_events.c. Link: http://lkml.kernel.org/p/20140715184816.GA20495@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-16tracing: Kill destroy_preds() and destroy_file_preds()Oleg Nesterov2-21/+0
destroy_preds() makes no sense. The only caller, event_remove(), actually wants destroy_file_preds(). __trace_remove_event_call() does destroy_call_preds() which takes care of call->filter. And after the previous change we can simply remove destroy_preds() from event_remove(), we are going to call remove_event_from_tracers() which in turn calls remove_event_file_dir()->free_event_filter(). Link: http://lkml.kernel.org/p/20140715184813.GA20488@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-16rcu: Allow for NULL tick_nohz_full_mask when nohz_full= missingPaul E. McKenney1-3/+4
If there isn't a nohz_full= kernel parameter specified, then tick_nohz_full_mask can legitimately be NULL. This can cause problems when RCU's boot code tries to cpumask_or() this value into rcu_nocb_mask. In addition, if NO_HZ_FULL_ALL=y, there is no point in doing the cpumask_or() in the first place because this will cause RCU_NOCB_CPU_ALL=y, which in turn will have all bits already set in rcu_nocb_mask. This commit therefore avoids the cpumask_or() if NO_HZ_FULL_ALL=y and checks for !tick_nohz_full_running otherwise, this latter check catching cases when there was no nohz_full= kernel parameter specified. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-07-16ftrace: Allow archs to specify if they need a separate function graph trampolineSteven Rostedt (Red Hat)1-2/+4
Currently if an arch supports function graph tracing, the core code will just assign the function graph trampoline to the function graph addr that gets called. But as the old method for function graph tracing always calls the function trampoline first and that calls the function graph trampoline, some archs may have the function graph trampoline dependent on operations that were done in the function trampoline. This causes function graph tracer to break on those archs. Instead of having the default be to set the function graph ftrace_ops to the function graph trampoline, have it instead just set it to zero which will keep it from jumping to a trampoline that is not set up to be jumped directly too. Link: http://lkml.kernel.org/r/53BED155.9040607@nvidia.com Reported-by: Tuomas Tynkkynen <ttynkkynen@nvidia.com> Tested-by: Tuomas Tynkkynen <ttynkkynen@nvidia.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-07-16sched: Allow wait_on_bit_action() functions to support a timeoutNeilBrown1-8/+8
It is currently not possible for various wait_on_bit functions to implement a timeout. While the "action" function that is called to do the waiting could certainly use schedule_timeout(), there is no way to carry forward the remaining timeout after a false wake-up. As false-wakeups a clearly possible at least due to possible hash collisions in bit_waitqueue(), this is a real problem. The 'action' function is currently passed a pointer to the word containing the bit being waited on. No current action functions use this pointer. So changing it to something else will be a little noisy but will have no immediate effect. This patch changes the 'action' function to take a pointer to the "struct wait_bit_key", which contains a pointer to the word containing the bit so nothing is really lost. It also adds a 'private' field to "struct wait_bit_key", which is initialized to zero. An action function can now implement a timeout with something like static int timed_out_waiter(struct wait_bit_key *key) { unsigned long waited; if (key->private == 0) { key->private = jiffies; if (key->private == 0) key->private -= 1; } waited = jiffies - key->private; if (waited > 10 * HZ) return -EAGAIN; schedule_timeout(waited - 10 * HZ); return 0; } If any other need for context in a waiter were found it would be easy to use ->private for some other purpose, or even extend "struct wait_bit_key". My particular need is to support timeouts in nfs_release_page() to avoid deadlocks with loopback mounted NFS. While wait_on_bit_timeout() would be a cleaner interface, it will not meet my need. I need the timeout to be sensitive to the state of the connection with the server, which could change. So I need to use an 'action' interface. Signed-off-by: NeilBrown <neilb@suse.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steve French <sfrench@samba.org> Cc: David Howells <dhowells@redhat.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140707051604.28027.41257.stgit@notabene.brown Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16sched: Remove proliferation of wait_on_bit() action functionsNeilBrown2-7/+19
The current "wait_on_bit" interface requires an 'action' function to be provided which does the actual waiting. There are over 20 such functions, many of them identical. Most cases can be satisfied by one of just two functions, one which uses io_schedule() and one which just uses schedule(). So: Rename wait_on_bit and wait_on_bit_lock to wait_on_bit_action and wait_on_bit_lock_action to make it explicit that they need an action function. Introduce new wait_on_bit{,_lock} and wait_on_bit{,_lock}_io which are *not* given an action function but implicitly use a standard one. The decision to error-out if a signal is pending is now made based on the 'mode' argument rather than being encoded in the action function. All instances of the old wait_on_bit and wait_on_bit_lock which can use the new version have been changed accordingly and their action functions have been discarded. wait_on_bit{_lock} does not return any specific error code in the event of a signal so the caller must check for non-zero and interpolate their own error code as appropriate. The wait_on_bit() call in __fscache_wait_on_invalidate() was ambiguous as it specified TASK_UNINTERRUPTIBLE but used fscache_wait_bit_interruptible as an action function. David Howells confirms this should be uniformly "uninterruptible" The main remaining user of wait_on_bit{,_lock}_action is NFS which needs to use a freezer-aware schedule() call. A comment in fs/gfs2/glock.c notes that having multiple 'action' functions is useful as they display differently in the 'wchan' field of 'ps'. (and /proc/$PID/wchan). As the new bit_wait{,_io} functions are tagged "__sched", they will not show up at all, but something higher in the stack. So the distinction will still be visible, only with different function names (gds2_glock_wait versus gfs2_glock_dq_wait in the gfs2/glock.c case). Since first version of this patch (against 3.15) two new action functions appeared, on in NFS and one in CIFS. CIFS also now uses an action function that makes the same freezer aware schedule call as NFS. Signed-off-by: NeilBrown <neilb@suse.de> Acked-by: David Howells <dhowells@redhat.com> (fscache, keys) Acked-by: Steven Whitehouse <swhiteho@redhat.com> (gfs2) Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steve French <sfrench@samba.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140707051603.28027.72349.stgit@notabene.brown Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16Merge tag 'v3.16-rc5' into sched/core, to refresh the branch before applying ↵Ingo Molnar22-137/+530
bigger tree-wide changes Signed-off-by: Ingo Molnar <mingo@kernel.org>