aboutsummaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)AuthorFilesLines
2013-08-26kernel/nsproxy.c: Improving a snippet of code.Raphael S.Carvalho1-1/+2
It seems GCC generates a better code in that way, so I changed that statement. Btw, they have the same semantic, so I'm sending this patch due to performance issues. Acked-by: Serge E. Hallyn <[email protected]> Signed-off-by: Raphael S.Carvalho <[email protected]> Signed-off-by: Eric W. Biederman <[email protected]>
2013-08-27Merge branch 'pm-sleep'Rafael J. Wysocki1-2/+2
* pm-sleep: PM / Sleep: new trace event to print device suspend and resume times PM / Sleep: increase ftrace coverage in suspend/resume
2013-08-27Merge branch 'acpi-processor'Rafael J. Wysocki1-6/+3
* acpi-processor: ACPI / processor: Acquire writer lock to update CPU maps ACPI / processor: Remove acpi_processor_get_limit_info()
2013-08-26cgroup: make cgroup_write_event_control() use css_from_dir() instead of ↵Tejun Heo1-13/+5
__d_cgrp() cgroup_event will be moved to its only user - memcg. Replace __d_cgrp() usage with css_from_dir(), which is already exported. This also simplifies the code a bit. Signed-off-by: Tejun Heo <[email protected]> Acked-by: Li Zefan <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]>
2013-08-26cgroup: make cgroup_event hold onto cgroup_subsys_state instead of cgroupTejun Heo1-14/+12
Currently, each registered cgroup_event holds an extra reference to the cgroup. This is a bit weird as events are subsystem specific and will also be incorrect in the planned unified hierarchy as css (cgroup_subsys_state) may come and go dynamically across the lifetime of a cgroup. Holding onto cgroup won't prevent the target css from going away. Update cgroup_event to hold onto the css the traget file belongs to instead of cgroup. Signed-off-by: Tejun Heo <[email protected]> Acked-by: Li Zefan <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]>
2013-08-26cgroup: implement CFTYPE_NO_PREFIXTejun Heo1-1/+2
When cgroup files are created, cgroup core automatically prepends the name of the subsystem as prefix. This patch adds CFTYPE_NO_ which disables the automatic prefix. This is to work around historical baggages and shouldn't be used for new files. This will be used to move "cgroup.event_control" from cgroup core to memcg. Signed-off-by: Tejun Heo <[email protected]> Acked-by: Li Zefan <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: Glauber Costa <[email protected]>
2013-08-26cgroup: make cgroup_css() take cgroup_subsys * instead and allow NULL subsysTejun Heo1-47/+43
cgroup_css() is no longer used in hot paths. Make it take struct cgroup_subsys * and allow the users to specify NULL subsys to obtain the dummy_css. This removes open-coded NULL subsystem testing in a couple users and generally simplifies the code. After this patch, css_from_dir() also allows NULL @ss and returns the matching dummy_css. This behavior change doesn't affect its only user - perf. Signed-off-by: Tejun Heo <[email protected]> Acked-by: Li Zefan <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]>
2013-08-26cgroup: rename cgroup_css_from_dir() to css_from_dir() and update its syntaxTejun Heo2-17/+11
cgroup_css_from_dir() will grow another user. In preparation, make the following changes. * All css functions are prefixed with just "css_", rename it to css_from_dir(). * Take dentry * instead of file * as dentry is what ultimately identifies a cgroup and file may not always be available. Note that the function now checkes whether @dentry->d_inode is NULL as the caller now may specify a negative dentry. * Make it take cgroup_subsys * instead of integer subsys_id. This simplifies the function and allows specifying no subsystem for cgroup->dummy_css. * Make return section a bit less verbose. This patch doesn't introduce any behavior changes. Signed-off-by: Tejun Heo <[email protected]> Acked-by: Li Zefan <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]>
2013-08-23workqueue: convert bus code to use dev_groupsGreg Kroah-Hartman1-12/+15
The dev_attrs field of struct bus_type is going away soon, dev_groups should be used instead. This converts the workqueue bus code to use the correct field. Acked-by: Tejun Heo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2013-08-23Merge branch 'for-3.11-fixes' of ↵Linus Torvalds1-5/+9
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fix from Tejun Heo: "A late fix for cgroup. This fixes a behavior regression visible to userland which was created by a commit merged during -rc1. While the behavior change isn't too likely to be noticeable, the fix is relatively low risk and we'll need to backport it through -stable anyway if the bug gets released" * 'for-3.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cpuset: fix a regression in validating config change
2013-08-22tracing: Make tracing_cpumask available for all instancesAlexander Z Lam2-17/+21
Allow tracer instances to disable tracing by cpu by moving the static global tracing_cpumask into trace_array. Link: http://lkml.kernel.org/r/921622317f239bfc2283cac2242647801ef584f2.1375980149.git.azl@google.com Cc: Vaibhav Nagarnaik <[email protected]> Cc: David Sharp <[email protected]> Cc: Alexander Z Lam <[email protected]> Signed-off-by: Alexander Z Lam <[email protected]> Signed-off-by: Steven Rostedt <[email protected]>
2013-08-21tracing: Kill the !CONFIG_MODULES code in trace_events.cOleg Nesterov1-12/+6
Move trace_module_nb under CONFIG_MODULES and kill the dummy trace_module_notify(). Imho it doesn't make sense to define "struct notifier_block" and its .notifier_call just to avoid "ifdef" in event_trace_init(), and all other !CONFIG_MODULES code has already gone away. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Oleg Nesterov <[email protected]> Signed-off-by: Steven Rostedt <[email protected]>
2013-08-21tracing: Don't pass file_operations array to event_create_dir()Oleg Nesterov1-34/+12
Now that event_create_dir() and __trace_add_new_event() always use the same file_operations we can kill these arguments and simplify the code. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Oleg Nesterov <[email protected]> Signed-off-by: Steven Rostedt <[email protected]>
2013-08-21tracing: Kill trace_create_file_ops() and friendsOleg Nesterov1-144/+9
trace_create_file_ops() allocates the copy of id/filter/format/enable file_operations to set "f_op->owner = mod" for fops_get(). However after the recent changes there is no reason to prevent rmmod even if one of these files is opened. A file operation can do nothing but fail after remove_event_file_dir() clears ->i_private for every file removed by trace_module_remove_events(). Kill "struct ftrace_module_file_ops" and fix the compilation errors. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Oleg Nesterov <[email protected]> Signed-off-by: Steven Rostedt <[email protected]>
2013-08-21tracing/syscalls: Annotate raw_init function with __initLi Zefan1-5/+5
init_syscall_trace() can only be called during kernel bootup only, so we can mark it and the functions it calls as __init. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Li Zefan <[email protected]> Signed-off-by: Steven Rostedt <[email protected]>
2013-08-21workqueue: Fix manage_workers() RETURNS descriptionLibin1-2/+5
No functional change. The comment of function manage_workers() RETURNS description is obvious wrong, same as the CONTEXT. Fix it. Signed-off-by: Libin <[email protected]> Signed-off-by: Tejun Heo <[email protected]>
2013-08-21workqueue: Comment correction in file headerLibin1-3/+4
No functional change. There are two worker pools for each cpu in current implementation (one for normal work items and the other for high priority ones). tj: Whitespace adjustments. Signed-off-by: Libin <[email protected]> Signed-off-by: Tejun Heo <[email protected]>
2013-08-21cpuset: fix a regression in validating config changeLi Zefan1-5/+9
It's not allowed to clear masks of a cpuset if there're tasks in it, but it's broken: # mkdir /cgroup/sub # echo 0 > /cgroup/sub/cpuset.cpus # echo 0 > /cgroup/sub/cpuset.mems # echo $$ > /cgroup/sub/tasks # echo > /cgroup/sub/cpuset.cpus (should fail) This bug was introduced by commit 88fa523bff295f1d60244a54833480b02f775152 ("cpuset: allow to move tasks to empty cpusets"). tj: Dropped temp bool variables and nestes the conditionals directly. Signed-off-by: Li Zefan <[email protected]> Signed-off-by: Tejun Heo <[email protected]>
2013-08-20rcu: Simplify _rcu_barrier() processingPaul E. McKenney1-2/+13
This commit drops an unneeded ACCESS_ONCE() and simplifies an "our work is done" check in _rcu_barrier(). This applies feedback from Linus (https://lkml.org/lkml/2013/7/26/777) that he gave to similar code in an unrelated patch. Signed-off-by: Paul E. McKenney <[email protected]> Reviewed-by: Josh Triplett <[email protected]> [ paulmck: Fix comment to match code, reported by Lai Jiangshan. ]
2013-08-20rcu: Make rcutorture emit online failures if verbosePaul E. McKenney1-1/+7
Although rcutorture counts CPU-hotplug online failures, it does not explicitly record which CPUs were having trouble coming online. This commit therefore emits a console message when online failure occurs. Signed-off-by: Paul E. McKenney <[email protected]> Reviewed-by: Josh Triplett <[email protected]>
2013-08-20rcu: Remove unused variable from rcu_torture_writer()Paul E. McKenney1-2/+0
The oldbatch variable in rcu_torture_writer() is stored to, but never loaded from. This commit therefore removes it. Signed-off-by: Paul E. McKenney <[email protected]> Reviewed-by: Josh Triplett <[email protected]>
2013-08-20rcu: Sort rcutorture module parametersPaul E. McKenney1-52/+49
There are getting to be too many module parameters to permit the current semi-random order, so this patch orders them. Signed-off-by: Paul E. McKenney <[email protected]> Reviewed-by: Josh Triplett <[email protected]>
2013-08-20rcu: Increase rcutorture test coveragePaul E. McKenney1-163/+63
Currently, rcutorture has separate torture_types to test synchronous, asynchronous, and expedited grace-period primitives. This has two disadvantages: (1) Three times the number of runs to cover the combinations and (2) Little testing of concurrent combinations of the three options. This commit therefore adds a pair of module parameters that control normal and expedited state, with the default being both types, randomly selected, by the fakewriter processes, thus reducing source-code size and increasing test coverage. In addtion, the writer task switches between asynchronous-normal and expedited grace-period primitives driven by the same pair of module parameters. Signed-off-by: Paul E. McKenney <[email protected]> Reviewed-by: Josh Triplett <[email protected]>
2013-08-20rcu: Add duplicate-callback tests to rcutorturePaul E. McKenney1-0/+61
This commit adds a object_debug option to rcutorture to allow the debug-object-based checks for duplicate call_rcu() invocations to be deterministically tested. Signed-off-by: Paul E. McKenney <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Sedat Dilek <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Linus Torvalds <[email protected]> Tested-by: Sedat Dilek <[email protected]> [ paulmck: Banish mid-function ifdef, more or less per Josh Triplett. ] Reviewed-by: Josh Triplett <[email protected]> [ paulmck: Improve duplicate-callback test, per Lai Jiangshan. ]
2013-08-20workqueue: fix some scripts/kernel-doc warningsYacine Belkadi1-41/+66
When building the htmldocs (in verbose mode), scripts/kernel-doc reports the following type of warnings: Warning(kernel/workqueue.c:653): No description found for return value of 'get_work_pool' Fix them by: - Using "Return:" sections to introduce descriptions of return values - Adding some missing descriptions Signed-off-by: Yacine Belkadi <[email protected]> Signed-off-by: Jiri Kosina <[email protected]>
2013-08-20kernel/params.c: use scnprintf() instead of sprintf()Chen Gang1-3/+4
For some strings (e.g. version string), they are permitted to be larger than PAGE_SIZE (although meaningless), so recommend to use scnprintf() instead of sprintf(). Signed-off-by: Chen Gang <[email protected]> Signed-off-by: Rusty Russell <[email protected]>
2013-08-20kernel/module.c: use scnprintf() instead of sprintf()Chen Gang1-1/+1
For some strings, they are permitted to be larger than PAGE_SIZE, so need use scnprintf() instead of sprintf(), or it will cause issue. One case is: if a module version is crazy defined (length more than PAGE_SIZE), 'modinfo' command is still OK (print full contents), but for "cat /sys/modules/'modname'/version", will cause issue in kernel. Signed-off-by: Chen Gang <[email protected]> Signed-off-by: Rusty Russell <[email protected]>
2013-08-20module: Add NOARG flag for ops with param_set_bool_enable_only() set functionSteven Rostedt1-0/+1
The ops that uses param_set_bool_enable_only() as its set function can easily handle being used without an argument. There's no reason to fail the loading of the module if it does not have one. Signed-off-by: Steven Rostedt <[email protected]> Signed-off-by: Rusty Russell <[email protected]>
2013-08-20module: Add flag to allow mod params to have no argumentsSteven Rostedt1-2/+4
Currently the params.c code allows only two "set" functions to have no arguments. If a parameter does not have an argument, then it looks at the set function and tests if it is either param_set_bool() or param_set_bint(). If it is not one of these functions, then it fails the loading of the module. But there may be module parameters that have different set functions and still allow no arguments. But unless each of these cases adds their function to the if statement, it wont be allowed to have no arguments. This method gets rather messing and does not scale. Instead, introduce a flags field to the kernel_param_ops, where if the flag KERNEL_PARAM_FL_NOARG is set, the parameter will not fail if it does not contain an argument. It will be expected that the corresponding set function can handle a NULL pointer as "val". Signed-off-by: Steven Rostedt <[email protected]> Signed-off-by: Rusty Russell <[email protected]>
2013-08-20module: fix sprintf format specifier in param_get_byte()Christoph Jaeger1-1/+1
In param_get_byte(), to which the macro STANDARD_PARAM_DEF(byte, ...) expands, "%c" is used to print an unsigned char. So it gets printed as a character what is not intended here. Use "%hhu" instead. [Rusty: note drivers which would be effected: drivers/net/wireless/cw1200/main.c drivers/ntb/ntb_transport.c:68 drivers/scsi/lpfc/lpfc_attr.c drivers/usb/atm/speedtch.c drivers/usb/gadget/g_ffs.c ] Acked-by: Jon Mason <[email protected]> (for ntb) Acked-by: Michal Nazarewicz <[email protected]> (for g_ffs.c) Signed-off-by: Christoph Jaeger <[email protected]> Signed-off-by: Rusty Russell <[email protected]>
2013-08-19Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds2-4/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Ingo Molnar: "Three small fixlets" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: nohz: fix compile warning in tick_nohz_init() nohz: Do not warn about unstable tsc unless user uses nohz_full sched_clock: Fix integer overflow
2013-08-19kernel: fix new kernel-doc warning in wait.cRandy Dunlap1-2/+1
Fix new kernel-doc warnings in kernel/wait.c: Warning(kernel/wait.c:374): No description found for parameter 'p' Warning(kernel/wait.c:374): Excess function parameter 'word' description in 'wake_up_atomic_t' Warning(kernel/wait.c:374): Excess function parameter 'bit' description in 'wake_up_atomic_t' Signed-off-by: Randy Dunlap <[email protected]> Cc: David Howells <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-08-19cgroup: fix cgroup_write_event_control()Tejun Heo1-4/+21
81eeaf0411 ("cgroup: make cftype->[un]register_event() deal with cgroup_subsys_state inst ead of cgroup") updated the cftype event methods to take @css (cgroup_subsys_state) instead of @cgroup; however, it incorrectly used @css passed to cgroup_write_event_control(), which the dummy_css for the cgroup as the file is a cgroup core file. This leads to oops on event registration. Fix it by using the css matching the event target file. Note that cgroup_write_event_control() now disallows cgroup core files from being event sources. This is for simplicity and doesn't matter as cgroup_event will be moved and made specific to memcg. Signed-off-by: Tejun Heo <[email protected]> Acked-by: Li Zefan <[email protected]>
2013-08-19cgroup: fix subsystem file accesses on the root cgroupTejun Heo1-14/+10
105347ba5 ("cgroup: make cgroup_file_open() rcu_read_lock() around cgroup_css() and add cfent->css") added cfent->css to cache the associted cgroup_subsys_state across file operations. A cfent is associated with single css throughout its lifetime and the origimal commit initialized the cache pointer during cgroup_add_file() and verified that it matches the actual one in cgroup_file_open(). While this works fine for !root cgroups, it's broken for root cgroups as files in a root cgroup are created before the css's are associated with the cgroup and thus cgroup_css() call in cgroup_add_file() returns NULL associating all cfents in the root cgroup with NULL css. This makes cgroup_file_open() trigger WARN and fail with -ENODEV for all !core subsystem files in the root cgroups. There's no reason to initialize cfent->css separately from cgroup_add_file(). As the association never changes, cgroup_file_open() can set it unconditionally every time and containing the logic in cgroup_file_open() makes more sense anyway as the only reason it's necessary is file->private_data being already occupied. Fix it by setting cfent->css unconditionally from cgroup_file_open(). Signed-off-by: Tejun Heo <[email protected]> Acked-by: Li Zefan <[email protected]>
2013-08-19cgroup: change cgroup_from_id() to css_from_id()Li Zefan1-0/+22
Now we want cgroup core to always provide the css to use to the subsystems, so change this API to css_from_id(). Uninline css_from_id(), because it's getting bigger and cgroup_css() has been unexported. While at it, remove the #ifdef, and shuffle the order of the args. Signed-off-by: Li Zefan <[email protected]> Signed-off-by: Tejun Heo <[email protected]>
2013-08-19generic-ipi/locking: Fix misleading smp_call_function_any() descriptionXie XiuQi1-2/+0
Fix locking description: after commit 8969a5ede0f9e17da4b9437 ("generic-ipi: remove kmalloc()"), wait = 0 can be guaranteed because we don't kmalloc() anymore. Signed-off-by: Xie XiuQi <[email protected]> Cc: Sheng Yang <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Rusty Russell <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-08-18nohz_full: Add full-system-idle arguments to APIPaul E. McKenney1-7/+18
This commit adds an isidle and jiffies argument to force_qs_rnp(), dyntick_save_progress_counter(), and rcu_implicit_dynticks_qs() to enable RCU's force-quiescent-state process to check for full-system idle. Signed-off-by: Paul E. McKenney <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Lai Jiangshan <[email protected]> [ paulmck: Use true and false for boolean constants per Lai Jiangshan. ] Reviewed-by: Josh Triplett <[email protected]>
2013-08-18nohz_full: Add full-system idle states and variablesPaul E. McKenney1-0/+17
This commit adds control variables and states for full-system idle. The system will progress through the states in numerical order when the system is fully idle (other than the timekeeping CPU), and reset down to the initial state if any non-timekeeping CPU goes non-idle. The current state is kept in full_sysidle_state. One flavor of RCU will be in charge of driving the state machine, defined by rcu_sysidle_state. This should be the busiest flavor of RCU. Signed-off-by: Paul E. McKenney <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Steven Rostedt <[email protected]> Reviewed-by: Josh Triplett <[email protected]>
2013-08-18nohz_full: Add per-CPU idle-state trackingPaul E. McKenney3-0/+85
This commit adds the code that updates the rcu_dyntick structure's new fields to track the per-CPU idle state based on interrupts and transitions into and out of the idle loop (NMIs are ignored because NMI handlers cannot cleanly read out the time anyway). This code is similar to the code that maintains RCU's idea of per-CPU idleness, but differs in that RCU treats CPUs running in user mode as idle, where this new code does not. Signed-off-by: Paul E. McKenney <[email protected]> Acked-by: Frederic Weisbecker <[email protected]> Cc: Steven Rostedt <[email protected]> Reviewed-by: Josh Triplett <[email protected]>
2013-08-18nohz_full: Add rcu_dyntick data for scalable detection of all-idle statePaul E. McKenney3-0/+33
This commit adds fields to the rcu_dyntick structure that are used to detect idle CPUs. These new fields differ from the existing ones in that the existing ones consider a CPU executing in user mode to be idle, where the new ones consider CPUs executing in user mode to be busy. The handling of these new fields is otherwise quite similar to that for the exiting fields. This commit also adds the initialization required for these fields. So, why is usermode execution treated differently, with RCU considering it a quiescent state equivalent to idle, while in contrast the new full-system idle state detection considers usermode execution to be non-idle? It turns out that although one of RCU's quiescent states is usermode execution, it is not a full-system idle state. This is because the purpose of the full-system idle state is not RCU, but rather determining when accurate timekeeping can safely be disabled. Whenever accurate timekeeping is required in a CONFIG_NO_HZ_FULL kernel, at least one CPU must keep the scheduling-clock tick going. If even one CPU is executing in user mode, accurate timekeeping is requires, particularly for architectures where gettimeofday() and friends do not enter the kernel. Only when all CPUs are really and truly idle can accurate timekeeping be disabled, allowing all CPUs to turn off the scheduling clock interrupt, thus greatly improving energy efficiency. This naturally raises the question "Why is this code in RCU rather than in timekeeping?", and the answer is that RCU has the data and infrastructure to efficiently make this determination. Signed-off-by: Paul E. McKenney <[email protected]> Acked-by: Frederic Weisbecker <[email protected]> Cc: Steven Rostedt <[email protected]> Reviewed-by: Josh Triplett <[email protected]>
2013-08-18nohz_full: Add Kconfig parameter for scalable detection of all-idle statePaul E. McKenney1-0/+23
At least one CPU must keep the scheduling-clock tick running for timekeeping purposes whenever there is a non-idle CPU. However, with the new nohz_full adaptive-idle machinery, it is difficult to distinguish between all CPUs really being idle as opposed to all non-idle CPUs being in adaptive-ticks mode. This commit therefore adds a Kconfig parameter as a first step towards enabling a scalable detection of full-system idle state. Signed-off-by: Paul E. McKenney <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Steven Rostedt <[email protected]> [ paulmck: Update help text per Frederic Weisbecker. ] Reviewed-by: Josh Triplett <[email protected]>
2013-08-18rcu: Eliminate unused APIs intended for adaptive ticksPaul E. McKenney1-43/+0
The rcu_user_enter_after_irq() and rcu_user_exit_after_irq() functions were intended for use by adaptive ticks, but changes in implementation have rendered them unnecessary. This commit therefore removes them. Reported-by: Frederic Weisbecker <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Reviewed-by: Josh Triplett <[email protected]>
2013-08-18rcu: Avoid redundant grace-period kthread wakeupsPaul E. McKenney1-3/+5
When setting up an in-the-future "advanced" grace period, the code needs to wake up the relevant grace-period kthread, which it currently does unconditionally. However, this results in needless wakeups in the case where the advanced grace period is being set up by the grace-period kthread itself, which is a non-uncommon situation. This commit therefore checks to see if the running thread is the grace-period kthread, and avoids doing the irq_work_queue()-mediated wakeup in that case. Signed-off-by: Paul E. McKenney <[email protected]> Reviewed-by: Josh Triplett <[email protected]>
2013-08-18rcu: Make call_rcu() leak callbacks for debug-object errorsPaul E. McKenney2-4/+20
If someone does a duplicate call_rcu(), the worst thing the second call_rcu() could do would be to actually queue the callback the second time because doing so corrupts whatever list the callback was already queued on. This commit therefore makes __call_rcu() check the new return value from debug-objects and leak the callback upon error. This commit also substitutes rcu_leak_callback() for whatever callback function was previously in place in order to avoid freeing the callback out from under any readers that might still be referencing it. These changes increase the probability that the debug-objects error messages will actually make it somewhere visible. Signed-off-by: Paul E. McKenney <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Sedat Dilek <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Linus Torvalds <[email protected]> Tested-by: Sedat Dilek <[email protected]> Reviewed-by: Josh Triplett <[email protected]>
2013-08-18rcu: Simplify debug-objects fixupsPaul E. McKenney1-100/+0
The current debug-objects fixups are complex and heavyweight, and the fixups are not complete: Even with the fixups, RCU's callback lists can still be corrupted. This commit therefore strips the fixups down to their minimal form, eliminating two of the three. It would be even better if (for example) call_rcu() simply leaked any problematic callbacks, but for that to happen, the debug-objects system would need to inform its caller of suspicious situations. This is the subject of a later commit in this series. Signed-off-by: Paul E. McKenney <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Sedat Dilek <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Linus Torvalds <[email protected]> Tested-by: Sedat Dilek <[email protected]> Reviewed-by: Josh Triplett <[email protected]>
2013-08-18rcu: Expedite grace periods during suspend/resumeBorislav Petkov1-0/+21
CONFIG_RCU_FAST_NO_HZ can increase grace-period durations by up to a factor of four, which can result in long suspend and resume times. Thus, this commit temporarily switches to expedited grace periods when suspending the box and return to normal settings when resuming. Similar logic is applied to hibernation. Because expedited grace periods are of dubious benefit on very large systems, so this commit restricts their automated use during suspend and resume to systems of 256 or fewer CPUs. (Some day a number of Linux-kernel facilities, including RCU's expedited grace periods, will be more scalable, but I need to see bug reports first.) [ paulmck: This also papers over an audio/irq bug, but hopefully that will be fixed soon. ] Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Bjørn Mork <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Reviewed-by: Josh Triplett <[email protected]>
2013-08-18Merge branch 'for-3.11-fixes' of ↵Linus Torvalds1-2/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fix from Tejun Heo: "This contains one patch to fix the return value of cpuset's cgroups interface function, which used to always return -ENODEV for the writes on the 'memory_pressure_enabled' file" * 'for-3.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cpuset: fix the return value of cpuset_write_u64()
2013-08-16Merge tag 'pm-3.11-rc6' of ↵Linus Torvalds1-7/+13
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fix from Rafael Wysocki: "The removal of delayed_work_pending() checks from kernel/power/qos.c done in 3.9 introduced a deadlock in pm_qos_work_fn(). Fix from Stephen Boyd" * tag 'pm-3.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM / QoS: Fix workqueue deadlock when using pm_qos_update_request_timeout()
2013-08-16perf: Do not compute time values unnecessarilyPeter Zijlstra1-4/+4
We should not be calling calc_timer_values() for events that do not actually have an mmap()'ed userpage. Signed-off-by: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-08-16perf: Account freq events globallyFrederic Weisbecker1-11/+8
Freq events may not always be affine to a particular CPU. As such, account_event_cpu() may crash if we account per cpu a freq event that has event->cpu == -1. To solve this, lets account freq events globally. In practice this doesn't change much the picture because perf tools create per-task perf events with one event per CPU by default. Profiling a single CPU is usually a corner case so there is no much point in optimizing things that way. Reported-by: Jiri Olsa <[email protected]> Suggested-by: Peter Zijlstra <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Tested-by: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Stephane Eranian <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>