aboutsummaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)AuthorFilesLines
2012-11-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2-3/+19
Signed-off-by: David S. Miller <[email protected]>
2012-11-29do_coredump(): get rid of pt_regs argumentAl Viro1-1/+1
Signed-off-by: Al Viro <[email protected]>
2012-11-29print_fatal_signal(): get rid of pt_regs argumentAl Viro1-2/+3
Signed-off-by: Al Viro <[email protected]>
2012-11-29ptrace_signal(): get rid of unused argumentsAl Viro1-4/+2
Signed-off-by: Al Viro <[email protected]>
2012-11-29get rid of ptrace_signal_deliver() argumentsAl Viro1-1/+1
the first one is equal to signal_pt_regs(), the second is never used (and always NULL, while we are at it). Signed-off-by: Al Viro <[email protected]>
2012-11-29flagday: kill pt_regs argument of do_fork()Al Viro1-8/+5
Signed-off-by: Al Viro <[email protected]>
2012-11-28death to idle_regs()Al Viro1-6/+0
Signed-off-by: Al Viro <[email protected]>
2012-11-28don't pass regs to copy_process()Al Viro1-4/+2
Signed-off-by: Al Viro <[email protected]>
2012-11-28flagday: don't pass regs to copy_thread()Al Viro1-1/+1
Signed-off-by: Al Viro <[email protected]>
2012-11-28audit: no nested contexts anymore...Al Viro1-81/+21
Signed-off-by: Al Viro <[email protected]>
2012-11-28generic sys_fork / sys_vfork / sys_cloneAl Viro1-0/+43
... and get rid of idiotic struct pt_regs * in asm-generic/syscalls.h prototypes of the same, while we are at it. Eventually we want those in linux/syscalls.h, of course, but that'll have to wait a bit. Note that there are *three* variants of sys_clone() order of arguments. Braindamage galore... Signed-off-by: Al Viro <[email protected]>
2012-11-28kill daemonize()Al Viro1-92/+0
Signed-off-by: Al Viro <[email protected]>
2012-11-28cgroup: list_del_init() on removed eventsGreg Thelen1-2/+2
Use list_del_init() rather than list_del() to remove events from cgrp->event_list. No functional change. This is just defensive coding. Signed-off-by: Greg Thelen <[email protected]> Signed-off-by: Tejun Heo <[email protected]>
2012-11-28cgroup: fix lockdep warning for event_controlGreg Thelen1-3/+8
The cgroup_event_wake() function is called with the wait queue head locked and it takes cgrp->event_list_lock. However, in cgroup_rmdir() remove_wait_queue() was being called after taking cgrp->event_list_lock. Correct the lock ordering by using a temporary list to obtain the event list to remove from the wait queue. Signed-off-by: Greg Thelen <[email protected]> Signed-off-by: Aaron Durbin <[email protected]> Signed-off-by: Tejun Heo <[email protected]>
2012-11-28kernel/ksysfs.c: remove CONFIG_HOTPLUG ifdefsBill Pemberton1-4/+1
Remove conditional code based on CONFIG_HOTPLUG being false. It's always on now in preparation of it going away as an option. Signed-off-by: Bill Pemberton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2012-11-28sysctl: remove CONFIG_HOTPLUG ifdefsBill Pemberton1-2/+2
Remove conditional code based on CONFIG_HOTPLUG being false. It's always on now in preparation of it going away as an option. Signed-off-by: Bill Pemberton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2012-11-28cputime: Comment cputime's adjusting codeFrederic Weisbecker1-2/+16
The reason for the scaling and monotonicity correction performed by cputime_adjust() may not be immediately clear to the reviewer. Add some comments to explain what happens there. Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Paul Gortmaker <[email protected]>
2012-11-28cputime: Consolidate cputime adjustment codeFrederic Weisbecker2-24/+24
task_cputime_adjusted() and thread_group_cputime_adjusted() essentially share the same code. They just don't use the same source: * The first function uses the cputime in the task struct and the previous adjusted snapshot that ensures monotonicity. * The second adds the cputime of all tasks in the group and the previous adjusted snapshot of the whole group from the signal structure. Just consolidate the common code that does the adjustment. These functions just need to fetch the values from the appropriate source. Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Paul Gortmaker <[email protected]>
2012-11-28cputime: Rename thread_group_times to thread_group_cputime_adjustedFrederic Weisbecker3-9/+9
We have thread_group_cputime() and thread_group_times(). The naming doesn't provide enough information about the difference between these two APIs. To lower the confusion, rename thread_group_times() to thread_group_cputime_adjusted(). This name better suggests that it's a version of thread_group_cputime() that does some stabilization on the raw cputime values. ie here: scale on top of CFS runtime stats and bound lower value for monotonicity. Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Paul Gortmaker <[email protected]>
2012-11-28cputime: Move thread_group_cputime() to sched codeFrederic Weisbecker2-24/+28
thread_group_cputime() is a general cputime API that is not only used by posix cpu timer. Let's move this helper to sched code. Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Paul Gortmaker <[email protected]>
2012-11-28cgroup: move list add after list head initilizationLi Zhong1-1/+1
2243076ad1 ("cgroup: initialize cgrp->allcg_node in init_cgroup_housekeeping()") initializes cgrp->allcg_node in init_cgroup_housekeeping(). Then in init_cgroup_root(), we should call init_cgroup_housekeeping() before adding it to &root->allcg_list; otherwise, we are initializing an entry already in a list. Signed-off-by: Li Zhong <[email protected]> Signed-off-by: Tejun Heo <[email protected]>
2012-11-27time: export time information for KVM pvclockMarcelo Tosatti1-0/+50
As suggested by John, export time data similarly to how its done by vsyscall support. This allows KVM to retrieve necessary information to implement vsyscall support in KVM guests. Acked-by: John Stultz <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2012-11-27sched: add notifier for cross-cpu migrationsMarcelo Tosatti1-0/+15
Originally from Jeremy Fitzhardinge. Acked-by: Ingo Molnar <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2012-11-26futex: avoid wake_futex() for a PI futex_qDarren Hart1-1/+17
Dave Jones reported a bug with futex_lock_pi() that his trinity test exposed. Sometime between queue_me() and taking the q.lock_ptr, the lock_ptr became NULL, resulting in a crash. While futex_wake() is careful to not call wake_futex() on futex_q's with a pi_state or an rt_waiter (which are either waiting for a futex_unlock_pi() or a PI futex_requeue()), futex_wake_op() and futex_requeue() do not perform the same test. Update futex_wake_op() and futex_requeue() to test for q.pi_state and q.rt_waiter and abort with -EINVAL if detected. To ensure any future breakage is caught, add a WARN() to wake_futex() if the same condition is true. This fix has seen 3 hours of testing with "trinity -c futex" on an x86_64 VM with 4 CPUS. [[email protected]: tidy up the WARN()] Signed-off-by: Darren Hart <[email protected]> Reported-by: Dave Jones <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: John Kacur <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-11-26watchdog: using u64 in get_sample_period()Chuansheng Liu1-2/+2
In get_sample_period(), unsigned long is not enough: watchdog_thresh * 2 * (NSEC_PER_SEC / 5) case1: watchdog_thresh is 10 by default, the sample value will be: 0xEE6B2800 case2: set watchdog_thresh is 20, the sample value will be: 0x1 DCD6 5000 In case2, we need use u64 to express the sample period. Otherwise, changing the threshold thru proc often can not be successful. Signed-off-by: liu chuansheng <[email protected]> Acked-by: Don Zickus <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-11-21Merge branch 'fortglx/3.8/time' of git://git.linaro.org/people/jstultz/linux ↵Thomas Gleixner13-262/+230
into timers/core Fix trivial conflicts in: kernel/time/tick-sched.c Signed-off-by: Thomas Gleixner <[email protected]>
2012-11-20cgroup: remove obsolete guarantee from cgroup_task_migrate.Tao Ma1-5/+3
'guarantee' is already removed from cgroup_task_migrate, so remove the corresponding comments. Some other typos in cgroup are also changed. Cc: Tejun Heo <[email protected]> Cc: Li Zefan <[email protected]> Signed-off-by: Tao Ma <[email protected]> Signed-off-by: Tejun Heo <[email protected]>
2012-11-20proc: Usable inode numbers for the namespace file descriptors.Eric W. Biederman5-1/+46
Assign a unique proc inode to each namespace, and use that inode number to ensure we only allocate at most one proc inode for every namespace in proc. A single proc inode per namespace allows userspace to test to see if two processes are in the same namespace. This has been a long requested feature and only blocked because a naive implementation would put the id in a global space and would ultimately require having a namespace for the names of namespaces, making migration and certain virtualization tricks impossible. We still don't have per superblock inode numbers for proc, which appears necessary for application unaware checkpoint/restart and migrations (if the application is using namespace file descriptors) but that is now allowd by the design if it becomes important. I have preallocated the ipc and uts initial proc inode numbers so their structures can be statically initialized. Signed-off-by: Eric W. Biederman <[email protected]>
2012-11-20userns: For /proc/self/{uid,gid}_map derive the lower userns from the struct ↵Eric W. Biederman1-2/+10
file To keep things sane in the context of file descriptor passing derive the user namespace that uids are mapped into from the opener of the file instead of from current. When writing to the maps file the lower user namespace must always be the parent user namespace, or setting the mapping simply does not make sense. Enforce that the opener of the file was in the parent user namespace or the user namespace whose mapping is being set. Acked-by: Serge E. Hallyn <[email protected]> Signed-off-by: "Eric W. Biederman" <[email protected]>
2012-11-20userns: Implement unshare of the user namespaceEric W. Biederman3-7/+41
- Add CLONE_THREAD to the unshare flags if CLONE_NEWUSER is selected As changing user namespaces is only valid if all there is only a single thread. - Restore the code to add CLONE_VM if CLONE_THREAD is selected and the code to addCLONE_SIGHAND if CLONE_VM is selected. Making the constraints in the code clear. Acked-by: Serge Hallyn <[email protected]> Signed-off-by: "Eric W. Biederman" <[email protected]>
2012-11-20userns: Implent proc namespace operationsEric W. Biederman1-17/+73
This allows entering a user namespace, and the ability to store a reference to a user namespace with a bind mount. Addition of missing userns_ns_put in userns_install from Gao feng <[email protected]> Acked-by: Serge Hallyn <[email protected]> Signed-off-by: "Eric W. Biederman" <[email protected]>
2012-11-20userns: Kill task_user_nsEric W. Biederman2-4/+16
The task_user_ns function hides the fact that it is getting the user namespace from struct cred on the task. struct cred may go away as soon as the rcu lock is released. This leads to a race where we can dereference a stale user namespace pointer. To make it obvious a struct cred is involved kill task_user_ns. To kill the race modify the users of task_user_ns to only reference the user namespace while the rcu lock is held. Cc: Kees Cook <[email protected]> Cc: James Morris <[email protected]> Acked-by: Kees Cook <[email protected]> Acked-by: Serge Hallyn <[email protected]> Signed-off-by: "Eric W. Biederman" <[email protected]>
2012-11-20userns: Make create_new_namespaces take a user_ns parameterEric W. Biederman2-14/+17
Modify create_new_namespaces to explicitly take a user namespace parameter, instead of implicitly through the task_struct. This allows an implementation of unshare(CLONE_NEWUSER) where the new user namespace is not stored onto the current task_struct until after all of the namespaces are created. Acked-by: Serge Hallyn <[email protected]> Signed-off-by: "Eric W. Biederman" <[email protected]>
2012-11-20userns: Allow unprivileged use of setns.Eric W. Biederman2-4/+6
- Push the permission check from the core setns syscall into the setns install methods where the user namespace of the target namespace can be determined, and used in a ns_capable call. Acked-by: Serge Hallyn <[email protected]> Signed-off-by: "Eric W. Biederman" <[email protected]>
2012-11-20userns: Allow unprivileged users to create new namespacesEric W. Biederman1-2/+3
If an unprivileged user has the appropriate capabilities in their current user namespace allow the creation of new namespaces. Acked-by: Serge Hallyn <[email protected]> Signed-off-by: "Eric W. Biederman" <[email protected]>
2012-11-20userns: Allow setting a userns mapping to your current uid.Eric W. Biederman1-0/+15
Acked-by: Serge Hallyn <[email protected]> Signed-off-by: "Eric W. Biederman" <[email protected]>
2012-11-19tracing: Remove unnecessary WARN_ONCE's from tracing_buffers_splice_readDave Jones1-2/+0
WARN shouldn't be used as a means of communicating failure to a userspace programmer. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Dave Jones <[email protected]> Signed-off-by: Steven Rostedt <[email protected]>
2012-11-19tracing: Remove unneeded checks from the stack tracerAnton Vorontsov1-4/+0
It seems that 'ftrace_enabled' flag should not be used inside the tracer functions. The ftrace core is using this flag for internal purposes, and the flag wasn't meant to be used in tracers' runtime checks. stack tracer is the only tracer that abusing the flag. So stop it from serving as a bad example. Also, there is a local 'stack_trace_disabled' flag in the stack tracer, which is never updated; so it can be removed as well. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Anton Vorontsov <[email protected]> Signed-off-by: Steven Rostedt <[email protected]>
2012-11-19cgroup: add cgroup->idTejun Heo1-1/+14
With the introduction of generic cgroup hierarchy iterators, css_id is being phased out. It was unnecessarily complex, id'ing the wrong thing (cgroups need IDs, not CSSes) and has other oddities like not being available at ->css_alloc(). This patch adds cgroup->id, which is a simple per-hierarchy ida-allocated ID which is assigned before ->css_alloc() and released after ->css_free(). Signed-off-by: Tejun Heo <[email protected]> Acked-by: Li Zefan <[email protected]> Acked-by: Neil Horman <[email protected]>
2012-11-19cgroup, cpuset: remove cgroup_subsys->post_clone()Tejun Heo2-48/+36
Currently CGRP_CPUSET_CLONE_CHILDREN triggers ->post_clone(). Now that clone_children is cpuset specific, there's no reason to have this rather odd option activation mechanism in cgroup core. cpuset can check the flag from its ->css_allocate() and take the necessary action. Move cpuset_post_clone() logic to the end of cpuset_css_alloc() and remove cgroup_subsys->post_clone(). Loosely based on Glauber's "generalize post_clone into post_create" patch. Signed-off-by: Tejun Heo <[email protected]> Original-patch-by: Glauber Costa <[email protected]> Original-patch: <[email protected]> Acked-by: Serge E. Hallyn <[email protected]> Acked-by: Li Zefan <[email protected]> Cc: Glauber Costa <[email protected]>
2012-11-19cgroup: s/CGRP_CLONE_CHILDREN/CGRP_CPUSET_CLONE_CHILDREN/Tejun Heo1-16/+12
clone_children is only meaningful for cpuset and will stay that way. Rename the flag to reflect that and update documentation. Also, drop clone_children() wrapper in cgroup.c. The thin wrapper is used only a few times and one of them will go away soon. Signed-off-by: Tejun Heo <[email protected]> Acked-by: Serge E. Hallyn <[email protected]> Acked-by: Li Zefan <[email protected]> Cc: Glauber Costa <[email protected]>
2012-11-19cgroup: rename ->create/post_create/pre_destroy/destroy() to ↵Tejun Heo5-52/+53
->css_alloc/online/offline/free() Rename cgroup_subsys css lifetime related callbacks to better describe what their roles are. Also, update documentation. Signed-off-by: Tejun Heo <[email protected]> Acked-by: Li Zefan <[email protected]>
2012-11-19cgroup: allow ->post_create() to failTejun Heo2-11/+22
There could be cases where controllers want to do initialization operations which may fail from ->post_create(). This patch makes ->post_create() return -errno to indicate failure and online_css() relay such failures. Signed-off-by: Tejun Heo <[email protected]> Acked-by: Li Zefan <[email protected]> Cc: Glauber Costa <[email protected]>
2012-11-19cgroup: update cgroup_create() failure pathTejun Heo1-7/+14
cgroup_create() was ignoring failure of cgroupfs files. Update it such that, if file creation fails, it rolls back by calling cgroup_destroy_locked() and returns failure. Note that error out goto labels are renamed. The labels are a bit confusing but will become better w/ later cgroup operation renames. Signed-off-by: Tejun Heo <[email protected]> Acked-by: Li Zefan <[email protected]>
2012-11-19cgroup: use mutex_trylock() when grabbing i_mutex of a new cgroup directoryTejun Heo1-3/+9
All cgroup directory i_mutexes nest outside cgroup_mutex; however, new directory creation is a special case. A new cgroup directory is created while holding cgroup_mutex. Populating the new directory requires both the new directory's i_mutex and cgroup_mutex. Because all directory i_mutexes nest outside cgroup_mutex, grabbing both requires releasing cgroup_mutex first, which isn't a good idea as the new cgroup isn't yet ready to be manipulated by other cgroup opreations. This is worked around by grabbing the new directory's i_mutex while holding cgroup_mutex before making it visible. As there's no other user at that point, grabbing the i_mutex under cgroup_mutex can't lead to deadlock. cgroup_create_file() was using I_MUTEX_CHILD to tell lockdep not to worry about the reverse locking order; however, this creates pseudo locking dependency cgroup_mutex -> I_MUTEX_CHILD, which isn't true - all directory i_mutexes are still nested outside cgroup_mutex. This pseudo locking dependency can lead to spurious lockdep warnings. Use mutex_trylock() instead. This will always succeed and lockdep doesn't create any locking dependency for it. Signed-off-by: Tejun Heo <[email protected]> Acked-by: Li Zefan <[email protected]>
2012-11-19cgroup: simplify cgroup_load_subsys() failure pathTejun Heo1-11/+10
Now that cgroup_unload_subsys() can tell whether the root css is online or not, we can safely call cgroup_unload_subsys() after idr init failure in cgroup_load_subsys(). Replace the manual unrolling and invoke cgroup_unload_subsys() on failure. This drops cgroup_mutex inbetween but should be safe as the subsystem will fail try_module_get() and thus can't be mounted inbetween. As this means that cgroup_unload_subsys() can be called before css_sets are rehashed, remove BUG_ON() on %NULL css_set->subsys[] from cgroup_unload_subsys(). Signed-off-by: Tejun Heo <[email protected]> Acked-by: Li Zefan <[email protected]>
2012-11-19cgroup: introduce CSS_ONLINE flag and on/offline_css() helpersTejun Heo1-23/+42
New helpers on/offline_css() respectively wrap ->post_create() and ->pre_destroy() invocations. online_css() sets CSS_ONLINE after ->post_create() is complete and offline_css() invokes ->pre_destroy() iff CSS_ONLINE is set and clears it while also handling the temporary dropping of cgroup_mutex. This patch doesn't introduce any behavior change at the moment but will be used to improve cgroup_create() failure path and allow ->post_create() to fail. Signed-off-by: Tejun Heo <[email protected]> Acked-by: Li Zefan <[email protected]>
2012-11-19cgroup: separate out cgroup_destroy_locked()Tejun Heo1-15/+25
Separate out cgroup_destroy_locked() from cgroup_destroy(). This will be later used in cgroup_create() failure path. While at it, add lockdep asserts on i_mutex and cgroup_mutex, and move @d and @parent assignments to their declarations. This patch doesn't introduce any functional difference. Signed-off-by: Tejun Heo <[email protected]> Acked-by: Li Zefan <[email protected]>
2012-11-19cgroup: fix harmless bugs in cgroup_load_subsys() fail path and ↵Tejun Heo1-1/+14
cgroup_unload_subsys() * If idr init fails, cgroup_load_subsys() cleared dummytop->subsys[] before calilng ->destroy() making CSS inaccessible to the callback, and didn't unlink ss->sibling. As no modular controller uses ->use_id, this doesn't cause any actual problems. * cgroup_unload_subsys() was forgetting to free idr, call ->pre_destroy() and clear ->active. As there currently is no modular controller which uses ->use_id, ->pre_destroy() or ->active, this doesn't cause any actual problems. Signed-off-by: Tejun Heo <[email protected]> Acked-by: Li Zefan <[email protected]>
2012-11-19cgroup: lock cgroup_mutex in cgroup_init_subsys()Tejun Heo1-0/+4
Make cgroup_init_subsys() grab cgroup_mutex while initializing a subsystem so that all helpers and callbacks are called under the context they expect. This isn't strictly necessary as cgroup_init_subsys() doesn't race with anybody but will allow adding lockdep assertions. Signed-off-by: Tejun Heo <[email protected]> Acked-by: Li Zefan <[email protected]>