aboutsummaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)AuthorFilesLines
2006-10-03[PATCH] sched: don't print migration cost when only 1 CPUDave Jones1-6/+8
If only a single CPU is present, printing this doesn't make much sense. Signed-off-by: Dave Jones <[email protected]> Acked-by: Ingo Molnar <[email protected]> Acked-by: Nick Piggin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-03[PATCH] sched: remove unnecessary sched group allocationsSiddha, Suresh B1-66/+38
Remove dynamic sched group allocations for MC and SMP domains. These allocations can easily fail on big systems(1024 or so CPUs) and we can live with out these dynamic allocations. [[email protected]: build fix] Signed-off-by: Suresh Siddha <[email protected]> Acked-by: Ingo Molnar <[email protected]> Acked-by: Nick Piggin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-03[PATCH] sched: force /sbin/init off isolated cpusNick Piggin1-2/+11
Force /sbin/init off isolated cpus (unless every CPU is specified as an isolcpu). Users seem to think that the isolated CPUs shouldn't have much running on them to begin with. That's fair enough: intuitive, I guess. It also means that the cpu affinity masks of tasks will not include isolcpus by default, which is also more intuitive, perhaps. /sbin/init is spawned from the boot CPU's idle thread, and /sbin/init starts the rest of userspace. So if the boot CPU is specified to be an isolcpu, then prior to this patch, all of userspace will be run there. (throw in a couple of plausible devinit -> cpuinit conversions I spotted while we're here). Signed-off-by: Nick Piggin <[email protected]> Cc: Dimitri Sivanich <[email protected]> Acked-by: Paul Jackson <[email protected]> Acked-by: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-03[PATCH] kernel-doc for kernel/resource.cRandy Dunlap1-11/+72
Add kernel-doc function headers in kernel/resource.c and use them in DocBook. Signed-off-by: Randy Dunlap <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-03[PATCH] kernel-doc for kernel/dma.cRandy Dunlap1-1/+9
Add kernel-doc function headers in kernel/dma.c and use it in DocBook. Clean up kernel-doc in mca_dma.h (the colon (':') represents a section header). Signed-off-by: Randy Dunlap <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-03[PATCH] Create kallsyms_lookup_size_offset()Franck Bui-Huu2-44/+82
Some uses of kallsyms_lookup() do not need to find out the name of a symbol and its module's name it belongs. This is specially true in arch specific code, which needs to unwind the stack to show the back trace during oops (mips is an example). In this specific case, we just need to retreive the function's size and the offset of the active intruction inside it. Adds a new entry "kallsyms_lookup_size_offset()" This new entry does exactly the same as kallsyms_lookup() but does not require any buffers to store any names. It returns 0 if it fails otherwise 1. Signed-off-by: Franck Bui-Huu <[email protected]> Cc: Rusty Russell <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] BLOCK: Revert patch to hack around undeclared sigset_t in linux/compat.hDavid Howells1-2/+0
Revert Andrew Morton's patch to temporarily hack around the lack of a declaration of sigset_t in linux/compat.h to make the block-disablement patches build on IA64. This got accidentally pushed to Linus and should be fixed in a different manner. Also make linux/compat.h #include asm/signal.h to gain a definition of sigset_t so that it can externally declare sigset_from_compat(). This has been compile-tested for i386, x86_64, ia64, mips, mips64, frv, ppc and ppc64 and run-tested on frv. Signed-off-by: David Howells <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] replace cad_pid by a struct pidCedric Le Goater2-6/+30
There are a few places in the kernel where the init task is signaled. The ctrl+alt+del sequence is one them. It kills a task, usually init, using a cached pid (cad_pid). This patch replaces the pid_t by a struct pid to avoid pid wrap around problem. The struct pid is initialized at boot time in init() and can be modified through systctl with /proc/sys/kernel/cad_pid [ I haven't found any distro using it ? ] It also introduces a small helper routine kill_cad_pid() which is used where it seemed ok to use cad_pid instead of pid 1. [[email protected]: cleanups, build fix] Signed-off-by: Cedric Le Goater <[email protected]> Cc: Eric W. Biederman <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Paul Mackerras <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] introduce get_task_pid() to fix unsafe get_pid()Oleg Nesterov1-0/+9
proc_pid_make_inode: ei->pid = get_pid(task_pid(task)); I think this is not safe. get_pid() can be preempted after checking "pid != NULL". Then the task exits, does detach_pid(), and RCU frees the pid. Signed-off-by: Oleg Nesterov <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] introduce kernel_execveArnd Bergmann1-3/+2
The use of execve() in the kernel is dubious, since it relies on the __KERNEL_SYSCALLS__ mechanism that stores the result in a global errno variable. As a first step of getting rid of this, change all users to a global kernel_execve function that returns a proper error code. This function is a terrible hack, and a later patch removes it again after the kernel syscalls are gone. Signed-off-by: Arnd Bergmann <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Richard Henderson <[email protected]> Cc: Ivan Kokshaysky <[email protected]> Cc: Russell King <[email protected]> Cc: Ian Molton <[email protected]> Cc: Mikael Starvik <[email protected]> Cc: David Howells <[email protected]> Cc: Yoshinori Sato <[email protected]> Cc: Hirokazu Takata <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: Kyle McMartin <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Kazumoto Kojima <[email protected]> Cc: Richard Curnow <[email protected]> Cc: William Lee Irwin III <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Jeff Dike <[email protected]> Cc: Paolo 'Blaisorblade' Giarrusso <[email protected]> Cc: Miles Bader <[email protected]> Cc: Chris Zankel <[email protected]> Cc: "Luck, Tony" <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Roman Zippel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] nsproxy cloning error path fixPavel1-1/+1
This patch fixes copy_namespaces()'s error path. when new nsproxy (new_ns) is created pointers to namespaces (ipc, uts) are copied from the old nsproxy. Later in copy_utsname, copy_ipcs, etc. according namespaces are get-ed. On error path needed namespaces are put-ed, so there's no need to put new nsproxy itelf as it woud cause putting namespaces for the second time. Found when incorporating namespaces into OpenVZ kernel. Signed-off-by: Pavel Emelianov <[email protected]> Acked-by: Serge Hallyn <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] IPC namespace - sysctlsKirill Korotaev1-28/+88
Sysctl tweaks for IPC namespace Signed-off-by: Pavel Emelianiov <[email protected]> Signed-off-by: Kirill Korotaev <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] IPC namespace - utilsKirill Korotaev1-0/+10
This patch adds basic IPC namespace functionality to IPC utils: - init_ipc_ns - copy/clone/unshare/free IPC ns - /proc preparations Signed-off-by: Pavel Emelianov <[email protected]> Signed-off-by: Kirill Korotaev <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Cedric Le Goater <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] IPC namespace coreKirill Korotaev2-17/+46
This patch set allows to unshare IPCs and have a private set of IPC objects (sem, shm, msg) inside namespace. Basically, it is another building block of containers functionality. This patch implements core IPC namespace changes: - ipc_namespace structure - new config option CONFIG_IPC_NS - adds CLONE_NEWIPC flag - unshare support [[email protected]: small fix for unshare of ipc namespace] [[email protected]: build fix] Signed-off-by: Pavel Emelianov <[email protected]> Signed-off-by: Kirill Korotaev <[email protected]> Signed-off-by: Cedric Le Goater <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] uts: copy nsproxy only when neededSerge Hallyn1-6/+14
The nsproxy was being copied in unshare() when anything was being unshared, even if it was something not referenced from nsproxy. This should end up in some cases with far more memory usage than necessary. Signed-off-by: Serge Hallyn <[email protected]> Cc: Kirill Korotaev <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Herbert Poetzl <[email protected]> Cc: Andrey Savochkin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] namespaces: utsname: implement CLONE_NEWUTS flagSerge E. Hallyn3-4/+71
Implement a CLONE_NEWUTS flag, and use it at clone and sys_unshare. [[email protected]: IPC unshare fix] [[email protected]: cleanup] Signed-off-by: Serge Hallyn <[email protected]> Cc: Kirill Korotaev <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Herbert Poetzl <[email protected]> Cc: Andrey Savochkin <[email protected]> Signed-off-by: Adrian Bunk <[email protected]> Signed-off-by: Cedric Le Goater <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] namespaces: utsname: sysctlSerge E. Hallyn1-19/+119
Sysctl uts patch. This will need to be done another way, but since sysctl itself needs to be container aware, 'the right thing' is a separate patchset. [[email protected]: ia64 build fix] [[email protected]: cleanup] [[email protected]: add proc_do_utsns_string] Signed-off-by: Serge E. Hallyn <[email protected]> Cc: Kirill Korotaev <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Herbert Poetzl <[email protected]> Cc: Andrey Savochkin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] namespaces: utsname: implement utsname namespacesSerge E. Hallyn3-0/+57
This patch defines the uts namespace and some manipulators. Adds the uts namespace to task_struct, and initializes a system-wide init namespace. It leaves a #define for system_utsname so sysctl will compile. This define will be removed in a separate patch. [[email protected]: build fix, cleanup] Signed-off-by: Serge Hallyn <[email protected]> Cc: Kirill Korotaev <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Herbert Poetzl <[email protected]> Cc: Andrey Savochkin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] namespaces: utsname: use init_utsname when appropriateSerge E. Hallyn2-8/+8
In some places, particularly drivers and __init code, the init utsns is the appropriate one to use. This patch replaces those with a the init_utsname helper. Changes: Removed several uses of init_utsname(). Hope I picked all the right ones in net/ipv4/ipconfig.c. These are now changed to utsname() (the per-process namespace utsname) in the previous patch (2/7) [[email protected]: CIFS fix] Signed-off-by: Serge E. Hallyn <[email protected]> Cc: Kirill Korotaev <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Herbert Poetzl <[email protected]> Cc: Andrey Savochkin <[email protected]> Cc: Serge Hallyn <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] namespaces: utsname: switch to using uts namespacesSerge E. Hallyn1-7/+7
Replace references to system_utsname to the per-process uts namespace where appropriate. This includes things like uname. Changes: Per Eric Biederman's comments, use the per-process uts namespace for ELF_PLATFORM, sunrpc, and parts of net/ipv4/ipconfig.c [[email protected]: UML fix] [[email protected]: cleanup] [[email protected]: build fix] Signed-off-by: Serge E. Hallyn <[email protected]> Cc: Kirill Korotaev <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Herbert Poetzl <[email protected]> Cc: Andrey Savochkin <[email protected]> Signed-off-by: Cedric Le Goater <[email protected]> Cc: Jeff Dike <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] namespaces: exit_task_namespaces() invalidates nsproxyCedric Le Goater1-1/+1
exit_task_namespaces() has replaced the former exit_namespace(). It invalidates task->nsproxy and associated namespaces. This is an issue for the (futur) pid namespace which is required to be valid in exit_notify(). This patch moves exit_task_namespaces() after exit_notify() to keep nsproxy valid. Signed-off-by: Cedric Le Goater <[email protected]> Cc: Serge E. Hallyn <[email protected]> Cc: Kirill Korotaev <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Herbert Poetzl <[email protected]> Cc: Andrey Savochkin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] namespaces: incorporate fs namespace into nsproxySerge E. Hallyn3-16/+37
This moves the mount namespace into the nsproxy. The mount namespace count now refers to the number of nsproxies point to it, rather than the number of tasks. As a result, the unshare_namespace() function in kernel/fork.c no longer checks whether it is being shared. Signed-off-by: Serge Hallyn <[email protected]> Cc: Kirill Korotaev <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Herbert Poetzl <[email protected]> Cc: Andrey Savochkin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] nsproxy: move init_nsproxy into kernel/nsproxy.cSerge E. Hallyn1-0/+3
Move the init_nsproxy definition out of arch/ into kernel/nsproxy.c. This avoids all arches having to be updated. Compiles and boots on s390. Signed-off-by: Serge E. Hallyn <[email protected]> Cc: Kirill Korotaev <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Herbert Poetzl <[email protected]> Cc: Andrey Savochkin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] namespaces: add nsproxySerge E. Hallyn4-2/+102
This patch adds a nsproxy structure to the task struct. Later patches will move the fs namespace pointer into this structure, and introduce a new utsname namespace into the nsproxy. The vserver and openvz functionality, then, would be implemented in large part by virtualizing/isolating more and more resources into namespaces, each contained in the nsproxy. [[email protected]: build fix] Signed-off-by: Serge Hallyn <[email protected]> Cc: Kirill Korotaev <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Herbert Poetzl <[email protected]> Cc: Andrey Savochkin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] make kernel/sysctl.c:_proc_do_string() staticAdrian Bunk1-2/+3
This patch makes the needlessly global _proc_do_string() static. Signed-off-by: Adrian Bunk <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] proc: sysctl: add _proc_do_string helperSam Vilain1-29/+36
The logic in proc_do_string is worth re-using without passing in a ctl_table structure (say, we want to calculate a pointer and pass that in instead); pass in the two fields it uses from that structure as explicit arguments. Signed-off-by: Sam Vilain <[email protected]> Cc: Serge E. Hallyn <[email protected]> Cc: Kirill Korotaev <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Herbert Poetzl <[email protected]> Cc: Andrey Savochkin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] cpumask: export cpu_online_map and cpu_possible_map consistentlyGreg Banks1-0/+3
cpumask: ensure that the cpu_online_map and cpu_possible_map bitmasks, and hence all the macros in <linux/cpumask.h> that require them, are available to modules for all supported combinations of architecture and CONFIG_SMP. Signed-off-by: Greg Banks <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] kretprobe spinlock deadlock patchbibo,mao1-4/+11
kprobe_flush_task() possibly calls kfree function during holding kretprobe_lock spinlock, if kfree function is probed by kretprobe that will incur spinlock deadlock. This patch moves kfree function out scope of kretprobe_lock. Signed-off-by: bibo, mao <[email protected]> Signed-off-by: Ananth N Mavinakayanahalli <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] disallow kprobes on notifier_call_chainbibo,mao1-1/+1
When kprobe is re-entered, the re-entered kprobe kernel path will will call atomic_notifier_call_chain function, if this function is kprobed that will incur numerous kprobe recursive fault. This patch disallows kprobes on atomic_notifier_call_chain function. Signed-off-by: bibo, mao <[email protected]> Signed-off-by: Ananth N Mavinakayanahalli <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] kprobe whitespace cleanupbibo,mao1-8/+8
Whitespace is used to indent, this patch cleans up these sentences by kernel coding style. Signed-off-by: bibo, mao <[email protected]> Signed-off-by: Ananth N Mavinakayanahalli <[email protected]> Cc: "Luck, Tony" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] Kprobes: Make kprobe modules more portableAnanth N Mavinakayanahalli2-1/+26
In an effort to make kprobe modules more portable, here is a patch that: o Introduces the "symbol_name" field to struct kprobe. The symbol->address resolution now happens in the kernel in an architecture agnostic manner. 64-bit powerpc users no longer have to specify the ".symbols" o Introduces the "offset" field to struct kprobe to allow a user to specify an offset into a symbol. o The legacy mechanism of specifying the kprobe.addr is still supported. However, if both the kprobe.addr and kprobe.symbol_name are specified, probe registration fails with an -EINVAL. o The symbol resolution code uses kallsyms_lookup_name(). So CONFIG_KPROBES now depends on CONFIG_KALLSYMS o Apparantly kprobe modules were the only legitimate out-of-tree user of the kallsyms_lookup_name() EXPORT. Now that the symbol resolution happens in-kernel, remove the EXPORT as suggested by Christoph Hellwig o Modify tcp_probe.c that uses the kprobe interface so as to make it work on multiple platforms (in its earlier form, the code wouldn't work, say, on powerpc) Signed-off-by: Ananth N Mavinakayanahalli <[email protected]> Signed-off-by: Prasanna S Panchamukhi <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] usb: fixup usb so it uses struct pidEric W. Biederman1-4/+4
The problem with remembering a user space process by its pid is that it is possible that the process will exit, pid wrap around will occur. Converting to a struct pid avoid that problem, and paves the way for implementing a pid namespace. Also since usb is the only user of kill_proc_info_as_uid rename kill_proc_info_as_uid to kill_pid_info_as_uid and have the new version take a struct pid. Signed-off-by: Eric W. Biederman <[email protected]> Acked-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] Use struct pspace in next_pidmap and find_ge_pidEric W. Biederman1-6/+7
This updates my proc: readdir race fix (take 3) patch to account for the changes made by: Sukadev Bhattiprolu <[email protected]> to introduce struct pspace. Signed-off-by: Eric W. Biederman <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] Define struct pspaceSukadev Bhattiprolu1-22/+31
Define a per-container pid space object. And create one instance of this object, init_pspace, to define the entire pid space. Subsequent patches will provide/use interfaces to create/destroy pid spaces. Its a subset/rework of Eric Biederman's patch http://lkml.org/lkml/2006/2/6/285 . Signed-off-by: Eric Biederman <[email protected]> Signed-off-by: Sukadev Bhattiprolu <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Serge Hallyn <[email protected]> Cc: Cedric Le Goater <[email protected]> Cc: Kirill Korotaev <[email protected]> Cc: Andrey Savochkin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] Move pidmap to pspace.hSukadev Bhattiprolu1-6/+1
Move struct pidmap and PIDMAP_ENTRIES to a new file, include/linux/pspace.h where it will be used in subsequent patches to define pid spaces. Its a subset of Eric Biederman's patch http://lkml.org/lkml/2006/2/6/285 [[email protected]: cleanups] Signed-off-by: Eric W. Biederman <[email protected]> Signed-off-by: Sukadev Bhattiprolu <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Serge Hallyn <[email protected]> Cc: Cedric Le Goater <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] pids coding style use struct pidmap in next_pidmapEric W. Biederman1-1/+1
Use struct pidmap instead of pidmap_t. This updates my proc: readdir race fix (take 3) patch to account for the changes made by: Sukadev Bhattiprolu <[email protected]> to kill pidmap_t. Signed-off-by: Eric W. Biederman <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] pids: coding style: use struct pidmapSukadev Bhattiprolu1-5/+5
Use struct pidmap instead of pidmap_t. Its a subset of Eric Biederman's patch http://lkml.org/lkml/2006/2/6/271. Signed-off-by: Eric W. Biederman <[email protected]> Signed-off-by: Sukadev Bhattiprolu <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Serge Hallyn <[email protected]> Cc: Cedric Le Goater <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] file: modify struct fown_struct to use a struct pidEric W. Biederman1-1/+1
File handles can be requested to send sigio and sigurg to processes. By tracking the destination processes using struct pid instead of pid_t we make the interface safe from all potential pid wrap around problems. Signed-off-by: Eric W. Biederman <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] pid: export the symbols needed to use struct pid *Eric W. Biederman1-0/+3
pids aren't something that drivers should care about. However there are a lot of helper layers in the kernel that do care, and are built as modules. Before I can convert them to using struct pid instead of pid_t I need to export the appropriate symbols so they can continue to be built. Signed-off-by: Eric W. Biederman <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] pid: implement signal functions that take a struct pid *Eric W. Biederman1-10/+47
Currently the signal functions all either take a task or a pid_t argument. This patch implements variants that take a struct pid *. After all of the users have been update it is my intention to remove the variants that take a pid_t as using pid_t can be more work (an extra hash table lookup) and difficult to get right in the presence of multiple pid namespaces. There are two kinds of functions introduced in this patch. The are the general use functions kill_pgrp and kill_pid which take a priv argument that is ultimately used to create the appropriate siginfo information, Then there are _kill_pgrp_info, kill_pgrp_info, kill_pid_info the internal implementation helpers that take an explicit siginfo. The distinction is made because filling out an explcit siginfo is tricky, and will be even more tricky when pid namespaces are introduced. Signed-off-by: Eric W. Biederman <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] proc: readdir race fix (take 3)Eric W. Biederman1-0/+36
The problem: An opendir, readdir, closedir sequence can fail to report process ids that are continually in use throughout the sequence of system calls. For this race to trigger the process that proc_pid_readdir stops at must exit before readdir is called again. This can cause ps to fail to report processes, and it is in violation of posix guarantees and normal application expectations with respect to readdir. Currently there is no way to work around this problem in user space short of providing a gargantuan buffer to user space so the directory read all happens in on system call. This patch implements the normal directory semantics for proc, that guarantee that a directory entry that is neither created nor destroyed while reading the directory entry will be returned. For directory that are either created or destroyed during the readdir you may or may not see them. Furthermore you may seek to a directory offset you have previously seen. These are the guarantee that ext[23] provides and that posix requires, and more importantly that user space expects. Plus it is a simple semantic to implement reliable service. It is just a matter of calling readdir a second time if you are wondering if something new has show up. These better semantics are implemented by scanning through the pids in numerical order and by making the file offset a pid plus a fixed offset. The pid scan happens on the pid bitmap, which when you look at it is remarkably efficient for a brute force algorithm. Given that a typical cache line is 64 bytes and thus covers space for 64*8 == 200 pids. There are only 40 cache lines for the entire 32K pid space. A typical system will have 100 pids or more so this is actually fewer cache lines we have to look at to scan a linked list, and the worst case of having to scan the entire pid bitmap is pretty reasonable. If we need something more efficient we can go to a more efficient data structure for indexing the pids, but for now what we have should be sufficient. In addition this takes no additional locks and is actually less code than what we are doing now. Also another very subtle bug in this area has been fixed. It is possible to catch a task in the middle of de_thread where a thread is assuming the thread of it's thread group leader. This patch carefully handles that case so if we hit it we don't fail to return the pid, that is undergoing the de_thread dance. Thanks to KAMEZAWA Hiroyuki <[email protected]> for providing the first fix, pointing this out and working on it. [[email protected]: fix it] Signed-off-by: Eric W. Biederman <[email protected]> Acked-by: KAMEZAWA Hiroyuki <[email protected]> Signed-off-by: Oleg Nesterov <[email protected]> Cc: Jean Delvare <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-02[PATCH] list module taint flags in Oops/panicRandy Dunlap1-3/+34
When listing loaded modules during an oops or panic, also list each module's Tainted flags if non-zero (P: Proprietary or F: Forced load only). If a module is did not taint the kernel, it is just listed like usbcore but if it did taint the kernel, it is listed like wizmodem(PF) Example: [ 3260.121718] Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP: [ 3260.121729] [<ffffffff8804c099>] :dump_test:proc_dump_test+0x99/0xc8 [ 3260.121742] PGD fe8d067 PUD 264a6067 PMD 0 [ 3260.121748] Oops: 0002 [1] SMP [ 3260.121753] CPU 1 [ 3260.121756] Modules linked in: dump_test(P) snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device ide_cd generic ohci1394 snd_hda_intel snd_hda_codec snd_pcm snd_timer snd ieee1394 snd_page_alloc piix ide_core arcmsr aic79xx scsi_transport_spi usblp [ 3260.121785] Pid: 5556, comm: bash Tainted: P 2.6.18-git10 #1 [Alternatively, I can look into listing tainted flags with 'lsmod', but that won't help in oopsen/panics so much.] [[email protected]: cleanup] Signed-off-by: Randy Dunlap <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-01[PATCH] Support piping into commands in /proc/sys/kernel/core_patternAndi Kleen2-1/+5
Using the infrastructure created in previous patches implement support to pipe core dumps into programs. This is done by overloading the existing core_pattern sysctl with a new syntax: |program When the first character of the pattern is a '|' the kernel will instead threat the rest of the pattern as a command to run. The core dump will be written to the standard input of that program instead of to a file. This is useful for having automatic core dump analysis without filling up disks. The program can do some simple analysis and save only a summary of the core dump. The core dump proces will run with the privileges and in the name space of the process that caused the core dump. I also increased the core pattern size to 128 bytes so that longer command lines fit. Most of the changes comes from allowing core dumps without seeks. They are fairly straight forward though. One small incompatibility is that if someone had a core pattern previously that started with '|' they will get suddenly new behaviour. I think that's unlikely to be a real problem though. Additional background: > Very nice, do you happen to have a program that can accept this kind of > input for crash dumps? I'm guessing that the embedded people will > really want this functionality. I had a cheesy demo/prototype. Basically it wrote the dump to a file again, ran gdb on it to get a backtrace and wrote the summary to a shared directory. Then there was a simple CGI script to generate a "top 10" crashes HTML listing. Unfortunately this still had the disadvantage to needing full disk space for a dump except for deleting it afterwards (in fact it was worse because over the pipe holes didn't work so if you have a holey address map it would require more space). Fortunately gdb seems to be happy to handle /proc/pid/fd/xxx input pipes as cores (at least it worked with zsh's =(cat core) syntax), so it would be likely possible to do it without temporary space with a simple wrapper that calls it in the right way. I ran out of time before doing that though. The demo prototype scripts weren't very good. If there is really interest I can dig them out (they are currently on a laptop disk on the desk with the laptop itself being in service), but I would recommend to rewrite them for any serious application of this and fix the disk space problem. Also to be really useful it should probably find a way to automatically fetch the debuginfos (I cheated and just installed them in advance). If nobody else does it I can probably do the rewrite myself again at some point. My hope at some point was that desktops would support it in their builtin crash reporters, but at least the KDE people I talked too seemed to be happy with their user space only solution. Alan sayeth: I don't believe that piping as such as neccessarily the right model, but the ability to intercept and processes core dumps from user space is asked for by many enterprise users as well. They want to know about, capture, analyse and process core dumps, often centrally and in automated form. [[email protected]: loff_t != unsigned long] Signed-off-by: Andi Kleen <[email protected]> Cc: Alan Cox <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-01[PATCH] Create call_usermodehelper_pipe()Andi Kleen1-1/+54
A new member in the ever growing family of call_usermode* functions is born. The new call_usermodehelper_pipe() function allows to pipe data to the stdin of the called user mode progam and behaves otherwise like the normal call_usermodehelp() (except that it always waits for the child to finish) Signed-off-by: Andi Kleen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-01[PATCH] r/o bind mount prepwork: inc_nlink() helperDave Hansen1-4/+4
This is mostly included for parity with dec_nlink(), where we will have some more hooks. This one should stay pretty darn straightforward for now. Signed-off-by: Dave Hansen <[email protected]> Acked-by: Christoph Hellwig <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-01[PATCH] csa accounting taskstats updateJay Lan1-11/+14
ChangeLog: Feedbacks from Andrew Morton: - define TS_COMM_LEN to 32 - change acct_stimexpd field of task_struct to be of cputime_t, which is to be used to save the tsk->stime of last timer interrupt update. - a new Documentation/accounting/taskstats-struct.txt to describe fields of taskstats struct. Feedback from Balbir Singh: - keep the stime of a task to be zero when both stime and utime are zero as recoreded in task_struct. Misc: - convert accumulated RSS/VM from platform dependent pages-ticks to MBytes-usecs in the kernel Cc: Shailabh Nagar <[email protected]> Cc: Balbir Singh <[email protected]> Cc: Jes Sorensen <[email protected]> Cc: Chris Sturtivant <[email protected]> Cc: Tony Ernst <[email protected]> Cc: Guillaume Thouvenin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-01[PATCH] csa: convert CONFIG tag for extended accounting routinesJay Lan5-31/+33
There were a few accounting data/macros that are used in CSA but are #ifdef'ed inside CONFIG_BSD_PROCESS_ACCT. This patch is to change those ifdef's from CONFIG_BSD_PROCESS_ACCT to CONFIG_TASK_XACCT. A few defines are moved from kernel/acct.c and include/linux/acct.h to kernel/tsacct.c and include/linux/tsacct_kern.h. Signed-off-by: Jay Lan <[email protected]> Cc: Shailabh Nagar <[email protected]> Cc: Balbir Singh <[email protected]> Cc: Jes Sorensen <[email protected]> Cc: Chris Sturtivant <[email protected]> Cc: Tony Ernst <[email protected]> Cc: Guillaume Thouvenin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-01[PATCH] csa: Extended system accounting over taskstatsJay Lan2-0/+23
Add extended system accounting handling over taskstats interface. A CONFIG_TASK_XACCT flag is created to enable the extended accounting code. Signed-off-by: Jay Lan <[email protected]> Cc: Shailabh Nagar <[email protected]> Cc: Balbir Singh <[email protected]> Cc: Jes Sorensen <[email protected]> Cc: Chris Sturtivant <[email protected]> Cc: Tony Ernst <[email protected]> Cc: Guillaume Thouvenin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-01[PATCH] csa: basic accounting over taskstatsJay Lan3-1/+77
Add some basic accounting fields to the taskstats struct, add a new kernel/tsacct.c to handle basic accounting data handling upon exit. A handle is added to taskstats.c to invoke the basic accounting data handling. Signed-off-by: Jay Lan <[email protected]> Cc: Shailabh Nagar <[email protected]> Cc: Balbir Singh <[email protected]> Cc: Jes Sorensen <[email protected]> Cc: Chris Sturtivant <[email protected]> Cc: Tony Ernst <[email protected]> Cc: Guillaume Thouvenin <[email protected]> Cc: "Michal Piotrowski" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-01[PATCH] Fix taskstats size calculation (use the new genetlink utility functions)Balbir Singh1-1/+1
The addition of the CSA patch pushed the size of struct taskstats to 256 bytes. This exposed a problem with prepare_reply(), we were not allocating space for the netlink and genetlink header. It worked earlier because alloc_skb() would align the skb to SMP_CACHE_BYTES, which added some additonal bytes. Signed-off-by: Balbir Singh <[email protected]> Cc: Jamal Hadi <[email protected]> Cc: Shailabh Nagar <[email protected]> Cc: Thomas Graf <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Jay Lan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>