aboutsummaryrefslogtreecommitdiff
path: root/kernel/capability.c
AgeCommit message (Collapse)AuthorFilesLines
2008-11-11Capabilities: BUG when an invalid capability is requestedEric Paris1-0/+5
If an invalid (large) capability is requested the capabilities system may panic as it is dereferencing an array of fixed (short) length. Its possible (and actually often happens) that the capability system accidentally stumbled into a valid memory region but it also regularly happens that it hits invalid memory and BUGs. If such an operation does get past cap_capable then the selinux system is sure to have problems as it already does a (simple) validity check and BUG. This is known to happen by the broken and buggy firegl driver. This patch cleanly checks all capable calls and BUG if a call is for an invalid capability. This will likely break the firegl driver for some situations, but it is the right thing to do. Garbage into a security system gets you killed/bugged Signed-off-by: Eric Paris <[email protected]> Acked-by: Arjan van de Ven <[email protected]> Acked-by: Serge Hallyn <[email protected]> Acked-by: Andrew G. Morgan <[email protected]> Signed-off-by: James Morris <[email protected]>
2008-11-11When the capset syscall is used it is not possible for audit to record theEric Paris1-0/+5
actual capbilities being added/removed. This patch adds a new record type which emits the target pid and the eff, inh, and perm cap sets. example output if you audit capset syscalls would be: type=SYSCALL msg=audit(1225743140.465:76): arch=c000003e syscall=126 success=yes exit=0 a0=17f2014 a1=17f201c a2=80000000 a3=7fff2ab7f060 items=0 ppid=2160 pid=2223 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=1 comm="setcap" exe="/usr/sbin/setcap" subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key=(null) type=UNKNOWN[1322] msg=audit(1225743140.465:76): pid=0 cap_pi=ffffffffffffffff cap_pp=ffffffffffffffff cap_pe=ffffffffffffffff Signed-off-by: Eric Paris <[email protected]> Acked-by: Serge Hallyn <[email protected]> Signed-off-by: James Morris <[email protected]>
2008-11-06file capabilities: add no_file_caps switch (v4)Serge E. Hallyn1-0/+11
Add a no_file_caps boot option when file capabilities are compiled into the kernel (CONFIG_SECURITY_FILE_CAPABILITIES=y). This allows distributions to ship a kernel with file capabilities compiled in, without forcing users to use (and understand and trust) them. When no_file_caps is specified at boot, then when a process executes a file, any file capabilities stored with that file will not be used in the calculation of the process' new capability sets. This means that booting with the no_file_caps boot option will not be the same as booting a kernel with file capabilities compiled out - in particular a task with CAP_SETPCAP will not have any chance of passing capabilities to another task (which isn't "really" possible anyway, and which may soon by killed altogether by David Howells in any case), and it will instead be able to put new capabilities in its pI. However since fI will always be empty and pI is masked with fI, it gains the task nothing. We also support the extra prctl options, setting securebits and dropping capabilities from the per-process bounding set. The other remaining difference is that killpriv, task_setscheduler, setioprio, and setnice will continue to be hooked. That will be noticable in the case where a root task changed its uid while keeping some caps, and another task owned by the new uid tries to change settings for the more privileged task. Changelog: Nov 05 2008: (v4) trivial port on top of always-start-\ with-clear-caps patch Sep 23 2008: nixed file_caps_enabled when file caps are not compiled in as it isn't used. Document no_file_caps in kernel-parameters.txt. Signed-off-by: Serge Hallyn <[email protected]> Acked-by: Andrew G. Morgan <[email protected]> Signed-off-by: James Morris <[email protected]>
2008-08-14security: Fix setting of PF_SUPERPRIV by __capable()David Howells1-8/+13
Fix the setting of PF_SUPERPRIV by __capable() as it could corrupt the flags the target process if that is not the current process and it is trying to change its own flags in a different way at the same time. __capable() is using neither atomic ops nor locking to protect t->flags. This patch removes __capable() and introduces has_capability() that doesn't set PF_SUPERPRIV on the process being queried. This patch further splits security_ptrace() in two: (1) security_ptrace_may_access(). This passes judgement on whether one process may access another only (PTRACE_MODE_ATTACH for ptrace() and PTRACE_MODE_READ for /proc), and takes a pointer to the child process. current is the parent. (2) security_ptrace_traceme(). This passes judgement on PTRACE_TRACEME only, and takes only a pointer to the parent process. current is the child. In Smack and commoncap, this uses has_capability() to determine whether the parent will be permitted to use PTRACE_ATTACH if normal checks fail. This does not set PF_SUPERPRIV. Two of the instances of __capable() actually only act on current, and so have been changed to calls to capable(). Of the places that were using __capable(): (1) The OOM killer calls __capable() thrice when weighing the killability of a process. All of these now use has_capability(). (2) cap_ptrace() and smack_ptrace() were using __capable() to check to see whether the parent was allowed to trace any process. As mentioned above, these have been split. For PTRACE_ATTACH and /proc, capable() is now used, and for PTRACE_TRACEME, has_capability() is used. (3) cap_safe_nice() only ever saw current, so now uses capable(). (4) smack_setprocattr() rejected accesses to tasks other than current just after calling __capable(), so the order of these two tests have been switched and capable() is used instead. (5) In smack_file_send_sigiotask(), we need to allow privileged processes to receive SIGIO on files they're manipulating. (6) In smack_task_wait(), we let a process wait for a privileged process, whether or not the process doing the waiting is privileged. I've tested this with the LTP SELinux and syscalls testscripts. Signed-off-by: David Howells <[email protected]> Acked-by: Serge Hallyn <[email protected]> Acked-by: Casey Schaufler <[email protected]> Acked-by: Andrew G. Morgan <[email protected]> Acked-by: Al Viro <[email protected]> Signed-off-by: James Morris <[email protected]>
2008-07-24security: filesystem capabilities refactor kernel codeAndrew G. Morgan1-117/+221
To date, we've tried hard to confine filesystem support for capabilities to the security modules. This has left a lot of the code in kernel/capability.c in a state where it looks like it supports something that filesystem support for capabilities actually suppresses when the LSM security/commmoncap.c code runs. What is left is a lot of code that uses sub-optimal locking in the main kernel With this change we refactor the main kernel code and make it explicit which locks are needed and that the only remaining kernel races in this area are associated with non-filesystem capability code. Signed-off-by: Andrew G. Morgan <[email protected]> Acked-by: Serge Hallyn <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-07-04security: filesystem capabilities: fix fragile setuid fixup codeAndrew G. Morgan1-0/+21
This commit includes a bugfix for the fragile setuid fixup code in the case that filesystem capabilities are supported (in access()). The effect of this fix is gated on filesystem capability support because changing securebits is only supported when filesystem capabilities support is configured.) [[email protected]: coding-style fixes] Signed-off-by: Andrew G. Morgan <[email protected]> Acked-by: Serge Hallyn <[email protected]> Acked-by: David Howells <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-05-31capabilities: remain source compatible with 32-bit raw legacy capability ↵Andrew G. Morgan1-38/+73
support. Source code out there hard-codes a notion of what the _LINUX_CAPABILITY_VERSION #define means in terms of the semantics of the raw capability system calls capget() and capset(). Its unfortunate, but true. Since the confusing header file has been in a released kernel, there is software that is erroneously using 64-bit capabilities with the semantics of 32-bit compatibilities. These recently compiled programs may suffer corruption of their memory when sys_getcap() overwrites more memory than they are coded to expect, and the raising of added capabilities when using sys_capset(). As such, this patch does a number of things to clean up the situation for all. It 1. forces the _LINUX_CAPABILITY_VERSION define to always retain its legacy value. 2. adopts a new #define strategy for the kernel's internal implementation of the preferred magic. 3. deprecates v2 capability magic in favor of a new (v3) magic number. The functionality of v3 is entirely equivalent to v2, the only difference being that the v2 magic causes the kernel to log a "deprecated" warning so the admin can find applications that may be using v2 inappropriately. [User space code continues to be encouraged to use the libcap API which protects the application from details like this. libcap-2.10 is the first to support v3 capabilities.] Fixes issue reported in https://bugzilla.redhat.com/show_bug.cgi?id=447518. Thanks to Bojan Smojver for the report. [[email protected]: s/depreciate/deprecate/g] [[email protected]: be robust about put_user size] [[email protected]: coding-style fixes] Signed-off-by: Andrew G. Morgan <[email protected]> Cc: Serge E. Hallyn <[email protected]> Cc: Bojan Smojver <[email protected]> Cc: [email protected] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Chris Wright <[email protected]>
2008-02-05Add 64-bit capability support to the kernelAndrew Morgan1-9/+104
The patch supports legacy (32-bit) capability userspace, and where possible translates 32-bit capabilities to/from userspace and the VFS to 64-bit kernel space capabilities. If a capability set cannot be compressed into 32-bits for consumption by user space, the system call fails, with -ERANGE. FWIW libcap-2.00 supports this change (and earlier capability formats) http://www.kernel.org/pub/linux/libs/security/linux-privs/kernel-2.6/ [[email protected]: coding-syle fixes] [[email protected]: use get_task_comm()] [[email protected]: build fix] [[email protected]: do not initialise statics to 0 or NULL] [[email protected]: unused var] [[email protected]: export __cap_ symbols] Signed-off-by: Andrew G. Morgan <[email protected]> Cc: Stephen Smalley <[email protected]> Acked-by: Serge Hallyn <[email protected]> Cc: Chris Wright <[email protected]> Cc: James Morris <[email protected]> Cc: Casey Schaufler <[email protected]> Signed-off-by: Erez Zadok <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-19Uninline find_pid etc set of functionsPavel Emelyanov1-1/+1
The find_pid/_vpid/_pid_ns functions are used to find the struct pid by its id, depending on whic id - global or virtual - is used. The find_vpid() is a macro that pushes the current->nsproxy->pid_ns on the stack to call another function - find_pid_ns(). It turned out, that this dereference together with the push itself cause the kernel text size to grow too much. Move all these out-of-line. Together with the previous patch this saves a bit less that 400 bytes from .text section. Signed-off-by: Pavel Emelyanov <[email protected]> Cc: Sukadev Bhattiprolu <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Paul Menage <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-19Uninline find_task_by_xxx set of functionsPavel Emelyanov1-4/+2
The find_task_by_something is a set of macros are used to find task by pid depending on what kind of pid is proposed - global or virtual one. All of them are wrappers above the most generic one - find_task_by_pid_type_ns() - and just substitute some args for it. It turned out, that dereferencing the current->nsproxy->pid_ns construction and pushing one more argument on the stack inline cause kernel text size to grow. This patch moves all this stuff out-of-line into kernel/pid.c. Together with the next patch it saves a bit less than 400 bytes from the .text section. Signed-off-by: Pavel Emelyanov <[email protected]> Cc: Sukadev Bhattiprolu <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Paul Menage <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Acked-by: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-19pid namespaces: changes to show virtual ids to userPavel Emelyanov1-6/+8
This is the largest patch in the set. Make all (I hope) the places where the pid is shown to or get from user operate on the virtual pids. The idea is: - all in-kernel data structures must store either struct pid itself or the pid's global nr, obtained with pid_nr() call; - when seeking the task from kernel code with the stored id one should use find_task_by_pid() call that works with global pids; - when showing pid's numerical value to the user the virtual one should be used, but however when one shows task's pid outside this task's namespace the global one is to be used; - when getting the pid from userspace one need to consider this as the virtual one and use appropriate task/pid-searching functions. [[email protected]: build fix] [[email protected]: nuther build fix] [[email protected]: yet nuther build fix] [[email protected]: remove unneeded casts] Signed-off-by: Pavel Emelyanov <[email protected]> Signed-off-by: Alexey Dobriyan <[email protected]> Cc: Sukadev Bhattiprolu <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Paul Menage <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-19pid namespaces: define is_global_init() and is_container_init()Serge E. Hallyn1-1/+2
is_init() is an ambiguous name for the pid==1 check. Split it into is_global_init() and is_container_init(). A cgroup init has it's tsk->pid == 1. A global init also has it's tsk->pid == 1 and it's active pid namespace is the init_pid_ns. But rather than check the active pid namespace, compare the task structure with 'init_pid_ns.child_reaper', which is initialized during boot to the /sbin/init process and never changes. Changelog: 2.6.22-rc4-mm2-pidns1: - Use 'init_pid_ns.child_reaper' to determine if a given task is the global init (/sbin/init) process. This would improve performance and remove dependence on the task_pid(). 2.6.21-mm2-pidns2: - [Sukadev Bhattiprolu] Changed is_container_init() calls in {powerpc, ppc,avr32}/traps.c for the _exception() call to is_global_init(). This way, we kill only the cgroup if the cgroup's init has a bug rather than force a kernel panic. [[email protected]: fix comment] [[email protected]: Use is_global_init() in arch/m32r/mm/fault.c] [[email protected]: kernel/pid.c: remove unused exports] [[email protected]: Fix capability.c to work with threaded init] Signed-off-by: Serge E. Hallyn <[email protected]> Signed-off-by: Sukadev Bhattiprolu <[email protected]> Acked-by: Pavel Emelianov <[email protected]> Cc: Eric W. Biederman <[email protected]> Cc: Cedric Le Goater <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Herbert Poetzel <[email protected]> Cc: Kirill Korotaev <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-18whitespace fixes: capability syscallsDaniel Walker1-95/+95
Large chunks of 5 spaces instead of tabs. Signed-off-by: Daniel Walker <[email protected]> Cc: Chris Wright <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-18V3 file capabilities: alter behavior of cap_setpcapAndrew Morgan1-4/+1
The non-filesystem capability meaning of CAP_SETPCAP is that a process, p1, can change the capabilities of another process, p2. This is not the meaning that was intended for this capability at all, and this implementation came about purely because, without filesystem capabilities, there was no way to use capabilities without one process bestowing them on another. Since we now have a filesystem support for capabilities we can fix the implementation of CAP_SETPCAP. The most significant thing about this change is that, with it in effect, no process can set the capabilities of another process. The capabilities of a program are set via the capability convolution rules: pI(post-exec) = pI(pre-exec) pP(post-exec) = (X(aka cap_bset) & fP) | (pI(post-exec) & fI) pE(post-exec) = fE ? pP(post-exec) : 0 at exec() time. As such, the only influence the pre-exec() program can have on the post-exec() program's capabilities are through the pI capability set. The correct implementation for CAP_SETPCAP (and that enabled by this patch) is that it can be used to add extra pI capabilities to the current process - to be picked up by subsequent exec()s when the above convolution rules are applied. Here is how it works: Let's say we have a process, p. It has capability sets, pE, pP and pI. Generally, p, can change the value of its own pI to pI' where (pI' & ~pI) & ~pP = 0. That is, the only new things in pI' that were not present in pI need to be present in pP. The role of CAP_SETPCAP is basically to permit changes to pI beyond the above: if (pE & CAP_SETPCAP) { pI' = anything; /* ie., even (pI' & ~pI) & ~pP != 0 */ } This capability is useful for things like login, which (say, via pam_cap) might want to raise certain inheritable capabilities for use by the children of the logged-in user's shell, but those capabilities are not useful to or needed by the login program itself. One such use might be to limit who can run ping. You set the capabilities of the 'ping' program to be "= cap_net_raw+i", and then only shells that have (pI & CAP_NET_RAW) will be able to run it. Without CAP_SETPCAP implemented as described above, login(pam_cap) would have to also have (pP & CAP_NET_RAW) in order to raise this capability and pass it on through the inheritable set. Signed-off-by: Andrew Morgan <[email protected]> Signed-off-by: Serge E. Hallyn <[email protected]> Cc: Stephen Smalley <[email protected]> Cc: James Morris <[email protected]> Cc: Casey Schaufler <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-17security/ cleanupsAdrian Bunk1-4/+0
This patch contains the following cleanups that are now possible: - remove the unused security_operations->inode_xattr_getsuffix - remove the no longer used security_operations->unregister_security - remove some no longer required exit code - remove a bunch of no longer used exports Signed-off-by: Adrian Bunk <[email protected]> Acked-by: James Morris <[email protected]> Cc: Chris Wright <[email protected]> Cc: Stephen Smalley <[email protected]> Cc: Serge Hallyn <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-02-12[PATCH] pid: replace do/while_each_task_pid with do/while_each_pid_taskEric W. Biederman1-3/+5
There isn't any real advantage to this change except that it allows the old functions to be removed. Which is easier on maintenance and puts the code in a more uniform style. Signed-off-by: Eric W. Biederman <[email protected]> Cc: Alan Cox <[email protected]> Cc: Oleg Nesterov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-09-29[PATCH] pidspace: is_init()Sukadev Bhattiprolu1-1/+1
This is an updated version of Eric Biederman's is_init() patch. (http://lkml.org/lkml/2006/2/6/280). It applies cleanly to 2.6.18-rc3 and replaces a few more instances of ->pid == 1 with is_init(). Further, is_init() checks pid and thus removes dependency on Eric's other patches for now. Eric's original description: There are a lot of places in the kernel where we test for init because we give it special properties. Most significantly init must not die. This results in code all over the kernel test ->pid == 1. Introduce is_init to capture this case. With multiple pid spaces for all of the cases affected we are looking for only the first process on the system, not some other process that has pid == 1. Signed-off-by: Eric W. Biederman <[email protected]> Signed-off-by: Sukadev Bhattiprolu <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Serge Hallyn <[email protected]> Cc: Cedric Le Goater <[email protected]> Cc: <[email protected]> Acked-by: Paul Mackerras <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-07-03[PATCH] sched: cleanup, remove task_t, convert to struct task_structIngo Molnar1-4/+4
cleanup: remove task_t and convert all the uses to struct task_struct. I introduced it for the scheduler anno and it was a mistake. Conversion was mostly scripted, the result was reviewed and all secondary whitespace and style impact (if any) was fixed up by hand. Signed-off-by: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-03-25[PATCH] refactor capable() to one implementation, add __capable() helperChris Wright1-0/+16
Move capable() to kernel/capability.c and eliminate duplicate implementations. Add __capable() function which can be used to check for capabiilty of any process. Signed-off-by: Chris Wright <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-01-11[PATCH] move capable() to capability.hRandy.Dunlap1-0/+1
- Move capable() from sched.h to capability.h; - Use <linux/capability.h> where capable() is used (in include/, block/, ipc/, kernel/, a few drivers/, mm/, security/, & sound/; many more drivers/ to go) Signed-off-by: Randy Dunlap <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2005-07-27[PATCH] kernel/capability.c: add kerneldocRandy Dunlap1-3/+17
Add kerneldoc to kernel/capability.c Signed-off-by: Randy Dunlap <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2005-04-16Linux-2.6.12-rc2Linus Torvalds1-0/+220
Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!