aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorRan Xiaokai <[email protected]>2021-07-28 00:26:29 -0700
committerChristian Brauner <[email protected]>2021-08-12 14:54:25 +0200
commit2863643fb8b92291a7e97ba46e342f1163595fa8 (patch)
tree9c335d15374b68494216167128695634c0fa01d6
parent36a21d51725af2ce0700c6ebcb6b9594aac658a6 (diff)
set_user: add capability check when rlimit(RLIMIT_NPROC) exceeds
in copy_process(): non root users but with capability CAP_SYS_RESOURCE or CAP_SYS_ADMIN will clean PF_NPROC_EXCEEDED flag even rlimit(RLIMIT_NPROC) exceeds. Add the same capability check logic here. Align the permission checks in copy_process() and set_user(). In copy_process() CAP_SYS_RESOURCE or CAP_SYS_ADMIN capable users will be able to circumvent and clear the PF_NPROC_EXCEEDED flag whereas they aren't able to the same in set_user(). There's no obvious logic to this and trying to unearth the reason in the thread didn't go anywhere. The gist seems to be that this code wants to make sure that a program can't successfully exec if it has gone through a set*id() transition while exceeding its RLIMIT_NPROC. A capable but non-INIT_USER caller getting PF_NPROC_EXCEEDED set during a set*id() transition wouldn't be able to exec right away if they still exceed their RLIMIT_NPROC at the time of exec. So their exec would fail in fs/exec.c: if ((current->flags & PF_NPROC_EXCEEDED) && is_ucounts_overlimit(current_ucounts(), UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC))) { retval = -EAGAIN; goto out_ret; } However, if the caller were to fork() right after the set*id() transition but before the exec while still exceeding their RLIMIT_NPROC then they would get PF_NPROC_EXCEEDED cleared (while the child would inherit it): retval = -EAGAIN; if (is_ucounts_overlimit(task_ucounts(p), UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC))) { if (p->real_cred->user != INIT_USER && !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN)) goto bad_fork_free; } current->flags &= ~PF_NPROC_EXCEEDED; which means a subsequent exec by the capable caller would now succeed even though they could still exceed their RLIMIT_NPROC limit. This seems inconsistent. Allow a CAP_SYS_ADMIN or CAP_SYS_RESOURCE capable user to avoid PF_NPROC_EXCEEDED as they already can in copy_process(). Cc: [email protected], [email protected], [email protected], Ran Xiaokai <[email protected]>, , , Link: https://lore.kernel.org/r/[email protected] Cc: Neil Brown <[email protected]> Cc: Kees Cook <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: James Morris <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Signed-off-by: Ran Xiaokai <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
-rw-r--r--kernel/sys.c3
1 files changed, 2 insertions, 1 deletions
diff --git a/kernel/sys.c b/kernel/sys.c
index ef1a78f5d71c..72c7639e3c98 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -480,7 +480,8 @@ static int set_user(struct cred *new)
* failure to the execve() stage.
*/
if (is_ucounts_overlimit(new->ucounts, UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC)) &&
- new_user != INIT_USER)
+ new_user != INIT_USER &&
+ !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN))
current->flags |= PF_NPROC_EXCEEDED;
else
current->flags &= ~PF_NPROC_EXCEEDED;