aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2014-01-23Merge git://git.infradead.org/users/eparis/auditLinus Torvalds20-243/+404
Pull audit update from Eric Paris: "Again we stayed pretty well contained inside the audit system. Venturing out was fixing a couple of function prototypes which were inconsistent (didn't hurt anything, but we used the same value as an int, uint, u32, and I think even a long in a couple of places). We also made a couple of minor changes to when a couple of LSMs called the audit system. We hoped to add aarch64 audit support this go round, but it wasn't ready. I'm disappearing on vacation on Thursday. I should have internet access, but it'll be spotty. If anything goes wrong please be sure to cc [email protected]. He'll make fixing things his top priority" * git://git.infradead.org/users/eparis/audit: (50 commits) audit: whitespace fix in kernel-parameters.txt audit: fix location of __net_initdata for audit_net_ops audit: remove pr_info for every network namespace audit: Modify a set of system calls in audit class definitions audit: Convert int limit uses to u32 audit: Use more current logging style audit: Use hex_byte_pack_upper audit: correct a type mismatch in audit_syscall_exit() audit: reorder AUDIT_TTY_SET arguments audit: rework AUDIT_TTY_SET to only grab spin_lock once audit: remove needless switch in AUDIT_SET audit: use define's for audit version audit: documentation of audit= kernel parameter audit: wait_for_auditd rework for readability audit: update MAINTAINERS audit: log task info on feature change audit: fix incorrect set of audit_sock audit: print error message when fail to create audit socket audit: fix dangling keywords in audit_log_set_loginuid() output audit: log on errors from filter user rules ...
2014-01-23lib/decompress_unlz4.c: always set an error return code on failuresJan Beulich1-0/+1
"ret", being set to -1 early on, gets cleared by the first invocation of lz4_decompress()/lz4_decompress_unknownoutputsize(), and hence subsequent failures wouldn't be noticed by the caller without setting it back to -1 right after those calls. Reported-by: Matthew Daley <[email protected]> Signed-off-by: Jan Beulich <[email protected]> Cc: Kyungsik Lee <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23romfs: fix returm err while getting inode in fill_superRui Xiang1-4/+2
Getting an inode by romfs_iget may lead to an err in fill_super, and the err value should be return. And it should return -ENOMEM instead while d_make_root fails, fix it too. Signed-off-by: Rui Xiang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23drivers/w1/masters/w1-gpio.c: add strong pullup emulationEvgeny Boger3-12/+23
Strong pullup is emulated by driving pin logic high after write command when using tri-state push-pull GPIO. Signed-off-by: Evgeny Boger <[email protected]> Cc: Greg KH <[email protected]> Acked-by: David Fries <[email protected]> Acked-by: Evgeniy Polyakov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23drivers/memstick/host/rtsx_pci_ms.c: fix ms card data transfer bugMicky Ching1-10/+20
This patch is used to add support for ms card. The main difference between ms card and mspro card is long data transfer mode. mspro card can use auto mode DMA for long data transfer, but ms can not use this mode, it should use normal mode DMA. The memstick core added support for ms card, but the original driver will make ms card fail at initialization, because it uses auto mode DMA. This patch makes the ms card work properly. Signed-off-by: Micky Ching <[email protected]> Cc: Maxim Levitsky <[email protected]> Cc: Samuel Ortiz <[email protected]> Cc: Alex Dubov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23userns: relax the posix_acl_valid() checksAndreas Gruenbacher1-10/+0
So far, POSIX ACLs are using a canonical representation that keeps all ACL entries in a strict order; the ACL_USER and ACL_GROUP entries for specific users and groups are ordered by user and group identifier, respectively. The user-space code provides ACL entries in this order; the kernel verifies that the ACL entry order is correct in posix_acl_valid(). User namespaces allow to arbitrary map user and group identifiers which can cause the ACL_USER and ACL_GROUP entry order to differ between user space and the kernel; posix_acl_valid() would then fail. Work around this by allowing ACL_USER and ACL_GROUP entries to be in any order in the kernel. The effect is only minor: file permission checks will pick the first matching ACL_USER entry, and check all matching ACL_GROUP entries. (The libacl user-space library and getfacl / setfacl tools will not create ACLs with duplicate user or group idenfifiers; they will handle ACLs with entries in an arbitrary order correctly.) Signed-off-by: Andreas Gruenbacher <[email protected]> Cc: Eric W. Biederman <[email protected]> Cc: Theodore Tso <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Andreas Dilger <[email protected]> Cc: Jan Kara <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23arch/sh/kernel/dwarf.c: use rbtree postorder iteration helper instead of ↵Cody P Schafer1-14/+4
solution using repeated rb_erase() Use rbtree_postorder_for_each_entry_safe() to destroy the rbtree instead of using repeated rb_erase() calls Signed-off-by: Cody P Schafer <[email protected]> Cc: Michel Lespinasse <[email protected]> Cc: Jan Kara <[email protected]> Cc: Paul Mundt <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23fs-ext3-use-rbtree-postorder-iteration-helper-instead-of-opencoding-fixAndrew Morton1-4/+4
use do{}while - more efficient and it squishes a coccinelle warning Reported-by: Fengguang Wu <[email protected]> Cc: Cody P Schafer <[email protected]> Cc: Jan Kara <[email protected]> Cc: Michel Lespinasse <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23fs/ext3: use rbtree postorder iteration helper instead of opencodingCody P Schafer1-31/+5
Use rbtree_postorder_for_each_entry_safe() to destroy the rbtree instead of opencoding an alternate postorder iteration that modifies the tree Signed-off-by: Cody P Schafer <[email protected]> Cc: Michel Lespinasse <[email protected]> Cc: Jan Kara <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23fs/jffs2: use rbtree postorder iteration helper instead of opencodingCody P Schafer2-49/+5
Use rbtree_postorder_for_each_entry_safe() to destroy the rbtree instead of opencoding an alternate postorder iteration that modifies the tree Signed-off-by: Cody P Schafer <[email protected]> Cc: Michel Lespinasse <[email protected]> Cc: Jan Kara <[email protected]> Cc: David Woodhouse <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23fs/ext4: use rbtree postorder iteration helper instead of opencodingCody P Schafer2-59/+9
Use rbtree_postorder_for_each_entry_safe() to destroy the rbtree instead of opencoding an alternate postorder iteration that modifies the tree Signed-off-by: Cody P Schafer <[email protected]> Reviewed-by: Jan Kara <[email protected]> Cc: Michel Lespinasse <[email protected]> Cc: Theodore Ts'o <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23fs/ubifs: use rbtree postorder iteration helper instead of opencodingCody P Schafer6-114/+17
Use rbtree_postorder_for_each_entry_safe() to destroy the rbtree instead of opencoding an alternate postorder iteration that modifies the tree Signed-off-by: Cody P Schafer <[email protected]> Cc: Michel Lespinasse <[email protected]> Cc: Jan Kara <[email protected]> Cc: Artem Bityutskiy <[email protected]> Cc: Adrian Hunter <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23net/netfilter/ipset/ip_set_hash_netiface.c: use rbtree postorder iteration ↵Cody P Schafer1-23/+4
instead of opencoding Use rbtree_postorder_for_each_entry_safe() to destroy the rbtree instead of opencoding an alternate postorder iteration that modifies the tree Signed-off-by: Cody P Schafer <[email protected]> Cc: Michel Lespinasse <[email protected]> Cc: Jan Kara <[email protected]> Cc: Pablo Neira Ayuso <[email protected]> Cc: Patrick McHardy <[email protected]> Cc: Jozsef Kadlecsik <[email protected]> Cc: "David S. Miller" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23rbtree/test: test rbtree_postorder_for_each_entry_safe()Cody P Schafer1-0/+11
Signed-off-by: Cody P Schafer <[email protected]> Cc: Michel Lespinasse <[email protected]> Cc: Jan Kara <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23rbtree/test: move rb_node to the middle of the test structCody P Schafer1-1/+1
Avoid making the rb_node the first entry to catch some bugs around NULL checking the rb_node. Signed-off-by: Cody P Schafer <[email protected]> Cc: Michel Lespinasse <[email protected]> Cc: Jan Kara <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23rapidio: add modular rapidio core build into powerpc and mips branchesAlexandre Bounine2-3/+3
Allow modular build option for RapidIO subsystem core in MIPS and PowerPC architectural branches. At this moment modular RapidIO subsystem build is enabled only for platforms that use PCI/PCIe based RapidIO controllers (e.g. Tsi721). Signed-off-by: Alexandre Bounine <[email protected]> Cc: Matt Porter <[email protected]> Cc: Jean Delvare <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Li Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23partitions/efi: complete documentation of gpt kernel param purposeDavidlohr Bueso1-1/+3
The usage of the 'gpt' kernel parameter is twofold: (i) skip any mbr integrity checks and (ii) enable the backup GPT header to be used in situations where the primary one is corrupted. This last "feature" is not obvious and needs to be properly documented in the kernel-parameters document. Addresses https://bugzilla.kernel.org/show_bug.cgi?id=63591 Signed-off-by: Davidlohr Bueso <[email protected]> Cc: Matt Domsch <[email protected]> Cc: Matt Fleming <[email protected]> Cc: "Chandramouleeswaran,Aswin" <[email protected]> Cc: Chris Murphy <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23kdump: add /sys/kernel/vmcoreinfo ABI documentationVivek Goyal1-0/+14
/sys/kernel/vmcoreinfo was introduced long back but there is no ABI documentation. This patch adds the documentation. Signed-off-by: Vivek Goyal <[email protected]> Cc: Ken'ichi Ohmichi <[email protected]> Cc: Dan Aloni <[email protected]> Cc: Greg KH <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23kdump: fix exported size of vmcoreinfo noteVivek Goyal1-1/+1
Right now we seem to be exporting the max data size contained inside vmcoreinfo note. But this does not include the size of meta data around vmcore info data. Like name of the note and starting and ending elf_note. I think user space expects total size and that size is put in PT_NOTE elf header. Things seem to be fine so far because we are not using vmcoreinfo note to the maximum capacity. But as it starts filling up, to capacity, at some point of time, problem will be visible. I don't think user space will be broken with this change. So there is no need to introduce vmcoreinfo2. This change is safe and backward compatible. More explanation on why this change is safe is below. vmcoreinfo contains information about kernel which user space needs to know to do things like filtering. For example, various kernel config options or information about size or offset of some data structures etc. All this information is commmunicated to user space with an ELF note present in ELF /proc/vmcore file. Currently vmcoreinfo data size is 4096. With some elf note meta data around it, actual size is 4132 bytes. But we are using barely 25% of that size. Rest is empty. So even if we tell user space that size of ELf note is 4096 and not 4132, nothing will be broken becase after around 1000 bytes, everything is zero anyway. But once we start filling up the note to the capacity, and not report the full size of note, bad things will start happening. Either some data will be lost or tools will be confused that they did not fine the zero note at the end. So I think this change is safe and should not break existing tools. Signed-off-by: Vivek Goyal <[email protected]> Cc: Ken'ichi Ohmichi <[email protected]> Cc: Dan Aloni <[email protected]> Cc: Greg KH <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23kexec: add sysctl to disable kexec_loadKees Cook4-2/+30
For general-purpose (i.e. distro) kernel builds it makes sense to build with CONFIG_KEXEC to allow end users to choose what kind of things they want to do with kexec. However, in the face of trying to lock down a system with such a kernel, there needs to be a way to disable kexec_load (much like module loading can be disabled). Without this, it is too easy for the root user to modify kernel memory even when CONFIG_STRICT_DEVMEM and modules_disabled are set. With this change, it is still possible to load an image for use later, then disable kexec_load so the image (or lack of image) can't be altered. The intention is for using this in environments where "perfect" enforcement is hard. Without a verified boot, along with verified modules, and along with verified kexec, this is trying to give a system a better chance to defend itself (or at least grow the window of discoverability) against attack in the face of a privilege escalation. In my mind, I consider several boot scenarios: 1) Verified boot of read-only verified root fs loading fd-based verification of kexec images. 2) Secure boot of writable root fs loading signed kexec images. 3) Regular boot loading kexec (e.g. kcrash) image early and locking it. 4) Regular boot with no control of kexec image at all. 1 and 2 don't exist yet, but will soon once the verified kexec series has landed. 4 is the state of things now. The gap between 2 and 4 is too large, so this change creates scenario 3, a middle-ground above 4 when 2 and 1 are not possible for a system. Signed-off-by: Kees Cook <[email protected]> Acked-by: Rik van Riel <[email protected]> Cc: Vivek Goyal <[email protected]> Cc: Eric Biederman <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23fs/exec.c: call arch_pick_mmap_layout() only onceRichard Weinberger1-1/+0
Currently both setup_new_exec() and flush_old_exec() issue a call to arch_pick_mmap_layout(). As setup_new_exec() and flush_old_exec() are always called pairwise arch_pick_mmap_layout() is called twice. This patch removes one call from setup_new_exec() to have it only called once. Signed-off-by: Richard Weinberger <[email protected]> Tested-by: Pat Erley <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23exec: avoid propagating PF_NO_SETAFFINITY into userspace childZhang Yi1-2/+2
Userspace process doesn't want the PF_NO_SETAFFINITY, but its parent may be a kernel worker thread which has PF_NO_SETAFFINITY set, and this worker thread can do kernel_thread() to create the child. Clearing this flag in usersapce child to enable its migrating capability. Signed-off-by: Zhang Yi <[email protected]> Acked-by: Oleg Nesterov <[email protected]> Cc: Tejun Heo <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23kernel/signal.c: change do_signal_stop/do_sigaction to use while_each_thread()Oleg Nesterov1-4/+3
Change do_signal_stop() and do_sigaction() to avoid next_thread() and use while_each_thread() instead. Signed-off-by: Oleg Nesterov <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Al Viro <[email protected]> Cc: Kees Cook <[email protected]> Reviewed-by: Sameer Nanda <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23kernel/sys.c: k_getrusage() can use while_each_thread()Oleg Nesterov1-2/+1
Change k_getrusage() to use while_each_thread(), no changes in the compiled code. Signed-off-by: Oleg Nesterov <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Al Viro <[email protected]> Cc: Kees Cook <[email protected]> Reviewed-by: Sameer Nanda <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23fs/proc/array.c: change do_task_stat() to use while_each_thread()Oleg Nesterov1-2/+1
Change the remaining next_thread (ab)users to use while_each_thread(). The last user which should be changed is next_tid(), but we can't do this now. __exit_signal() and complete_signal() are fine, they actually need next_thread() logic. This patch (of 3): do_task_stat() can use while_each_thread(), no changes in the compiled code. Signed-off-by: Oleg Nesterov <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Al Viro <[email protected]> Cc: Kees Cook <[email protected]> Reviewed-by: Sameer Nanda <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23exec: kill task_struct->did_execOleg Nesterov4-6/+2
We can kill either task->did_exec or PF_FORKNOEXEC, they are mutually exclusive. The patch kills ->did_exec because it has a single user. Signed-off-by: Oleg Nesterov <[email protected]> Acked-by: KOSAKI Motohiro <[email protected]> Cc: Al Viro <[email protected]> Cc: Kees Cook <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23exec: move the final allow_write_access/fput into free_bprm()Oleg Nesterov1-15/+5
Both success/failure paths cleanup bprm->file, we can move this code into free_bprm() to simlify and cleanup this logic. Signed-off-by: Oleg Nesterov <[email protected]> Acked-by: KOSAKI Motohiro <[email protected]> Cc: Al Viro <[email protected]> Acked-by: Kees Cook <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23exec:check_unsafe_exec: kill the dead -EAGAIN and clear_in_exec logicOleg Nesterov1-21/+8
fs_struct->in_exec == T means that this ->fs is used by a single process (thread group), and one of the treads does do_execve(). To avoid the mt-exec races this code has the following complications: 1. check_unsafe_exec() returns -EBUSY if ->in_exec was already set by another thread. 2. do_execve_common() records "clear_in_exec" to ensure that the error path can only clear ->in_exec if it was set by current. However, after 9b1bf12d5d51 "signals: move cred_guard_mutex from task_struct to signal_struct" we do not need these complications: 1. We can't race with our sub-thread, this is called under per-process ->cred_guard_mutex. And we can't race with another CLONE_FS task, we already checked that this fs is not shared. We can remove the dead -EAGAIN logic. 2. "out_unmark:" in do_execve_common() is either called under ->cred_guard_mutex, or after de_thread() which kills other threads, so we can't race with sub-thread which could set ->in_exec. And if ->fs is shared with another process ->in_exec should be false anyway. We can clear in_exec unconditionally. This also means that check_unsafe_exec() can be void. Signed-off-by: Oleg Nesterov <[email protected]> Acked-by: KOSAKI Motohiro <[email protected]> Cc: Al Viro <[email protected]> Cc: Kees Cook <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23exec:check_unsafe_exec: use while_each_thread() rather than next_thread()Oleg Nesterov1-1/+2
next_thread() should be avoided, change check_unsafe_exec() to use while_each_thread(). Nobody except signal->curr_target actually needs next_thread-like code, and we need to change (fix) this interface. This particular code is fine, p == current. But in general the code like this can loop forever if p exits and next_thread(t) can't reach the unhashed thread. This also saves 32 bytes. Signed-off-by: Oleg Nesterov <[email protected]> Acked-by: KOSAKI Motohiro <[email protected]> Cc: Al Viro <[email protected]> Cc: Kees Cook <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23kernel/fork.c: remove redundant NULL check in dup_mm()Daeseok Youn1-3/+0
current->mm doesn't need a NULL check in dup_mm(). Becasue dup_mm() is used only in copy_mm() and current->mm is checked whether it is NULL or not in copy_mm() before calling dup_mm(). Signed-off-by: Daeseok Youn <[email protected]> Acked-by: Oleg Nesterov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23kernel/fork.c: fix coding style issuesDaeseok Youn1-2/+2
Fix errors reported by checkpatch.pl. One error is parentheses, the other is a whitespace issue. Signed-off-by: Daeseok Youn <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23kernel/fork.c: make dup_mm() staticDaeSeok Youn2-3/+1
dup_mm() is used only in kernel/fork.c Signed-off-by: Daeseok Youn <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23fs/proc: don't use module_init for non-modular core codePaul Gortmaker16-16/+16
PROC_FS is a bool, so this code is either present or absent. It will never be modular, so using module_init as an alias for __initcall is rather misleading. Fix this up now, so that we can relocate module_init from init.h into module.h in the future. If we don't do this, we'd have to add module.h to obviously non-modular code, and that would be ugly at best. Note that direct use of __initcall is discouraged, vs. one of the priority categorized subgroups. As __initcall gets mapped onto device_initcall, our use of fs_initcall (which makes sense for fs code) will thus change these registrations from level 6-device to level 5-fs (i.e. slightly earlier). However no observable impact of that small difference has been observed during testing, or is expected. Also note that this change uncovers a missing semicolon bug in the registration of vmcore_init as an initcall. Signed-off-by: Paul Gortmaker <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23fs/proc_namespace.c: simplify testing nsp and nsp->mnt_nsAxel Lin1-6/+1
Trivial cleanup to eliminate a goto. Signed-off-by: Axel Lin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23fs/proc/proc_devtree.c: remove empty /proc/device-tree when no openfirmware ↵Dave Jones1-0/+1
exists. Distribution kernels might want to build in support for /proc/device-tree for kernels that might end up running on hardware that doesn't support openfirmware. This results in an empty /proc/device-tree existing. Remove it if the OFW root node doesn't exist. This situation actually confuses grub2, resulting in install failures. grub2 sees the /proc/device-tree and picks the wrong install target cf. http://bzr.savannah.gnu.org/lh/grub/trunk/grub/annotate/4300/util/grub-install.in#L311 grub should be more robust, but still, leaving an empty proc dir seems pointless. Addresses https://bugzilla.redhat.com/show_bug.cgi?id=818378. Signed-off-by: Dave Jones <[email protected]> Cc: Al Viro <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Josh Boyer <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23proc: set attributes of pde using accessor functionsRui Xiang2-4/+3
Use existing accessors proc_set_user() and proc_set_size() to set attributes. Just a cleanup. Signed-off-by: Rui Xiang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23proc: fix ->f_pos overflows in first_tid()Oleg Nesterov1-5/+9
1. proc_task_readdir()->first_tid() path truncates f_pos to int, this is wrong even on 64bit. We could check that f_pos < PID_MAX or even INT_MAX in proc_task_readdir(), but this patch simply checks the potential overflow in first_tid(), this check is nop on 64bit. We do not care if it was negative and the new unsigned value is huge, all we need to ensure is that we never wrongly return !NULL. 2. Remove the 2nd "nr != 0" check before get_nr_threads(), nr_threads == 0 is not distinguishable from !pid_task() above. Signed-off-by: Oleg Nesterov <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Sameer Nanda <[email protected]> Cc: Sergey Dyasly <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23proc: don't (ab)use ->group_leader in proc_task_readdir() pathsOleg Nesterov1-28/+24
proc_task_readdir() does not really need "leader", first_tid() has to revalidate it anyway. Just pass proc_pid(inode) to first_tid() instead, it can do pid_task(PIDTYPE_PID) itself and read ->group_leader only if necessary. The patch also extracts the "inode is dead" code from pid_delete_dentry(dentry) into the new trivial helper, proc_inode_is_dead(inode), proc_task_readdir() uses it to return -ENOENT if this dir was removed. This is a bit racy, but the race is very inlikely and the getdents() after openndir() can see the empty "." + ".." dir only once. Signed-off-by: Oleg Nesterov <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Sameer Nanda <[email protected]> Cc: Sergey Dyasly <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23proc: change first_tid() to use while_each_thread() rather than next_thread()Oleg Nesterov1-10/+10
Rerwrite the main loop to use while_each_thread() instead of next_thread(). We are going to fix or replace while_each_thread(), next_thread() should be avoided whenever possible. Signed-off-by: Oleg Nesterov <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Sameer Nanda <[email protected]> Cc: Sergey Dyasly <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23proc: fix the potential use-after-free in first_tid()Oleg Nesterov1-0/+3
proc_task_readdir() verifies that the result of get_proc_task() is pid_alive() and thus its ->group_leader is fine too. However this is not necessarily true after rcu_read_unlock(), we need to recheck this again after first_tid() does rcu_read_lock(). Otherwise leader->thread_group.next (used by next_thread()) can be invalid if the rcu grace period expires in between. The race is subtle and unlikely, but still it is possible afaics. To simplify lets ignore the "likely" case when tid != 0, f_version can be cleared by proc_task_operations->llseek(). Suppose we have a main thread M and its subthread T. Suppose that f_pos == 3, iow first_tid() should return T. Now suppose that the following happens between rcu_read_unlock() and rcu_read_lock(): 1. T execs and becomes the new leader. This removes M from ->thread_group but next_thread(M) is still T. 2. T creates another thread X which does exec as well, T goes away. 3. X creates another subthread, this increments nr_threads. 4. first_tid() does next_thread(M) and returns the already dead T. Note also that we need 2. and 3. only because of get_nr_threads() check, and this check was supposed to be optimization only. Signed-off-by: Oleg Nesterov <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Sameer Nanda <[email protected]> Cc: Sergey Dyasly <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23proc: cleanup/simplify get_task_state/task_state_arrayOleg Nesterov2-13/+4
get_task_state() and task_state_array[] look confusing and suboptimal, it is not clear what it can actually report to user-space and task_state_array[] blows .data for no reason. 1. state = (tsk->state & TASK_REPORT) | tsk->exit_state is not clear. TASK_REPORT is self-documenting but it is not clear what ->exit_state can add. Move the potential exit_state's (EXIT_ZOMBIE and EXIT_DEAD) into TASK_REPORT and use it to calculate the final result. 2. With the change above it is obvious that task_state_array[] has the unused entries just to make BUILD_BUG_ON() happy. Change this BUILD_BUG_ON() to use TASK_REPORT rather than TASK_STATE_MAX and shrink task_state_array[]. 3. Turn the "while (state)" loop into fls(state). Signed-off-by: Oleg Nesterov <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: David Laight <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Tejun Heo <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23coredump: make __get_dumpable/get_dumpable inline, kill fs/coredump.hOleg Nesterov4-29/+17
1. Remove fs/coredump.h. It is not clear why do we need it, it only declares __get_dumpable(), signal.c includes it for no reason. 2. Now that get_dumpable() and __get_dumpable() are really trivial make them inline in linux/sched.h. Signed-off-by: Oleg Nesterov <[email protected]> Acked-by: Kees Cook <[email protected]> Cc: Alex Kelly <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Josh Triplett <[email protected]> Cc: Petr Matousek <[email protected]> Cc: Vasily Kulikov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23coredump: kill MMF_DUMPABLE and MMF_DUMP_SECURELYOleg Nesterov2-18/+7
Nobody actually needs MMF_DUMPABLE/MMF_DUMP_SECURELY, they are only used to enforce the encoding of SUID_DUMP_* enum in mm->flags & MMF_DUMPABLE_MASK. Now that set_dumpable() updates both bits atomically we can kill them and simply store the value "as is" in 2 lower bits. Signed-off-by: Oleg Nesterov <[email protected]> Acked-by: Kees Cook <[email protected]> Cc: Alex Kelly <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Josh Triplett <[email protected]> Cc: Petr Matousek <[email protected]> Cc: Vasily Kulikov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23coredump: set_dumpable: fix the theoretical race with itselfOleg Nesterov1-34/+15
set_dumpable() updates MMF_DUMPABLE_MASK in a non-trivial way to ensure that get_dumpable() can't observe the intermediate state, but this all can't help if multiple threads call set_dumpable() at the same time. And in theory commit_creds()->set_dumpable(SUID_DUMP_ROOT) racing with sys_prctl()->set_dumpable(SUID_DUMP_DISABLE) can result in SUID_DUMP_USER. Change this code to update both bits atomically via cmpxchg(). Note: this assumes that it is safe to mix bitops and cmpxchg. IOW, if, say, an architecture implements cmpxchg() using the locking (like arch/parisc/lib/bitops.c does), then it should use the same locks for set_bit/etc. Signed-off-by: Oleg Nesterov <[email protected]> Acked-by: Kees Cook <[email protected]> Cc: Alex Kelly <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Josh Triplett <[email protected]> Cc: Petr Matousek <[email protected]> Cc: Vasily Kulikov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23Documentation/cpu-hotplug.txt: fix a typo in example codeSangjung Woo1-1/+1
As the notifier_block name (i.e. foobar_cpu_notifer) is different from the parameter (i.e.foobar_cpu_notifier) of register function, that is definitely error and it also makes readers confused. Signed-off-by: Sangjung Woo <[email protected]> Reviewed-by: Srivatsa S. Bhat <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23Kconfig: update flightly outdated CONFIG_SMP documentationRobert Graffham11-40/+40
Remove an outdated reference to "most personal computers" having only one CPU, and change the use of "singleprocessor" and "single processor" in CONFIG_SMP's documentation to "uniprocessor" across all arches where that documentation is present. Signed-off-by: Robert Graffham <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23Documentation/filesystems/00-INDEX: updatesFabian Frederick1-13/+45
Add the following documentation-files with description : -autofs4-mount-control.txt -btrfs.txt -debugfs.txt -devpts.txt -fiemap.txt -gfs2-glocks.txt -gfs2-uevents.txt -omfs.txt -path-lookup.txt -qnx6.txt -quota.txt -squashfs.txt -sysfs-tagging.txt -ubifs.txt -xfs-delayed-logging-design.txt -xfs-self-describing-metadata.txt Add the following documentation directories with description : -caching -cifs (replacing cifs.txt) -pohmelfs Remove the following documentation-files reference: -dentry-locking.txt -reiser4.txt Signed-off-by: Fabian Frederick <[email protected]> Cc: Randy Dunlap <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23Documentation/blockdev/ramdisk.txt: updatesFabian Frederick1-6/+15
- ramdisk_blocksize doesn't exist anymore - Module parameters added to documentation Signed-off-by: Fabian Frederick <[email protected]> Acked-by: Randy Dunlap <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23Documentation/filesystems/sysfs.txt: fix device_attribute declarationAndre Richter1-3/+3
Fix a wrong device_attribute declaration example. Signed-off-by: Andre Richter <[email protected]> Cc: Greg KH <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-23hfsplus: remove hfsplus_file_lookup()Sougata Santra1-59/+0
HFS+ resource fork lookup breaks opendir() library function. Since opendir first calls open() with O_DIRECTORY flag set. O_DIRECTORY means "refuse to open if not a directory". The open system call in the kernel does a check for inode->i_op->lookup and returns -ENOTDIR. So if hfsplus_file_lookup is set it allows opendir() for plain files. Also resource fork lookup in HFS+ does not work. Since it is never invoked after VFS permission checking. It will always return with -EACCES. When we call opendir() on a file, it does not return NULL. opendir() library call is based on open with O_DIRECTORY flag passed and then layered on top of getdents() system call. O_DIRECTORY means "refuse to open if not a directory". The open() system call in the kernel does a check for: do_sys_open() -->..--> can_lookup() i.e it only checks inode->i_op->lookup and returns ENOTDIR if this function pointer is not set. In OSX, we can open "file/rsrc" to get the resource fork of "file". This behavior is emulated inside hfsplus on Linux, which means that to some degree every file acts like a directory. That is the reason lookup() inode operations is supported for files, and it is possible to do a lookup on this specific name. As a result of this open succeeds without returning ENOTDIR for HFS+ Please see the LKML discussion thread on this issue: http://marc.info/?l=linux-fsdevel&m=122823343730412&w=2 I tried to test file/rsrc lookup in HFS+ driver and the feature does not work. From OSX: $ touch test $ echo "1234" > test/..namedfork/rsrc $ ls -l test..namedfork/rsrc --rw-r--r-- 1 tuxera staff 5 10 dec 12:59 test/..namedfork/rsrc [sougata@ultrabook tmp]$ id uid=1000(sougata) gid=1000(sougata) groups=1000(sougata),5(tty),18(dialout),1001(vboxusers) [sougata@ultrabook tmp]$ mount /dev/sdb1 on /mnt/tmp type hfsplus (rw,relatime,umask=0,uid=1000,gid=1000,nls=utf8) [sougata@ultrabook tmp]$ ls -l test/rsrc ls: cannot access test/rsrc: Permission denied According to this LKML thread it is expected behavior. http://marc.info/?t=121139033800008&r=1&w=4 I guess now that permission checking happens in vfs generic_permission() ? So it turns out that even though the lookup() inode_operation exists for HFS+ files. It cannot really get invoked ?. So if we can disable this feature to make opendir() work for HFS+. Signed-off-by: Sougata Santra <[email protected]> Acked-by: Christoph Hellwig <[email protected]> Cc: Vyacheslav Dubeyko <[email protected]> Cc: Anton Altaparmakov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>