aboutsummaryrefslogtreecommitdiff
path: root/fs/proc
AgeCommit message (Collapse)AuthorFilesLines
2011-03-31Fix common misspellingsLucas De Marchi1-1/+1
Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <[email protected]>
2011-03-27proc: fix oops on invalid /proc/<pid>/maps accessLinus Torvalds1-1/+2
When m_start returns an error, the seq_file logic will still call m_stop with that error entry, so we'd better make sure that we check it before using it as a vma. Introduced by commit ec6fd8a4355c ("report errors in /proc/*/*map* sanely"), which replaced NULL with various ERR_PTR() cases. (On ia64, you happen to get a unaligned fault instead of a page fault, since the address used is generally some random error code like -EPERM) Reported-by: Anca Emanuel <[email protected]> Reported-by: Tony Luck <[email protected]> Cc: Al Viro <[email protected]> Cc: Américo Wang <[email protected]> Cc: Stephen Wilson <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-03-23Merge branch 'for-linus' of ↵Linus Torvalds3-78/+132
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: deal with races in /proc/*/{syscall,stack,personality} proc: enable writing to /proc/pid/mem proc: make check_mem_permission() return an mm_struct on success proc: hold cred_guard_mutex in check_mem_permission() proc: disable mem_write after exec mm: implement access_remote_vm mm: factor out main logic of access_process_vm mm: use mm_struct to resolve gate vma's in __get_user_pages mm: arch: rename in_gate_area_no_task to in_gate_area_no_mm mm: arch: make in_gate_area take an mm_struct instead of a task_struct mm: arch: make get_gate_vma take an mm_struct instead of a task_struct x86: mark associated mm when running a task in 32 bit compatibility mode x86: add context tag to mark mm when running a task in 32-bit compatibility mode auxv: require the target to be tracable (or yourself) close race in /proc/*/environ report errors in /proc/*/*map* sanely pagemap: close races with suid execve make sessionid permissions in /proc/*/task/* match those in /proc/* fix leaks in path_lookupat() Fix up trivial conflicts in fs/proc/base.c
2011-03-23procfs: kill the global proc_mnt variableOleg Nesterov3-6/+4
After the previous cleanup in proc_get_sb() the global proc_mnt has no reasons to exists, kill it. Signed-off-by: Oleg Nesterov <[email protected]> Signed-off-by: Eric W. Biederman <[email protected]> Signed-off-by: Daniel Lezcano <[email protected]> Cc: Alexey Dobriyan <[email protected]> Acked-by: Serge E. Hallyn <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-03-23pidns: call pid_ns_prepare_proc() from create_pid_namespace()Eric W. Biederman1-18/+7
Reorganize proc_get_sb() so it can be called before the struct pid of the first process is allocated. Signed-off-by: Eric W. Biederman <[email protected]> Signed-off-by: Daniel Lezcano <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Alexey Dobriyan <[email protected]> Acked-by: Serge E. Hallyn <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-03-23proc: protect mm start_code/end_code in /proc/pid/statKees Cook1-2/+2
While mm->start_stack was protected from cross-uid viewing (commit f83ce3e6b02d5 ("proc: avoid information leaks to non-privileged processes")), the start_code and end_code values were not. This would allow the text location of a PIE binary to leak, defeating ASLR. Note that the value "1" is used instead of "0" for a protected value since "ps", "killall", and likely other readers of /proc/pid/stat, take start_code of "0" to mean a kernel thread and will misbehave. Thanks to Brad Spengler for pointing this out. Addresses CVE-2011-0726 Signed-off-by: Kees Cook <[email protected]> Cc: <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: David Howells <[email protected]> Cc: Eugene Teo <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Brad Spengler <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-03-23proc: make struct proc_dir_entry::namelen unsigned intAlexey Dobriyan1-4/+4
1. namelen is declared "unsigned short" which hints for "maybe space savings". Indeed in 2.4 struct proc_dir_entry looked like: struct proc_dir_entry { unsigned short low_ino; unsigned short namelen; Now, low_ino is "unsigned int", all savings were gone for a long time. "struct proc_dir_entry" is not that countless to worry about it's size, anyway. 2. converting from unsigned short to int/unsigned int can only create problems, we better play it safe. Space is not really conserved, because of natural alignment for the next field. sizeof(struct proc_dir_entry) remains the same. Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-03-23procfs: fix some wrong error code usageJovi Zhang1-2/+5
[root@wei 1]# cat /proc/1/mem cat: /proc/1/mem: No such process error code -ESRCH is wrong in this situation. Return -EPERM instead. Signed-off-by: Jovi Zhang <[email protected]> Reviewed-by: KOSAKI Motohiro <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-03-23procfs: fix /proc/<pid>/maps heap checkAaro Koskinen1-2/+2
The current code fails to print the "[heap]" marking if the heap is split into multiple mappings. Fix the check so that the marking is displayed in all possible cases: 1. vma matches exactly the heap 2. the heap vma is merged e.g. with bss 3. the heap vma is splitted e.g. due to locked pages Test cases. In all cases, the process should have mapping(s) with [heap] marking: (1) vma matches exactly the heap #include <stdio.h> #include <unistd.h> #include <sys/types.h> int main (void) { if (sbrk(4096) != (void *)-1) { printf("check /proc/%d/maps\n", (int)getpid()); while (1) sleep(1); } return 0; } # ./test1 check /proc/553/maps [1] + Stopped ./test1 # cat /proc/553/maps | head -4 00008000-00009000 r-xp 00000000 01:00 3113640 /test1 00010000-00011000 rw-p 00000000 01:00 3113640 /test1 00011000-00012000 rw-p 00000000 00:00 0 [heap] 4006f000-40070000 rw-p 00000000 00:00 0 (2) the heap vma is merged #include <stdio.h> #include <unistd.h> #include <sys/types.h> char foo[4096] = "foo"; char bar[4096]; int main (void) { if (sbrk(4096) != (void *)-1) { printf("check /proc/%d/maps\n", (int)getpid()); while (1) sleep(1); } return 0; } # ./test2 check /proc/556/maps [2] + Stopped ./test2 # cat /proc/556/maps | head -4 00008000-00009000 r-xp 00000000 01:00 3116312 /test2 00010000-00012000 rw-p 00000000 01:00 3116312 /test2 00012000-00014000 rw-p 00000000 00:00 0 [heap] 4004a000-4004b000 rw-p 00000000 00:00 0 (3) the heap vma is splitted (this fails without the patch) #include <stdio.h> #include <unistd.h> #include <sys/mman.h> #include <sys/types.h> int main (void) { if ((sbrk(4096) != (void *)-1) && !mlockall(MCL_FUTURE) && (sbrk(4096) != (void *)-1)) { printf("check /proc/%d/maps\n", (int)getpid()); while (1) sleep(1); } return 0; } # ./test3 check /proc/559/maps [1] + Stopped ./test3 # cat /proc/559/maps|head -4 00008000-00009000 r-xp 00000000 01:00 3119108 /test3 00010000-00011000 rw-p 00000000 01:00 3119108 /test3 00011000-00012000 rw-p 00000000 00:00 0 [heap] 00012000-00013000 rw-p 00000000 00:00 0 [heap] It looks like the bug has been there forever, and since it only results in some information missing from a procfile, it does not fulfil the -stable "critical issue" criteria. Signed-off-by: Aaro Koskinen <[email protected]> Reviewed-by: KOSAKI Motohiro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-03-23proc: hide kernel addresses via %pK in /proc/<pid>/stackKonstantin Khlebnikov1-1/+1
This file is readable for the task owner. Hide kernel addresses from unprivileged users, leave them function names and offsets. Signed-off-by: Konstantin Khlebnikov <[email protected]> Acked-by: Kees Cook <[email protected]> Cc: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-03-23deal with races in /proc/*/{syscall,stack,personality}Al Viro1-19/+50
All of those are rw-r--r-- and all are broken for suid - if you open a file before the target does suid-root exec, you'll be still able to access it. For personality it's not a big deal, but for syscall and stack it's a real problem. Fix: check that task is tracable for you at the time of read(). Signed-off-by: Al Viro <[email protected]>
2011-03-23proc: enable writing to /proc/pid/memStephen Wilson1-5/+0
With recent changes there is no longer a security hazard with writing to /proc/pid/mem. Remove the #ifdef. Signed-off-by: Stephen Wilson <[email protected]> Signed-off-by: Al Viro <[email protected]>
2011-03-23proc: make check_mem_permission() return an mm_struct on successStephen Wilson1-24/+34
This change allows us to take advantage of access_remote_vm(), which in turn eliminates a security issue with the mem_write() implementation. The previous implementation of mem_write() was insecure since the target task could exec a setuid-root binary between the permission check and the actual write. Holding a reference to the target mm_struct eliminates this vulnerability. Signed-off-by: Stephen Wilson <[email protected]> Signed-off-by: Al Viro <[email protected]>
2011-03-23proc: hold cred_guard_mutex in check_mem_permission()Stephen Wilson1-4/+22
Avoid a potential race when task exec's and we get a new ->mm but check against the old credentials in ptrace_may_access(). Holding of the mutex is implemented by factoring out the body of the code into a helper function __check_mem_permission(). Performing this factorization now simplifies upcoming changes and minimizes churn in the diff's. Signed-off-by: Stephen Wilson <[email protected]> Signed-off-by: Al Viro <[email protected]>
2011-03-23proc: disable mem_write after execStephen Wilson1-0/+4
This change makes mem_write() observe the same constraints as mem_read(). This is particularly important for mem_write as an accidental leak of the fd across an exec could result in arbitrary modification of the target process' memory. IOW, /proc/pid/mem is implicitly close-on-exec. Signed-off-by: Stephen Wilson <[email protected]> Signed-off-by: Al Viro <[email protected]>
2011-03-23mm: arch: make get_gate_vma take an mm_struct instead of a task_structStephen Wilson1-3/+5
Morally, the presence of a gate vma is more an attribute of a particular mm than a particular task. Moreover, dropping the dependency on task_struct will help make both existing and future operations on mm's more flexible and convenient. Signed-off-by: Stephen Wilson <[email protected]> Reviewed-by: Michel Lespinasse <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Signed-off-by: Al Viro <[email protected]>
2011-03-23auxv: require the target to be tracable (or yourself)Al Viro1-3/+3
same as for environ, except that we didn't do any checks to prevent access after suid execve Signed-off-by: Al Viro <[email protected]>
2011-03-23close race in /proc/*/environAl Viro1-6/+4
Switch to mm_for_maps(). Maybe we ought to make it r--r--r--, since we do checks on IO anyway... Signed-off-by: Al Viro <[email protected]>
2011-03-23report errors in /proc/*/*map* sanelyAl Viro3-11/+13
Signed-off-by: Al Viro <[email protected]>
2011-03-23pagemap: close races with suid execveAl Viro2-7/+4
just use mm_for_maps() Signed-off-by: Al Viro <[email protected]>
2011-03-23make sessionid permissions in /proc/*/task/* match those in /proc/*Al Viro1-1/+1
Signed-off-by: Al Viro <[email protected]>
2011-03-22smaps: have smaps show transparent huge pagesDave Hansen1-0/+4
Now that the mere act of _looking_ at /proc/$pid/smaps will not destroy transparent huge pages, tell how much of the VMA is actually mapped with them. This way, we can make sure that we're getting THPs where we expect to see them. Signed-off-by: Dave Hansen <[email protected]> Acked-by: Mel Gorman <[email protected]> Acked-by: David Rientjes <[email protected]> Reviewed-by: Eric B Munson <[email protected]> Tested-by: Eric B Munson <[email protected]> Cc: Michael J Wolf <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Matt Mackall <[email protected]> Cc: Jeremy Fitzhardinge <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-03-22smaps: teach smaps_pte_range() about THP pmdsDave Hansen1-2/+21
This adds code to explicitly detect and handle pmd_trans_huge() pmds. It then passes HPAGE_SIZE units in to the smap_pte_entry() function instead of PAGE_SIZE. This means that using /proc/$pid/smaps now will no longer cause THPs to be broken down in to small pages. Signed-off-by: Dave Hansen <[email protected]> Reviewed-by: Eric B Munson <[email protected]> Tested-by: Eric B Munson <[email protected]> Acked-by: Andrea Arcangeli <[email protected]> Acked-by: David Rientjes <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Michael J Wolf <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Matt Mackall <[email protected]> Cc: Jeremy Fitzhardinge <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-03-22smaps: pass pte size argument in to smaps_pte_entry()Dave Hansen1-12/+12
Add an argument to the new smaps_pte_entry() function to let it account in things other than PAGE_SIZE units. I changed all of the PAGE_SIZE sites, even though not all of them can be reached for transparent huge pages, just so this will continue to work without changes as THPs are improved. Signed-off-by: Dave Hansen <[email protected]> Acked-by: Mel Gorman <[email protected]> Acked-by: Johannes Weiner <[email protected]> Acked-by: David Rientjes <[email protected]> Reviewed-by: Eric B Munson <[email protected]> Tested-by: Eric B Munson <[email protected]> Cc: Michael J Wolf <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Matt Mackall <[email protected]> Cc: Jeremy Fitzhardinge <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-03-22smaps: break out smaps_pte_entry() from smaps_pte_range()Dave Hansen1-40/+47
We will use smaps_pte_entry() in a moment to handle both small and transparent large pages. But, we must break it out of smaps_pte_range() first. Signed-off-by: Dave Hansen <[email protected]> Acked-by: Mel Gorman <[email protected]> Acked-by: Johannes Weiner <[email protected]> Acked-by: David Rientjes <[email protected]> Reviewed-by: Eric B Munson <[email protected]> Tested-by: Eric B Munson <[email protected]> Cc: Michael J Wolf <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Matt Mackall <[email protected]> Cc: Jeremy Fitzhardinge <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-03-22pagewalk: only split huge pages when necessaryDave Hansen1-0/+6
Right now, if a mm_walk has either ->pte_entry or ->pmd_entry set, it will unconditionally split any transparent huge pages it runs in to. In practice, that means that anyone doing a cat /proc/$pid/smaps will unconditionally break down every huge page in the process and depend on khugepaged to re-collapse it later. This is fairly suboptimal. This patch changes that behavior. It teaches each ->pmd_entry handler (there are five) that they must break down the THPs themselves. Also, the _generic_ code will never break down a THP unless a ->pte_entry handler is actually set. This means that the ->pmd_entry handlers can now choose to deal with THPs without breaking them down. [[email protected]: coding-style fixes] Signed-off-by: Dave Hansen <[email protected]> Acked-by: Mel Gorman <[email protected]> Acked-by: David Rientjes <[email protected]> Reviewed-by: Eric B Munson <[email protected]> Tested-by: Eric B Munson <[email protected]> Cc: Michael J Wolf <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Matt Mackall <[email protected]> Cc: Jeremy Fitzhardinge <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-03-16Merge branch 'next' into for-linusJames Morris1-1/+0
2011-03-10/proc/self is never going to be invalidated...Al Viro1-30/+0
Signed-off-by: Al Viro <[email protected]>
2011-03-08unfuck proc_sysctl ->d_compare()Al Viro2-4/+11
a) struct inode is not going to be freed under ->d_compare(); however, the thing PROC_I(inode)->sysctl points to just might. Fortunately, it's enough to make freeing that sucker delayed, provided that we don't step on its ->unregistering, clear the pointer to it in PROC_I(inode) before dropping the reference and check if it's NULL in ->d_compare(). b) I'm not sure that we *can* walk into NULL inode here (we recheck dentry->seq between verifying that it's still hashed / fetching dentry->d_inode and passing it to ->d_compare() and there's no negative hashed dentries in /proc/sys/*), but if we can walk into that, we really should not have ->d_compare() return 0 on it! Said that, I really suspect that this check can be simply killed. Nick? Signed-off-by: Al Viro <[email protected]>
2011-03-08Merge branch 'master' of git://git.infradead.org/users/eparis/selinux into nextJames Morris1-1/+0
2011-03-02of/flattree: Drop an uninteresting message to pr_debug levelPaul Bolle1-1/+1
This message looks like an error (which it isn't) when booting with a flattened device tree. Remove the message from normal kernel builds. Signed-off-by: Paul Bolle <[email protected]> Signed-off-by: Grant Likely <[email protected]>
2011-02-15s390: remove task_show_regsMartin Schwidefsky1-3/+0
task_show_regs used to be a debugging aid in the early bringup days of Linux on s390. /proc/<pid>/status is a world readable file, it is not a good idea to show the registers of a process. The only correct fix is to remove task_show_regs. Reported-by: Al Viro <[email protected]> Signed-off-by: Martin Schwidefsky <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-02-01security/selinux: fix /proc/sys/ labelingLucian Adrian Grijincu1-1/+0
This fixes an old (2007) selinux regression: filesystem labeling for /proc/sys returned -r--r--r-- unknown /proc/sys/fs/file-nr instead of -r--r--r-- system_u:object_r:sysctl_fs_t:s0 /proc/sys/fs/file-nr Events that lead to breaking of /proc/sys/ selinux labeling: 1) sysctl was reimplemented to route all calls through /proc/sys/ commit 77b14db502cb85a031fe8fde6c85d52f3e0acb63 [PATCH] sysctl: reimplement the sysctl proc support 2) proc_dir_entry was removed from ctl_table: commit 3fbfa98112fc3962c416452a0baf2214381030e6 [PATCH] sysctl: remove the proc_dir_entry member for the sysctl tables 3) selinux still walked the proc_dir_entry tree to apply labeling. Because ctl_tables don't have a proc_dir_entry, we did not label /proc/sys/ inodes any more. To achieve this the /proc/sys/ inodes were marked private and private inodes were ignored by selinux. commit bbaca6c2e7ef0f663bc31be4dad7cf530f6c4962 [PATCH] selinux: enhance selinux to always ignore private inodes commit 86a71dbd3e81e8870d0f0e56b87875f57e58222b [PATCH] sysctl: hide the sysctl proc inodes from selinux Access control checks have been done by means of a special sysctl hook that was called for read/write accesses to any /proc/sys/ entry. We don't have to do this because, instead of walking the proc_dir_entry tree we can walk the dentry tree (as done in this patch). With this patch: * we don't mark /proc/sys/ inodes as private * we don't need the sysclt security hook * we walk the dentry tree to find the path to the inode. We have to strip the PID in /proc/PID/ entries that have a proc_dir_entry because selinux does not know how to label paths like '/1/net/rpc/nfsd.fh' (and defaults to 'proc_t' labeling). Selinux does know of '/net/rpc/nfsd.fh' (and applies the 'sysctl_rpc_t' label). PID stripping from the path was done implicitly in the previous code because the proc_dir_entry tree had the root in '/net' in the example from above. The dentry tree has the root in '/1'. Signed-off-by: Eric W. Biederman <[email protected]> Signed-off-by: Lucian Adrian Grijincu <[email protected]> Signed-off-by: Eric Paris <[email protected]>
2011-01-26console: rename acquire/release_console_sem() to console_lock/unlock()Torben Hohn1-2/+2
The -rt patches change the console_semaphore to console_mutex. As a result, a quite large chunk of the patches changes all acquire/release_console_sem() to acquire/release_console_mutex() This commit makes things use more neutral function names which dont make implications about the underlying lock. The only real change is the return value of console_trylock which is inverted from try_acquire_console_sem() This patch also paves the way to switching console_sem from a semaphore to a mutex. [[email protected]: coding-style fixes] [[email protected]: make console_trylock return 1 on success, per Geert] Signed-off-by: Torben Hohn <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Greg KH <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-01-20kconfig: rename CONFIG_EMBEDDED to CONFIG_EXPERTDavid Rientjes1-3/+3
The meaning of CONFIG_EMBEDDED has long since been obsoleted; the option is used to configure any non-standard kernel with a much larger scope than only small devices. This patch renames the option to CONFIG_EXPERT in init/Kconfig and fixes references to the option throughout the kernel. A new CONFIG_EMBEDDED option is added that automatically selects CONFIG_EXPERT when enabled and can be used in the future to isolate options that should only be considered for embedded systems (RISC architectures, SLOB, etc). Calling the option "EXPERT" more accurately represents its intention: only expert users who understand the impact of the configuration changes they are making should enable it. Reviewed-by: Ingo Molnar <[email protected]> Acked-by: David Woodhouse <[email protected]> Signed-off-by: David Rientjes <[email protected]> Cc: Greg KH <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Robin Holt <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-01-13thp: remove PG_buddyAndrea Arcangeli1-6/+8
PG_buddy can be converted to _mapcount == -2. So the PG_compound_lock can be added to page->flags without overflowing (because of the sparse section bits increasing) with CONFIG_X86_PAE=y and CONFIG_X86_PAT=y. This also has to move the memory hotplug code from _mapcount to lru.next to avoid any risk of clashes. We can't use lru.next for PG_buddy removal, but memory hotplug can use lru.next even more easily than the mapcount instead. Signed-off-by: Andrea Arcangeli <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-01-13thp: transparent hugepage vmstatAndrea Arcangeli1-1/+13
Add hugepage stat information to /proc/vmstat and /proc/meminfo. Signed-off-by: Andrea Arcangeli <[email protected]> Acked-by: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-01-13oom: allow a non-CAP_SYS_RESOURCE proces to oom_score_adj downMandeep Singh Baines1-1/+3
We'd like to be able to oom_score_adj a process up/down as it enters/leaves the foreground. Currently, it is not possible to oom_adj down without CAP_SYS_RESOURCE. This patch allows a task to decrease its oom_score_adj back to the value that a CAP_SYS_RESOURCE thread set it to or its inherited value at fork. Assuming the thread that has forked it has oom_score_adj of 0, each process could decrease it back from 0 upon activation unless a CAP_SYS_RESOURCE thread elevated it to something higher. Alternative considered: * a setuid binary * a daemon with CAP_SYS_RESOURCE Since you don't wan't all processes to be able to reduce their oom_adj, a setuid or daemon implementation would be complex. The alternatives also have much higher overhead. This patch updated from original patch based on feedback from David Rientjes. Signed-off-by: Mandeep Singh Baines <[email protected]> Acked-by: David Rientjes <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Ying Han <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-01-13mm: smaps: export mlock informationNikanth Karthikesan1-2/+5
Currently there is no way to find whether a process has locked its pages in memory or not. And which of the memory regions are locked in memory. Add a new field "Locked" to export this information via the smaps file. Signed-off-by: Nikanth Karthikesan <[email protected]> Acked-by: Balbir Singh <[email protected]> Acked-by: Wu Fengguang <[email protected]> Cc: Matt Mackall <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-01-13/proc/kcore: fix seekingDave Anderson1-1/+1
Commit 34aacb2920 ("procfs: Use generic_file_llseek in /proc/kcore") broke seeking on /proc/kcore. This changes it back to use default_llseek in order to restore the original behavior. The problem with generic_file_llseek is that it only allows seeks up to inode->i_sb->s_maxbytes, which is 2GB-1 on procfs, where the memory file offset values in the /proc/kcore PT_LOAD segments may exceed or start beyond that offset value. A similar revert was made for /proc/vmcore. Signed-off-by: Dave Anderson <[email protected]> Acked-by: Frederic Weisbecker <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-01-13proc: move proc_console.c to fs/proc/consoles.cAlexey Dobriyan2-3/+3
Filename is supposed to match procfile name for random junk. Add __init while I'm at it. Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-01-13proc: less LOCK/UNLOCK in remove_proc_entry()Alexey Dobriyan1-4/+1
For the common case where a proc entry is being removed and nobody is in the process of using it, save a LOCK/UNLOCK pair. Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-01-13kpagecount: add slab page checking because _mapcount is in a unionPetr Holasek1-1/+1
Add a PageSlab() check before adding the _mapcount value to /kpagecount. page->_mapcount is in a union with the SLAB structure so for pages controlled by SLAB, page_mapcount() returns nonsense. Signed-off-by: Petr Holasek <[email protected]> Cc: Wu Fengguang <[email protected]> Cc: Matt Mackall <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-01-13proc: use single_open() correctlyJovi Zhang1-26/+3
single_open()'s third argument is for copying into seq_file->private. Use that, rather than open-coding it. Signed-off-by: Jovi Zhang <[email protected]> Acked-by: David Rientjes <[email protected]> Acked-by: Alexey Dobriyan <[email protected]> Reviewed-by: KOSAKI Motohiro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-01-13proc: ->low_ino cleanupAlexey Dobriyan3-15/+6
- ->low_ino is write-once field -- reading it under locks is unnecessary. - /proc/$PID stuff never reaches pde_put()/free_proc_entry() -- PROC_DYNAMIC_FIRST check never triggers. - in proc_get_inode(), inode number always matches proc dir entry, so save one parameter. Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-01-13proc: use seq_puts()/seq_putc() where possibleAlexey Dobriyan6-31/+31
For string without format specifiers, use seq_puts(). For seq_printf("\n"), use seq_putc('\n'). text data bss dec hex filename 61866 488 112 62466 f402 fs/proc/proc.o 61729 488 112 62329 f379 fs/proc/proc.o ---------------------------------------------------- -139 Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-01-13proc: use unsigned long inside /proc/*/statmAlexey Dobriyan4-9/+12
/proc/*/statm code needlessly truncates data from unsigned long to int. One needs only 8+ TB of RAM to make truncation visible. Signed-off-by: Alexey Dobriyan <[email protected]> Reviewed-by: WANG Cong <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-01-13fs/proc/base.c, kernel/latencytop.c: convert sprintf_symbol() to %psJoe Perches1-14/+8
Use temporary lr for struct latency_record for improved readability and fewer columns used. Removed trailing space from output. [[email protected]: coding-style fixes] Signed-off-by: Joe Perches <[email protected]> Cc: Jiri Kosina <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-01-07Merge branch 'tty-next' of ↵Linus Torvalds2-0/+115
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6 * 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6: (36 commits) serial: apbuart: Fixup apbuart_console_init() TTY: Add tty ioctl to figure device node of the system console. tty: add 'active' sysfs attribute to tty0 and console device drivers: serial: apbuart: Handle OF failures gracefully Serial: Avoid unbalanced IRQ wake disable during resume tty: fix typos/errors in tty_driver.h comments pch_uart : fix warnings for 64bit compile 8250: fix uninitialized FIFOs ip2: fix compiler warning on ip2main_pci_tbl specialix: fix compiler warning on specialix_pci_tbl rocket: fix compiler warning on rocket_pci_ids 8250: add a UPIO_DWAPB32 for 32 bit accesses 8250: use container_of() instead of casting serial: omap-serial: Add support for kernel debugger serial: fix pch_uart kconfig & build drivers: char: hvc: add arm JTAG DCC console support RS485 documentation: add 16C950 UART description serial: ifx6x60: fix memory leak serial: ifx6x60: free IRQ on error Serial: EG20T: add PCH_UART driver ... Fixed up conflicts in drivers/serial/apbuart.c with evil merge that makes the code look fairly sane (unlike either side).
2011-01-07Merge branch 'vfs-scale-working' of ↵Linus Torvalds4-29/+68
git://git.kernel.org/pub/scm/linux/kernel/git/npiggin/linux-npiggin * 'vfs-scale-working' of git://git.kernel.org/pub/scm/linux/kernel/git/npiggin/linux-npiggin: (57 commits) fs: scale mntget/mntput fs: rename vfsmount counter helpers fs: implement faster dentry memcmp fs: prefetch inode data in dcache lookup fs: improve scalability of pseudo filesystems fs: dcache per-inode inode alias locking fs: dcache per-bucket dcache hash locking bit_spinlock: add required includes kernel: add bl_list xfs: provide simple rcu-walk ACL implementation btrfs: provide simple rcu-walk ACL implementation ext2,3,4: provide simple rcu-walk ACL implementation fs: provide simple rcu-walk generic_check_acl implementation fs: provide rcu-walk aware permission i_ops fs: rcu-walk aware d_revalidate method fs: cache optimise dentry and inode for rcu-walk fs: dcache reduce branches in lookup path fs: dcache remove d_mounted fs: fs_struct use seqlock fs: rcu-walk for path lookup ...