aboutsummaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2008-04-29trivial: fix user-visible typo in hfsplusDave Jones1-1/+1
Signed-off-by: Dave Jones <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds2-1/+91
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: block: Skip I/O merges when disabled block: add large command support block: replace sizeof(rq->cmd) with BLK_MAX_CDB ide: use blk_rq_init() to initialize the request block: use blk_rq_init() to initialize the request block: rename and export rq_init() block: no need to initialize rq->cmd with blk_get_request block: no need to initialize rq->cmd in prepare_flush_fn hook block/blk-barrier.c:blk_ordered_cur_seq() mustn't be inline block/elevator.c:elv_rq_merge_ok() mustn't be inline block: make queue flags non-atomic block: add dma alignment and padding support to blk_rq_map_kern unexport blk_max_pfn ps3disk: Remove superfluous cast block: make rq_init() do a full memset() relay: fix splice problem
2008-04-29aio: fix misleading commentsJeff Moyer1-4/+1
The FIXME comments are inaccurate. The locking comment over lookup_ioctx() is wrong. Signed-off-by: Jeff Moyer <[email protected]> Signed-off-by: Zach Brown <[email protected]> Signed-off-by: Shen Feng <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29ncpfs: use get/put_unaligned_* helpersHarvey Harrison1-20/+19
[[email protected]: coding-style fixes] Signed-off-by: Harvey Harrison <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29isofs: use get/put_unaligned_* helpersHarvey Harrison1-6/+6
Signed-off-by: Harvey Harrison <[email protected]> Cc: Jan Kara <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29hfsplus: use get/put_unaligned_* helpersHarvey Harrison1-1/+1
Signed-off-by: Harvey Harrison <[email protected]> Cc: Roman Zippel <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29fat: use get/put_unaligned_* helpersHarvey Harrison1-5/+3
Signed-off-by: Harvey Harrison <[email protected]> Acked-by: OGAWA Hirofumi <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29afs: support the CB.ProbeUuid RPC opDavid Howells2-0/+108
Add support for the CB.ProbeUuid cache manager RPC op. This allows a modern OpenAFS server to quickly ask if the client has been rebooted. Signed-off-by: David Howells <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29afs: the AFS RPC op CBGetCapabilities is actually CBTellMeAboutYourselfDavid Howells2-14/+14
The AFS RxRPC op CBGetCapabilities is actually CBTellMeAboutYourself. Signed-off-by: David Howells <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29afs: use the shorter LIST_HEAD for brevityRobert P. J. Day1-1/+1
Signed-off-by: Robert P. J. Day <[email protected]> Acked-by: David Howells <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29Remove duplicated unlikely() in IS_ERR()Hirofumi Nakagawa4-6/+6
Some drivers have duplicated unlikely() macros. IS_ERR() already has unlikely() in itself. This patch cleans up such pointless code. Signed-off-by: Hirofumi Nakagawa <[email protected]> Acked-by: David S. Miller <[email protected]> Acked-by: Jeff Garzik <[email protected]> Cc: Paul Clements <[email protected]> Cc: Richard Purdie <[email protected]> Cc: Alessandro Zummo <[email protected]> Cc: David Brownell <[email protected]> Cc: James Bottomley <[email protected]> Cc: Michael Halcrow <[email protected]> Cc: Anton Altaparmakov <[email protected]> Cc: Al Viro <[email protected]> Cc: Carsten Otte <[email protected]> Cc: Patrick McHardy <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Jaroslav Kysela <[email protected]> Cc: Takashi Iwai <[email protected]> Acked-by: Mike Frysinger <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29sysctl: add the ->permissions callback on the ctl_table_rootPavel Emelyanov1-2/+2
When reading from/writing to some table, a root, which this table came from, may affect this table's permissions, depending on who is working with the table. The core hunk is at the bottom of this patch. All the rest is just pushing the ctl_table_root argument up to the sysctl_perm() function. This will be mostly (only?) used in the net sysctls. Signed-off-by: Pavel Emelyanov <[email protected]> Acked-by: David S. Miller <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Denis V. Lunev <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29sysctl: merge equal proc_sys_read and proc_sys_writePavel Emelyanov1-39/+11
Many (most of) sysctls do not have a per-container sense. E.g. kernel.print_fatal_signals, vm.panic_on_oom, net.core.netdev_budget and so on and so forth. Besides, tuning then from inside a container is not even secure. On the other hand, hiding them completely from the container's tasks sometimes causes user-space to stop working. When developing net sysctl, the common practice was to duplicate a table and drop the write bits in table->mode, but this approach was not very elegant, lead to excessive memory consumption and was not suitable in general. Here's the alternative solution. To facilitate the per-container sysctls ctl_table_root-s were introduced. Each root contains a list of ctl_table_header-s that are visible to different namespaces. The idea of this set is to add the permissions() callback on the ctl_table_root to allow ctl root limit permissions to the same ctl_table-s. The main user of this functionality is the net-namespaces code, but later this will (should) be used by more and more namespaces, containers and control groups. Actually, this idea's core is in a single hunk in the third patch. First two patches are cleanups for sysctl code, while the third one mostly extends the arguments set of some sysctl functions. This patch: These ->read and ->write callbacks act in a very similar way, so merge these paths to reduce the number of places to patch later and shrink the .text size (a bit). Signed-off-by: Pavel Emelyanov <[email protected]> Acked-by: "David S. Miller" <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Denis V. Lunev <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29jbd2: use non-racy method for proc entries creationDenis V. Lunev1-13/+4
Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data be setup before gluing PDE to main tree. Signed-off-by: Denis V. Lunev <[email protected]> Cc: <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29reiserfs: use non-racy method for proc entries creationDenis V. Lunev1-6/+3
Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data be setup before gluing PDE to main tree. /proc entry owner is also added. Signed-off-by: Denis V. Lunev <[email protected]> Cc: Jeff Mahoney <[email protected]> Cc: Chris Mason <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29ext4: use non-racy method for proc entries creationDenis V. Lunev1-11/+4
Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data be setup before gluing PDE to main tree. Signed-off-by: Denis V. Lunev <[email protected]> Cc: <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29afs: use non-racy method for proc entries creationDenis V. Lunev1-19/+14
Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data be setup before gluing PDE to main tree. Signed-off-by: Denis V. Lunev <[email protected]> Acked-by: David Howells <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29nfs: use proc_create to setup de->proc_fopsDenis V. Lunev1-8/+6
Use proc_create() to make sure that ->proc_fops be setup before gluing PDE to main tree. Signed-off-by: Denis V. Lunev <[email protected]> Cc: "J. Bruce Fields" <[email protected]> Cc: Trond Myklebust <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29nfsd: use proc_create to setup de->proc_fopsDenis V. Lunev1-2/+2
Use proc_create() to make sure that ->proc_fops be setup before gluing PDE to main tree. Signed-off-by: Denis V. Lunev <[email protected]> Cc: Neil Brown <[email protected]> Cc: "J. Bruce Fields" <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29proc: introduce proc_create_data to setup de->dataDenis V. Lunev2-4/+6
This set of patches fixes an proc ->open'less usage due to ->proc_fops flip in the most part of the kernel code. The original OOPS is described in the commit 2d3a4e3666325a9709cc8ea2e88151394e8f20fc: Typical PDE creation code looks like: pde = create_proc_entry("foo", 0, NULL); if (pde) pde->proc_fops = &foo_proc_fops; Notice that PDE is first created, only then ->proc_fops is set up to final value. This is a problem because right after creation a) PDE is fully visible in /proc , and b) ->proc_fops are proc_file_operations which do not have ->open callback. So, it's possible to ->read without ->open (see one class of oopses below). The fix is new API called proc_create() which makes sure ->proc_fops are set up before gluing PDE to main tree. Typical new code looks like: pde = proc_create("foo", 0, NULL, &foo_proc_fops); if (!pde) return -ENOMEM; Fix most networking users for a start. In the long run, create_proc_entry() for regular files will go. In addition to this, proc_create_data is introduced to fix reading from proc without PDE->data. The race is basically the same as above. create_proc_entries is replaced in the entire kernel code as new method is also simply better. This patch: The problem is the same as for de->proc_fops. Right now PDE becomes visible without data set. So, the entry could be looked up without data. This, in most cases, will simply OOPS. proc_create_data call is created to address this issue. proc_create now becomes a wrapper around it. Signed-off-by: Denis V. Lunev <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: "J. Bruce Fields" <[email protected]> Cc: Alessandro Zummo <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Bartlomiej Zolnierkiewicz <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Chris Mason <[email protected]> Acked-by: David Howells <[email protected]> Cc: Dmitry Torokhov <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Grant Grundler <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Haavard Skinnemoen <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Bottomley <[email protected]> Cc: Jaroslav Kysela <[email protected]> Cc: Jeff Garzik <[email protected]> Cc: Jeff Mahoney <[email protected]> Cc: Jesper Nilsson <[email protected]> Cc: Karsten Keil <[email protected]> Cc: Kyle McMartin <[email protected]> Cc: Len Brown <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Mauro Carvalho Chehab <[email protected]> Cc: Mikael Starvik <[email protected]> Cc: Nadia Derbey <[email protected]> Cc: Neil Brown <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Osterlund <[email protected]> Cc: Pierre Peiffer <[email protected]> Cc: Russell King <[email protected]> Cc: Takashi Iwai <[email protected]> Cc: Tony Luck <[email protected]> Cc: Trond Myklebust <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29proc: convert /proc/tty/ldiscs to seq_file interfaceAlexey Dobriyan1-31/+45
Note: THIS_MODULE and header addition aren't technically needed because this code is not modular, but let's keep it anyway because people can copy this code into modular code. Signed-off-by: Alexey Dobriyan <[email protected]> Cc: Alan Cox <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29proc: remove ->get_info infrastructureAlexey Dobriyan1-6/+1
Now that last dozen or so users of ->get_info were removed, ditch it too. Everyone sane shouldd have switched to seq_file interface long ago. P.S.: Co-existing 3 interfaces (->get_info/->read_proc/->proc_fops) for proc is long-standing crap, BTW, thus a) put ->read_proc/->write_proc/read_proc_entry() users on death row, b) new such users should be rejected, c) everyone is encouraged to convert his favourite ->read_proc user or I'll do it, lazy bastards. Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29proc: remove proc_root from driversAlexey Dobriyan3-2/+2
Remove proc_root export. Creation and removal works well if parent PDE is supplied as NULL -- it worked always that way. So, one useless export removed and consistency added, some drivers created PDEs with &proc_root as parent but removed them as NULL and so on. Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29proc: remove proc_root_driverAlexey Dobriyan1-4/+1
Use creation by full path: "driver/foo". Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29proc: remove proc_root_fsAlexey Dobriyan5-14/+12
Use creation by full path instead: "fs/foo". Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29proc: remove proc_busAlexey Dobriyan1-3/+2
Remove proc_bus export and variable itself. Using pathnames works fine and is slightly more understandable and greppable. Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29proc: drop several "PDE valid/invalid" checksAlexey Dobriyan2-56/+46
proc-misc code is noticeably full of "if (de)" checks when PDE passed is always valid. Remove them. Addition of such check in proc_lookup_de() is for failed lookup case. Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29proc: less special case in xlate codeAlexey Dobriyan1-3/+6
If valid "parent" is passed to proc_create/remove_proc_entry(), then name of PDE should consist of only one path component, otherwise creation or or removal will fail. However, if NULL is passed as parent then create/remove accept full path as a argument. This is arbitrary restriction -- all infrastructure is in place. So, patch allows the following to succeed: create_proc_entry("foo/bar", 0, pde_baz); remove_proc_entry("baz/foo/bar", &proc_root); Also makes the following to behave identically: create_proc_entry("foo/bar", 0, NULL); create_proc_entry("foo/bar", 0, &proc_root); Discrepancy noticed by Den Lunev (IIRC). Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29proc: simplify locking in remove_proc_entry()Alexey Dobriyan1-42/+40
proc_subdir_lock protects only modifying and walking through PDE lists, so after we've found PDE to remove and actually removed it from lists, there is no need to hold proc_subdir_lock for the rest of operation. Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29procfs: mem permission cleanupRoland McGrath1-9/+29
This cleans up the permission checks done for /proc/PID/mem i/o calls. It puts all the logic in a new function, check_mem_permission(). The old code repeated the (!MAY_PTRACE(task) || !ptrace_may_attach(task)) magical expression multiple times. The new function does all that work in one place, with clear comments. The old code called security_ptrace() twice on successful checks, once in MAY_PTRACE() and once in __ptrace_may_attach(). Now it's only called once, and only if all other checks have succeeded. Signed-off-by: Roland McGrath <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Oleg Nesterov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29proc: switch to proc_create()Alexey Dobriyan4-51/+24
Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29procfs task exe symlinkMatt Helsley6-70/+79
The kernel implements readlink of /proc/pid/exe by getting the file from the first executable VMA. Then the path to the file is reconstructed and reported as the result. Because of the VMA walk the code is slightly different on nommu systems. This patch avoids separate /proc/pid/exe code on nommu systems. Instead of walking the VMAs to find the first executable file-backed VMA we store a reference to the exec'd file in the mm_struct. That reference would prevent the filesystem holding the executable file from being unmounted even after unmapping the VMAs. So we track the number of VM_EXECUTABLE VMAs and drop the new reference when the last one is unmapped. This avoids pinning the mounted filesystem. [[email protected]: improve comments] [[email protected]: fix dup_mmap] Signed-off-by: Matt Helsley <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: David Howells <[email protected]> Cc:"Eric W. Biederman" <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Al Viro <[email protected]> Cc: Hugh Dickins <[email protected]> Signed-off-by: YAMAMOTO Takashi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29proc: print more information when removing non-empty directoriesAlexey Dobriyan1-1/+6
This usually saves one recompile to insert similar printk like below. :) Sample nastygram: remove_proc_entry: removing non-empty directory '/proc/foo', leaking at least 'bar' ------------[ cut here ]------------ WARNING: at fs/proc/generic.c:776 remove_proc_entry+0x18a/0x200() Modules linked in: foo(-) container fan battery dock sbs ac sbshc backlight ipv6 loop af_packet amd_rng sr_mod i2c_amd8111 i2c_amd756 cdrom i2c_core button thermal processor Pid: 3034, comm: rmmod Tainted: G M 2.6.25-rc1 #5 Call Trace: [<ffffffff80231974>] warn_on_slowpath+0x64/0x90 [<ffffffff80232a6e>] printk+0x4e/0x60 [<ffffffff802d6c8a>] remove_proc_entry+0x18a/0x200 [<ffffffff8045cd88>] mutex_lock_nested+0x1c8/0x2d0 [<ffffffff8025f0f0>] __try_stop_module+0x0/0x40 [<ffffffff8025effd>] sys_delete_module+0x14d/0x200 [<ffffffff8045df3d>] lockdep_sys_exit_thunk+0x35/0x67 [<ffffffff8031c307>] __up_read+0x27/0xa0 [<ffffffff8045decc>] trace_hardirqs_on_thunk+0x35/0x3a [<ffffffff8020b6ab>] system_call_after_swapgs+0x7b/0x80 ---[ end trace 10ef850597e89c54 ]--- Signed-off-by: Alexey Dobriyan <[email protected]> Cc: Arjan van de Ven <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29elf: fix shadowed variables in fs/binfmt_elf.cWANG Cong1-11/+10
Fix these sparse warings: fs/binfmt_elf.c:1749:29: warning: symbol 'tmp' shadows an earlier one fs/binfmt_elf.c:1734:28: originally declared here fs/binfmt_elf.c:2009:26: warning: symbol 'vma' shadows an earlier one fs/binfmt_elf.c:1892:24: originally declared here [[email protected]: chose better variable name] Signed-off-by: WANG Cong <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29BINFMT: fill_elf_header cleanup - use straight memset firstCyrill Gorcunov1-6/+3
This patch does simplify fill_elf_header function by setting to zero the whole elf header first. So we fillup the fields we really need only. before: text data bss dec hex filename 11735 80 0 11815 2e27 fs/binfmt_elf.o after: text data bss dec hex filename 11710 80 0 11790 2e0e fs/binfmt_elf.o viola, 25 bytes of text is freed Signed-off-by: Cyrill Gorcunov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29cgroups: add an owner to the mm_structBalbir Singh1-0/+1
Remove the mem_cgroup member from mm_struct and instead adds an owner. This approach was suggested by Paul Menage. The advantage of this approach is that, once the mm->owner is known, using the subsystem id, the cgroup can be determined. It also allows several control groups that are virtually grouped by mm_struct, to exist independent of the memory controller i.e., without adding mem_cgroup's for each controller, to mm_struct. A new config option CONFIG_MM_OWNER is added and the memory resource controller selects this config option. This patch also adds cgroup callbacks to notify subsystems when mm->owner changes. The mm_cgroup_changed callback is called with the task_lock() of the new task held and is called just prior to changing the mm->owner. I am indebted to Paul Menage for the several reviews of this patchset and helping me make it lighter and simpler. This patch was tested on a powerpc box, it was compiled with both the MM_OWNER config turned on and off. After the thread group leader exits, it's moved to init_css_state by cgroup_exit(), thus all future charges from runnings threads would be redirected to the init_css_set's subsystem. Signed-off-by: Balbir Singh <[email protected]> Cc: Pavel Emelianov <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Sudhir Kumar <[email protected]> Cc: YAMAMOTO Takashi <[email protected]> Cc: Hirokazu Takahashi <[email protected]> Cc: David Rientjes <[email protected]>, Cc: Balbir Singh <[email protected]> Acked-by: KAMEZAWA Hiroyuki <[email protected]> Acked-by: Pekka Enberg <[email protected]> Reviewed-by: Paul Menage <[email protected]> Cc: Oleg Nesterov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29cgroups: implement device whitelistSerge E. Hallyn1-0/+9
Implement a cgroup to track and enforce open and mknod restrictions on device files. A device cgroup associates a device access whitelist with each cgroup. A whitelist entry has 4 fields. 'type' is a (all), c (char), or b (block). 'all' means it applies to all types and all major and minor numbers. Major and minor are either an integer or * for all. Access is a composition of r (read), w (write), and m (mknod). The root device cgroup starts with rwm to 'all'. A child devcg gets a copy of the parent. Admins can then remove devices from the whitelist or add new entries. A child cgroup can never receive a device access which is denied its parent. However when a device access is removed from a parent it will not also be removed from the child(ren). An entry is added using devices.allow, and removed using devices.deny. For instance echo 'c 1:3 mr' > /cgroups/1/devices.allow allows cgroup 1 to read and mknod the device usually known as /dev/null. Doing echo a > /cgroups/1/devices.deny will remove the default 'a *:* mrw' entry. CAP_SYS_ADMIN is needed to change permissions or move another task to a new cgroup. A cgroup may not be granted more permissions than the cgroup's parent has. Any task can move itself between cgroups. This won't be sufficient, but we can decide the best way to adequately restrict movement later. [[email protected]: coding-style fixes] [[email protected]: fix may-be-used-uninitialized warning] Signed-off-by: Serge E. Hallyn <[email protected]> Acked-by: James Morris <[email protected]> Looks-good-to: Pavel Emelyanov <[email protected]> Cc: Daniel Hokka Zakrisson <[email protected]> Cc: Li Zefan <[email protected]> Cc: Paul Menage <[email protected]> Cc: Balbir Singh <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29eCryptfs: protect crypt_stat->flags in ecryptfs_open()Michael Halcrow1-0/+2
Make sure crypt_stat->flags is protected with a lock in ecryptfs_open(). Signed-off-by: Michael Halcrow <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29eCryptfs: make key module subsystem respect namespacesMichael Halcrow4-64/+135
Make eCryptfs key module subsystem respect namespaces. Since I will be removing the netlink interface in a future patch, I just made changes to the netlink.c code so that it will not break the build. With my recent patches, the kernel module currently defaults to the device handle interface rather than the netlink interface. [[email protected]: export free_user_ns()] Signed-off-by: Michael Halcrow <[email protected]> Acked-by: Serge Hallyn <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29eCryptfs: integrate eCryptfs device handle into the module.Michael Halcrow6-226/+435
Update the versioning information. Make the message types generic. Add an outgoing message queue to the daemon struct. Make the functions to parse and write the packet lengths available to the rest of the module. Add functions to create and destroy the daemon structs. Clean up some of the comments and make the code a little more consistent with itself. [[email protected]: printk fixes] Signed-off-by: Michael Halcrow <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29eCryptfs: introduce device handle for userspace daemon communicationsMichael Halcrow1-0/+580
A regular device file was my real preference from the get-go, but I went with netlink at the time because I thought it would be less complex for managing send queues (i.e., just do a unicast and move on). It turns out that we do not really get that much complexity reduction with netlink, and netlink is more heavyweight than a device handle. In addition, the netlink interface to eCryptfs has been broken since 2.6.24. I am assuming this is a bug in how eCryptfs uses netlink, since the other in-kernel users of netlink do not seem to be having any problems. I have had one report of a user successfully using eCryptfs with netlink on 2.6.24, but for my own systems, when starting the userspace daemon, the initial helo message sent to the eCryptfs kernel module results in an oops right off the bat. I spent some time looking at it, but I have not yet found the cause. The netlink interface breaking gave me the motivation to just finish my patch to migrate to a regular device handle. If I cannot find out soon why the netlink interface in eCryptfs broke, I am likely to just send a patch to disable it in 2.6.24 and 2.6.25. I would like the device handle to be the preferred means of communicating with the userspace daemon from 2.6.26 on forward. This patch: Functions to facilitate reading and writing to the eCryptfs miscellaneous device handle. This will replace the netlink interface as the preferred mechanism for communicating with the userspace eCryptfs daemon. Each user has his own daemon, which registers itself by opening the eCryptfs device handle. Only one daemon per euid may be registered at any given time. The eCryptfs module sends a message to a daemon by adding its message to the daemon's outgoing message queue. The daemon reads the device handle to get the oldest message off the queue. Incoming messages from the userspace daemon are immediately handled. If the message is a response, then the corresponding process that is blocked waiting for the response is awakened. Signed-off-by: Michael Halcrow <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29ecryptfs: add missing lock around notify_changeMiklos Szeredi1-0/+2
Callers of notify_change() need to hold i_mutex. Signed-off-by: Miklos Szeredi <[email protected]> Cc: Michael Halcrow <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29ecryptfs: replace remaining __FUNCTION__ occurrencesHarvey Harrison6-36/+36
__FUNCTION__ is gcc-specific, use __func__ Signed-off-by: Harvey Harrison <[email protected]> Cc: Michael Halcrow <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29remove ecryptfs_header_cache_0Adrian Bunk1-1/+0
Remove the no longer used ecryptfs_header_cache_0. Signed-off-by: Adrian Bunk <[email protected]> Cc: Michael Halcrow <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29vfs: fix unconditional write_super() call in file_fsync()OGAWA Hirofumi1-1/+1
We need to check ->s_dirt before calling write_super(). It became the cause of an unneeded write. This bug was noticed by Sudhanshu Saxena. Signed-off-by: OGAWA Hirofumi <[email protected]> Cc: Al Viro <[email protected]> Cc: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29xattr: add missing consts to function argumentsDavid Howells1-20/+21
Add missing consts to xattr function arguments. Signed-off-by: David Howells <[email protected]> Cc: Andreas Gruenbacher <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29vfs: remove lives_below_in_same_fs()Jan Blunck1-12/+1
Remove lives_below_in_same_fs() since is_subdir() from fs/dcache.c is providing the same functionality. Signed-off-by: Jan Blunck <[email protected]> Acked-by: Miklos Szeredi <[email protected]> Cc: Al Viro <[email protected]> Cc: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29fs/inode.c: use hlist_for_each_entry()Matthias Kaehlcke1-4/+2
fs/inode.c: use hlist_for_each_entry() in find_inode() and find_inode_fast() [[email protected]: coding-style fixes] Signed-off-by: Matthias Kaehlcke <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29vfs: skip inodes without pages to free in drop_pagecache_sb()Jan Kara1-0/+2
Many inodes have no pagecache, so we can avoid lots of lock-takings. Signed-off-by: Jan Kara <[email protected]> Cc: Fengguang Wu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-29vfs: fix lock inversion in drop_pagecache_sb()Jan Kara1-1/+7
Fix longstanding lock inversion in drop_pagecache_sb by dropping inode_lock before calling __invalidate_mapping_pages(). We just have to make sure inode won't go away from under us by keeping reference to it and putting the reference only after we have safely resumed the scan of the inode list. A bit tricky but not too bad... Signed-off-by: Jan Kara <[email protected]> Cc: Fengguang Wu <[email protected]> Cc: David Chinner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>