aboutsummaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2012-10-08btrfs: improved readablity for add_inode_refJan Schmidt1-81/+97
Moved part of the code into a sub function and replaced most of the gotos by ifs, hoping that it will be easier to read now. Signed-off-by: Jan Schmidt <[email protected]> Signed-off-by: Mark Fasheh <[email protected]>
2012-10-08Btrfs: handle not finding the extent exactly when logging changed extentsJosef Bacik1-6/+40
I started hitting warnings when running xfstest 68 in a loop because there were EM's that were not lined up properly with the physical extents. This is ok, if we do something like punch a hole or write to a preallocated space or something like that we can have an EM that doesn't cover the entire physical extent. So fix the tree logging stuff to cope with this case so we don't just commit the transaction. With this patch I no longer see the warnings from the tree logging code. Thanks, Signed-off-by: Josef Bacik <[email protected]>
2012-10-08btrfs: move transaction aborts to the point of failureDavid Sterba4-47/+80
Call btrfs_abort_transaction as early as possible when an error condition is detected, that way the line number reported is useful and we're not clueless anymore which error path led to the abort. Signed-off-by: David Sterba <[email protected]>
2012-10-08Btrfs: fix the missing error information in create_pending_snapshot()Miao Xie1-22/+35
The macro btrfs_abort_transaction() can get the line number of the code where the problem happens, so we should invoke it in the place that the error occurs, or we will lose the line number. Reported-by: David Sterba <[email protected]> Signed-off-by: Miao Xie <[email protected]>
2012-10-08Btrfs: fix off-by-one in file cloneLiu Bo1-9/+9
Btrfs uses inclusive range end for lock_extent(), unlock_extent() and related functions, so we made off-by-one errors in file clone. This fixes it and also fixes some style problems. Signed-off-by: Liu Bo <[email protected]>
2012-10-09exec: make de_thread() killableOleg Nesterov1-2/+14
Change de_thread() to use KILLABLE rather than UNINTERRUPTIBLE while waiting for other threads. The only complication is that we should clear ->group_exit_task and ->notify_count before we return, and we should do this under tasklist_lock. -EAGAIN is used to match the initial signal_group_exit() check/return, it doesn't really matter. This fixes the (unlikely) race with coredump. de_thread() checks signal_group_exit() before it starts to kill the subthreads, but this can't help if another CLONE_VM (but non CLONE_THREAD) task starts the coredumping after de_thread() unlocks ->siglock. In this case the killed sub-thread can block in exit_mm() waiting for coredump_finish(), execing thread waits for that sub-thead, and the coredumping thread waits for execing thread. Deadlock. Signed-off-by: Oleg Nesterov <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-10-07cifs: reinstate the forcegid optionJeff Layton1-0/+9
Apparently this was lost when we converted to the standard option parser in 8830d7e07a5e38bc47650a7554b7c1cfd49902bf Cc: Sachin Prabhu <[email protected]> Cc: [email protected] # v3.4+ Reported-by: Gregory Lee Bartholomew <[email protected]> Signed-off-by: Jeff Layton <[email protected]> Signed-off-by: Steve French <[email protected]>
2012-10-07Convert properly UTF-8 to UTF-16Frediano Ziglio1-0/+22
wchar_t is currently 16bit so converting a utf8 encoded characters not in plane 0 (>= 0x10000) to wchar_t (that is calling char2uni) lead to a -EINVAL return. This patch detect utf8 in cifs_strtoUTF16 and add special code calling utf8s_to_utf16s. Signed-off-by: Frediano Ziglio <[email protected]> Acked-by: Jeff Layton <[email protected]> Signed-off-by: Steve French <[email protected]>
2012-10-07[CIFS] WARN_ON_ONCE if kernel_sendmsg() returns -ENOSPCSteve French1-0/+6
kernel_sendmsg() is less likely to return -ENOSPC and it might be a bug to do so. However, in the past there might have been cases where a -ENOSPC was returned from a low level driver. Add a WARN_ON_ONCE() to ensure that it is safe to assume that -ENOSPC is no longer returned. This -ENOSPC specific handling will be removed once we are sure it is no longer returned. Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Suresh Jayaraman <[email protected]> Signed-off-by: Steve French <[email protected]>
2012-10-08Merge branch 'for-linus' of ↵Linus Torvalds6-27/+46
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull ceph updates from Sage Weil: "The bulk of this pull is a series from Alex that refactors and cleans up the RBD code to lay the groundwork for supporting the new image format and evolving feature set. There are also some cleanups in libceph, and for ceph there's fixed validation of file striping layouts and a bugfix in the code handling a shrinking MDS cluster." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (71 commits) ceph: avoid 32-bit page index overflow ceph: return EIO on invalid layout on GET_DATALOC ioctl rbd: BUG on invalid layout ceph: propagate layout error on osd request creation libceph: check for invalid mapping ceph: convert to use le32_add_cpu() ceph: Fix oops when handling mdsmap that decreases max_mds rbd: update remaining header fields for v2 rbd: get snapshot name for a v2 image rbd: get the snapshot context for a v2 image rbd: get image features for a v2 image rbd: get the object prefix for a v2 rbd image rbd: add code to get the size of a v2 rbd image rbd: lay out header probe infrastructure rbd: encapsulate code that gets snapshot info rbd: add an rbd features field rbd: don't use index in __rbd_add_snap_dev() rbd: kill create_snap sysfs entry rbd: define rbd_dev_image_id() rbd: define some new format constants ...
2012-10-08Merge tag 'ext4_for_linus' of ↵Linus Torvalds22-775/+1353
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "The big new feature added this time is supporting online resizing using the meta_bg feature. This allows us to resize file systems which are greater than 16TB. In addition, the speed of online resizing has been improved in general. We also fix a number of races, some of which could lead to deadlocks, in ext4's Asynchronous I/O and online defrag support, thanks to good work by Dmitry Monakhov. There are also a large number of more minor bug fixes and cleanups from a number of other ext4 contributors, quite of few of which have submitted fixes for the first time." * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (69 commits) ext4: fix ext4_flush_completed_IO wait semantics ext4: fix mtime update in nodelalloc mode ext4: fix ext_remove_space for punch_hole case ext4: punch_hole should wait for DIO writers ext4: serialize truncate with owerwrite DIO workers ext4: endless truncate due to nonlocked dio readers ext4: serialize unlocked dio reads with truncate ext4: serialize dio nonlocked reads with defrag workers ext4: completed_io locking cleanup ext4: fix unwritten counter leakage ext4: give i_aiodio_unwritten a more appropriate name ext4: ext4_inode_info diet ext4: convert to use leXX_add_cpu() ext4: ext4_bread usage audit fs: reserve fallocate flag codepoint ext4: remove redundant offset check in mext_check_arguments() ext4: don't clear orphan list on ro mount with errors jbd2: fix assertion failure in commit code due to lacking transaction credits ext4: release donor reference when EXT4_IOC_MOVE_EXT ioctl fails ext4: enable FITRIM ioctl on bigalloc file system ...
2012-10-07Merge tag 'for-v3.7' of git://git.infradead.org/users/cbou/linux-pstoreLinus Torvalds5-15/+125
Pull pstore changes from Anton Vorontsov: 1) We no longer ad-hoc to the function tracer "high level" infrastructure and no longer use its debugfs knobs. The change slightly touches kernel/trace directory, but it got the needed ack from Steven Rostedt: http://lkml.org/lkml/2012/8/21/688 2) Added maintainers entry; 3) A bunch of fixes, nothing special. * tag 'for-v3.7' of git://git.infradead.org/users/cbou/linux-pstore: pstore: Avoid recursive spinlocks in the oops_in_progress case pstore/ftrace: Convert to its own enable/disable debugfs knob pstore/ram: Add missing platform_device_unregister MAINTAINERS: Add pstore maintainers pstore/ram: Mark ramoops_pstore_write_buf() as notrace pstore/ram: Fix printk format warning pstore/ram: Fix possible NULL dereference
2012-10-06omfs: convert to use beXX_add_cpu()Wei Yongjun1-3/+2
Convert cpu_to_beXX(beXX_to_cpu(E1) + E2) to use beXX_add_cpu(). dpatch engine is used to auto generate this patch. (https://github.com/weiyj/dpatch) Signed-off-by: Wei Yongjun <[email protected]> Acked-by: Bob Copeland <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-10-06fs/proc/root.c: use NULL instead of 0 for pointerSachin Kamat1-1/+1
This cleanup also fixes the following sparse warning: fs/proc/root.c:64:45: warning: Using plain integer as NULL pointer Signed-off-by: Sachin Kamat <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-10-06proc_sysctl.c: use BUG_ON instead of BUGPrasad Joshi1-2/+1
The use of if (!head) BUG(); can be replaced with the single line BUG_ON(!head). Signed-off-by: Prasad Joshi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-10-06proc: use kzalloc instead of kmalloc and memsetyan1-7/+6
Part of the memory will be written twice after this change, but that should be negligible. [[email protected]: fix __proc_create() coding-style issues, remove unneeded zero-initialisations] Signed-off-by: yan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-10-06proc: no need to initialize proc_inode->fd in proc_get_inode()yan1-1/+0
proc_get_inode() obtains the inode via a call to iget_locked(). iget_locked() calls alloc_inode() which will call proc_alloc_inode() which clears proc_inode.fd, so there is no need to clear this field in proc_get_inode(). If iget_locked() instead found the inode via find_inode_fast(), that inode will not have I_NEW set so this change has no effect. Signed-off-by: yan <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-10-06proc: return -ENOMEM when inode allocation failedyan1-1/+1
If proc_get_inode() returns NULL then presumably it encountered memory exhaustion. proc_lookup_de() should return -ENOMEM in this case, not -EINVAL. Signed-off-by: yan <[email protected]> Cc: Ryan Mallon <[email protected]> Cc: Cong Wang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-10-06coredump: extend core dump note section to contain file names of mapped filesDenys Vlasenko2-4/+107
This note has the following format: long count -- how many files are mapped long page_size -- units for file_ofs array of [COUNT] elements of long start long end long file_ofs followed by COUNT filenames in ASCII: "FILE1" NUL "FILE2" NUL... Signed-off-by: Denys Vlasenko <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Amerigo Wang <[email protected]> Cc: "Jonathan M. Foote" <[email protected]> Cc: Roland McGrath <[email protected]> Cc: Pedro Alves <[email protected]> Cc: Fengguang Wu <[email protected]> Cc: Stephen Rothwell <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-10-06coredump: add a new elf note with siginfo of the signalDenys Vlasenko2-2/+31
Existing PRSTATUS note contains only si_signo, si_code, si_errno fields from the siginfo of the signal which caused core to be dumped. There are tools which try to analyze crashes for possible security implications, and they want to use, among other data, si_addr field from the SIGSEGV. This patch adds a new elf note, NT_SIGINFO, which contains the complete siginfo_t of the signal which killed the process. Signed-off-by: Denys Vlasenko <[email protected]> Reviewed-by: Oleg Nesterov <[email protected]> Cc: Amerigo Wang <[email protected]> Cc: "Jonathan M. Foote" <[email protected]> Cc: Roland McGrath <[email protected]> Cc: Pedro Alves <[email protected]> Cc: Fengguang Wu <[email protected]> Cc: Stephen Rothwell <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-10-06coredump: pass siginfo_t* to do_coredump() and below, not merely signrDenys Vlasenko5-17/+17
This is a preparatory patch for the introduction of NT_SIGINFO elf note. With this patch we pass "siginfo_t *siginfo" instead of "int signr" to do_coredump() and put it into coredump_params. It will be used by the next patch. Most changes are simple s/signr/siginfo->si_signo/. Signed-off-by: Denys Vlasenko <[email protected]> Reviewed-by: Oleg Nesterov <[email protected]> Cc: Amerigo Wang <[email protected]> Cc: "Jonathan M. Foote" <[email protected]> Cc: Roland McGrath <[email protected]> Cc: Pedro Alves <[email protected]> Cc: Fengguang Wu <[email protected]> Cc: Stephen Rothwell <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-10-06coredump: use SUID_DUMPABLE_ENABLED rather than hardcoded 1Oleg Nesterov2-2/+2
Cosmetic. Change setup_new_exec() and task_dumpable() to use SUID_DUMPABLE_ENABLED for /bin/grep. [[email protected]: checkpatch fixes] Signed-off-by: Oleg Nesterov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-10-06coredump: add support for %d=__get_dumpable() in core nameOleg Nesterov1-3/+7
Some coredump handlers want to create a core file in a way compatible with standard behavior. Standard behavior with fs.suid_dumpable = 2 is to create core file with uid=gid=0. However, there was no way for coredump handler to know that the process being dumped was suid'ed. This patch adds the new %d specifier for format_corename() which simply reports __get_dumpable(mm->flags), this is compatible with /proc/sys/fs/suid_dumpable we already have. Addresses https://bugzilla.redhat.com/show_bug.cgi?id=787135 Developed during a discussion with Denys Vlasenko. Signed-off-by: Oleg Nesterov <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: Alex Kelly <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Cong Wang <[email protected]> Cc: Jiri Moskovcak <[email protected]> Acked-by: Neil Horman <[email protected]> Cc: Alan Cox <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-10-06coredump: update coredump-related headersAlex Kelly3-0/+9
Create a new header file, fs/coredump.h, which contains functions only used by the new coredump.c. It also moves do_coredump to the include/linux/coredump.h header file, for consistency. Signed-off-by: Alex Kelly <[email protected]> Reviewed-by: Josh Triplett <[email protected]> Acked-by: Serge Hallyn <[email protected]> Acked-by: Kees Cook <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-10-06coredump: make core dump functionality optionalAlex Kelly3-26/+37
Adds an expert Kconfig option, CONFIG_COREDUMP, which allows disabling of core dump. This saves approximately 2.6k in the compiled kernel, and complements CONFIG_ELF_CORE, which now depends on it. CONFIG_COREDUMP also disables coredump-related sysctls, except for suid_dumpable and related functions, which are necessary for ptrace. [[email protected]: fix binfmt_aout.c build] Signed-off-by: Alex Kelly <[email protected]> Reviewed-by: Josh Triplett <[email protected]> Acked-by: Serge Hallyn <[email protected]> Acked-by: Kees Cook <[email protected]> Cc: Randy Dunlap <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-10-06fat: simplify writeback_inode()Namjae Jeon1-9/+5
[[email protected]: checkpatch fixes] Signed-off-by: Namjae Jeon <[email protected]> Signed-off-by: Amit Sahrawat <[email protected]> Acked-by: OGAWA Hirofumi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-10-06fat: no need to reset EOF in ent_put for FAT32Namjae Jeon1-3/+0
#define FAT_ENT_EOF(EOF_FAT32) there is no need to reset value of 'new' for FAT32 as the values is already correct Signed-off-by: Namjae Jeon <[email protected]> Signed-off-by: Amit Sahrawat <[email protected]> Acked-by: OGAWA Hirofumi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-10-06fs/fat: fix checkpatch issues in fatent.cCruz Julian Bishop1-4/+6
1: Stop any lines going over 80 characters 2: Remove a blank line before EXPORT_SYMBOL_GPL Signed-off-by: Cruz Julian Bishop <[email protected]> Acked-by: OGAWA Hirofumi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-10-06fs/fat: fix all other checkpatch issues in dir.cCruz Julian Bishop1-9/+14
1: Import linux/uaccess.h instead of asm.uaccess.h 2: Stop any lines going over 80 characters 3: Stopped setting any variables in if statements 4: Stopped splitting quoted strings 5: Removed unneeded parentheses Signed-off-by: Cruz Julian Bishop <[email protected]> Acked-by: OGAWA Hirofumi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-10-06fs/fat: fix some small checkpatch issues in dir.cCruz Julian Bishop1-7/+0
Simply remove the spacing between function definitions and EXPORT_SYMBOL_GPL calls, which were previously generating warnings. Signed-off-by: Cruz Julian Bishop <[email protected]> Acked-by: OGAWA Hirofumi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-10-06fs/fat: fix two checkpatch issues in cache.cCruz Julian Bishop1-4/+6
This does the following: 1: Splits the arguments of a function call to stop it from exceeding 80 characters 2: Re-indents the arguments of another function call to prevent the splitting of a quoted string. Signed-off-by: Cruz Julian Bishop <[email protected]> Acked-by: OGAWA Hirofumi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-10-06fs/fat: chang indentation of some comments in fat.hCruz Julian Bishop1-36/+36
The comments were not lined up properly, so I just re-indented them. This also fixes a stupid checkpatch issue unknowingly Signed-off-by: Cruz Julian Bishop <[email protected]> Acked-by: OGAWA Hirofumi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-10-06fs/fat: fix some checkpatch issues in fat.hCruz Julian Bishop1-3/+3
Mainly fix spacing issues such as "foo * bar" and "foo= bar" Signed-off-by: Cruz Julian Bishop <[email protected]> Acked-by: OGAWA Hirofumi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-10-06fs/fat: fix a checkpatch issue in namei_msdos.cCruz Julian Bishop1-1/+1
Add a space before an equals sign/operator in line 410. Signed-off-by: Cruz Julian Bishop <[email protected]> Acked-by: OGAWA Hirofumi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-10-06fat (exportfs): fix dentry reconnectionSteven J. Magnani6-137/+141
Maintain an index of directory inodes by starting cluster, so that fat_get_parent() can return the proper cached inode rather than inventing one that cannot be traced back to the filesystem root. Add a new msdos/vfat binary mount option "nfs" so that FAT filesystems that are _not_ exported via NFS are not saddled with maintenance of an index they will never use. Finally, simplify NFS file handle generation and lookups. An ext2-congruent implementation is adequate for FAT needs. Signed-off-by: Steven J. Magnani <[email protected]> Acked-by: OGAWA Hirofumi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-10-06fat (exportfs): move NFS support codeSteven J. Magnani4-131/+174
Under memory pressure, the system may evict dentries from cache. When the FAT driver receives a NFS request involving an evicted dentry, it is unable to reconnect it to the filesystem root. This causes the request to fail, often with ENOENT. This is partially due to ineffectiveness of the current FAT NFS implementation, and partially due to an unimplemented fh_to_parent method. The latter can cause file accesses to fail on shares exported with subtree_check. This patch set provides the FAT driver with the ability to reconnect dentries. NFS file handle generation and lookups are simplified and made congruent with ext2. Testing has involved a memory-starved virtual machine running 3.5-rc5 that exports a ~2 GB vfat filesystem containing a kernel tree (~770 MB, ~40000 files, 9 levels). Both 'cp -r' and 'ls -lR' operations were performed from a client, some overlapping, some consecutive. Exports with 'subtree_check' and 'no_subtree_check' have been tested. Note that while this patch set improves FAT's NFS support, it does not eliminate ESTALE errors completely. The following should be considered for NFS clients who are sensitive to ESTALE: * Mounting with lookupcache=none Unfortunately this can degrade performance severely, particularly for deep filesystems. * Incorporating VFS patches to retry ESTALE failures on the client-side, such as https://lkml.org/lkml/2012/6/29/381 * Handling ESTALE errors in client application code This patch: Move NFS-related code into its own C file. No functional changes. Signed-off-by: Steven J. Magnani <[email protected]> Acked-by: OGAWA Hirofumi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-10-06fat: use accessor function for msdos_dir_entry 'start'Namjae Jeon1-4/+2
Use accessor function for msdos_dir_entry 'start' Signed-off-by: Namjae Jeon <[email protected]> Signed-off-by: Amit Sahrawat <[email protected]> Acked-by: OGAWA Hirofumi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-10-06hpfs: convert to use leXX_add_cpu()Wei Yongjun2-17/+17
Convert cpu_to_leXX(leXX_to_cpu(E1) + E2) to use leXX_add_cpu(). dpatch engine is used to auto generate this patch. (https://github.com/weiyj/dpatch) Signed-off-by: Wei Yongjun <[email protected]> Cc: Mikulas Patocka <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-10-06binfmt_elf: Uninitialized variableAlan Cox1-1/+1
load_elf_interp() has interp_map_addr carefully described as "uninitialized_var" and marked so as to avoid a warning. However if you trace the code it is passed into load_elf_interp and then this value is checked against NULL. As this return value isn't used this is actually safe but it freaks various analysis tools that see un-initialized memory addresses being read before their value is ever defined. Set it to NULL as a matter of programming good taste if nothing else Signed-off-by: Alan Cox <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-10-06epoll: support for disabling items, and a self-test appPaton J. Lewis1-3/+35
Enhanced epoll_ctl to support EPOLL_CTL_DISABLE, which disables an epoll item. If epoll_ctl doesn't return -EBUSY in this case, it is then safe to delete the epoll item in a multi-threaded environment. Also added a new test_epoll self- test app to both demonstrate the need for this feature and test it. Signed-off-by: Paton J. Lewis <[email protected]> Cc: Alexander Viro <[email protected]> Cc: Jason Baron <[email protected]> Cc: Paul Holland <[email protected]> Cc: Davide Libenzi <[email protected]> Cc: Michael Kerrisk <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-10-06idr: rename MAX_LEVEL to MAX_IDR_LEVELFengguang Wu1-1/+1
To avoid name conflicts: drivers/video/riva/fbdev.c:281:9: sparse: preprocessor token MAX_LEVEL redefined While at it, also make the other names more consistent and add parentheses. [[email protected]: repair fallout] [[email protected]: IB/mlx4: fix for MAX_ID_MASK to MAX_IDR_MASK name change] Signed-off-by: Fengguang Wu <[email protected]> Cc: Bernd Petrovitsch <[email protected]> Cc: walter harms <[email protected]> Cc: Glauber Costa <[email protected]> Signed-off-by: Stephen Rothwell <[email protected]> Cc: Roland Dreier <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-10-05ext4: fix ext4_flush_completed_IO wait semanticsDmitry Monakhov6-13/+19
BUG #1) All places where we call ext4_flush_completed_IO are broken because buffered io and DIO/AIO goes through three stages 1) submitted io, 2) completed io (in i_completed_io_list) conversion pended 3) finished io (conversion done) And by calling ext4_flush_completed_IO we will flush only requests which were in (2) stage, which is wrong because: 1) punch_hole and truncate _must_ wait for all outstanding unwritten io regardless to it's state. 2) fsync and nolock_dio_read should also wait because there is a time window between end_page_writeback() and ext4_add_complete_io() As result integrity fsync is broken in case of buffered write to fallocated region: fsync blkdev_completion ->filemap_write_and_wait_range ->ext4_end_bio ->end_page_writeback <-- filemap_write_and_wait_range return ->ext4_flush_completed_IO sees empty i_completed_io_list but pended conversion still exist ->ext4_add_complete_io BUG #2) Race window becomes wider due to the 'ext4: completed_io locking cleanup V4' patch series This patch make following changes: 1) ext4_flush_completed_io() now first try to flush completed io and when wait for any outstanding unwritten io via ext4_unwritten_wait() 2) Rename function to more appropriate name. 3) Assert that all callers of ext4_flush_unwritten_io should hold i_mutex to prevent endless wait Signed-off-by: Dmitry Monakhov <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]> Reviewed-by: Jan Kara <[email protected]>
2012-10-04Merge branch 'for_linus' of ↵Linus Torvalds6-51/+136
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull ext3 & udf fixes from Jan Kara: "Shortlog pretty much says it all. The interesting bits are UDF support for direct IO and ext3 fix for a long standing oops in data=journal mode." * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: jbd: Fix assertion failure in commit code due to lacking transaction credits UDF: Add support for O_DIRECT ext3: Replace 0 with NULL for pointer in super.c file udf: add writepages support for udf ext3: don't clear orphan list on ro mount with errors reiserfs: Make reiserfs_xattr_handlers static
2012-10-04btrfs: allow setting NOCOW for a zero sized file via ioctlDavid Sterba1-4/+27
Hi, the patch si simple, but it has user visible impact and I'm not quite sure how to resolve it. In short, $subj says it, chattr -C supports it and we want to use it. The conditions that acutally allow to change the NOCOW flag are clear. What if I try to set the flag on a file that is not empty? Options: 1) whole ioctl will fail, EINVAL 2.1) ioctl will succeed, the NOCOW flag will be silently removed, but the file will stay COW-ed and checksummed 2.2) ioctl will succeed, flag will not be removed and a syslog message will warn that the COW flag has not been changed 2.2.1) dtto, no syslog message Man page of chattr states that "If it is set on a file which already has data blocks, it is undefined when the blocks assigned to the file will be fully stable." Yes, it's undefined and with current implementation it'll never happen. So from this end, the user cannot expect anything. I'm trying to find a reasonable behaviour, so that a command like 'chattr -R -aijS +C' to tweak a broad set of flags in a deep directory does not fail unnecessarily and does not pollute the log. My personal preference is 2.2.1, but my dev's oppinion is skewed, not counting the fact that I know the code and otherwise would look there before consulting the documentation. The patch implements 2.2.1. david -------------8<------------------- From: David Sterba <[email protected]> It's safe to turn off checksums for a zero sized file. http://thread.gmane.org/gmane.comp.file-systems.btrfs/18030 "We cannot switch on NODATASUM for a file that already has extents that are checksummed. The invariant here is that either all the extents or none are checksummed. Theoretically it's possible to add/remove all checksums from a given file, but it's a potentially longtime operation, the file has to be in some intermediate state where the checksums partially exist but have to be ignored (for the csum->nocsum) until the file is fully converted, this brings more special cases to extent handling, it has to survive power failure and remain consistent, and probably needs to be restarted after next mount." Signed-off-by: David Sterba <[email protected]>
2012-10-04Btrfs: fix punch hole when no extent existsJosef Bacik1-1/+3
I saw the warning in btrfs_drop_extent_cache where our end is less than our start while running xfstests 68 in a loop. This is because we unconditionally do drop_end = min(end, extent_end) in __btrfs_drop_extents(), even though we may not have found an extent in the range we were looking to drop. So keep track of wether or not we found something, and if we didn't just use our end. Thanks, Signed-off-by: Josef Bacik <[email protected]>
2012-10-04Btrfs: don't do anything in our ->freeze_fs and ->unfreeze_fsJosef Bacik1-6/+0
We do not need to do anything special to freeze or unfreeze, it's all taken care of by the generic work, and what we currently have is wrong anyway since we shouldn't be returnning to userspace with mutexes held anyway. Thanks, Signed-off-by: Josef Bacik <[email protected]>
2012-10-04Btrfs: remove unused write cache pages hookJosef Bacik1-47/+0
The btree inode has it's own write cache pages so we can remove this write cache pages hook as it's not used. Thanks, Signed-off-by: Josef Bacik <[email protected]>
2012-10-04Btrfs: fix race when getting the eb out of page->privateJosef Bacik1-4/+19
We can race when checking wether PagePrivate is set on a page and we actually have an eb saved in the pages private pointer. We could have easily written out this page and released it in the time that we did the pagevec lookup and actually got around to looking at this page. So use mapping->private_lock to ensure we get a consistent view of the page->private pointer. This is inline with the alloc and releasepage paths which use private_lock when manipulating page->private. Thanks, Reported-by: David Sterba <[email protected]> Signed-off-by: Josef Bacik <[email protected]>
2012-10-04Btrfs: do not hold the write_lock on the extent tree while loggingJosef Bacik3-5/+20
Dave Sterba pointed out a sleeping while atomic bug while doing fsync. This is because I'm an idiot and didn't realize that rwlock's were spin locks, so we've been holding this thing while doing allocations and such which is not good. This patch fixes this by dropping the write lock before we do anything heavy and re-acquire it when it is done. We also need to take a ref on the em's in case their corresponding pages are evicted and mark them as being logged so that releasepage does not remove them and doesn't remove them from our local list. Thanks, Reported-by: Dave Sterba <[email protected]> Signed-off-by: Josef Bacik <[email protected]>
2012-10-04Btrfs: fix race with freeze and free space inodesJosef Bacik1-2/+11
So we start our freeze, somebody comes in and does an fsync() on a file where we have to commit a transaction for whatever reason, and we will deadlock because the freeze is waiting on FS_FREEZE people to stop writing to the file system, but the transaction is waiting for its free space inodes to be written out, which are in turn waiting on sb_start_intwrite while trying to write the file extents. To fix this we'll just skip the sb_start_intwrite() if we TRANS_JOIN_NOLOCK since we're being waited on by a transaction commit so we're safe wrt to freeze and this will keep us from deadlocking. Thanks, Signed-off-by: Josef Bacik <[email protected]>