aboutsummaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2015-04-17fs/affs/affs.h: add mount option manipulation macrosFabian Frederick1-0/+4
Add clear/set/test affs mount option macros. Signed-off-by: Fabian Frederick <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-17fs/affs: use AFFS_MOUNT prefix for mount optionsFabian Frederick6-48/+54
Currently, affs still uses direct access on mount_options. This patch prepares to use affs_clear/set/test_opt() like other filesystems. Signed-off-by: Fabian Frederick <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-17adfs: return correct return valuesSanidhya Kashyap2-5/+16
Fix the wrong values returned by various functions such as EIO and ENOMEM. Signed-off-by: Sanidhya Kashyap <[email protected]> Cc: Fabian Frederick <[email protected]> Cc: Joe Perches <[email protected]> Cc: Taesoo kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-17fs/exec.c:de_thread: move notify_count write under lockKirill Tkhai1-1/+5
We set sig->notify_count = -1 between RELEASE and ACQUIRE operations: spin_unlock_irq(lock); ... if (!thread_group_leader(tsk)) { ... for (;;) { sig->notify_count = -1; write_lock_irq(&tasklist_lock); There are no restriction on it so other processors may see this STORE mixed with other STOREs in both areas limited by the spinlocks. Probably, it may be reordered with the above sig->group_exit_task = tsk; sig->notify_count = zap_other_threads(tsk); in some way. Set it under tasklist_lock locked to be sure nothing will be reordered. Signed-off-by: Kirill Tkhai <[email protected]> Acked-by: Oleg Nesterov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-17prctl: avoid using mmap_sem for exe_file serializationDavidlohr Bueso1-0/+6
Oleg cleverly suggested using xchg() to set the new mm->exe_file instead of calling set_mm_exe_file() which requires some form of serialization -- mmap_sem in this case. For archs that do not have atomic rmw instructions we still fallback to a spinlock alternative, so this should always be safe. As such, we only need the mmap_sem for looking up the backing vm_file, which can be done sharing the lock. Naturally, this means we need to manually deal with both the new and old file reference counting, and we need not worry about the MMF_EXE_FILE_CHANGED bits, which can probably be deleted in the future anyway. Signed-off-by: Davidlohr Bueso <[email protected]> Suggested-by: Oleg Nesterov <[email protected]> Acked-by: Oleg Nesterov <[email protected]> Reviewed-by: Konstantin Khlebnikov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-17mm: rcu-protected get_mm_exe_file()Konstantin Khlebnikov1-2/+1
This patch removes mm->mmap_sem from mm->exe_file read side. Also it kills dup_mm_exe_file() and moves exe_file duplication into dup_mmap() where both mmap_sems are locked. [[email protected]: fix comment typo] Signed-off-by: Konstantin Khlebnikov <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Al Viro <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: "Paul E. McKenney" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-17fs/fat: comment fix, fat_bits can be also 32Alexander Kuleshov1-1/+1
Signed-off-by: Alexander Kuleshov <[email protected]> Acked-by: OGAWA Hirofumi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-17fs/fat: remove unnecessary includesAlexander Kuleshov9-33/+1
'fat.h' includes <linux/buffer_head.h> which includes <linux/fs.h> which includes all the header files required for all *.c files fat filesystem. [[email protected]: fs/fat/iode.c needs seq_file.h] [[email protected]: put one actually necessary include file back] Signed-off-by: Alexander Kuleshov <[email protected]> Acked-by: OGAWA Hirofumi <[email protected]> Signed-off-by: Stephen Rothwell <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-17fs/fat: remove unnecessary defintionAlexander Kuleshov1-2/+1
'*sb' never used, so let's remote it and pass inode->i_sb directly to the MSDOS_SB. Signed-off-by: Alexander Kuleshov <[email protected]> Acked-by: OGAWA Hirofumi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-17hfsplus: don't store special "osx" xattr prefix on-diskThomas Hebb1-5/+14
On Mac OS X, HFS+ extended attributes are not namespaced. Since we want to be compatible with OS X filesystems and yet still support the Linux namespacing system, the hfsplus driver implements a special "osx" namespace that is reported for any attribute that is not namespaced on-disk. However, the current code for getting and setting these unprefixed attributes is broken. hfsplus_osx_setattr() and hfsplus_osx_getattr() are passed names that have already had their "osx." prefixes stripped by the generic functions. The functions first, quite correctly, check those names to make sure that they aren't prefixed with a known namespace, which would allow namespace access restrictions to be bypassed. However, the functions then prepend "osx." to the name they're given before passing it on to hfsplus_getattr() and hfsplus_setattr(). Not only does this cause the "osx." prefix to be stored on-disk, defeating its purpose, it also breaks the check for the special "com.apple.FinderInfo" attribute, which is reported for all files, and as a consequence makes some userspace applications (e.g. GNU patch) fail even when extended attributes are not otherwise in use. There are five commits which have touched this particular code: 127e5f5ae51e ("hfsplus: rework functionality of getting, setting and deleting of extended attributes") b168fff72109 ("hfsplus: use xattr handlers for removexattr") bf29e886b242 ("hfsplus: correct usage of HFSPLUS_ATTR_MAX_STRLEN for non-English attributes") fcacbd95e121 ("fs/hfsplus: move xattr_name allocation in hfsplus_getxattr()") ec1bbd346f18 ("fs/hfsplus: move xattr_name allocation in hfsplus_setxattr()") The first commit creates the functions to begin with. The namespace is prepended by the original code, which I believe was correct at the time, since hfsplus_?etattr() stripped the prefix if found. The second commit removes this behavior from hfsplus_?etattr() and appears to have been intended to also remove the prefixing from hfsplus_osx_?etattr(). However, what it actually does is remove a necessary strncpy() call completely, breaking the osx namespace entirely. The third commit re-adds the strncpy() call as it was originally, but doesn't mention it in its commit message. The final two commits refactor the code and don't affect its functionality. This commit does what b168fff attempted to do (prevent the prefix from being added), but does it properly, instead of passing in an empty buffer (which is what b168fff actually did). Fixes: b168fff72109 ("hfsplus: use xattr handlers for removexattr") Signed-off-by: Thomas Hebb <[email protected]> Cc: Hin-Tak Leung <[email protected]> Cc: Sergei Antonov <[email protected]> Cc: Anton Altaparmakov <[email protected]> Cc: Fabian Frederick <[email protected]> Cc: Christian Kujau <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Al Viro <[email protected]> Cc: Viacheslav Dubeyko <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-17hfsplus: fix expand when not enough available spaceSergei Antonov1-0/+6
Fix a bug which is reproduced as follows. Create a file: echo abc > test_file Try to expand the file beyond available space: truncate --size=<size exceeding available space> test_file Since HFS+ does not support file size > allocated size, truncate should fail. However, it ends successfully. The driver returns success despite having been unable to allocate the requested space for the file. Also filesystem check finds an error: Checking catalog file. Incorrect size for file test_file (It should be 469094400 instead of 1000000000) Add a piece of code analogous to code in the fat driver. Now a proper error is returned and filesystem remains consistent. Signed-off-by: Sergei Antonov <[email protected]> Cc: Vyacheslav Dubeyko <[email protected]> Cc: Hin-Tak Leung <[email protected]> Reviewed-by: Anton Altaparmakov <[email protected]> Cc: Al Viro <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Sougata Santra <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-17hfsplus: incorrect return valueChengyu Song1-2/+2
In case of memory allocation error, the return should be -ENOMEM, instead of -ENOSPC. Signed-off-by: Chengyu Song <[email protected]> Reviewed-by: Sergei Antonov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-17fs/hfsplus: replace if/BUG by BUG_ONFabian Frederick1-3/+1
Signed-off-by: Fabian Frederick <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-17fs/hfsplus: use bool instead of int for is_known_namespace() return valueFabian Frederick1-1/+1
is_known_namespace() only returns true/false. Also remove inline and let compiler decide what to do with static functions. Signed-off-by: Fabian Frederick <[email protected]> Cc: "David S. Miller" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-17fs/hfsplus: atomically set inode->i_flagsFabian Frederick1-7/+5
According to commit 5f16f3225b06 ("ext4: atomically set inode->i_flags in ext4_set_inode_flags()"). Signed-off-by: Fabian Frederick <[email protected]> Cc: "Theodore Ts'o" <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-17fs/hfsplus: move xattr_name allocation in hfsplus_setxattr()Fabian Frederick5-65/+35
security/trusted/user/osx setxattr did the same xattr_name initialization. Move that operation in hfsplus_setxattr(). Tested with security/trusted/user getfattr/setfattr Signed-off-by: Fabian Frederick <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-17fs/hfsplus: move xattr_name allocation in hfsplus_getxattr()Fabian Frederick5-67/+38
security/trusted/user/osx getxattr did the same xattr_name initialization. Move that operation in hfsplus_getxattr(). Tested with security/trusted/user getfattr/setfattr Signed-off-by: Fabian Frederick <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-17hfsplus: add missing curly braces in hfsplus_delete_cat()Dan Carpenter1-1/+2
This doesn't change how the code works, but clearly the curly braces were intended. Signed-off-by: Dan Carpenter <[email protected]> Cc: Vyacheslav Dubeyko <[email protected]> Cc: Sougata Santra <[email protected]> Cc: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-17hfs: incorrect return valuesChengyu Song1-2/+2
In case of memory allocation error, the return should be -ENOMEM, instead of -ENOSPC. Signed-off-by: Chengyu Song <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-17nilfs2: use inode_set_flags() in nilfs_set_inode_flags()Ryusuke Konishi1-7/+8
Use inode_set_flags() to atomically set i_flags instead of clearing out the S_IMMUTABLE, S_APPEND, etc. flags and then setting them from the FS_IMMUTABLE_FL, FS_APPEND_FL flags to avoid a race where an immutable file has the immutable flag cleared for a brief window of time. This is a similar fix to commit 5f16f3225b06 ("ext4: atomically set inode->i_flags in ext4_set_inode_flags()"). Signed-off-by: Ryusuke Konishi <[email protected]> Cc: "Theodore Ts'o" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-17nilfs2: put out gfp mask manipulation from nilfs_set_inode_flags()Ryusuke Konishi1-2/+2
nilfs_set_inode_flags() function adjusts gfp-mask of inode->i_mapping as well as i_flags, however, this coupling of operations is not appropriate. For instance, nilfs_ioctl_setflags(), one of three callers of nilfs_set_inode_flags(), doesn't need to reinitialize the gfp-mask at all. In addition, nilfs_new_inode(), another caller of nilfs_set_inode_flags(), doesn't either because it has already initialized the gfp-mask. Only __nilfs_read_inode(), the remaining caller, needs it. So, this moves the gfp mask manipulation to __nilfs_read_inode() from nilfs_set_inode_flags(). Signed-off-by: Ryusuke Konishi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-17nilfs2: fix gcc warning at nilfs_checkpoint_is_mounted()Ryusuke Konishi1-1/+1
Fix the following build warning: fs/nilfs2/super.c: In function 'nilfs_checkpoint_is_mounted': fs/nilfs2/super.c:1023:10: warning: comparison of unsigned expression < 0 is always false [-Wtype-limits] if (cno < 0 || cno > nilfs->ns_cno) ^ This warning indicates that the comparision "cno < 0" is useless because variable "cno" has an unsigned integer type "__u64". Signed-off-by: Ryusuke Konishi <[email protected]> Reported-by: David Binderman <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-17nilfs2: improve execution time of NILFS_IOCTL_GET_CPINFO ioctlRyusuke Konishi1-6/+52
The older a filesystem gets, the slower lscp command becomes. This is because nilfs_cpfile_do_get_cpinfo() function meets more hole blocks as the start offset of valid checkpoint numbers gets bigger. This reduces the overhead by skipping hole blocks efficiently with nilfs_mdt_find_block() helper. A measurement result of this patch is as follows: Before: $ time lscp CNO DATE TIME MODE FLG BLKCNT ICNT 5769303 2015-02-22 19:31:33 cp - 108 1 5769304 2015-02-22 19:38:54 cp - 108 1 real 0m0.182s user 0m0.003s sys 0m0.180s After: $ time lscp CNO DATE TIME MODE FLG BLKCNT ICNT 5769303 2015-02-22 19:31:33 cp - 108 1 5769304 2015-02-22 19:38:54 cp - 108 1 real 0m0.003s user 0m0.001s sys 0m0.002s Signed-off-by: Ryusuke Konishi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-17nilfs2: add helper to find existent block on metadata fileRyusuke Konishi2-0/+57
Add a new metadata file function, nilfs_mdt_find_block(), which finds an existent block on a metadata file in a given range of blocks. This function skips continuous hole blocks efficiently by using nilfs_bmap_seek_key(). Signed-off-by: Ryusuke Konishi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-17nilfs2: add bmap function to seek a valid keyRyusuke Konishi4-1/+115
Add a new bmap function, nilfs_bmap_seek_key(), which seeks a valid entry and returns its key starting from a given key. This function can be used to skip hole blocks efficiently. Signed-off-by: Ryusuke Konishi <[email protected]> Cc: Dan Carpenter <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-17nilfs2: unify type of key arguments in bmap interfaceRyusuke Konishi4-20/+16
The type of key arguments in block mapping interface varies depending on function. For instance, nilfs_bmap_lookup_at_level() takes "__u64" for its key argument whereas nilfs_bmap_lookup() takes "unsigned long". This fits them to "__u64" to eliminate the variation. Signed-off-by: Ryusuke Konishi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-17nilfs2: use bgl_lock_ptr()Ryusuke Konishi1-2/+5
Simplify nilfs_mdt_bgl_lock() by utilizing bgl_lock_ptr() helper in <linux/blockgroup_lock.h>. Signed-off-by: Ryusuke Konishi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-17nilfs2: use set_mask_bits() for operations on buffer state bitmapRyusuke Konishi2-20/+18
nilfs_forget_buffer(), nilfs_clear_dirty_page(), and nilfs_segctor_complete_write() are using a bunch of atomic bit operations against buffer state bitmap. This reduces the number of them by utilizing set_mask_bits() macro. Signed-off-by: Ryusuke Konishi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-17nilfs2: do not use async write flag for segment summary buffersRyusuke Konishi1-3/+0
The async write flag is introduced to nilfs2 in the commit 7f42ec394156 ("nilfs2: fix issue with race condition of competition between segments for dirty blocks"), but the flag only makes sense for data buffers and btree node buffers. It is not needed for segment summary buffers. This gets rid of the latter uses as part of refactoring of atomic bit operations on buffer state bitmap. Signed-off-by: Ryusuke Konishi <[email protected]> Cc: Vyacheslav Dubeyko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-17befs: replace typedef befs_inode_info by structureFabian Frederick2-7/+7
See Documentation/CodingStyle Signed-off-by: Fabian Frederick <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-17befs: replace typedef befs_sb_info by structureFabian Frederick5-13/+11
See Documenation/CodingStyle Signed-off-by: Fabian Frederick <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-17befs: replace typedef befs_mount_options by structureFabian Frederick2-5/+5
See Documentation/CodingStyle Signed-off-by: Fabian Frederick <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-17fs/binfmt_misc.c: simplify entry_status()Rasmus Villemoes1-21/+9
sprintf() reliably returns the number of characters printed, so we don't need to ask strlen() where we are. Also replace calling sprintf("%02x") in a loop with the much simpler bin2hex(). [[email protected]: it's odd to include kernel.h after everything else] Signed-off-by: Rasmus Villemoes <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-16Merge branch 'for-linus' of ↵Linus Torvalds45-562/+439
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull third hunk of vfs changes from Al Viro: "This contains the ->direct_IO() changes from Omar + saner generic_write_checks() + dealing with fcntl()/{read,write}() races (mirroring O_APPEND/O_DIRECT into iocb->ki_flags and instead of repeatedly looking at ->f_flags, which can be changed by fcntl(2), check ->ki_flags - which cannot) + infrastructure bits for dhowells' d_inode annotations + Christophs switch of /dev/loop to vfs_iter_write()" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (30 commits) block: loop: switch to VFS ITER_BVEC configfs: Fix inconsistent use of file_inode() vs file->f_path.dentry->d_inode VFS: Make pathwalk use d_is_reg() rather than S_ISREG() VFS: Fix up debugfs to use d_is_dir() in place of S_ISDIR() VFS: Combine inode checks with d_is_negative() and d_is_positive() in pathwalk NFS: Don't use d_inode as a variable name VFS: Impose ordering on accesses of d_inode and d_flags VFS: Add owner-filesystem positive/negative dentry checks nfs: generic_write_checks() shouldn't be done on swapout... ocfs2: use __generic_file_write_iter() mirror O_APPEND and O_DIRECT into iocb->ki_flags switch generic_write_checks() to iocb and iter ocfs2: move generic_write_checks() before the alignment checks ocfs2_file_write_iter: stop messing with ppos udf_file_write_iter: reorder and simplify fuse: ->direct_IO() doesn't need generic_write_checks() ext4_file_write_iter: move generic_write_checks() up xfs_file_aio_write_checks: switch to iocb/iov_iter generic_write_checks(): drop isblk argument blkdev_write_iter: expand generic_file_checks() call in there ...
2015-04-16Merge branch 'for_linus' of ↵Linus Torvalds25-308/+461
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull quota and udf updates from Jan Kara: "The pull contains quota changes which complete unification of XFS and VFS quota interfaces (so tools can use either interface to manipulate any filesystem). There's also a patch to support project quotas in VFS quota subsystem from Li Xi. Finally there's a bunch of UDF fixes and cleanups and tiny cleanup in reiserfs & ext3" * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: (21 commits) udf: Update ctime and mtime when directory is modified udf: return correct errno for udf_update_inode() ext3: Remove useless condition in if statement. vfs: Add general support to enforce project quota limits reiserfs: fix __RASSERT format string udf: use int for allocated blocks instead of sector_t udf: remove redundant buffer_head.h includes udf: remove else after return in __load_block_bitmap() udf: remove unused variable in udf_table_free_blocks() quota: Fix maximum quota limit settings quota: reorder flags in quota state quota: paranoia: check quota tree root quota: optimize i_dquot access quota: Hook up Q_XSETQLIM for id 0 to ->set_info xfs: Add support for Q_SETINFO quota: Make ->set_info use structure with neccesary info to VFS and XFS quota: Remove ->get_xstate and ->get_xstatev callbacks gfs2: Convert to using ->get_state callback xfs: Convert to using ->get_state callback quota: Wire up Q_GETXSTATE and Q_GETXSTATV calls to work with ->get_state ...
2015-04-16Merge branch 'for-4.1/core' of git://git.kernel.dk/linux-blockLinus Torvalds1-15/+30
Pull block layer core bits from Jens Axboe: "This is the core pull request for 4.1. Not a lot of stuff in here for this round, mostly little fixes or optimizations. This pull request contains: - An optimization that speeds up queue runs on blk-mq, especially for the case where there's a large difference between nr_cpu_ids and the actual mapped software queues on a hardware queue. From Chong Yuan. - Honor node local allocations for requests on legacy devices. From David Rientjes. - Cleanup of blk_mq_rq_to_pdu() from me. - exit_aio() fixup from me, greatly speeding up exiting multiple IO contexts off exit_group(). For my particular test case, fio exit took ~6 seconds. A typical case of both exposing RCU grace periods to user space, and serializing exit of them. - Make blk_mq_queue_enter() honor the gfp mask passed in, so we only wait if __GFP_WAIT is set. From Keith Busch. - blk-mq exports and two added helpers from Mike Snitzer, which will be used by the dm-mq code. - Cleanups of blk-mq queue init from Wei Fang and Xiaoguang Wang" * 'for-4.1/core' of git://git.kernel.dk/linux-block: blk-mq: reduce unnecessary software queue looping aio: fix serial draining in exit_aio() blk-mq: cleanup blk_mq_rq_to_pdu() blk-mq: put blk_queue_rq_timeout together in blk_mq_init_queue() block: remove redundant check about 'set->nr_hw_queues' in blk_mq_alloc_tag_set() block: allocate request memory local to request queue blk-mq: don't wait in blk_mq_queue_enter() if __GFP_WAIT isn't set blk-mq: export blk_mq_run_hw_queues blk-mq: add blk_mq_init_allocated_queue and export blk_mq_register_disk
2015-04-16Merge tag 'powerpc-4.1-1' of ↵Linus Torvalds1-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux Pull powerpc updates from Michael Ellerman: - Numerous minor fixes, cleanups etc. - More EEH work from Gavin to remove its dependency on device_nodes. - Memory hotplug implemented entirely in the kernel from Nathan Fontenot. - Removal of redundant CONFIG_PPC_OF by Kevin Hao. - Rewrite of VPHN parsing logic & tests from Greg Kurz. - A fix from Nish Aravamudan to reduce memory usage by clamping nodes_possible_map. - Support for pstore on powernv from Hari Bathini. - Removal of old powerpc specific byte swap routines by David Gibson. - Fix from Vasant Hegde to prevent the flash driver telling you it was flashing your firmware when it wasn't. - Patch from Ben Herrenschmidt to add an OPAL heartbeat driver. - Fix for an oops causing get/put_cpu_var() imbalance in perf by Jan Stancek. - Some fixes for migration from Tyrel Datwyler. - A new syscall to switch the cpu endian by Michael Ellerman. - Large series from Wei Yang to implement SRIOV, reviewed and acked by Bjorn. - A fix for the OPAL sensor driver from Cédric Le Goater. - Fixes to get STRICT_MM_TYPECHECKS building again by Michael Ellerman. - Large series from Daniel Axtens to make our PCI hooks per PHB rather than per machine. - Small patch from Sam Bobroff to explicitly abort non-suspended transactions on syscalls, plus a test to exercise it. - Numerous reworks and fixes for the 24x7 PMU from Sukadev Bhattiprolu. - Small patch to enable the hard lockup detector from Anton Blanchard. - Fix from Dave Olson for missing L2 cache information on some CPUs. - Some fixes from Michael Ellerman to get Cell machines booting again. - Freescale updates from Scott: Highlights include BMan device tree nodes, an MSI erratum workaround, a couple minor performance improvements, config updates, and misc fixes/cleanup. * tag 'powerpc-4.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: (196 commits) powerpc/powermac: Fix build error seen with powermac smp builds powerpc/pseries: Fix compile of memory hotplug without CONFIG_MEMORY_HOTREMOVE powerpc: Remove PPC32 code from pseries specific find_and_init_phbs() powerpc/cell: Fix iommu breakage caused by controller_ops change powerpc/eeh: Fix crash in eeh_add_device_early() on Cell powerpc/perf: Cap 64bit userspace backtraces to PERF_MAX_STACK_DEPTH powerpc/perf/hv-24x7: Fail 24x7 initcall if create_events_from_catalog() fails powerpc/pseries: Correct memory hotplug locking powerpc: Fix missing L2 cache size in /sys/devices/system/cpu powerpc: Add ppc64 hard lockup detector support oprofile: Disable oprofile NMI timer on ppc64 powerpc/perf/hv-24x7: Add missing put_cpu_var() powerpc/perf/hv-24x7: Break up single_24x7_request powerpc/perf/hv-24x7: Define update_event_count() powerpc/perf/hv-24x7: Whitespace cleanup powerpc/perf/hv-24x7: Define add_event_to_24x7_request() powerpc/perf/hv-24x7: Rename hv_24x7_event_update powerpc/perf/hv-24x7: Move debug prints to separate function powerpc/perf/hv-24x7: Drop event_24x7_request() powerpc/perf/hv-24x7: Use pr_devel() to log message ... Conflicts: tools/testing/selftests/powerpc/Makefile tools/testing/selftests/powerpc/tm/Makefile
2015-04-16f2fs: pass checkpoint reason on roll-forward recoveryJaegeuk Kim3-2/+7
This patch adds CP_RECOVERY to remain recovery information for checkpoint. And, it makes sure writing checkpoint in this case. Signed-off-by: Jaegeuk Kim <[email protected]>
2015-04-16f2fs: avoid abnormal behavior on broken symlinkJaegeuk Kim1-1/+19
When f2fs_symlink was triggered and checkpoint was done before syncing its link path, f2fs can get broken symlink like "xxx -> \0\0\0". This incurs abnormal path_walk by VFS. Signed-off-by: Jaegeuk Kim <[email protected]>
2015-04-16f2fs: flush symlink path to avoid broken symlink after PORJaegeuk Kim1-0/+11
This patch tries to avoid broken symlink case after POR in best effort. This results in performance regression. But, if f2fs has inline_data and the target path is under 3KB-sized long, the page would be stored in its inode_block, so that there would be no performance regression. Note that, if user wants to keep this file atomically, it needs to trigger dir->fsync. And, there is still a hole to produce broken symlink. Signed-off-by: Jaegeuk Kim <[email protected]>
2015-04-16Merge branch 'xfs-dio-extend-fix' into for-nextDave Chinner3-82/+239
Conflicts: fs/xfs/xfs_file.c
2015-04-16xfs: using generic_file_direct_write() is unnecessaryDave Chinner1-3/+20
generic_file_direct_write() does all sorts of things to make DIO work "sorta ok" with mixed buffered IO workloads. We already do most of this work in xfs_file_aio_dio_write() because of the locking requirements, so there's only a couple of things it does for us. The first thing is that it does a page cache invalidation after the ->direct_IO callout. This can easily be added to the XFS code. The second thing it does is that if data was written, it updates the iov_iter structure to reflect the data written, and then does EOF size updates if necessary. For XFS, these EOF size updates are now not necessary, as we do them safely and race-free in IO completion context. That leaves just the iov_iter update, and that's also moved to the XFS code. Therefore we don't need to call generic_file_direct_write() and in doing so remove redundant buffered writeback and page cache invalidation calls from the DIO submission path. We also remove a racy EOF size update, and make the DIO submission code in XFS much easier to follow. Wins all round, really. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2015-04-16xfs: direct IO EOF zeroing needs to drain AIODave Chinner1-0/+10
When we are doing AIO DIO writes, the IOLOCK only provides an IO submission barrier. When we need to do EOF zeroing, we need to ensure that no other IO is in progress and all pending in-core EOF updates have been completed. This requires us to wait for all outstanding AIO DIO writes to the inode to complete and, if necessary, run their EOF updates. Once all the EOF updates are complete, we can then restart xfs_file_aio_write_checks() while holding the IOLOCK_EXCL, knowing that EOF is up to date and we have exclusive IO access to the file so we can run EOF block zeroing if we need to without interference. This gives EOF zeroing the same exclusivity against other IO as we provide truncate operations. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2015-04-16xfs: DIO write completion size updates raceDave Chinner2-1/+19
xfs_end_io_direct_write() can race with other IO completions when updating the in-core inode size. The IO completion processing is not serialised for direct IO - they are done either under the IOLOCK_SHARED for non-AIO DIO, and without any IOLOCK held at all during AIO DIO completion. Hence the non-atomic test-and-set update of the in-core inode size is racy and can result in the in-core inode size going backwards if the race if hit just right. If the inode size goes backwards, this can trigger the EOF zeroing code to run incorrectly on the next IO, which then will zero data that has successfully been written to disk by a previous DIO. To fix this bug, we need to serialise the test/set updates of the in-core inode size. This first patch introduces locking around the relevant updates and checks in the DIO path. Because we now have an ioend in xfs_end_io_direct_write(), we know exactly then we are doing an IO that requires an in-core EOF update, and we know that they are not running in interrupt context. As such, we do not need to use irqsave() spinlock variants to protect against interrupts while the lock is held. Hence we can use an existing spinlock in the inode to do this serialisation and so not need to grow the struct xfs_inode just to work around this problem. This patch does not address the test/set EOF update in generic_file_write_direct() for various reasons - that will be done as a followup with separate explanation. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2015-04-16xfs: DIO writes within EOF don't need an ioendDave Chinner2-30/+40
DIO writes that lie entirely within EOF have nothing to do in IO completion. In this case, we don't need no steekin' ioend, and so we can avoid allocating an ioend until we have a mapping that spans EOF. This means that IO completion has two contexts - deferred completion to the dio workqueue that uses an ioend, and interrupt completion that does nothing because there is nothing that can be done in this context. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2015-04-16xfs: handle DIO overwrite EOF update completion correctlyDave Chinner2-31/+31
Currently a DIO overwrite that extends the EOF (e.g sub-block IO or write into allocated blocks beyond EOF) requires a transaction for the EOF update. Thi is done in IO completion context, but we aren't explicitly handling this situation properly and so it can run in interrupt context. Ensure that we defer IO that spans EOF correctly to the DIO completion workqueue, and now that we have an ioend in IO completion we can use the common ioend completion path to do all the work. Note: we do not preallocate the append transaction as we can have multiple mapping and allocation calls per direct IO. hence preallocating can still leave us with nested transactions by attempting to map and allocate more blocks after we've preallocated an append transaction. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2015-04-16xfs: DIO needs an ioend for writesDave Chinner2-10/+85
Currently we can only tell DIO completion that an IO requires unwritten extent completion. This is done by a hacky non-null private pointer passed to Io completion, but the private pointer does not actually contain any information that is used. We also need to pass to IO completion the fact that the IO may be beyond EOF and so a size update transaction needs to be done. This is currently determined by checks in the io completion, but we need to determine if this is necessary at block mapping time as we need to defer the size update transactions to a completion workqueue, just like unwritten extent conversion. To do this, first we need to allocate and pass an ioend to to IO completion. Add this for unwritten extent conversion; we'll do the EOF updates in the next commit. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2015-04-16xfs: move DIO mapping size calculationDave Chinner1-33/+46
The mapping size calculation is done last in __xfs_get_blocks(), but we are going to need the actual mapping size we will use to map the direct IO correctly in xfs_map_direct(). Factor out the calculation for code clarity, and move the call to be the first operation in mapping the extent to the returned buffer. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2015-04-16xfs: factor DIO write mapping from get_blocksDave Chinner1-13/+27
Clarify and separate the buffer mapping logic so that the direct IO mapping is not tangled up in propagating the extent status to teh mapping buffer. This makes it easier to extend the direct IO mapping to use an ioend in future. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2015-04-16ext4 crypto: enable encryption feature flagTheodore Ts'o6-24/+79
Also add the test dummy encryption mode flag so we can more easily test the encryption patches using xfstests. Signed-off-by: Michael Halcrow <[email protected]> Signed-off-by: Theodore Ts'o <[email protected]>