aboutsummaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2013-10-09of: remove HAVE_ARCH_DEVTREE_FIXUPSRob Herring1-3/+0
HAVE_ARCH_DEVTREE_FIXUPS appears to always be needed except for sparc, but it is only used for /proc/device-teee and sparc does not enable /proc/device-tree. So this option is redundant. Remove the option and always enable it. This has the side effect of fixing /proc/device-tree on arches such as arm64 which failed to define this option. Signed-off-by: Rob Herring <[email protected]> Acked-by: Vineet Gupta <[email protected]> Acked-by: Grant Likely <[email protected]> Cc: Russell King <[email protected]> Cc: James Hogan <[email protected]> Cc: Michal Simek <[email protected]> Cc: Jonas Bonn <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: [email protected] Cc: Chris Zankel <[email protected]> Cc: Max Filippov <[email protected]>
2013-10-09sched/numa: Call task_numa_free() from do_execve()Rik van Riel1-0/+1
It is possible for a task in a numa group to call exec, and have the new (unrelated) executable inherit the numa group association from its former self. This has the potential to break numa grouping, and is trivial to fix. Signed-off-by: Rik van Riel <[email protected]> Signed-off-by: Mel Gorman <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Srikar Dronamraju <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-10-09sched/numa: Report a NUMA task group IDMel Gorman1-0/+2
It is desirable to model from userspace how the scheduler groups tasks over time. This patch adds an ID to the numa_group and reports it via /proc/PID/status. Signed-off-by: Mel Gorman <[email protected]> Reviewed-by: Rik van Riel <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Srikar Dronamraju <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-10-08xfs: clean up xfs_inactive() error handling, kill VN_INACTIVE_[NO]CACHEBrian Foster3-26/+12
The xfs_inactive() return value is meaningless. Turn xfs_inactive() into a void function and clean up the error handling appropriately. Kill the VN_INACTIVE_[NO]CACHE directives as they are not relevant to Linux. Signed-off-by: Brian Foster <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Ben Myers <[email protected]>
2013-10-08xfs: push down inactive transaction mgmt for ifreeBrian Foster1-50/+71
Push the inode free work performed during xfs_inactive() down into a new xfs_inactive_ifree() helper. This clears xfs_inactive() from all inode locking and transaction management more directly associated with freeing the inode xattrs, extents and the inode itself. Signed-off-by: Brian Foster <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Ben Myers <[email protected]>
2013-10-08xfs: push down inactive transaction mgmt for truncateBrian Foster1-49/+68
Create the new xfs_inactive_truncate() function to handle the truncate portion of xfs_inactive(). Push the locking and transaction management into the new function. Signed-off-by: Brian Foster <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Ben Myers <[email protected]>
2013-10-08xfs: push down inactive transaction mgmt for remote symlinksBrian Foster3-54/+49
Push down the transaction management for remote symlinks from xfs_inactive() down to xfs_inactive_symlink_rmt(). The latter is cleaned up to avoid transaction management intended for the calling context (i.e., trans duplication, reservation, item attachment). Signed-off-by: Brian Foster <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Ben Myers <[email protected]>
2013-10-08xfs: add the inode directory type support to XFS_IOC_FSGEOMMark Tinguely2-3/+5
Add the inode type directory type support to XFS_IOC_FSGEOM so that xfs_repair/xfs_info knows if the superblock v4 filesystem enabled the feature. Signed-off-by: Mark Tinguely <[email protected]> Reviewed-by: Carlos Maiolino <[email protected]> Signed-off-by: Ben Myers <[email protected]>
2013-10-08f2fs: fix writing incorrect orphan blocksJaegeuk Kim1-2/+2
Previously, there was a erroneous scenario like below. thread 1: thread 2: f2fs_unlink - acquire_orphan_inode : sbi->n_orphans++ write_checkpoint - block_operations : f2fs_lock_all - do_checkpoint : write orphan blocks with sbi->n_orphans - unblock_operations - f2fs_lock_op - release_orphan_inode - f2fs_unlock_op During the checkpoint by thread 2, f2fs stores a wrong orphan block according to the wrong sbi->n_orphans. To avoid this, simply we should make cover acquire_orphan_inode too with f2fs_lock_op. Signed-off-by: Jaegeuk Kim <[email protected]>
2013-10-08f2fs: avoid unnecessary checkpointsJaegeuk Kim1-1/+3
During the f2fs_put_super procedure, we don't need to conduct checkpoint all the time, since we don't need to do that if superblock is clean. Signed-off-by: Jaegeuk Kim <[email protected]>
2013-10-07cifs: Allow LANMAN auth method for servers supporting unencapsulated ↵Sachin Prabhu1-2/+2
authentication methods This allows users to use LANMAN authentication on servers which support unencapsulated authentication. The patch fixes a regression where users using plaintext authentication were no longer able to do so because of changed bought in by patch 3f618223dc0bdcbc8d510350e78ee2195ff93768 https://bugzilla.redhat.com/show_bug.cgi?id=1011621 Reported-by: Panos Kavalagios <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Sachin Prabhu <[email protected]> Signed-off-by: Steve French <[email protected]>
2013-10-07cifs: Fix inability to write files >2GB to SMB2/3 sharesJan Klos1-2/+4
When connecting to SMB2/3 shares, maximum file size is set to non-LFS maximum in superblock. This is due to cap_large_files bit being different for SMB1 and SMB2/3 (where it is just an internal flag that is not negotiated and the SMB1 one corresponds to multichannel capability, so maybe LFS works correctly if server sends 0x08 flag) while capabilities are checked always for the SMB1 bit in cifs_read_super(). The patch fixes this by checking for the correct bit according to the protocol version. CC: Stable <[email protected]> Signed-off-by: Jan Klos <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Steve French <[email protected]>
2013-10-07f2fs: handle remount options correctlyKelly Anderson1-0/+17
The current f2fs code errors if the xattr or acl options are passed when remounting. This is important in a typical scenario where f2fs is mounted as a "ro" root file-system by the boot loader and then the init process wants to remount it "rw" with the "remount,rw" option. Signed-off-by: Kelly Anderson <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2013-10-07f2fs: use rw_sem instead of fs_lock(locks mutex)Gu Zheng9-116/+83
The fs_locks is used to block other ops(ex, recovery) when doing checkpoint. And each other operate routine(besides checkpoint) needs to acquire a fs_lock, there is a terrible problem here, if these are too many concurrency threads acquiring fs_lock, so that they will block each other and may lead to some performance problem, but this is not the phenomenon we want to see. Though there are some optimization patches introduced to enhance the usage of fs_lock, but the thorough solution is using a *rw_sem* to replace the fs_lock. Checkpoint routine takes write_sem, and other ops take read_sem, so that we can block other ops(ex, recovery) when doing checkpoint, and other ops will not disturb each other, this can avoid the problem described above completely. Because of the weakness of rw_sem, the above change may introduce a potential problem that the checkpoint thread might get starved if other threads are intensively locking the read semaphore for I/O.(Pointed out by Xu Jin) In order to avoid this, a wait_list is introduced, the appending read semaphore ops will be dropped into the wait_list if checkpoint thread is waiting for write semaphore, and will be waked up when checkpoint thread gives up write semaphore. Thanks to Kim's previous review and test, and will be very glad to see other guys' performance tests about this patch. V2: -fix the potential starvation problem. -use more suitable func name suggested by Xu Jin. Signed-off-by: Gu Zheng <[email protected]> [Jaegeuk Kim: adjust minor coding standard] Signed-off-by: Jaegeuk Kim <[email protected]>
2013-10-06cifs: Avoid umount hangs with smb2 when server is unresponsiveShirish Pargaonkar2-2/+13
Do not send SMB2 Logoff command when reconnecting, the way smb1 code base works. Also, no need to wait for a credit for an echo command when one is already in flight. Without these changes, umount command hangs if the server is unresponsive e.g. hibernating. Signed-off-by: Shirish Pargaonkar <[email protected]> Acked-by: Jeff Layton <[email protected]> Signed-off-by: Steve French <[email protected]>
2013-10-05do not treat non-symlink reparse points as valid symlinksSteve French3-14/+71
Windows 8 and later can create NFS symlinks (within reparse points) which we were assuming were normal NTFS symlinks and thus reporting corrupt paths for. Add check for reparse points to make sure that they really are normal symlinks before we try to parse the pathname. We also should not be parsing other types of reparse points (DFS junctions etc) as if they were a symlink so return EOPNOTSUPP on those. Also fix endian errors (we were not parsing symlink lengths as little endian). This fixes commit d244bf2dfbebfded05f494ffd53659fa7b1e32c1 which implemented follow link for non-Unix CIFS mounts CC: Stable <[email protected]> Reviewed-by: Andrew Bartlett <[email protected]> Signed-off-by: Steve French <[email protected]>
2013-10-05sysfs: merge regular and bin file handlingTejun Heo6-501/+28
With the previous changes, sysfs regular file code is ready to handle bin files too. This patch makes bin files share the regular file path. * sysfs_create/remove_bin_file() are moved to fs/sysfs/file.c. * sysfs_init_inode() is updated to use the new sysfs_bin_operations instead of bin_fops for bin files. * fs/sysfs/bin.c and the related pieces are removed. This patch shouldn't introduce any behavior difference to bin file accesses. Overall, this unification reduces the amount of duplicate logic, makes behaviors more consistent and paves the road for building simpler and more versatile interface which will allow other subsystems to make use of sysfs for their pseudo filesystems. v2: Stale fs/sysfs/bin.c reference dropped from Documentation/DocBook/filesystems.tmpl. Reported by kbuild test robot. Signed-off-by: Tejun Heo <[email protected]> Cc: Kay Sievers <[email protected]> Cc: kbuild test robot <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2013-10-05sysfs: prepare open path for unified regular / bin file handlingTejun Heo1-25/+33
sysfs bin file handling will be merged into the regular file support. This patch prepares the open path. This patch updates sysfs_open_file() such that it can handle both regular and bin files. This is a preparation and the new bin file path isn't used yet. Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2013-10-05sysfs: copy bin mmap support from fs/sysfs/bin.c to fs/sysfs/file.cTejun Heo3-1/+249
sysfs bin file handling will be merged into the regular file support. This patch copies mmap support from bin so that fs/sysfs/file.c can handle mmapping bin files. The code is copied mostly verbatim with the following updates. * ->mmapped and ->vm_ops are added to sysfs_open_file and bin_buffer references are replaced with sysfs_open_file ones. * Symbols are prefixed with sysfs_. * sysfs_unmap_bin_file() grabs sysfs_open_dirent and traverses ->files. Invocation of this function is added to sysfs_addrm_finish(). * sysfs_bin_mmap() is added to sysfs_bin_operations. This is a preparation and the new mmap path isn't used yet. Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2013-10-05sysfs: add sysfs_bin_read()Tejun Heo1-0/+65
sysfs bin file handling will be merged into the regular file support. This patch prepares the read path. Copy fs/sysfs/bin.c::read() to fs/sysfs/file.c and make it use sysfs_open_file instead of bin_buffer. The function is identical copy except for the use of sysfs_open_file. The new function is added to sysfs_bin_operations. This isn't used yet but will eventually replace fs/sysfs/bin.c. Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2013-10-05sysfs: prepare path write for unified regular / bin file handlingTejun Heo2-6/+35
sysfs bin file handling will be merged into the regular file support. This patch prepares the write path. bin file write is almost identical to regular file write except that the write length is capped by the inode size and @off is passed to the write method. This patch adds bin file handling to sysfs_write_file() so that it can handle both regular and bin files. A new file_operations struct sysfs_bin_operations is added, which currently only hosts sysfs_write_file() and generic_file_llseek(). This isn't used yet but will eventually replace fs/sysfs/bin.c. Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2013-10-05sysfs: collapse fs/sysfs/bin.c::fill_read() into read()Tejun Heo1-21/+15
read() is simple enough and fill_read() being in a separate function doesn't add anything. Let's collapse it into read(). This will make merging bin file handling with regular file. Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2013-10-05sysfs: skip bin_buffer->buffer while readingTejun Heo1-13/+8
After b31ca3f5dfc ("sysfs: fix deadlock"), bin read() first writes data to bb->buffer and bounces it to a transient kernel buffer which is then copied out to userland. The double bouncing doesn't add anything. Let's just use the transient buffer directly. While at it, rename @temp to @buf for clarity. Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2013-10-05sysfs: use seq_file when reading regular filesTejun Heo1-91/+73
sysfs read path implements its own buffering scheme between userland and kernel callbacks, which essentially is a degenerate duplicate of seq_file. This patch replaces the custom read buffering implementation in sysfs with seq_file. While the amount of code reduction is small, this reduces low level hairiness and enables future development of a new versatile API based on seq_file so that sysfs features can be shared with other subsystems. As write path was already converted to not use sysfs_open_file->page, this patch makes ->page and ->count unused and removes them. Userland behavior remains the same except for some extreme corner cases - e.g. sysfs will now regenerate the content each time a file is read after a non-contiguous seek whereas the original code would keep using the same content. While this is a userland visible behavior change, it is extremely unlikely to be noticeable and brings sysfs behavior closer to that of procfs. Signed-off-by: Tejun Heo <[email protected]> Cc: Kay Sievers <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2013-10-05sysfs: use transient write bufferTejun Heo1-62/+52
There isn't much to be gained by keeping around kernel buffer while a file is open especially as the read path planned to be converted to use seq_file and won't use the buffer. This patch makes sysfs_write_file() use per-write transient buffer instead of sysfs_open_file->page. This simplifies the write path, enables removing sysfs_open_file->page once read path is updated and will help merging bin file write path which already requires the use of a transient buffer due to a locking order issue. As the function comments of flush_write_buffer() and sysfs_write_buffer() are being updated anyway, reformat them so that they're more conventional. v2: Use min_t() instead of min() in sysfs_write_file() to avoid build warning on arm. Reported by build test robot. Signed-off-by: Tejun Heo <[email protected]> Cc: kbuild test robot <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2013-10-05sysfs: add sysfs_open_file->sd and ->fileTejun Heo1-11/+12
sysfs will be converted to use seq_file for read path, which will make it difficult to pass around multiple pointers directly. This patch adds sysfs_open_file->sd and ->file so that we can reach all the necessary data structures from sysfs_open_file. flush_write_buffer() is updated to drop @dentry which was used to discover the sysfs_dirent as it's now available through sysfs_open_file->sd. This patch doesn't cause any behavior difference. Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2013-10-05sysfs: rename sysfs_buffer to sysfs_open_fileTejun Heo1-64/+63
sysfs read path will be converted to use seq_file which will handle buffering making sysfs_buffer a misnomer. Rename sysfs_buffer to sysfs_open_file, and sysfs_open_dirent->buffers to ->files. This path is pure rename. Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2013-10-05sysfs: add sysfs_open_file_mutexTejun Heo1-6/+12
Add a separate mutex to protect sysfs_open_dirent->buffers list. This will allow performing sleepable operations while traversing sysfs_buffers, which will be renamed to sysfs_open_file. Note that currently sysfs_open_dirent->buffers list isn't being used for anything and this patch doesn't make any functional difference. It will be used to merge regular and bin file supports. Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2013-10-05sysfs: remove sysfs_buffer->opsTejun Heo1-12/+21
Currently, sysfs_ops is fetched during sysfs_open_file() and cached in sysfs_buffer->ops to be used while the file is open. This patch removes the caching and makes each operation directly fetch sysfs_ops. This patch doesn't introduce any behavior difference and is to prepare for merging regular and bin file supports. Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2013-10-05Merge branch 'for-linus' of ↵Linus Torvalds6-17/+39
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "This is a small collection of fixes, including a regression fix from Liu Bo that solves rare crashes with compression on. I've merged my for-linus up to 3.12-rc3 because the top commit is only meant for 3.12. The rest of the fixes are also available in my master branch on top of my last 3.11 based pull" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: btrfs: Fix crash due to not allocating integrity data for a bioset Btrfs: fix a use-after-free bug in btrfs_dev_replace_finishing Btrfs: eliminate races in worker stopping code Btrfs: fix crash of compressed writes Btrfs: fix transid verify errors when recovering log tree
2013-10-05sysfs: remove sysfs_buffer->needs_read_fillTejun Heo1-12/+12
->needs_read_fill is used to implement the following behaviors. 1. Ensure buffer filling on the first read. 2. Force buffer filling after a write. 3. Force buffer filling after a successful poll. However, #2 and #3 don't really work as sysfs doesn't reset file position. While the read buffer would be refilled, the next read would continue from the position after the last read or write, requiring an explicit seek to the start for it to be useful, which makes ->needs_read_fill superflous as read buffer is always refilled if f_pos == 0. Update sysfs_read_file() to test buffer->page for #1 instead and remove ->needs_read_fill. While this changes behavior in extreme corner cases - e.g. re-reading a sysfs file after seeking to non-zero position after a write or poll, it's highly unlikely to lead to actual breakage. This change is to prepare for using seq_file in the read path. While at it, reformat a comment in fill_write_buffer(). Signed-off-by: Tejun Heo <[email protected]> Cc: Kay Sievers <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2013-10-05sysfs: remove unused sysfs_buffer->posTejun Heo1-1/+0
Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2013-10-05btrfs: Fix crash due to not allocating integrity data for a biosetDarrick J. Wong1-0/+8
When btrfs creates a bioset, we must also allocate the integrity data pool. Otherwise btrfs will crash when it tries to submit a bio to a checksumming disk: BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 IP: [<ffffffff8111e28a>] mempool_alloc+0x4a/0x150 PGD 2305e4067 PUD 23063d067 PMD 0 Oops: 0000 [#1] PREEMPT SMP Modules linked in: btrfs scsi_debug xfs ext4 jbd2 ext3 jbd mbcache sch_fq_codel eeprom lpc_ich mfd_core nfsd exportfs auth_rpcgss af_packet raid6_pq xor zlib_deflate libcrc32c [last unloaded: scsi_debug] CPU: 1 PID: 4486 Comm: mount Not tainted 3.12.0-rc1-mcsum #2 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 task: ffff8802451c9720 ti: ffff880230698000 task.ti: ffff880230698000 RIP: 0010:[<ffffffff8111e28a>] [<ffffffff8111e28a>] mempool_alloc+0x4a/0x150 RSP: 0018:ffff880230699688 EFLAGS: 00010286 RAX: 0000000000000001 RBX: 0000000000000000 RCX: 00000000005f8445 RDX: 0000000000000001 RSI: 0000000000000010 RDI: 0000000000000000 RBP: ffff8802306996f8 R08: 0000000000011200 R09: 0000000000000008 R10: 0000000000000020 R11: ffff88009d6e8000 R12: 0000000000011210 R13: 0000000000000030 R14: ffff8802306996b8 R15: ffff8802451c9720 FS: 00007f25b8a16800(0000) GS:ffff88024fc80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000018 CR3: 0000000230576000 CR4: 00000000000007e0 Stack: ffff8802451c9720 0000000000000002 ffffffff81a97100 0000000000281250 ffffffff81a96480 ffff88024fc99150 ffff880228d18200 0000000000000000 0000000000000000 0000000000000040 ffff880230e8c2e8 ffff8802459dc900 Call Trace: [<ffffffff811b2208>] bio_integrity_alloc+0x48/0x1b0 [<ffffffff811b26fc>] bio_integrity_prep+0xac/0x360 [<ffffffff8111e298>] ? mempool_alloc+0x58/0x150 [<ffffffffa03e8041>] ? alloc_extent_state+0x31/0x110 [btrfs] [<ffffffff81241579>] blk_queue_bio+0x1c9/0x460 [<ffffffff8123e58a>] generic_make_request+0xca/0x100 [<ffffffff8123e639>] submit_bio+0x79/0x160 [<ffffffffa03f865e>] btrfs_map_bio+0x48e/0x5b0 [btrfs] [<ffffffffa03c821a>] btree_submit_bio_hook+0xda/0x110 [btrfs] [<ffffffffa03e7eba>] submit_one_bio+0x6a/0xa0 [btrfs] [<ffffffffa03ef450>] read_extent_buffer_pages+0x250/0x310 [btrfs] [<ffffffff8125eef6>] ? __radix_tree_preload+0x66/0xf0 [<ffffffff8125f1c5>] ? radix_tree_insert+0x95/0x260 [<ffffffffa03c66f6>] btree_read_extent_buffer_pages.constprop.128+0xb6/0x120 [btrfs] [<ffffffffa03c8c1a>] read_tree_block+0x3a/0x60 [btrfs] [<ffffffffa03caefd>] open_ctree+0x139d/0x2030 [btrfs] [<ffffffffa03a282a>] btrfs_mount+0x53a/0x7d0 [btrfs] [<ffffffff8113ab0b>] ? pcpu_alloc+0x8eb/0x9f0 [<ffffffff81167305>] ? __kmalloc_track_caller+0x35/0x1e0 [<ffffffff81176ba0>] mount_fs+0x20/0xd0 [<ffffffff81191096>] vfs_kern_mount+0x76/0x120 [<ffffffff81193320>] do_mount+0x200/0xa40 [<ffffffff81135cdb>] ? strndup_user+0x5b/0x80 [<ffffffff81193bf0>] SyS_mount+0x90/0xe0 [<ffffffff8156d31d>] system_call_fastpath+0x1a/0x1f Code: 4c 8d 75 a8 4c 89 6d e8 45 89 e0 4c 8d 6f 30 48 89 5d d8 41 83 e0 af 48 89 fb 49 83 c6 18 4c 89 7d f8 65 4c 8b 3c 25 c0 b8 00 00 <48> 8b 73 18 44 89 c7 44 89 45 98 ff 53 20 48 85 c0 48 89 c2 74 RIP [<ffffffff8111e28a>] mempool_alloc+0x4a/0x150 RSP <ffff880230699688> CR2: 0000000000000018 ---[ end trace 7a96042017ed21e2 ]--- Signed-off-by: Darrick J. Wong <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-10-05Merge branch 'for-linus' into for-linus-3.12Chris Mason6-17/+31
2013-10-04Merge branch 'for-linus' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds10-115/+74
Pull CIFS fixes from Steve French: "Small set of cifs fixes. Most important is Jeff's fix that works around disconnection problems which can be caused by simultaneous use of user space tools (starting a long running smbclient backup then doing a cifs kernel mount) or multiple cifs mounts through a NAT, and Jim's fix to deal with reexport of cifs share. I expect to send two more cifs fixes next week (being tested now) - fixes to address an SMB2 unmount hang when server dies and a fix for cifs symlink handling of Windows "NFS" symlinks" * 'for-linus' of git://git.samba.org/sfrench/cifs-2.6: [CIFS] update cifs.ko version [CIFS] Remove ext2 flags that have been moved to fs.h [CIFS] Provide sane values for nlink cifs: stop trying to use virtual circuits CIFS: FS-Cache: Uncache unread pages in cifs_readpages() before freeing them
2013-10-04Merge tag 'xfs-for-linus-v3.12-rc4' of git://oss.sgi.com/xfs/xfsLinus Torvalds6-42/+45
Pull xfs bugfixes from Ben Myers: "There are lockdep annotations for project quotas, a fix for dirent dtype support on v4 filesystems, a fix for a memory leak in recovery, and a fix for the build error that resulted from it. D'oh" * tag 'xfs-for-linus-v3.12-rc4' of git://oss.sgi.com/xfs/xfs: xfs: Use kmem_free() instead of free() xfs: fix memory leak in xlog_recover_add_to_trans xfs: dirent dtype presence is dependent on directory magic numbers xfs: lockdep needs to know about 3 dquot-deep nesting
2013-10-04Btrfs: fix a use-after-free bug in btrfs_dev_replace_finishingIlya Dryomov2-5/+7
free_device rcu callback, scheduled from btrfs_rm_dev_replace_srcdev, can be processed before btrfs_scratch_superblock is called, which would result in a use-after-free on btrfs_device contents. Fix this by zeroing the superblock before the rcu callback is registered. Cc: Stefan Behrens <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]> Signed-off-by: Josef Bacik <[email protected]>
2013-10-04Btrfs: eliminate races in worker stopping codeIlya Dryomov2-6/+21
The current implementation of worker threads in Btrfs has races in worker stopping code, which cause all kinds of panics and lockups when running btrfs/011 xfstest in a loop. The problem is that btrfs_stop_workers is unsynchronized with respect to check_idle_worker, check_busy_worker and __btrfs_start_workers. E.g., check_idle_worker race flow: btrfs_stop_workers(): check_idle_worker(aworker): - grabs the lock - splices the idle list into the working list - removes the first worker from the working list - releases the lock to wait for its kthread's completion - grabs the lock - if aworker is on the working list, moves aworker from the working list to the idle list - releases the lock - grabs the lock - puts the worker - removes the second worker from the working list ...... btrfs_stop_workers returns, aworker is on the idle list FS is umounted, memory is freed ...... aworker is waken up, fireworks ensue With this applied, I wasn't able to trigger the problem in 48 hours, whereas previously I could reliably reproduce at least one of these races within an hour. Reported-by: David Sterba <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]> Signed-off-by: Josef Bacik <[email protected]>
2013-10-04Btrfs: fix crash of compressed writesLiu Bo1-1/+1
The crash[1] is found by xfstests/generic/208 with "-o compress", it's not reproduced everytime, but it does panic. The bug is quite interesting, it's actually introduced by a recent commit (573aecafca1cf7a974231b759197a1aebcf39c2a, Btrfs: actually limit the size of delalloc range). Btrfs implements delay allocation, so during writeback, we (1) get a page A and lock it (2) search the state tree for delalloc bytes and lock all pages within the range (3) process the delalloc range, including find disk space and create ordered extent and so on. (4) submit the page A. It runs well in normal cases, but if we're in a racy case, eg. buffered compressed writes and aio-dio writes, sometimes we may fail to lock all pages in the 'delalloc' range, in which case, we need to fall back to search the state tree again with a smaller range limit(max_bytes = PAGE_CACHE_SIZE - offset). The mentioned commit has a side effect, that is, in the fallback case, we can find delalloc bytes before the index of the page we already have locked, so we're in the case of (delalloc_end <= *start) and return with (found > 0). This ends with not locking delalloc pages but making ->writepage still process them, and the crash happens. This fixes it by just thinking that we find nothing and returning to caller as the caller knows how to deal with it properly. [1]: ------------[ cut here ]------------ kernel BUG at mm/page-writeback.c:2170! [...] CPU: 2 PID: 11755 Comm: btrfs-delalloc- Tainted: G O 3.11.0+ #8 [...] RIP: 0010:[<ffffffff810f5093>] [<ffffffff810f5093>] clear_page_dirty_for_io+0x1e/0x83 [...] [ 4934.248731] Stack: [ 4934.248731] ffff8801477e5dc8 ffffea00049b9f00 ffff8801869f9ce8 ffffffffa02b841a [ 4934.248731] 0000000000000000 0000000000000000 0000000000000fff 0000000000000620 [ 4934.248731] ffff88018db59c78 ffffea0005da8d40 ffffffffa02ff860 00000001810016c0 [ 4934.248731] Call Trace: [ 4934.248731] [<ffffffffa02b841a>] extent_range_clear_dirty_for_io+0xcf/0xf5 [btrfs] [ 4934.248731] [<ffffffffa02a8889>] compress_file_range+0x1dc/0x4cb [btrfs] [ 4934.248731] [<ffffffff8104f7af>] ? detach_if_pending+0x22/0x4b [ 4934.248731] [<ffffffffa02a8bad>] async_cow_start+0x35/0x53 [btrfs] [ 4934.248731] [<ffffffffa02c694b>] worker_loop+0x14b/0x48c [btrfs] [ 4934.248731] [<ffffffffa02c6800>] ? btrfs_queue_worker+0x25c/0x25c [btrfs] [ 4934.248731] [<ffffffff810608f5>] kthread+0x8d/0x95 [ 4934.248731] [<ffffffff81060868>] ? kthread_freezable_should_stop+0x43/0x43 [ 4934.248731] [<ffffffff814fe09c>] ret_from_fork+0x7c/0xb0 [ 4934.248731] [<ffffffff81060868>] ? kthread_freezable_should_stop+0x43/0x43 [ 4934.248731] Code: ff 85 c0 0f 94 c0 0f b6 c0 59 5b 5d c3 0f 1f 44 00 00 55 48 89 e5 41 54 53 48 89 fb e8 2c de 00 00 49 89 c4 48 8b 03 a8 01 75 02 <0f> 0b 4d 85 e4 74 52 49 8b 84 24 80 00 00 00 f6 40 20 01 75 44 [ 4934.248731] RIP [<ffffffff810f5093>] clear_page_dirty_for_io+0x1e/0x83 [ 4934.248731] RSP <ffff8801869f9c48> [ 4934.280307] ---[ end trace 36f06d3f8750236a ]--- Signed-off-by: Liu Bo <[email protected]> Signed-off-by: Josef Bacik <[email protected]>
2013-10-04Btrfs: fix transid verify errors when recovering log treeJosef Bacik1-5/+2
If we crash with a log, remount and recover that log, and then crash before we can commit another transaction we will get transid verify errors on the next mount. This is because we were not zero'ing out the log when we committed the transaction after recovery. This is ok as long as we commit another transaction at some point in the future, but if you abort or something else goes wrong you can end up in this weird state because the recovery stuff says that the tree log should have a generation+1 of the super generation, which won't be the case of the transaction that was started for recovery. Fix this by removing the check and _always_ zero out the log portion of the super when we commit a transaction. This fixes the transid verify issues I was seeing with my force errors tests. Thanks, Signed-off-by: Josef Bacik <[email protected]>
2013-10-04xfs: Use kmem_free() instead of free()Thierry Reding1-1/+1
This fixes a build failure caused by calling the free() function which does not exist in the Linux kernel. Signed-off-by: Thierry Reding <[email protected]> Reviewed-by: Mark Tinguely <[email protected]> Signed-off-by: Ben Myers <[email protected]> (cherry picked from commit aaaae98022efa4f3c31042f1fdf9e7a0c5f04663)
2013-10-04xfs: fix memory leak in xlog_recover_add_to_trans[email protected]1-0/+1
Free the memory in error path of xlog_recover_add_to_trans(). Normally this memory is freed in recovery pass2, but is leaked in the error path. Signed-off-by: Mark Tinguely <[email protected]> Reviewed-by: Eric Sandeen <[email protected]> Signed-off-by: Ben Myers <[email protected]> (cherry picked from commit 519ccb81ac1c8e3e4eed294acf93be00b43dcad6)
2013-10-04xfs: dirent dtype presence is dependent on directory magic numbersDave Chinner4-39/+28
The determination of whether a directory entry contains a dtype field originally was dependent on the filesystem having CRCs enabled. This meant that the format for dtype beign enabled could be determined by checking the directory block magic number rather than doing a feature bit check. This was useful in that it meant that we didn't need to pass a struct xfs_mount around to functions that were already supplied with a directory block header. Unfortunately, the introduction of dtype fields into the v4 structure via a feature bit meant this "use the directory block magic number" method of discriminating the dirent entry sizes is broken. Hence we need to convert the places that use magic number checks to use feature bit checks so that they work correctly and not by chance. The current code works on v4 filesystems only because the dirent size roundup covers the extra byte needed by the dtype field in the places where this problem occurs. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Ben Myers <[email protected]> Signed-off-by: Ben Myers <[email protected]> (cherry picked from commit 367993e7c6428cb7617ab7653d61dca54e2fdede)
2013-10-04xfs: lockdep needs to know about 3 dquot-deep nestingDave Chinner1-3/+16
Michael Semon reported that xfs/299 generated this lockdep warning: ============================================= [ INFO: possible recursive locking detected ] 3.12.0-rc2+ #2 Not tainted --------------------------------------------- touch/21072 is trying to acquire lock: (&xfs_dquot_other_class){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64 but task is already holding lock: (&xfs_dquot_other_class){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&xfs_dquot_other_class); lock(&xfs_dquot_other_class); *** DEADLOCK *** May be due to missing lock nesting notation 7 locks held by touch/21072: #0: (sb_writers#10){++++.+}, at: [<c11185b6>] mnt_want_write+0x1e/0x3e #1: (&type->i_mutex_dir_key#4){+.+.+.}, at: [<c11078ee>] do_last+0x245/0xe40 #2: (sb_internal#2){++++.+}, at: [<c122c9e0>] xfs_trans_alloc+0x1f/0x35 #3: (&(&ip->i_lock)->mr_lock/1){+.+...}, at: [<c126cd1b>] xfs_ilock+0x100/0x1f1 #4: (&(&ip->i_lock)->mr_lock){++++-.}, at: [<c126cf52>] xfs_ilock_nowait+0x105/0x22f #5: (&dqp->q_qlock){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64 #6: (&xfs_dquot_other_class){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64 The lockdep annotation for dquot lock nesting only understands locking for user and "other" dquots, not user, group and quota dquots. Fix the annotations to match the locking heirarchy we now have. Reported-by: Michael L. Semon <[email protected]> Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Ben Myers <[email protected]> Signed-off-by: Ben Myers <[email protected]> (cherry picked from commit f112a049712a5c07de25d511c3c6587a2b1a015e)
2013-10-04Merge branch 'for-linus' of ↵Linus Torvalds3-13/+32
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse bugfixes from Miklos Szeredi: "This contains two more fixes by Maxim for writeback/truncate races and fixes for RCU walk in fuse_dentry_revalidate()" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: no RCU mode in fuse_access() fuse: readdirplus: fix RCU walk fuse: don't check_submounts_and_drop() in RCU walk fuse: fix fallocate vs. ftruncate race fuse: wait for writeback in fuse_file_fallocate()
2013-10-04GFS2: Protect quota sync generationSteven Whitehouse3-2/+6
Now that gfs2_quota_sync can be potentially called from multiple threads, we should protect this bit of code, and the sync generation number in particular in order to ensure that there are no races when syncing quotas. Signed-off-by: Steven Whitehouse <[email protected]> Cc: Abhijith Das <[email protected]>
2013-10-04GFS2: Inline qd_trylock into gfs2_quota_unlockSteven Whitehouse1-25/+20
The function qd_trylock was not a trylock despite its name and can be inlined into gfs2_quota_unlock in order to make the code a bit clearer. There should be no functional change as a result of this patch. Signed-off-by: Steven Whitehouse <[email protected]> Cc: Abhijith Das <[email protected]>
2013-10-04GFS2: Make two similar quota code fragments into a functionSteven Whitehouse1-34/+26
There should be no functional change bar the removal of a test of the MS_READONLY flag which would never be reachable. This merges the common code from qd_fish and qd_trylock into a single function and calls it from both those places. Signed-off-by: Steven Whitehouse <[email protected]> Cc: Abhijith Das <[email protected]>
2013-10-04GFS2: Remove obsolete quota tunableSteven Whitehouse4-5/+1
There is no need for a paramater which relates to the internals of quota to be exposed to users. The only possible use would be to turn it up so large that the memory allocation fails. So lets remove it and set it to a sensible value which ensures that we don't ask for multipage allocations. Currently the size of struct gfs2_holder means that the caluclated value is identical to the previous default value, so there should be no functional change. Signed-off-by: Steven Whitehouse <[email protected]> Cc: Abhijith Das <[email protected]>
2013-10-03sysfs: introduce [__]sysfs_remove()Tejun Heo4-28/+29
Given a sysfs_dirent, there is no reason to have multiple versions of removal functions. A function which removes the specified sysfs_dirent and its descendants is enough. This patch intorduces [__}sysfs_remove() which replaces all internal variations of removal functions. This will be the only removal function in the planned new sysfs_dirent based interface. Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>