aboutsummaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2016-03-09xfs: remove impossible conditionLuis de Bethencourt1-4/+1
bp_release is set to 0 just before the breakpoint of the for loop before the conditional check (in line 458). The other breakpoint is a goto that skips the dead code. Addresses-Coverity-Id: 102338 Signed-off-by: Luis de Bethencourt <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-03-09xfs: check sizes of XFS on-disk structures at compile timeDarrick J. Wong2-0/+120
Check the sizes of XFS on-disk structures when compiling the kernel. Use this to catch inadvertent changes in structure size due to padding and alignment issues, etc. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Brian Foster <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-03-07jffs2: reduce the breakage on recovery from halfway failed rename()Al Viro1-3/+8
d_instantiate(new_dentry, old_inode) is absolutely wrong thing to do - it will oops if new_dentry used to be positive, for starters. What we need is d_invalidate() the target and be done with that. Cc: [email protected] # v3.18+ Signed-off-by: Al Viro <[email protected]>
2016-03-07ncpfs: fix a braino in OOM handling in ncp_fill_cache()Al Viro1-1/+1
Failing to allocate an inode for child means that cache for *parent* is incompletely populated. So it's parent directory inode ('dir') that needs NCPI_DIR_CACHE flag removed, *not* the child inode ('inode', which is what we'd failed to allocate in the first place). Fucked-up-in: commit 5e993e25 ("ncpfs: get rid of d_validate() nonsense") Fucked-up-by: Al Viro <[email protected]> Cc: [email protected] # v3.19 Signed-off-by: Al Viro <[email protected]>
2016-03-07mtd: kill the ecclayout->oobavail fieldBoris BREZILLON1-4/+2
ecclayout->oobavail is just redundant with the mtd->oobavail field. Moreover, it prevents static const definition of ecc layouts since the NAND framework is calculating this value based on the ecclayout->oobfree field. Signed-off-by: Boris Brezillon <[email protected]> Signed-off-by: Brian Norris <[email protected]>
2016-03-07Merge branch 'overlayfs-linus' of ↵Linus Torvalds3-6/+19
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull overlayfs fixes from Miklos Szeredi: "Overlayfs bug fixes. All marked as -stable material" * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: ovl: copy new uid/gid into overlayfs runtime inode ovl: ignore lower entries when checking purity of non-directory entries ovl: fix getcwd() failure after unsuccessful rmdir ovl: fix working on distributed fs as lower layer
2016-03-07Merge tag 'v4.5-rc7' into x86/asm, to pick up SMAP fixIngo Molnar59-314/+686
Signed-off-by: Ingo Molnar <[email protected]>
2016-03-07Merge branch 'xfs-misc-fixes-4.6-2' into for-nextDave Chinner9-282/+304
2016-03-07Merge branch 'xfs-dax-fixes-4.6' into for-nextDave Chinner2-9/+102
2016-03-07Merge branch 'xfs-writepage-rework-4.6' into for-nextDave Chinner2-467/+270
2016-03-07xfs: ioends require logically contiguous file offsetsDarrick J. Wong1-1/+2
We need to create a new ioend if the current writepage call isn't logically contiguous with the range contained in the previous ioend. Hopefully writepage gets called in order of increasing file offset. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-03-07Merge branch 'xfs-buf-macro-cleanup-4.6' into for-nextDave Chinner8-56/+26
2016-03-07Merge branch 'xfs-gut-icdinode-4.6' into for-nextDave Chinner23-308/+429
2016-03-07Merge branch 'xfs-misc-fixes-4.6' into for-nextDave Chinner12-49/+62
2016-03-07Merge branch 'xfs-dio-fix-4.6' into for-nextDave Chinner8-207/+142
2016-03-07Merge branch 'xfs-get-next-dquot-4.6' into for-nextDave Chinner10-86/+377
2016-03-07Merge branch 'xfs-rt-fixes-4.6' into for-nextDave Chinner5-2/+42
2016-03-07Merge branch 'xfs-torn-log-fixes-4.5' into for-nextDave Chinner1-103/+168
2016-03-07xfs: use named array initializers for log item dumpingDarrick J. Wong1-64/+68
Use named array initializers for the string arrays used to dump log items, rather than depending on the order being maintained correctly. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-03-07xfs: fix computation of inode btree maxlevelsDarrick J. Wong1-2/+2
Commit 88740da18[1] introduced a function to compute the maximum height of the inode btree back in 1994. Back then, apparently, the freespace and inode btrees shared the same geometry; however, it has long since been the case that the inode and freespace btrees have different record and key sizes. Therefore, we must use m_inobt_mnr if we want a correct calculation/log reservation/etc. (Yes, this bug has been around for 21 years and ten months.) (Yes, I was in middle school when this bug was committed.) [1] http://oss.sgi.com/cgi-bin/gitweb.cgi?p=archive/xfs-import.git;a=commitdiff;h=88740da18ddd9d7ba3ebaa9502fefc6ef2fd19cd Historical-research-by: Dave Chinner <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-03-07xfs: reinitialise per-AG structures if geometry changes during recoveryDave Chinner1-9/+13
If a crash occurs immediately after a filesystem grow operation, the updated superblock geometry is found only in the log. After we recover the log, the superblock is reread and re-initialised and so has the new geometry in memory. If the new geometry has more AGs than prior to the grow operation, then the new AGs will not have in-memory xfs_perag structurea associated with them. This will result in an oops when the first metadata buffer from a new AG is looked up in the buffer cache, as the block lies within the new geometry but then fails to find a perag structure on lookup. This is easily fixed by simply re-initialising the perag structure after re-reading the superblock at the conclusion of the first pahse of log recovery. This, however, does not fix the case of log recovery requiring access to metadata in the newly grown space. Fortunately for us, because the in-core superblock has not been updated, this will result in detection of access beyond the end of the filesystem and so recovery will fail at that point. If this proves to be a problem, then we can address it separately to the current reported issue. Reported-by: Alex Lyakas <[email protected]> Tested-by: Alex Lyakas <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-03-07xfs: only run torn log write detection on dirty logsBrian Foster1-11/+31
XFS uses CRC verification over a sub-range of the head of the log to detect and handle torn writes. This torn log write detection currently runs unconditionally at mount time, regardless of whether the log is dirty or clean. This is problematic in cases where a filesystem might end up being moved across different, incompatible (i.e., opposite byte-endianness) architectures. The problem lies in the fact that log data is not necessarily written in an architecture independent format. For example, certain bits of data are written in native endian format. Further, the size of certain log data structures differs (i.e., struct xlog_rec_header) depending on the word size of the cpu. This leads to false positive crc verification errors and ultimately failed mounts when a cleanly unmounted filesystem is mounted on a system with an incompatible architecture from data that was written near the head of the log. Update the log head/tail discovery code to run torn write detection only when the log is not clean. This means something other than an unmount record resides at the head of the log and log recovery is imminent. It is a requirement to run log recovery on the same type of host that had written the content of the dirty log and therefore CRC failures are legitimate corruptions in that scenario. Reported-by: Jan Beulich <[email protected]> Tested-by: Jan Beulich <[email protected]> Signed-off-by: Brian Foster <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-03-07xfs: refactor in-core log state update to helperBrian Foster1-19/+33
Once the record at the head of the log is identified and verified, the in-core log state is updated based on the record. This includes information such as the current head block and cycle, the start block of the last record written to the log, the tail lsn, etc. Once torn write detection is conditional, this logic will need to be reused. Factor the code to update the in-core log data structures into a new helper function. This patch does not change behavior. Signed-off-by: Brian Foster <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-03-07xfs: refactor unmount record detection into helperBrian Foster1-60/+93
Once the mount sequence has identified the head and tail blocks of the physical log, the record at the head of the log is located and examined for an unmount record to determine if the log is clean. This currently occurs after torn write verification of the head region of the log. This must ultimately be separated from torn write verification and may need to be called again if the log head is walked back due to a torn write (to determine whether the new head record is an unmount record). Separate this logic into a new helper function. This patch does not change behavior. Signed-off-by: Brian Foster <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-03-07xfs: separate log head record discovery from verificationBrian Foster1-22/+20
The code that locates the log record at the head of the log is buried in the log head verification function. This is fine when torn write verification occurs unconditionally, but this behavior is problematic for filesystems that might be moved across systems with different architectures. In preparation for separating examination of the log head for unmount records from torn write detection, lift the record location logic out of the log verification function and into the caller. This patch does not change behavior. Signed-off-by: Brian Foster <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-03-06Merge branch 'for-linus' of ↵Linus Torvalds6-3/+48
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull ceph fix from Sage Weil: "This is a final commit we missed to align the protocol compatibility with the feature bits. It decodes a few extra fields in two different messages and reports EIO when they are used (not yet supported)" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: ceph: initial CEPH_FEATURE_FS_FILE_LAYOUT_V2 support
2016-03-06configfs: switch ->default groups to a linked listChristoph Hellwig4-72/+33
Replace the current NULL-terminated array of default groups with a linked list. This gets rid of lots of nasty code to size and/or dynamically allocate the array. While we're at it also provide a conveniant helper to remove the default groups. Signed-off-by: Christoph Hellwig <[email protected]> Acked-by: Felipe Balbi <[email protected]> [drivers/usb/gadget] Acked-by: Joel Becker <[email protected]> Acked-by: Nicholas Bellinger <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]>
2016-03-05lookup_dcache(): lift d_alloc() into callersAl Viro1-18/+17
... and kill need_lookup thing Signed-off-by: Al Viro <[email protected]>
2016-03-05do_last(): reorder and simplify a bitAl Viro1-11/+11
bugger off on negatives a bit earlier, simplify the tests Signed-off-by: Al Viro <[email protected]>
2016-03-05Merge branch 'for-linus' into work.lookupsAl Viro41-355/+494
for the sake of namei.c fixes
2016-03-04Merge branch 'for-linus2' of git://git.kernel.dk/linux-blockLinus Torvalds2-13/+42
Pull block fixes from Jens Axboe: "Round 2 of this. I cut back to the bare necessities, the patch is still larger than it usually would be at this time, due to the number of NVMe fixes in there. This pull request contains: - The 4 core fixes from Ming, that fix both problems with exceeding the virtual boundary limit in case of merging, and the gap checking for cloned bio's. - NVMe fixes from Keith and Christoph: - Regression on larger user commands, causing problems with reading log pages (for instance). This touches both NVMe, and the block core since that is now generally utilized also for these types of commands. - Hot removal fixes. - User exploitable issue with passthrough IO commands, if !length is given, causing us to fault on writing to the zero page. - Fix for a hang under error conditions - And finally, the current series regression for umount with cgroup writeback, where the final flush would happen async and hence open up window after umount where the device wasn't consistent. fsck right after umount would show this. From Tejun" * 'for-linus2' of git://git.kernel.dk/linux-block: block: support large requests in blk_rq_map_user_iov block: fix blk_rq_get_max_sectors for driver private requests nvme: fix max_segments integer truncation nvme: set queue limits for the admin queue writeback: flush inode cgroup wb switches instead of pinning super_block NVMe: Fix 0-length integrity payload NVMe: Don't allow unsupported flags NVMe: Move error handling to failed reset handler NVMe: Simplify device reset failure NVMe: Fix namespace removal deadlock NVMe: Use IDA for namespace disk naming NVMe: Don't unmap controller registers on reset block: merge: get the 1st and last bvec via helpers block: get the 1st and last bvec via helpers block: check virt boundary in bio_will_gap() block: bio: introduce helpers to get the 1st and last bvec
2016-03-04Merge tag 'for-linus-20160304' of git://git.infradead.org/linux-mtdLinus Torvalds5-51/+91
Pull jffs2 fixes from David Woodhouse: "This contains two important JFFS2 fixes marked for stable: - a lock ordering problem between the page lock and the internal f->sem mutex, which was causing occasional deadlocks in garbage collection - a scan failure causing moved directories to sometimes end up appearing to have hard links. There are also a couple of trivial MAINTAINERS file updates" * tag 'for-linus-20160304' of git://git.infradead.org/linux-mtd: MAINTAINERS: add maintainer entry for FREESCALE GPMI NAND driver Fix directory hardlinks from deleted directories jffs2: Fix page lock / f->sem deadlock Revert "jffs2: Fix lock acquisition order bug in jffs2_write_begin" MAINTAINERS: update Han's email
2016-03-04Merge branch 'for-linus-4.5' of ↵Linus Torvalds1-1/+9
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fix from Chris Mason: "Filipe nailed down a problem where tree log replay would do some work that orphan code wasn't expecting to be done yet, leading to BUG_ON" * 'for-linus-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: fix loading of orphan roots leading to BUG_ON
2016-03-04ceph: initial CEPH_FEATURE_FS_FILE_LAYOUT_V2 supportYan, Zheng6-3/+48
Add support for the format change of MClientReply/MclientCaps. Also add code that denies access to inodes with pool_ns layouts. Signed-off-by: Yan, Zheng <[email protected]> Reviewed-by: Sage Weil <[email protected]>
2016-03-04direct-io: only use block polling if explicitly requestedChristoph Hellwig1-1/+2
Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Stephen Bates <[email protected]> Tested-by: Stephen Bates <[email protected]> Acked-by: Jeff Moyer <[email protected]> Signed-off-by: Al Viro <[email protected]>
2016-03-04vfs: add the RWF_HIPRI flag for preadv2/pwritev2Christoph Hellwig1-2/+4
This adds a flag that tells the file system that this is a high priority request for which it's worth to poll the hardware. The flag is purely advisory and can be ignored if not supported. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Stephen Bates <[email protected]> Tested-by: Stephen Bates <[email protected]> Acked-by: Jeff Moyer <[email protected]> Signed-off-by: Al Viro <[email protected]>
2016-03-04vfs: vfs: Define new syscalls preadv2,pwritev2Milosz Tanski1-35/+126
New syscalls that take an flag argument. No flags are added yet in this patch. Signed-off-by: Milosz Tanski <[email protected]> [hch: rebased on top of my kiocb changes] Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Stephen Bates <[email protected]> Tested-by: Stephen Bates <[email protected]> Acked-by: Jeff Moyer <[email protected]> Signed-off-by: Al Viro <[email protected]>
2016-03-04vfs: pass a flags argument to vfs_readv/vfs_writevChristoph Hellwig3-21/+29
This way we can set kiocb flags also from the sync read/write path for the read_iter/write_iter operations. For now there is no way to pass flags to plain read/write operations as there is no real need for that, and all flags passed are explicitly rejected for these files. Signed-off-by: Milosz Tanski <[email protected]> [hch: rebased on top of my kiocb changes] Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Stephen Bates <[email protected]> Tested-by: Stephen Bates <[email protected]> Acked-by: Jeff Moyer <[email protected]> Signed-off-by: Al Viro <[email protected]>
2016-03-03Btrfs: fix loading of orphan roots leading to BUG_ONFilipe Manana1-1/+9
When looking for orphan roots during mount we can end up hitting a BUG_ON() (at root-item.c:btrfs_find_orphan_roots()) if a log tree is replayed and qgroups are enabled. This is because after a log tree is replayed, a transaction commit is made, which triggers qgroup extent accounting which in turn does backref walking which ends up reading and inserting all roots in the radix tree fs_info->fs_root_radix, including orphan roots (deleted snapshots). So after the log tree is replayed, when finding orphan roots we hit the BUG_ON with the following trace: [118209.182438] ------------[ cut here ]------------ [118209.183279] kernel BUG at fs/btrfs/root-tree.c:314! [118209.184074] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC [118209.185123] Modules linked in: btrfs dm_flakey dm_mod crc32c_generic ppdev xor raid6_pq evdev sg parport_pc parport acpi_cpufreq tpm_tis tpm psmouse processor i2c_piix4 serio_raw pcspkr i2c_core button loop autofs4 ext4 crc16 mbcache jbd2 sd_mod sr_mod cdrom ata_generic virtio_scsi ata_piix libata virtio_pci virtio_ring virtio scsi_mod e1000 floppy [last unloaded: btrfs] [118209.186318] CPU: 14 PID: 28428 Comm: mount Tainted: G W 4.5.0-rc5-btrfs-next-24+ #1 [118209.186318] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS by qemu-project.org 04/01/2014 [118209.186318] task: ffff8801ec131040 ti: ffff8800af34c000 task.ti: ffff8800af34c000 [118209.186318] RIP: 0010:[<ffffffffa04237d7>] [<ffffffffa04237d7>] btrfs_find_orphan_roots+0x1fc/0x244 [btrfs] [118209.186318] RSP: 0018:ffff8800af34faa8 EFLAGS: 00010246 [118209.186318] RAX: 00000000ffffffef RBX: 00000000ffffffef RCX: 0000000000000001 [118209.186318] RDX: 0000000080000000 RSI: 0000000000000001 RDI: 00000000ffffffff [118209.186318] RBP: ffff8800af34fb08 R08: 0000000000000001 R09: 0000000000000000 [118209.186318] R10: ffff8800af34f9f0 R11: 6db6db6db6db6db7 R12: ffff880171b97000 [118209.186318] R13: ffff8801ca9d65e0 R14: ffff8800afa2e000 R15: 0000160000000000 [118209.186318] FS: 00007f5bcb914840(0000) GS:ffff88023edc0000(0000) knlGS:0000000000000000 [118209.186318] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [118209.186318] CR2: 00007f5bcaceb5d9 CR3: 00000000b49b5000 CR4: 00000000000006e0 [118209.186318] Stack: [118209.186318] fffffbffffffffff 010230ffffffffff 0101000000000000 ff84000000000000 [118209.186318] fbffffffffffffff 30ffffffffffffff 0000000000000101 ffff880082348000 [118209.186318] 0000000000000000 ffff8800afa2e000 ffff8800afa2e000 0000000000000000 [118209.186318] Call Trace: [118209.186318] [<ffffffffa042e2db>] open_ctree+0x1e37/0x21b9 [btrfs] [118209.186318] [<ffffffffa040a753>] btrfs_mount+0x97e/0xaed [btrfs] [118209.186318] [<ffffffff8108e1c0>] ? trace_hardirqs_on+0xd/0xf [118209.186318] [<ffffffff8117b87e>] mount_fs+0x67/0x131 [118209.186318] [<ffffffff81192d2b>] vfs_kern_mount+0x6c/0xde [118209.186318] [<ffffffffa0409f81>] btrfs_mount+0x1ac/0xaed [btrfs] [118209.186318] [<ffffffff8108e1c0>] ? trace_hardirqs_on+0xd/0xf [118209.186318] [<ffffffff8108c26b>] ? lockdep_init_map+0xb9/0x1b3 [118209.186318] [<ffffffff8117b87e>] mount_fs+0x67/0x131 [118209.186318] [<ffffffff81192d2b>] vfs_kern_mount+0x6c/0xde [118209.186318] [<ffffffff81195637>] do_mount+0x8a6/0x9e8 [118209.186318] [<ffffffff8119598d>] SyS_mount+0x77/0x9f [118209.186318] [<ffffffff81493017>] entry_SYSCALL_64_fastpath+0x12/0x6b [118209.186318] Code: 64 00 00 85 c0 89 c3 75 24 f0 41 80 4c 24 20 20 49 8b bc 24 f0 01 00 00 4c 89 e6 e8 e8 65 00 00 85 c0 89 c3 74 11 83 f8 ef 75 02 <0f> 0b 4c 89 e7 e8 da 72 00 00 eb 1c 41 83 bc 24 00 01 00 00 00 [118209.186318] RIP [<ffffffffa04237d7>] btrfs_find_orphan_roots+0x1fc/0x244 [btrfs] [118209.186318] RSP <ffff8800af34faa8> [118209.230735] ---[ end trace 83938f987d85d477 ]--- So fix this by not treating the error -EEXIST, returned when attempting to insert a root already inserted by the backref walking code, as an error. The following test case for xfstests reproduces the bug: seq=`basename $0` seqres=$RESULT_DIR/$seq echo "QA output created by $seq" tmp=/tmp/$$ status=1 # failure is the default! trap "_cleanup; exit \$status" 0 1 2 3 15 _cleanup() { _cleanup_flakey cd / rm -f $tmp.* } # get standard environment, filters and checks . ./common/rc . ./common/filter . ./common/dmflakey # real QA test starts here _supported_fs btrfs _supported_os Linux _require_scratch _require_dm_target flakey _require_metadata_journaling $SCRATCH_DEV rm -f $seqres.full _scratch_mkfs >>$seqres.full 2>&1 _init_flakey _mount_flakey _run_btrfs_util_prog quota enable $SCRATCH_MNT # Create 2 directories with one file in one of them. # We use these just to trigger a transaction commit later, moving the file from # directory a to directory b and doing an fsync against directory a. mkdir $SCRATCH_MNT/a mkdir $SCRATCH_MNT/b touch $SCRATCH_MNT/a/f sync # Create our test file with 2 4K extents. $XFS_IO_PROG -f -s -c "pwrite -S 0xaa 0 8K" $SCRATCH_MNT/foobar | _filter_xfs_io # Create a snapshot and delete it. This doesn't really delete the snapshot # immediately, just makes it inaccessible and invisible to user space, the # snapshot is deleted later by a dedicated kernel thread (cleaner kthread) # which is woke up at the next transaction commit. # A root orphan item is inserted into the tree of tree roots, so that if a # power failure happens before the dedicated kernel thread does the snapshot # deletion, the next time the filesystem is mounted it resumes the snapshot # deletion. _run_btrfs_util_prog subvolume snapshot $SCRATCH_MNT $SCRATCH_MNT/snap _run_btrfs_util_prog subvolume delete $SCRATCH_MNT/snap # Now overwrite half of the extents we wrote before. Because we made a snapshpot # before, which isn't really deleted yet (since no transaction commit happened # after we did the snapshot delete request), the non overwritten extents get # referenced twice, once by the default subvolume and once by the snapshot. $XFS_IO_PROG -c "pwrite -S 0xbb 4K 8K" $SCRATCH_MNT/foobar | _filter_xfs_io # Now move file f from directory a to directory b and fsync directory a. # The fsync on the directory a triggers a transaction commit (because a file # was moved from it to another directory) and the file fsync leaves a log tree # with file extent items to replay. mv $SCRATCH_MNT/a/f $SCRATCH_MNT/a/b $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/a $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foobar echo "File digest before power failure:" md5sum $SCRATCH_MNT/foobar | _filter_scratch # Now simulate a power failure and mount the filesystem to replay the log tree. # After the log tree was replayed, we used to hit a BUG_ON() when processing # the root orphan item for the deleted snapshot. This is because when processing # an orphan root the code expected to be the first code inserting the root into # the fs_info->fs_root_radix radix tree, while in reallity it was the second # caller attempting to do it - the first caller was the transaction commit that # took place after replaying the log tree, when updating the qgroup counters. _flakey_drop_and_remount echo "File digest before after failure:" # Must match what he got before the power failure. md5sum $SCRATCH_MNT/foobar | _filter_scratch _unmount_flakey status=0 exit Fixes: 2d9e97761087 ("Btrfs: use btrfs_get_fs_root in resolve_indirect_ref") Cc: [email protected] # 4.4+ Signed-off-by: Filipe Manana <[email protected]> Reviewed-by: Qu Wenruo <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2016-03-03block-dev: enable writeback cgroup supportShaohua Li1-1/+5
block_dev's .writepages/.writepage already handles wbc_init_bio/wbc_account_io. We only set the SB_I_CGROUPWB bit to suppport writeback cgroup support. Signed-off-by: Shaohua Li <[email protected]> Acked-by: Tejun Heo <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2016-03-03writeback: flush inode cgroup wb switches instead of pinning super_blockTejun Heo2-13/+42
If cgroup writeback is in use, inodes can be scheduled for asynchronous wb switching. Before 5ff8eaac1636 ("writeback: keep superblock pinned during cgroup writeback association switches"), this could race with umount leading to super_block being destroyed while inodes are pinned for wb switching. 5ff8eaac1636 fixed it by bumping s_active while wb switches are in flight; however, this allowed in-flight wb switches to make umounts asynchronous when the userland expected synchronosity - e.g. fsck immediately following umount may fail because the device is still busy. This patch removes the problematic super_block pinning and instead makes generic_shutdown_super() flush in-flight wb switches. wb switches are now executed on a dedicated isw_wq so that they can be flushed and isw_nr_in_flight keeps track of the number of in-flight wb switches so that flushing can be avoided in most cases. v2: Move cgroup_writeback_umount() further below and add MS_ACTIVE check in inode_switch_wbs() as Jan an Al suggested. Signed-off-by: Tejun Heo <[email protected]> Reported-by: Tahsin Erdogan <[email protected]> Cc: Jan Kara <[email protected]> Cc: Al Viro <[email protected]> Link: http://lkml.kernel.org/g/CAAeU0aNCq7LGODvVGRU-oU_o-6enii5ey0p1c26D1ZzYwkDc5A@mail.gmail.com Fixes: 5ff8eaac1636 ("writeback: keep superblock pinned during cgroup writeback association switches") Cc: [email protected] #v4.5 Reviewed-by: Jan Kara <[email protected]> Tested-by: Tahsin Erdogan <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2016-03-03Orangefs: improve gossip statementsMike Marshall4-21/+49
Signed-off-by: Mike Marshall <[email protected]>
2016-03-03ovl: copy new uid/gid into overlayfs runtime inodeKonstantin Khlebnikov1-0/+2
Overlayfs must update uid/gid after chown, otherwise functions like inode_owner_or_capable() will check user against stale uid. Catched by xfstests generic/087, it chowns file and calls utimes. Signed-off-by: Konstantin Khlebnikov <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]> Cc: <[email protected]>
2016-03-03ovl: ignore lower entries when checking purity of non-directory entriesKonstantin Khlebnikov2-5/+14
After rename file dentry still holds reference to lower dentry from previous location. This doesn't matter for data access because data comes from upper dentry. But this stale lower dentry taints dentry at new location and turns it into non-pure upper. Such file leaves visible whiteout entry after remove in directory which shouldn't have whiteouts at all. Overlayfs already tracks pureness of file location in oe->opaque. This patch just uses that for detecting actual path type. Comment from Vivek Goyal's patch: Here are the details of the problem. Do following. $ mkdir upper lower work merged upper/dir/ $ touch lower/test $ sudo mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir= work merged $ mv merged/test merged/dir/ $ rm merged/dir/test $ ls -l merged/dir/ /usr/bin/ls: cannot access merged/dir/test: No such file or directory total 0 c????????? ? ? ? ? ? test Basic problem seems to be that once a file has been unlinked, a whiteout has been left behind which was not needed and hence it becomes visible. Whiteout is visible because parent dir is of not type MERGE, hence od->is_real is set during ovl_dir_open(). And that means ovl_iterate() passes on iterate handling directly to underlying fs. Underlying fs does not know/filter whiteouts so it becomes visible to user. Why did we leave a whiteout to begin with when we should not have. ovl_do_remove() checks for OVL_TYPE_PURE_UPPER() and does not leave whiteout if file is pure upper. In this case file is not found to be pure upper hence whiteout is left. So why file was not PURE_UPPER in this case? I think because dentry is still carrying some leftover state which was valid before rename. For example, od->numlower was set to 1 as it was a lower file. After rename, this state is not valid anymore as there is no such file in lower. Signed-off-by: Konstantin Khlebnikov <[email protected]> Reported-by: Viktor Stanchev <[email protected]> Suggested-by: Vivek Goyal <[email protected]> Link: https://bugzilla.kernel.org/show_bug.cgi?id=109611 Acked-by: Vivek Goyal <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]> Cc: <[email protected]>
2016-03-03ovl: fix getcwd() failure after unsuccessful rmdirRui Wang1-1/+2
ovl_remove_upper() should do d_drop() only after it successfully removes the dir, otherwise a subsequent getcwd() system call will fail, breaking userspace programs. This is to fix: https://bugzilla.kernel.org/show_bug.cgi?id=110491 Signed-off-by: Rui Wang <[email protected]> Reviewed-by: Konstantin Khlebnikov <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]> Cc: <[email protected]>
2016-03-03ovl: fix working on distributed fs as lower layerKonstantin Khlebnikov1-0/+1
This adds missing .d_select_inode into alternative dentry_operations. Signed-off-by: Konstantin Khlebnikov <[email protected]> Fixes: 7c03b5d45b8e ("ovl: allow distributed fs as lower layer") Fixes: 4bacc9c9234c ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay") Reviewed-by: Nikolay Borisov <[email protected]> Tested-by: Nikolay Borisov <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]> Cc: <[email protected]> # 4.2+
2016-03-03quota: Fix possible GPF due to uninitialised pointersNikolay Borisov1-2/+1
When dqget() in __dquot_initialize() fails e.g. due to IO error, __dquot_initialize() will pass an array of uninitialized pointers to dqput_all() and thus can lead to deference of random data. Fix the problem by properly initializing the array. CC: [email protected] Signed-off-by: Nikolay Borisov <[email protected]> Signed-off-by: Jan Kara <[email protected]>
2016-03-02nfsd4: resfh unused in nfsd4_secinfoJ. Bruce Fields1-2/+0
Signed-off-by: J. Bruce Fields <[email protected]>
2016-03-02f2fs: mutex can't be used by down_write_nest_lock()Yang Shi1-3/+1
f2fs_lock_all() calls down_write_nest_lock() to acquire a rw_sem and check a mutex, but down_write_nest_lock() is designed for two rw_sem accoring to the comment in include/linux/rwsem.h. And, other than f2fs, it is just called in mm/mmap.c with two rwsem. So, it looks it is used wrongly by f2fs. And, it causes the below compile warning on -rt kernel too. In file included from fs/f2fs/xattr.c:25:0: fs/f2fs/f2fs.h: In function 'f2fs_lock_all': fs/f2fs/f2fs.h:962:34: warning: passing argument 2 of 'down_write_nest_lock' from incompatible pointer type [-Wincompatible-pointer-types] f2fs_down_write(&sbi->cp_rwsem, &sbi->cp_mutex); ^ fs/f2fs/f2fs.h:27:55: note: in definition of macro 'f2fs_down_write' #define f2fs_down_write(x, y) down_write_nest_lock(x, y) ^ In file included from include/linux/rwsem.h:22:0, from fs/f2fs/xattr.c:21: include/linux/rwsem_rt.h:138:20: note: expected 'struct rw_semaphore *' but argument is of type 'struct mutex *' static inline void down_write_nest_lock(struct rw_semaphore *sem, Signed-off-by: Yang Shi <[email protected]> Reviewed-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2016-03-02f2fs: recovery missing dot dentries in root directoryLiu Xue1-0/+7
If f2fs was corrupted with missing dot dentries in root dirctory, it needs to recover them after fsck.f2fs set F2FS_INLINE_DOTS flag in directory inode when fsck.f2fs detects missing dot dentries. Signed-off-by: Xue Liu <[email protected]> Signed-off-by: Yong Sheng <[email protected]> Reviewed-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>