aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2010-08-01ext4: force block allocation on quota_offDmitry Monakhov1-1/+14
Perform full sync procedure so that any delayed allocation blocks are allocated so quota will be consistent. Signed-off-by: Dmitry Monakhov <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]>
2010-08-01ext4: fix freeze deadlock under IOEric Sandeen1-2/+2
Commit 6b0310fbf087ad6 caused a regression resulting in deadlocks when freezing a filesystem which had active IO; the vfs_check_frozen level (SB_FREEZE_WRITE) did not let the freeze-related IO syncing through. Duh. Changing the test to FREEZE_TRANS should let the normal freeze syncing get through the fs, but still block any transactions from starting once the fs is completely frozen. I tested this by running fsstress in the background while periodically snapshotting the fs and running fsck on the result. I ran into occasional deadlocks, but different ones. I think this is a fine fix for the problem at hand, and the other deadlocky things will need more investigation. Reported-by: Phillip Susi <[email protected]> Signed-off-by: Eric Sandeen <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]>
2010-07-29ext4: drop inode from orphan list if ext4_delete_inode() failsTheodore Ts'o1-0/+1
There were some error paths in ext4_delete_inode() which was not dropping the inode from the orphan list. This could lead to a BUG_ON on umount when the orphan list is discovered to be non-empty. Signed-off-by: "Theodore Ts'o" <[email protected]>
2010-07-27ext4: check to make make sure bd_dev is set before dereferencing itTheodore Ts'o1-4/+13
There are some drivers which may not set bdev->bd_dev. So make sure it is non-NULL before dereferencing it. Google-Bug-Id: 1773557 Signed-off-by: "Theodore Ts'o" <[email protected]>
2010-07-27jbd2: Make barrier messages less scaryEric Sandeen1-4/+4
Saying things like "sync failed" when a device does not support barriers makes users slightly more worried than they need to be; rather than talking about sync failures, let's just state the barrier-based facts. Signed-off-by: Eric Sandeen <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]>
2010-07-27ext4: don't print scary messages for allocation failures post-abortEric Sandeen2-11/+17
I often get emails containing the "This should not happen!!" message, conveniently trimmed to remove things like: sd 0:0:0:0: [sda] Unhandled error code sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 03 13 c9 70 00 00 28 00 end_request: I/O error, dev sda, sector 51628400 Aborting journal on device dm-0-8. EXT4-fs error (device dm-0): ext4_journal_start_sb: Detected aborted journal EXT4-fs (dm-0): Remounting filesystem read-only I don't think there is any value to the verbosity if the reason is due to a filesystem abort; it just obfuscates the root cause. Signed-off-by: Eric Sandeen <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]>
2010-07-27ext4: fix EFBIG edge case when writing to large non-extent fileToshiyuki Okajima1-1/+2
By running the following reproducer, we can confirm that the write system call returns with 0 when it should return the error EFBIG. #!/bin/sh /bin/dd if=/dev/zero of=./img bs=1k count=1 seek=1024k > /dev/null 2>&1 /sbin/mkfs.ext3 -Fq ./img /bin/mount -o loop -t ext4 ./img /mnt /bin/touch /mnt/file strace /bin/dd if=/dev/zero of=/mnt/file conv=notrunc bs=1k count=1 seek=$((2194719883264/1024)) 2>&1 | /bin/egrep "write.* 1024\) = " /bin/umount /mnt exit Signed-off-by: Toshiyuki Okajima <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]> Cc: Eric Sandeen <[email protected]>
2010-07-27ext4: fix ext4_get_blocks referencesEric Sandeen2-9/+6
ext4_get_blocks got renamed to ext4_map_blocks, but left stale comments and a prototype littered around. Signed-off-by: Eric Sandeen <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]>
2010-07-27ext4: Always journal quota file modificationsJan Kara1-14/+5
When journaled quota options are not specified, we do writes to quota files just in data=ordered mode. This actually causes warnings from JBD2 about dirty journaled buffer because ext4_getblk unconditionally treats a block allocated by it as metadata. Since quota actually is filesystem metadata, the easiest way to get rid of the warning is to always treat quota writes as metadata... Signed-off-by: Jan Kara <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]>
2010-07-27ext4: Fix potential memory leak in ext4_fill_superCyrill Gorcunov1-3/+5
Under heavy memory pressure we may hit out of memory situation and as result kstrdup'ed options will not be freed. Fix it. Signed-off-by: Cyrill Gorcunov <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]>
2010-07-27ext4: Don't error out the fs if the user tries to make a file too bigTheodore Ts'o1-4/+2
If the user attempts to make a non-extent-mapped file to be too large, return EFBIG, but don't call ext4_std_err() which will end up marking the file system as containing an error. Thanks to Toshiyuki Okajima-san at Fujitsu for pointing this out. Signed-off-by: "Theodore Ts'o" <[email protected]>
2010-07-27ext4: allocate stripe-multiple IOs on stripe boundariesEric Sandeen1-4/+3
For some reason, today mballoc only allocates IOs which are exactly stripe-sized on a stripe boundary. If you have a multiple (say, a 128k IO on a 64k stripe) you may end up unaligned. It seems to me that a simple change to align stripe-multiple IOs on stripe boundaries would be a very good idea, unless this breaks some other mballoc heuristic for some reason... Reported-by: Mike Snitzer <[email protected]> Signed-off-by: Eric Sandeen <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]>
2010-07-27ext4: move aio completion after unwritten extent conversion[email protected] (Jiaying Zhang)2-6/+15
This patch is to be applied upon Christoph's "direct-io: move aio_complete into ->end_io" patch. It adds iocb and result fields to struct ext4_io_end_t, so that we can call aio_complete from ext4_end_io_nolock() after the extent conversion has finished. I have verified with Christoph's aio-dio test that used to fail after a few runs on an original kernel but now succeeds on the patched kernel. See http://thread.gmane.org/gmane.comp.file-systems.ext4/19659 for details. Signed-off-by: Jiaying Zhang <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]>
2010-07-27direct-io: move aio_complete into ->end_ioChristoph Hellwig6-18/+37
Filesystems with unwritten extent support must not complete an AIO request until the transaction to convert the extent has been commited. That means the aio_complete calls needs to be moved into the ->end_io callback so that the filesystem can control when to call it exactly. This makes a bit of a mess out of dio_complete and the ->end_io callback prototype even more complicated. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Jan Kara <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]>
2010-07-27ext4: Support discard requests when running in no-journal modeJiaying Zhang1-16/+21
Issue discard request in ext4_free_blocks() when ext4 has no journal and is mounted with discard option. Signed-off-by: Jiaying Zhang <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]>
2010-07-27jbd2: Remove __GFP_NOFAIL from jbd2 layerTheodore Ts'o3-23/+57
__GFP_NOFAIL is going away, so add our own retry loop. Also add jbd2__journal_start() and jbd2__journal_restart() which take a gfp mask, so that file systems can optionally (re)start transaction handles using GFP_KERNEL. If they do this, then they need to be prepared to handle receiving an PTR_ERR(-ENOMEM) error, and be ready to reflect that error up to userspace. Signed-off-by: "Theodore Ts'o" <[email protected]>
2010-07-27ext4: Fix block bitmap inconsistencies after a crash when deleting filesAmir G1-22/+13
We have experienced bitmap inconsistencies after crash during file delete under heavy load. The crash is not file system related and I the following patch in ext4_free_branches() fixes the recovery problem. If the transaction is restarted and there is a crash before the new transaction is committed, then after recovery, the blocks that this indirect block points to have been freed, but the indirect block itself has not been freed and may still point to some of the free blocks (because of the ext4_forget()). So ext4_forget() should be called inside ext4_free_blocks() to avoid this problem. Signed-off-by: Amir Goldstein <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]>
2010-07-27ext4: Remove unnecessary casts of private_dataJoe Perches2-2/+2
Signed-off-by: Joe Perches <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]>
2010-07-27ext4: fix potential NULL dereference while tracingTheodore Ts'o2-10/+14
The allocation_context pointer can be NULL. Signed-off-by: "Theodore Ts'o" <[email protected]>
2010-07-27ext4: Define s_jnl_backup_type in superblockTheodore Ts'o1-2/+2
This has been in use by e2fsprogs for a while; define it to keep the super block fields in sync. Signed-off-by: "Theodore Ts'o" <[email protected]>
2010-07-27ext4: Once a day, printk file system error information to dmesgTheodore Ts'o2-0/+62
This allows us to grab any file system error messages by scraping /var/log/messages. This will make it easy for us to do error analysis across the very large number of machines as we deploy ext4 across the fleet. Signed-off-by: "Theodore Ts'o" <[email protected]>
2010-07-27ext4: Save error information to the superblock for analysisTheodore Ts'o5-22/+90
Save number of file system errors, and the time function name, line number, block number, and inode number of the first and most recent errors reported on the file system in the superblock. Signed-off-by: "Theodore Ts'o" <[email protected]>
2010-07-27ext4: Pass line numbers to ext4_error() and friendsTheodore Ts'o8-80/+98
Signed-off-by: "Theodore Ts'o" <[email protected]>
2010-07-27ext4: Cleanup ext4_check_dir_entry so __func__ is now implicitTheodore Ts'o3-16/+17
Also start passing the line number to ext4_check_dir since we're going to need it in upcoming patch. Signed-off-by: "Theodore Ts'o" <[email protected]>
2010-06-29ext4: Pass line number to ext4_journal_abort_handle()Theodore Ts'o3-47/+53
This allows the error messages to include the line number Signed-off-by: "Theodore Ts'o" <[email protected]>
2010-06-29ext4: Enhance ext4_grp_locked_error() to take block and function numbersTheodore Ts'o4-32/+45
Also use a macro definition so that __func__ and __LINE__ is implicit. Signed-off-by: "Theodore Ts'o" <[email protected]>
2010-06-29ext4: clean up ext4_abort() so __func__ is now implicitTheodore Ts'o3-9/+10
Use a macro definition for ext4_abort() to clean up the .c files a wee bit. Signed-off-by: "Theodore Ts'o" <[email protected]>
2010-06-29ext4: Add new superblock fields reserved for the Next3 snapshot featureTheodore Ts'o1-1/+7
Signed-off-by: "Theodore Ts'o" <[email protected]>
2010-06-15ext4: update ctime when changing the file's permission by setfaclJan Kara1-0/+1
ext4 didn't update the ctime of the file when its permission was changed. Steps to reproduce: # touch aaa # stat -c %Z aaa 1275289822 # setfacl -m 'u::x,g::x,o::x' aaa # stat -c %Z aaa 1275289822 <- unchanged But, according to the spec of the ctime, ext4 must update it. Port of ext3 patch by Miao Xie <[email protected]>. CC: [email protected] Signed-off-by: Jan Kara <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]>
2010-06-14ext4: remove vestiges of nobh supportChristoph Hellwig4-44/+15
The nobh option was only supported for writeback mode, but given that all write paths actually create buffer heads it effectively was a no-op already. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]>
2010-06-14ext4: remove initialized but not read variablesAndi Kleen7-50/+10
No real bugs found, just removed some dead code. Found by gcc 4.6's new warnings. Signed-off-by: Andi Kleen <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]>
2010-06-14ext4: Convert more i_flags references to use accessor functionsTheodore Ts'o3-4/+5
These changes are not ones which are likely to result in races, but they should be fixed. Signed-off-by: "Theodore Ts'o" <[email protected]>
2010-06-11ext4: Clean up s_dirt handlingTheodore Ts'o9-16/+36
We don't need to set s_dirt in most of the ext4 code when journaling is enabled. In ext3/4 some of the summary statistics for # of free inodes, blocks, and directories are calculated from the per-block group statistics when the file system is mounted or unmounted. As a result the superblock doesn't have to be updated, either via the journal or by setting s_dirt. There are a few exceptions, most notably when resizing the file system, where the superblock needs to be modified --- and in that case it should be done as a journalled operation if possible, and s_dirt set only in no-journal mode. This patch will optimize out some unneeded disk writes when using ext4 with a journal. Signed-off-by: "Theodore Ts'o" <[email protected]>
2010-06-11Linux 2.6.35-rc3Linus Torvalds1-1/+1
2010-06-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds14-28/+103
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: wimax/i2400m: fix missing endian correction read in fw loader net8139: fix a race at the end of NAPI pktgen: Fix accuracy of inter-packet delay. pkt_sched: gen_estimator: add a new lock net: deliver skbs on inactive slaves to exact matches ipv6: fix ICMP6_MIB_OUTERRORS r8169: fix mdio_read and update mdio_write according to hw specs gianfar: Revive the driver for eTSEC devices (disable timestamping) caif: fix a couple range checks phylib: Add support for the LXT973 phy. net: Print num_rx_queues imbalance warning only when there are allocated queues
2010-06-11Merge branch 'pm-fixes' of ↵Linus Torvalds3-0/+8
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6 * 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6: PM / x86: Save/restore MISC_ENABLE register
2010-06-11Merge branch 'for-linus' of ↵Linus Torvalds9-17/+41
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: Btrfs: The file argument for fsync() is never null Btrfs: handle ERR_PTR from posix_acl_from_xattr() Btrfs: avoid BUG when dropping root and reference in same transaction Btrfs: prohibit a operation of changing acl's mask when noacl mount option used Btrfs: should add a permission check for setfacl Btrfs: btrfs_lookup_dir_item() can return ERR_PTR Btrfs: btrfs_read_fs_root_no_name() returns ERR_PTRs Btrfs: unwind after btrfs_start_transaction() errors Btrfs: btrfs_iget() returns ERR_PTR Btrfs: handle kzalloc() failure in open_ctree() Btrfs: handle error returns from btrfs_lookup_dir_item() Btrfs: Fix BUG_ON for fs converted from extN Btrfs: Fix null dereference in relocation.c Btrfs: fix remap_file_pages error Btrfs: uninitialized data is check_path_shared() Btrfs: fix fallocate regression Btrfs: fix loop device on top of btrfs
2010-06-11Merge branch 'for-linus' of ↵Linus Torvalds9-130/+17
git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: PCI: clear bridge resource range if BIOS assigned bad one PCI: hotplug/cpqphp, fix NULL dereference Revert "PCI: create function symlinks in /sys/bus/pci/slots/N/" PCI: change resource collision messages from KERN_ERR to KERN_INFO
2010-06-11PCI: clear bridge resource range if BIOS assigned bad oneYinghai Lu4-0/+5
Yannick found that video does not work with 2.6.34. The cause of this bug was that the BIOS had assigned the wrong range to the PCI bridge above the video device. Before 2.6.34 the kernel would have shrunk the size of the bridge window, but since d65245c PCI: don't shrink bridge resources the kernel will avoid shrinking BIOS ranges. So zero out the old range if we fail to claim it at boot time; this will cause us to allocate a new range at startup, restoring the 2.6.34 behavior. Fixes regression https://bugzilla.kernel.org/show_bug.cgi?id=16009. Reported-by: Yannick <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> Signed-off-by: Yinghai Lu <[email protected]> Signed-off-by: Jesse Barnes <[email protected]>
2010-06-11PCI: hotplug/cpqphp, fix NULL dereferenceJiri Slaby1-0/+7
There are devices out there which are PCI Hot-plug controllers with compaq PCI IDs, but are not bridges, hence have pdev->subordinate NULL. But cpqphp expects the pointer to be non-NULL. Add a check to the probe function to avoid oopses like: BUG: unable to handle kernel NULL pointer dereference at 00000050 IP: [<f82e3c41>] cpqhpc_probe+0x951/0x1120 [cpqphp] *pdpt = 0000000033779001 *pde = 0000000000000000 ... The device here was: 00:0b.0 PCI Hot-plug controller [0804]: Compaq Computer Corporation PCI Hotplug Controller [0e11:a0f7] (rev 11) Subsystem: Compaq Computer Corporation Device [0e11:a2f8] Signed-off-by: Jiri Slaby <[email protected]> Cc: Greg KH <[email protected]> Signed-off-by: Jesse Barnes <[email protected]>
2010-06-11Revert "PCI: create function symlinks in /sys/bus/pci/slots/N/"Jesse Barnes3-125/+0
This reverts commit 75568f8094eb0333e9c2109b23cbc8b82d318a3c. Since they're just a convenience anyway, remove these symlinks since they're causing duplicate filename errors in the wild. Acked-by: Alex Chiang <[email protected]> Signed-off-by: Jesse Barnes <[email protected]>
2010-06-11PCI: change resource collision messages from KERN_ERR to KERN_INFOBjorn Helgaas1-5/+5
We can often deal with PCI resource issues by moving devices around. In that case, there's no point in alarming the user with messages like these. There are many bug reports where the message itself is the only problem, e.g., https://bugs.launchpad.net/ubuntu/+source/linux/+bug/413419 . Signed-off-by: Bjorn Helgaas <[email protected]> Signed-off-by: Jesse Barnes <[email protected]>
2010-06-11Btrfs: The file argument for fsync() is never nullDan Carpenter1-1/+1
The "file" argument for fsync is never null so we can remove this check. What drew my attention here is that 7ea8085910e: "drop unused dentry argument to ->fsync" introduced an unconditional dereference at the start of the function and that generated a smatch warning. Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2010-06-11Btrfs: handle ERR_PTR from posix_acl_from_xattr()Dan Carpenter1-0/+2
posix_acl_from_xattr() returns both ERR_PTRs and null, but it's OK to pass null values to set_cached_acl() Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2010-06-11Btrfs: avoid BUG when dropping root and reference in same transactionSage Weil1-3/+0
If btrfs_ioctl_snap_destroy() deletes a snapshot but finishes with end_transaction(), the cleaner kthread may come in and drop the root in the same transaction. If that's the case, the root's refs still == 1 in the tree when btrfs_del_root() deletes the item, because commit_fs_roots() hasn't updated it yet (that happens during the commit). This wasn't a problem before only because btrfs_ioctl_snap_destroy() would commit the transaction before dropping the dentry reference, so the dead root wouldn't get queued up until after the fs root item was updated in the btree. Since it is not an error to drop the root reference and the root in the same transaction, just drop the BUG_ON() in btrfs_del_root(). Signed-off-by: Sage Weil <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2010-06-11Btrfs: prohibit a operation of changing acl's mask when noacl mount option usedShi Weihua1-0/+3
when used Posix File System Test Suite(pjd-fstest) to test btrfs, some cases about setfacl failed when noacl mount option used. I simplified used commands in pjd-fstest, and the following steps can reproduce it. ------------------------ # cd btrfs-part/ # mkdir aaa # setfacl -m m::rw aaa <- successed, but not expected by pjd-fstest. ------------------------ I checked ext3, a warning message occured, like as: setfacl: aaa/: Operation not supported Certainly, it's expected by pjd-fstest. So, i compared acl.c of btrfs and ext3. Based on that, a patch created. Fortunately, it works. Signed-off-by: Shi Weihua <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2010-06-11Btrfs: should add a permission check for setfaclShi Weihua1-0/+3
On btrfs, do the following ------------------ # su user1 # cd btrfs-part/ # touch aaa # getfacl aaa # file: aaa # owner: user1 # group: user1 user::rw- group::rw- other::r-- # su user2 # cd btrfs-part/ # setfacl -m u::rwx aaa # getfacl aaa # file: aaa # owner: user1 # group: user1 user::rwx <- successed to setfacl group::rw- other::r-- ------------------ but we should prohibit it that user2 changing user1's acl. In fact, on ext3 and other fs, a message occurs: setfacl: aaa: Operation not permitted This patch fixed it. Signed-off-by: Shi Weihua <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2010-06-11Btrfs: btrfs_lookup_dir_item() can return ERR_PTRDan Carpenter1-1/+1
btrfs_lookup_dir_item() can return either ERR_PTRs or null. Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2010-06-11Btrfs: btrfs_read_fs_root_no_name() returns ERR_PTRsDan Carpenter1-0/+4
btrfs_read_fs_root_no_name() returns ERR_PTRs on error so I added a check for that. It's not clear to me if it can also return NULL pointers or not so I left the original NULL pointer check as is. Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2010-06-11Btrfs: unwind after btrfs_start_transaction() errorsDan Carpenter1-1/+1
This was added by a22285a6a3: "Btrfs: Integrate metadata reservation with start_transaction". If we goto out here then we skip all the unwinding and there are locks still held etc. Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Chris Mason <[email protected]>