aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2013-07-01ext4: translate flag bits to strings in tracepointsTheodore Ts'o3-41/+98
Translate the bitfields used in various flags argument to strings to make the tracepoint output more human-readable. Signed-off-by: "Theodore Ts'o" <[email protected]>
2013-07-01ext4: fix up error handling for mpage_map_and_submit_extent()Theodore Ts'o1-24/+28
The function mpage_released_unused_page() must only be called once; otherwise the kernel will BUG() when the second call to mpage_released_unused_page() tries to unlock the pages which had been unlocked by the first call. Also restructure the error handling so that we only give up on writing the dirty pages in the case of ENOSPC where retrying the allocation won't help. Otherwise, a transient failure, such as a kmalloc() failure in calling ext4_map_blocks() might cause us to give up on those pages, leading to a scary message in /var/log/messages plus data loss. Signed-off-by: "Theodore Ts'o" <[email protected]> Reviewed-by: Jan Kara <[email protected]>
2013-07-01jbd2: fix theoretical race in jbd2__journal_restartTheodore Ts'o1-1/+1
Once we decrement transaction->t_updates, if this is the last handle holding the transaction from closing, and once we release the t_handle_lock spinlock, it's possible for the transaction to commit and be released. In practice with normal kernels, this probably won't happen, since the commit happens in a separate kernel thread and it's unlikely this could all happen within the space of a few CPU cycles. On the other hand, with a real-time kernel, this could potentially happen, so save the tid found in transaction->t_tid before we release t_handle_lock. It would require an insane configuration, such as one where the jbd2 thread was set to a very high real-time priority, perhaps because a high priority real-time thread is trying to read or write to a file system. But some people who use real-time kernels have been known to do insane things, including controlling laser-wielding industrial robots. :-) Signed-off-by: "Theodore Ts'o" <[email protected]> Cc: [email protected]
2013-07-01ext4: only zero partial blocks in ext4_zero_partial_blocks()Lukas Czerner1-7/+10
Currently if we pass range into ext4_zero_partial_blocks() which covers entire block we would attempt to zero it even though we should only zero unaligned part of the block. Fix this by checking whether the range covers the whole block skip zeroing if so. Signed-off-by: Lukas Czerner <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]>
2013-07-01ext4: check error return from ext4_write_inline_data_end()Theodore Ts'o1-4/+7
The function ext4_write_inline_data_end() can return an error. So we need to assign it to a signed integer variable to check for an error return (since copied is an unsigned int). Signed-off-by: "Theodore Ts'o" <[email protected]> Cc: Zheng Liu <[email protected]> Cc: [email protected]
2013-07-01ext4: delete unnecessary C statementsjon ernst1-9/+1
Comparing unsigned variable with 0 always returns false. err = 0 is duplicated and unnecessary. [ tytso: Also cleaned up error handling in ext4_block_zero_page_range() ] Signed-off-by: "Jon Ernst" <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]>
2013-07-01ext3,ext4: don't mess with dir_file->f_pos in htree_dirblock_to_tree()Al Viro2-10/+4
Both ext3 and ext4 htree_dirblock_to_tree() is just filling the in-core rbtree for use by call_filldir(). All updates of ->f_pos are done by the latter; bumping it here (on error) is obviously wrong - we might very well have it nowhere near the block we'd found an error in. Signed-off-by: Al Viro <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]> Cc: [email protected]
2013-07-01jbd2: move superblock checksum calculation to jbd2_write_superblock()Theodore Ts'o1-1/+2
Some of the functions which modify the jbd2 superblock were not updating the checksum before calling jbd2_write_superblock(). Move the call to jbd2_superblock_csum_set() to jbd2_write_superblock(), so that the checksum is calculated consistently. Signed-off-by: "Theodore Ts'o" <[email protected]> Cc: Darrick J. Wong <[email protected]> Cc: [email protected]
2013-07-01ext4: pass inode pointer instead of file pointer to punch holeAshish Sangwan3-4/+3
No need to pass file pointer when we can directly pass inode pointer. Signed-off-by: Ashish Sangwan <[email protected]> Signed-off-by: Namjae Jeon <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]>
2013-07-01ext4: improve free space calculation for inline_databoxi liu1-1/+1
In ext4 feature inline_data,it use the xattr's space to store the inline data in inode.When we calculate the inline data as the xattr,we add the pad.But in get_max_inline_xattr_value_size() function we count the free space without pad.It cause some contents are moved to a block even if it can be stored in the inode. Signed-off-by: liulei <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]> Reviewed-by: Tao Ma <[email protected]>
2013-07-01ext4: reduce object size when !CONFIG_PRINTKJoe Perches2-20/+75
Reduce the object size ~10% could be useful for embedded systems. Add #ifdef CONFIG_PRINTK #else #endif blocks to hold formats and arguments, passing " " to functions when !CONFIG_PRINTK and still verifying format and arguments with no_printk. $ size fs/ext4/built-in.o* text data bss dec hex filename 239375 610 888 240873 3ace9 fs/ext4/built-in.o.new 264167 738 888 265793 40e41 fs/ext4/built-in.o.old $ grep -E "CONFIG_EXT4|CONFIG_PRINTK" .config # CONFIG_PRINTK is not set CONFIG_EXT4_FS=y CONFIG_EXT4_USE_FOR_EXT23=y CONFIG_EXT4_FS_POSIX_ACL=y # CONFIG_EXT4_FS_SECURITY is not set # CONFIG_EXT4_DEBUG is not set Signed-off-by: Joe Perches <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]>
2013-07-01ext4: improve extent cache shrink mechanism to avoid to burn CPU timeZheng Liu5-25/+68
Now we maintain an proper in-order LRU list in ext4 to reclaim entries from extent status tree when we are under heavy memory pressure. For keeping this order, a spin lock is used to protect this list. But this lock burns a lot of CPU time. We can use the following steps to trigger it. % cd /dev/shm % dd if=/dev/zero of=ext4-img bs=1M count=2k % mkfs.ext4 ext4-img % mount -t ext4 -o loop ext4-img /mnt % cd /mnt % for ((i=0;i<160;i++)); do truncate -s 64g $i; done % for ((i=0;i<160;i++)); do cp $i /dev/null &; done % perf record -a -g % perf report This commit tries to fix this problem. Now a new member called i_touch_when is added into ext4_inode_info to record the last access time for an inode. Meanwhile we never need to keep a proper in-order LRU list. So this can avoid to burns some CPU time. When we try to reclaim some entries from extent status tree, we use list_sort() to get a proper in-order list. Then we traverse this list to discard some entries. In ext4_sb_info, we use s_es_last_sorted to record the last time of sorting this list. When we traverse the list, we skip the inode that is newer than this time, and move this inode to the tail of LRU list. When the head of the list is newer than s_es_last_sorted, we will sort the LRU list again. In this commit, we break the loop if s_extent_cache_cnt == 0 because that means that all extents in extent status tree have been reclaimed. Meanwhile in this commit, ext4_es_{un}register_shrinker()'s prototype is changed to save a local variable in these functions. Reported-by: Dave Hansen <[email protected]> Signed-off-by: Zheng Liu <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]>
2013-07-01ext4: implement error handling of ext4_mb_new_preallocation()Alexey Khoroshilov1-7/+10
If memory allocation in ext4_mb_new_group_pa() is failed, it returns error code, ext4_mb_new_preallocation() propages it, but ext4_mb_new_blocks() ignores it. An observed result was: - allocation fail means ext4_mb_new_group_pa() does not update ext4_allocation_context; - ext4_mb_new_blocks() sets ext4_allocation_request->len (ar->len = ac->ac_b_ex.fe_len;) to number of blocks preallocated (512) instead of number of blocks requested (1); - that activates update cycle in ext4_splice_branch(): for (i = 1; i < blks; i++) <-- blks is 512 instead of 1 here *(where->p + i) = cpu_to_le32(current_block++); - it iterates 511 times and corrupts a chunk of memory including inode structure; - page fault happens at EXT4_SB(inode->i_sb) in ext4_mark_inode_dirty(); - system hangs with 'scheduling while atomic' BUG. The patch implements a check for ext4_mb_new_preallocation() error code and handles its failure as if ext4_mb_regular_allocator() fails. Found by Linux File System Verification project (linuxtesting.org). [ Patch restructed by tytso to make the flow of control easier to follow. ] Signed-off-by: Alexey Khoroshilov <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]>
2013-07-01ext4: fix corruption when online resizing a fs with 1K block sizeMaarten ter Huurne1-3/+1
Subtracting the number of the first data block places the superblock backups one block too early, corrupting the file system. When the block size is larger than 1K, the first data block is 0, so the subtraction has no effect and no corruption occurs. Signed-off-by: Maarten ter Huurne <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]> Reviewed-by: Jan Kara <[email protected]> CC: [email protected]
2013-07-01Merge tag 'v3.10' into sched/coreIngo Molnar885-5164/+13534
Merge in a recent upstream commit: c2853c8df57f include/linux/math64.h: add div64_ul() because: 72a4cf20cb71 sched: Change cfs_rq load avg to unsigned long relies on it. [ We don't rebase sched/core for this, because the handful of followup commits after the broken commit are not behavioral changes so are unlikely to be needed during bisection. ] Signed-off-by: Ingo Molnar <[email protected]>
2013-06-30Linux 3.10Linus Torvalds1-1/+1
2013-06-30Merge branch 'merge' of ↵Linus Torvalds1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull another powerpc fix from Benjamin Herrenschmidt: "I mentioned that while we had fixed the kernel crashes, EEH error recovery didn't always recover... It appears that I had a fix for that already in powerpc-next (with a stable CC). I cherry-picked it today and did a few tests and it seems that things now work quite well. The patch is also pretty simple, so I see no reason to wait before merging it." * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc/eeh: Fix fetching bus for single-dev-PE
2013-06-30Merge tag 'scsi-fixes' of ↵Linus Torvalds11-93/+61
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "This is a set of seven bug fixes. Several fcoe fixes for locking problems, initiator issues and a VLAN API change, all of which could eventually lead to data corruption, one fix for a qla2xxx locking problem which could lead to multiple completions of the same request (and subsequent data corruption) and a use after free in the ipr driver. Plus one minor MAINTAINERS file update" (only six bugfixes in this pull, since I had already pulled the fcoe API fix directly from Robert Love) * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: [SCSI] ipr: Avoid target_destroy accessing memory after it was freed [SCSI] qla2xxx: Fix for locking issue between driver ISR and mailbox routines MAINTAINERS: Fix fcoe mailing list libfc: extend ex_lock to protect all of fc_seq_send libfc: Correct check for initiator role libfcoe: Fix Conflicting FCFs issue in the fabric
2013-06-30Revert "serial: 8250_pci: add support for another kind of NetMos Technology ↵Greg Kroah-Hartman1-4/+0
PCI 9835 Multi-I/O Controller" This reverts commit 8d2f8cd424ca0b99001f3ff4f5db87c4e525f366. As reported by Stefan, this device already works with the parport_serial driver, so the 8250_pci driver should not also try to grab it as well. Reported-by: Stefan Seyfried <[email protected]> Cc: Wang YanQing <[email protected]> Cc: stable <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2013-06-30powerpc/eeh: Fix fetching bus for single-dev-PEGavin Shan1-1/+2
While running Linux as guest on top of phyp, we possiblly have PE that includes single PCI device. However, we didn't return its PCI bus correctly and it leads to failure on recovery from EEH errors for single-dev-PE. The patch fixes the issue. Cc: <[email protected]> # v3.7+ Cc: Steve Best <[email protected]> Signed-off-by: Gavin Shan <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2013-06-29Merge branch 'merge' of ↵Linus Torvalds2-7/+14
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc fixes from Ben Herrenschmidt: "We discovered some breakage in our "EEH" (PCI Error Handling) code while doing error injection, due to a couple of regressions. One of them is due to a patch (37f02195bee9 "powerpc/pci: fix PCI-e devices rescan issue on powerpc platform") that, in hindsight, I shouldn't have merged considering that it caused more problems than it solved. Please pull those two fixes. One for a simple EEH address cache initialization issue. The other one is a patch from Guenter that I had originally planned to put in 3.11 but which happens to also fix that other regression (a kernel oops during EEH error handling and possibly hotplug). With those two, the couple of test machines I've hammered with error injection are remaining up now. EEH appears to still fail to recover on some devices, so there is another problem that Gavin is looking into but at least it's no longer crashing the kernel." * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc/pci: Improve device hotplug initialization powerpc/eeh: Add eeh_dev to the cache during boot
2013-06-29ARM: dt: Only print warning, not WARN() on bad cpu map in device treeOlof Johansson1-2/+3
Due to recent changes and expecations of proper cpu bindings, there are now cases for many of the in-tree devicetrees where a WARN() will hit on boot due to badly formatted /cpus nodes. Downgrade this to a pr_warn() to be less alarmist, since it's not a new problem. Tested on Arndale, Cubox, Seaboard and Panda ES. Panda hits the WARN without this, the others do not. Acked-by: Russell King <[email protected]> Signed-off-by: Olof Johansson <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-06-30powerpc/pci: Improve device hotplug initializationGuenter Roeck1-5/+12
Commit 37f02195b (powerpc/pci: fix PCI-e devices rescan issue on powerpc platform) fixes a problem with interrupt and DMA initialization on hot plugged devices. With this commit, interrupt and DMA initialization for hot plugged devices is handled in the pci device enable function. This approach has a couple of drawbacks. First, it creates two code paths for device initialization, one for hot plugged devices and another for devices known during the initial PCI scan. Second, the initialization code for hot plugged devices is only called when the device is enabled, ie typically in the probe function. Also, the platform specific setup code is called each time pci_enable_device() is called, not only once during device discovery, meaning it is actually called multiple times, once for devices discovered during the initial scan and again each time a driver is re-loaded. The visible result is that interrupt pins are only assigned to hot plugged devices when the device driver is loaded. Effectively this changes the PCI probe API, since pci_dev->irq and the device's dma configuration will now only be valid after pci_enable() was called at least once. A more subtle change is that platform specific PCI device setup is moved from device discovery into the driver's probe function, more specifically into the pci_enable_device() call. To fix the inconsistencies, add new function pcibios_add_device. Call pcibios_setup_device from pcibios_setup_bus_devices if device setup is not complete, and from pcibios_add_device if bus setup is complete. With this change, device setup code is moved back into device initialization, and called exactly once for both static and hot plugged devices. [ This also fixes a regression introduced by the above patch which causes dev->irq to be overwritten under some cirumstances after MSIs have been enabled for the device which leads to crashes due to the MSI core "hijacking" dev->irq to store the base MSI number and not the LSI. --BenH ] Cc: Yuanquan Chen <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Hiroo Matsumoto <[email protected]> Signed-off-by: Guenter Roeck <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2013-06-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds3-13/+14
Pull crypto fix from Herbert Xu: "This fixes a crash in the crypto layer exposed by an SCTP test tool" * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: algboss - Hold ref count on larval
2013-06-29Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds1-0/+5
Pull drm/qxl fix from Dave Airlie: "Bad me forgot an access check, possible security issue, but since this is the first kernel with it, should be fine to just put it in now" * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: drm/qxl: add missing access check for execbuffer ioctl
2013-06-29Fix: kernel/ptrace.c: ptrace_peek_siginfo() missing __put_user() validationMathieu Desnoyers1-9/+11
This __put_user() could be used by unprivileged processes to write into kernel memory. The issue here is that even if copy_siginfo_to_user() fails, the error code is not checked before __put_user() is executed. Luckily, ptrace_peek_siginfo() has been added within the 3.10-rc cycle, so it has not hit a stable release yet. Signed-off-by: Mathieu Desnoyers <[email protected]> Acked-by: Oleg Nesterov <[email protected]> Cc: Andrey Vagin <[email protected]> Cc: Roland McGrath <[email protected]> Cc: Paul McKenney <[email protected]> Cc: David Howells <[email protected]> Cc: Dave Jones <[email protected]> Cc: Pavel Emelyanov <[email protected]> Cc: Pedro Alves <[email protected]> Cc: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-06-29Merge branch 'for-linus' of ↵Linus Torvalds1-1/+5
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph fix from Sage Weil: "This is a recently spotted regression in the snapshot behavior... It turns out several tests weren't being run in the nightlies so this took a while to spot" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: rbd: send snapshot context with writes
2013-06-29Merge branch 'for-linus' of ↵Linus Torvalds1-15/+39
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull ubifs fixes from Al Viro: "A couple of ubifs readdir/lseek race fixes. Stable fodder, really nasty..." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: UBIFS: fix a horrid bug UBIFS: prepare to fix a horrid bug
2013-06-29Merge tag 'for-linus-20130628' of ↵Linus Torvalds2-34/+22
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-mn10300 Pull two MN10300 fixes from David Howells: "The first fixes a problem with passing arrays rather than pointers to get_user() where __typeof__ then wants to declare and initialise an array variable which gcc doesn't like. The second fixes a problem whereby putting mem=xxx into the kernel command line causes init=xxx to get an incorrect value." * tag 'for-linus-20130628' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-mn10300: mn10300: Use early_param() to parse "mem=" parameter mn10300: Allow to pass array name to get_user()
2013-06-29Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds1-2/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Thomas Gleixner: "Correct an ordering issue in the tick broadcast code. I really wish we'd get compensation for pain and suffering for each line of code we write to work around dysfunctional timer hardware." * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tick: Fix tick_broadcast_pending_mask not cleared
2013-06-29Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds1-7/+9
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fix from Ingo Molnar: "One more fix for a recently discovered bug" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Disable monitoring on setuid processes for regular users
2013-06-29[readdir] constify ->actorAl Viro9-68/+57
Signed-off-by: Al Viro <[email protected]>
2013-06-29[readdir] ->readdir() is goneAl Viro7-16/+13
everything's converted to ->iterate() Signed-off-by: Al Viro <[email protected]>
2013-06-29[readdir] convert ecryptfsAl Viro1-20/+15
Signed-off-by: Al Viro <[email protected]>
2013-06-29[readdir] convert codaAl Viro1-58/+19
Signed-off-by: Al Viro <[email protected]>
2013-06-29[readdir] convert ocfs2Al Viro4-113/+61
Signed-off-by: Al Viro <[email protected]>
2013-06-29[readdir] convert fatfsAl Viro1-50/+54
... pox upon the idiotic ioctls; life would be much easier without those. Signed-off-by: Al Viro <[email protected]>
2013-06-29[readdir] convert xfsAl Viro7-61/+44
Signed-off-by: Al Viro <[email protected]>
2013-06-29[readdir] convert btrfsAl Viro3-40/+21
Signed-off-by: Al Viro <[email protected]>
2013-06-29[readdir] convert hostfsAl Viro1-7/+6
Signed-off-by: Al Viro <[email protected]>
2013-06-29[readdir] convert afsAl Viro1-62/+37
Signed-off-by: Al Viro <[email protected]>
2013-06-29[readdir] convert ncpfsAl Viro1-43/+35
Signed-off-by: Al Viro <[email protected]>
2013-06-29[readdir] convert hfsplusAl Viro1-27/+23
Signed-off-by: Al Viro <[email protected]>
2013-06-29[readdir] convert hfsAl Viro1-26/+23
Signed-off-by: Al Viro <[email protected]>
2013-06-29[readdir] convert befsAl Viro1-18/+22
Signed-off-by: Al Viro <[email protected]>
2013-06-29[readdir] convert cifsAl Viro3-100/+82
Signed-off-by: Al Viro <[email protected]>
2013-06-29[readdir] convert freevxfsAl Viro1-32/+23
Signed-off-by: Al Viro <[email protected]>
2013-06-29[readdir] convert fuseAl Viro1-20/+17
Signed-off-by: Al Viro <[email protected]>
2013-06-29[readdir] convert hpfsAl Viro1-27/+29
Signed-off-by: Al Viro <[email protected]>
2013-06-29reiserfs: switch reiserfs_readdir_dentry to inodeAl Viro3-17/+15
... and clean the callers up a bit Signed-off-by: Al Viro <[email protected]>