aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2013-11-11Btrfs: improve jitter performance of the sequential buffered writeMiao Xie1-3/+4
The performance was slowed down sometimes when we ran sysbench to measure the performance of the sequential buffered write by 2 or more threads. It was because the write order of the test threads might be confused by the task scheduler, and the coming write would be beyond the end of the file, in this case, we need insert dummy file extents and create a hole for the area we skip. But in order to avoid the ongoing ordered extents which are in the area, we need wait for them. Unfortunately, the current code doesn't check if there are ordered extents in the area or not, try to find and flush the dirty pages directly, but in fact, there is no dirty page in that area, this step of the current code is unnecessary, and just wastes time. Sometimes, it would increase the contention of some locks, and makes the performance slow down suddenly. So we remove the ordered extent flush function before the check, and flush the dirty pages and wait for the ordered extents only when we find them. According to my test, we got 1-2 times of the performance regression when we ran the test by 10 times before applying this patch. After applying this patch, the regression went away. Test Environment: CPU: 1CPU * 4Cores Memory: 6GB Partition: 20GB Test Command: # sysbench --test=fileio --file-total-size=16G --file-test-mode=seqwr \ > --num-threads=512 --file-block-size=16384 --max-time=60 --max-requests=0 run Signed-off-by: Miao Xie <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-11-11Btrfs: fix BUG_ON() casued by the reserved space migrationMiao Xie3-3/+28
When we did space balance and snapshot creation at the same time, we might meet the following oops: kernel BUG at fs/btrfs/inode.c:3038! [SNIP] Call Trace: [<ffffffffa0411ec7>] btrfs_orphan_cleanup+0x293/0x407 [btrfs] [<ffffffffa042dc45>] btrfs_mksubvol.isra.28+0x259/0x373 [btrfs] [<ffffffffa042de85>] btrfs_ioctl_snap_create_transid+0x126/0x156 [btrfs] [<ffffffffa042dff1>] btrfs_ioctl_snap_create_v2+0xd0/0x121 [btrfs] [<ffffffffa0430b2c>] btrfs_ioctl+0x414/0x1854 [btrfs] [<ffffffff813b60b7>] ? __do_page_fault+0x305/0x379 [<ffffffff811215a9>] vfs_ioctl+0x1d/0x39 [<ffffffff81121d7c>] do_vfs_ioctl+0x32d/0x3e2 [<ffffffff81057fe7>] ? finish_task_switch+0x80/0xb8 [<ffffffff81121e88>] SyS_ioctl+0x57/0x83 [<ffffffff813b39ff>] ? do_device_not_available+0x12/0x14 [<ffffffff813b99c2>] system_call_fastpath+0x16/0x1b [SNIP] RIP [<ffffffffa040da40>] btrfs_orphan_add+0xc3/0x126 [btrfs] The reason of the problem is that the relocation root creation stole the reserved space, which was reserved for orphan item deletion. There are several ways to fix this problem, one is to increasing the reserved space size of the space balace, and then we can use that space to create the relocation tree for each fs/file trees. But it is hard to calculate the suitable size because we doesn't know how many fs/file trees we need relocate. We fixed this problem by reserving the space for relocation root creation actively since the space it need is very small (one tree block, used for root node copy), then we use that reserved space to create the relocation tree. If we don't reserve space for relocation tree creation, we will use the reserved space of the balance. Signed-off-by: Miao Xie <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-11-11btrfs: remove unused parameter from btrfs_header_fsidRoss Kirk4-10/+10
Remove unused parameter, 'eb'. Unused since introduction in 5f39d397dfbe140a14edecd4e73c34ce23c4f9ee Updated to be rebased against current upstream and correct diff supplied this time! Signed-off-by: Ross Kirk <[email protected]> Reviewed-by: Eric Sandeen <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-11-11Btrfs: fix two use-after-free bugs with transaction cleanupJosef Bacik3-82/+52
I was noticing the slab redzone stuff going off every once and a while during transaction aborts. This was caused by two things 1) We would walk the pending snapshots and set their error to -ECANCELED. We don't need to do this, the snapshot stuff waits for a transaction commit and if there is a problem we just free our pending snapshot object and exit. Doing this was causing us to touch the pending snapshot object after the thing had already been freed. 2) We were freeing the transaction manually with wanton disregard for it's use_count reference counter. To fix this I cleaned up the transaction freeing loop to either wait for the transaction commit to finish if it was in the middle of that (since it will be cleaned and freed up there) or to do the cleanup oursevles. I also moved the global "kill all things dirty everywhere" stuff outside of the transaction cleanup loop since that only needs to be done once. With this patch I'm no longer seeing slab corruption because of use after frees. Thanks, Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-11-11Btrfs: remove all BUG_ON()'s from commit_cowonly_rootsJosef Bacik1-5/+8
Noticed this when forcing errors to happen during delayed ref running. Thanks, Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-11-11Btrfs: don't delete ordered roots from list during cleanupJosef Bacik1-1/+2
During transaction cleanup after an abort we are just removing roots from the ordered roots list which is incorrect. We have a BUG_ON() to make sure that the root is still part of the ordered roots list when we put our ordered extent which we were tripping in this case. So do like we do everywhere else and just move it to the tail of the ordered roots list and allow the normal cleanup to take care of stuff. Thanks, Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-11-11Btrfs: cleanup transaction on abortJosef Bacik2-1/+6
If we abort not during a transaction commit we won't clean up anything until we unmount. Unfortunately if we abort in the middle of writing out an ordered extent we won't clean it up and if somebody is waiting on that ordered extent they will wait forever. To fix this just make the transaction kthread call the cleanup transaction stuff if it notices theres an error, and make btrfs_end_transaction wake up the transaction kthread if there is an error. Thanks, Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-11-11Btrfs: do not release metadata for space cache inodesJosef Bacik1-1/+7
I've been testing our error paths and I was tripping the BUG_ON() in drop_outstanding_extent because our outstanding_extents is 0 for space cache inodes. This is because we don't reserve metadata space for these inodes since we depend on the global block reserve for our space. To fix this we need to make sure the DO_ACCOUNTING stuff doesn't actually call release_metadata for space cache inodes. With this patch I'm no longer panicing. Thanks, Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-11-11Btrfs: reset intwrite on transaction abortJosef Bacik1-0/+2
If we abort a transaction in the middle of a commit we weren't undoing the intwrite locking. This patch fixes that problem. Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-11-11Btrfs: relocate csums properly with prealloc extentsJosef Bacik1-3/+15
A user reported a problem where they were getting csum errors when running a balance and running systemd's journal. This is because systemd is awesome and fallocate()'s its log space and writes into it. Unfortunately we assume that when we read in all the csums for an extent that they are sequential starting at the bytenr we care about. This obviously isn't the case for prealloc extents, where we could have written to the middle of the prealloc extent only, which means the csum would be for the bytenr in the middle of our range and not the front of our range. Fix this by offsetting the new bytenr we are logging to based on the original bytenr the csum was for. With this patch I no longer see the csum errors I was seeing. Thanks, Cc: [email protected] Reported-by: Chris Murphy <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-11-11Btrfs: don't leak block group on errorFilipe David Borba Manana1-2/+1
In extent-tree.c:btrfs_write_dirty_block_groups(), if the call to write_one_cache_group() failed, we would return without putting the block group first. Signed-off-by: Filipe David Borba Manana <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-11-11Btrfs: fix sync fs to actually wait for all data to be persistedFilipe David Borba Manana1-3/+9
Currently the fs sync function (super.c:btrfs_sync_fs()) doesn't wait for delayed work to finish before returning success to the caller. This change fixes this, ensuring that there's no data loss if a power failure happens right after fs sync returns success to the caller and before the next commit happens. Steps to reproduce the data loss issue: $ mkfs.btrfs -f /dev/sdb3 $ mount /dev/sdb3 /mnt/btrfs $ perl -e '$d = ("\x41" x 6001); open($f,">","/mnt/btrfs/foobar"); print $f $d; close($f);' && btrfs fi sync /mnt/btrfs Right after the btrfs fi sync command (a second or 2 for example), power off the machine and reboot it. The file will be empty, as it can be verified after mounting the filesystem and through btrfs-debug-tree: $ btrfs-debug-tree /dev/sdb3 | egrep '\(257 INODE_ITEM 0\) itemoff' -B 3 -A 8 item 3 key (256 DIR_INDEX 2) itemoff 3751 itemsize 36 location key (257 INODE_ITEM 0) type FILE namelen 6 datalen 0 name: foobar item 4 key (257 INODE_ITEM 0) itemoff 3591 itemsize 160 inode generation 7 transid 7 size 0 block group 0 mode 100644 links 1 item 5 key (257 INODE_REF 256) itemoff 3575 itemsize 16 inode ref index 2 namelen 6 name: foobar checksum tree key (CSUM_TREE ROOT_ITEM 0) leaf 29429760 items 0 free space 3995 generation 7 owner 7 fs uuid 6192815c-af2a-4b75-b3db-a959ffb6166e chunk uuid b529c44b-938c-4d3d-910a-013b4700bcae uuid tree key (UUID_TREE ROOT_ITEM 0) After this patch, the data loss no longer happens after a power failure and btrfs-debug-tree shows: $ btrfs-debug-tree /dev/sdb3 | egrep '\(257 INODE_ITEM 0\) itemoff' -B 3 -A 8 item 3 key (256 DIR_INDEX 2) itemoff 3751 itemsize 36 location key (257 INODE_ITEM 0) type FILE namelen 6 datalen 0 name: foobar item 4 key (257 INODE_ITEM 0) itemoff 3591 itemsize 160 inode generation 6 transid 6 size 6001 block group 0 mode 100644 links 1 item 5 key (257 INODE_REF 256) itemoff 3575 itemsize 16 inode ref index 2 namelen 6 name: foobar item 6 key (257 EXTENT_DATA 0) itemoff 3522 itemsize 53 extent data disk byte 12845056 nr 8192 extent data offset 0 nr 8192 ram 8192 extent compression 0 checksum tree key (CSUM_TREE ROOT_ITEM 0) Signed-off-by: Filipe David Borba Manana <[email protected]> Reviewed-by: Miao Xie <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-11-11Btrfs: fix tracking of orphan inode countFilipe David Borba Manana1-5/+8
In inode.c:btrfs_orphan_add() if we failed to insert the orphan item, we would return without decrementing the orphan count that we just incremented before attempting the insertion, leaving the orphan inode count wrong. In inode.c:btrfs_orphan_del(), we were decrementing the inode orphan count if the bit BTRFS_INODE_ORPHAN_META_RESERVED was set, which is logically wrong because it should be decremented if the bit BTRFS_INODE_HAS_ORPHAN_ITEM was set - after all we increment the count when we set the bit BTRFS_INODE_HAS_ORPHAN_ITEM elsewhere. Signed-off-by: Filipe David Borba Manana <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-11-11Btrfs: export btrfs space shared info to userspaceLiu Bo1-1/+33
Similar to ocfs2, btrfs also supports that extents can be shared by different inodes, and there are some userspace tools requesting for this kind of 'space shared infomation'.[1] ocfs2 uses flag FIEMAP_EXTENT_SHARED, so does btrfs. [1]: http://thr3ads.net/ocfs2-devel/2010/09/489052-PATCH-3-3-shared-du-using-fiemap-to-figure-up-the-shared-extents-per-file-and-the-footprint-in Reviewed-by: David Sterba <[email protected]> Signed-off-by: Liu Bo <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-11-11Btrfs: remove path arg from btrfs_truncate_free_space_cacheFilipe David Borba Manana5-15/+3
Not used for anything, and removing it avoids caller's need to allocate a path structure. Signed-off-by: Filipe David Borba Manana <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-11-11Btrfs: remove duplicated ino cache's inode lookupFilipe David Borba Manana3-9/+5
We're doing a unnecessary extra lookup of the ino cache's inode when we already have it (and holding a reference) during the process of saving the ino cache contents to disk. Therefore remove this extra lookup. Signed-off-by: Filipe David Borba Manana <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-11-11Btrfs: do a full search everytime in btrfs_search_old_slotJosef Bacik1-2/+6
While running some snashot aware defrag tests I noticed I was panicing every once and a while in key_search. This is because of the optimization that says if we find a key at slot 0 it will be at slot 0 all the way down the rest of the tree. This isn't the case for btrfs_search_old_slot since it will likely replay changes to a buffer if something has changed since we took our sequence number. So short circuit this optimization by setting prev_cmp to -1 every time we call key_search so we will do our normal binary search. With this patch I am no longer seeing the panics I was seeing before. Thanks, Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-11-11Btrfs: add a sanity test for btrfs_split_itemJosef Bacik7-7/+283
While looking at somebodys corruption I became completely convinced that btrfs_split_item was broken, so I wrote this test to verify that it was working as it was supposed to. Thankfully it appears to be working as intended, so just add this test to make sure nobody breaks it in the future. Thanks, Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-11-11btrfs: drop unused parameter from btrfs_item_nrRoss Kirk8-32/+31
Remove unused eb parameter from btrfs_item_nr Signed-off-by: Ross Kirk <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-11-11Btrfs: don't store NULL byte in symlink extentsFilipe David Borba Manana1-2/+2
It is not necessary to store the NULL byte in a symlink inline file extent. There's currently no code that requires the NULL byte to be present in the extent. This change also doesn't break file format compatibility nor the send/receive feature. The VFS also doesn't need the NULL byte to be present in the extent, as it reads up to inode->i_size bytes (which already excluded the NULL byte) and sets the NULL byte for us (in fs/namei.c:page_getlink()). So with this change we save 1 byte per symlink file extent (which is always inlined in the btree leaf) without losing backward and forward compatibility. Signed-off-by: Filipe David Borba Manana <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-11-11Btrfs: eliminate the exceptional root_tree refs=0Stefan Behrens3-15/+10
The fact that btrfs_root_refs() returned 0 for the tree_root caused bugs in the past, therefore it is set to 1 with this patch and (hopefully) all affected code is adapted to this change. I verified this change by temporarily adding WARN_ON() checks everywhere where btrfs_root_refs() is used, checking whether the logic of the code is changed by btrfs_root_refs() returning 1 instead of 0 for root->root_key.objectid == BTRFS_ROOT_TREE_OBJECTID. With these added checks, I ran the xfstests './check -g auto'. The two roots chunk_root and log_root_tree that are only referenced by the superblock and the log_roots below the log_root_tree still have btrfs_root_refs() == 0, only the tree_root is changed. Signed-off-by: Stefan Behrens <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-11-03Linux 3.12Linus Torvalds1-1/+1
2013-11-03Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linusLinus Torvalds3-7/+8
Pull MIPS fixes from Ralf Baechle: "Three fixes across arch/mips with the most complex one being the GIC interrupt fix - at nine lines still not monster. I'm confident this are the final MIPS patches even if there should go for an rc8" * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: MIPS: ralink: fix return value check in rt_timer_probe() MIPS: malta: Fix GIC interrupt offsets MIPS: Perf: Fix 74K cache map
2013-11-03ipc, msg: forbid negative values for "msg{max,mnb,mni}"Mathias Krause2-11/+15
Negative message lengths make no sense -- so don't do negative queue lenghts or identifier counts. Prevent them from getting negative. Also change the underlying data types to be unsigned to avoid hairy surprises with sign extensions in cases where those variables get evaluated in unsigned expressions with bigger data types, e.g size_t. In case a user still wants to have "unlimited" sizes she could just use INT_MAX instead. Signed-off-by: Mathias Krause <[email protected]> Cc: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-11-02Merge tag 'fixes-for-linus' of ↵Linus Torvalds2-1/+13
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull ARM kallsyms fix from Rusty Russell: "Last minute perf unbreakage for ARM modules; spent a day in linux-next" * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: scripts/kallsyms: filter symbols not in kernel address space
2013-11-02ARC: Incorrect mm reference used in vmalloc fault handlerVineet Gupta1-3/+3
A vmalloc fault needs to sync up PGD/PTE entry from init_mm to current task's "active_mm". ARC vmalloc fault handler however was using mm. A vmalloc fault for non user task context (actually pre-userland, from init thread's open for /dev/console) caused the handler to deref NULL mm (for mm->pgd) The reasons it worked so far is amazing: 1. By default (!SMP), vmalloc fault handler uses a cached value of PGD. In SMP that MMU register is repurposed hence need for mm pointer deref. 2. In pre-3.12 SMP kernel, the problem triggering vmalloc didn't exist in pre-userland code path - it was introduced with commit 20bafb3d23d108bc "n_tty: Move buffers into n_tty_data" Signed-off-by: Vineet Gupta <[email protected]> Cc: Gilad Ben-Yossef <[email protected]> Cc: Noam Camus <[email protected]> Cc: [email protected] #3.10 and 3.11 Cc: Peter Hurley <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-11-02scripts/kallsyms: filter symbols not in kernel address spaceMing Lei2-1/+13
This patch uses CONFIG_PAGE_OFFSET to filter symbols which are not in kernel address space because these symbols are generally for generating code purpose and can't be run at kernel mode, so we needn't keep them in /proc/kallsyms. For example, on ARM there are some symbols which may be linked in relocatable code section, then perf can't parse symbols any more from /proc/kallsyms, this patch fixes the problem (introduced b9b32bf70f2fb710b07c94e13afbc729afe221da) Cc: Russell King <[email protected]> Cc: [email protected] Cc: Michal Marek <[email protected]> Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Rusty Russell <[email protected]> Cc: [email protected]
2013-11-01Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds4-14/+39
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Two fixes: - Fix 'NMI handler took too long to run' false positives [ Genuine NMI overhead speedups will come for v3.13, this commit only fixes a measurement bug ] - Fix perf ring-buffer missed barrier causing (rare) ring-buffer data corruption on ppc64" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86: Fix NMI measurements perf: Fix perf ring buffer memory ordering
2013-11-01Merge tag 'usb-3.12-rc8' of ↵Linus Torvalds4-268/+66
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here is a set of patches that revert all of the changes done to the pl2303 USB serial driver in the 3.12-rc timeframe, as it turns out they break some devices that work just fine on 3.11. As it's not a good idea to break working systems, drop them all and they will be reworked for future kernel versions such that there is no breakage. I've also included a MAINTAINERS update for the USB serial subsystem and a new device id for the ftdi_sio driver as well" * tag 'usb-3.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: USB: serial: ftdi_sio: add id for Z3X Box device USB: Maintainers change for usb serial drivers Revert "USB: pl2303: restrict the divisor based baud rate encoding method to the "HX" chip type" Revert "usb: pl2303: fix+improve the divsor based baud rate encoding method" Revert "usb: pl2303: do not round to the next nearest standard baud rate for the divisor based baud rate encoding method" Revert "usb: pl2303: remove 500000 baud from the list of standard baud rates" Revert "usb: pl2303: move the two baud rate encoding methods to separate functions" Revert "usb: pl2303: increase the allowed baud rate range for the divisor based encoding method" Revert "usb: pl2303: also use the divisor based baud rate encoding method for baud rates < 115200 with HX chips" Revert "usb: pl2303: add two comments concerning the supported baud rates with HX chips" Revert "pl2303: simplify the else-if contruct for type_1 chips in pl2303_startup()" Revert "pl2303: improve the chip type information output on startup" Revert "pl2303: improve the chip type detection/distinction" Revert "USB: pl2303: distinguish between original and cloned HX chips"
2013-11-01Merge tag 'sound-3.12' of ↵Linus Torvalds4-1/+9
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull more sound fixes from Takashi Iwai: "The fixes for random bugs that have been reported lately in the game: a few fixes in ASoC dpam and wm_hubs bugs spotted by Coverity, a one-liner HD-audio fixup, and a fix for Oops with DPCM. They are not so critically urgent bugs, but all small and safe" * tag 'sound-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: fix oops in snd_pcm_info() caused by ASoC DPCM ASoC: wm_hubs: Add missing break in hp_supply_event() ALSA: hda - Add a fixup for ASUS N76VZ ASoC: dapm: Return -ENOMEM in snd_soc_dapm_new_dai_widgets() ASoC: dapm: Fix source list debugfs outputs
2013-11-01Merge tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mturquette/linuxLinus Torvalds4-4/+25
Pull clock subsystem fixes from Mike Turquette. * tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mturquette/linux: clk: fixup argument order when setting VCO parameters clk: socfpga: Fix incorrect sdmmc clock name clk: armada-370: fix tclk frequencies clk: nomadik: set all timers to use 2.4 MHz TIMCLK
2013-11-01memcg: remove incorrect underflow checkGreg Thelen1-1/+0
When a memcg is deleted mem_cgroup_reparent_charges() moves charged memory to the parent memcg. As of v3.11-9444-g3ea67d0 "memcg: add per cgroup writeback pages accounting" there's bad pointer read. The goal was to check for counter underflow. The counter is a per cpu counter and there are two problems with the code: (1) per cpu access function isn't used, instead a naked pointer is used which easily causes oops. (2) the check doesn't sum all cpus Test: $ cd /sys/fs/cgroup/memory $ mkdir x $ echo 3 > /proc/sys/vm/drop_caches $ (echo $BASHPID >> x/tasks && exec cat) & [1] 7154 $ grep ^mapped x/memory.stat mapped_file 53248 $ echo 7154 > tasks $ rmdir x <OOPS> The fix is to remove the check. It's currently dangerous and isn't worth fixing it to use something expensive, such as percpu_counter_sum(), for each reparented page. __this_cpu_read() isn't enough to fix this because there's no guarantees of the current cpus count. The only guarantees is that the sum of all per-cpu counter is >= nr_pages. Fixes: 3ea67d06e467 ("memcg: add per cgroup writeback pages accounting") Reported-and-tested-by: Flavio Leitner <[email protected]> Signed-off-by: Greg Thelen <[email protected]> Reviewed-by: Sha Zhengju <[email protected]> Acked-by: Johannes Weiner <[email protected]> Signed-off-by: Hugh Dickins <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-11-01USB: serial: ftdi_sio: add id for Z3X Box deviceАлексей Крамаренко2-0/+7
Custom VID/PID for Z3X Box device, popular tool for cellphone flashing. Signed-off-by: Alexey E. Kramarenko <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2013-11-01USB: Maintainers change for usb serial driversGreg KH1-50/+3
Johan has been conned^Wgracious in accepting the maintainership of the USB serial drivers, especially as he's been doing all of the real work for the past few years. At the same time, remove a bunch of old entries for USB serial drivers that don't make sense anymore, given that the developers are no longer around, and individual driver maintainerships for tiny things like this is pretty pointless. Acked-by: Johan Hovold <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2013-11-01Revert "USB: pl2303: restrict the divisor based baud rate encoding method to ↵Greg Kroah-Hartman1-5/+1
the "HX" chip type" This reverts commit b8bdad608213caffa081a97d2e937e5fe08c4046. Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as they cause regressions on some versions of the chip. This will all be revisited for later kernel versions when we can figure out how to handle this in a way that does not break working devices. Reported-by: Mika Westerberg <[email protected]> Cc: Frank Schäfer <[email protected]> Acked-by: Johan Hovold <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2013-11-01Revert "usb: pl2303: fix+improve the divsor based baud rate encoding method"Greg Kroah-Hartman1-52/+10
This reverts commit 57ce61aad748ceaa08c859da04043ad7dae7c15e. Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as they cause regressions on some versions of the chip. This will all be revisited for later kernel versions when we can figure out how to handle this in a way that does not break working devices. Reported-by: Mika Westerberg <[email protected]> Cc: Frank Schäfer <[email protected]> Acked-by: Johan Hovold <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2013-11-01Revert "usb: pl2303: do not round to the next nearest standard baud rate for ↵Greg Kroah-Hartman1-37/+28
the divisor based baud rate encoding method" This reverts commit 75417d9f99f89ab241de69d7db15af5842b488c4. Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as they cause regressions on some versions of the chip. This will all be revisited for later kernel versions when we can figure out how to handle this in a way that does not break working devices. Reported-by: Mika Westerberg <[email protected]> Cc: Frank Schäfer <[email protected]> Acked-by: Johan Hovold <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2013-11-01Revert "usb: pl2303: remove 500000 baud from the list of standard baud rates"Greg Kroah-Hartman1-2/+2
This reverts commit b9208c721ce736125fe58d398319513a27850fd8. Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as they cause regressions on some versions of the chip. This will all be revisited for later kernel versions when we can figure out how to handle this in a way that does not break working devices. Reported-by: Mika Westerberg <[email protected]> Cc: Frank Schäfer <[email protected]> Acked-by: Johan Hovold <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2013-11-01Revert "usb: pl2303: move the two baud rate encoding methods to separate ↵Greg Kroah-Hartman1-114/+101
functions" This reverts commit e917ba01d69ad705a4cd6a6c77538f55d84f5907. Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as they cause regressions on some versions of the chip. This will all be revisited for later kernel versions when we can figure out how to handle this in a way that does not break working devices. Reported-by: Mika Westerberg <[email protected]> Cc: Frank Schäfer <[email protected]> Acked-by: Johan Hovold <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2013-11-01Revert "usb: pl2303: increase the allowed baud rate range for the divisor ↵Greg Kroah-Hartman1-12/+4
based encoding method" This reverts commit b5c16c6a031c52cc4b7dda6c3de46462fbc92eab. Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as they cause regressions on some versions of the chip. This will all be revisited for later kernel versions when we can figure out how to handle this in a way that does not break working devices. Reported-by: Mika Westerberg <[email protected]> Cc: Frank Schäfer <[email protected]> Acked-by: Johan Hovold <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2013-11-01Revert "usb: pl2303: also use the divisor based baud rate encoding method ↵Greg Kroah-Hartman1-1/+1
for baud rates < 115200 with HX chips" This reverts commit 61fa8d694b8547894b57ea0d99d0120a58f6ebf8. Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as they cause regressions on some versions of the chip. This will all be revisited for later kernel versions when we can figure out how to handle this in a way that does not break working devices. Reported-by: Mika Westerberg <[email protected]> Cc: Frank Schäfer <[email protected]> Acked-by: Johan Hovold <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2013-11-01Revert "usb: pl2303: add two comments concerning the supported baud rates ↵Greg Kroah-Hartman1-12/+0
with HX chips" This reverts commit c23bda365dfbf56aa4d6d4a97f83136c36050e01. Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as they cause regressions on some versions of the chip. This will all be revisited for later kernel versions when we can figure out how to handle this in a way that does not break working devices. Reported-by: Mika Westerberg <[email protected]> Cc: Frank Schäfer <[email protected]> Acked-by: Johan Hovold <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2013-11-01Revert "pl2303: simplify the else-if contruct for type_1 chips in ↵Greg Kroah-Hartman1-2/+3
pl2303_startup()" This reverts commit 73b583af597542329e6adae44524da6f27afed62. Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as they cause regressions on some versions of the chip. This will all be revisited for later kernel versions when we can figure out how to handle this in a way that does not break working devices. Reported-by: Mika Westerberg <[email protected]> Cc: Frank Schäfer <[email protected]> Acked-by: Johan Hovold <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2013-11-01Revert "pl2303: improve the chip type information output on startup"Greg Kroah-Hartman1-10/+5
This reverts commit a77a8c23e4db9fb1f776147eda0d85117359c700. Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as they cause regressions on some versions of the chip. This will all be revisited for later kernel versions when we can figure out how to handle this in a way that does not break working devices. Reported-by: Mika Westerberg <[email protected]> Cc: Frank Schäfer <[email protected]> Acked-by: Johan Hovold <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2013-11-01Revert "pl2303: improve the chip type detection/distinction"Greg Kroah-Hartman1-72/+23
This reverts commit 034d1527adebd302115c87ef343497a889638275. Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as they cause regressions on some versions of the chip. This will all be revisited for later kernel versions when we can figure out how to handle this in a way that does not break working devices. Reported-by: Mika Westerberg <[email protected]> Cc: Frank Schäfer <[email protected]> Acked-by: Johan Hovold <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2013-11-01Revert "USB: pl2303: distinguish between original and cloned HX chips"Greg Kroah-Hartman1-32/+11
This reverts commit 7d26a78f62ff4fb08bc5ba740a8af4aa7ac67da4. Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as they cause regressions on some versions of the chip. This will all be revisited for later kernel versions when we can figure out how to handle this in a way that does not break working devices. Reported-by: Mika Westerberg <[email protected]> Cc: Frank Schäfer <[email protected]> Acked-by: Johan Hovold <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2013-10-31Merge branch 'akpm' (fixes from Andrew Morton)Linus Torvalds2-30/+27
Merge four more fixes from Andrew Morton. * emailed patches from Andrew Morton <[email protected]>: lib/scatterlist.c: don't flush_kernel_dcache_page on slab page mm: memcg: fix test for child groups mm: memcg: lockdep annotation for memcg OOM lock mm: memcg: use proper memcg in limit bypass
2013-10-31lib/scatterlist.c: don't flush_kernel_dcache_page on slab pageMing Lei1-1/+2
Commit b1adaf65ba03 ("[SCSI] block: add sg buffer copy helper functions") introduces two sg buffer copy helpers, and calls flush_kernel_dcache_page() on pages in SG list after these pages are written to. Unfortunately, the commit may introduce a potential bug: - Before sending some SCSI commands, kmalloc() buffer may be passed to block layper, so flush_kernel_dcache_page() can see a slab page finally - According to cachetlb.txt, flush_kernel_dcache_page() is only called on "a user page", which surely can't be a slab page. - ARCH's implementation of flush_kernel_dcache_page() may use page mapping information to do optimization so page_mapping() will see the slab page, then VM_BUG_ON() is triggered. Aaro Koskinen reported the bug on ARM/kirkwood when DEBUG_VM is enabled, and this patch fixes the bug by adding test of '!PageSlab(miter->page)' before calling flush_kernel_dcache_page(). Signed-off-by: Ming Lei <[email protected]> Reported-by: Aaro Koskinen <[email protected]> Tested-by: Simon Baatz <[email protected]> Cc: Russell King - ARM Linux <[email protected]> Cc: Will Deacon <[email protected]> Cc: Aaro Koskinen <[email protected]> Acked-by: Catalin Marinas <[email protected]> Cc: FUJITA Tomonori <[email protected]> Cc: Tejun Heo <[email protected]> Cc: "James E.J. Bottomley" <[email protected]> Cc: Jens Axboe <[email protected]> Cc: <[email protected]> [3.2+] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-10-31mm: memcg: fix test for child groupsJohannes Weiner1-24/+11
When memcg code needs to know whether any given memcg has children, it uses the cgroup child iteration primitives and returns true/false depending on whether the iteration loop is executed at least once or not. Because a cgroup's list of children is RCU protected, these primitives require the RCU read-lock to be held, which is not the case for all memcg callers. This results in the following splat when e.g. enabling hierarchy mode: WARNING: CPU: 3 PID: 1 at kernel/cgroup.c:3043 css_next_child+0xa3/0x160() CPU: 3 PID: 1 Comm: systemd Not tainted 3.12.0-rc5-00117-g83f11a9-dirty #18 Hardware name: LENOVO 3680B56/3680B56, BIOS 6QET69WW (1.39 ) 04/26/2012 Call Trace: dump_stack+0x54/0x74 warn_slowpath_common+0x78/0xa0 warn_slowpath_null+0x1a/0x20 css_next_child+0xa3/0x160 mem_cgroup_hierarchy_write+0x5b/0xa0 cgroup_file_write+0x108/0x2a0 vfs_write+0xbd/0x1e0 SyS_write+0x4c/0xa0 system_call_fastpath+0x16/0x1b In the memcg case, we only care about children when we are attempting to modify inheritable attributes interactively. Racing with deletion could mean a spurious -EBUSY, no problem. Racing with addition is handled just fine as well through the memcg_create_mutex: if the child group is not on the list after the mutex is acquired, it won't be initialized from the parent's attributes until after the unlock. Signed-off-by: Johannes Weiner <[email protected]> Acked-by: Michal Hocko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-10-31mm: memcg: lockdep annotation for memcg OOM lockJohannes Weiner1-1/+10
The memcg OOM lock is a mutex-type lock that is open-coded due to memcg's special needs. Add annotations for lockdep coverage. Signed-off-by: Johannes Weiner <[email protected]> Cc: Michal Hocko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>