aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2011-02-25mm: vmscan: stop reclaim/compaction earlier due to insufficient progress if ↵Mel Gorman1-10/+22
!__GFP_REPEAT should_continue_reclaim() for reclaim/compaction allows scanning to continue even if pages are not being reclaimed until the full list is scanned. In terms of allocation success, this makes sense but potentially it introduces unwanted latency for high-order allocations such as transparent hugepages and network jumbo frames that would prefer to fail the allocation attempt and fallback to order-0 pages. Worse, there is a potential that the full LRU scan will clear all the young bits, distort page aging information and potentially push pages into swap that would have otherwise remained resident. This patch will stop reclaim/compaction if no pages were reclaimed in the last SWAP_CLUSTER_MAX pages that were considered. For allocations such as hugetlbfs that use __GFP_REPEAT and have fewer fallback options, the full LRU list may still be scanned. Order-0 allocation should not be affected because RECLAIM_MODE_COMPACTION is not set so the following avoids the gfp_mask being examined: if (!(sc->reclaim_mode & RECLAIM_MODE_COMPACTION)) return false; A tool was developed based on ftrace that tracked the latency of high-order allocations while transparent hugepage support was enabled and three benchmarks were run. The "fix-infinite" figures are 2.6.38-rc4 with Johannes's patch "vmscan: fix zone shrinking exit when scan work is done" applied. STREAM Highorder Allocation Latency Statistics fix-infinite break-early 1 :: Count 10298 10229 1 :: Min 0.4560 0.4640 1 :: Mean 1.0589 1.0183 1 :: Max 14.5990 11.7510 1 :: Stddev 0.5208 0.4719 2 :: Count 2 1 2 :: Min 1.8610 3.7240 2 :: Mean 3.4325 3.7240 2 :: Max 5.0040 3.7240 2 :: Stddev 1.5715 0.0000 9 :: Count 111696 111694 9 :: Min 0.5230 0.4110 9 :: Mean 10.5831 10.5718 9 :: Max 38.4480 43.2900 9 :: Stddev 1.1147 1.1325 Mean time for order-1 allocations is reduced. order-2 looks increased but with so few allocations, it's not particularly significant. THP mean allocation latency is also reduced. That said, allocation time varies so significantly that the reductions are within noise. Max allocation time is reduced by a significant amount for low-order allocations but reduced for THP allocations which presumably are now breaking before reclaim has done enough work. SysBench Highorder Allocation Latency Statistics fix-infinite break-early 1 :: Count 15745 15677 1 :: Min 0.4250 0.4550 1 :: Mean 1.1023 1.0810 1 :: Max 14.4590 10.8220 1 :: Stddev 0.5117 0.5100 2 :: Count 1 1 2 :: Min 3.0040 2.1530 2 :: Mean 3.0040 2.1530 2 :: Max 3.0040 2.1530 2 :: Stddev 0.0000 0.0000 9 :: Count 2017 1931 9 :: Min 0.4980 0.7480 9 :: Mean 10.4717 10.3840 9 :: Max 24.9460 26.2500 9 :: Stddev 1.1726 1.1966 Again, mean time for order-1 allocations is reduced while order-2 allocations are too few to draw conclusions from. The mean time for THP allocations is also slightly reduced albeit the reductions are within varianes. Once again, our maximum allocation time is significantly reduced for low-order allocations and slightly increased for THP allocations. Anon stream mmap reference Highorder Allocation Latency Statistics 1 :: Count 1376 1790 1 :: Min 0.4940 0.5010 1 :: Mean 1.0289 0.9732 1 :: Max 6.2670 4.2540 1 :: Stddev 0.4142 0.2785 2 :: Count 1 - 2 :: Min 1.9060 - 2 :: Mean 1.9060 - 2 :: Max 1.9060 - 2 :: Stddev 0.0000 - 9 :: Count 11266 11257 9 :: Min 0.4990 0.4940 9 :: Mean 27250.4669 24256.1919 9 :: Max 11439211.0000 6008885.0000 9 :: Stddev 226427.4624 186298.1430 This benchmark creates one thread per CPU which references an amount of anonymous memory 1.5 times the size of physical RAM. This pounds swap quite heavily and is intended to exercise THP a bit. Mean allocation time for order-1 is reduced as before. It's also reduced for THP allocations but the variations here are pretty massive due to swap. As before, maximum allocation times are significantly reduced. Overall, the patch reduces the mean and maximum allocation latencies for the smaller high-order allocations. This was with Slab configured so it would be expected to be more significant with Slub which uses these size allocations more aggressively. The mean allocation times for THP allocations are also slightly reduced. The maximum latency was slightly increased as predicted by the comments due to reclaim/compaction breaking early. However, workloads care more about the latency of lower-order allocations than THP so it's an acceptable trade-off. Signed-off-by: Mel Gorman <[email protected]> Acked-by: Andrea Arcangeli <[email protected]> Acked-by: Johannes Weiner <[email protected]> Reviewed-by: Minchan Kim <[email protected]> Acked-by: Andrea Arcangeli <[email protected]> Acked-by: Rik van Riel <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Kent Overstreet <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-02-25drivers/nfc/pn544.c: add missing regulatorMatti J. Aaltonen1-1/+3
The regulator framework is used for power management. The regulators are only named in the driver code, the actual control stuff is in the board file for each architecture or use case. The PN544 chip has three regulators that can be controlled or not - depending on the architecture where the chip is being used. So some of the regulators may not be controllable. In our current case the third regulator, which was missing from the code, went unnoticed because we didn't need to control it. To be as general as possible - in this respect - the driver needs to list all regulators. Then the board file can be used to actually set the usage. Signed-off-by: Matti J. Aaltonen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-02-25drivers/nfc/Kconfig: use full form of the NFC acronymMatti J. Aaltonen1-1/+1
Spell out the NFC acronym when it's shown for the first time. Signed-off-by: Matti J. Aaltonen <[email protected]> Acked-by: Wolfram Sang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-02-25swiotlb: fix wrong panicFUJITA Tomonori1-2/+4
swiotlb's map_page wrongly calls panic() when it can't find a buffer fit for device's dma mask. It should return an error instead. Devices with an odd dma mask (i.e. under 4G) like b44 network card hit this bug (the system crashes): http://marc.info/?l=linux-kernel&m=129648943830106&w=2 If swiotlb returns an error, b44 driver can use the own bouncing mechanism. Reported-by: Chuck Ebbert <[email protected]> Signed-off-by: FUJITA Tomonori <[email protected]> Tested-by: Arkadiusz Miskiewicz <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-02-25MAINTAINERS: add Chinese documentation maintainerHarry Wei1-0/+7
I have translated some kernel documentation so I wish to maintain the Chinese documentation in our kernel directories. Signed-off-by: Harry Wei <[email protected]> Cc: Joe Perches <[email protected]> Cc: Greg KH <[email protected]> Cc: Randy Dunlap <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-02-25mm: grab rcu read lock in move_pages()Greg Thelen1-3/+3
The move_pages() usage of find_task_by_vpid() requires rcu_read_lock() to prevent free_pid() from reclaiming the pid. Without this patch, RCU warnings are printed in v2.6.38-rc4 move_pages() with: CONFIG_LOCKUP_DETECTOR=y CONFIG_PREEMPT=y CONFIG_LOCKDEP=y CONFIG_PROVE_LOCKING=y CONFIG_PROVE_RCU=y Previously, migrate_pages() went through a similar transformation replacing usage of tasklist_lock with rcu read lock: commit 55cfaa3cbdd29c4919ecb5fb8965c310f357e48c Author: Zeng Zhaoming <[email protected]> Date: Thu Dec 2 14:31:13 2010 -0800 mm/mempolicy.c: add rcu read lock to protect pid structure commit 1e50df39f6e2c3a4a3394df62baa8a213df16c54 Author: KOSAKI Motohiro <[email protected]> Date: Thu Jan 13 15:46:14 2011 -0800 mempolicy: remove tasklist_lock from migrate_pages Signed-off-by: Greg Thelen <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Rik van Riel <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: "Paul E. McKenney" <[email protected]> Cc: Tetsuo Handa <[email protected]> Cc: Sergey Senozhatsky <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Zeng Zhaoming <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-02-25epoll: prevent creating circular epoll structuresDavide Libenzi1-0/+95
In several places, an epoll fd can call another file's ->f_op->poll() method with ep->mtx held. This is in general unsafe, because that other file could itself be an epoll fd that contains the original epoll fd. The code defends against this possibility in its own ->poll() method using ep_call_nested, but there are several other unsafe calls to ->poll elsewhere that can be made to deadlock. For example, the following simple program causes the call in ep_insert recursively call the original fd's ->poll, leading to deadlock: #include <unistd.h> #include <sys/epoll.h> int main(void) { int e1, e2, p[2]; struct epoll_event evt = { .events = EPOLLIN }; e1 = epoll_create(1); e2 = epoll_create(2); pipe(p); epoll_ctl(e2, EPOLL_CTL_ADD, e1, &evt); epoll_ctl(e1, EPOLL_CTL_ADD, p[0], &evt); write(p[1], p, sizeof p); epoll_ctl(e1, EPOLL_CTL_ADD, e2, &evt); return 0; } On insertion, check whether the inserted file is itself a struct epoll, and if so, do a recursive walk to detect whether inserting this file would create a loop of epoll structures, which could lead to deadlock. [[email protected]: Use epmutex to serialize concurrent inserts] Signed-off-by: Davide Libenzi <[email protected]> Signed-off-by: Nelson Elhage <[email protected]> Reported-by: Nelson Elhage <[email protected]> Tested-by: Nelson Elhage <[email protected]> Cc: <[email protected]> [2.6.34+, possibly earlier] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-02-25Merge branch 'for-linus' of ↵Linus Torvalds2-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6: regulator, mc13xxx: Remove pointless test for unsigned less than zero regulator: Fix warning with CONFIG_BUG disabled
2011-02-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstableLinus Torvalds10-57/+282
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: Btrfs: fix fiemap bugs with delalloc Btrfs: set FMODE_EXCL in btrfs_device->mode Btrfs: make btrfs_rm_device() fail gracefully Btrfs: Avoid accessing unmapped kernel address Btrfs: Fix BTRFS_IOC_SUBVOL_SETFLAGS ioctl Btrfs: allow balance to explicitly allocate chunks as it relocates Btrfs: put ENOSPC debugging under a mount option
2011-02-25Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds5-14/+27
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86 quirk: Fix polarity for IRQ0 pin2 override on SB800 systems x86/mrst: Fix apb timer rating when lapic timer is used x86: Fix reboot problem on VersaLogic Menlow boards
2011-02-25RTC: fix typo in drivers/rtc/rtc-at91sam9.cJelle Martijn Kok1-1/+1
The member of the rtc_class_ops struct is called alarm_irq_enable and not alarm_irq_enabled CC: Thomas Gleixner <[email protected]> Signed-off-by: Jelle Martijn Kok <[email protected]> Signed-off-by: John Stultz <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-02-25Merge branch 'patches_for_2.6.38rc' of git://git.pwsan.com/linux-2.6 into ↵Tony Lindgren974-8137/+12677
devel-fixes
2011-02-25omap4: prcm: Fix the CPUx clockdomain offsetsSantosh Shilimkar1-2/+2
CPU0 and CPU1 clockdomain is at the offset of 0x18 from the LPRM base. The header file has set it wrongly to 0x0. Offset 0x0 is for CPUx power domain control register Fix the same. The autogen scripts is fixed thanks to Benoit Cousson With the old value, the clockdomain code would access the *_PWRSTCTRL.POWERSTATE field when it thought it was accessing the *_CLKSTCTRL.CLKTRCTRL field. In the worst case, this could cause system power management to behave incorrectly. Signed-off-by: Santosh Shilimkar <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Rajendra Nayak <[email protected]> Cc: Benoit Cousson <[email protected]> [[email protected]: added second paragraph to commit message] Signed-off-by: Paul Walmsley <[email protected]>
2011-02-25Merge branch 'usb-linus' of ↵Linus Torvalds7-45/+49
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6 * 'usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: usb: musb: core: set has_tt flag USB: xhci: mark local functions as static USB: xhci: fix couple sparse annotations USB: xhci: rework xhci_print_ir_set() to get ir set from xhci itself USB: Reset USB 3.0 devices on (re)discovery xhci: Fix an error in count_sg_trbs_needed() xhci: Fix errors in the running total calculations in the TRB math xhci: Clarify some expressions in the TRB math xhci: Avoid BUG() in interrupt context
2011-02-25Merge branch 'for-linus' of git://neil.brown.name/mdLinus Torvalds14-24/+56
* 'for-linus' of git://neil.brown.name/md: md: Fix - again - partition detection when array becomes active Fix over-zealous flush_disk when changing device size. md: avoid spinlock problem in blk_throtl_exit md: correctly handle probe of an 'mdp' device. md: don't set_capacity before array is active. md: Fix raid1->raid0 takeover
2011-02-25RxRPC: Allocate tokens with kzalloc to avoid oops in rxrpc_destroyAnton Blanchard1-4/+4
With slab poisoning enabled, I see the following oops: Unable to handle kernel paging request for data at address 0x6b6b6b6b6b6b6b73 ... NIP [c0000000006bc61c] .rxrpc_destroy+0x44/0x104 LR [c0000000006bc618] .rxrpc_destroy+0x40/0x104 Call Trace: [c0000000feb2bc00] [c0000000006bc618] .rxrpc_destroy+0x40/0x104 (unreliable) [c0000000feb2bc90] [c000000000349b2c] .key_cleanup+0x1a8/0x20c [c0000000feb2bd40] [c0000000000a2920] .process_one_work+0x2f4/0x4d0 [c0000000feb2be00] [c0000000000a2d50] .worker_thread+0x254/0x468 [c0000000feb2bec0] [c0000000000a868c] .kthread+0xbc/0xc8 [c0000000feb2bf90] [c000000000020e00] .kernel_thread+0x54/0x70 We aren't initialising token->next, but the code in destroy_context relies on the list being NULL terminated. Use kzalloc to zero out all the fields. Signed-off-by: Anton Blanchard <[email protected]> Signed-off-by: David Howells <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-02-25afs: Fix oops in afs_unlink_writebackAnton Blanchard1-0/+1
I'm seeing the following oops when testing afs: Unable to handle kernel paging request for data at address 0x00000008 ... NIP [c0000000003393b0] .afs_unlink_writeback+0x38/0xc0 LR [c00000000033987c] .afs_put_writeback+0x98/0xec Call Trace: [c00000000345f600] [c00000000033987c] .afs_put_writeback+0x98/0xec [c00000000345f690] [c00000000033ae80] .afs_write_begin+0x6a4/0x75c [c00000000345f790] [c00000000012b77c] .generic_file_buffered_write+0x148/0x320 [c00000000345f8d0] [c00000000012e1b8] .__generic_file_aio_write+0x37c/0x3e4 [c00000000345f9d0] [c00000000012e2a8] .generic_file_aio_write+0x88/0xfc [c00000000345fa90] [c0000000003390a8] .afs_file_write+0x10c/0x178 [c00000000345fb40] [c000000000188788] .do_sync_write+0xc4/0x128 [c00000000345fcc0] [c000000000189658] .vfs_write+0xe8/0x1d8 [c00000000345fd70] [c000000000189884] .SyS_write+0x68/0xb0 [c00000000345fe30] [c000000000008564] syscall_exit+0x0/0x40 afs_write_begin hits an error and calls afs_unlink_writeback. In there we do list_del_init on an uninitialised list. The patch below initialises ->link when creating the afs_writeback struct. Signed-off-by: Anton Blanchard <[email protected]> Signed-off-by: David Howells <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-02-25fuse: fix truncate after openMiklos Szeredi1-2/+5
Commit e1181ee6 "vfs: pass struct file to do_truncate on O_TRUNC opens" broke the behavior of open(O_TRUNC|O_RDONLY) in fuse. Fuse assumed that when called from open, a truncate() will be done, not an ftruncate(). Fix by restoring the old behavior, based on the ATTR_OPEN flag. Signed-off-by: Miklos Szeredi <[email protected]>
2011-02-25fuse: fix hang of single threaded fuseblk filesystemMiklos Szeredi2-8/+50
Single threaded NTFS-3G could get stuck if a delayed RELEASE reply triggered a DESTROY request via path_put(). Fix this by a) making RELEASE requests synchronous, whenever possible, on fuseblk filesystems b) if not possible (triggered by an asynchronous read/write) then do the path_put() in a separate thread with schedule_work(). Reported-by: Oliver Neukum <[email protected]> Cc: [email protected] Signed-off-by: Miklos Szeredi <[email protected]>
2011-02-25eukrea-tlv320: fix platform_nameEric Bénard1-1/+1
commit f0fba2ad1b6b53d5360125c41953b7afcd6deff0 included a mistake on the name of the platform in the snd_soc_dai_link structure. Signed-off-by: Eric Bénard <[email protected]> Acked-by: Liam Girdwood <[email protected]> Signed-off-by: Mark Brown <[email protected]> Cc: [email protected]
2011-02-25ASoC: correct pxa AC97 DAI namesDmitry Eremin-Solenikov8-16/+16
Correct names for pxa AC97 DAI are pxa2xx-ac97 and pxa2xx-ac97-aux. Fix that for all PXA platforms. Signed-off-by: Dmitry Eremin-Solenikov <[email protected]> Acked-by: Liam Girdwood <[email protected]> Signed-off-by: Mark Brown <[email protected]> Cc: [email protected]
2011-02-25perf hists: Print number of samples, not the period sumArnaldo Carvalho de Melo1-2/+5
So that we match the header where we state the number of events with the "Samples" column when using 'perf report -n/--show-nr-samples': [root@emilia ~]# perf record -a sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.111 MB perf.data (~4860 samples) ] [root@emilia ~]# perf report --stdio --show-nr-samples # Events: 11 cycles # # Overhead Samples Command Shared Object Symbol # ........ .......... ........... .................. ............................ # 16.65% 1 sleep [kernel.kallsyms] [k] unmap_vmas 16.10% 1 perf libpthread-2.12.so [.] __pthread_cleanup_push_defer 15.79% 2 perf [kernel.kallsyms] [k] format_decode 12.88% 1 kworker/1:2 [kernel.kallsyms] [k] cache_reap 10.69% 1 swapper [kernel.kallsyms] [k] _raw_spin_lock 7.55% 1 sleep [kernel.kallsyms] [k] prepare_exec_creds 6.00% 1 perf [jbd2] [k] start_this_handle 5.29% 1 perf [kernel.kallsyms] [k] seq_read 4.75% 1 perf [kernel.kallsyms] [k] get_pid_task 4.30% 1 perf [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore # # (For a higher level overview, try: perf report --sort comm,dso) # [root@emilia ~]# Reported-by: Stephane Eranian <[email protected]> Reported-by: Cliff Wickman <[email protected]> Acked-by: Stephane Eranian <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Tom Zanussi <[email protected]> Cc: <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> [ cherry-picked it from perf/core, as it has been reported by others as well. ] Signed-off-by: Ingo Molnar <[email protected]>
2011-02-25regulator, mc13xxx: Remove pointless test for unsigned less than zeroJesper Juhl1-1/+1
The variable 'val' is a 'unsigned int', so it can never be less than zero. This fact makes the "val < 0" part of the test done in BUG_ON() in mc13xxx_regulator_get_voltage() rather pointles since it can never have any effect. This patch removes the pointless test. Signed-off-by: Jesper Juhl <[email protected]> Acked-by: Alberto Panizzo <[email protected]> Acked-by: Mark Brown <[email protected]> Signed-off-by: Liam Girdwood <[email protected]>
2011-02-25regulator: Fix warning with CONFIG_BUG disabledMark Brown1-0/+1
Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Liam Girdwood <[email protected]>
2011-02-24Merge branch 'drm-fixes' of ↵Linus Torvalds5-47/+127
git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: drm/i915: Fix unintended recursion in ironlake_disable_rc6 drm/i915: fix corruptions on i8xx due to relaxed fencing drm/i915: skip FDI & PCH enabling for DP_A agp/intel: Experiment with a 855GM GWB bit drm/i915: don't enable FDI & transcoder interrupts after all drm/i915: Ignore a hung GPU when flushing the framebuffer prior to a switch
2011-02-25Merge branch 'drm-intel-fixes' of ↵Dave Airlie1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/ickle/drm-intel into drm-fixes * 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ickle/drm-intel: drm/i915: Fix unintended recursion in ironlake_disable_rc6
2011-02-24Merge branch 'kvm-updates/2.6.38' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-0/+2
* 'kvm-updates/2.6.38' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: SVM: Advance instruction pointer in dr_intercept
2011-02-24OMAP2+: clocksource: fix crash on boot when !CONFIG_OMAP_32K_TIMERPaul Walmsley1-0/+13
OMAP2+ kernels built without CONFIG_OMAP_32K_TIMER crash on boot after the 2.6.38 sched_clock changes: [ 0.000000] OMAP clockevent source: GPTIMER1 at 13000000 Hz [ 0.000000] Unable to handle kernel NULL pointer dereference at virtual address 00000000 [ 0.000000] pgd = c0004000 [ 0.000000] [00000000] *pgd=00000000 [ 0.000000] Internal error: Oops: 80000005 [#1] SMP [ 0.000000] last sysfs file: [ 0.000000] Modules linked in: [ 0.000000] CPU: 0 Not tainted (2.6.38-rc5-00057-g04aa67d #152) [ 0.000000] PC is at 0x0 [ 0.000000] LR is at sched_clock_poll+0x2c/0x3c Without CONFIG_OMAP_32K_TIMER, the kernel has an clockevent and clocksource resolution about three orders of magnitude higher than with CONFIG_OMAP_32K_TIMER set. The tradeoff is that the lowest power consumption states are not available. Fix by calling init_sched_clock() from the GPTIMER clocksource init code. Signed-off-by: Paul Walmsley <[email protected]> Signed-off-by: Tony Lindgren <[email protected]>
2011-02-24MAINTAINERS: Update email addressHerton Ronaldo Krzesinski1-2/+2
Signed-off-by: Herton Ronaldo Krzesinski <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-02-24x86 quirk: Fix polarity for IRQ0 pin2 override on SB800 systemsAndreas Herrmann3-13/+18
On some SB800 systems polarity for IOAPIC pin2 is wrongly specified as low active by BIOS. This caused system hangs after resume from S3 when HPET was used in one-shot mode on such systems because a timer interrupt was missed (HPET signal is high active). For more details see: http://marc.info/?l=linux-kernel&m=129623757413868 Tested-by: Manoj Iyer <[email protected]> Tested-by: Andre Przywara <[email protected]> Signed-off-by: Andreas Herrmann <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: [email protected] # 37.x, 32.x LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2011-02-24usb: musb: core: set has_tt flagFelipe Balbi1-0/+1
MUSB is a non-standard host implementation which can handle all speeds with the same core. We need to set has_tt flag after commit d199c96d41d80a567493e12b8e96ea056a1350c1 (USB: prevent buggy hubs from crashing the USB stack) in order for MUSB HCD to continue working. Signed-off-by: Felipe Balbi <[email protected]> Cc: stable <[email protected]> Cc: Alan Stern <[email protected]> Tested-by: Michael Jones <[email protected]> Tested-by: Alexander Holler <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2011-02-24PM: Make ACPI wakeup from S5 work again when CONFIG_PM_SLEEP is unsetRafael J. Wysocki2-11/+16
Commit 074037e (PM / Wakeup: Introduce wakeup source objects and event statistics (v3)) caused ACPI wakeup to only work if CONFIG_PM_SLEEP is set, but it also worked for CONFIG_PM_SLEEP unset before. This can be fixed by making device_set_wakeup_enable(), device_init_wakeup() and device_may_wakeup() work in the same way as before commit 074037e when CONFIG_PM_SLEEP is unset. Reported-and-tested-by: Justin Maggard <[email protected]> Cc: [email protected] Signed-off-by: Rafael J. Wysocki <[email protected]>
2011-02-24drm/i915: Fix unintended recursion in ironlake_disable_rc6Chris Wilson1-1/+1
After disabling, we're meant to teardown the bo used for the contexts, not recurse into ourselves again and preventing module unload. Reported-and-tested-by: Ben Widawsky <[email protected]> Signed-off-by: Chris Wilson <[email protected]>
2011-02-24ALSA: hda - Add support for new IDT 92HD98 and 92HD99 codecsVitaliy Kulikov1-3/+12
Also fix number of 92HD87 pins to exclude invalid pins. Signed-off-by: Vitaliy Kulikov <[email protected]> Signed-off-by: Takashi Iwai <[email protected]>
2011-02-24block: bd_link_disk_holder() should hold on to holder_dirTejun Heo1-0/+6
The new implementation of bd_link_disk_holder() added by 49731baa41d (block: restore multiple bd_link_disk_holder() support) didn't get an extra reference for the holder_dir kobject of the slave bdev; however, bdev kills holder_dir on removal, not release, so if the slave bdev is removed while there are holder links, the holder_dir will be destroyed while there still are holder links, which leads to oops later when bd_unlink_disk_order() tries to remove those links. Make bd_link_disk_holder() grab an extra reference for the slave's holder_dir and put it in bd_unlink_disk_holder(). Signed-off-by: Tejun Heo <[email protected]> Reported-by: "Hawrylewicz Czarnowski, Przemyslaw" <[email protected]> Tested-by: "Hawrylewicz Czarnowski, Przemyslaw" <[email protected]> Cc: Neil Brown <[email protected]> Cc: Jens Axboe <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-02-24mm: fix refcounting in swaponMiklos Szeredi1-1/+1
Grab a reference to bdev before calling blkdev_get(), which expects the refcount to be already incremented and either returns success or decrements the refcount and returns an error. The bug was introduced by e525fd89 (block: make blkdev_get/put() handle exclusive access), which didn't take into account this behavior of blkdev_get(). Acked-by: Tejun Heo <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-02-24block: fix refcounting in BLKBSZSETMiklos Szeredi1-3/+5
Adam Kovari and others reported that disconnecting an USB drive with an ntfs-3g filesystem would cause "kernel BUG at fs/inode.c:1421!" to be triggered. The BUG could be traced back to ioctl(BLKBSZSET), which would erroneously decrement the refcount on the bdev. This is because blkdev_get() expects the refcount to be already incremented and either returns success or decrements the refcount and returns an error. The bug was introduced by e525fd89 (block: make blkdev_get/put() handle exclusive access), which didn't take into account this behavior of blkdev_get(). This fixes https://bugzilla.kernel.org/show_bug.cgi?id=29202 (and likely 29792 too) Reported-by: Adam Kovari <[email protected]> Acked-by: Tejun Heo <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-02-24Merge branch 'for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: Unlock vfsmount_lock in do_umount
2011-02-24x86/mrst: Fix apb timer rating when lapic timer is usedJacob Pan1-1/+1
Need to adjust the clockevent device rating for the structure that will be registered with clockevent system instead of the temporary structure. Without this fix, APB timer rating will be higher than LAPIC timer such that it can not be released later to be used as the broadcast timer. Signed-off-by: Jacob Pan <[email protected]> Cc: Arjan van de Ven <[email protected]> Cc: Alan Cox <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: John Stultz <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2011-02-24Unlock vfsmount_lock in do_umountJ. R. Okajima1-1/+1
By the commit b3e19d9 2011-01-07 fs: scale mntget/mntput vfsmount_lock was introduced around testing mnt_count. Fix the mis-typed 'unlock' Signed-off-by: J. R. Okajima <[email protected]> Acked-by: Al Viro <[email protected]> Signed-off-by: Al Viro <[email protected]>
2011-02-24md: Fix - again - partition detection when array becomes activeNeilBrown2-1/+23
Revert b821eaa572fd737faaf6928ba046e571526c36c6 and f3b99be19ded511a1bf05a148276239d9f13eefa When I wrote the first of these I had a wrong idea about the lifetime of 'struct block_device'. It can disappear at any time that the block device is not open if it falls out of the inode cache. So relying on the 'size' recorded with it to detect when the device size has changed and so we need to revalidate, is wrong. Rather, we really do need the 'changed' attribute stored directly in the mddev and set/tested as appropriate. Without this patch, a sequence of: mknod / open / close / unlink (which can cause a block_device to be created and then destroyed) will result in a rescan of the partition table and consequence removal and addition of partitions. Several of these in a row can get udev racing to create and unlink and other code can get confused. With the patch, the rescan is only performed when needed and so there are no races. This is suitable for any stable kernel from 2.6.35. Reported-by: "Wojcik, Krzysztof" <[email protected]> Signed-off-by: NeilBrown <[email protected]> Cc: [email protected]
2011-02-24Fix over-zealous flush_disk when changing device size.NeilBrown6-11/+18
There are two cases when we call flush_disk. In one, the device has disappeared (check_disk_change) so any data will hold becomes irrelevant. In the oter, the device has changed size (check_disk_size_change) so data we hold may be irrelevant. In both cases it makes sense to discard any 'clean' buffers, so they will be read back from the device if needed. In the former case it makes sense to discard 'dirty' buffers as there will never be anywhere safe to write the data. In the second case it *does*not* make sense to discard dirty buffers as that will lead to file system corruption when you simply enlarge the containing devices. flush_disk calls __invalidate_devices. __invalidate_device calls both invalidate_inodes and invalidate_bdev. invalidate_inodes *does* discard I_DIRTY inodes and this does lead to fs corruption. invalidate_bev *does*not* discard dirty pages, but I don't really care about that at present. So this patch adds a flag to __invalidate_device (calling it __invalidate_device2) to indicate whether dirty buffers should be killed, and this is passed to invalidate_inodes which can choose to skip dirty inodes. flusk_disk then passes true from check_disk_change and false from check_disk_size_change. dm avoids tripping over this problem by calling i_size_write directly rathher than using check_disk_size_change. md does use check_disk_size_change and so is affected. This regression was introduced by commit 608aeef17a which causes check_disk_size_change to call flush_disk, so it is suitable for any kernel since 2.6.27. Cc: [email protected] Acked-by: Jeff Moyer <[email protected]> Cc: Andrew Patterson <[email protected]> Cc: Jens Axboe <[email protected]> Signed-off-by: NeilBrown <[email protected]>
2011-02-23mm: fix possible cause of a page_mapped BUGHugh Dickins1-3/+1
Robert Swiecki reported a BUG_ON(page_mapped) from a fuzzer, punching a hole with madvise(,, MADV_REMOVE). That path is under mutex, and cannot be explained by lack of serialization in unmap_mapping_range(). Reviewing the code, I found one place where vm_truncate_count handling should have been updated, when I switched at the last minute from one way of managing the restart_addr to another: mremap move changes the virtual addresses, so it ought to adjust the restart_addr. But rather than exporting the notion of restart_addr from memory.c, or converting to restart_pgoff throughout, simply reset vm_truncate_count to 0 to force a rescan if mremap move races with preempted truncation. We have no confirmation that this fixes Robert's BUG, but it is a fix that's worth making anyway. Signed-off-by: Hugh Dickins <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-02-23mm: prevent concurrent unmap_mapping_range() on the same inodeMiklos Szeredi10-38/+23
Michael Leun reported that running parallel opens on a fuse filesystem can trigger a "kernel BUG at mm/truncate.c:475" Gurudas Pai reported the same bug on NFS. The reason is, unmap_mapping_range() is not prepared for more than one concurrent invocation per inode. For example: thread1: going through a big range, stops in the middle of a vma and stores the restart address in vm_truncate_count. thread2: comes in with a small (e.g. single page) unmap request on the same vma, somewhere before restart_address, finds that the vma was already unmapped up to the restart address and happily returns without doing anything. Another scenario would be two big unmap requests, both having to restart the unmapping and each one setting vm_truncate_count to its own value. This could go on forever without any of them being able to finish. Truncate and hole punching already serialize with i_mutex. Other callers of unmap_mapping_range() do not, and it's difficult to get i_mutex protection for all callers. In particular ->d_revalidate(), which calls invalidate_inode_pages2_range() in fuse, may be called with or without i_mutex. This patch adds a new mutex to 'struct address_space' to prevent running multiple concurrent unmap_mapping_range() on the same mapping. [ We'll hopefully get rid of all this with the upcoming mm preemptibility series by Peter Zijlstra, the "mm: Remove i_mmap_mutex lockbreak" patch in particular. But that is for 2.6.39 ] Signed-off-by: Miklos Szeredi <[email protected]> Reported-by: Michael Leun <[email protected]> Reported-by: Gurudas Pai <[email protected]> Tested-by: Gurudas Pai <[email protected]> Acked-by: Hugh Dickins <[email protected]> Cc: [email protected] Signed-off-by: Linus Torvalds <[email protected]>
2011-02-23Revert "Bluetooth: Enable USB autosuspend by default on btusb"Linus Torvalds1-2/+0
This reverts commit 556ea928f78a390fe16ae584e6433dff304d3014. Jeff Chua reports that it can cause some bluetooth devices (he mentions an Bluetooth Intermec scanner) to just stop responding after a while with messages like [ 4533.361959] btusb 8-1:1.0: no reset_resume for driver btusb? [ 4533.361964] btusb 8-1:1.1: no reset_resume for driver btusb? from the kernel. See also https://bugzilla.kernel.org/show_bug.cgi?id=26182 for other reports. Reported-by: Jeff Chua <[email protected]> Reported-by: Andrew Meakovski <[email protected]> Reported-by: Jim Faulkner <[email protected]> Acked-by: Greg KH <[email protected]> Acked-by: Matthew Garrett <[email protected]> Acked-by: Gustavo F. Padovan <[email protected]> Cc: [email protected] (for 2.6.37) Signed-off-by: Linus Torvalds <[email protected]>
2011-02-24Merge branch 'drm-intel-fixes' of ↵Dave Airlie5-46/+126
git://git.kernel.org/pub/scm/linux/kernel/git/ickle/drm-intel into drm-fixes * 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ickle/drm-intel: drm/i915: fix corruptions on i8xx due to relaxed fencing drm/i915: skip FDI & PCH enabling for DP_A agp/intel: Experiment with a 855GM GWB bit drm/i915: don't enable FDI & transcoder interrupts after all drm/i915: Ignore a hung GPU when flushing the framebuffer prior to a switch
2011-02-24drm/i915: fix corruptions on i8xx due to relaxed fencingDaniel Vetter1-1/+15
It looks like gen2 has a peculiar interleaved 2-row inter-tile layout. Probably inherited from i81x which had 2kb tiles (which naturally fit an even-number-of-tile-rows scheme to fit onto 4kb pages). There is no other mention of this in any docs (also not in the Intel internal documention according to Chris Wilson). Problem manifests itself in corruptions in the second half of the last tile row (if the bo has an odd number of tiles). Which can only happen with relaxed tiling (introduced in a00b10c360b35d6431a9). So reject set_tiling calls that don't satisfy this constrain to prevent broken userspace from causing havoc. While at it, also check the size for newer chipsets. LKML: https://lkml.org/lkml/2011/2/19/5 Reported-by: Indan Zupancic <[email protected]> Tested-by: Indan Zupancic <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Signed-off-by: Chris Wilson <[email protected]>
2011-02-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds31-170/+258
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (33 commits) Added support for usb ethernet (0x0fe6, 0x9700) r8169: fix RTL8168DP power off issue. r8169: correct settings of rtl8102e. r8169: fix incorrect args to oob notify. DM9000B: Fix PHY power for network down/up DM9000B: Fix reg_save after spin_lock in dm9000_timeout net_sched: long word align struct qdisc_skb_cb data sfc: lower stack usage in efx_ethtool_self_test bridge: Use IPv6 link-local address for multicast listener queries bridge: Fix MLD queries' ethernet source address bridge: Allow mcast snooping for transient link local addresses too ipv6: Add IPv6 multicast address flag defines bridge: Add missing ntohs()s for MLDv2 report parsing bridge: Fix IPv6 multicast snooping by correcting offset in MLDv2 report bridge: Fix IPv6 multicast snooping by storing correct protocol type p54pci: update receive dma buffers before and after processing fix cfg80211_wext_siwfreq lock ordering... rt2x00: Fix WPA TKIP Michael MIC failures. ath5k: Fix fast channel switching tcp: undo_retrans counter fixes ...
2011-02-23Merge branch 'drm-fixes' of ↵Linus Torvalds5-19/+27
git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: amd64-agp: fix crash at second module load drm/radeon: fix regression with AA resolve checking drm: drop commented out code and preceding comment drm/vblank: Enable precise vblank timestamps for interlaced and doublescan modes. drm/vblank: Use memory barriers optimized for atomic_t instead of generics. drm/vblank: Use abs64(diff_ns) for s64 diff_ns instead of abs(diff_ns) drm/radeon/kms: align height of fb allocation. Revert "drm/radeon/kms: switch back to min->max pll post divider iteration"
2011-02-23Merge branch 'r8169-davem' of ↵David S. Miller1-19/+23
git://git.kernel.org/pub/scm/linux/kernel/git/romieu/netdev-2.6