aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2012-03-06genirq: Clear action->thread_mask if IRQ_ONESHOT is not setThomas Gleixner1-6/+38
Xommit ac5637611(genirq: Unmask oneshot irqs when thread was not woken) fails to unmask when a !IRQ_ONESHOT threaded handler is handled by handle_level_irq. This happens because thread_mask is or'ed unconditionally in irq_wake_thread(), but for !IRQ_ONESHOT interrupts never cleared. So the check for !desc->thread_active fails and keeps the interrupt disabled. Keep the thread_mask zero for !IRQ_ONESHOT interrupts. Document the thread_mask magic while at it. Reported-and-tested-by: Sven Joachim <[email protected]> Reported-and-tested-by: Stefan Lippers-Hollmann <[email protected]> Cc: [email protected] Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-03-06openvswitch: Honor dp_ifindex, when specified, for vport lookup by name.Ben Pfaff1-0/+3
When OVS_VPORT_ATTR_NAME is specified and dp_ifindex is nonzero, the logical behavior would be for the vport name lookup scope to be limited to the specified datapath, but in fact the dp_ifindex value was ignored. This commit causes the search scope to be honored. Signed-off-by: Ben Pfaff <[email protected]> Signed-off-by: Jesse Gross <[email protected]>
2012-03-06IPv6: Fix not join all-router mcast group when forwarding set.Li Wei1-0/+4
When forwarding was set and a new net device is register, we need add this device to the all-router mcast group. Signed-off-by: Li Wei <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-03-06mmap: EINVAL not ENOMEM when rejecting VM_GROWSHugh Dickins1-1/+2
Currently error is -ENOMEM when rejecting VM_GROWSDOWN|VM_GROWSUP from shared anonymous: hoist the file case's -EINVAL up for both. Signed-off-by: Hugh Dickins <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-03-06caif-hsi: Set default MTU to 4096Sjur Brændeland1-1/+1
Default MTU for CAIF HSI was wrongly set to 15 * 4092 bytes. The patch sets default MTU size to 4096. Signed-off-by: Sjur Brændeland <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-03-06cxgb4vf: Add support for Chelsio's T480-CR and T440-LP-CR adaptersVipul Pandya1-0/+2
This patch adds PCI device ids for Chelsio's T480-CR and T440-LP-CR adapters. Signed-off-by: Vipul Pandya <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-03-06cxgb4: Add support for Chelsio's T480-CR and T440-LP-CR adaptersVipul Pandya1-0/+2
This patch adds PCI device ids for Chelsio's T480-CR and T440-LP-CR adapters. Signed-off-by: Vipul Pandya <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-03-06mlx4_core: remove buggy sched_queue maskingYevgeny Petrilin1-5/+0
Fixes a bug introduced by commit fe9a2603c, where the priority bits in the schedule queue field were masked out. Signed-off-by: Amir Vadai <[email protected]> Signed-off-by: Yevgeny Petrilin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-03-06netfilter: nf_conntrack: fix early_drop with reliable event deliveryPablo Neira Ayuso1-2/+6
If reliable event delivery is enabled and ctnetlink fails to deliver the destroy event in early_drop, the conntrack subsystem cannot drop any the candidate flow that was planned to be evicted. Reported-by: Kerin Millar <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-03-06bridge: netfilter: don't call iptables on vlan packets if sysctl is offFlorian Westphal1-14/+18
When net.bridge.bridge-nf-filter-vlan-tagged is 0 (default), vlan packets arriving should not be sent to ip(6)tables by bridge netfilter. However, it turns out that we currently always send VLAN packets to netfilter, if .. a), CONFIG_VLAN_8021Q is enabled ; or b), CONFIG_VLAN_8021Q is not set but rx vlan offload is enabled on the bridge port. This is because bridge netfilter treats skb with skb->protocol == ETH_P_IP{V6} as "non-vlan packet". With rx vlan offload on or CONFIG_VLAN_8021Q=y, the vlan header has already been removed here, and we cannot rely on skb->protocol alone. Fix this by only using skb->protocol if the skb has no vlan tag, or if a vlan tag is present and filter-vlan-tagged bridge netfilter sysctl is enabled. We cannot remove the skb->protocol == htons(ETH_P_8021Q) test because the vlan tag is still around in the CONFIG_VLAN_8021Q=n && "ethtool -K $itf rxvlan off" case. reproducer: iptables -t raw -I PREROUTING -i br0 iptables -t raw -I PREROUTING -i br0.1 Then send packets to an ip address configured on br0.1 interface. Even with net.bridge.bridge-nf-filter-vlan-tagged=0, the 1st rule will match instead of the 2nd one. With this patch applied, the 2nd rule will match instead. In the non-local address case, netfilter won't be consulted after this patch unless the sysctl is switched on. Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-03-06netfilter: bridge: fix wrong pointer dereferencePablo Neira Ayuso1-1/+1
In adf7ff8, a invalid dereference was added in ebt_make_names. CC [M] net/bridge/netfilter/ebtables.o net/bridge/netfilter/ebtables.c: In function `ebt_make_names': net/bridge/netfilter/ebtables.c:1371:20: warning: `t' may be used uninitialized in this function [-Wuninitialized] Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-03-06netfilter: ctnetlink: remove incorrect spin_[un]lock_bh on NAT module autoloadPablo Neira Ayuso1-3/+0
Since 7d367e0, ctnetlink_new_conntrack is called without holding the nf_conntrack_lock spinlock. Thus, ctnetlink_parse_nat_setup does not require to release that spinlock anymore in the NAT module autoload case. Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-03-06netfilter: ebtables: fix wrong name length while copying to user-spaceSantosh Nayak1-3/+13
user-space ebtables expects 32 bytes-long names, but xt_match names use 29 bytes. We have to copy less 29 bytes and then, make sure we fill the remaining bytes with zeroes. Signed-off-by: Santosh Nayak <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-03-06r8169: runtime resume before shutdown.françois romieu1-0/+5
With runtime PM, if the ethernet cable is disconnected, the device is transitioned to D3 state to conserve energy. If the system is shutdown in this state, any register accesses in rtl_shutdown are dropped on the floor. As the device was programmed by .runtime_suspend() to wake on link changes, it is thus brought back up as soon as the link recovers. Resuming every suspended device through the driver core would slow things down and it is not clear how many devices really need it now. Original report and D0 transition patch by Sameer Nanda. Patch has been changed to comply with advices by Rafael J. Wysocki and the PM folks. Reported-by: Sameer Nanda <[email protected]> Signed-off-by: Francois Romieu <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Cc: Hayes Wang <[email protected]> Cc: Alan Stern <[email protected]> Acked-by: Rafael J. Wysocki <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-03-06tcp: fix tcp_shift_skb_data() to not shift SACKed data below snd_unaNeal Cardwell1-0/+4
This commit fixes tcp_shift_skb_data() so that it does not shift SACKed data below snd_una. This fixes an issue whose symptoms exactly match reports showing tp->sacked_out going negative since 3.3.0-rc4 (see "WARNING: at net/ipv4/tcp_input.c:3418" thread on netdev). Since 2008 (832d11c5cd076abc0aa1eaf7be96c81d1a59ce41) tcp_shift_skb_data() had been shifting SACKed ranges that were below snd_una. It checked that the *end* of the skb it was about to shift from was above snd_una, but did not check that the end of the actual shifted range was above snd_una; this commit adds that check. Shifting SACKed ranges below snd_una is problematic because for such ranges tcp_sacktag_one() short-circuits: it does not declare anything as SACKed and does not increase sacked_out. Before the fixes in commits cc9a672ee522d4805495b98680f4a3db5d0a0af9 and daef52bab1fd26e24e8e9578f8fb33ba1d0cb412, shifting SACKed ranges below snd_una happened to work because tcp_shifted_skb() was always (incorrectly) passing in to tcp_sacktag_one() an skb whose end_seq tcp_shift_skb_data() had already guaranteed was beyond snd_una. Hence tcp_sacktag_one() never short-circuited and always increased tp->sacked_out in this case. After those two fixes, my testing has verified that shifting SACKed ranges below snd_una could cause tp->sacked_out to go negative with the following sequence of events: (1) tcp_shift_skb_data() sees an skb whose end_seq is beyond snd_una, then shifts a prefix of that skb that is below snd_una (2) tcp_shifted_skb() increments the packet count of the already-SACKed prev sk_buff (3) tcp_sacktag_one() sees the end of the new SACKed range is below snd_una, so it short-circuits and doesn't increase tp->sacked_out (5) tcp_clean_rtx_queue() sees the SACKed skb has been ACKed, decrements tp->sacked_out by this "inflated" pcount that was missing a matching increase in tp->sacked_out, and hence tp->sacked_out underflows to a u32 like 0xFFFFFFFF, which casted to s32 is negative. (6) this leads to the warnings seen in the recent "WARNING: at net/ipv4/tcp_input.c:3418" thread on the netdev list; e.g.: tcp_input.c:3418 WARN_ON((int)tp->sacked_out < 0); More generally, I think this bug can be tickled in some cases where two or more ACKs from the receiver are lost and then a DSACK arrives that is immediately above an existing SACKed skb in the write queue. This fix changes tcp_shift_skb_data() to abort this sequence at step (1) in the scenario above by noticing that the bytes are below snd_una and not shifting them. Signed-off-by: Neal Cardwell <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-03-06Merge branch 'master' of ↵John W. Linville3-4/+8
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
2012-03-06Merge tag 'fixes-3.3-rc7' of ↵Linus Torvalds24-46/+47
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull arm-soc bug fixes from Arnd Bergmann: "Here are all the fixes I got after sending the last pull request. These fix mostly regressions on exynos, at91, pxa and ep93xx. Signed-off-by: Arnd Bergmann <[email protected]>" * tag 'fixes-3.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: ep93xx: convert vision_ep9307 to MULTI_IRQ_HANDLER ARM: EXYNOS: fix touchscreen IRQ setup on Universal C210 board ARM: pxa: fix invalid mfp pin issue ARM: pxa: remove duplicated registeration on pxa-gpio ARM: pxa: add dummy clock for pxa25x and pxa27x ARM: S3C24XX: DMA resume regression fix ARM: S3C24XX: Fix restart on S3C2442 ARM: SAMSUNG: Fix memory size for hsotg ARM: at91/dma: DMA controller registering with DT support ARM: at91/dma: remove platform data from DMA controller
2012-03-06Merge branch 'for-linus' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 regression fix from Martin Schwidefsky: "It is a fix for a regression that has been introduced with git commit 25f269f17316 - "[S390] qdio: EQBS retry after CCQ 96" - and if possible we would like to have working code for the fcp data router in 3.3." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: [S390] qdio: fix handler function arguments for zfcp data router
2012-03-06Merge tag 'for-linus' of ↵Linus Torvalds1-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator updates from Mark Brown: "A simple fix that's obvious from inspection. There's no mainline users of this driver yet (there's some i.MX platforms which will use it)." * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: Fix mask parameter in da9052_reg_update calls
2012-03-06vsprintf: make %pV handling compatible with kasprintf()Jan Beulich1-3/+9
kasprintf() (and potentially other functions that I didn't run across so far) want to evaluate argument lists twice. Caring to do so for the primary list is obviously their job, but they can't reasonably be expected to check the format string for instances of %pV, which however need special handling too: On architectures like x86-64 (as opposed to e.g. ix86), using the same argument list twice doesn't produce the expected results, as an internally managed cursor gets updated during the first run. Fix the problem by always acting on a copy of the original list when handling %pV. Signed-off-by: Jan Beulich <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-03-06page_cgroup: fix horrid swap accounting regressionHugh Dickins1-1/+3
Why is memcg's swap accounting so broken? Insane counts, wrong ownership, unfreeable structures, which later get freed and then accessed after free. Turns out to be a tiny a little 3.3-rc1 regression in 9fb4b7cc0724 "page_cgroup: add helper function to get swap_cgroup": the helper function (actually named lookup_swap_cgroup()) returns an address using void* arithmetic, but the structure in question is a short. Signed-off-by: Hugh Dickins <[email protected]> Reviewed-by: Bob Liu <[email protected]> Cc: Michal Hocko <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: Johannes Weiner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-03-06Merge branch 'fixes' of git://github.com/hzhuang1/linux into fixesArnd Bergmann171-574/+1139
* 'fixes' of git://github.com/hzhuang1/linux: (3 commits) ARM: pxa: fix invalid mfp pin issue ARM: pxa: remove duplicated registeration on pxa-gpio ARM: pxa: add dummy clock for pxa25x and pxa27x Includes an update to v3.3-rc6
2012-03-06Merge branch 'v3.3-samsung-fixes-4' of ↵Arnd Bergmann15-25/+25
git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung into fixes * 'v3.3-samsung-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung: ARM: EXYNOS: fix touchscreen IRQ setup on Universal C210 board ARM: S3C24XX: DMA resume regression fix ARM: S3C24XX: Fix restart on S3C2442 ARM: SAMSUNG: Fix memory size for hsotg
2012-03-06ARM: ep93xx: convert vision_ep9307 to MULTI_IRQ_HANDLERH Hartley Sweeten1-0/+2
As done for the other ep93xx machines in: commit 9a6879bd902e2ec605fff4d9fb3247b440a1f66a ARM: ep93xx: convert to MULTI_IRQ_HANDLER Now that there is a generic IRQ handler for multiple VIC devices use it for vision_ep9307 to help building multi platform kernels. Signed-off-by: Hartley Sweeten <[email protected]> Acked-by: Ryan Mallon <[email protected]> Reviewed-by: Jamie Iles <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]>
2012-03-06ARM: EXYNOS: fix touchscreen IRQ setup on Universal C210 boardBartlomiej Zolnierkiewicz1-0/+2
Fixes atmel_mxt_ts freeze on Universal C210. Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]> Signed-off-by: Kyungmin Park <[email protected]> Signed-off-by: Marek Szyprowski <[email protected]> Signed-off-by: Kukjin Kim <[email protected]>
2012-03-06ALSA: hda - add quirk to detect CD input on Gigabyte EP45-DS3Marton Balint1-0/+9
My CD input got lost in commit 68ef0561efe494143516df38c03a16b837b8e79c. Raymond helped me to add the necessary pin fixup to make it appear again. In fact, this is basically his patch. It fixes alsa bug #5541. Signed-off-by: Marton Balint <[email protected]> Signed-off-by: Takashi Iwai <[email protected]>
2012-03-06ARM: pxa: fix invalid mfp pin issueHaojian Zhuang1-0/+7
Failure is reported on hx4700 with kernel v3.3-rc1. __mfp_validate: GPIO20 is invalid pin __mfp_validate: GPIO21 is invalid pin __mfp_validate: GPIO15 is invalid pin __mfp_validate: GPIO78 is invalid pin __mfp_validate: GPIO79 is invalid pin __mfp_validate: GPIO80 is invalid pin __mfp_validate: GPIO33 is invalid pin __mfp_validate: GPIO48 is invalid pin __mfp_validate: GPIO49 is invalid pin __mfp_validate: GPIO50 is invalid pin Since pxa_last_gpio is used in mfp-pxa2xx driver. But it's only updated in pxa-gpio driver that run after mfp-pxa2xx driver. So update the pxa_last_gpio first in mfp-pxa2xx driver. Reported-by: Paul Parsons <[email protected]> Signed-off-by: Haojian Zhuang <[email protected]>
2012-03-06ARM: pxa: remove duplicated registeration on pxa-gpioHaojian Zhuang5-5/+0
Both reboot (via reboot(RB_AUTOBOOT)) and suspend freeze on hx4700. Registration of pxa_gpio_syscore_ops is moved into pxa-gpio driver, but it still exists in arch-pxa directory. It resulsts failure on reboot and suspend. Now remove the registration code in arch-pxa. Reported-by: Paul Parsons <[email protected]> Signed-off-by: Haojian Zhuang <[email protected]>
2012-03-06ARM: pxa: add dummy clock for pxa25x and pxa27xHaojian Zhuang2-0/+2
gpio-pxa driver is shared among arch-pxa and arch-mmp. Clock is the essential component on pxa3xx/pxa95x and arch-mmp. So we need to define dummy clock in pxa25x/pxa27x instead. This regression was introduced by the commit "ARM: pxa: add dummy clock for sa1100-rtc", id a55b5adaf403c4d032e0871ad4ee3367782f4db6. Reported-by: Jonathan Cameron <[email protected]> Signed-off-by: Paul Parsons <[email protected]> Tested-by: Robert Jarzmik <[email protected]> Signed-off-by: Haojian Zhuang <[email protected]>
2012-03-06tg3: Fix to use multi queue BQL interfacesTom Herbert1-3/+3
Fix tg3 to use BQL multi queue related netdev interfaces since the device supports multi queue. Signed-off-by: Tom Herbert <[email protected]> Reported-by: Christoph Lameter <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-03-05Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds5-18/+42
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "It contains three cherry-picked fixes from perf/core, which turned out to be more urgent than we originally thought." * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf tools: Handle kernels that don't support attr.exclude_{guest,host} perf tools: Change perf_guest default back to false perf record: No build id option fails
2012-03-05Merge tag 'usb-3.3-rc6' of ↵Linus Torvalds2-10/+2
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb USB: revert a powerpc EHCI patch There is just one patch in here, a revert of a powerpc EHCI driver patch that was reported to cause problems. Signed-off-by: Greg Kroah-Hartman <[email protected]> * tag 'usb-3.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: Revert "powerpc/usb: fix issue of CPU halt when missing USB PHY clock"
2012-03-05Merge tag 'tty-3.3-rc6' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty tty: build fix for 3.3-rc6 This contains one build fix for the powerpc udbg driver that was reported. Signed-off-by: Greg Kroah-Hartman <[email protected]> * tag 'tty-3.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: tty/powerpc: early udbg consoles can't be modules
2012-03-05Merge tag 'md-3.3-fixes' of git://neil.brown.name/mdLinus Torvalds2-12/+28
Pull md fixes from Neil Brown: "Three fixes for md in 3.3-rc: Two relate to the recently added drive replacement. One fixes the problem where a read error in RAID10 would sometimes be retried indefinitely." * tag 'md-3.3-fixes' of git://neil.brown.name/md: md/raid10: fix assembling of arrays with replacement devices. md/raid10: fix handling of error on last working device in array. md/raid1: fix buglet in md_raid1_contested.
2012-03-05Merge branch 'akpm' (Andrew's patch bomb)Linus Torvalds24-164/+156
Merge the emailed seties of 19 patches from Andrew Morton * akpm: rapidio/tsi721: fix queue wrapping bug in inbound doorbell handler memcg: fix mapcount check in move charge code for anonymous page mm: thp: fix BUG on mm->nr_ptes alpha: fix 32/64-bit bug in futex support memcg: fix GPF when cgroup removal races with last exit debugobjects: Fix selftest for static warnings floppy/scsi: fix setting of BIO flags memcg: fix deadlock by inverting lrucare nesting drivers/rtc/rtc-r9701.c: fix crash in r9701_remove() c2port: class_create() returns an ERR_PTR pps: class_create() returns an ERR_PTR, not NULL hung_task: fix the broken rcu_lock_break() logic vfork: kill PF_STARTING coredump_wait: don't call complete_vfork_done() vfork: make it killable vfork: introduce complete_vfork_done() aio: wake up waiters when freeing unused kiocbs kprobes: return proper error code from register_kprobe() kmsg_dump: don't run on non-error paths by default
2012-03-05rapidio/tsi721: fix queue wrapping bug in inbound doorbell handlerAlexandre Bounine1-2/+3
Fix a bug that causes a kernel panic when the number of received doorbells is larger than number of entries in the inbound doorbell queue (current default value = 512). Another possible indication for this bug is large number of spurious doorbells reported by tsi721 driver after reaching the queue size maximum. Signed-off-by: Alexandre Bounine <[email protected]> Cc: Chul Kim <[email protected]> Cc: Matt Porter <[email protected]> Cc: <[email protected]> [3.2.x+] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-03-05memcg: fix mapcount check in move charge code for anonymous pageNaoya Horiguchi1-1/+1
Currently the charge on shared anonyous pages is supposed not to moved in task migration. To implement this, we need to check that mapcount > 1, instread of > 2. So this patch fixes it. Signed-off-by: Naoya Horiguchi <[email protected]> Reviewed-by: Daisuke Nishimura <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: Hillf Danton <[email protected]> Cc: Johannes Weiner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-03-05mm: thp: fix BUG on mm->nr_ptesAndrea Arcangeli1-3/+3
Dave Jones reports a few Fedora users hitting the BUG_ON(mm->nr_ptes...) in exit_mmap() recently. Quoting Hugh's discovery and explanation of the SMP race condition: "mm->nr_ptes had unusual locking: down_read mmap_sem plus page_table_lock when incrementing, down_write mmap_sem (or mm_users 0) when decrementing; whereas THP is careful to increment and decrement it under page_table_lock. Now most of those paths in THP also hold mmap_sem for read or write (with appropriate checks on mm_users), but two do not: when split_huge_page() is called by hwpoison_user_mappings(), and when called by add_to_swap(). It's conceivable that the latter case is responsible for the exit_mmap() BUG_ON mm->nr_ptes that has been reported on Fedora." The simplest way to fix it without having to alter the locking is to make split_huge_page() a noop in nr_ptes terms, so by counting the preallocated pagetables that exists for every mapped hugepage. It was an arbitrary choice not to count them and either way is not wrong or right, because they are not used but they're still allocated. Reported-by: Dave Jones <[email protected]> Reported-by: Hugh Dickins <[email protected]> Signed-off-by: Andrea Arcangeli <[email protected]> Acked-by: Hugh Dickins <[email protected]> Cc: David Rientjes <[email protected]> Cc: Josh Boyer <[email protected]> Cc: <[email protected]> [3.0.x, 3.1.x, 3.2.x] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-03-05alpha: fix 32/64-bit bug in futex supportAndrew Morton1-1/+1
Michael Cree said: : : I have noticed some user space problems (pulseaudio crashes in pthread : : code, glibc/nptl test suite failures, java compiler freezes on SMP alpha : : systems) that arise when using a 2.6.39 or later kernel on Alpha. : : Bisecting between 2.6.38 and 2.6.39 (using glibc/nptl test suite as : : criterion for good/bad kernel) eventually leads to: : : : : 8d7718aa082aaf30a0b4989e1f04858952f941bc is the first bad commit : : commit 8d7718aa082aaf30a0b4989e1f04858952f941bc : : Author: Michel Lespinasse <[email protected]> : : Date: Thu Mar 10 18:50:58 2011 -0800 : : : : futex: Sanitize futex ops argument types : : : : Change futex_atomic_op_inuser and futex_atomic_cmpxchg_inatomic : : prototypes to use u32 types for the futex as this is the data type the : : futex core code uses all over the place. : : : : Looking at the commit I see there is a change of the uaddr argument in : : the Alpha architecture specific code for futexes from int to u32, but I : : don't see why this should cause a problem. Richard Henderson said: : futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, : u32 oldval, u32 newval) : ... : : "r"(uaddr), "r"((long)oldval), "r"(newval) : : : There is no 32-bit compare instruction. These are implemented by : consistently extending the values to a 64-bit type. Since the : load instruction sign-extends, we want to sign-extend the other : quantity as well (despite the fact it's logically unsigned). : : So: : : - : "r"(uaddr), "r"((long)oldval), "r"(newval) : + : "r"(uaddr), "r"((long)(int)oldval), "r"(newval) : : should do the trick. Michael said: : This fixes the glibc test suite failures and the pulseaudio related : crashes, but it does not fix the java compiiler lockups that I was (and : are still) observing. That is some other problem. Reported-by: Michael Cree <[email protected]> Tested-by: Michael Cree <[email protected]> Acked-by: Phil Carmody <[email protected]> Cc: Richard Henderson <[email protected]> Cc: Michel Lespinasse <[email protected]> Cc: Ivan Kokshaysky <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-03-05memcg: fix GPF when cgroup removal races with last exitHugh Dickins6-48/+18
When moving tasks from old memcg (with move_charge_at_immigrate on new memcg), followed by removal of old memcg, hit General Protection Fault in mem_cgroup_lru_del_list() (called from release_pages called from free_pages_and_swap_cache from tlb_flush_mmu from tlb_finish_mmu from exit_mmap from mmput from exit_mm from do_exit). Somewhat reproducible, takes a few hours: the old struct mem_cgroup has been freed and poisoned by SLAB_DEBUG, but mem_cgroup_lru_del_list() is still trying to update its stats, and take page off lru before freeing. A task, or a charge, or a page on lru: each secures a memcg against removal. In this case, the last task has been moved out of the old memcg, and it is exiting: anonymous pages are uncharged one by one from the memcg, as they are zapped from its pagetables, so the charge gets down to 0; but the pages themselves are queued in an mmu_gather for freeing. Most of those pages will be on lru (and force_empty is careful to lru_add_drain_all, to add pages from pagevec to lru first), but not necessarily all: perhaps some have been isolated for page reclaim, perhaps some isolated for other reasons. So, force_empty may find no task, no charge and no page on lru, and let the removal proceed. There would still be no problem if these pages were immediately freed; but typically (and the put_page_testzero protocol demands it) they have to be added back to lru before they are found freeable, then removed from lru and freed. We don't see the issue when adding, because the mem_cgroup_iter() loops keep their own reference to the memcg being scanned; but when it comes to mem_cgroup_lru_del_list(). I believe this was not an issue in v3.2: there, PageCgroupAcctLRU and PageCgroupUsed flags were used (like a trick with mirrors) to deflect view of pc->mem_cgroup to the stable root_mem_cgroup when neither set. 38c5d72f3ebe ("memcg: simplify LRU handling by new rule") mercifully removed those convolutions, but left this General Protection Fault. But it's surprisingly easy to restore the old behaviour: just check PageCgroupUsed in mem_cgroup_lru_add_list() (which decides on which lruvec to add), and reset pc to root_mem_cgroup if page is uncharged. A risky change? just going back to how it worked before; testing, and an audit of uses of pc->mem_cgroup, show no problem. And there's a nice bonus: with mem_cgroup_lru_add_list() itself making sure that an uncharged page goes to root lru, mem_cgroup_reset_owner() no longer has any purpose, and we can safely revert 4e5f01c2b9b9 ("memcg: clear pc->mem_cgroup if necessary"). Calling update_page_reclaim_stat() after add_page_to_lru_list() in swap.c is not strictly necessary: the lru_lock there, with RCU before memcg structures are freed, makes mem_cgroup_get_reclaim_stat_from_page safe without that; but it seems cleaner to rely on one dependency less. Signed-off-by: Hugh Dickins <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Konstantin Khlebnikov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-03-05debugobjects: Fix selftest for static warningsStephen Boyd1-11/+3
debugobjects is now printing a warning when a fixup for a NOTAVAILABLE object is run. This causes the selftest to fail like: ODEBUG: selftest warnings failed 4 != 5 We could just increase the number of warnings that the selftest is expecting to see because that is actually what has changed. But, it turns out that fixup_activate() was written with inverted logic and thus a fixup for a static object returned 1 indicating the object had been fixed, and 0 otherwise. Fix the logic to be correct and update the counts to reflect that nothing needed fixing for a static object. Signed-off-by: Stephen Boyd <[email protected]> Reported-by: Thomas Gleixner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-03-05floppy/scsi: fix setting of BIO flagsMuthu Kumar2-2/+2
Fix setting bio flags in drivers (sd_dif/floppy). Signed-off-by: Muthukumar R <[email protected]> Cc: Jens Axboe <[email protected]> Cc: James Bottomley <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-03-05memcg: fix deadlock by inverting lrucare nestingHugh Dickins1-35/+37
We have forgotten the rules of lock nesting: the irq-safe ones must be taken inside the non-irq-safe ones, otherwise we are open to deadlock: CPU0 CPU1 ---- ---- lock(&(&pc->lock)->rlock); local_irq_disable(); lock(&(&zone->lru_lock)->rlock); lock(&(&pc->lock)->rlock); <Interrupt> lock(&(&zone->lru_lock)->rlock); To check a different locking issue, I happened to add a spin_lock to memcg's bit_spin_lock in lock_page_cgroup(), and lockdep very quickly complained about __mem_cgroup_commit_charge_lrucare() (on CPU1 above). So delete __mem_cgroup_commit_charge_lrucare(), passing a bool lrucare to __mem_cgroup_commit_charge() instead, taking zone->lru_lock under lock_page_cgroup() in the lrucare case. The original was using spin_lock_irqsave, but we'd be in more trouble if it were ever called at interrupt time: unconditional _irq is enough. And ClearPageLRU before del from lru, SetPageLRU before add to lru: no strong reason, but that is the ordering used consistently elsewhere. Fixes 36b62ad539498d00c2d280a151a ("memcg: simplify corner case handling of LRU"). Signed-off-by: Hugh Dickins <[email protected]> Acked-by: Johannes Weiner <[email protected]> Cc: Konstantin Khlebnikov <[email protected]> Acked-by: KAMEZAWA Hiroyuki <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-03-05drivers/rtc/rtc-r9701.c: fix crash in r9701_remove()Anatolij Gustschin1-7/+7
If probing the RTC didn't succeed due to failed RTC register access, the RTC device will be unregistered. Then, when removing the module r9701_remove() causes a kernel crash while trying to unregister a not registered RTC device. Fix this by doing RTC register access test before RTC device registration. Signed-off-by: Anatolij Gustschin <[email protected]> Cc: Alessandro Zummo <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-03-05c2port: class_create() returns an ERR_PTRDan Carpenter1-2/+2
class_create() doesn't return a NULL, it only returns ERR_PTRs. Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-03-05pps: class_create() returns an ERR_PTR, not NULLDan Carpenter1-2/+2
class_create() never returns NULLs only ERR_PTRs. Signed-off-by: Dan Carpenter <[email protected]> Cc: Rodolfo Giometti <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-03-05hung_task: fix the broken rcu_lock_break() logicOleg Nesterov1-4/+7
check_hung_uninterruptible_tasks()->rcu_lock_break() introduced by "softlockup: check all tasks in hung_task" commit ce9dbe24 looks absolutely wrong. - rcu_lock_break() does put_task_struct(). If the task has exited it is not safe to even read its ->state, nothing protects this task_struct. - The TASK_DEAD checks are wrong too. Contrary to the comment, we can't use it to check if the task was unhashed. It can be unhashed without TASK_DEAD, or it can be valid with TASK_DEAD. For example, an autoreaping task can do release_task(current) long before it sets TASK_DEAD in do_exit(). Or, a zombie task can have ->state == TASK_DEAD but release_task() was not called, and in this case we must not break the loop. Change this code to check pid_alive() instead, and do this before we drop the reference to the task_struct. Note: while_each_thread() under rcu_read_lock() is not really safe, it can livelock. This will be fixed later, but fortunately in this case the "max_count" logic saves us anyway. Signed-off-by: Oleg Nesterov <[email protected]> Acked-by: Frederic Weisbecker <[email protected]> Acked-by: Mandeep Singh Baines <[email protected]> Acked-by: Paul E. McKenney <[email protected]> Cc: Tetsuo Handa <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-03-05vfork: kill PF_STARTINGOleg Nesterov2-10/+0
Previously it was (ab)used by utrace. Then it was wrongly used by the scheduler code. Currently it is not used, kill it before it finds the new erroneous user. Signed-off-by: Oleg Nesterov <[email protected]> Acked-by: Tejun Heo <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-03-05coredump_wait: don't call complete_vfork_done()Oleg Nesterov3-14/+3
Now that CLONE_VFORK is killable, coredump_wait() no longer needs complete_vfork_done(). zap_threads() should find and kill all tasks with the same ->mm, this includes our parent if ->vfork_done is set. mm_release() becomes the only caller, unexport complete_vfork_done(). Signed-off-by: Oleg Nesterov <[email protected]> Acked-by: Tejun Heo <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-03-05vfork: make it killableOleg Nesterov2-9/+33
Make vfork() killable. Change do_fork(CLONE_VFORK) to do wait_for_completion_killable(). If it fails we do not return to the user-mode and never touch the memory shared with our child. However, in this case we should clear child->vfork_done before return, we use task_lock() in do_fork()->wait_for_vfork_done() and complete_vfork_done() to serialize with each other. Note: now that we use task_lock() we don't really need completion, we could turn task->vfork_done into "task_struct *wake_up_me" but this needs some complications. NOTE: this and the next patches do not affect in-kernel users of CLONE_VFORK, kernel threads run with all signals ignored including SIGKILL/SIGSTOP. However this is obviously the user-visible change. Not only a fatal signal can kill the vforking parent, a sub-thread can do execve or exit_group() and kill the thread sleeping in vfork(). Signed-off-by: Oleg Nesterov <[email protected]> Acked-by: Tejun Heo <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>