aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2011-01-26mm: clear pages_scanned only if draining a pcp adds pages to the buddy allocatorDavid Rientjes1-2/+4
Commit 0e093d99763e ("writeback: do not sleep on the congestion queue if there are no congested BDIs or if significant congestion is not being encountered in the current zone") uncovered a livelock in the page allocator that resulted in tasks infinitely looping trying to find memory and kswapd running at 100% cpu. The issue occurs because drain_all_pages() is called immediately following direct reclaim when no memory is freed and try_to_free_pages() returns non-zero because all zones in the zonelist do not have their all_unreclaimable flag set. When draining the per-cpu pagesets back to the buddy allocator for each zone, the zone->pages_scanned counter is cleared to avoid erroneously setting zone->all_unreclaimable later. The problem is that no pages may actually be drained and, thus, the unreclaimable logic never fails direct reclaim so the oom killer may be invoked. This apparently only manifested after wait_iff_congested() was introduced and the zone was full of anonymous memory that would not congest the backing store. The page allocator would infinitely loop if there were no other tasks waiting to be scheduled and clear zone->pages_scanned because of drain_all_pages() as the result of this change before kswapd could scan enough pages to trigger the reclaim logic. Additionally, with every loop of the page allocator and in the reclaim path, kswapd would be kicked and would end up running at 100% cpu. In this scenario, current and kswapd are all running continuously with kswapd incrementing zone->pages_scanned and current clearing it. The problem is even more pronounced when current swaps some of its memory to swap cache and the reclaimable logic then considers all active anonymous memory in the all_unreclaimable logic, which requires a much higher zone->pages_scanned value for try_to_free_pages() to return zero that is never attainable in this scenario. Before wait_iff_congested(), the page allocator would incur an unconditional timeout and allow kswapd to elevate zone->pages_scanned to a level that the oom killer would be called the next time it loops. The fix is to only attempt to drain pcp pages if there is actually a quantity to be drained. The unconditional clearing of zone->pages_scanned in free_pcppages_bulk() need not be changed since other callers already ensure that draining will occur. This patch ensures that free_pcppages_bulk() will actually free memory before calling into it from drain_all_pages() so zone->pages_scanned is only cleared if appropriate. Signed-off-by: David Rientjes <[email protected]> Cc: Mel Gorman <[email protected]> Reviewed-by: Johannes Weiner <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Wu Fengguang <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Reviewed-by: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-01-26mm: fix deferred congestion timeout if preferred zone is not allowedDavid Rientjes2-2/+13
Before 0e093d99763e ("writeback: do not sleep on the congestion queue if there are no congested BDIs or if significant congestion is not being encountered in the current zone"), preferred_zone was only used for NUMA statistics, to determine the zoneidx from which to allocate from given the type requested, and whether to utilize memory compaction. wait_iff_congested(), though, uses preferred_zone to determine if the congestion wait should be deferred because its dirty pages are backed by a congested bdi. This incorrectly defers the timeout and busy loops in the page allocator with various cond_resched() calls if preferred_zone is not allowed in the current context, usually consuming 100% of a cpu. This patch ensures preferred_zone is an allowed zone in the fastpath depending on whether current is constrained by its cpuset or nodes in its mempolicy (when the nodemask passed is non-NULL). This is correct since the fastpath allocation always passes ALLOC_CPUSET when trying to allocate memory. In the slowpath, this patch resets preferred_zone to the first zone of the allowed type when the allocation is not constrained by current's cpuset, i.e. it does not pass ALLOC_CPUSET. This patch also ensures preferred_zone is from the set of allowed nodes when called from within direct reclaim since allocations are always constrained by cpusets in this context (it is blockable). Both of these uses of cpuset_current_mems_allowed are protected by get_mems_allowed(). Signed-off-by: David Rientjes <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Wu Fengguang <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Acked-by: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-01-26pps: claim parallel port exclusivelyAlexander Gordeev2-2/+2
Both pps_parport and pps_gen_parport are written in a way that they can't share a port with any other driver. This can result in locking up the process that loads modules or even the whole kernel if the modules are compiled in. Use PARPORT_FLAG_EXCL to indicate this. Signed-off-by: Alexander Gordeev <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-01-26pps ktimer: remove noisy messageRodolfo Giometti1-2/+0
Signed-off-by: Rodolfo Giometti <[email protected]> Reported-by: Ingo Molnar <[email protected]> Cc: Alexander Gordeev <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-01-26parport: make lockdep happy with waitlist_lockAlexander Gordeev1-2/+2
parport_unregister_device() should never be used when interrupts are enabled in hardware and irq handler is registered so there is no need to disable interrupts when using waitlist_lock. But there is no way to explain this subtle semantics to lockdep analyzer. So disable interrupts here too to simplify things. The price is negligible. Signed-off-by: Alexander Gordeev <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-01-26langwell_gpio: modify EOI handling following change of kernel irq subsystemFeng Tang1-2/+7
Latest kernel has many changes in IRQ subsystem and its interfaces, like adding "irq_eoi" for struct irq_chip, this patch is a follow up change for that. Also remove the unnecessary cast for a "void *". Signed-off-by: Feng Tang <[email protected]> Cc: Alek Du <[email protected]> Cc: Alan Cox <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-01-26leds: leds-pwm: return proper error if pwm_request failedAxel Lin1-0/+1
Return PTR_ERR(led_dat->pwm) instead of 0 if pwm_request failed Signed-off-by: Axel Lin <[email protected]> Cc: Richard Purdie <[email protected]> Cc: Luotao Fu <[email protected]> Cc: Reviewed-by: Dmitry Torokhov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-01-26mm/pgtable-generic.c: fix CONFIG_SWAP=n buildAndrew Morton1-0/+1
mips (and sparc32): In file included from arch/mips/include/asm/tlb.h:21, from mm/pgtable-generic.c:9: include/asm-generic/tlb.h: In function `tlb_flush_mmu': include/asm-generic/tlb.h:76: error: implicit declaration of function `release_pages' include/asm-generic/tlb.h: In function `tlb_remove_page': include/asm-generic/tlb.h:105: error: implicit declaration of function `page_cache_release' free_pages_and_swap_cache() and free_page_and_swap_cache() are macros which call release_pages() and page_cache_release(). The obvious fix is to include pagemap.h in swap.h, where those macros are defined. But that breaks sparc for weird reasons. So fix it within mm/pgtable-generic.c instead. Reported-by: Yoichi Yuasa <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Acked-by: Sam Ravnborg <[email protected]> Cc: Sergei Shtylyov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-01-26thp: fix PARAVIRT x86 32bit noPAEAndrea Arcangeli1-3/+2
This fixes TRANSPARENT_HUGEPAGE=y with PARAVIRT=y and HIGHMEM64=n. The #ifdef that this patch removes was erratically introduced to fix a build error for noPAE (where pmd.pmd doesn't exist). So then the kernel built but it failed at runtime because set_pmd_at was a noop. This will correct it by enabling set_pmd_at for noPAE mode too. Signed-off-by: Andrea Arcangeli <[email protected]> Reported-by: werner <[email protected]> Reported-by: Minchan Kim <[email protected]> Tested-by: Minchan Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-01-26Merge branch 'fixes' of master.kernel.org:/home/rmk/linux-2.6-armLinus Torvalds12-138/+107
* 'fixes' of master.kernel.org:/home/rmk/linux-2.6-arm: ALSA: AACI: fix timeout duration ALSA: AACI: fix timeout condition checking ARM: 6636/1: ep93xx: default multiplexed gpio ports to gpio mode ARM: 6637/1: Make the argument to virt_to_phys() "const volatile" ARM: twd: ensure timer reload is reprogrammed on entry to periodic mode ARM: 6635/2: Configure reference clock for Versatile Express timers ARM: versatile: name configuration options after actual board names ARM: realview: name configuration options after actual board names ARM: realview,vexpress: fix section mismatch warning for pen_release ARM: 6632/3: mmci: stop using the blockend interrupts
2011-01-26Merge branch 'for-linus' of ↵Linus Torvalds1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2: nilfs2: fix crash after one superblock became unavailable
2011-01-26Merge branch 'for-linus' of ↵Linus Torvalds6-32/+18
git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: m68k/amiga: Fix "debug=mem" m68k/atari: Rename "scc" to "atari_scc" m68k: Uninline strchr()
2011-01-26Merge branch 'rmobile-fixes-for-linus' of ↵Linus Torvalds9-19/+222
git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6 * 'rmobile-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: ARM: mach-shmobile: AG5EVM LCDC / MIPI-DSI platform data ARM: mach-shmobile: sh73a0 CPGA fix for PLL CFG bit ARM: mach-shmobile: mackerel: clarify shdi/mmcif switch settings ARM: mach-shmobile: sh73a0 CPGA fix for IrDA MSTP ARM: mach-shmobile: sh73a0 CPGA fix for FRQCRA M3 ARM: mach-shmobile: remove sh7367 on-chip set_irq_type() ARM: mach-shmobile: sh7372 INTCS MFIS2 interrupt update ARM: mach-shmobile: ag5evm: Add IrDA support ARM: mach-shmobile: clock-sh7372: fixup pllc2 set_rate mmc: sh_mmcif: Convert to __raw_xxx() I/O accessors. ARM: mach-shmobile: ag5evm requires GPIOLIB ARM: mach-shmobile: fix cpu_base of gic_init() on sh73a0
2011-01-26Merge branch 'fbdev-fixes-for-linus' of ↵Linus Torvalds7-40/+44
git://git.kernel.org/pub/scm/linux/kernel/git/lethal/fbdev-2.6 * 'fbdev-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/fbdev-2.6: mailmap: Add an entry for Axel Lin. video: fix some comments in drivers/video/console/vgacon.c drivers/video/bf537-lq035.c: Add missing IS_ERR test video: pxa168fb: remove a redundant pxa168fb_check_var call video: da8xx-fb: fix fb_probe error path video: pxa3xx-gcu: Return -EFAULT when copy_from_user() fails video: nuc900fb: properly free resources in nuc900fb_remove video: nuc900fb: fix compile error
2011-01-26Merge branch 'sh-fixes-for-linus' of ↵Linus Torvalds10-9/+30
git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6 * 'sh-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: sh: Fix build of sh7750 base boards sh: update INTC to clear IRQ sense valid flag sh: Fix sh build failure when CONFIG_SFC=m sh: fix MSIOF0 SPI on ecovec: it conflicts with VOU sh: support XZ-compressed kernel. sh: Fix up breakage from asm-generic/pgtable.h changes.
2011-01-26KEYS: Fix __key_link_end() quota fixup on errorDavid Howells4-20/+27
Fix __key_link_end()'s attempt to fix up the quota if an error occurs. There are two erroneous cases: Firstly, we always decrease the quota if the preallocated replacement keyring needs cleaning up, irrespective of whether or not we should (we may have replaced a pointer rather than adding another pointer). Secondly, we never clean up the quota if we added a pointer without the keyring storage being extended (we allocate multiple pointers at a time, even if we're not going to use them all immediately). We handle this by setting the bottom bit of the preallocation pointer in __key_link_begin() to indicate that the quota needs fixing up, which is then passed to __key_link() (which clears the whole thing) and __key_link_end(). Signed-off-by: David Howells <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-01-26intel_scu_ipcutils: Fix the license tagAlan Cox1-1/+1
GPL V2 should be GPL v2 Signed-off-by: Alan Cox <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-01-26Documentation: Fix kernel parameter orderingAlan Cox1-2/+2
A B C D E ... Signed-off-by: Alan Cox <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-01-26intel_scu_ipc: fix signedness bugAxel Lin1-4/+3
busy_loop() returns negative error code, thus change err variable from u32 to int to properly propagate correct error code. Also remove unneeded initialization for err and i variables. Signed-off-by: Axel Lin <[email protected]> Signed-off-by: Alan Cox <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-01-25bonding: update documentation - alternate configuration.Nicolas de Pesloüan1-12/+71
The bonding documentation used to provide configuration details and examples for initscripts and sysconfig only. This patch describe the third possible configuration: /etc/network/interfaces. Signed-off-by: Nicolas de Pesloüan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-01-25TCP: fix a bug that triggers large number of TCP RST by mistakeJerry Chu1-1/+1
This patch fixes a bug that causes TCP RST packets to be generated on otherwise correctly behaved applications, e.g., no unread data on close,..., etc. To trigger the bug, at least two conditions must be met: 1. The FIN flag is set on the last data packet, i.e., it's not on a separate, FIN only packet. 2. The size of the last data chunk on the receive side matches exactly with the size of buffer posted by the receiver, and the receiver closes the socket without any further read attempt. This bug was first noticed on our netperf based testbed for our IW10 proposal to IETF where a large number of RST packets were observed. netperf's read side code meets the condition 2 above 100%. Before the fix, tcp_data_queue() will queue the last skb that meets condition 1 to sk_receive_queue even though it has fully copied out (skb_copy_datagram_iovec()) the data. Then if condition 2 is also met, tcp_recvmsg() often returns all the copied out data successfully without actually consuming the skb, due to a check "if ((chunk = len - tp->ucopy.len) != 0) {" and "len -= chunk;" after tcp_prequeue_process() that causes "len" to become 0 and an early exit from the big while loop. I don't see any reason not to free the skb whose data have been fully consumed in tcp_data_queue(), regardless of the FIN flag. We won't get there if MSG_PEEK is on. Am I missing some arcane cases related to urgent data? Signed-off-by: H.K. Jerry Chu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-01-25MAINTAINERS: remove Reinette Chatre as iwlwifi maintainerReinette Chatre1-1/+0
Signed-off-by: Reinette Chatre <[email protected]> Signed-off-by: Wey-Yi Guy <[email protected]> Signed-off-by: John W. Linville <[email protected]>
2011-01-25rt2x00: add device id for windy31 usb deviceGreg Kroah-Hartman1-0/+1
This patch adds the device id for the windy31 USB device to the rt73usb driver. Thanks to Ralf Flaxa for reporting this and providing testing and a sample device. Reported-by: Ralf Flaxa <[email protected]> Tested-by: Ralf Flaxa <[email protected]> Cc: stable <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Acked-by: Ivo van Doorn <[email protected]> Signed-off-by: John W. Linville <[email protected]>
2011-01-25mac80211: fix a crash in ieee80211_beacon_get_tim on change_interfaceFelix Fietkau1-0/+3
Some drivers (e.g. ath9k) do not always disable beacons when they're supposed to. When an interface is changed using the change_interface op, the mode specific sdata part is in an undefined state and trying to get a beacon at this point can produce weird crashes. To fix this, add a check for ieee80211_sdata_running before using anything from the sdata. Signed-off-by: Felix Fietkau <[email protected]> Cc: [email protected] Signed-off-by: John W. Linville <[email protected]>
2011-01-25Merge branch 'mmci' into fixesRussell King2-82/+21
2011-01-25ALSA: AACI: fix timeout durationRussell King1-19/+23
Relying on the access time of peripherals is unreliable - it depends on the speed of the CPU and the bus. On Versatile Express, these timeouts were expiring, causing the driver to fail. Add udelay(1) to ensure that they don't expire early, and adjust timeouts to give a reasonable margin over the response times. Signed-off-by: Russell King <[email protected]>
2011-01-25ALSA: AACI: fix timeout condition checkingRussell King1-3/+3
Ensure that a timeout coincident with the condition being waited for results in success rather than failure. This helps avoid timeout conditions being inappropriately flagged. Signed-off-by: Russell King <[email protected]>
2011-01-25ARM: 6636/1: ep93xx: default multiplexed gpio ports to gpio modeHartley Sweeten1-0/+7
The EP93xx C and D GPIO ports are multiplexed with the Keypad Interface peripheral.  At power-up they default into non-GPIO mode with the Key Matrix controller enabled so these ports are unusable for GPIO.  Note that the Keypad Interface peripheral is only available in the EP9307, EP9312, and EP9315 processor variants. The keypad support will clear the DeviceConfig bits appropriately to enable the Keypad Interface when the driver is loaded.  And, when the driver is unloaded it will set the bits to return the ports to GPIO mode. To make these ports available for GPIO after power-up on all EP93xx processor variants, set the KEYS and GONK bits in the DeviceConfig register. Similarly, the E, G, and H ports are multiplexed with the IDE Interface peripheral.  At power-up these also default into non-GPIO mode.  Note that the IDE peripheral is only available in the EP9312 and EP9315 processor variants. Since an IDE driver is not even available in mainline, set the EONIDE, GONIDE, and HONIDE bits in the DeviceConfig register so that these ports will be available for GPIO use after power-up. Signed-off-by: H Hartley Sweeten <[email protected]> Acked-by: Ryan Mallon <[email protected]> Signed-off-by: Russell King <[email protected]>
2011-01-25ARM: 6637/1: Make the argument to virt_to_phys() "const volatile"Catalin Marinas1-1/+1
Changing the virt_to_phys() argument to "const volatile void *" avoids compiler warnings in some situations where this function is used. Signed-off-by: Catalin Marinas <[email protected]> Acked-by: Stephen Boyd <[email protected]> Acked-by: Arnd Bergmann <[email protected]> Signed-off-by: Russell King <[email protected]>
2011-01-25ARM: twd: ensure timer reload is reprogrammed on entry to periodic modeRussell King1-5/+2
Ensure that the twd timer reload value is reprogrammed each time we enter periodic mode. This ensures that the reload value is always reset correctly. Tested-by: Santosh Shilimkar <[email protected]> Acked-by: Colin Cross <[email protected]> Signed-off-by: Russell King <[email protected]>
2011-01-25ipv6: Revert 'administrative down' address handling changes.David S. Miller1-48/+33
This reverts the following set of commits: d1ed113f1669390da9898da3beddcc058d938587 ("ipv6: remove duplicate neigh_ifdown") 29ba5fed1bbd09c2cba890798c8f9eaab251401d ("ipv6: don't flush routes when setting loopback down") 9d82ca98f71fd686ef2f3017c5e3e6a4871b6e46 ("ipv6: fix missing in6_ifa_put in addrconf") 2de795707294972f6c34bae9de713e502c431296 ("ipv6: addrconf: don't remove address state on ifdown if the address is being kept") 8595805aafc8b077e01804c9a3668e9aa3510e89 ("IPv6: only notify protocols if address is compeletely gone") 27bdb2abcc5edb3526e25407b74bf17d1872c329 ("IPv6: keep tentative addresses in hash table") 93fa159abe50d3c55c7f83622d3f5c09b6e06f4b ("IPv6: keep route for tentative address") 8f37ada5b5f6bfb4d251a7f510f249cb855b77b3 ("IPv6: fix race between cleanup and add/delete address") 84e8b803f1e16f3a2b8b80f80a63fa2f2f8a9be6 ("IPv6: addrconf notify when address is unavailable") dc2b99f71ef477a31020511876ab4403fb7c4420 ("IPv6: keep permanent addresses on admin down") because the core semantic change to ipv6 address handling on ifdown has broken some things, in particular "disable_ipv6" sysctl handling. Stephen has made several attempts to get things back in working order, but nothing has restored disable_ipv6 fully yet. Reported-by: Eric W. Biederman <[email protected]> Tested-by: Eric W. Biederman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-01-25NFS: nfs_wcc_update_inode() should set nfsi->attr_gencountTrond Myklebust1-9/+17
If the call to nfs_wcc_update_inode() results in an attribute update, we need to ensure that the inode's attr_gencount gets bumped too, otherwise we are not protected against races with other GETATTR calls. Signed-off-by: Trond Myklebust <[email protected]>
2011-01-25NFS improve pnfs_put_deviceid_cache debug printAndy Adamson1-1/+1
What we really want to know is the ref count. Signed-off-by: Andy Adamson <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2011-01-25NFS fix cb_sequence error processingAndy Adamson1-1/+1
Always assign the cb_process_state nfs_client pointer so a processing error in cb_sequence after the nfs_client is found and referenced returns a non-NULL cb_process_state nfs_client and the matching nfs_put_client in nfs4_callback_compound dereferences the client. Signed-off-by: Andy Adamson <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2011-01-25NFS do not find client in NFSv4 pg_authenticateAndy Adamson10-128/+42
The information required to find the nfs_client cooresponding to the incoming back channel request is contained in the NFS layer. Perform minimal checking in the RPC layer pg_authenticate method, and push more detailed checking into the NFS layer where the nfs_client can be found. Signed-off-by: Andy Adamson <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2011-01-25NLM: Fix "kernel BUG at fs/lockd/host.c:417!" or ".../host.c:283!"Chuck Lever1-4/+5
Nick Bowler <[email protected]> reports: > We were just having some NFS server troubles, and my client machine > running 2.6.38-rc1+ (specifically, commit 2b1caf6ed7b888c95) crashed > hard (syslog output appended to this mail). > > I'm not sure what the exact timeline was or how to reproduce this, > but the server was rebooted during all this. Since I've never seen > this happen before, it is possibly a regression from previous kernel > releases. However, I recently updated my nfs-utils (on the client) to > version 1.2.3, so that might be related as well. [ BUG output redacted ] When done searching, the for_each_host loop in next_host_state() falls through and returns the final host on the host chain without bumping it's reference count. Since the host's ref count is only one at that point, releasing the host in nlm_host_rebooted() attempts to destroy the host prematurely, and therefore hits a BUG(). Likely, the original intent of the for_each_host behavior in next_host_state() was to handle the case when the host chain is empty. Searching the chain and finding no suitable host to return needs to be handled as well. Defensively restructure next_host_state() always to return NULL when the loop falls through. Introduced by commit b10e30f6 "lockd: reorganize nlm_host_rebooted". Cc: J. Bruce Fields <[email protected]> Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2011-01-25NFS: Prevent memory allocation failure in nfsacl_encode()Chuck Lever4-13/+31
nfsacl_encode() allocates memory in certain cases. This of course is not guaranteed to work. Since commit 9f06c719 "SUNRPC: New xdr_streams XDR encoder API", the kernel's XDR encoders can't return a result indicating possibly a failure, so a memory allocation failure in nfsacl_encode() has become fatal (ie, the XDR code Oopses) in some cases. However, the allocated memory is a tiny fixed amount, on the order of 40-50 bytes. We can easily use a stack-allocated buffer for this, with only a wee bit of nose-holding. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2011-01-25NFS: nfsacl_{encode,decode} should return signed integerChuck Lever2-8/+28
Clean up. The nfsacl_encode() and nfsacl_decode() functions return negative errno values, and each call site verifies that the returned value is not negative. Change the synopsis of both of these functions to reflect this usage. Document the synopsis and return values. Reported-by: Trond Myklebust <[email protected]> Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2011-01-25NFS: Fix "kernel BUG at fs/nfs/nfs3xdr.c:1338!"Chuck Lever1-1/+4
Milan Broz <[email protected]> reports: > on today Linus' tree I get OOps if using nfs. > > server (2.6.36) exports dir: > /dir 172.16.1.0/24(rw,async,all_squash,no_subtree_check,anonuid=500,anongid=500) > > on client it is mounted in fstab > server:/dir /mnt/tst nfs rw,soft 0 0 > > and these commands OOpses it (simplified from a configure script): > > cd /dir > touch x > install x y > > [ 105.327701] ------------[ cut here ]------------ > [ 105.327979] kernel BUG at fs/nfs/nfs3xdr.c:1338! > [ 105.328075] invalid opcode: 0000 [#1] PREEMPT SMP > [ 105.328223] last sysfs file: /sys/devices/virtual/bdi/0:16/uevent > [ 105.328349] Modules linked in: usbcore dm_mod > [ 105.328553] > [ 105.328678] Pid: 3710, comm: install Not tainted 2.6.37+ #423 440BX Desktop Reference Platform/VMware Virtual Platform > [ 105.328853] EIP: 0060:[<c116c06c>] EFLAGS: 00010282 CPU: 0 > [ 105.329152] EIP is at nfs3_xdr_enc_setacl3args+0x61/0x98 > [ 105.329249] EAX: ffffffea EBX: ce941d98 ECX: 00000000 EDX: 00000004 > [ 105.329340] ESI: ce941cd0 EDI: 000000a4 EBP: ce941cc0 ESP: ce941cb4 > [ 105.329431] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 > [ 105.329525] Process install (pid: 3710, ti=ce940000 task=ced36f20 task.ti=ce940000) > [ 105.336600] Stack: > [ 105.336693] ce941cd0 ce9dc000 00000000 ce941cf8 c12ecd02 c12f43e0 c116c00b cf754158 > [ 105.336982] ce9dc004 cf754284 ce9dc004 cf7ffee8 ceff9978 ce9dc000 cf7ffee8 ce9dc000 > [ 105.337182] ce9dc000 ce941d14 c12e698d cf75412c ce941d98 cf7ffee8 cf7fff20 00000000 > [ 105.337405] Call Trace: > [ 105.337695] [<c12ecd02>] rpcauth_wrap_req+0x75/0x7f > [ 105.337806] [<c12f43e0>] ? xdr_encode_opaque+0x12/0x15 > [ 105.337898] [<c116c00b>] ? nfs3_xdr_enc_setacl3args+0x0/0x98 > [ 105.337988] [<c12e698d>] call_transmit+0x17e/0x1e8 > [ 105.338072] [<c12ec307>] __rpc_execute+0x6d/0x1a6 > [ 105.338155] [<c12ec474>] rpc_execute+0x34/0x37 > [ 105.338235] [<c12e738d>] rpc_run_task+0xb5/0xbd > [ 105.338316] [<c12e7474>] rpc_call_sync+0x3d/0x58 > [ 105.338402] [<c116d0c6>] nfs3_proc_setacls+0x18e/0x24f > [ 105.338493] [<c10b3f76>] ? __kmalloc+0x148/0x1c4 > [ 105.338579] [<c10ecd01>] ? posix_acl_alloc+0x12/0x22 > [ 105.338665] [<c116d5c8>] nfs3_proc_setacl+0xa0/0xca > [ 105.338748] [<c116d69c>] nfs3_setxattr+0x62/0x88 > [ 105.338834] [<c1317042>] ? sub_preempt_count+0x7c/0x89 > [ 105.338926] [<c116d63a>] ? nfs3_setxattr+0x0/0x88 > [ 105.339026] [<c10cfa79>] __vfs_setxattr_noperm+0x26/0x95 > [ 105.339114] [<c10cfb43>] vfs_setxattr+0x5b/0x76 > [ 105.339211] [<c10cfbfb>] setxattr+0x9d/0xc3 > [ 105.339298] [<c10a2ea8>] ? handle_pte_fault+0x258/0x5cb > [ 105.339428] [<c1091ff6>] ? __free_pages+0x1a/0x23 > [ 105.339517] [<c10498ea>] ? up_read+0x16/0x2c > [ 105.339599] [<c10b8365>] ? fget+0x0/0xa3 > [ 105.339677] [<c10b8365>] ? fget+0x0/0xa3 > [ 105.339760] [<c1025d23>] ? get_parent_ip+0xb/0x31 > [ 105.339843] [<c1317042>] ? sub_preempt_count+0x7c/0x89 > [ 105.339931] [<c10cfc72>] sys_fsetxattr+0x51/0x79 > [ 105.340014] [<c1002853>] sysenter_do_call+0x12/0x32 > [ 105.340133] Code: 2e 76 18 00 58 31 d2 8b 7f 28 f6 43 04 01 74 03 8b 53 08 6a 00 8b 46 04 6a 01 8b 0b 52 89 fa e8 85 10 f8 ff 83 c4 0c 85 c0 79 04 <0f> 0b eb fe 31 c9 f6 43 04 04 74 03 8b 4b 0c 68 00 10 00 00 8d > [ 105.350321] EIP: [<c116c06c>] nfs3_xdr_enc_setacl3args+0x61/0x98 SS:ESP 0068:ce941cb4 > [ 105.364385] ---[ end trace 01fcfe7f0f7f6e4a ]--- nfs3_xdr_enc_setacl3args() is not properly setting up the target buffer before nfsacl_encode() attempts to encode the ACL. Introduced by commit d9c407b1 "NFS: Introduce new-style XDR encoding functions for NFSv3." Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2011-01-25NFS: Fix "kernel BUG at fs/aio.c:554!"Chuck Lever1-14/+20
Nick Piggin reports: > I'm getting use after frees in aio code in NFS > > [ 2703.396766] Call Trace: > [ 2703.396858] [<ffffffff8100b057>] ? native_sched_clock+0x27/0x80 > [ 2703.396959] [<ffffffff8108509e>] ? put_lock_stats+0xe/0x40 > [ 2703.397058] [<ffffffff81088348>] ? lock_release_holdtime+0xa8/0x140 > [ 2703.397159] [<ffffffff8108a2a5>] lock_acquire+0x95/0x1b0 > [ 2703.397260] [<ffffffff811627db>] ? aio_put_req+0x2b/0x60 > [ 2703.397361] [<ffffffff81039701>] ? get_parent_ip+0x11/0x50 > [ 2703.397464] [<ffffffff81612a31>] _raw_spin_lock_irq+0x41/0x80 > [ 2703.397564] [<ffffffff811627db>] ? aio_put_req+0x2b/0x60 > [ 2703.397662] [<ffffffff811627db>] aio_put_req+0x2b/0x60 > [ 2703.397761] [<ffffffff811647fe>] do_io_submit+0x2be/0x7c0 > [ 2703.397895] [<ffffffff81164d0b>] sys_io_submit+0xb/0x10 > [ 2703.397995] [<ffffffff8100307b>] system_call_fastpath+0x16/0x1b > > Adding some tracing, it is due to nfs completing the request then > returning something other than -EIOCBQUEUED, so aio.c > also completes the request. To address this, prevent the NFS direct I/O engine from completing async iocbs when the forward path returns an error without starting any I/O. This fix appears to survive ^C during both "xfstest no. 208" and "fsx -Z." It's likely this bug has existed for a very long while, as we are seeing very similar symptoms in OEL 5. Copying stable. Cc: Stable <[email protected]> Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2011-01-25NFS4: Avoid potential NULL pointer dereference in decode_and_add_ds().Jesper Juhl1-2/+7
On Mon, 17 Jan 2011, Mi Jinlong wrote: > > > Jesper Juhl: > > strrchr() can return NULL if nothing is found. If this happens we'll > > dereference a NULL pointer in > > fs/nfs/nfs4filelayoutdev.c::decode_and_add_ds(). > > > > I tried to find some other code that guarantees that this can never > > happen but I was unsuccessful. So, unless someone else can point to some > > code that ensures this can never be a problem, I believe this patch is > > needed. > > > > While I was changing this code I also noticed that all the dprintk() > > statements, except one, start with "%s:". The one missing the ":" I added > > it to. > > Maybe another one also should be changed at decode_and_add_ds() at line 243: > > 243 printk("%s Decoded address and port %s\n", __func__, buf); > Missed that one. Thanks. Signed-off-by: Jesper Juhl <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2011-01-25PM / Runtime: Don't enable interrupts while running in_interruptAlan Stern1-3/+6
This patch (as1445) fixes a bug in the runtime PM core left over from the addition of the no_callbacks flag. If this flag is set then it is possible for rpm_suspend() to be called in_interrupt, so when releasing spinlocks it's important not to re-enable interrupts. To avoid an unnecessary save-and-restore of the interrupt flag, the patch also inlines a pm_request_idle() call. This fixes Bugzilla #27482. (The offending code was added in 2.6.37, so it's not necessary to apply this to any earlier stable kernels.) Signed-off-by: Alan Stern <[email protected]> Reported-by: tim blechmann <[email protected]> CC: <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2011-01-25CIFS: Add strictcache mount optionPavel Shilovsky2-0/+10
Use for switching on strict cache mode. In this mode the client reads from the cache all the time it has Oplock Level II, otherwise - read from the server. As for write - the client stores a data in the cache in Exclusive Oplock case, otherwise - write directly to the server. Signed-off-by: Pavel Shilovsky <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Steve French <[email protected]>
2011-01-25CIFS: Implement cifs_strict_writev (try #4)Pavel Shilovsky4-6/+217
If we don't have Exclusive oplock we write a data to the server. Also set invalidate_mapping flag on the inode if we wrote something to the server. Add cifs_iovec_write to let the client write iovec buffers through CIFSSMBWrite2. Signed-off-by: Pavel Shilovsky <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Steve French <[email protected]>
2011-01-25[CIFS] Replace cifs md5 hashing functions with kernel crypto APIsSteve French6-416/+51
Replace remaining use of md5 hash functions local to cifs module with kernel crypto APIs. Remove header and source file containing those local functions. Signed-off-by: Shirish Pargaonkar <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Steve French <[email protected]>
2011-01-25ALSA: HDA: Fix dmesg output of HDMI supported bitsDavid Henningsson1-1/+1
This typo caused the dmesg output of the supported bits of HDMI to be cut off early. Cc: [email protected] Signed-off-by: David Henningsson <[email protected]> Signed-off-by: Takashi Iwai <[email protected]>
2011-01-25hwmon: (lis3) turn down the no IRQ messageKalhan Trisal1-1/+1
Turn down the no IRQ message - on some platforms that's a normal state of affairs. Signed-off-by: Kalhan Trisal <[email protected]> Signed-off-by: Alan Cox <[email protected]> Acked-by: Eric Piel <[email protected]> Signed-off-by: Guenter Roeck <[email protected]>
2011-01-25ALSA: fix invalid hardware.h include in ac97c for AVR32 architectureHans-Christian Egtvedt1-1/+4
This patch fixes the non-compiling AC97C driver for AVR32 architecture by include mach/hardware.h only for AT91 architecture. The AVR32 architecture does not supply the hardware.h include file. Signed-off-by: Hans-Christian Egtvedt <[email protected]> CC: [email protected] Signed-off-by: Takashi Iwai <[email protected]>
2011-01-25ARM: 6635/2: Configure reference clock for Versatile Express timersPawel Moll2-0/+15
Timers on Versatile Express mainboard are used as system clock/event sources. Driver assumes that they are clocked with 1MHz signal. Old V2M firmware apparently configured it by default, but on newer boards one can observe that "sleep 1" command takes over 30 seconds to finish, as the timers are fed with 32kHz instead... This patch performs required magic and also removes code clearing timer's control registers, as exactly the same operations are performed by the timer driver few jiffies later. Signed-off-by: Pawel Moll <[email protected]> Tested-by: Will Deacon <[email protected]> Signed-off-by: Russell King <[email protected]>
2011-01-25ceph: avoid picking MDS that is not activeSage Weil1-3/+7
Ignore replication or auth frag data if it indicates an MDS that is not active. This can happen if the MDS shuts down and the client has stale data about the namespace distribution across the MDS cluster. If that's the case, fall back to directing the request based on the auth cap (which should always be accurate). Signed-off-by: Sage Weil <[email protected]>