aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2010-08-24powerpc/pci: Fix checking for child bridges in PCI code.Grant Likely1-1/+2
pci_device_to_OF_node() can return null, and list_for_each_entry will never enter the loop when dev is NULL, so it looks like this test is a typo. Reported-by: Julia Lawall <[email protected]> Signed-off-by: Grant Likely <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2010-08-24powerpc: Fix typo in uImage targetAnatolij Gustschin1-1/+1
Commit e32e78c5ee8aadef020fbaecbe6fb741ed9029fd (powerpc: fix build with make 3.82) introduced a typo in uImage target and broke building uImage: make: *** No rule to make target `uImage'. Stop. Signed-off-by: Anatolij Gustschin <[email protected]> Cc: stable <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2010-08-24powerpc: Initialise paca->kstack before early_setup_secondaryMatt Evans1-3/+3
As early setup calls down to slb_initialize(), we must have kstack initialised before checking "should we add a bolted SLB entry for our kstack?" Failing to do so means stack access requires an SLB miss exception to refill an entry dynamically, if the stack isn't accessible via SLB(0) (kernel text & static data). It's not always allowable to take such a miss, and intermittent crashes will result. Primary CPUs don't have this issue; an SLB entry is not bolted for their stack anyway (as that lives within SLB(0)). This patch therefore only affects the init of secondaries. Signed-off-by: Matt Evans <[email protected]> Cc: stable <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2010-08-24powerpc: Fix bogus it_blocksize in VIO iommu codeAnton Blanchard4-7/+8
When looking at some issues with the virtual ethernet driver I noticed that TCE allocation was following a very strange pattern: address 00e9000 length 2048 address 0409000 length 2048 <----- address 0429000 length 2048 address 0449000 length 2048 address 0469000 length 2048 address 0489000 length 2048 address 04a9000 length 2048 address 04c9000 length 2048 address 04e9000 length 2048 address 4009000 length 2048 <----- address 4029000 length 2048 Huge unexplained gaps in what should be an empty TCE table. It turns out it_blocksize, the amount we want to align the next allocation to, was c0000000fe903b20. Completely bogus. Initialise it to something reasonable in the VIO IOMMU code, and use kzalloc everywhere to protect against this when we next add a non compulsary field to iommu code and forget to initialise it. Signed-off-by: Anton Blanchard <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2010-08-24powerpc: Inline ppc64_runlatch_offAnton Blanchard2-9/+14
I'm sick of seeing ppc64_runlatch_off in our profiles, so inline it into the callers. To avoid a mess of circular includes I didn't add it as an inline function. Signed-off-by: Anton Blanchard <[email protected]> Acked-by: Olof Johansson <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2010-08-24powerpc: Correct smt_enabled=X boot option for > 2 threads per coreNathan Fontenot2-31/+43
The 'smt_enabled=X' boot option does not handle values of X > 2. For Power 7 processors with smt modes of 0,1,2,3, and 4 this does not work. This patch allows the smt_enabled option to be set to any value limited to a max equal to the number of threads per core. Signed-off-by: Nathan Fontenot <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2010-08-24powerpc: Silence xics_migrate_irqs_away() during cpu offlineSigned-off-by: Darren Hart1-2/+4
All IRQs are migrated away from a CPU that is being offlined so the following messages suggest a problem when the system is behaving as designed: IRQ 262 affinity broken off cpu 1 IRQ 17 affinity broken off cpu 0 IRQ 18 affinity broken off cpu 0 IRQ 19 affinity broken off cpu 0 IRQ 256 affinity broken off cpu 0 IRQ 261 affinity broken off cpu 0 IRQ 262 affinity broken off cpu 0 Don't print these messages when the CPU is not online. Signed-off-by: Darren Hart <[email protected]> Acked-by: Will Schmidt <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Nathan Fontenot <[email protected]> Cc: Robert Jennings <[email protected]> Cc: Brian King <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2010-08-24powerpc: Silence __cpu_up() under normal operationSigned-off-by: Darren Hart1-2/+2
During CPU offline/online tests __cpu_up would flood the logs with the following message: Processor 0 found. This provides no useful information to the user as there is no context provided, and since the operation was a success (to this point) it is expected that the CPU will come back online, providing all the feedback necessary. Change the "Processor found" message to DBG() similar to other such messages in the same function. Also, add an appropriate log level for the "Processor is stuck" message. Signed-off-by: Darren Hart <[email protected]> Acked-by: Will Schmidt <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Nathan Fontenot <[email protected]> Cc: Robert Jennings <[email protected]> Cc: Brian King <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2010-08-24powerpc: Re-enable preemption before cpu_die()Signed-off-by: Darren Hart1-1/+1
start_secondary() is called shortly after _start and also via cpu_idle()->cpu_die()->pseries_mach_cpu_die() start_secondary() expects a preempt_count() of 0. pseries_mach_cpu_die() is called via the cpu_idle() routine with preemption disabled, resulting in the following repeating message during rapid cpu offline/online tests with CONFIG_PREEMPT=y: BUG: scheduling while atomic: swapper/0/0x00000002 Modules linked in: autofs4 binfmt_misc dm_mirror dm_region_hash dm_log [last unloaded: scsi_wait_scan] Call Trace: [c00000010e7079c0] [c0000000000133ec] .show_stack+0xd8/0x218 (unreliable) [c00000010e707aa0] [c0000000006a47f0] .dump_stack+0x28/0x3c [c00000010e707b20] [c00000000006e7a4] .__schedule_bug+0x7c/0x9c [c00000010e707bb0] [c000000000699d9c] .schedule+0x104/0x800 [c00000010e707cd0] [c000000000015b24] .cpu_idle+0x1c4/0x1d8 [c00000010e707d70] [c0000000006aa1b4] .start_secondary+0x398/0x3d4 [c00000010e707e30] [c000000000008278] .start_secondary_resume+0x10/0x14 Move the cpu_die() call inside the existing preemption enabled block of cpu_idle(). This is safe as the idle task is affined to a single CPU so the debug_smp_processor_id() tests (from cpu_should_die()) won't trigger as we are in a "migration disabled" region. Signed-off-by: Darren Hart <[email protected]> Acked-by: Will Schmidt <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Nathan Fontenot <[email protected]> Cc: Robert Jennings <[email protected]> Cc: Brian King <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2010-08-24powerpc/pci: Drop unnecessary null testJulia Lawall1-2/+1
list_for_each_entry binds its first argument to a non-null value, and thus any null test on the value of that argument is superfluous. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ iterator I; expression x,E,E1,E2; statement S,S1,S2; @@ I(x,...) { <... - if (x != NULL || ...) S ...> } // </smpl> Signed-off-by: Julia Lawall <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2010-08-24powerpc/powermac: Drop unnecessary null testJulia Lawall1-1/+1
for_each_node_by_name binds its first argument to a non-null value, and thus any null test on the value of that argument is superfluous. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ iterator I; expression x,E; @@ I(x,...) { <... ( - (x != NULL) && E ...> } // </smpl> Signed-off-by: Julia Lawall <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2010-08-24powerpc/powermac: Drop unnecessary of_node_putJulia Lawall2-3/+0
for_each_node_by_name only exits when its first argument is NULL, and a subsequent call to of_node_put on that argument is unnecessary. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ iterator name for_each_node_by_name; expression np,E; identifier l; @@ for_each_node_by_name(np,...) { ... when != break; when != goto l; } ... when != np = E - of_node_put(np); // </smpl> Signed-off-by: Julia Lawall <[email protected]> Reviewed-by: Grant Likely <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2010-08-24powerpc/kdump: Stop all other CPUs before running crash handlersAnton Blanchard1-11/+13
During kdump we run the crash handlers first then stop all other CPUs. We really want to stop all CPUs as close to the fail as possible and also have a very controlled environment for running the crash handlers, so it makes sense to reverse the order. Signed-off-by: Anton Blanchard <[email protected]> Acked-by: Matt Evans <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2010-08-24powerpc/mm: Fix vsid_scrample typoAnton Blanchard1-1/+1
The code is wrapped in an #if 0, but it's wrong so we may as well fix it. Signed-off-by: Anton Blanchard <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2010-08-24powerpc: Use is_32bit_task() helper to test 32 bit binaryDenis Kirjanov1-3/+3
Use is_32bit_task() helper to test 32 bit binary. Signed-off-by: Denis Kirjanov <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2010-08-24powerpc: Export memstart_addr and kernstart_addr on ppc64Sonny Rao1-0/+2
Some modules (like eHCA) want to map all of kernel memory, for this to work with a relocated kernel, we need to export kernstart_addr so modules can use PHYSICAL_START and memstart_addr so they could use MEMORY_START. Note that the 32bit code already exports these symbols. Signed-off-By: Sonny Rao <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2010-08-24powerpc: Make rwsem use "long" typeBenjamin Herrenschmidt1-27/+37
This makes the 64-bit kernel use 64-bit signed integers for the counter (effectively supporting 32-bit of active count in the semaphore), thus avoiding things like overflow of the mmap_sem if you use a really crazy number of threads Note: Ideally the type in the structure should be atomic_long_t rather than "long". However, there's some nasty issues with that. It needs to be initialized statically -and- lib/rwsem.c does things like sem->count = RWSEM_UNLOCKED_VALUE; Now, if you mix in the fact that atomic_* types are actually structures with one member and note typedefs of a scalar, it makes its really nasty. So I stuck to what we did before using a long and casts for now. Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2010-08-24Merge remote branch 'jwb/merge' into mergeBenjamin Herrenschmidt6-12/+27
2010-08-23ARM: imx: fix build failure concerning otg/ulpiUwe Kleine-König4-6/+6
The build failure was introduced by 13dd0c9 (USB: otg/ulpi: extend the generic ulpi driver.) Signed-off-by: Uwe Kleine-König <[email protected]> Acked-by: Igor Grinberg <[email protected]> Cc: Mike Rapoport <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2010-08-23USB: ftdi_sio: add product ID for Lenz LI-USBGalen Seitz2-0/+4
Add ftdi product ID for Lenz LI-USB, a model train interface. This was NOT tested against 2.6.35, but a similar patch was tested with the CentOS 2.6.18-194.11.1.el5 kernel. It wasn't clear to me what ordering is being used in ftdi_sio.c, so I inserted the ID after another model train entry(SPROG_II). Signed-off-by: Galen Seitz <[email protected]> Cc: stable <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2010-08-23USB: adutux: fix misuse of return value of copy_to_user()Kulikov Vasiliy1-1/+1
copy_to_user() returns number of not copied bytes, not error code. Signed-off-by: Kulikov Vasiliy <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2010-08-23USB: iowarrior: fix misuse of return value of copy_to_user()Kulikov Vasiliy1-2/+2
copy_to_user() returns number of not copied bytes, not error code. Signed-off-by: Kulikov Vasiliy <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2010-08-23USB: xHCI: update ring dequeue pointer when process missed tdsAndiry Xu1-0/+4
This patch fixes a isoc transfer bug reported by Sander Eikelenboom. When ep->skip is set, endpoint ring dequeue pointer should be updated when processed every missed td. Although ring dequeue pointer will also be updated when ep->skip is clear, leave it intact during missed tds processing may cause two issues: 1). If the very next valid transfer following missed tds is a short transfer, its actual_length will be miscalculated; 2). If there are too many missed tds during transfer, new inserted tds may found the transfer ring full and urb enqueue fails. Reported-by: Sander Eikelenboom <[email protected]> Tested-by: Sander Eikelenboom <[email protected]> Signed-off-by: Andiry Xu <[email protected]> Signed-off-by: Sarah Sharp <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2010-08-23USB: xhci: Remove buggy assignment in next_trb()John Youn1-1/+1
The code to increment the TRB pointer has a slight ambiguity that could lead to a bug on different compilers. The ANSI C specification does not specify the precedence of the assignment operator over the postfix operator. gcc 4.4 produced the correct code (increment the pointer and assign the value), but a MIPS compiler that one of John's clients used assigned the old (unincremented) value. Remove the unnecessary assignment to make all compilers produce the correct assembly. Signed-off-by: John Youn <[email protected]> Signed-off-by: Sarah Sharp <[email protected]> Cc: stable <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2010-08-23USB: ftdi_sio: Add ID for Ionics PlugComputerMartin Michlmayr2-0/+8
Add the ID for the Ionics PlugComputer (<http://ionicsplug.com/>). Signed-off-by: Martin Michlmayr <[email protected]> Cc: stable <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2010-08-23USB: serial: io_ti.c: don't return 0 if writing the download record failedRoel Kluin1-1/+1
If the write download record failed we shouldn't return 0. Signed-off-by: Roel Kluin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2010-08-23USB: otg: twl4030: fix wrong assumption of starting stateFelipe Balbi1-1/+5
The reset state of twl4030-usb is not sleeping, it starts up awaken and we need to disable it if we have booted with a disconnected cable to avoid over consumption on the default state. To avoid problems later, we read the current state of the transceiver from the PHY_PWR_CTRL register. The bootloader can, anyways, put the device to sleep before us. Tested on a custom OMAP board. Signed-off-by: Felipe Balbi <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2010-08-23USB: gadget: Return -ENOMEM on memory allocation failureJulia Lawall1-0/+1
In this code, 0 is returned on memory allocation failure, even though other failures return -ENOMEM or other similar values. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression ret; expression x,e1,e2,e3; @@ ret = 0 ... when != ret = e1 *x = \(kmalloc\|kcalloc\|kzalloc\)(...) ... when != ret = e2 if (x == NULL) { ... when != ret = e3 return ret; } // </smpl> Signed-off-by: Julia Lawall <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2010-08-23USB: gadget: fix composite kernel-doc warningsRandy Dunlap2-2/+3
Warning(include/linux/usb/composite.h:284): No description found for parameter 'disconnect' Warning(drivers/usb/gadget/composite.c:744): No description found for parameter 'c' Warning(drivers/usb/gadget/composite.c:744): Excess function parameter 'cdev' description in 'usb_string_ids_n' Signed-off-by: Randy Dunlap <[email protected]> Cc: David Brownell <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2010-08-23USB: ssu100: set tty_flags in ssu100_process_packetBill Pemberton1-9/+29
flag was never set in ssu100_process_packet. Add logic to set it before calling tty_insert_flip_* Signed-off-by: Bill Pemberton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2010-08-23USB: ssu100: add disconnect function for ssu100Bill Pemberton1-1/+1
Add a disconnect function to the functions of this device. The disconnect is a call to usb_serial_generic_disconnect() so it requires that symbol to be exported from generic.c. Signed-off-by: Bill Pemberton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2010-08-23USB: serial: export symbol usb_serial_generic_disconnectBill Pemberton1-0/+1
This is needed by the ssu100 driver to use this function. Signed-off-by: Bill Pemberton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2010-08-23USB: ssu100: rework logic for TIOCMIWAITBill Pemberton1-35/+111
Rework the logic for TIOCMIWAIT to use wait_event_interruptible. This also adds support for TIOCGICOUNT. Signed-off-by: Bill Pemberton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2010-08-23USB: ssu100: add register parameter to ssu100_setregisterBill Pemberton1-3/+4
The function ssu100_setregister was hard coded to only set the MCR register. Add a register parameter so that other registers can be set. Signed-off-by: Bill Pemberton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2010-08-23USB: ssu100: remove duplicate #defines in ssu100Bill Pemberton1-55/+31
The ssu100 uses a TI16C550C UART so the SERIAL_ defines in this code are duplicates of those found in serial_reg.h. Remove the defines in ssu100.c and use the ones in the header file. Signed-off-by: Bill Pemberton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2010-08-23USB: ssu100: refine process_packet in ssu100Bill Pemberton1-6/+2
The status information does not appear at the start of each incoming packet so the check for len < 4 at the start of ssu100_process_packet is wrong. Remove it. Signed-off-by: Bill Pemberton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2010-08-23USB: ssu100: add locking for port private data in ssu100Bill Pemberton1-1/+8
Signed-off-by: Bill Pemberton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2010-08-23USB: r8a66597-udc: return -ENOMEM if kzalloc() failsAxel Lin1-0/+1
Signed-off-by: Axel Lin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2010-08-23USB: io_ti: check firmware version before updatingGreg Kroah-Hartman1-1/+1
If we can't read the firmware for a device from the disk, and yet the device already has a valid firmware image in it, we don't want to replace the firmware with something invalid. So check the version number to be less than the current one to verify this is the correct thing to do. Reported-by: Chris Beauchamp <[email protected]> Tested-by: Chris Beauchamp <[email protected]> Cc: Alan Stern <[email protected]> Cc: stable <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2010-08-23USB: ftdi_sio: fix endianess of max packet sizeMichael Wileczka1-1/+1
The USB max packet size (always little-endian) was not being byte swapped on big-endian systems. Applicable since [USB: ftdi_sio: fix hi-speed device packet size calculation] approx 2.6.31 Signed-off-by: Michael Wileczka <[email protected]> Cc: stable <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2010-08-23USB: CP210x Fix Break On/OffCraig Shelley1-2/+2
The definitions for BREAK_ON and BREAK_OFF are inverted, causing break requests to fail. This patch sets BREAK_ON and BREAK_OFF to the correct values. Signed-off-by: Craig Shelley <[email protected]> Cc: stable <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2010-08-23USB: pl2303: New vendor and product idJef Driesen2-0/+5
Add support for the Zeagle N2iTiON3 dive computer interface. Since Zeagle devices are actually manufactured by Seiko, this patch will support other Seiko based models as well. Signed-off-by: Jef Driesen <[email protected]> Cc: stable <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2010-08-23USB: serial: fix leak of usb serial module refrence countMing Lei1-16/+7
The patch with title below makes reference count of usb serial module always more than one after driver is bound. USB-BKL: Remove BKL use for usb serial driver probing In fact, the patch above only replaces lock_kernel() with try_module_get() , and does not use module_put() to do what unlock_kernel() did, so casue leak of reference count of usb serial module and the module can not be unloaded after serial driver is bound with device. This patch fixes the issue, also simplifies such things: -only call try_module_get() once in the entry of usb_serial_probe() -only call module_put() once in the exit of usb_serial_probe Signed-off-by: Ming Lei <[email protected]> Cc: Johan Hovold <[email protected]> Cc: Andi Kleen <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2010-08-23USB: add device IDs for igotu to navmanRoss Burton1-0/+1
I recently bought a i-gotU USB GPS, and whilst hunting around for linux support discovered this post by you back in 2009: http://kerneltrap.org/mailarchive/linux-usb/2009/3/12/5148644 >Try the navman driver instead. You can either add the device id to the > driver and rebuild it, or do this before you plug the device in: > modprobe navman > echo -n "0x0df7 0x0900" > /sys/bus/usb-serial/drivers/navman/new_id > > and then plug your device in and see if that works. I can confirm that the navman driver works with the right device IDs on my i-gotU GT-600, which has the same device IDs. Attached is a patch adding the IDs. From: Ross Burton <[email protected]> Cc: stable <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2010-08-23USB: isp1760: use a write barrier to ensure proper ndelay timingMichael Hennerich1-0/+2
The ISP1760 has some timing requirements where it has to delay a short period after a write to a register has started. However, this delay is from the time the write hits the USB chip (the ISP1760), not from the time where the processor started processing the write. So on a quick enough processor, it is sometimes possible for the write to not hit the device before we start delaying, and we then violate the part's timing requirements, so things stop working. To avoid all this, insert a write barrier after the register write and before the timing delay/register read so we can guarantee we only start counting time after the write has hit the device. Signed-off-by: Michael Hennerich <[email protected]> Signed-off-by: Mike Frysinger <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2010-08-23USB: option: add Celot CT-650Michael Tokarev1-2/+5
Signed-off-by: Michael Tokarev <[email protected]> Cc: stable <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2010-08-23USB: uvc_v4l2: cleanup test for end of loopDan Carpenter1-1/+1
We're trying to test for the the end of the loop here. "format" is never NULL. We don't know what "format->fcc" is because we're past the end of the loop and I think "fmt->fmt.pix.pixelformat" comes from the user so we don't know what that is either. It works, but it's cleaner to just test to see if (i == ARRAY_SIZE(uvc_formats). Signed-off-by: Dan Carpenter <[email protected]> Acked-by: Laurent Pinchart <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2010-08-24xfs: do not discard page cache data on EAGAINChristoph Hellwig1-3/+6
If xfs_map_blocks returns EAGAIN because of lock contention we must redirty the page and not disard the pagecache content and return an error from writepage. We used to do this correctly, but the logic got lost during the recent reshuffle of the writepage code. Signed-off-by: Christoph Hellwig <[email protected]> Reported-by: Mike Gao <[email protected]> Tested-by: Mike Gao <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2010-08-24xfs: don't do memory allocation under the CIL context lockDave Chinner1-8/+26
Formatting items requires memory allocation when using delayed logging. Currently that memory allocation is done while holding the CIL context lock in read mode. This means that if memory allocation takes some time (e.g. enters reclaim), we cannot push on the CIL until the allocation(s) required by formatting complete. This can stall CIL pushes for some time, and once a push is stalled so are all new transaction commits. Fix this splitting the item formatting into two steps. The first step which does the allocation and memcpy() into the allocated buffer is now done outside the CIL context lock, and only the CIL insert is done inside the CIL context lock. This avoids the stall issue. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2010-08-24xfs: Reduce log force overhead for delayed loggingDave Chinner3-118/+147
Delayed logging adds some serialisation to the log force process to ensure that it does not deference a bad commit context structure when determining if a CIL push is necessary or not. It does this by grabing the CIL context lock exclusively, then dropping it before pushing the CIL if necessary. This causes serialisation of all log forces and pushes regardless of whether a force is necessary or not. As a result fsync heavy workloads (like dbench) can be significantly slower with delayed logging than without. To avoid this penalty, copy the current sequence from the context to the CIL structure when they are swapped. This allows us to do unlocked checks on the current sequence without having to worry about dereferencing context structures that may have already been freed. Hence we can remove the CIL context locking in the forcing code and only call into the push code if the current context matches the sequence we need to force. By passing the sequence into the push code, we can check the sequence again once we have the CIL lock held exclusive and abort if the sequence has already been pushed. This avoids a lock round-trip and unnecessary CIL pushes when we have racing push calls. The result is that the regression in dbench performance goes away - this change improves dbench performance on a ramdisk from ~2100MB/s to ~2500MB/s. This compares favourably to not using delayed logging which retuns ~2500MB/s for the same workload. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>