aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2011-05-25lib: consolidate DEBUG_STACK_USAGE optionStephen Boyd13-105/+9
Most arches define CONFIG_DEBUG_STACK_USAGE exactly the same way. Move it to lib/Kconfig.debug so each arch doesn't have to define it. This obviously makes the option generic, but that's fine because the config is already used in generic code. It's not obvious to me that sysrq-P actually does anything caution by keeping the most inclusive wording. Signed-off-by: Stephen Boyd <[email protected]> Cc: Chris Metcalf <[email protected]> Acked-by: David S. Miller <[email protected]> Acked-by: Richard Weinberger <[email protected]> Acked-by: Mike Frysinger <[email protected]> Cc: Russell King <[email protected]> Cc: Hirokazu Takata <[email protected]> Acked-by: Ralf Baechle <[email protected]> Cc: Paul Mackerras <[email protected]> Acked-by: Benjamin Herrenschmidt <[email protected]> Cc: Chen Liqin <[email protected]> Cc: Lennox Wu <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25percpu_counter: change return value and add commentsShaohua Li1-1/+5
The percpu_counter_*_positive() API in UP case doesn't check if return value is positive. Add comments to explain why we don't. Also if count < 0, returns 0 instead of 1 for *read_positive(). [[email protected]: tweak comment] Signed-off-by: Shaohua Li <[email protected]> Acked-by: Eric Dumazet <[email protected]> Cc: Tejun Heo <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25lib/genalloc.c: add support for specifying the physical addressJean-Christophe PLAGNIOL-VILLARD2-9/+58
So we can specify the virtual address as the base of the pool chunk and then get physical addresses for hardware IP. For example on at91 we will use this on spi, uart or macb Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD <[email protected]> Cc: Nicolas Ferre <[email protected]> Cc: Patrice VILCHEZ <[email protected]> Cc: Jes Sorensen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25include/linux/genalloc.h: add multiple-inclusion guardsJean-Christophe PLAGNIOL-VILLARD1-0/+3
Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD <[email protected]> Cc: Nicolas Ferre <[email protected]> Cc: Patrice VILCHEZ <[email protected]> Cc: Jes Sorensen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25lib: consolidate DEBUG_PER_CPU_MAPSStephen Boyd3-23/+11
DEBUG_PER_CPU_MAPS is used in lib/cpumask.c as well as in inlcude/linux/cpumask.h and thus it has outgrown its use within x86 and powerpc alone. Any arch with SMP support may want to get some more debugging, so make this option generic. Signed-off-by: Stephen Boyd <[email protected]> Cc: <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Acked-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25lib: add kstrto*_from_user()Alexey Dobriyan2-0/+57
There is quite a lot of code which does copy_from_user() + strict_strto*() or simple_strto*() combo in slightly different ways. Before doing conversions all over tree, let's get final API correct. Enter kstrtoull_from_user() and friends. Typical code which uses them looks very simple: TYPE val; int rv; rv = kstrtoTYPE_from_user(buf, count, 0, &val); if (rv < 0) return rv; [use val] return count; There is a tiny semantic difference from the plain kstrto*() API -- the latter allows any amount of leading zeroes, while the former copies data into buffer on stack and thus allows leading zeroes as long as it fits into buffer. This shouldn't be a problem for typical usecase "echo 42 > /proc/x". The point is to make reading one integer from userspace _very_ simple and very bug free. Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25lru_cache: use correct type in sizeof for allocationIlia Mirkin1-1/+1
This has no actual effect, since sizeof(struct hlist_head) == sizeof(struct hlist_head *), but it's still the wrong type to use. The semantic match that finds this problem: // <smpl> @@ type T; identifier x; @@ T *x; ... * x = kzalloc(... * sizeof(T*) * ..., ...); // </smpl> [[email protected]: use kcalloc()] Signed-off-by: Ilia Mirkin <[email protected]> Acked-by: Lars Ellenberg <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25leds: provide helper to register "leds-gpio" devicesUwe Kleine-König5-1/+53
This function makes a deep copy of the platform data to allow it to live in init memory. For a kernel that supports several machines and so includes the definition for several leds-gpio devices this saves quite some memory because all but one definition can be free'd after boot. As the function is used by arch code it must be builtin and so cannot go into leds-gpio.c. [[email protected]: s/CONFIG_LED_REGISTER_GPIO/CONFIG_LEDS_REGISTER_GPIO/] Signed-off-by: Uwe Kleine-König <[email protected]> Cc: Russell King <[email protected]> Acked-by: Richard Purdie <[email protected]> Cc: Fabio Estevam <[email protected]> Cc: Sascha Hauer <[email protected]> Tested-by: H Hartley Sweeten <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25drivers/leds/leds-lm3530.c: add regulatorShreshtha Kumar Sahu1-8/+65
Add add regulator support to lm3530 driver. The lm3530 driver needs to get proper regulator during device probe and enable it before accessing the device. Also it disables the regulator in case of brightness == LED_OFF, and puts it back during driver removal. [[email protected]: coding-style fixes] Signed-off-by: Shreshtha Kumar Sahu <[email protected]> Cc: Lee Jones <[email protected]> Cc: Shreshtha Kumar Sahu <[email protected]> Cc: Richard Purdie <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25leds: remove the leds-h1940 driverVasily Khoruzhick3-178/+0
The H1940 machine now uses leds-gpio and leds-h1940 has no users anymore. Signed-off-by: Vasily Khoruzhick <[email protected]> Cc: "Arnaud Patard (Rtp)" <[email protected]> Cc: Ben Dooks <[email protected]> Cc: Richard Purdie <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25drivers/leds/leds-pca9532.c: add support pca9530, pca9531 and pca9533Jan Weitzel1-22/+62
The pca953x family are only different in number of leds and register layout Adding chipinfo to use driver with whole pca953x family Rename driver to pca953x, but left files and platformflags named pca9532. Tested with pca9530 and pca9533 Tested-by: Juergen Kilb <[email protected]> Signed-off-by: Jan Weitzel <[email protected]> Acked-by: Joachim Eastwood <[email protected]> Tested-by: Joachim Eastwood <[email protected]> Cc: Wolfram Sang <[email protected]> Cc: H Hartley Sweeten <[email protected]> Cc: Richard Purdie <[email protected]> Cc: Grant Likely <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25drivers/leds/leds-pca9532.c: add gpio capabilityJoachim Eastwood3-4/+122
Allow unused leds on pca9532 to be used as gpio. The board I am working on now has no less than 6 pca9532 chips. One chips is used for only leds, one has 14 leds and 2 gpio and the rest of the chips are gpio only. There is also one board in mainline which could use this capabilty; arch/arm/mach-iop32x/n2100.c 232 { .type = PCA9532_TYPE_NONE }, /* power OFF gpio */ 233 { .type = PCA9532_TYPE_NONE }, /* reset gpio */ This patch defines a new pin type, PCA9532_TYPE_GPIO, and registers a gpiochip if any pin has this type set. The gpio will registers all chip pins but will filter on gpio_request. [[email protected]: fix build when GPIOLIB is not enabled] Signed-off-by: Joachim Eastwood <[email protected]> Reviewed-by: Wolfram Sang <[email protected]> Reviewed-by: H Hartley Sweeten <[email protected]> Cc: Richard Purdie <[email protected]> Cc: Grant Likely <[email protected]> Signed-off-by: Randy Dunlap <[email protected]> Cc: Jan Weitzel <[email protected]> Cc: Juergen Kilb <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25leds: support automatic start of blinking with ledtrig-timerEsben Haabendal3-1/+12
By setting initial values blink_delay_on and blink_delay_off in a led_classdev struct, this change starts the blinking when the led is initialized. With this patch, you can initialize blink_delay_on and blink_delay_off in led_classdev with default_trigger set to "timer", and the led will start up blinking. The current ledtrig-timer implementation ignores any initial blink_delay_on/blink_delay_off settings, and requires setting blink_delay_on/blink_delay_off (typically from userspace) before the led blinks. Signed-off-by: Esben Haabendal <[email protected]> Cc: Richard Purdie <[email protected]> Cc: Randy Dunlap <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25MAINTAINERS: orphan DMFE, move Tobias Ringstrom to CREDITSJoe Perches2-2/+5
Tobias's email bounces and he hasn't submitted or acked a patch in git history. Signed-off-by: Joe Perches <[email protected]> Cc: David Miller <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25MAINTAINERS: remove stale reference to Chris Wright's LSM treeLucian Adrian Grijincu1-1/+0
This tree hasn't been updated since June 2008. Signed-off-by: Lucian Adrian Grijincu <[email protected]> Acked-by: Chris Wright <[email protected]> Cc: James Morris <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25printk: allocate kernel log buffer earlierMike Travis4-29/+68
On larger systems, because of the numerous ACPI, Bootmem and EFI messages, the static log buffer overflows before the larger one specified by the log_buf_len param is allocated. Minimize the overflow by allocating the new log buffer as soon as possible. On kernels without memblock, a later call to setup_log_buf from kernel/init.c is the fallback. [[email protected]: coding-style fixes] [[email protected]: fix CONFIG_PRINTK=n build] Signed-off-by: Mike Travis <[email protected]> Cc: Yinghai Lu <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Jack Steiner <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25memblock: add error return when CONFIG_HAVE_MEMBLOCK is not setYinghai Lu1-1/+8
On larger systems, information in the kernel log is lost because there is so much early text printed, that it overflows the static log buffer before the log_buf_len kernel parameter can be processed, and a bigger log buffer allocated. Distros are relunctant to increase memory usage by increasing the size of the static log buffer, so minimize the problem by allocating the new log buffer as early as possible. This patch: Add an error return if CONFIG_HAVE_MEMBLOCK is not set instead of having to add #ifdef CONFIG_HAVE_MEMBLOCK around blocks of code calling that function. Signed-off-by: Mike Travis <[email protected]> Cc: Yinghai Lu <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Jack Steiner <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25lib/vsprintf.c: fix interaction of kasprintf() and vsnprintf() when using %pVJan Beulich1-1/+1
Otherwise, the warning at the top of vsnprintf() gets triggered by kvasprintf()'s first invocation (with NULL buffer and zero size) of vsnprintf(). Signed-off-by: Jan Beulich <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25sparse: Undef __compiletime_{warning,error} if __CHECKER__ is definedKOSAKI Motohiro1-1/+1
sparse can't parse warning and error attribute. then they should be hidden from sparse. Signed-off-by: KOSAKI Motohiro <[email protected]> Cc: Arjan van de Ven <[email protected]> Cc: Dave Hansen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25sparse: define __must_be_array() for __CHECKER__KOSAKI Motohiro1-0/+4
commit c5e631cf65f ("ARRAY_SIZE: check for type") added __must_be_array(). But sparse can't parse this gcc extention. Now make C=2 makes following sparse errors a lot. kernel/futex.c:2699:25: error: No right hand side of '+'-expression Because __must_be_array() is used for ARRAY_SIZE() macro and it is used very widely. This patch fixes it. Signed-off-by: KOSAKI Motohiro <[email protected]> Cc: Dave Hansen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25sparse: define dummy BUILD_BUG_ON definition for sparseKOSAKI Motohiro1-0/+8
BUILD_BUG_ON() causes a syntax error to detect coding errors. So it causes sparse to detect an error too. This reduces sparse's usefulness. This patch makes a dummy BUILD_BUG_ON() definition for sparse. Signed-off-by: KOSAKI Motohiro <[email protected]> Cc: Dave Hansen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25init/calibrate.c: fix for critical bogoMIPS intermittent calculation failureAndrew Worsley1-6/+69
A fix to the TSC (Time Stamp Counter) based bogoMIPS calculation used on secondary CPUs which has two faults: 1: Not handling wrapping of the lower 32 bits of the TSC counter on 32bit kernel - perhaps TSC is not reset by a warm reset? 2: TSC and Jiffies are no incrementing together properly. Either jiffies increment too quickly or Time Stamp Counter isn't incremented in during an SMI but the real time clock is and jiffies are incremented. Case 1 can result in a factor of 16 too large a value which makes udelay() values too small and can cause mysterious driver errors. Case 2 appears to give smaller 10-15% errors after averaging but enough to cause occasional failures on my own board I have tested this code on my own branch and attach patch suitable for current kernel code. See below for examples of the failures and how the fix handles these situations now. I reported this issue earlier here: Intermittent problem with BogoMIPs calculation on Intel AP CPUs - http://marc.info/?l=linux-kernel&m=129947246316875&w=4 I suspect this issue has been seen by others but as it is intermittent and bogoMIPS for secondary CPUs are no longer printed out it might have been difficult to identify this as the cause. Perhaps these unresolved issues, although quite old, might be relevant as possibly this fault has been around for a while. In particular Case 1 may only be relevant to 32bit kernels on newer HW (most people run 64bit kernels?). Case 2 is less dramatic since the earlier fix in this area and also intermittent. Re: bogomips discrepancy on Intel Core2 Quad CPU - http://marc.info/?l=linux-kernel&m=118929277524298&w=4 slow system and bogus bogomips - http://marc.info/?l=linux-kernel&m=116791286716107&w=4 Re: Re: [RFC-PATCH] clocksource: update lpj if clocksource has - http://marc.info/?l=linux-kernel&m=128952775819467&w=4 This issue is masked a little by commit feae3203d711db0a ("timers, init: Limit the number of per cpu calibration bootup messages") which only prints out the first bogoMIPS value making it much harder to notice other values differing. Perhaps it should be changed to only suppress them when they are similar values? Here are some outputs showing faults occurring and the new code handling them properly. See my earlier message for examples of the original failure. Case 1: A Time Stamp Counter wrap: ... Calibrating delay loop (skipped), value calculated using timer frequency.. 6332.70 BogoMIPS (lpj=31663540) .... calibrate_delay_direct() timer_rate_max=31666493 timer_rate_min=31666151 pre_start=4170369255 pre_end=4202035539 calibrate_delay_direct() timer_rate_max=2425955274 timer_rate_min=2425954941 pre_start=4265368533 pre_end=2396356387 calibrate_delay_direct() ignoring timer_rate as we had a TSC wrap around start=4265368581 >=post_end=2396356511 calibrate_delay_direct() timer_rate_max=31666274 timer_rate_min=31665942 pre_start=2440373374 pre_end=2472039515 calibrate_delay_direct() timer_rate_max=31666492 timer_rate_min=31666160 pre_start=2535372139 pre_end=2567038422 calibrate_delay_direct() timer_rate_max=31666455 timer_rate_min=31666207 pre_start=2630371084 pre_end=2662037415 Calibrating delay using timer specific routine.. 6333.28 BogoMIPS (lpj=31666428) Total of 2 processors activated (12665.99 BogoMIPS). .... Case 2: Some thing (presumably the SMM interrupt?) causing the very low increase in TSC counter for the DELAY_CALIBRATION_TICKS increase in jiffies ... Calibrating delay loop (skipped), value calculated using timer frequency.. 6333.25 BogoMIPS (lpj=31666270) ... calibrate_delay_direct() timer_rate_max=31666483 timer_rate_min=31666074 pre_start=4199536526 pre_end=4231202809 calibrate_delay_direct() timer_rate_max=864348 timer_rate_min=864016 pre_start=2405343672 pre_end=2406207897 calibrate_delay_direct() timer_rate_max=31666483 timer_rate_min=31666179 pre_start=2469540464 pre_end=2501206823 calibrate_delay_direct() timer_rate_max=31666511 timer_rate_min=31666122 pre_start=2564539400 pre_end=2596205712 calibrate_delay_direct() timer_rate_max=31666084 timer_rate_min=31665685 pre_start=2659538782 pre_end=2691204657 calibrate_delay_direct() dropping min bogoMips estimate 1 = 864348 Calibrating delay using timer specific routine.. 6333.27 BogoMIPS (lpj=31666390) Total of 2 processors activated (12666.53 BogoMIPS). ... After 70 boots I saw 2 variations <1% slip through [[email protected]: coding-style fixes] [[email protected]: fix straggly printk mess] Signed-off-by: Andrew Worsley <[email protected]> Reviewed-by: Phil Carmody <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25xattr.h: expose string defines to userspaceEric Paris1-4/+4
af4f136056c9 ("security: move LSM xattrnames to xattr.h") moved the XATTR_CAPS_SUFFIX define from capability.h to xattr.h. This makes sense except it was previously exports to userspace but xattr.h does not export it to userspace. This patch exports these headers to userspace to fix the ABI regression. There is some slight possibility that this will cause problems in other applications which used these #defines differently (wrongly) and I could JUST export the capabilities xattr name that we broke. Does anyonehave an idea how exposing these headers could cause a problem? Below is what is being exposed to userspace, included here since it isn't clear exactly what is going to be made available from the patch. /* Namespaces */ #define XATTR_OS2_PREFIX "os2." #define XATTR_OS2_PREFIX_LEN (sizeof (XATTR_OS2_PREFIX) - 1) #define XATTR_SECURITY_PREFIX "security." #define XATTR_SECURITY_PREFIX_LEN (sizeof (XATTR_SECURITY_PREFIX) - 1) #define XATTR_SYSTEM_PREFIX "system." #define XATTR_SYSTEM_PREFIX_LEN (sizeof (XATTR_SYSTEM_PREFIX) - 1) #define XATTR_TRUSTED_PREFIX "trusted." #define XATTR_TRUSTED_PREFIX_LEN (sizeof (XATTR_TRUSTED_PREFIX) - 1) #define XATTR_USER_PREFIX "user." #define XATTR_USER_PREFIX_LEN (sizeof (XATTR_USER_PREFIX) - 1) /* Security namespace */ #define XATTR_SELINUX_SUFFIX "selinux" #define XATTR_NAME_SELINUX XATTR_SECURITY_PREFIX XATTR_SELINUX_SUFFIX #define XATTR_SMACK_SUFFIX "SMACK64" #define XATTR_SMACK_IPIN "SMACK64IPIN" #define XATTR_SMACK_IPOUT "SMACK64IPOUT" #define XATTR_NAME_SMACK XATTR_SECURITY_PREFIX XATTR_SMACK_SUFFIX #define XATTR_NAME_SMACKIPIN XATTR_SECURITY_PREFIX XATTR_SMACK_IPIN #define XATTR_NAME_SMACKIPOUT XATTR_SECURITY_PREFIX XATTR_SMACK_IPOUT #define XATTR_CAPS_SUFFIX "capability" #define XATTR_NAME_CAPS XATTR_SECURITY_PREFIX XATTR_CAPS_SUFFIX Reported-by: Ozan Çaglayan <[email protected]> Signed-off-by: Eric Paris <[email protected]> Cc: Mimi Zohar <[email protected]> Cc: Serge Hallyn <[email protected]> Cc: James Morris <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25bitmap, irq: add smp_affinity_list interface to /proc/irqMike Travis6-23/+188
Manually adjusting the smp_affinity for IRQ's becomes unwieldy when the cpu count is large. Setting smp affinity to cpus 256 to 263 would be: echo 000000ff,00000000,00000000,00000000,00000000,00000000,00000000,00000000 > smp_affinity instead of: echo 256-263 > smp_affinity_list Think about what it looks like for cpus around say, 4088 to 4095. We already have many alternate "list" interfaces: /sys/devices/system/cpu/cpuX/indexY/shared_cpu_list /sys/devices/system/cpu/cpuX/topology/thread_siblings_list /sys/devices/system/cpu/cpuX/topology/core_siblings_list /sys/devices/system/node/nodeX/cpulist /sys/devices/pci***/***/local_cpulist Add a companion interface, smp_affinity_list to use cpu lists instead of cpu maps. This conforms to other companion interfaces where both a map and a list interface exists. This required adding a bitmap_parselist_user() function in a manner similar to the bitmap_parse_user() function. [[email protected]: make __bitmap_parselist() static] Signed-off-by: Mike Travis <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Jack Steiner <[email protected]> Cc: Lee Schermerhorn <[email protected]> Cc: Andy Shevchenko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25fscache: remove dead code under CONFIG_WORKQUEUE_DEBUGFSAmerigo Wang3-35/+0
There is no CONFIG_WORKQUEUE_DEBUGFS any more, so this code is dead. Signed-off-by: WANG Cong <[email protected]> Cc: David Howells <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25x86: remove 32-bit versions of readq()/writeq()Roland Dreier8-24/+74
The presense of a writeq() implementation on 32-bit x86 that splits the 64-bit write into two 32-bit writes turns out to break the mpt2sas driver (and in general is risky for drivers as was discussed in <http://lkml.kernel.org/r/[email protected]>). To fix this, revert 2c5643b1c5c7 ("x86: provide readq()/writeq() on 32-bit too") and follow-on cleanups. This unfortunately leads to pushing non-atomic definitions of readq() and write() to various x86-only drivers that in the meantime started using the definitions in the x86 version of <asm/io.h>. However as discussed exhaustively, this is actually the right thing to do, because the right way to split a 64-bit transaction is hardware dependent and therefore belongs in the hardware driver (eg mpt2sas needs a spinlock to make sure no other accesses occur in between the two halves of the access). Build tested on 32- and 64-bit x86 allmodconfig. Link: http://lkml.kernel.org/r/[email protected] Acked-by: Hitoshi Mitake <[email protected]> Cc: Kashyap Desai <[email protected]> Cc: Len Brown <[email protected]> Cc: Ravi Anand <[email protected]> Cc: Vikas Chaudhary <[email protected]> Cc: Matthew Garrett <[email protected]> Cc: Jason Uhlenkott <[email protected]> Acked-by: James Bottomley <[email protected]> Acked-by: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Signed-off-by: Roland Dreier <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25Remove unused PROC_CHANGE_PENALTY constantStephen Boyd5-20/+0
This constant hasn't been used since before the git era (2.6.12) and thus can be dropped. Signed-off-by: Stephen Boyd <[email protected]> Cc: Russell King <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Hirokazu Takata <[email protected]> Cc: Kyle McMartin <[email protected]> Cc: Richard Henderson <[email protected]> Cc: Ivan Kokshaysky <[email protected]> Cc: Matt Turner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25include/linux/c2port.h: remove wrong and never used macrosWanlong Gao1-3/+0
The macro to_class_dev() uses the deprecated structure class_device, and the c2port_device has no member named class in the definition of the macro to_c2port_device. Signed-off-by: Wanlong Gao <[email protected]> Cc: Rodolfo Giometti <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25ulimit: raise default hard ulimit on number of files to 4096Tim Gardner2-2/+3
Apps are increasingly using more than 1024 file descriptors. See discussion in several distro bug trackers, e.g. BugLink: http://bugs.launchpad.net/bugs/663090 https://issues.rpath.com/browse/RPL-2054 You don't want to raise the default soft limit, since that might break apps that use select(), but it's safe to raise the default hard limit; that way, apps that know they need lots of file descriptors can raise their soft limit without needing root, and without user intervention. Ubuntu is doing this with a kernel change because they have a policy of not changing kernel defaults in userland. While 4096 might not be enough for *all* apps, it seems to be plenty for the apps I've seen lately that are unhappy with 1024. Signed-off-by: Tim Gardner <[email protected]> Cc: Dan Kegel <[email protected]> Cc: Al Viro <[email protected]> Cc: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25um: fix crash while os_dump_core()Richard Weinberger1-0/+1
os_dump_core() emits SIGTERM to terminate all UML processes. Kernel threads have to exit on SIGTERM instead of calling last_ditch_exit(). Multiple calls to last_ditch_exit() can cause a crash. Signed-off-by: Richard Weinberger <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25um: include linux/prefetch.hRichard Weinberger1-0/+2
Fix build failures on UML. Signed-off-by: Richard Weinberger <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25um: print info about fatal segfaultsRichard Weinberger1-0/+24
Print a short info about fatal segfaults like other archs do. Signed-off-by: Richard Weinberger <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25um: add ucast ethernet transportNolan Leake8-311/+413
The ucast transport is similar to the mcast transport (and, in fact, shares most of its code), only it uses UDP unicast to move packets. Obviously this is only useful for point-to-point connections between virtual ethernet devices. Signed-off-by: Nolan Leake <[email protected]> Signed-off-by: Richard Weinberger <[email protected]> Cc: David Miller <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25um: add earlyprintk supportRichard Weinberger5-0/+50
User Mode Linux can also benefit from earlyprintk. UML's earlyprintk writes kernel messages directly to stdout. Signed-off-by: Richard Weinberger <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25um: remove SIGHUP handlerRichard Weinberger1-1/+0
The UML kernel ignores SIGHUP anyway. This handler is in vain. Signed-off-by: Richard Weinberger <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25um: fix UML_LIB_PATHRichard Weinberger3-2/+8
UML_LIB_PATH is hardcoded to /usr/lib/uml/, on 64bit systems UML_LIB_PATH needs to be /usr/lib64/uml/. Signed-off-by: Richard Weinberger <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25cris: convert old cpumask API into new oneKOSAKI Motohiro2-17/+20
Adapt to the new API. [[email protected]: coding-style fixes] Signed-off-by: KOSAKI Motohiro <[email protected]> Cc: Mikael Starvik <[email protected]> Cc: Jesper Nilsson <[email protected]> Cc: Thiago Farina <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mn10300: convert old cpumask API into new oneKOSAKI Motohiro4-63/+68
Adapt to the new API. We plan to remove old cpumask APIs later. Thus this patch converts them into the new one. Signed-off-by: KOSAKI Motohiro <[email protected]> Cc: David Howells <[email protected]> Cc: Koichi Yasutake <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Chris Metcalf <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25alpha: hook up gpiolib supportMark Brown2-0/+59
Allow people to use gpiolib on Alpha if they want to, mostly for build coverage. The header is a stright copy of that for Microblaze, which in turn was taken from PowerPC. [[email protected]: define GENERIC_GPIO] Signed-off-by: Mark Brown <[email protected]> Cc: Richard Henderson <[email protected]> Cc: Ivan Kokshaysky <[email protected]> Cc: Matt Turner <[email protected]> Acked-by: Grant Likely <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25alpha: replace with new cpumask APIsKOSAKI Motohiro5-12/+14
We plan to remove cpu_xx() old APIs. Thus convert them. This patch has no functional change. Signed-off-by: KOSAKI Motohiro <[email protected]> Cc: Richard Henderson <[email protected]> Cc: Ivan Kokshaysky <[email protected]> Cc: Matt Turner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25nommu: add page alignment to mmapBob Liu1-9/+14
Currently on nommu arch mmap(),mremap() and munmap() doesn't do page_align() which isn't consist with mmu arch and cause some issues. First, some drivers' mmap() function depends on vma->vm_end - vma->start is page aligned which is true on mmu arch but not on nommu. eg: uvc camera driver. Second munmap() may return -EINVAL[split file] error in cases when end is not page aligned(passed into from userspace) but vma->vm_end is aligned dure to split or driver's mmap() ops. Add page alignment to fix those issues. [[email protected]: coding-style fixes] Signed-off-by: Bob Liu <[email protected]> Cc: David Howells <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Greg Ungerer <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mm: batch activate_page() to reduce lock contentionShaohua Li1-5/+40
The zone->lru_lock is heavily contented in workload where activate_page() is frequently used. We could do batch activate_page() to reduce the lock contention. The batched pages will be added into zone list when the pool is full or page reclaim is trying to drain them. For example, in a 4 socket 64 CPU system, create a sparse file and 64 processes, processes shared map to the file. Each process read access the whole file and then exit. The process exit will do unmap_vmas() and cause a lot of activate_page() call. In such workload, we saw about 58% total time reduction with below patch. Other workloads with a lot of activate_page also benefits a lot too. Andrew Morton suggested activate_page() and putback_lru_pages() should follow the same path to active pages, but this is hard to implement (see commit 7a608572a282a ("Revert "mm: batch activate_page() to reduce lock contention")). On the other hand, do we really need putback_lru_pages() to follow the same path? I tested several FIO/FFSB benchmark (about 20 scripts for each benchmark) in 3 machines here from 2 sockets to 4 sockets. My test doesn't show anything significant with/without below patch (there is slight difference but mostly some noise which we found even without below patch before). Below patch basically returns to the same as my first post. I tested some microbenchmarks: case-anon-cow-rand-mt 0.58% case-anon-cow-rand -3.30% case-anon-cow-seq-mt -0.51% case-anon-cow-seq -5.68% case-anon-r-rand-mt 0.23% case-anon-r-rand 0.81% case-anon-r-seq-mt -0.71% case-anon-r-seq -1.99% case-anon-rx-rand-mt 2.11% case-anon-rx-seq-mt 3.46% case-anon-w-rand-mt -0.03% case-anon-w-rand -0.50% case-anon-w-seq-mt -1.08% case-anon-w-seq -0.12% case-anon-wx-rand-mt -5.02% case-anon-wx-seq-mt -1.43% case-fork 1.65% case-fork-sleep -0.07% case-fork-withmem 1.39% case-hugetlb -0.59% case-lru-file-mmap-read-mt -0.54% case-lru-file-mmap-read 0.61% case-lru-file-mmap-read-rand -2.24% case-lru-file-readonce -0.64% case-lru-file-readtwice -11.69% case-lru-memcg -1.35% case-mmap-pread-rand-mt 1.88% case-mmap-pread-rand -15.26% case-mmap-pread-seq-mt 0.89% case-mmap-pread-seq -69.72% case-mmap-xread-rand-mt 0.71% case-mmap-xread-seq-mt 0.38% The most significent are: case-lru-file-readtwice -11.69% case-mmap-pread-rand -15.26% case-mmap-pread-seq -69.72% which use activate_page a lot. others are basically variations because each run has slightly difference. In UP case, 'size mm/swap.o' before the two patches: text data bss dec hex filename 6466 896 4 7366 1cc6 mm/swap.o after the two patches: text data bss dec hex filename 6343 896 4 7243 1c4b mm/swap.o Signed-off-by: Shaohua Li <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Hiroyuki Kamezawa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Hugh Dickins <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25asm-generic/cacheflush.h: flush icache when copying to user pagesMike Frysinger1-1/+4
The copy_to_user_page() function is supposed to flush the icache on the memory that was written, but the current asm-generic version lacks that logic. While normally it isn't a big deal as the asm-generic version of icache flushing is a stub, it is a deal for ports that want to use the asm-generic version as a baseline and then overlay its own specific parts (like icache flushing). Signed-off-by: Mike Frysinger <[email protected]> Cc: Arnd Bergmann <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mm/page_alloc.c: prevent unending loop in __alloc_pages_slowpath()Andrew Barry1-1/+1
I believe I found a problem in __alloc_pages_slowpath, which allows a process to get stuck endlessly looping, even when lots of memory is available. Running an I/O and memory intensive stress-test I see a 0-order page allocation with __GFP_IO and __GFP_WAIT, running on a system with very little free memory. Right about the same time that the stress-test gets killed by the OOM-killer, the utility trying to allocate memory gets stuck in __alloc_pages_slowpath even though most of the systems memory was freed by the oom-kill of the stress-test. The utility ends up looping from the rebalance label down through the wait_iff_congested continiously. Because order=0, __alloc_pages_direct_compact skips the call to get_page_from_freelist. Because all of the reclaimable memory on the system has already been reclaimed, __alloc_pages_direct_reclaim skips the call to get_page_from_freelist. Since there is no __GFP_FS flag, the block with __alloc_pages_may_oom is skipped. The loop hits the wait_iff_congested, then jumps back to rebalance without ever trying to get_page_from_freelist. This loop repeats infinitely. The test case is pretty pathological. Running a mix of I/O stress-tests that do a lot of fork() and consume all of the system memory, I can pretty reliably hit this on 600 nodes, in about 12 hours. 32GB/node. Signed-off-by: Andrew Barry <[email protected]> Signed-off-by: Minchan Kim <[email protected]> Reviewed-by: Rik van Riel<[email protected]> Acked-by: Mel Gorman <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mm: add SECTION_ALIGN_UP() and SECTION_ALIGN_DOWN() macroDaniel Kiper1-0/+3
Add SECTION_ALIGN_UP() and SECTION_ALIGN_DOWN() macro which aligns given pfn to upper section and lower section boundary accordingly. Required for the latest memory hotplug support for the Xen balloon driver. Signed-off-by: Daniel Kiper <[email protected]> Reviewed-by: Konrad Rzeszutek Wilk <[email protected]> David Rientjes <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25memsw: remove noswapaccount kernel parameterMichal Hocko2-13/+3
The noswapaccount parameter has been deprecated since 2.6.38 without any complaints from users so we can remove it. swapaccount=0|1 can be used instead. As we are removing the parameter we can also clean up swapaccount because it doesn't have to accept an empty string anymore (to match noswapaccount) and so we can push = into __setup macro rather than checking "=1" resp. "=0" strings Signed-off-by: Michal Hocko <[email protected]> Cc: Hiroyuki Kamezawa <[email protected]> Cc: Daisuke Nishimura <[email protected]> Cc: Balbir Singh <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25proc: allocate storage for numa_maps statistics onceStephen Wilson1-9/+27
In show_numa_map() we collect statistics into a numa_maps structure. Since the number of NUMA nodes can be very large, this structure is not a candidate for stack allocation. Instead of going thru a kmalloc()+kfree() cycle each time show_numa_map() is invoked, perform the allocation just once when /proc/pid/numa_maps is opened. Performing the allocation when numa_maps is opened, and thus before a reference to the target tasks mm is taken, eliminates a potential stalemate condition in the oom-killer as originally described by Hugh Dickins: ... imagine what happens if the system is out of memory, and the mm we're looking at is selected for killing by the OOM killer: while we wait in __get_free_page for more memory, no memory is freed from the selected mm because it cannot reach exit_mmap while we hold that reference. Signed-off-by: Stephen Wilson <[email protected]> Reviewed-by: KOSAKI Motohiro <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: David Rientjes <[email protected]> Cc: Lee Schermerhorn <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Christoph Lameter <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25proc: make struct proc_maps_private truly privateStephen Wilson2-8/+8
Now that mm/mempolicy.c is no longer implementing /proc/pid/numa_maps there is no need to export struct proc_maps_private to the world. Move it to fs/proc/internal.h instead. Signed-off-by: Stephen Wilson <[email protected]> Reviewed-by: KOSAKI Motohiro <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: David Rientjes <[email protected]> Cc: Lee Schermerhorn <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Christoph Lameter <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mm: proc: move show_numa_map() to fs/proc/task_mmu.cStephen Wilson2-185/+182
Moving show_numa_map() from mempolicy.c to task_mmu.c solves several issues. - Having the show() operation "miles away" from the corresponding seq_file iteration operations is a maintenance burden. - The need to export ad hoc info like struct proc_maps_private is eliminated. - The implementation of show_numa_map() can be improved in a simple manner by cooperating with the other seq_file operations (start, stop, etc) -- something that would be messy to do without this change. Signed-off-by: Stephen Wilson <[email protected]> Reviewed-by: KOSAKI Motohiro <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: David Rientjes <[email protected]> Cc: Lee Schermerhorn <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Christoph Lameter <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mm: declare mpol_to_str() when CONFIG_TMPFS=nStephen Wilson1-2/+2
When CONFIG_TMPFS=n mpol_to_str() is not declared in mempolicy.h. However, in the NUMA case, the definition is always compiled. Since it is not strictly true that tmpfs is the only client, and since the symbol was always lurking around anyways, export mpol_to_str() unconditionally. Furthermore, this will allow us to move show_numa_map() out of mempolicy.c and into the procfs subsystem. Signed-off-by: Stephen Wilson <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: David Rientjes <[email protected]> Cc: Lee Schermerhorn <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Randy Dunlap <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>