aboutsummaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)AuthorFilesLines
2014-06-09Merge tag 'trace-3.16' of ↵Linus Torvalds5-15/+34
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "Lots of tweaks, small fixes, optimizations, and some helper functions to help out the rest of the kernel to ease their use of trace events. The big change for this release is the allowing of other tracers, such as the latency tracers, to be used in the trace instances and allow for function or function graph tracing to be in the top level simultaneously" * tag 'trace-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (44 commits) tracing: Fix memory leak on instance deletion tracing: Fix leak of ring buffer data when new instances creation fails tracing/kprobes: Avoid self tests if tracing is disabled on boot up tracing: Return error if ftrace_trace_arrays list is empty tracing: Only calculate stats of tracepoint benchmarks for 2^32 times tracing: Convert stddev into u64 in tracepoint benchmark tracing: Introduce saved_cmdlines_size file tracing: Add __get_dynamic_array_len() macro for trace events tracing: Remove unused variable in trace_benchmark tracing: Eliminate double free on failure of allocation on boot up ftrace/x86: Call text_ip_addr() instead of the duplicated code tracing: Print max callstack on stacktrace bug tracing: Move locking of trace_cmdline_lock into start/stop seq calls tracing: Try again for saved cmdline if failed due to locking tracing: Have saved_cmdlines use the seq_read infrastructure tracing: Add tracepoint benchmark tracepoint tracing: Print nasty banner when trace_printk() is in use tracing: Add funcgraph_tail option to print function name after closing braces tracing: Eliminate duplicate TRACE_GRAPH_PRINT_xx defines tracing: Add __bitmask() macro to trace events to cpumasks and other bitmasks ...
2014-06-09Merge branch 'for-3.16' of ↵Linus Torvalds2-107/+176
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: "A lot of activities on cgroup side. Heavy restructuring including locking simplification took place to improve the code base and enable implementation of the unified hierarchy, which currently exists behind a __DEVEL__ mount option. The core support is mostly complete but individual controllers need further work. To explain the design and rationales of the the unified hierarchy Documentation/cgroups/unified-hierarchy.txt is added. Another notable change is css (cgroup_subsys_state - what each controller uses to identify and interact with a cgroup) iteration update. This is part of continuing updates on css object lifetime and visibility. cgroup started with reference count draining on removal way back and is now reaching a point where csses behave and are iterated like normal refcnted objects albeit with some complexities to allow distinguishing the state where they're being deleted. The css iteration update isn't taken advantage of yet but is planned to be used to simplify memcg significantly" * 'for-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (77 commits) cgroup: disallow disabled controllers on the default hierarchy cgroup: don't destroy the default root cgroup: disallow debug controller on the default hierarchy cgroup: clean up MAINTAINERS entries cgroup: implement css_tryget() device_cgroup: use css_has_online_children() instead of has_children() cgroup: convert cgroup_has_live_children() into css_has_online_children() cgroup: use CSS_ONLINE instead of CGRP_DEAD cgroup: iterate cgroup_subsys_states directly cgroup: introduce CSS_RELEASED and reduce css iteration fallback window cgroup: move cgroup->serial_nr into cgroup_subsys_state cgroup: link all cgroup_subsys_states in their sibling lists cgroup: move cgroup->sibling and ->children into cgroup_subsys_state cgroup: remove cgroup->parent device_cgroup: remove direct access to cgroup->children memcg: update memcg_has_children() to use css_next_child() memcg: remove tasks/children test from mem_cgroup_force_empty() cgroup: remove css_parent() cgroup: skip refcnting on normal root csses and cgrp_dfl_root self css cgroup: use cgroup->self.refcnt for cgroup refcnting ...
2014-06-09Merge branch 'for-3.16' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata Pull libata updates from Tejun Heo: "Nothing too interesting - another ahci platform driver variant, additional controller support, minor fixes and cleanups" * 'for-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: ahci: Add Device ID for HighPoint RocketRaid 642L ata: ep93xx: use dmaengine_prep_slave_sg api instead of internal callback ahci: add PCI ID for Marvell 88SE91A0 SATA Controller sata_fsl: remove check for CONFIG_MPC8315_DS ahci: add support for Hisilicon sata libahci_platform: add host_flags parameter in ahci_platform_init_host() ata: ahci: append new hflag AHCI_HFLAG_NO_FBS ata: use CONFIG_PM_SLEEP instead of CONFIG_PM where applicable in host drivers ata: ahci_mvebu: new driver for Marvell Armada 380 AHCI interfaces Documentation: dt-bindings: reformat and order list of ahci-platform compatibles libata-sff: remove dead code ata: SATL compliance for Inquiry Product Revision pata_octeon_cf: use devm_kzalloc() to allocate cf_port
2014-06-09Merge branch 'for-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds1-35/+5
Pull workqueue updates from Tejun Heo: "Lai simplified worker destruction path and internal workqueue locking and there are some other minor changes. Except for the removal of some long-deprecated interfaces which haven't had any in-kernel user for quite a while, there shouldn't be any difference to workqueue users" * 'for-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: kernel/workqueue.c: pr_warning/pr_warn & printk/pr_info workqueue: remove the confusing POOL_FREEZING workqueue: rename first_worker() to first_idle_worker() workqueue: remove unused work_clear_pending() workqueue: remove unused WORK_CPU_END workqueue: declare system_highpri_wq workqueue: use generic attach/detach routine for rescuers workqueue: separate pool-attaching code out from create_worker() workqueue: rename manager_mutex to attach_mutex workqueue: narrow the protection range of manager_mutex workqueue: convert worker_idr to worker_ida workqueue: separate iteration role from worker_idr workqueue: destroy worker directly in the idle timeout handler workqueue: async worker destruction workqueue: destroy_worker() should destroy idle workers only workqueue: use manager lock only to protect worker_idr workqueue: Remove deprecated system_nrt[_freezable]_wq workqueue: Remove deprecated flush[_delayed]_work_sync() kernel/workqueue.c: pr_warning/pr_warn & printk/pr_info workqueue: simplify wq_update_unbound_numa() by jumping to use_dfl_pwq if the target cpumask equals wq's
2014-06-09Merge branch 'for-3.16' of ↵Linus Torvalds3-3/+35
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu Pull percpu updates from Tejun Heo: "Nothing too exciting. percpu_ref is going through some interface changes and getting new features with more changes in the pipeline but given its young age and few users, it's very low impact" * 'for-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu-refcount: implement percpu_ref_tryget() percpu-refcount: rename percpu_ref_tryget() to percpu_ref_tryget_live() percpu: Replace __get_cpu_var with this_cpu_ptr
2014-06-09Merge branch 'topic/xilinx' into for-linusVinod Koul1-0/+47
2014-06-09Merge branch 'topic/dw' into for-linusVinod Koul1-1/+1
2014-06-08Merge tag 'ext4_for_linus' of ↵Linus Torvalds1-1/+11
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "Clean ups and miscellaneous bug fixes, in particular for the new collapse_range and zero_range fallocate functions. In addition, improve the scalability of adding and remove inodes from the orphan list" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (25 commits) ext4: handle symlink properly with inline_data ext4: fix wrong assert in ext4_mb_normalize_request() ext4: fix zeroing of page during writeback ext4: remove unused local variable "stored" from ext4_readdir(...) ext4: fix ZERO_RANGE test failure in data journalling ext4: reduce contention on s_orphan_lock ext4: use sbi in ext4_orphan_{add|del}() ext4: use EXT_MAX_BLOCKS in ext4_es_can_be_merged() ext4: add missing BUFFER_TRACE before ext4_journal_get_write_access ext4: remove unnecessary double parentheses ext4: do not destroy ext4_groupinfo_caches if ext4_mb_init() fails ext4: make local functions static ext4: fix block bitmap validation when bigalloc, ^flex_bg ext4: fix block bitmap initialization under sparse_super2 ext4: find the group descriptors on a 1k-block bigalloc,meta_bg filesystem ext4: avoid unneeded lookup when xattr name is invalid ext4: fix data integrity sync in ordered mode ext4: remove obsoleted check ext4: add a new spinlock i_raw_lock to protect the ext4's raw inode ext4: fix locking for O_APPEND writes ...
2014-06-08Merge branch 'next' (accumulated 3.16 merge window patches) into masterLinus Torvalds182-2404/+5189
Now that 3.15 is released, this merges the 'next' branch into 'master', bringing us to the normal situation where my 'master' branch is the merge window. * accumulated work in next: (6809 commits) ufs: sb mutex merge + mutex_destroy powerpc: update comments for generic idle conversion cris: update comments for generic idle conversion idle: remove cpu_idle() forward declarations nbd: zero from and len fields in NBD_CMD_DISCONNECT. mm: convert some level-less printks to pr_* MAINTAINERS: adi-buildroot-devel is moderated MAINTAINERS: add linux-api for review of API/ABI changes mm/kmemleak-test.c: use pr_fmt for logging fs/dlm/debug_fs.c: replace seq_printf by seq_puts fs/dlm/lockspace.c: convert simple_str to kstr fs/dlm/config.c: convert simple_str to kstr mm: mark remap_file_pages() syscall as deprecated mm: memcontrol: remove unnecessary memcg argument from soft limit functions mm: memcontrol: clean up memcg zoneinfo lookup mm/memblock.c: call kmemleak directly from memblock_(alloc|free) mm/mempool.c: update the kmemleak stack trace for mempool allocations lib/radix-tree.c: update the kmemleak stack trace for radix tree allocations mm: introduce kmemleak_update_trace() mm/kmemleak.c: use %u to print ->checksum ...
2014-06-07Merge tag 'clk-for-linus-3.16' of ↵Linus Torvalds3-57/+95
git://git.linaro.org/people/mike.turquette/linux into next Pull clock framework updates from Mike Turquette: "The clock framework changes for 3.16 are pretty typical: mostly clock driver additions and fixes. There are additions to the clock core code for some of the basic types (e.g. the common divider type has some fixes and featured added to it). One minor annoyance is a last-minute dependency that wasn't handled quite right. Commit ba0fae3b06a6 ("clk: berlin: add core clock driver for BG2/BG2CD") in this pull request depends on include/dt-bindings/clock/berlin2.h, which is already in your tree via the arm-soc pull request. Building for the berlin platform will break when the clk tree is built on it's own, but merged into your master branch everything should be fine" * tag 'clk-for-linus-3.16' of git://git.linaro.org/people/mike.turquette/linux: (75 commits) mmc: sunxi: Add driver for SD/MMC hosts found on Allwinner sunxi SoCs clk: export __clk_round_rate for providers clk: versatile: free icst on error return clk: qcom: Return error pointers for unimplemented clocks clk: qcom: Support msm8974pro global clock control hardware clk: qcom: Properly support display clocks on msm8974 clk: qcom: Support display RCG clocks clk: qcom: Return highest rate when round_rate() exceeds plan clk: qcom: Fix mmcc-8974's PLL configurations clk: qcom: Fix clk_rcg2_is_enabled() check clk: berlin: add core clock driver for BG2Q clk: berlin: add core clock driver for BG2/BG2CD clk: berlin: add driver for BG2x complex divider cells clk: berlin: add driver for BG2x simple PLLs clk: berlin: add driver for BG2x audio/video PLL clk: st: Terminate of match table clk/exynos4: Fix compilation warning ARM: shmobile: r8a7779: Add clock index macros for DT sources clk: divider: Fix overflow in clk_divider_bestdiv clk: u300: Terminate of match table ...
2014-06-07Merge tag 'vfio-v3.16-rc1' of git://github.com/awilliam/linux-vfio into nextLinus Torvalds1-3/+2
Pull VFIO updates from Alex Williamson: "A handful of VFIO bug fixes for v3.16" * tag 'vfio-v3.16-rc1' of git://github.com/awilliam/linux-vfio: drivers/vfio/pci: Fix wrong MSI interrupt count drivers/vfio: Rework offsetofend() vfio/iommu_type1: Avoid overflow vfio/pci: Fix unchecked return value vfio/pci: Fix sizing of DPA and THP express capabilities
2014-06-06Merge branch 'akpm' (patches from Andrew Morton) into nextLinus Torvalds10-19/+66
Merge more updates from Andrew Morton: - Most of the rest of MM. This includes "mark remap_file_pages syscall as deprecated" but the actual "replace remap_file_pages syscall with emulation" is held back. I guess we'll need to work out when to pull the trigger on that one. - various minor cleanups to obscure filesystems - the drivers/rtc queue - hfsplus updates - ufs, hpfs, fatfs, affs, reiserfs - Documentation/ - signals - procfs - cpu hotplug - lib/idr.c - rapidio - sysctl - ipc updates * emailed patches from Andrew Morton <[email protected]>: (171 commits) ufs: sb mutex merge + mutex_destroy powerpc: update comments for generic idle conversion cris: update comments for generic idle conversion idle: remove cpu_idle() forward declarations nbd: zero from and len fields in NBD_CMD_DISCONNECT. mm: convert some level-less printks to pr_* MAINTAINERS: adi-buildroot-devel is moderated MAINTAINERS: add linux-api for review of API/ABI changes mm/kmemleak-test.c: use pr_fmt for logging fs/dlm/debug_fs.c: replace seq_printf by seq_puts fs/dlm/lockspace.c: convert simple_str to kstr fs/dlm/config.c: convert simple_str to kstr mm: mark remap_file_pages() syscall as deprecated mm: memcontrol: remove unnecessary memcg argument from soft limit functions mm: memcontrol: clean up memcg zoneinfo lookup mm/memblock.c: call kmemleak directly from memblock_(alloc|free) mm/mempool.c: update the kmemleak stack trace for mempool allocations lib/radix-tree.c: update the kmemleak stack trace for radix tree allocations mm: introduce kmemleak_update_trace() mm/kmemleak.c: use %u to print ->checksum ...
2014-06-06svcrdma: refactor marshalling logicSteve Wise1-2/+1
This patch refactors the NFSRDMA server marshalling logic to remove the intermediary map structures. It also fixes an existing bug where the NFSRDMA server was not minding the device fast register page list length limitations. Signed-off-by: Tom Tucker <[email protected]> Signed-off-by: Steve Wise <[email protected]>
2014-06-06nfs4: remove unused CHANGE_SECURITY_LABELJ. Bruce Fields1-2/+0
This constant has the wrong value. And we don't use it. And it's been removed from the 4.2 spec anyway. Signed-off-by: J. Bruce Fields <[email protected]>
2014-06-06idle: remove cpu_idle() forward declarationsGeert Uytterhoeven2-3/+0
After all architectures were converted to the generic idle framework, commit d190e8195b90 ("idle: Remove GENERIC_IDLE_LOOP config switch") removed the last caller of cpu_idle(). The forward declarations in header files were forgotten. Signed-off-by: Geert Uytterhoeven <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-06-06mm: introduce kmemleak_update_trace()Catalin Marinas1-0/+4
The memory allocation stack trace is not always useful for debugging a memory leak (e.g. radix_tree_preload). This function, when called, updates the stack trace for an already allocated object. Signed-off-by: Catalin Marinas <[email protected]> Cc: Johannes Weiner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-06-06key: convert use of typedef ctl_table to struct ctl_tableJoe Perches1-1/+1
This typedef is unnecessary and should just be removed. Signed-off-by: Joe Perches <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-06-06ipc/shm.c: increase the defaults for SHMALL, SHMMAXManfred Spraul1-2/+1
System V shared memory a) can be abused to trigger out-of-memory conditions and the standard measures against out-of-memory do not work: - it is not possible to use setrlimit to limit the size of shm segments. - segments can exist without association with any processes, thus the oom-killer is unable to free that memory. b) is typically used for shared information - today often multiple GB. (e.g. database shared buffers) The current default is a maximum segment size of 32 MB and a maximum total size of 8 GB. This is often too much for a) and not enough for b), which means that lots of users must change the defaults. This patch increases the default limits (nearly) to the maximum, which is perfect for case b). The defaults are used after boot and as the initial value for each new namespace. Admins/distros that need a protection against a) should reduce the limits and/or enable shm_rmid_forced. Unix has historically required setting these limits for shared memory, and Linux inherited such behavior. The consequence of this is added complexity for users and administrators. One very common example are Database setup/installation documents and scripts, where users must manually calculate the values for these limits. This also requires (some) knowledge of how the underlying memory management works, thus causing, in many occasions, the limits to just be flat out wrong. Disabling these limits sooner could have saved companies a lot of time, headaches and money for support. But it's never too late, simplify users life now. Further notes: - The patch only changes default, overrides behave as before: # sysctl kernel.shmall=33554432 would recreate the previous limit for SHMMAX (for the current namespace). - Disabling sysv shm allocation is possible with: # sysctl kernel.shmall=0 (not a new feature, also per-namespace) - The limits are intentionally set to a value slightly less than ULONG_MAX, to avoid triggering overflows in user space apps. [not unreasonable, see http://marc.info/?l=linux-mm&m=139638334330127] Signed-off-by: Manfred Spraul <[email protected]> Signed-off-by: Davidlohr Bueso <[email protected]> Reported-by: Davidlohr Bueso <[email protected]> Acked-by: Michael Kerrisk <[email protected]> Acked-by: KOSAKI Motohiro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-06-06idr: reorder the fieldsLai Jiangshan1-5/+8
idr_layer->layer is always accessed in read path, move it in the front. idr_layer->bitmap is moved on the bottom. And rcu_head shares with bitmap due to they do not be accessed at the same time. idr->id_free/id_free_cnt/lock are free list fields, and moved to the bottom. They will be removed in near future. Signed-off-by: Lai Jiangshan <[email protected]> Cc: Tejun Heo <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-06-06signals: introduce kernel_sigaction()Oleg Nesterov1-2/+16
Now that allow_signal() is really trivial we can unify it with disallow_signal(). Add the new helper, kernel_sigaction(), and reimplement allow_signal/disallow_signal as a trivial wrappers. This saves one EXPORT_SYMBOL() and the new helper can have more users. Signed-off-by: Oleg Nesterov <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Al Viro <[email protected]> Cc: David Woodhouse <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Tejun Heo <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-06-06signals: mv {dis,}allow_signal() from sched.h/exit.c to signal.[ch]Oleg Nesterov2-3/+2
Move the declaration/definition of allow_signal/disallow_signal to signal.h/signal.c. The new place is more logical and allows to use the static helpers in signal.c (see the next changes). While at it, make them return void and remove the valid_signal() check. Nobody checks the returned value, and in-kernel users must not pass the wrong signal number. Signed-off-by: Oleg Nesterov <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Al Viro <[email protected]> Cc: David Woodhouse <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Tejun Heo <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-06-06signals: kill sigfindinword()Oleg Nesterov1-5/+0
It has no users and it doesn't look useful. I do not know why/when it was introduced, I can't even find any user in the git history. Signed-off-by: Oleg Nesterov <[email protected]> Acked-by: Geert Uytterhoeven <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Al Viro <[email protected]> Cc: David Woodhouse <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Tejun Heo <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-06-06ptrace: fix fork event messages across pid namespacesMatthew Dempsky1-0/+32
When tracing a process in another pid namespace, it's important for fork event messages to contain the child's pid as seen from the tracer's pid namespace, not the parent's. Otherwise, the tracer won't be able to correlate the fork event with later SIGTRAP signals it receives from the child. We still risk a race condition if a ptracer from a different pid namespace attaches after we compute the pid_t value. However, sending a bogus fork event message in this unlikely scenario is still a vast improvement over the status quo where we always send bogus fork event messages to debuggers in a different pid namespace than the forking process. Signed-off-by: Matthew Dempsky <[email protected]> Acked-by: Oleg Nesterov <[email protected]> Cc: Kees Cook <[email protected]> Cc: Julien Tinnes <[email protected]> Cc: Roland McGrath <[email protected]> Cc: Jan Kratochvil <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-06-06drivers/rtc/rtc-cmos.c: drivers/char/rtc.c features for DECstation supportMaciej W. Rozycki1-0/+4
This brings in drivers/char/rtc.c functionality required for DECstation and, should the maintainers decide to switch, Alpha systems to use rtc-cmos. Specifically these features are made available: * RTC iomem rather than x86/PCI port I/O mapping, controlled with the RTC_IOMAPPED macro as with the original driver. The DS1287A chip in all DECstation systems is mapped in the host bus address space as a contiguous block of 64 32-bit words of which the least significant byte accesses the RTC chip for both reads and writes. All the address and data window register accesses are made transparently by the chipset glue logic so that the device appears directly mapped on the host bus. * A way to set the size of the address space explicitly with the newly-added `address_space' member of the platform part of the RTC device structure. This avoids the unreliable heuristics that does not work in a setup where the RTC is not explicitly accessed with the usual address and data window register pair. * The ability to use the RTC periodic interrupt as a system clock device, which is implemented by arch/mips/kernel/cevt-ds1287.c for DECstation systems and takes the RTC interrupt away from the RTC driver. Eventually hooking back to the clock device's interrupt handler should be possible for the purpose of the alarm clock and possibly also update-in-progress interrupt, but this is not done by this change. o To avoid interfering with the clock interrupt all the places where the RTC interrupt mask is fiddled with are only executed if and IRQ has been assigned to the RTC driver. o To avoid changing the clock setup Register A is not fiddled with if CMOS_RTC_FLAGS_NOFREQ is set in the newly-added `flags' member of the platform part of the RTC device structure. Originally, in drivers/char/rtc.c, this was keyed with the absence of the RTC interrupt, just like the interrupt mask, but there only the periodic interrupt frequency is set, whereas rtc-cmos also sets the divider bits. Therefore a new flag is introduced so that systems where the RTC interrupt is not usable rather than used as a system clock device can fully initialise the RTC. * A small clean-up is made to the IRQ assignment code that makes the IRQ number hardcoded to -1 rather than arbitrary -ENXIO (or whatever error happens to be returned by platform_get_irq) where no IRQ has been assigned to the RTC driver (NO_IRQ might be another candidate, but it looks like this macro has inconsistent or missing definitions and limited use and might therefore be unsafe). Verified to work correctly with a DECstation 5000/240 system. [[email protected]: fix weird code layout] Signed-off-by: Maciej W. Rozycki <[email protected]> Cc: Alessandro Zummo <[email protected]> Cc: Ralf Baechle <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-06-06Merge tag 'mfd-for-linus-3.16' of ↵Linus Torvalds16-1288/+2522
git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd into next Pull MFD updates from Lee Jones: "Changes to existing drivers: - increase DT coverage: arizona, mc13xxx, stmpe-i2c, syscon, sun6i-prcm - regmap use of and/or clean-up: tps65090, twl6040 - basic renaming: max14577 - use new cpufreq helpers: db8500-prcmu - increase regulator support: stmpe, arizona, wm5102 - reduce legacy GPIO overhead: stmpe - provide necessary remove path: bcm590xx - expand sysfs presence: kempld - move driver specific code out to drivers: rtc-s5m, arizona - clk handling: twl6040 - use managed (devm_*) resources: ipaq-micro - clean-up/remove unused/duplicated code: tps65218, sec, pm8921, abx500-core, db8500-prcmu, menelaus - build/boot/sematic bug fixes: rtsx_usb, stmpe, bcm590xx, abx500, mc13xxx, rdc321x-southbridge, mfd-core, sec, max14577, syscon, cros_ec_spi - constify stuff: sm501, tps65910, tps6507x, tps6586x, max77686, max8997, kempld, max77693, max8907, rtsx_usb, db8500-prcmu, max8998, wm8400, sec, lp3943, max14577, as3711, omap-usb-host, ipaq-micro Support for new devices: - add support for max77836 into max14577 - add support for tps658640 into tps6586x - add support for cros-ec-i2c-tunnel into cros_ec - add new driver for rtsx_usb_sdmmc and rtsx_usb_ms - add new driver for axp20x - add new driver for sun6i-prcm - add new driver for ipaq-micro" * tag 'mfd-for-linus-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (77 commits) mfd: wm5102: Correct default for LDO Control 2 register mfd: menelaus: Use module_i2c_driver mfd: tps65218: Terminate of match table mfd: db8500-prcmu: Remove check for CONFIG_DBX500_PRCMU_DEBUG mfd: ti-keystone-devctrl: Add bindings for device state control mfd: palmas: Format the header file mfd: abx500-core: Remove unused function abx500_dump_all_banks() mfd: arizona: Correct addresses of always-on trigger registers mfd: max14577: Cast to architecture agnostic data type i2c: ChromeOS EC tunnel driver mfd: cros_ec: Sync to the latest cros_ec_commands.h from EC sources mfd: cros_ec: spi: Increase cros_ec_spi deadline from 5ms to 100ms mfd: cros_ec: spi: Make the cros_ec_spi timeout more reliable mfd: cros_ec: spi: Add mutex to cros_ec_spi mfd: cros_ec: spi: Calculate delay between transfers correctly mfd: arizona: Correct error message for addition of main IRQ chip mfd: wm8997: Add registers for high power mode mfd: arizona: Add MICVDD to mapped regulators mfd: ipaq-micro: Make mfd_cell array const mfd: ipaq-micro: Use devm_ioremap_resource() ...
2014-06-06Merge tag 'iommu-updates-v3.16' of ↵Linus Torvalds1-0/+24
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu into next Pull IOMMU updates from Joerg Roedel: "The changes include: - a new IOMMU driver for ARM Renesas SOCs - updates and fixes for the ARM Exynos driver to bring it closer to a usable state again - convert the AMD IOMMUv2 driver to use the mmu_notifier->release call-back instead of the task_exit notifier - random other fixes and minor improvements to a number of other IOMMU drivers" * tag 'iommu-updates-v3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (54 commits) iommu/msm: Use devm_ioremap_resource to simplify code iommu/amd: Fix recently introduced compile warnings arm/ipmmu-vmsa: Fix compile error iommu/exynos: Fix checkpatch warning iommu/exynos: Fix trivial typo iommu/exynos: Remove invalid symbol dependency iommu: fsl_pamu.c: Fix for possible null pointer dereference iommu/amd: Remove duplicate checking code iommu/amd: Handle parallel invalidate_range_start/end calls correctly iommu/amd: Remove IOMMUv2 pasid_state_list iommu/amd: Implement mmu_notifier_release call-back iommu/amd: Convert IOMMUv2 state_table into state_list iommu/amd: Don't access IOMMUv2 state_table directly iommu/ipmmu-vmsa: Support clearing mappings iommu/ipmmu-vmsa: Remove stage 2 PTE bits definitions iommu/ipmmu-vmsa: Support 2MB mappings iommu/ipmmu-vmsa: Rewrite page table management iommu/ipmmu-vmsa: PMD is never folded, PUD always is iommu/ipmmu-vmsa: Set the PTE contiguous hint bit when possible iommu/ipmmu-vmsa: Define driver-specific page directory sizes ...
2014-06-06Merge tag 'arm64-upstream' of ↵Linus Torvalds1-16/+18
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux into next Pull arm64 updates from Catalin Marinas: - Optimised assembly string/memory routines (based on the AArch64 Cortex Strings library contributed to glibc but re-licensed under GPLv2) - Optimised crypto algorithms making use of the ARMv8 crypto extensions (together with kernel API for using FPSIMD instructions in interrupt context) - Ftrace support - CPU topology parsing from DT - ESR_EL1 (Exception Syndrome Register) exposed to user space signal handlers for SIGSEGV/SIGBUS (useful to emulation tools like Qemu) - 1GB section linear mapping if applicable - Barriers usage clean-up - Default pgprot clean-up Conflicts as per Catalin. * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (57 commits) arm64: kernel: initialize broadcast hrtimer based clock event device arm64: ftrace: Add system call tracepoint arm64: ftrace: Add CALLER_ADDRx macros arm64: ftrace: Add dynamic ftrace support arm64: Add ftrace support ftrace: Add arm64 support to recordmcount arm64: Add 'notrace' attribute to unwind_frame() for ftrace arm64: add __ASSEMBLY__ in asm/insn.h arm64: Fix linker script entry point arm64: lib: Implement optimized string length routines arm64: lib: Implement optimized string compare routines arm64: lib: Implement optimized memcmp routine arm64: lib: Implement optimized memset routine arm64: lib: Implement optimized memmove routine arm64: lib: Implement optimized memcpy routine arm64: defconfig: enable a few more common/useful options in defconfig ftrace: Make CALLER_ADDRx macros more generic arm64: Fix deadlock scenario with smp_send_stop() arm64: Fix machine_shutdown() definition arm64: Support arch_irq_work_raise() via self IPIs ...
2014-06-06blk-mq: bump max tag depth to 10K tagsJens Axboe1-1/+1
For some scsi-mq cases, the tag map can be huge. So increase the max number of tags we support. Additionally, don't fail with EINVAL if a user requests too many tags. Warn that the tag depth has been adjusted down, and store the new value inside the tag_set passed in. Signed-off-by: Jens Axboe <[email protected]>
2014-06-06block: add blk_rq_set_block_pc()Jens Axboe1-0/+1
With the optimizations around not clearing the full request at alloc time, we are leaving some of the needed init for REQ_TYPE_BLOCK_PC up to the user allocating the request. Add a blk_rq_set_block_pc() that sets the command type to REQ_TYPE_BLOCK_PC, and properly initializes the members associated with this type of request. Update callers to use this function instead of manipulating rq->cmd_type directly. Includes fixes from Christoph Hellwig <[email protected]> for my half-assed attempt. Signed-off-by: Jens Axboe <[email protected]>
2014-06-06perf: Differentiate exec() and non-exec() comm eventsAdrian Hunter2-3/+7
perf tools like 'perf report' can aggregate samples by comm strings, which generally works. However, there are other potential use-cases. For example, to pair up 'calls' with 'returns' accurately (from branch events like Intel BTS) it is necessary to identify whether the process has exec'd. Although a comm event is generated when an 'exec' happens it is also generated whenever the comm string is changed on a whim (e.g. by prctl PR_SET_NAME). This patch adds a flag to the comm event to differentiate one case from the other. In order to determine whether the kernel supports the new flag, a selection bit named 'exec' is added to struct perf_event_attr. The bit does nothing but will cause perf_event_open() to fail if the bit is set on kernels that do not have it defined. Signed-off-by: Adrian Hunter <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Cc: Paul Mackerras <[email protected]> Cc: Dave Jones <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Alexander Viro <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2014-06-06Merge branch 'perf/urgent' into perf/core, to resolve conflict and to ↵Ingo Molnar19-36/+123
prepare for new patches Conflicts: arch/x86/kernel/traps.c Signed-off-by: Ingo Molnar <[email protected]>
2014-06-06perf: Fix perf_event_comm() vs. exec() assumptionPeter Zijlstra1-1/+3
perf_event_comm() assumes that set_task_comm() is only called on exec(), and in particular that its only called on current. Neither are true, as Dave reported a WARN triggered by set_task_comm() being called on !current. Separate the exec() hook from the comm hook. Reported-by: Dave Jones <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Cc: Alexander Viro <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] [ Build fix. ] Signed-off-by: Ingo Molnar <[email protected]>
2014-06-05Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm into nextLinus Torvalds5-0/+28
Pull ARM updates from Russell King: - Major clean-up of the L2 cache support code. The existing mess was becoming rather unmaintainable through all the additions that others have done over time. This turns it into a much nicer structure, and implements a few performance improvements as well. - Clean up some of the CP15 control register tweaks for alignment support, moving some code and data into alignment.c - DMA properties for ARM, from Santosh and reviewed by DT people. This adds DT properties to specify bus translations we can't discover automatically, and to indicate whether devices are coherent. - Hibernation support for ARM - Make ftrace work with read-only text in modules - add suspend support for PJ4B CPUs - rework interrupt masking for undefined instruction handling, which allows us to enable interrupts earlier in the handling of these exceptions. - support for big endian page tables - fix stacktrace support to exclude stacktrace functions from the trace, and add save_stack_trace_regs() implementation so that kprobes can record stack traces. - Add support for the Cortex-A17 CPU. - Remove last vestiges of ARM710 support. - Removal of ARM "meminfo" structure, finally converting us solely to memblock to handle the early memory initialisation. * 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: (142 commits) ARM: ensure C page table setup code follows assembly code (part II) ARM: ensure C page table setup code follows assembly code ARM: consolidate last remaining open-coded alignment trap enable ARM: remove global cr_no_alignment ARM: remove CPU_CP15 conditional from alignment.c ARM: remove unused adjust_cr() function ARM: move "noalign" command line option to alignment.c ARM: provide common method to clear bits in CPU control register ARM: 8025/1: Get rid of meminfo ARM: 8060/1: mm: allow sub-architectures to override PCI I/O memory type ARM: 8066/1: correction for ARM patch 8031/2 ARM: 8049/1: ftrace/add save_stack_trace_regs() implementation ARM: 8065/1: remove last use of CONFIG_CPU_ARM710 ARM: 8062/1: Modify ldrt fixup handler to re-execute the userspace instruction ARM: 8047/1: rwsem: use asm-generic rwsem implementation ARM: l2c: trial at enabling some Cortex-A9 optimisations ARM: l2c: add warnings for stuff modifying aux_ctrl register values ARM: l2c: print a warning with L2C-310 caches if the cache size is modified ARM: l2c: remove old .set_debug method ARM: l2c: kill L2X0_AUX_CTRL_MASK before anyone else makes use of this ...
2014-06-05net: phy: fix sparse warning in fixed.cKonrad Zapalowicz1-0/+5
This commit fixes the following sparse warning: drivers/net/phy/fixed.c:207 - warning: symbol 'fixed_phy_del' was not declared. Should it be static? by adding symbol definition to the phy_fixed.h API file. It is ok to do because the function in question is an exported symbol. Signed-off-by: Konrad Zapalowicz <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-06-05cpufreq: add support for intermediate (stable) frequenciesViresh Kumar1-0/+25
Douglas Anderson, recently pointed out an interesting problem due to which udelay() was expiring earlier than it should. While transitioning between frequencies few platforms may temporarily switch to a stable frequency, waiting for the main PLL to stabilize. For example: When we transition between very low frequencies on exynos, like between 200MHz and 300MHz, we may temporarily switch to a PLL running at 800MHz. No CPUFREQ notification is sent for that. That means there's a period of time when we're running at 800MHz but loops_per_jiffy is calibrated at between 200MHz and 300MHz. And so udelay behaves badly. To get this fixed in a generic way, introduce another set of callbacks get_intermediate() and target_intermediate(), only for drivers with target_index() and CPUFREQ_ASYNC_NOTIFICATION unset. get_intermediate() should return a stable intermediate frequency platform wants to switch to, and target_intermediate() should set CPU to that frequency, before jumping to the frequency corresponding to 'index'. Core will take care of sending notifications and driver doesn't have to handle them in target_intermediate() or target_index(). NOTE: ->target_index() should restore to policy->restore_freq in case of failures as core would send notifications for that. Tested-by: Stephen Warren <[email protected]> Signed-off-by: Viresh Kumar <[email protected]> Reviewed-by: Doug Anderson <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2014-06-05Merge branch 'arm64-efi-for-linus' of ↵Linus Torvalds1-0/+12
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next Pull ARM64 EFI update from Peter Anvin: "By agreement with the ARM64 EFI maintainers, we have agreed to make -tip the upstream for all EFI patches. That is why this patchset comes from me :) This patchset enables EFI stub support for ARM64, like we already have on x86" * 'arm64-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: arm64: efi: only attempt efi map setup if booting via EFI efi/arm64: ignore dtb= when UEFI SecureBoot is enabled doc: arm64: add description of EFI stub support arm64: efi: add EFI stub doc: arm: add UEFI support documentation arm64: add EFI runtime services efi: Add shared FDT related functions for ARM/ARM64 arm64: Add function to create identity mappings efi: add helper function to get UEFI params from FDT doc: efi-stub.txt updates for ARM lib: add fdt_empty_tree.c
2014-06-05block: add notion of a chunk size for request mergingJens Axboe1-1/+21
Some drivers have different limits on what size a request should optimally be, depending on the offset of the request. Similar to dividing a device into chunks. Add a setting that allows the driver to inform the block layer of such a chunk size. The block layer will then prevent merging across the chunks. This is needed to optimally support NVMe with a non-zero stripe size. Signed-off-by: Jens Axboe <[email protected]>
2014-06-05Merge branch 'x86-efi-for-linus' of ↵Linus Torvalds1-2/+10
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next Pull x86 EFI updates from Peter Anvin: "A collection of EFI changes. The perhaps most important one is to fully save and restore the FPU state around each invocation of EFI runtime, and to not choke on non-ASCII characters in the boot stub" * 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efivars: Add compatibility code for compat tasks efivars: Refactor sanity checking code into separate function efivars: Stop passing a struct argument to efivar_validate() efivars: Check size of user object efivars: Use local variables instead of a pointer dereference x86/efi: Save and restore FPU context around efi_calls (i386) x86/efi: Save and restore FPU context around efi_calls (x86_64) x86/efi: Implement a __efi_call_virt macro x86, fpu: Extend the use of static_cpu_has_safe x86/efi: Delete most of the efi_call* macros efi: x86: Handle arbitrary Unicode characters efi: Add get_dram_base() helper function efi: Add shared printk wrapper for consistent prefixing efi: create memory map iteration helper efi: efi-stub-helper cleanup
2014-06-05Merge branch 'x86/vdso' of ↵Linus Torvalds2-1/+15
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next Pull x86 cdso updates from Peter Anvin: "Vdso cleanups and improvements largely from Andy Lutomirski. This makes the vdso a lot less ''special''" * 'x86/vdso' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/vdso, build: Make LE access macros clearer, host-safe x86/vdso, build: Fix cross-compilation from big-endian architectures x86/vdso, build: When vdso2c fails, unlink the output x86, vdso: Fix an OOPS accessing the HPET mapping w/o an HPET x86, mm: Replace arch_vma_name with vm_ops->name for vsyscalls x86, mm: Improve _install_special_mapping and fix x86 vdso naming mm, fs: Add vm_ops->name as an alternative to arch_vma_name x86, vdso: Fix an OOPS accessing the HPET mapping w/o an HPET x86, vdso: Remove vestiges of VDSO_PRELINK and some outdated comments x86, vdso: Move the vvar and hpet mappings next to the 64-bit vDSO x86, vdso: Move the 32-bit vdso special pages after the text x86, vdso: Reimplement vdso.so preparation in build-time C x86, vdso: Move syscall and sysenter setup into kernel/cpu/common.c x86, vdso: Clean up 32-bit vs 64-bit vdso params x86, mm: Ensure correct alignment of the fixmap
2014-06-05Merge branch 'devel-stable' into for-nextRussell King3-0/+23
2014-06-05Merge branches 'alignment', 'fixes', 'l2c' (early part) and 'misc' into for-nextRussell King3-0/+6
2014-06-05perf: Disable sampled events if no PMU interruptVince Weaver1-0/+10
Add common code to generate -ENOTSUPP at event creation time if an architecture attempts to create a sampled event and PERF_PMU_NO_INTERRUPT is set. This adds a new pmu->capabilities flag. Initially we only support PERF_PMU_NO_INTERRUPT (to indicate a PMU has no support for generating hardware interrupts) but there are other capabilities that can be added later. Signed-off-by: Vince Weaver <[email protected]> Acked-by: Will Deacon <[email protected]> [peterz: rename to PERF_PMU_CAP_* and moved the pmu::capabilities word into a hole] Signed-off-by: Peter Zijlstra <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Linus Torvalds <[email protected]> Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1405161708060.11099@vincent-weaver-1.umelst.maine.edu Signed-off-by: Ingo Molnar <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2014-06-05Merge branch 'perf/kprobes' into perf/coreIngo Molnar2-3/+20
Conflicts: arch/x86/kernel/traps.c The kprobes enhancements are fully cooked, ship them upstream. Signed-off-by: Ingo Molnar <[email protected]>
2014-06-05Merge branch 'perf/uprobes' into perf/coreIngo Molnar1-0/+4
These bits from Oleg are fully cooked, ship them to Linus. Signed-off-by: Ingo Molnar <[email protected]>
2014-06-05sched: Rename capacity related flagsNicolas Pitre1-2/+2
It is better not to think about compute capacity as being equivalent to "CPU power". The upcoming "power aware" scheduler work may create confusion with the notion of energy consumption if "power" is used too liberally. Let's rename the following feature flags since they do relate to capacity: SD_SHARE_CPUPOWER -> SD_SHARE_CPUCAPACITY ARCH_POWER -> ARCH_CAPACITY NONTASK_POWER -> NONTASK_CAPACITY Signed-off-by: Nicolas Pitre <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Cc: Vincent Guittot <[email protected]> Cc: Daniel Lezcano <[email protected]> Cc: Morten Rasmussen <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: [email protected] Cc: Andy Fleming <[email protected]> Cc: Anton Blanchard <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Grant Likely <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Paul Gortmaker <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Preeti U Murthy <[email protected]> Cc: Rob Herring <[email protected]> Cc: Srivatsa S. Bhat <[email protected]> Cc: Toshi Kani <[email protected]> Cc: Vasant Hegde <[email protected]> Cc: Vincent Guittot <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2014-06-05sched: Final power vs. capacity cleanupsNicolas Pitre1-3/+3
It is better not to think about compute capacity as being equivalent to "CPU power". The upcoming "power aware" scheduler work may create confusion with the notion of energy consumption if "power" is used too liberally. This contains the architecture visible changes. Incidentally, only ARM takes advantage of the available pow^H^H^Hcapacity scaling hooks and therefore those changes outside kernel/sched/ are confined to one ARM specific file. The default arch_scale_smt_power() hook is not overridden by anyone. Replacements are as follows: arch_scale_freq_power --> arch_scale_freq_capacity arch_scale_smt_power --> arch_scale_smt_capacity SCHED_POWER_SCALE --> SCHED_CAPACITY_SCALE SCHED_POWER_SHIFT --> SCHED_CAPACITY_SHIFT The local usage of "power" in arch/arm/kernel/topology.c is also changed to "capacity" as appropriate. Signed-off-by: Nicolas Pitre <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Cc: Vincent Guittot <[email protected]> Cc: Daniel Lezcano <[email protected]> Cc: Morten Rasmussen <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: [email protected] Cc: Arnd Bergmann <[email protected]> Cc: Dietmar Eggemann <[email protected]> Cc: Grant Likely <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mark Brown <[email protected]> Cc: Rob Herring <[email protected]> Cc: Russell King <[email protected]> Cc: Sudeep KarkadaNagesha <[email protected]> Cc: Vincent Guittot <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2014-06-05sched: Let 'struct sched_group_power' care about CPU capacityNicolas Pitre1-1/+1
It is better not to think about compute capacity as being equivalent to "CPU power". The upcoming "power aware" scheduler work may create confusion with the notion of energy consumption if "power" is used too liberally. Since struct sched_group_power is really about compute capacity of sched groups, let's rename it to struct sched_group_capacity. Similarly sgp becomes sgc. Related variables and functions dealing with groups are also adjusted accordingly. Signed-off-by: Nicolas Pitre <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Cc: Vincent Guittot <[email protected]> Cc: Daniel Lezcano <[email protected]> Cc: Morten Rasmussen <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: [email protected] Cc: Linus Torvalds <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2014-06-05sched: Fix signedness bug in yield_to()Dan Carpenter2-2/+2
yield_to() is supposed to return -ESRCH if there is no task to yield to, but because the type is bool that is the same as returning true. The only place I see which cares is kvm_vcpu_on_spin(). Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: Raghavendra <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Cc: Gleb Natapov <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/20140523102042.GA7267@mwanda Signed-off-by: Ingo Molnar <[email protected]>
2014-06-05locking/rwsem: Fix warnings for CONFIG_RWSEM_GENERIC_SPINLOCKDavidlohr Bueso1-1/+1
Optimistic spinning is only used by the xadd variant of rw-semaphores. Make sure that we use the old version of the __RWSEM_INITIALIZER macro for systems that rely on the spinlock one, otherwise warnings can be triggered, such as the following reported on an arm box: ipc/ipcns_notifier.c:22:8: warning: excess elements in struct initializer [enabled by default] ipc/ipcns_notifier.c:22:8: warning: (near initialization for 'ipcns_chain.rwsem') [enabled by default] ipc/ipcns_notifier.c:22:8: warning: excess elements in struct initializer [enabled by default] ipc/ipcns_notifier.c:22:8: warning: (near initialization for 'ipcns_chain.rwsem') [enabled by default] Signed-off-by: Davidlohr Bueso <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Cc: Tim Chen <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Paul McKenney <[email protected]> Cc: Michel Lespinasse <[email protected]> Cc: Peter Hurley <[email protected]> Cc: Alex Shi <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Jason Low <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Chris Mason <[email protected]> Cc: Josef Bacik <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2014-06-05locking/rwsem: Support optimistic spinningDavidlohr Bueso1-3/+22
We have reached the point where our mutexes are quite fine tuned for a number of situations. This includes the use of heuristics and optimistic spinning, based on MCS locking techniques. Exclusive ownership of read-write semaphores are, conceptually, just about the same as mutexes, making them close cousins. To this end we need to make them both perform similarly, and right now, rwsems are simply not up to it. This was discovered by both reverting commit 4fc3f1d6 (mm/rmap, migration: Make rmap_walk_anon() and try_to_unmap_anon() more scalable) and similarly, converting some other mutexes (ie: i_mmap_mutex) to rwsems. This creates a situation where users have to choose between a rwsem and mutex taking into account this important performance difference. Specifically, biggest difference between both locks is when we fail to acquire a mutex in the fastpath, optimistic spinning comes in to play and we can avoid a large amount of unnecessary sleeping and overhead of moving tasks in and out of wait queue. Rwsems do not have such logic. This patch, based on the work from Tim Chen and I, adds support for write-side optimistic spinning when the lock is contended. It also includes support for the recently added cancelable MCS locking for adaptive spinning. Note that is is only applicable to the xadd method, and the spinlock rwsem variant remains intact. Allowing optimistic spinning before putting the writer on the wait queue reduces wait queue contention and provided greater chance for the rwsem to get acquired. With these changes, rwsem is on par with mutex. The performance benefits can be seen on a number of workloads. For instance, on a 8 socket, 80 core 64bit Westmere box, aim7 shows the following improvements in throughput: +--------------+---------------------+-----------------+ | Workload | throughput-increase | number of users | +--------------+---------------------+-----------------+ | alltests | 20% | >1000 | | custom | 27%, 60% | 10-100, >1000 | | high_systime | 36%, 30% | >100, >1000 | | shared | 58%, 29% | 10-100, >1000 | +--------------+---------------------+-----------------+ There was also improvement on smaller systems, such as a quad-core x86-64 laptop running a 30Gb PostgreSQL (pgbench) workload for up to +60% in throughput for over 50 clients. Additionally, benefits were also noticed in exim (mail server) workloads. Furthermore, no performance regression have been seen at all. Based-on-work-from: Tim Chen <[email protected]> Signed-off-by: Davidlohr Bueso <[email protected]> [peterz: rej fixup due to comment patches, sched/rt.h header] Signed-off-by: Peter Zijlstra <[email protected]> Cc: Alex Shi <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Michel Lespinasse <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Peter Hurley <[email protected]> Cc: "Paul E.McKenney" <[email protected]> Cc: Jason Low <[email protected]> Cc: Aswin Chandramouleeswaran <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: "Scott J Norton" <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Chris Mason <[email protected]> Cc: Josef Bacik <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>