aboutsummaryrefslogtreecommitdiff
path: root/include/linux/mutex.h
AgeCommit message (Collapse)AuthorFilesLines
2013-11-11locking/doc: Update references to kernel/mutex.cPeter Zijlstra1-1/+1
Fix this docbook error: >> docproc: kernel/mutex.c: No such file or directory by updating the stale references to kernel/mutex.c. Reported-by: [email protected] Signed-off-by: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-09-28mutex: replace CONFIG_HAVE_ARCH_MUTEX_CPU_RELAX with simple ifdefHeiko Carstens1-3/+3
Linus suggested to replace #ifndef CONFIG_HAVE_ARCH_MUTEX_CPU_RELAX #define arch_mutex_cpu_relax() cpu_relax() #endif with just a simple #ifndef arch_mutex_cpu_relax # define arch_mutex_cpu_relax() cpu_relax() #endif to get rid of CONFIG_HAVE_CPU_RELAX_SIMPLE. So architectures can simply define arch_mutex_cpu_relax if they want an architecture specific function instead of having to add a select statement in their Kconfig in addition. Suggested-by: Linus Torvalds <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2013-07-12mutex: Move ww_mutex definitions to ww_mutex.hMaarten Lankhorst1-358/+0
Move the definitions for wound/wait mutexes out to a separate header, ww_mutex.h. This reduces clutter in mutex.h, and increases readability. Suggested-by: Linus Torvalds <[email protected]> Signed-off-by: Maarten Lankhorst <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Acked-by: Rik van Riel <[email protected]> Acked-by: Maarten Lankhorst <[email protected]> Cc: Dave Airlie <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Tidied up the code a bit. ] Signed-off-by: Ingo Molnar <[email protected]>
2013-06-26mutex: Add w/w mutex slowpath debuggingDaniel Vetter1-0/+8
Injects EDEADLK conditions at pseudo-random interval, with exponential backoff up to UINT_MAX (to ensure that every lock operation still completes in a reasonable time). This way we can test the wound slowpath even for ww mutex users where contention is never expected, and the ww deadlock avoidance algorithm is only needed for correctness against malicious userspace. An example would be protecting kernel modesetting properties, which thanks to single-threaded X isn't really expected to contend, ever. I've looked into using the CONFIG_FAULT_INJECTION infrastructure, but decided against it for two reasons: - EDEADLK handling is mandatory for ww mutex users and should never affect the outcome of a syscall. This is in contrast to -ENOMEM injection. So fine configurability isn't required. - The fault injection framework only allows to set a simple probability for failure. Now the probability that a ww mutex acquire stage with N locks will never complete (due to too many injected EDEADLK backoffs) is zero. But the expected number of ww_mutex_lock operations for the completely uncontended case would be O(exp(N)). The per-acuiqire ctx exponential backoff solution choosen here only results in O(log N) overhead due to injection and so O(log N * N) lock operations. This way we can fail with high probability (and so have good test coverage even for fancy backoff and lock acquisition paths) without running into patalogical cases. Note that EDEADLK will only ever be injected when we managed to acquire the lock. This prevents any behaviour changes for users which rely on the EALREADY semantics. Signed-off-by: Daniel Vetter <[email protected]> Signed-off-by: Maarten Lankhorst <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: Linus Torvalds <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/20130620113117.4001.21681.stgit@patser Signed-off-by: Ingo Molnar <[email protected]>
2013-06-26mutex: Add support for wound/wait style locksMaarten Lankhorst1-1/+354
Wound/wait mutexes are used when other multiple lock acquisitions of a similar type can be done in an arbitrary order. The deadlock handling used here is called wait/wound in the RDBMS literature: The older tasks waits until it can acquire the contended lock. The younger tasks needs to back off and drop all the locks it is currently holding, i.e. the younger task is wounded. For full documentation please read Documentation/ww-mutex-design.txt. References: https://lwn.net/Articles/548909/ Signed-off-by: Maarten Lankhorst <[email protected]> Acked-by: Daniel Vetter <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: Linus Torvalds <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-04-19mutex: Queue mutex spinners with MCS lock to reduce cacheline contentionWaiman Long1-0/+3
The current mutex spinning code (with MUTEX_SPIN_ON_OWNER option turned on) allow multiple tasks to spin on a single mutex concurrently. A potential problem with the current approach is that when the mutex becomes available, all the spinning tasks will try to acquire the mutex more or less simultaneously. As a result, there will be a lot of cacheline bouncing especially on systems with a large number of CPUs. This patch tries to reduce this kind of contention by putting the mutex spinners into a queue so that only the first one in the queue will try to acquire the mutex. This will reduce contention and allow all the tasks to move forward faster. The queuing of mutex spinners is done using an MCS lock based implementation which will further reduce contention on the mutex cacheline than a similar ticket spinlock based implementation. This patch will add a new field into the mutex data structure for holding the MCS lock. This expands the mutex size by 8 bytes for 64-bit system and 4 bytes for 32-bit system. This overhead will be avoid if the MUTEX_SPIN_ON_OWNER option is turned off. The following table shows the jobs per minute (JPM) scalability data on an 8-node 80-core Westmere box with a 3.7.10 kernel. The numactl command is used to restrict the running of the fserver workloads to 1/2/4/8 nodes with hyperthreading off. +-----------------+-----------+-----------+-------------+----------+ | Configuration | Mean JPM | Mean JPM | Mean JPM | % Change | | | w/o patch | patch 1 | patches 1&2 | 1->1&2 | +-----------------+------------------------------------------------+ | | User Range 1100 - 2000 | +-----------------+------------------------------------------------+ | 8 nodes, HT off | 227972 | 227237 | 305043 | +34.2% | | 4 nodes, HT off | 393503 | 381558 | 394650 | +3.4% | | 2 nodes, HT off | 334957 | 325240 | 338853 | +4.2% | | 1 node , HT off | 198141 | 197972 | 198075 | +0.1% | +-----------------+------------------------------------------------+ | | User Range 200 - 1000 | +-----------------+------------------------------------------------+ | 8 nodes, HT off | 282325 | 312870 | 332185 | +6.2% | | 4 nodes, HT off | 390698 | 378279 | 393419 | +4.0% | | 2 nodes, HT off | 336986 | 326543 | 340260 | +4.2% | | 1 node , HT off | 197588 | 197622 | 197582 | 0.0% | +-----------------+-----------+-----------+-------------+----------+ At low user range 10-100, the JPM differences were within +/-1%. So they are not that interesting. The fserver workload uses mutex spinning extensively. With just the mutex change in the first patch, there is no noticeable change in performance. Rather, there is a slight drop in performance. This mutex spinning patch more than recovers the lost performance and show a significant increase of +30% at high user load with the full 8 nodes. Similar improvements were also seen in a 3.8 kernel. The table below shows the %time spent by different kernel functions as reported by perf when running the fserver workload at 1500 users with all 8 nodes. +-----------------------+-----------+---------+-------------+ | Function | % time | % time | % time | | | w/o patch | patch 1 | patches 1&2 | +-----------------------+-----------+---------+-------------+ | __read_lock_failed | 34.96% | 34.91% | 29.14% | | __write_lock_failed | 10.14% | 10.68% | 7.51% | | mutex_spin_on_owner | 3.62% | 3.42% | 2.33% | | mspin_lock | N/A | N/A | 9.90% | | __mutex_lock_slowpath | 1.46% | 0.81% | 0.14% | | _raw_spin_lock | 2.25% | 2.50% | 1.10% | +-----------------------+-----------+---------+-------------+ The fserver workload for an 8-node system is dominated by the contention in the read/write lock. Mutex contention also plays a role. With the first patch only, mutex contention is down (as shown by the __mutex_lock_slowpath figure) which help a little bit. We saw only a few percents improvement with that. By applying patch 2 as well, the single mutex_spin_on_owner figure is now split out into an additional mspin_lock figure. The time increases from 3.42% to 11.23%. It shows a great reduction in contention among the spinners leading to a 30% improvement. The time ratio 9.9/2.33=4.3 indicates that there are on average 4+ spinners waiting in the spin_lock loop for each spinner in the mutex_spin_on_owner loop. Contention in other locking functions also go down by quite a lot. The table below shows the performance change of both patches 1 & 2 over patch 1 alone in other AIM7 workloads (at 8 nodes, hyperthreading off). +--------------+---------------+----------------+-----------------+ | Workload | mean % change | mean % change | mean % change | | | 10-100 users | 200-1000 users | 1100-2000 users | +--------------+---------------+----------------+-----------------+ | alltests | 0.0% | -0.8% | +0.6% | | five_sec | -0.3% | +0.8% | +0.8% | | high_systime | +0.4% | +2.4% | +2.1% | | new_fserver | +0.1% | +14.1% | +34.2% | | shared | -0.5% | -0.3% | -0.4% | | short | -1.7% | -9.8% | -8.3% | +--------------+---------------+----------------+-----------------+ The short workload is the only one that shows a decline in performance probably due to the spinner locking and queuing overhead. Signed-off-by: Waiman Long <[email protected]> Reviewed-by: Davidlohr Bueso <[email protected]> Acked-by: Rik van Riel <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Chandramouleeswaran Aswin <[email protected]> Cc: Norton Scott J <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: David Howells <[email protected]> Cc: Dave Jones <[email protected]> Cc: Clark Williams <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2011-07-26atomic: use <linux/atomic.h>Arun Sharma1-1/+1
This allows us to move duplicated code in <asm/atomic.h> (atomic_inc_not_zero() for now) to <linux/atomic.h> Signed-off-by: Arun Sharma <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: David Miller <[email protected]> Cc: Eric Dumazet <[email protected]> Acked-by: Mike Frysinger <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-07-21mutex: Make mutex_destroy() an inline functionJean Delvare1-1/+1
The non-debug variant of mutex_destroy is a no-op, currently implemented as a macro which does nothing. This approach fails to check the type of the parameter, so an error would only show when debugging gets enabled. Using an inline function instead, offers type checking for earlier bug catching. Signed-off-by: Jean Delvare <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2011-05-25lockdep, mutex: provide mutex_lock_nest_lockPeter Zijlstra1-0/+9
In order to convert i_mmap_lock to a mutex we need a mutex equivalent to spin_lock_nest_lock(), thus provide the mutex_lock_nest_lock() annotation. As with spin_lock_nest_lock(), mutex_lock_nest_lock() allows annotation of the locking pattern where an outer lock serializes the acquisition order of nested locks. That is, if every time you lock multiple locks A, say A1 and A2 you first acquire N, the order of acquiring A1 and A2 is irrelevant. Signed-off-by: Peter Zijlstra <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: David Miller <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Russell King <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Jeff Dike <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Tony Luck <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Mel Gorman <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Nick Piggin <[email protected]> Cc: Namhyung Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-04-14mutex: Use p->on_cpu for the adaptive spinPeter Zijlstra1-1/+1
Since we now have p->on_cpu unconditionally available, use it to re-implement mutex_spin_on_owner. Requested-by: Thomas Gleixner <[email protected]> Reviewed-by: Frank Rowand <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Nick Piggin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Andrew Morton <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected]
2010-11-26mutexes, sched: Introduce arch_mutex_cpu_relax()Gerald Schaefer1-0/+4
The spinning mutex implementation uses cpu_relax() in busy loops as a compiler barrier. Depending on the architecture, cpu_relax() may do more than needed in this specific mutex spin loops. On System z we also give up the time slice of the virtual cpu in cpu_relax(), which prevents effective spinning on the mutex. This patch replaces cpu_relax() in the spinning mutex code with arch_mutex_cpu_relax(), which can be defined by each architecture that selects HAVE_ARCH_MUTEX_CPU_RELAX. The default is still cpu_relax(), so this patch should not affect other architectures than System z for now. Signed-off-by: Gerald Schaefer <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> LKML-Reference: <1290437256.7455.4.camel@thinkpad> Signed-off-by: Ingo Molnar <[email protected]>
2010-09-03mutex: Fix annotations to include it in kernel-locking docbookRandy Dunlap1-0/+8
Fix kernel-doc notation in linux/mutex.h and kernel/mutex.c, then add these 2 files to the kernel-locking docbook as the Mutex API reference chapter. Add one API function to mutex-design.txt and correct a typo in that file. Signed-off-by: Randy Dunlap <[email protected]> Cc: Rusty Russell <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2009-04-30mutex: add atomic_dec_and_mutex_lock(), fixAndrew Morton1-23/+1
include/linux/mutex.h:136: warning: 'mutex_lock' declared inline after being called include/linux/mutex.h:136: warning: previous declaration of 'mutex_lock' was here uninline it. [ Impact: clean up and uninline, address compiler warning ] Signed-off-by: Andrew Morton <[email protected]> Cc: Al Viro <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Eric Paris <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2009-04-29mutex: add atomic_dec_and_mutex_lock()Eric Paris1-0/+23
Much like the atomic_dec_and_lock() function in which we take an hold a spin_lock if we drop the atomic to 0 this function takes and holds the mutex if we dec the atomic to 0. Signed-off-by: Eric Paris <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Cc: Paul Mackerras <[email protected]> Orig-LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2009-01-14mutex: implement adaptive spinningPeter Zijlstra1-2/+3
Change mutex contention behaviour such that it will sometimes busy wait on acquisition - moving its behaviour closer to that of spinlocks. This concept got ported to mainline from the -rt tree, where it was originally implemented for rtmutexes by Steven Rostedt, based on work by Gregory Haskins. Testing with Ingo's test-mutex application (http://lkml.org/lkml/2006/1/8/50) gave a 345% boost for VFS scalability on my testbox: # ./test-mutex-shm V 16 10 | grep "^avg ops" avg ops/sec: 296604 # ./test-mutex-shm V 16 10 | grep "^avg ops" avg ops/sec: 85870 The key criteria for the busy wait is that the lock owner has to be running on a (different) cpu. The idea is that as long as the owner is running, there is a fair chance it'll release the lock soon, and thus we'll be better off spinning instead of blocking/scheduling. Since regular mutexes (as opposed to rtmutexes) do not atomically track the owner, we add the owner in a non-atomic fashion and deal with the races in the slowpath. Furthermore, to ease the testing of the performance impact of this new code, there is means to disable this behaviour runtime (without having to reboot the system), when scheduler debugging is enabled (CONFIG_SCHED_DEBUG=y), by issuing the following command: # echo NO_OWNER_SPIN > /debug/sched_features This command re-enables spinning again (this is also the default): # echo OWNER_SPIN > /debug/sched_features Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-10-30mutex: improve header comment to be actually informative about the APIArjan van de Ven1-0/+2
Impact: improve documentation It's nice to say that mutex_trylock follows the spin_trylock convention. It's a lot nicer if the comment also says which that is... make it so. Signed-off-by: Arjan van de Ven <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2008-02-08Remove fastcall from linux/includeHarvey Harrison1-6/+6
[[email protected]: coding-style fixes] Signed-off-by: Harvey Harrison <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-12-06Add mutex_lock_killableLiam R. Howlett1-0/+5
Similar to mutex_lock_interruptible, it can be interrupted by a fatal signal only. Signed-off-by: Liam R. Howlett <[email protected]> Acked-by: Ingo Molnar <[email protected]> Signed-off-by: Matthew Wilcox <[email protected]>
2007-10-17Mutex documentation is unclear about software interrupts, tasklets and timersMatti Linnanvuori1-1/+2
Acked-by: Arjan van de Ven <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-11lockdep: fixup mutex annotationsPeter Zijlstra1-3/+6
The fancy mutex_lock fastpath has too many indirections to track the caller hence all contentions are perceived to come from mutex_lock(). Avoid this by explicitly not using the fastpath code (it was disabled already anyway). Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2007-05-09mutex_lock_interruptible(): add __must_checkAndrew Morton1-2/+3
It's not sane to use mutex_lock_interruptible() and to then ignore the result. Ditto down_interruptible(), but I'm lazy. Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-01-26[PATCH] fix various kernel-doc in header filesRobert P. J. Day1-1/+1
Fix a number of kernel-doc entries for header files in include/linux by making sure they begin with the appropriate '/**' notation and use @var notation. Signed-off-by: Robert P. J. Day <[email protected]> Signed-off-by: Randy Dunlap <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-12-08[PATCH] lockdep: avoid lockdep warning in mdNeilBrown1-0/+2
md_open takes ->reconfig_mutex which causes lockdep to complain. This (normally) doesn't have deadlock potential as the possible conflict is with a reconfig_mutex in a different device. I say "normally" because if a loop were created in the array->member hierarchy a deadlock could happen. However that causes bigger problems than a deadlock and should be fixed independently. So we flag the lock in md_open as a nested lock. This requires defining mutex_lock_interruptible_nested. Cc: Ingo Molnar <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Acked-by: Ingo Molnar <[email protected]> Signed-off-by: Neil Brown <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-12-07[PATCH] lockdep: name some old style locksPeter Zijlstra1-1/+1
Name some of the remaning 'old_style_spin_init' locks Signed-off-by: Peter Zijlstra <[email protected]> Acked-by: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-07-03[PATCH] lockdep: prove mutex locking correctnessIngo Molnar1-3/+28
Use the lock validator framework to prove mutex locking correctness. Signed-off-by: Ingo Molnar <[email protected]> Signed-off-by: Arjan van de Ven <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-07-03[PATCH] lockdep: better lock debuggingIngo Molnar1-6/+0
Generic lock debugging: - generalized lock debugging framework. For example, a bug in one lock subsystem turns off debugging in all lock subsystems. - got rid of the caller address passing (__IP__/__IP_DECL__/etc.) from the mutex/rtmutex debugging code: it caused way too much prototype hackery, and lockdep will give the same information anyway. - ability to do silent tests - check lock freeing in vfree too. - more finegrained debugging options, to allow distributions to turn off more expensive debugging features. There's no separate 'held mutexes' list anymore - but there's a 'held locks' stack within lockdep, which unifies deadlock detection across all lock classes. (this is independent of the lockdep validation stuff - lockdep first checks whether we are holding a lock already) Here are the current debugging options: CONFIG_DEBUG_MUTEXES=y CONFIG_DEBUG_LOCK_ALLOC=y which do: config DEBUG_MUTEXES bool "Mutex debugging, basic checks" config DEBUG_LOCK_ALLOC bool "Detect incorrect freeing of live mutexes" Signed-off-by: Ingo Molnar <[email protected]> Signed-off-by: Arjan van de Ven <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-01-11[PATCH] fix/simplify mutex debugging codeDavid Woodhouse1-1/+1
Let's switch mutex_debug_check_no_locks_freed() to take (addr, len) as arguments instead, since all its callers were just calculating the 'to' address for themselves anyway... (and sometimes doing so badly). Signed-off-by: David Woodhouse <[email protected]> Acked-by: Ingo Molnar <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-01-11[MUTEX]: linux/mutex.h needs linux/linkage.h tooDavid S. Miller1-0/+1
Signed-off-by: David S. Miller <[email protected]>
2006-01-09[PATCH] mutex subsystem, coreIngo Molnar1-0/+119
mutex implementation, core files: just the basic subsystem, no users of it. Signed-off-by: Ingo Molnar <[email protected]> Signed-off-by: Arjan van de Ven <[email protected]>