aboutsummaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)AuthorFilesLines
2011-02-19genirq: Use handle_irq_event() in handle_simple_irq()Thomas Gleixner1-13/+2
Signed-off-by: Thomas Gleixner <[email protected]>
2011-02-19genirq: Implement handle_irq_event()Thomas Gleixner2-8/+42
Core code replacement for the ugly camel case. It contains all the code which is shared in all handlers. clear status flags set INPROGRESS flag unlock call action chain note_interrupt lock clr INPROGRESS flag Signed-off-by: Thomas Gleixner <[email protected]>
2011-02-19genirq: Do not fiddle with IRQ_MASKED in handle_edge_irq()Thomas Gleixner1-1/+1
IRQ_MASKED is set in mask_ack_irq() anyway. Remove it from handle_edge_irq() to allow simpler ab^HHreuse of that function. Signed-off-by: Thomas Gleixner <[email protected]> Cc: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]>
2011-02-19genirq: Consolidate IRQ_DISABLEDThomas Gleixner4-16/+14
Handle IRQ_DISABLED consistent. Signed-off-by: Thomas Gleixner <[email protected]>
2011-02-19genirq: Remove default magicThomas Gleixner2-60/+17
Now that everything uses the wrappers, we can remove the default functions. None of those functions is performance critical. That makes the IRQ_MASKED flag tracking fully consistent. Signed-off-by: Thomas Gleixner <[email protected]>
2011-02-19genirq: Consolidate disable/enableThomas Gleixner5-8/+20
Create irq_disable/enable and use them to keep the flags consistent. Signed-off-by: Thomas Gleixner <[email protected]>
2011-02-19genirq: Consolidate startup/shutdown of interruptsThomas Gleixner4-31/+33
Aside of duplicated code some of the startup/shutdown sites do not handle the MASKED/DISABLED flags and the depth field at all. Move that to a helper function and take care of it there. Signed-off-by: Thomas Gleixner <[email protected]> Cc: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]>
2011-02-19genirq: Remove bogus conditionalThomas Gleixner1-4/+1
The if (chip->irq_shutdown) check will always evaluate to true, as we fill in chip->irq_shutdown with default_shutdown in irq_chip_set_defaults() if the chip does not provide its own function. Signed-off-by: Thomas Gleixner <[email protected]> Cc: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]>
2011-02-19genirq: Move irq thread flags to coreThomas Gleixner1-0/+14
Soleley used in core code. Signed-off-by: Thomas Gleixner <[email protected]>
2011-02-19genirq: Mark polled irqs and defer the real handlerThomas Gleixner3-28/+60
With the chip.end() function gone we might run into a situation where a poll call runs and the real interrupt comes in, sees IRQ_INPROGRESS and disables the line. That might be a perfect working one, which will then be masked forever. So mark them polled while the poll runs. When the real handler sees IRQ_INPROGRESS it checks the poll flag and waits for the polling to complete. Add the necessary amount of sanity checks to it to avoid deadlocks. Signed-off-by: Thomas Gleixner <[email protected]>
2011-02-19genirq: spurious: Run only one poller at a timeThomas Gleixner1-1/+15
No point in running concurrent pollers which confuse each other by setting PENDING. Signed-off-by: Thomas Gleixner <[email protected]>
2011-02-19genirq: Do not poll disabled, percpu and timer interruptsThomas Gleixner1-14/+26
There is no point in polling disabled lines. percpu does not make sense at all because we only poll on the cpu we're currently running on. Also polling per_cpu interrupts is racy as hell. The handler runs without locking so we might get a huge surprise. If the timer interrupt needs polling, then we wont get there anyway. Signed-off-by: Thomas Gleixner <[email protected]>
2011-02-19genirq: Fixup poll handlingThomas Gleixner1-31/+19
try_one_irq() contains redundant code and lots of useless checks for shared interrupts. Check for shared before setting IRQ_INPROGRESS and then call handle_IRQ_event() while pending. Shorter version with the same functionality. Signed-off-by: Thomas Gleixner <[email protected]>
2011-02-19genirq: Warn when handler enables interruptsThomas Gleixner1-1/+3
We run all handlers with interrupts disabled and expect them not to enable them. Warn when we catch one who does. Signed-off-by: Thomas Gleixner <[email protected]>
2011-02-19genirq: Plug race in report_bad_irq()Thomas Gleixner1-3/+9
We cannot walk the action chain unlocked. Even if IRQ_INPROGRESS is set an action can be removed and we follow a null pointer. It's safe to take the lock there, because the code which removes the action will call synchronize_irq() which waits unlocked for IRQ_INPROGRESS going away. Signed-off-by: Thomas Gleixner <[email protected]>
2011-02-19genirq: Remove redundant thread affinity settingThomas Gleixner1-2/+0
Thread affinity is already set by setup_affinity(). Signed-off-by: Thomas Gleixner <[email protected]>
2011-02-19genirq: Do not copy affinity before setThomas Gleixner3-15/+36
While rumaging through arch code I found that there are a few workarounds which deal with the fact that the initial affinity setting from request_irq() copies the mask into irq_data->affinity before the chip code is called. In the normal path we unconditionally copy the mask when the chip code returns 0. Copy after the code is called and add a return code IRQ_SET_MASK_OK_NOCOPY for the chip functions, which prevents the copy. That way we see the real mask when the chip function decided to truncate it further as some arches do. IRQ_SET_MASK_OK is 0, which is the current behaviour. Signed-off-by: Thomas Gleixner <[email protected]>
2011-02-19genirq: Always apply cpu online maskThomas Gleixner1-6/+6
If the affinity had been set by the user, then a later request_irq() will honour that setting. But online cpus can have changed. So apply the online mask and for this case as well. Signed-off-by: Thomas Gleixner <[email protected]>
2011-02-19genirq: Rremove redundant checkThomas Gleixner1-1/+2
IRQ_NO_BALANCING is already checked in irq_can_set_affinity() above, no need to check it again. Signed-off-by: Thomas Gleixner <[email protected]>
2011-02-19genirq: Simplify affinity related codeThomas Gleixner1-23/+41
There is lot of #ifdef CONFIG_GENERIC_PENDING_IRQ along with duplicated code in the irq core. Move the #ifdeffery into one place and cleanup the code so it's readable. No functional change. Signed-off-by: Thomas Gleixner <[email protected]>
2011-02-19genirq: Namespace cleanupThomas Gleixner2-17/+17
The irq namespace has become quite convoluted. My bad. Clean it up and deprecate the old functions. All new functions follow the scheme: irq number based: irq_set/get/xxx/_xxx(unsigned int irq, ...) irq_data based: irq_data_set/get/xxx/_xxx(struct irq_data *d, ....) irq_desc based: irq_desc_get_xxx(struct irq_desc *desc) Signed-off-by: Thomas Gleixner <[email protected]>
2011-02-19genirq: Add missing buslock to set_irq_type(), set_irq_wake()Thomas Gleixner2-0/+4
chips behind a slow bus cannot update the chip under desc->lock, but we miss the chip_buslock/chip_bus_sync_unlock() calls around the set type and set wake functions. Signed-off-by: Thomas Gleixner <[email protected]>
2011-02-19genirq: Make nr_irqs runtime expandableThomas Gleixner1-3/+19
We face more and more the requirement to expand nr_irqs at runtime. The reason are irq expanders which can not be detected in the early boot stage. So we speculate nr_irqs to have enough room. Further Xen needs extra irq numbers and we really want to avoid adding more "detection" code into the early boot. There is no real good reason why we need to limit nr_irqs at early boot. Allow the allocation code to expand nr_irqs. We have already 8k extra number space in the allocation bitmap, so lets use it. Signed-off-by: Thomas Gleixner <[email protected]>
2011-02-19Merge branch 'irq/urgent' into irq/coreThomas Gleixner15-55/+105
Reason: Further patches are conflicting with mainline fixes Signed-off-by: Thomas Gleixner <[email protected]>
2011-02-19genirq: Disable the SHIRQ_DEBUG call in request_threaded_irq for nowThomas Gleixner1-1/+1
With CONFIG_SHIRQ_DEBUG=y we call a newly installed interrupt handler in request_threaded_irq(). The original implementation (commit a304e1b8) called the handler _BEFORE_ it was installed, but that caused problems with handlers calling disable_irq_nosync(). See commit 377bf1e4. It's braindead in the first place to call disable_irq_nosync in shared handlers, but .... Moving this call after we installed the handler looks innocent, but it is very subtle broken on SMP. Interrupt handlers rely on the fact, that the irq core prevents reentrancy. Now this debug call violates that promise because we run the handler w/o the IRQ_INPROGRESS protection - which we cannot apply here because that would result in a possibly forever masked interrupt line. A concurrent real hardware interrupt on a different CPU results in handler reentrancy and can lead to complete wreckage, which was unfortunately observed in reality and took a fricking long time to debug. Leave the code here for now. We want this debug feature, but that's not easy to fix. We really should get rid of those disable_irq_nosync() abusers and remove that function completely. Signed-off-by: Thomas Gleixner <[email protected]> Cc: Anton Vorontsov <[email protected]> Cc: David Woodhouse <[email protected]> Cc: Arjan van de Ven <[email protected]> Cc: [email protected] # .28 -> .37
2011-02-19genirq: Prevent access beyond allocated_irqs bitmapThomas Gleixner3-2/+17
Lars-Peter Clausen pointed out: I stumbled upon this while looking through the existing archs using SPARSE_IRQ. Even with SPARSE_IRQ the NR_IRQS is still the upper limit for the number of IRQs. Both PXA and MMP set NR_IRQS to IRQ_BOARD_START, with IRQ_BOARD_START being the number of IRQs used by the core. In various machine files the nr_irqs field of the ARM machine defintion struct is then set to "IRQ_BOARD_START + NR_BOARD_IRQS". As a result "nr_irqs" will greater then NR_IRQS which then again causes the "allocated_irqs" bitmap in the core irq code to be accessed beyond its size overwriting unrelated data. The core code really misses a sanity check there. This went unnoticed so far as by chance the compiler/linker places data behind that bitmap which gets initialized later on those affected platforms. So the obvious fix would be to add a sanity check in early_irq_init() and break all affected platforms. Though that check wants to be backported to stable as well, which will require to fix all known problematic platforms and probably some more yet not known ones as well. Lots of churn. A way simpler solution is to allocate a slightly larger bitmap and avoid the whole churn w/o breaking anything. Add a few warnings when an arch returns utter crap. Reported-by: Lars-Peter Clausen <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] # .37 Cc: Haojian Zhuang <[email protected]> Cc: Eric Miao <[email protected]> Cc: Peter Zijlstra <[email protected]>
2011-02-18Merge branch 'fixes-2.6.38' of ↵Linus Torvalds3-17/+28
git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq * 'fixes-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: make sure MAYDAY_INITIAL_TIMEOUT is at least 2 jiffies long workqueue, freezer: unify spelling of 'freeze' + 'able' to 'freezable' workqueue: wake up a worker when a rescuer is leaving a gcwq
2011-02-18ntp: Remove redundant and incorrect parameter checkRichard Cochran1-3/+3
The ADJ_SETOFFSET code redundantly checks the range of the nanoseconds field of the time value. This field is checked again in the subsequent call to timekeeping_inject_offset(). Also, as is, the check will not detect whether the number of microseconds is out of range. Let timekeeping_inject_offset() do the error checking. Signed-off-by: Richard Cochran <[email protected]> Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2011-02-18Revert "tracing: Add unstable sched clock note to the warning"Ingo Molnar1-6/+2
This reverts commit 5e38ca8f3ea423442eaafe1b7e206084aa38120a. Breaks the build of several !CONFIG_HAVE_UNSTABLE_SCHED_CLOCK architectures. Cc: Jiri Olsa <[email protected]> Cc: Steven Rostedt <[email protected]> Message-ID: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2011-02-17Merge branch 'tip/perf/core' of ↵Ingo Molnar4-62/+40
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core
2011-02-16PM / Hibernate: Return error code when alloc_image_page() failsStanislaw Gruszka1-5/+2
Currently we return 0 in swsusp_alloc() when alloc_image_page() fails. Fix that. Also remove unneeded "error" variable since the only useful value of error is -ENOMEM. [rjw: Fixed up the changelog and changed subject.] Signed-off-by: Stanislaw Gruszka <[email protected]> Cc: [email protected] Signed-off-by: Rafael J. Wysocki <[email protected]>
2011-02-16workqueue: make sure MAYDAY_INITIAL_TIMEOUT is at least 2 jiffies longTejun Heo1-1/+3
MAYDAY_INITIAL_TIMEOUT is defined as HZ / 100 and depending on configuration may end up 0 or 1. Even when it's 1, depending on when the mayday timer is added in the current jiffy interval, it may expire way before a jiffy has passed. Make sure MAYDAY_INITIAL_TIMEOUT is at least two to guarantee that at least a full jiffy has passed before calling rescuers. Signed-off-by: Tejun Heo <[email protected]> Reported-by: Ray Jui <[email protected]> Cc: [email protected]
2011-02-16workqueue, freezer: unify spelling of 'freeze' + 'able' to 'freezable'Tejun Heo3-16/+16
There are two spellings in use for 'freeze' + 'able' - 'freezable' and 'freezeable'. The former is the more prominent one. The latter is mostly used by workqueue and in a few other odd places. Unify the spelling to 'freezable'. Signed-off-by: Tejun Heo <[email protected]> Reported-by: Alan Stern <[email protected]> Acked-by: "Rafael J. Wysocki" <[email protected]> Acked-by: Greg Kroah-Hartman <[email protected]> Acked-by: Dmitry Torokhov <[email protected]> Cc: David Woodhouse <[email protected]> Cc: Alex Dubov <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Steven Whitehouse <[email protected]>
2011-02-16lockdep/timers: Explain in detail the locking problems del_timer_sync() may ↵Steven Rostedt1-0/+23
cause Twice I had to explain the output about why lockdep gives an error with locks in IRQ context and with del_timer_sync(). Might as well write it up and place it in the comments above the code in del_timer_sync(). Perhaps the next time this lockdep dump triggers people will understand the issues. It is a ticky issue and very subtle, explaining it in detail in the code may help others understand the issue when they stumble upon the bug again. Signed-off-by: Steven Rostedt <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2011-02-16Merge commit 'v2.6.38-rc5' into core/lockingIngo Molnar18-117/+155
Merge reason: pick up upstream fixes. Signed-off-by: Ingo Molnar <[email protected]>
2011-02-16sched: Wholesale removal of sd_idle logicVenkatesh Pallipadi1-42/+11
sd_idle logic was introduced way back in 2005 (commit 5969fe06), as an HT optimization. As per the discussion in the thread here: lkml - sched: Resolve sd_idle and first_idle_cpu Catch-22 - v1 https://patchwork.kernel.org/patch/532501/ The capacity based logic in the load balancer right now handles this in a much cleaner way, handling more than 2 SMT siblings etc, and sd_idle does not seem to bring any additional benefits. sd_idle logic also has some bugs that has performance impact. Here is the patch that removes the sd_idle logic altogether. Also, there was a dependency of sched_mc_power_savings == 2, with sd_idle logic. Signed-off-by: Venkatesh Pallipadi <[email protected]> Acked-by: Vaidyanathan Srinivasan <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2011-02-16Merge commit 'v2.6.38-rc5' into sched/coreIngo Molnar18-156/+206
Merge reason: Pick up upstream fixes. Signed-off-by: Ingo Molnar <[email protected]>
2011-02-16perf: Optimize hrtimer eventsPeter Zijlstra1-3/+32
There is no need to re-initialize the hrtimer every time we start it, so don't do that (shaves a few cycles). Also, since we know hrtimers run at a fixed rate (nanoseconds) we can pre-compute the desired frequency at which they tick. This avoids us having to go through the whole adaptive frequency feedback logic (shaves another few cycles). Signed-off-by: Peter Zijlstra <[email protected]> LKML-Reference: <1297448589.5226.47.camel@laptop> Signed-off-by: Ingo Molnar <[email protected]>
2011-02-16perf: Optimize throttling codePeter Zijlstra2-20/+25
By pre-computing the maximum number of samples per tick we can avoid a multiplication and a conditional since MAX_INTERRUPTS > max_samples_per_tick. Signed-off-by: Peter Zijlstra <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <[email protected]>
2011-02-16perf: Add cgroup supportStephane Eranian2-35/+626
This kernel patch adds the ability to filter monitoring based on container groups (cgroups). This is for use in per-cpu mode only. The cgroup to monitor is passed as a file descriptor in the pid argument to the syscall. The file descriptor must be opened to the cgroup name in the cgroup filesystem. For instance, if the cgroup name is foo and cgroupfs is mounted in /cgroup, then the file descriptor is opened to /cgroup/foo. Cgroup mode is activated by passing PERF_FLAG_PID_CGROUP in the flags argument to the syscall. For instance to measure in cgroup foo on CPU1 assuming cgroupfs is mounted under /cgroup: struct perf_event_attr attr; int cgroup_fd, fd; cgroup_fd = open("/cgroup/foo", O_RDONLY); fd = perf_event_open(&attr, cgroup_fd, 1, -1, PERF_FLAG_PID_CGROUP); close(cgroup_fd); Signed-off-by: Stephane Eranian <[email protected]> [ added perf_cgroup_{exit,attach} ] Signed-off-by: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2011-02-16cgroup: Fix cgroup_subsys::exit callbackPeter Zijlstra2-17/+20
Make the ::exit method act like ::attach, it is after all very nearly the same thing. The bug had no effect on correctness - fixing it is an optimization for the scheduler. Also, later perf-cgroups patches rely on it. Signed-off-by: Peter Zijlstra <[email protected]> Acked-by: Paul Menage <[email protected]> LKML-Reference: <1297160655.13327.92.camel@laptop> Signed-off-by: Ingo Molnar <[email protected]>
2011-02-16Merge branch 'perf/urgent' into perf/coreIngo Molnar7-32/+64
Merge reason: we need to queue up dependent patch Signed-off-by: Ingo Molnar <[email protected]>
2011-02-16perf: Fix throttle logicPeter Zijlstra1-4/+15
It was possible to call pmu::start() on an already running event. In particular this lead so some wreckage as the hrtimer events would re-initialize active timers. This was due to throttled events being activated again by scheduling. Scheduling in a context would add and force start events, resulting in running events with a possible throttle status. The next tick to hit that task will then try to unthrottle the event and call ->start() on an already running event. Reported-by: Jeff Moyer <[email protected]> Cc: <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <[email protected]>
2011-02-15Merge branches 'core-fixes-for-linus' and 'timers-fixes-for-linus' of ↵Linus Torvalds2-7/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: Revert "lockdep, timer: Fix del_timer_sync() annotation" * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: timer debug: Hide kernel addresses via %pK in /proc/timer_list
2011-02-15Merge branch 'perf-fixes-for-linus' of ↵Linus Torvalds1-2/+8
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Fix text_poke_smp_batch() deadlock perf tools: Fix thread_map event synthesizing in top and record watchdog, nmi: Lower the severity of error messages ARM: oprofile: Fix backtraces in timer mode oprofile: Fix usage of CONFIG_HW_PERF_EVENTS for oprofile_perf_init and friends
2011-02-15Merge branch 'master' into for-nextJiri Kosina42-626/+681
2011-02-15sh: Enable CONFIG_GCOV_PROFILE_ALL for shChris Smith1-1/+1
This patch enables gcov kernel profiling over the whole kernel for sh. Profiling of specific files individually already worked. A handful of files have to be explicitly excluded from the profiling to avoid breaking things, notably pmb.c. Signed-off-by: Chris Smith <[email protected]> Signed-off-by: Stuart Menefy <[email protected]> Signed-off-by: Paul Mundt <[email protected]>
2011-02-14Merge branch 'tip/perf/urgent' of ↵Ingo Molnar2-8/+46
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core
2011-02-14workqueue: wake up a worker when a rescuer is leaving a gcwqTejun Heo1-0/+9
After executing the matching works, a rescuer leaves the gcwq whether there are more pending works or not. This may decrease the concurrency level to zero and stall execution until a new work item is queued on the gcwq. Make rescuer wake up a regular worker when it leaves a gcwq if there are more works to execute, so that execution isn't stalled. Signed-off-by: Tejun Heo <[email protected]> Reported-by: Ray Jui <[email protected]> Cc: [email protected]
2011-02-14tracing/kprobe: Fix NULL pointer deref checkMasami Hiramatsu1-1/+1
Add NULL check for avoiding NULL pointer deref. This bug has been introduced by: 1ff511e35ed8: tracing/kprobes: Add bitfield type which causes a null pointer dereference bug when kprobe-tracer parses an argument without type. Reported-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Masami Hiramatsu <[email protected]> Cc: [email protected] Cc: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Reported-by: Arnaldo Carvalho de Melo <[email protected]>