aboutsummaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)AuthorFilesLines
2015-06-10time: Refactor usecs_to_jiffiesNicholas Mc Guire1-10/+3
Refactor the usecs_to_jiffies conditional code part in time.c and jiffies.h putting it into conditional functions rather than #ifdefs to improve readability. This is analogous to the msecs_to_jiffies() cleanup in commit ca42aaf0c861 ("time: Refactor msecs_to_jiffies") Signed-off-by: Nicholas Mc Guire <[email protected]> Cc: Masahiro Yamada <[email protected]> Cc: Sam Ravnborg <[email protected]> Cc: Joe Perches <[email protected]> Cc: John Stultz <[email protected]> Cc: Andrew Hunter <[email protected]> Cc: Paul Turner <[email protected]> Cc: Michal Marek <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-06-02clockevents: Rename state to state_use_accessorsThomas Gleixner1-2/+2
The only sensible way to make abuse of core internal fields obvious and easy to grep for. Signed-off-by: Thomas Gleixner <[email protected]> Cc: Viresh Kumar <[email protected]> Cc: Peter Zijlstra <[email protected]>
2015-06-02clockevents: Use set/get state helper functionsThomas Gleixner2-6/+7
Signed-off-by: Thomas Gleixner <[email protected]> Cc: Viresh Kumar <[email protected]> Cc: Peter Zijlstra <[email protected]>
2015-06-02clockevents: Provide functions to set and get the stateThomas Gleixner5-24/+35
We want to rename dev->state, so provide proper get and set functions. Rename clockevents_set_state() to clockevents_switch_state() to avoid confusion. Signed-off-by: Thomas Gleixner <[email protected]> Cc: Viresh Kumar <[email protected]> Cc: Peter Zijlstra <[email protected]>
2015-06-02clockevents: Use helpers to check the state of a clockevent deviceViresh Kumar4-17/+17
Use accessor functions to check the state of clockevent devices in core code. Signed-off-by: Viresh Kumar <[email protected]> Cc: [email protected] Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/fa2b9869fd17f210eaa156ec2b594efd0230b6c7.1432192527.git.viresh.kumar@linaro.org Signed-off-by: Thomas Gleixner <[email protected]>
2015-05-27clockevents: Do not suspend/resume if unusedAlexandre Belloni1-2/+2
There is no point in calling suspend/resume for unused clockevents as they are already stopped and disabled. This is really important for AT91 as the hardware is a trainwreck and takes ages to synchronize. Reported-by: Sylvain Rochet <[email protected]> Signed-off-by: Alexandre Belloni <[email protected]> Cc: Daniel Lezcano <[email protected]> Cc: Nicolas Ferre <[email protected]> Cc: Boris Brezillon <[email protected]> Cc: Maxime Ripard <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/1421399151-26800-1-git-send-email-alexandre.belloni@free-electrons.com Signed-off-by: Thomas Gleixner <[email protected]>
2015-05-22time: Remove read_boot_clock()Xunlei Pang1-11/+3
Now that we have a read_boot_clock64() function available on every architecture, and converted all the users to it, it's time to remove the (now unused) read_boot_clock() completely from the kernel. Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Xunlei Pang <[email protected]> [jstultz: Minor commit message tweak suggested by Ingo] Signed-off-by: John Stultz <[email protected]>
2015-05-22tracing: timer: Add deferrable flag to timer_startBadhri Jagan Sridharan1-1/+1
The timer_start event now shows whether the timer is deferrable in case of a low-res timer. The debug_activate function now includes a deferrable flag while calling the trace_timer_start event. Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Acked-by: Steven Rostedt <[email protected]> Signed-off-by: Badhri Jagan Sridharan <[email protected]> [jstultz: Fixed minor whitespace and grammer tweaks pointed out by Ingo] Signed-off-by: John Stultz <[email protected]>
2015-05-22time: Rework debugging variables so they aren't globalJohn Stultz1-22/+11
Ingo suggested that the timekeeping debugging variables recently added should not be global, and should be tied to the timekeeper's read_base. Thus this patch implements that suggestion. This version is different from the earlier versions as it keeps the variables in the timekeeper structure rather then in the tkr. Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Prarit Bhargava <[email protected]> Cc: Richard Cochran <[email protected]> Signed-off-by: John Stultz <[email protected]>
2015-05-22timekeeping: Provide new API to get the current time resolutionHarald Geyer1-0/+17
This patch series introduces a new function u32 ktime_get_resolution_ns(void) which allows to clean up some driver code. In particular the IIO subsystem has a function to provide timestamps for events but no means to get their resolution. So currently the dht11 driver tries to guess the resolution in a rather messy and convoluted way. We can do much better with the new code. This API is not designed to be exposed to user space. This has been tested on i386, sunxi and mxs. Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Harald Geyer <[email protected]> [jstultz: Tweaked to make it build after upstream changes] Signed-off-by: John Stultz <[email protected]>
2015-05-22time: Make sure tz_minuteswest is set to a valid value when setting timeSasha Levin1-0/+4
Invalid values may overflow later, leading to undefined behaviour when multiplied by 60 to get the amount of seconds. Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: John Stultz <[email protected]>
2015-05-19clockevents: Stop unused clockevent devicesViresh Kumar3-4/+22
To avoid getting spurious interrupts on a tickless CPU, clockevent device can now be stopped by switching to ONESHOT_STOPPED state. The natural place for handling this transition is tick_program_event(). On 'expires == KTIME_MAX', we skip programming the event and so we need to fix such call sites as well, to always call tick_program_event() irrespective of the expires value. Once the clockevent device is required again, check if it was earlier put into ONESHOT_STOPPED state. If yes, switch its state to ONESHOT before programming its event. To make sure we haven't missed any corner case, add a WARN() for the case where we try to reprogram clockevent device while we aren't configured in ONESHOT_STOPPED state. Signed-off-by: Viresh Kumar <[email protected]> Cc: [email protected] Cc: Frederic Weisbecker <[email protected]> Cc: Kevin Hilman <[email protected]> Cc: Daniel Lezcano <[email protected]> Cc: Preeti U Murthy <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/5146b07be7f0bc497e0ebae036590ec2fa73e540.1428031396.git.viresh.kumar@linaro.org Signed-off-by: Thomas Gleixner <[email protected]>
2015-05-19clockevents: Introduce CLOCK_EVT_STATE_ONESHOT_STOPPED stateViresh Kumar2-1/+19
When no timers/hrtimers are pending, the expiry time is set to a special value: 'KTIME_MAX'. This normally happens with NO_HZ_{IDLE|FULL} in both LOWRES/HIGHRES modes. When 'expiry == KTIME_MAX', we either cancel the 'tick-sched' hrtimer (NOHZ_MODE_HIGHRES) or skip reprogramming clockevent device (NOHZ_MODE_LOWRES). But, the clockevent device is already reprogrammed from the tick-handler for next tick. As the clock event device is programmed in ONESHOT mode it will at least fire one more time (unnecessarily). Timers on few implementations (like arm_arch_timer, etc.) only support PERIODIC mode and their drivers emulate ONESHOT over that. Which means that on these platforms we will get spurious interrupts periodically (at last programmed interval rate, normally tick rate). In order to avoid spurious interrupts, the clockevent device should be stopped or its interrupts should be masked. A simple (yet hacky) solution to get this fixed could be: update hrtimer_force_reprogram() to always reprogram clockevent device and update clockevent drivers to STOP generating events (or delay it to max time) when 'expires' is set to KTIME_MAX. But the drawback here is that every clockevent driver has to be hacked for this particular case and its very easy for new ones to miss this. However, Thomas suggested to add an optional state ONESHOT_STOPPED to solve this problem: lkml.org/lkml/2014/5/9/508. This patch adds support for ONESHOT_STOPPED state in clockevents core. It will only be available to drivers that implement the state-specific callbacks instead of the legacy ->set_mode() callback. Signed-off-by: Viresh Kumar <[email protected]> Reviewed-by: Preeti U. Murthy <[email protected]> Cc: [email protected] Cc: Frederic Weisbecker <[email protected]> Cc: Kevin Hilman <[email protected]> Cc: Daniel Lezcano <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/b8b383a03ac07b13312c16850b5106b82e4245b5.1428031396.git.viresh.kumar@linaro.org Signed-off-by: Thomas Gleixner <[email protected]>
2015-05-19Merge branch 'linus' into timers/coreThomas Gleixner25-215/+265
Make sure the upstream fixes are applied before adding further modifications.
2015-05-19time: Refactor msecs_to_jiffiesNicholas Mc Guire1-40/+19
Refactor the msecs_to_jiffies conditional code part in time.c and jiffies.h putting it into conditional functions rather than #ifdefs to improve readability. [ tglx: Verified that there is no binary code change ] Signed-off-by: Nicholas Mc Guire <[email protected]> Cc: Masahiro Yamada <[email protected]> Cc: Sam Ravnborg <[email protected]> Cc: Joe Perches <[email protected]> Cc: John Stultz <[email protected]> Cc: Andrew Hunter <[email protected]> Cc: Paul Turner <[email protected]> Cc: Michal Marek <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-05-19time: Move timeconst.h into include/generatedNicholas Mc Guire3-18/+4
kernel/time/timeconst.h is moved to include/generated/ and generated by the top level Kbuild. This allows using timeconst.h in an earlier build stage. Signed-off-by: Nicholas Mc Guire <[email protected]> Cc: Masahiro Yamada <[email protected]> Cc: Sam Ravnborg <[email protected]> Cc: Joe Perches <[email protected]> Cc: John Stultz <[email protected]> Cc: Andrew Hunter <[email protected]> Cc: Paul Turner <[email protected]> Cc: Michal Marek <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-05-18watchdog: Fix merge 'conflict'Peter Zijlstra1-5/+15
Two watchdog changes that came through different trees had a non conflicting conflict, that is, one changed the semantics of a variable but no actual code conflict happened. So the merge appeared fine, but the resulting code did not behave as expected. Commit 195daf665a62 ("watchdog: enable the new user interface of the watchdog mechanism") changes the semantics of watchdog_user_enabled, which thereafter is only used by the functions introduced by b3738d293233 ("watchdog: Add watchdog enable/disable all functions"). There further appears to be a distinct lack of serialization between setting and using watchdog_enabled, so perhaps we should wrap the {en,dis}able_all() things in watchdog_proc_mutex. This patch fixes a s2r failure reported by Michal; which I cannot readily explain. But this does make the code internally consistent again. Reported-and-tested-by: Michal Hocko <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-05-18sched,perf: Fix periodic timersPeter Zijlstra5-33/+38
In the below two commits (see Fixes) we have periodic timers that can stop themselves when they're no longer required, but need to be (re)-started when their idle condition changes. Further complications is that we want the timer handler to always do the forward such that it will always correctly deal with the overruns, and we do not want to race such that the handler has already decided to stop, but the (external) restart sees the timer still active and we end up with a 'lost' timer. The problem with the current code is that the re-start can come before the callback does the forward, at which point the forward from the callback will WARN about forwarding an enqueued timer. Now, conceptually its easy to detect if you're before or after the fwd by comparing the expiration time against the current time. Of course, that's expensive (and racy) because we don't have the current time. Alternatively one could cache this state inside the timer, but then everybody pays the overhead of maintaining this extra state, and that is undesired. The only other option that I could see is the external timer_active variable, which I tried to kill before. I would love a nicer interface for this seemingly simple 'problem' but alas. Fixes: 272325c4821f ("perf: Fix mux_interval hrtimer wreckage") Fixes: 77a4d1a1b9a1 ("sched: Cleanup bandwidth timers") Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: Sasha Levin <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: http://lkml.kernel.org/r/[email protected]
2015-05-15Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds2-33/+33
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Two fixes: a suspend/resume related regression fix, and an RT priority boosting fix" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/core: Fix regression in cpuset_cpu_inactive() for suspend sched: Handle priority boosted tasks proper in setscheduler()
2015-05-15Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds1-8/+33
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Mostly tooling fixes, but also a lockdep annotation fix, a PMU event list fix and a new model addition" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tools/liblockdep: Fix compilation error tools/liblockdep: Fix linker error in case of cross compile perf tools: Use getconf to determine number of online CPUs tools: Fix tools/vm build perf/x86/rapl: Enable Broadwell-U RAPL support perf/x86/intel: Fix SLM cache event list perf: Annotate inherited event ctx->mutex recursion
2015-05-09Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "Two patches from the irq departement: - a simple fix to make dummy_irq_chip usable for wakeup scenarios - removal of the gic arch_extn hackery. Now that all users are converted we really want to get rid of the interface so people wont come up with new use cases" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip: gic: Drop support for gic_arch_extn genirq: Set IRQCHIP_SKIP_SET_WAKE flag for dummy_irq_chip
2015-05-09Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds1-5/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Thomas Gleixner: "A simple fix to actually shut down a detached device instead of keeping it active" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clockevents: Shutdown detached clockevent device
2015-05-08Merge tag 'trace-fixes-v4.1-rc2' of ↵Linus Torvalds1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fix from Steven Rostedt: "The newly added ftrace_print_array_seq() function had a bug in it. Luckily, the only user of it didn't make the 4.1 merge window. But the helper function should be fixed before 4.2 when the users start coming in" * tag 'trace-fixes-v4.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Make ftrace_print_array_seq compute buf_len
2015-05-08perf: Annotate inherited event ctx->mutex recursionPeter Zijlstra1-8/+33
While fuzzing Sasha tripped over another ctx->mutex recursion lockdep splat. Annotate this. Reported-by: Sasha Levin <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2015-05-08sched/core: Fix regression in cpuset_cpu_inactive() for suspendOmar Sandoval1-16/+12
Commit 3c18d447b3b3 ("sched/core: Check for available DL bandwidth in cpuset_cpu_inactive()"), a SCHED_DEADLINE bugfix, had a logic error that caused a regression in setting a CPU inactive during suspend. I ran into this when a program was failing pthread_setaffinity_np() with EINVAL after a suspend+wake up. A simple reproducer: $ ./a.out sched_setaffinity: Success $ systemctl suspend $ ./a.out sched_setaffinity: Invalid argument ... where ./a.out is: #define _GNU_SOURCE #include <errno.h> #include <sched.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> int main(void) { long num_cores; cpu_set_t cpu_set; int ret; num_cores = sysconf(_SC_NPROCESSORS_ONLN); CPU_ZERO(&cpu_set); CPU_SET(num_cores - 1, &cpu_set); errno = 0; ret = sched_setaffinity(getpid(), sizeof(cpu_set), &cpu_set); perror("sched_setaffinity"); return ret ? EXIT_FAILURE : EXIT_SUCCESS; } The mistake is that suspend is handled in the action == CPU_DOWN_PREPARE_FROZEN case of the switch statement in cpuset_cpu_inactive(). However, the commit in question masked out CPU_TASKS_FROZEN from the action, making this case dead. The fix is straightforward. Signed-off-by: Omar Sandoval <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Juri Lelli <[email protected]> Cc: Thomas Gleixner <[email protected]> Fixes: 3c18d447b3b3 ("sched/core: Check for available DL bandwidth in cpuset_cpu_inactive()") Link: http://lkml.kernel.org/r/1cb5ecb3d6543c38cce5790387f336f54ec8e2bc.1430733960.git.osandov@osandov.com Signed-off-by: Ingo Molnar <[email protected]>
2015-05-08sched: Handle priority boosted tasks proper in setscheduler()Thomas Gleixner2-17/+21
Ronny reported that the following scenario is not handled correctly: T1 (prio = 10) lock(rtmutex); T2 (prio = 20) lock(rtmutex) boost T1 T1 (prio = 20) sys_set_scheduler(prio = 30) T1 prio = 30 .... sys_set_scheduler(prio = 10) T1 prio = 30 The last step is wrong as T1 should now be back at prio 20. Commit c365c292d059 ("sched: Consider pi boosting in setscheduler()") only handles the case where a boosted tasks tries to lower its priority. Fix it by taking the new effective priority into account for the decision whether a change of the priority is required. Reported-by: Ronny Meeus <[email protected]> Tested-by: Steven Rostedt <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Steven Rostedt <[email protected]> Cc: <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Mike Galbraith <[email protected]> Fixes: c365c292d059 ("sched: Consider pi boosting in setscheduler()") Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1505051806060.4225@nanos Signed-off-by: Ingo Molnar <[email protected]>
2015-05-07nohz: Fix !HIGH_RES_TIMERS hangThomas Gleixner1-6/+3
Simon Horman reported this crash on a system with high-res timers disabled but nohz enabled: > ------------[ cut here ]------------ > kernel BUG at kernel/irq_work.c:135! BUG_ON(!irqs_disabled()); So something enabled interrupts in the periodic tick handling machinery, and that code path indeed has a local_irq_disable()/enable pair in tick_nohz_switch_to_nohz() which causes havoc. Fix it. This patch also fixes a +nohz -hrtimers hang reported by Ingo Molnar. Reported-by: Simon Horman <[email protected]> Reported-by: Ingo Molnar <[email protected]> Tested-by: Simon Horman <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: LAK <[email protected]> Cc: Magnus Damm <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1505071425520.4225@nanos Signed-off-by: Ingo Molnar <[email protected]>
2015-05-06tracing: Make ftrace_print_array_seq compute buf_lenAlex Bennée1-1/+2
The only caller to this function (__print_array) was getting it wrong by passing the array length instead of buffer length. As the element size was already being passed for other reasons it seems reasonable to push the calculation of buffer length into the function. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Alex Bennée <[email protected]> Signed-off-by: Steven Rostedt <[email protected]>
2015-05-06Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds1-7/+9
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU fix from Ingo Molnar: "An RCU Kconfig fix that eliminates an annoying interactive kconfig question for CONFIG_RCU_TORTURE_TEST_SLOW_INIT" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: rcu: Control grace-period delays directly from value
2015-05-05tick: hrtimer-broadcast: Prevent endless restarting when broadcast device is ↵Andreas Sandberg1-3/+7
unused The hrtimer callback in the hrtimer's tick broadcast code sometimes incorrectly ends up scheduling events at the current tick causing the kernel to hang servicing the same hrtimer forever. This typically happens when a device is swapped out by tick_install_broadcast_device(), which replaces the event handler with clock_events_handle_noop() and sets the device mode to CLOCK_EVT_MODE_UNUSED. If the timer is scheduled when this happens, the next_event field will not be updated and the hrtimer ends up being restarted at the current tick. To prevent this from happening, only try to restart the hrtimer if the broadcast clock event device is in one of the active modes and try to cancel the timer when entering the CLOCK_EVT_MODE_UNUSED mode. Signed-off-by: Andreas Sandberg <[email protected]> Tested-by: Catalin Marinas <[email protected]> Acked-by: Mark Rutland <[email protected]> Reviewed-by: Preeti U Murthy <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-05-05timer: Use timer->base for flag checksJoonwoo Park1-1/+1
At present, internal_add_timer() examines flags with 'base' which doesn't contain flags. Examine with 'timer->base' to avoid unnecessary waking up of nohz CPU when timer base has TIMER_DEFERRABLE set. Signed-off-by: Joonwoo Park <[email protected]> Cc: [email protected] Cc: [email protected] Cc: John Stultz <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-05-05tick-broadcast: Fix the printing of broadcast masksPreeti U Murthy1-4/+4
Today the number of bits of the broadcast masks that is output into /proc/timer_list is sizeof(unsigned long). This means that on machines with a larger number of CPUs, the bitmasks of CPUs beyond this range do not appear. Fix this by using bitmap printing through "%*pb" instead, so as to output the broadcast masks for the range of nr_cpu_ids into /proc/timer_list. Signed-off-by: Preeti U Murthy <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-05-05tick: broadcast: Simplify oneshot logic and shorten lock regionThomas Gleixner1-25/+17
Simplify the oneshot logic by avoiding the reprogramming loops. That also allows to call the cpu local handler outside of the broadcast_lock held region. Tested-by: Borislav Petkov <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2015-05-05tick: broadcast: Prevent livelock from event handlerThomas Gleixner1-28/+25
With the removal of the hrtimer softirq the switch to highres/nohz mode happens in the tick interrupt. That leads to a livelock when the per cpu event handler is directly called from the broadcast handler under broadcast lock because broadcast lock needs to be taken for the highres/nohz switch as well. Solve this by calling the cpu local handler outside the broadcast_lock held region. Fixes: c6eb3f70d448 "hrtimer: Get rid of hrtimer softirq" Reported-and-tested-by: Borislav Petkov <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2015-05-04perf: Remove unused function perf_mux_hrtimer_cancel()Thomas Gleixner1-28/+0
Reported-by: Ingo Molnar <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2015-05-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds1-6/+6
Pull networking fixes from David Miller: 1) Receive packet length needs to be adjust by 2 on RX to accomodate the two padding bytes in altera_tse driver. From Vlastimil Setka. 2) If rx frame is dropped due to out of memory in macb driver, we leave the receive ring descriptors in an undefined state. From Punnaiah Choudary Kalluri 3) Some netlink subsystems erroneously signal NLM_F_MULTI. That is only for dumps. Fix from Nicolas Dichtel. 4) Fix mis-use of raw rt->rt_pmtu value in ipv4, one must always go via the ipv4_mtu() helper. From Herbert Xu. 5) Fix null deref in bridge netfilter, and miscalculated lengths in jump/goto nf_tables verdicts. From Florian Westphal. 6) Unhash ping sockets properly. 7) Software implementation of BPF divide did 64/32 rather than 64/64 bit divide. The JITs got it right. Fix from Alexei Starovoitov. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (30 commits) ipv4: Missing sk_nulls_node_init() in ping_unhash(). net: fec: Fix RGMII-ID mode net/mlx4_en: Schedule napi when RX buffers allocation fails netxen_nic: use spin_[un]lock_bh around tx_clean_lock net/mlx4_core: Fix unaligned accesses mlx4_en: Use correct loop cursor in error path. cxgb4: Fix MC1 memory offset calculation bnx2x: Delay during kdump load net: Fix Kernel Panic in bonding driver debugfs file: rlb_hash_table net: dsa: Fix scope of eeprom-length property net: macb: Fix race condition in driver when Rx frame is dropped hv_netvsc: Fix a bug in netvsc_start_xmit() altera_tse: Correct rx packet length mlx4: Fix tx ring affinity_mask creation tipc: fix problem with parallel link synchronization mechanism tipc: remove wrong use of NLM_F_MULTI bridge/nl: remove wrong use of NLM_F_MULTI bridge/mdb: remove wrong use of NLM_F_MULTI net: sched: act_connmark: don't zap skb->nfct trivial: net: systemport: bcmsysport.h: fix 0x0x prefix ...
2015-04-30Merge tag 'pm+acpi-4.1-rc2' of ↵Linus Torvalds1-14/+2
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management and ACPI fixes from Rafael Wysocki: "Three regression fixes this time, one for a recent regression in the cpuidle core affecting multiple systems, one for an inadvertently added duplicate typedef in ACPICA that breaks compilation with GCC 4.5 and one for an ACPI Smart Battery Subsystem driver regression introduced during the 3.18 cycle (stable-candidate). Specifics: - Fix for a regression in the cpuidle core introduced by one of the recent commits in the clockevents_notify() removal series that put a call to a function which had to be executed with disabled interrupts into a code path running with enabled interrupts (Rafael J Wysocki) - Fix for a build problem in ACPICA (with GCC 4.5) introduced by one of the recent ACPICA tools commits that added a duplicate typedef to one of the ACPICA's header files by mistake (Olaf Hering) - Fix for a regression in the ACPI SBS (Smart Battery Subsystem) driver introduced during the 3.18 development cycle causing the smart battery manager to be marked as not present when it should be marked as present (Chris Bainbridge)" * tag 'pm+acpi-4.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpuidle: Run tick_broadcast_exit() with disabled interrupts ACPI / SBS: Enable battery manager when present ACPICA: remove duplicate u8 typedef
2015-04-30Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-15/+0
Pull kvm changes from Paolo Bonzini: "Remove from guest code the handling of task migration during a pvclock read; instead use the correct protocol in KVM. This removes the need for task migration notifiers in core scheduler code" [ The scheduler people really hated the migration notifiers, so this was kind of required - Linus ] * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: x86: pvclock: Really remove the sched notifier for cross-cpu migrations kvm: x86: fix kvmclock update protocol
2015-04-30modsign: change default key detailsDavid Howells1-3/+3
Change default key details to be more obviously unspecified. Reported-by: Linus Torvalds <[email protected]> Signed-off-by: David Howells <[email protected]> Acked-by: James Morris <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-29cpuidle: Run tick_broadcast_exit() with disabled interruptsRafael J. Wysocki1-14/+2
Commit 335f49196fd6 (sched/idle: Use explicit broadcast oneshot control function) replaced clockevents_notify() invocations in cpuidle_idle_call() with direct calls to tick_broadcast_enter() and tick_broadcast_exit(), but it overlooked the fact that interrupts were already enabled before calling the latter which led to functional breakage on systems using idle states with the CPUIDLE_FLAG_TIMER_STOP flag set. Fix that by moving the invocations of tick_broadcast_enter() and tick_broadcast_exit() down into cpuidle_enter_state() where interrupts are still disabled when tick_broadcast_exit() is called. Also ensure that interrupts will be disabled before running tick_broadcast_exit() even if they have been enabled by the idle state's ->enter callback. Trigger a WARN_ON_ONCE() in that case, as we generally don't want that to happen for states with CPUIDLE_FLAG_TIMER_STOP set. Fixes: 335f49196fd6 (sched/idle: Use explicit broadcast oneshot control function) Reported-and-tested-by: Linus Walleij <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Daniel Lezcano <[email protected]> Reported-and-tested-by: Sudeep Holla <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2015-04-28Merge branch 'for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Martin Schwidefsky: "One additional new feature for 4.1, a new PRNG based on SHA-512 for the zcrypt driver. Two memory management related changes, the page table reallocation for KVM is removed, and with file ptes gone the encoding of page table entries is improved. And three bug fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/zcrypt: Introduce new SHA-512 based Pseudo Random Generator. s390/mm: change swap pte encoding and pgtable cleanup s390/mm: correct transfer of dirty & young bits in __pmd_to_pte s390/bpf: add dependency to z196 features s390/3215: free memory in error path s390/kvm: remove delayed reallocation of page tables for KVM kexec: allocate the kexec control page with KEXEC_CONTROL_MEMORY_GFP
2015-04-27bpf: fix 64-bit divideAlexei Starovoitov1-6/+6
ALU64_DIV instruction should be dividing 64-bit by 64-bit, whereas do_div() does 64-bit by 32-bit divide. x64 and arm64 JITs correctly implement 64 by 64 unsigned divide. llvm BPF backend emits code assuming that ALU64_DIV does 64 by 64. Fixes: 89aa075832b0 ("net: sock: allow eBPF programs to be attached to sockets") Reported-by: Michael Holzheu <[email protected]> Acked-by: Daniel Borkmann <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-04-27x86: pvclock: Really remove the sched notifier for cross-cpu migrationsPaolo Bonzini1-15/+0
This reverts commits 0a4e6be9ca17c54817cf814b4b5aa60478c6df27 and 80f7fdb1c7f0f9266421f823964fd1962681f6ce. The task migration notifier was originally introduced in order to support the pvclock vsyscall with non-synchronized TSC, but KVM only supports it with synchronized TSC. Hence, on KVM the race condition is only needed due to a bad implementation on the host side, and even then it's so rare that it's mostly theoretical. As far as KVM is concerned it's possible to fix the host, avoiding the additional complexity in the vDSO and the (re)introduction of the task migration notifier. Xen, on the other hand, hasn't yet implemented vsyscall support at all, so we do not care about its plans for non-synchronized TSC. Reported-by: Peter Zijlstra <[email protected]> Suggested-by: Marcelo Tosatti <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2015-04-26Merge branch 'for-linus' of ↵Linus Torvalds8-19/+19
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull fourth vfs update from Al Viro: "d_inode() annotations from David Howells (sat in for-next since before the beginning of merge window) + four assorted fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: RCU pathwalk breakage when running into a symlink overmounting something fix I_DIO_WAKEUP definition direct-io: only inc/dec inode->i_dio_count for file systems fs/9p: fix readdir() VFS: assorted d_backing_inode() annotations VFS: fs/inode.c helpers: d_inode() annotations VFS: fs/cachefiles: d_backing_inode() annotations VFS: fs library helpers: d_inode() annotations VFS: assorted weird filesystems: d_inode() annotations VFS: normal filesystems (and lustre): d_inode() annotations VFS: security/: d_inode() annotations VFS: security/: d_backing_inode() annotations VFS: net/: d_inode() annotations VFS: net/unix: d_backing_inode() annotations VFS: kernel/: d_inode() annotations VFS: audit: d_backing_inode() annotations VFS: Fix up some ->d_inode accesses in the chelsio driver VFS: Cachefiles should perform fs modifications on the top layer only VFS: AF_UNIX sockets should call mknod on the top layer only
2015-04-24clockevents: Shutdown detached clockevent deviceViresh Kumar1-5/+1
A clockevent device is marked DETACHED when it is replaced by another clockevent device. The device is shutdown properly for drivers that implement legacy ->set_mode() callback, as we call ->set_mode() for CLOCK_EVT_MODE_UNUSED as well. But for the new per-state callback interface, we skip shutting down the device, as we thought its an internal state change. That wasn't correct. The effect is that the device is left programmed in oneshot or periodic mode. Fall-back to 'case CLOCK_EVT_STATE_SHUTDOWN', to shutdown the device. Fixes: bd624d75db21 "clockevents: Introduce mode specific callbacks" Reported-by: Daniel Lezcano <[email protected]> Signed-off-by: Viresh Kumar <[email protected]> Cc: [email protected] Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/eef0a91c51b74d4e52c8e5a95eca27b5a0563f07.1428650683.git.viresh.kumar@linaro.org Signed-off-by: Thomas Gleixner <[email protected]>
2015-04-24genirq: Set IRQCHIP_SKIP_SET_WAKE flag for dummy_irq_chipRoger Quadros1-0/+1
Without this system suspend is broken on systems that have drivers calling enable/disable_irq_wake() for interrupts based off the dummy irq hook. (e.g. drivers/gpio/gpio-pcf857x.c) Signed-off-by: Roger Quadros <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: Gregory Clement <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-04-23kexec: allocate the kexec control page with KEXEC_CONTROL_MEMORY_GFPMartin Schwidefsky1-1/+1
Introduce KEXEC_CONTROL_MEMORY_GFP to allow the architecture code to override the gfp flags of the allocation for the kexec control page. The loop in kimage_alloc_normal_control_pages allocates pages with GFP_KERNEL until a page is found that happens to have an address smaller than the KEXEC_CONTROL_MEMORY_LIMIT. On systems with a large memory size but a small KEXEC_CONTROL_MEMORY_LIMIT the loop will keep allocating memory until the oom killer steps in. Signed-off-by: Martin Schwidefsky <[email protected]>
2015-04-23sched: debug: Remove the cfs bandwidth timer_active printoutThomas Gleixner1-2/+0
The struct member is gone. Reported-by: [email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Peter Zijlstra <[email protected]>
2015-04-22Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/auditLinus Torvalds4-53/+94
Pull audit fixes from Paul Moore: "Seven audit patches for v4.1, all bug fixes. The largest, and perhaps most significant commit helps resolve some memory pressure issues related to the inode cache and audit, there are also a few small commits which help resolve some timing issues with the audit log queue, and the rest fall into the always popular "code clean-up" category. In general, nothing really substantial, just a nice set of maintenance patches" * 'upstream' of git://git.infradead.org/users/pcmoore/audit: audit: Remove condition which always evaluates to false audit: reduce mmap_sem hold for mm->exe_file audit: consolidate handling of mm->exe_file audit: code clean up audit: don't reset working wait time accidentally with auditd audit: don't lose set wait time on first successful call to audit_log_start() audit: move the tree pruning to a dedicated thread
2015-04-22perf: perf_mux_hrtimer_cancel() can be statickbuild test robot1-1/+1
Signed-off-by: Fengguang Wu <[email protected]> Cc: [email protected] Cc: Paul Mackerras <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/20150422200000.GA122603@lkp-sb04 Signed-off-by: Thomas Gleixner <[email protected]>