blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2017-07-12	kernel/watchdog: split up config options	Nicholas Piggin	1	-9/+20
	Split SOFTLOCKUP_DETECTOR from LOCKUP_DETECTOR, and split HARDLOCKUP_DETECTOR_PERF from HARDLOCKUP_DETECTOR. LOCKUP_DETECTOR implies the general boot, sysctl, and programming interfaces for the lockup detectors. An architecture that wants to use a hard lockup detector must define HAVE_HARDLOCKUP_DETECTOR_PERF or HAVE_HARDLOCKUP_DETECTOR_ARCH. Alternatively an arch can define HAVE_NMI_WATCHDOG, which provides the minimum arch_touch_nmi_watchdog, and it otherwise does its own thing and does not implement the LOCKUP_DETECTOR interfaces. sparc is unusual in that it has started to implement some of the interfaces, but not fully yet. It should probably be converted to a full HAVE_HARDLOCKUP_DETECTOR_ARCH. [[email protected]: fix] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Nicholas Piggin <[email protected]> Reviewed-by: Don Zickus <[email protected]> Reviewed-by: Babu Moger <[email protected]> Tested-by: Babu Moger <[email protected]> [sparc] Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Michael Ellerman <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12	kernel/watchdog: introduce arch_touch_nmi_watchdog()	Nicholas Piggin	1	-11/+16
	For architectures that define HAVE_NMI_WATCHDOG, instead of having them provide the complete touch_nmi_watchdog() function, just have them provide arch_touch_nmi_watchdog(). This gives the generic code more flexibility in implementing this function, and arch implementations don't miss out on touching the softlockup watchdog or other generic details. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Nicholas Piggin <[email protected]> Reviewed-by: Don Zickus <[email protected]> Reviewed-by: Babu Moger <[email protected]> Tested-by: Babu Moger <[email protected]> [sparc] Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Michael Ellerman <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12	kernel/watchdog: remove unused declaration	Nicholas Piggin	1	-3/+0
	Patch series "Improve watchdog config for arch watchdogs", v4. A series to make the hardlockup watchdog more easily replaceable by arch code. The last patch provides some justification for why we want to do this (existing sparc watchdog is another that could benefit). This patch (of 5): Remove unused declaration. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Nicholas Piggin <[email protected]> Reviewed-by: Don Zickus <[email protected]> Reviewed-by: Babu Moger <[email protected]> Tested-by: Babu Moger <[email protected]> [sparc] Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Michael Ellerman <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-03-03	sched/headers: Move softlockup detector watchdog methods to <linux/nmi.h>	Ingo Molnar	1	-0/+37
	These methods don't belong into <linux/sched.h>, they are neither directly related to task_struct or are scheduler functionality. Put them next to the other watchdog methods in <linux/nmi.h>. ( Arguably that header's name is a misnomer, and this patch makes it more so - but it should be renamed in another patch. ) Acked-by: Linus Torvalds <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-01-24	kernel/watchdog: prevent false hardlockup on overloaded system	Don Zickus	1	-0/+1
	On an overloaded system, it is possible that a change in the watchdog threshold can be delayed long enough to trigger a false positive. This can easily be achieved by having a cpu spinning indefinitely on a task, while another cpu updates watchdog threshold. What happens is while trying to park the watchdog threads, the hrtimers on the other cpus trigger and reprogram themselves with the new slower watchdog threshold. Meanwhile, the nmi watchdog is still programmed with the old faster threshold. Because the one cpu is blocked, it prevents the thread parking on the other cpus from completing, which is needed to shutdown the nmi watchdog and reprogram it correctly. As a result, a false positive from the nmi watchdog is reported. Fix this by setting a park_in_progress flag to block all lockups until the parking is complete. Fix provided by Ulrich Obergfell. [[email protected]: s/park_in_progress/watchdog_park_in_progress/] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Don Zickus <[email protected]> Reviewed-by: Aaron Tomlin <[email protected]> Cc: Ulrich Obergfell <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-14	kernel/watchdog.c: move shared definitions to nmi.h	Babu Moger	1	-0/+24
	Patch series "Clean up watchdog handlers", v2. This is an attempt to cleanup watchdog handlers. Right now, kernel/watchdog.c implements both softlockup and hardlockup detectors. Softlockup code is generic. Hardlockup code is arch specific. Some architectures don't use hardlockup detectors. They use their own watchdog detectors. To make both these combination work, we have numerous #ifdefs in kernel/watchdog.c. We are trying here to make these handlers independent of each other. Also provide an interface for architectures to implement their own handlers. watchdog_nmi_enable and watchdog_nmi_disable will be defined as weak such that architectures can override its definitions. Thanks to Don Zickus for his suggestions. Here are our previous discussions http://www.spinics.net/lists/sparclinux/msg16543.html http://www.spinics.net/lists/sparclinux/msg16441.html This patch (of 3): Move shared macros and definitions to nmi.h so that watchdog.c, new file watchdog_hld.c or any other architecture specific handler can use those definitions. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Babu Moger <[email protected]> Acked-by: Don Zickus <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Kosina <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Yaowei Bai <[email protected]> Cc: Aaron Tomlin <[email protected]> Cc: Ulrich Obergfell <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Hidehiro Kawai <[email protected]> Cc: Josh Hunt <[email protected]> Cc: "David S. Miller" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-10-07	nmi_backtrace: add more trigger_*_cpu_backtrace() methods	Chris Metcalf	1	-5/+26
	Patch series "improvements to the nmi_backtrace code" v9. This patch series modifies the trigger_xxx_backtrace() NMI-based remote backtracing code to make it more flexible, and makes a few small improvements along the way. The motivation comes from the task isolation code, where there are scenarios where we want to be able to diagnose a case where some cpu is about to interrupt a task-isolated cpu. It can be helpful to see both where the interrupting cpu is, and also an approximation of where the cpu that is being interrupted is. The nmi_backtrace framework allows us to discover the stack of the interrupted cpu. I've tested that the change works as desired on tile, and build-tested x86, arm, mips, and sparc64. For x86 I confirmed that the generic cpuidle stuff as well as the architecture-specific routines are in the new cpuidle section. For arm, mips, and sparc I just build-tested it and made sure the generic cpuidle routines were in the new cpuidle section, but I didn't attempt to figure out which the platform-specific idle routines might be. That might be more usefully done by someone with platform experience in follow-up patches. This patch (of 4): Currently you can only request a backtrace of either all cpus, or all cpus but yourself. It can also be helpful to request a remote backtrace of a single cpu, and since we want that, the logical extension is to support a cpumask as the underlying primitive. This change modifies the existing lib/nmi_backtrace.c code to take a cpumask as its basic primitive, and modifies the linux/nmi.h code to use the new "cpumask" method instead. The existing clients of nmi_backtrace (arm and x86) are converted to using the new cpumask approach in this change. The other users of the backtracing API (sparc64 and mips) are converted to use the cpumask approach rather than the all/allbutself approach. The mips code ignored the "include_self" boolean but with this change it will now also dump a local backtrace if requested. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Chris Metcalf <[email protected]> Tested-by: Daniel Thompson <[email protected]> [arm] Reviewed-by: Aaron Tomlin <[email protected]> Reviewed-by: Petr Mladek <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Russell King <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: David Miller <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-23	x86/apic: Remove declaration of unused hw_nmi_is_cpu_stuck	Yaowei Bai	1	-1/+0
	Commit 10f9014912 ("x86: Cleanup hw_nmi.c cruft") removed unused code in the hw_nmi.c file because of the redesign of the hardlockup watchdog but left declaration of hw_nmi_is_cpu_stuck in linux/nmi.h, so remvoe it. Signed-off-by: Yaowei Bai <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-11-05	kernel/watchdog.c: perform all-CPU backtrace in case of hard lockup	Jiri Kosina	1	-0/+1
	In many cases of hardlockup reports, it's actually not possible to know why it triggered, because the CPU that got stuck is usually waiting on a resource (with IRQs disabled) in posession of some other CPU is holding. IOW, we are often looking at the stacktrace of the victim and not the actual offender. Introduce sysctl / cmdline parameter that makes it possible to have hardlockup detector perform all-CPU backtrace. Signed-off-by: Jiri Kosina <[email protected]> Reviewed-by: Aaron Tomlin <[email protected]> Cc: Ulrich Obergfell <[email protected]> Acked-by: Don Zickus <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-09-08	Merge branch 'nmi' of git://ftp.arm.linux.org.uk/~rmk/linux-arm	Linus Torvalds	1	-0/+6
	Pull NMI backtrace update from Russell King: "These changes convert the x86 NMI handling to be a library implementation which other architectures can make use of. Thomas Gleixner has reviewed and tested these changes, and wishes me to send these rather than taking them through the tip tree. The final patch in the set adds an initial implementation using this infrastructure to ARM, even though it doesn't send the IPI at "NMI" level. Patches are in progress to add the ARM equivalent of NMI, but we still need the IRQ-level fallback for systems where the "NMI" isn't available due to secure firmware denying access to it" * 'nmi' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: ARM: add basic support for on-demand backtrace of other CPUs nmi: x86: convert to generic nmi handler nmi: create generic NMI backtrace implementation
2015-09-04	watchdog: rename watchdog_suspend() and watchdog_resume()	Ulrich Obergfell	1	-4/+4
	Rename watchdog_suspend() to lockup_detector_suspend() and watchdog_resume() to lockup_detector_resume() to avoid confusion with the watchdog subsystem and to be consistent with the existing name lockup_detector_init(). Also provide comment blocks to explain the watchdog_running and watchdog_suspended variables and their relationship. Signed-off-by: Ulrich Obergfell <[email protected]> Reviewed-by: Aaron Tomlin <[email protected]> Cc: Guenter Roeck <[email protected]> Cc: Don Zickus <[email protected]> Cc: Ulrich Obergfell <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Chris Metcalf <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-09-04	watchdog: use suspend/resume interface in fixup_ht_bug()	Ulrich Obergfell	1	-4/+9
	Remove watchdog_nmi_disable_all() and watchdog_nmi_enable_all() since these functions are no longer needed. If a subsystem has a need to deactivate the watchdog temporarily, it should utilize the watchdog_suspend() and watchdog_resume() functions. [[email protected]: fix build with CONFIG_LOCKUP_DETECTOR=m] Signed-off-by: Ulrich Obergfell <[email protected]> Reviewed-by: Aaron Tomlin <[email protected]> Cc: Guenter Roeck <[email protected]> Cc: Don Zickus <[email protected]> Cc: Ulrich Obergfell <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Chris Metcalf <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-09-04	watchdog: introduce watchdog_suspend() and watchdog_resume()	Ulrich Obergfell	1	-0/+2
	This interface can be utilized to deactivate the hard and soft lockup detector temporarily. Callers are expected to minimize the duration of deactivation. Multiple deactivations are allowed to occur in parallel but should be rare in practice. [[email protected]: remove unneeded static initialization] Signed-off-by: Ulrich Obergfell <[email protected]> Reviewed-by: Aaron Tomlin <[email protected]> Cc: Guenter Roeck <[email protected]> Cc: Don Zickus <[email protected]> Cc: Ulrich Obergfell <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Chris Metcalf <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-09-04	kernel/watchdog: move NMI function header declarations from watchdog.h to nmi.h	Guenter Roeck	1	-3/+5
	The kernel's NMI watchdog has nothing to do with the watchdog subsystem. Its header declarations should be in linux/nmi.h, not linux/watchdog.h. The code provided two sets of dummy functions if HARDLOCKUP_DETECTOR is not configured, one in the include file and one in kernel/watchdog.c. Remove the dummy functions from kernel/watchdog.c and use those from the include file. Signed-off-by: Guenter Roeck <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Peter Zijlstra (Intel) <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Don Zickus <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-07-17	nmi: create generic NMI backtrace implementation	Russell King	1	-0/+6
	x86s NMI backtrace implementation (for arch_trigger_all_cpu_backtrace()) is fairly generic in nature - the only architecture specific bits are the act of raising the NMI to other CPUs, and reporting the status of the NMI handler. These are fairly simple to factor out, and produce a generic implementation which can be shared between ARM and x86. Reviewed-by: Thomas Gleixner <[email protected]> Signed-off-by: Russell King <[email protected]>
2015-06-24	watchdog: add watchdog_cpumask sysctl to assist nohz	Chris Metcalf	1	-0/+3
	Change the default behavior of watchdog so it only runs on the housekeeping cores when nohz_full is enabled at build and boot time. Allow modifying the set of cores the watchdog is currently running on with a new kernel.watchdog_cpumask sysctl. In the current system, the watchdog subsystem runs a periodic timer that schedules the watchdog kthread to run. However, nohz_full cores are designed to allow userspace application code running on those cores to have 100% access to the CPU. So the watchdog system prevents the nohz_full application code from being able to run the way it wants to, thus the motivation to suppress the watchdog on nohz_full cores, which this patchset provides by default. However, if we disable the watchdog globally, then the housekeeping cores can't benefit from the watchdog functionality. So we allow disabling it only on some cores. See Documentation/lockup-watchdogs.txt for more information. [[email protected]: fix a watchdog crash in some configurations] Signed-off-by: Chris Metcalf <[email protected]> Acked-by: Don Zickus <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ulrich Obergfell <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Frederic Weisbecker <[email protected]> Signed-off-by: John Hubbard <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-14	watchdog: introduce the hardlockup_detector_disable() function	Ulrich Obergfell	1	-7/+2
	Have kvm_guest_init() use hardlockup_detector_disable() instead of watchdog_enable_hardlockup_detector(false). Remove the watchdog_hardlockup_detector_is_enabled() and the watchdog_enable_hardlockup_detector() function which are no longer needed. Signed-off-by: Ulrich Obergfell <[email protected]> Signed-off-by: Don Zickus <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-14	watchdog: enable the new user interface of the watchdog mechanism	Ulrich Obergfell	1	-2/+0
	With the current user interface of the watchdog mechanism it is only possible to disable or enable both lockup detectors at the same time. This series introduces new kernel parameters and changes the semantics of some existing kernel parameters, so that the hard lockup detector and the soft lockup detector can be disabled or enabled individually. With this series applied, the user interface is as follows. - parameters in /proc/sys/kernel . soft_watchdog This is a new parameter to control and examine the run state of the soft lockup detector. . nmi_watchdog The semantics of this parameter have changed. It can now be used to control and examine the run state of the hard lockup detector. . watchdog This parameter is still available to control the run state of both lockup detectors at the same time. If this parameter is examined, it shows the logical OR of soft_watchdog and nmi_watchdog. . watchdog_thresh The semantics of this parameter are not affected by the patch. - kernel command line parameters . nosoftlockup The semantics of this parameter have changed. It can now be used to disable the soft lockup detector at boot time. . nmi_watchdog=0 or nmi_watchdog=1 Disable or enable the hard lockup detector at boot time. The patch introduces '=1' as a new option. . nowatchdog The semantics of this parameter are not affected by the patch. It is still available to disable both lockup detectors at boot time. Also, remove the proc_dowatchdog() function which is no longer needed. [[email protected]: wrote changelog] [[email protected]: update documentation for kernel params and sysctl] Signed-off-by: Ulrich Obergfell <[email protected]> Signed-off-by: Don Zickus <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-14	watchdog: introduce separate handlers for parameters in /proc/sys/kernel	Ulrich Obergfell	1	-0/+8
	Separate handlers for each watchdog parameter in /proc/sys/kernel replace the proc_dowatchdog() function. Three of those handlers merely call proc_watchdog_common() with one different argument. Signed-off-by: Ulrich Obergfell <[email protected]> Signed-off-by: Don Zickus <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-14	watchdog: new definitions and variables, initialization	Ulrich Obergfell	1	-0/+2
	The hardlockup and softockup had always been tied together. Due to the request of KVM folks, they had a need to have one enabled but not the other. Internally rework the code to split things apart more cleanly. There is a bunch of churn here, but the end result should be code that should be easier to maintain and fix without knowing the internals of what is going on. This patch (of 9): Introduce new definitions and variables to separate the user interface in /proc/sys/kernel from the internal run state of the lockup detectors. The internal run state is represented by two bits in a new variable that is named 'watchdog_enabled'. This helps simplify the code, for example: - In order to check if any of the two lockup detectors is enabled, it is sufficient to check if 'watchdog_enabled' is not zero. - In order to enable/disable one or both lockup detectors, it is sufficient to set/clear one or both bits in 'watchdog_enabled'. - Concurrent updates of 'watchdog_enabled' need not be synchronized via a spinlock or a mutex. Updates can either be atomic or concurrency can be detected by using 'cmpxchg'. Signed-off-by: Ulrich Obergfell <[email protected]> Signed-off-by: Don Zickus <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-10-14	kernel/watchdog.c: control hard lockup detection default	Ulrich Obergfell	1	-0/+13
	In some cases we don't want hard lockup detection enabled by default. An example is when running as a guest. Introduce watchdog_enable_hardlockup_detector(bool) allowing those cases to disable hard lockup detection. This must be executed early by the boot processor from e.g. smp_prepare_boot_cpu, in order to allow kernel command line arguments to override it, as well as to avoid hard lockup detection being enabled before we've had a chance to indicate that it's unwanted. In summary, initial boot: default=enabled smp_prepare_boot_cpu watchdog_enable_hardlockup_detector(false): default=disabled cmdline has 'nmi_watchdog=1': default=enabled The running kernel still has the ability to enable/disable at any time with /proc/sys/kernel/nmi_watchdog us usual. However even when the default has been overridden /proc/sys/kernel/nmi_watchdog will initially show '1'. To truly turn it on one must disable/enable it, i.e. echo 0 > /proc/sys/kernel/nmi_watchdog echo 1 > /proc/sys/kernel/nmi_watchdog This patch will be immediately useful for KVM with the next patch of this series. Other hypervisor guest types may find it useful as well. [[email protected]: fix build] [[email protected]: fix compile issues on sparc] Signed-off-by: Ulrich Obergfell <[email protected]> Signed-off-by: Andrew Jones <[email protected]> Signed-off-by: Don Zickus <[email protected]> Signed-off-by: Don Zickus <[email protected]> Cc: Stephen Rothwell <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-07-22	acpi, apei, ghes: Make NMI error notification to be GHES architecture extension.	Tomasz Nowicki	1	-0/+4
	Currently APEI depends on x86 architecture. It is because of NMI hardware error notification of GHES which is currently supported by x86 only. However, many other APEI features can be still used perfectly by other architectures. This commit adds two symbols: 1. HAVE_ACPI_APEI for those archs which support APEI. 2. HAVE_ACPI_APEI_NMI which is used for NMI code isolation in ghes.c file. NMI related data and functions are grouped so they can be wrapped inside one #ifdef section. Appropriate function stubs are provided for !NMI case. Note there is no functional changes for x86 due to hard selected HAVE_ACPI_APEI and HAVE_ACPI_APEI_NMI symbols. Signed-off-by: Tomasz Nowicki <[email protected]> Acked-by: Borislav Petkov <[email protected]> Signed-off-by: Tony Luck <[email protected]>
2014-06-23	kernel/watchdog.c: print traces for all cpus on lockup detection	Aaron Tomlin	1	-0/+1
	A 'softlockup' is defined as a bug that causes the kernel to loop in kernel mode for more than a predefined period to time, without giving other tasks a chance to run. Currently, upon detection of this condition by the per-cpu watchdog task, debug information (including a stack trace) is sent to the system log. On some occasions, we have observed that the "victim" rather than the actual "culprit" (i.e. the owner/holder of the contended resource) is reported to the user. Often this information has proven to be insufficient to assist debugging efforts. To avoid loss of useful debug information, for architectures which support NMI, this patch makes it possible to improve soft lockup reporting. This is accomplished by issuing an NMI to each cpu to obtain a stack trace. If NMI is not supported we just revert back to the old method. A sysctl and boot-time parameter is available to toggle this feature. [[email protected]: add CONFIG_SMP in certain areas] [[email protected]: additional CONFIG_SMP=n optimisations] [[email protected]: fix warning] Signed-off-by: Aaron Tomlin <[email protected]> Signed-off-by: Don Zickus <[email protected]> Cc: David S. Miller <[email protected]> Cc: Mateusz Guzik <[email protected]> Cc: Oleg Nesterov <[email protected]> Signed-off-by: Jan Moskyto Matejka <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-06-23	nmi: provide the option to issue an NMI back trace to every cpu but current	Aaron Tomlin	1	-1/+10
	Sometimes it is preferred not to use the trigger_all_cpu_backtrace() routine when one wants to avoid capturing a back trace for current. For instance if one was previously captured recently. This patch provides a new routine namely trigger_allbutself_cpu_backtrace() which offers the flexibility to issue an NMI to every cpu but current and capture a back trace accordingly. Patch x86 and sparc to support new routine. [[email protected]: add stub in #else clause] [[email protected]: don't print message in single processor case, wrap with get/put_cpu based on Oleg's suggestion] [[email protected]: undo C99ism] Signed-off-by: Aaron Tomlin <[email protected]> Signed-off-by: Don Zickus <[email protected]> Acked-by: David S. Miller <[email protected]> Cc: Mateusz Guzik <[email protected]> Cc: Oleg Nesterov <[email protected]> Signed-off-by: Stephen Rothwell <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-06-20	watchdog: Rename confusing state variable	Frederic Weisbecker	1	-1/+1
	We have two very conflicting state variable names in the watchdog: * watchdog_enabled: This one reflects the user interface. It's set to 1 by default and can be overriden with boot options or sysctl/procfs interface. * watchdog_disabled: This is the internal toggle state that tells if watchdog threads, timers and NMI events are currently running or not. This state mostly depends on the user settings. It's a convenient state latch. Now we really need to find clearer names because those are just too confusing to encourage deep review. watchdog_enabled now becomes watchdog_user_enabled to reflect its purpose as an interface. watchdog_disabled becomes watchdog_running to suggest its role as a pure internal state. Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Srivatsa S. Bhat <[email protected]> Cc: Anish Singh <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Li Zhong <[email protected]> Cc: Don Zickus <[email protected]>
2012-03-23	nmi watchdog: do not use cpp symbol in Kconfig	Cong Wang	1	-1/+1
	ARCH_HAS_NMI_WATCHDOG is a macro defined by arch, but config HARDLOCKUP_DETECTOR depends on it. This is wrong, ARCH_HAS_NMI_WATCHDOG has to be a Kconfig config, and arch's need it should select it explicitly. Signed-off-by: WANG Cong <[email protected]> Acked-by: Don Zickus <[email protected]> Acked-by: Mike Frysinger <[email protected]> Cc: David Howells <[email protected]> Cc: David Miller <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-23	watchdog: Change the default timeout and configure nmi watchdog period based ↵	Mandeep Singh Baines	1	-1/+1
	on watchdog_thresh Before the conversion of the NMI watchdog to perf event, the watchdog timeout was 5 seconds. Now it is 60 seconds. For my particular application, netbooks, 5 seconds was a better timeout. With a short timeout, we catch faults earlier and are able to send back a panic. With a 60 second timeout, the user is unlikely to wait and will instead hit the power button, causing us to lose the panic info. This change configures the NMI period to watchdog_thresh and sets the softlockup_thresh to watchdog_thresh * 2. In addition, watchdog_thresh was reduced to 10 seconds as suggested by Ingo Molnar. Signed-off-by: Mandeep Singh Baines <[email protected]> Cc: Marcin Slusarz <[email protected]> Cc: Don Zickus <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]> LKML-Reference: <[email protected]>
2011-05-23	watchdog: Disable watchdog when thresh is zero	Mandeep Singh Baines	1	-2/+3
	This restores the previous behavior of softlock_thresh. Currently, setting watchdog_thresh to zero causes the watchdog kthreads to consume a lot of CPU. In addition, the logic of proc_dowatchdog_thresh and proc_dowatchdog_enabled has been factored into proc_dowatchdog. Signed-off-by: Mandeep Singh Baines <[email protected]> Cc: Marcin Slusarz <[email protected]> Cc: Don Zickus <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]> LKML-Reference: <[email protected]>
2010-12-22	x86, nmi_watchdog: Remove ARCH_HAS_NMI_WATCHDOG and rely on ↵	Don Zickus	1	-5/+1
	CONFIG_HARDLOCKUP_DETECTOR The x86 arch has shifted its use of the nmi_watchdog from a local implementation to the global one provide by kernel/watchdog.c. This shift has caused a whole bunch of compile problems under different config options. I attempt to simplify things with the patch below. In order to simplify things, I had to come to terms with the meaning of two terms ARCH_HAS_NMI_WATCHDOG and CONFIG_HARDLOCKUP_DETECTOR. Basically they mean the same thing, the former on a local level and the latter on a global level. With the old x86 nmi watchdog gone, there is no need to rely on defining the ARCH_HAS_NMI_WATCHDOG variable because it doesn't make sense any more. x86 will now use the global implementation. The changes below do a few things. First it changes the few places that relied on ARCH_HAS_NMI_WATCHDOG to use CONFIG_X86_LOCAL_APIC (the former was an alias for the latter anyway, so nothing unusual here). Those pieces of code were relying more on local apic functionality the nmi watchdog functionality, so the change should make sense. Second, I removed the x86 implementation of touch_nmi_watchdog(). It isn't need now, instead x86 will rely on kernel/watchdog.c's implementation. Third, I removed the #define ARCH_HAS_NMI_WATCHDOG itself from x86. And tweaked the include/linux/nmi.h file to tell users to look for an externally defined touch_nmi_watchdog in the case of ARCH_HAS_NMI_WATCHDOG _or_ CONFIG_HARDLOCKUP_DETECTOR. This changes removes some of the ugliness in that file. Finally, I added a Kconfig dependency for CONFIG_HARDLOCKUP_DETECTOR that said you can't have ARCH_HAS_NMI_WATCHDOG _and_ CONFIG_HARDLOCKUP_DETECTOR. You can only have one nmi_watchdog. Tested with ARCH=i386: allnoconfig, defconfig, allyesconfig, (various broken configs) ARCH=x86_64: allnoconfig, defconfig, allyesconfig, (various broken configs) Hopefully, after this patch I won't get any more compile broken emails. :-) v3: changed a couple of 'linux/nmi.h' -> 'asm/nmi.h' to pick-up correct function prototypes when CONFIG_HARDLOCKUP_DETECTOR is not set. Signed-off-by: Don Zickus <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-12-10	lockup detector: Compile fixes from removing the old x86 nmi watchdog	Don Zickus	1	-1/+3
	My patch that removed the old x86 nmi watchdog broke other arches. This change reverts a piece of that patch and puts the change in the correct spot. Signed-off-by: Don Zickus <[email protected]> Reviewed-by: Cyrill Gorcunov <[email protected]> Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-11-18	x86, nmi_watchdog: Remove all stub function calls from old nmi_watchdog	Don Zickus	1	-2/+0
	Now that the bulk of the old nmi_watchdog is gone, remove all the stub variables and hooks associated with it. This touches lots of files mainly because of how the io_apic nmi_watchdog was implemented. Now that the io_apic nmi_watchdog is forever gone, remove all its fingers. Most of this code was not being exercised by virtue of nmi_watchdog != NMI_IO_APIC, so there shouldn't be anything to risky here. Signed-off-by: Don Zickus <[email protected]> Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-11-18	x86, nmi_watchdog: Remove the old nmi_watchdog	Don Zickus	1	-5/+1
	Now that we have a new nmi_watchdog that is more generic and sits on top of the perf subsystem, we really do not need the old nmi_watchdog any more. In addition, the old nmi_watchdog doesn't really work if you are using the default clocksource, hpet. The old nmi_watchdog code relied on local apic interrupts to determine if the cpu is still alive. With hpet as the clocksource, these interrupts don't increment any more and the old nmi_watchdog triggers false postives. This piece removes the old nmi_watchdog code and stubs out any variables and functions calls. The stubs are the same ones used by the new nmi_watchdog code, so it should be well tested. Signed-off-by: Don Zickus <[email protected]> Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-05-16	lockup_detector: Cross arch compile fixes	Don Zickus	1	-1/+1
	Combining the softlockup and hardlockup code causes watchdog.c to build even without the hardlockup detection support. So if an arch, that has the previous and the new nmi watchdog implementations cohabiting, wants to know if the generic one is in use, CONFIG_LOCKUP_DETECTOR is not a reliable check. We need to use CONFIG_HARDLOCKUP_DETECTOR instead. Fixes: kernel/built-in.o: In function `touch_nmi_watchdog': (.text+0x449bc): multiple definition of `touch_nmi_watchdog' arch/sparc/kernel/built-in.o:(.text+0x11b28): first defined here Signed-off-by: Don Zickus <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Don Zickus <[email protected]> Cc: Cyrill Gorcunov <[email protected]> LKML-Reference: <[email protected]> [ use CONFIG_HARDLOCKUP_DETECTOR instead of CONFIG_PERF_EVENTS_NMI] Signed-off-by: Frederic Weisbecker <[email protected]>
2010-05-12	lockup_detector: Combine nmi_watchdog and softlockup detector	Don Zickus	1	-4/+4
	The new nmi_watchdog (which uses the perf event subsystem) is very similar in structure to the softlockup detector. Using Ingo's suggestion, I combined the two functionalities into one file: kernel/watchdog.c. Now both the nmi_watchdog (or hardlockup detector) and softlockup detector sit on top of the perf event subsystem, which is run every 60 seconds or so to see if there are any lockups. To detect hardlockups, cpus not responding to interrupts, I implemented an hrtimer that runs 5 times for every perf event overflow event. If that stops counting on a cpu, then the cpu is most likely in trouble. To detect softlockups, tasks not yielding to the scheduler, I used the previous kthread idea that now gets kicked every time the hrtimer fires. If the kthread isn't being scheduled neither is anyone else and the warning is printed to the console. I tested this on x86_64 and both the softlockup and hardlockup paths work. V2: - cleaned up the Kconfig and softlockup combination - surrounded hardlockup cases with #ifdef CONFIG_PERF_EVENTS_NMI - seperated out the softlockup case from perf event subsystem - re-arranged the enabling/disabling nmi watchdog from proc space - added cpumasks for hardlockup failure cases - removed fallback to soft events if no PMU exists for hard events V3: - comment cleanups - drop support for older softlockup code - per_cpu cleanups - completely remove software clock base hardlockup detector - use per_cpu masking on hard/soft lockup detection - #ifdef cleanups - rename config option NMI_WATCHDOG to LOCKUP_DETECTOR - documentation additions V4: - documentation fixes - convert per_cpu to __get_cpu_var - powerpc compile fixes V5: - split apart warn flags for hard and soft lockups TODO: - figure out how to make an arch-agnostic clock2cycles call (if possible) to feed into perf events as a sample period [fweisbec: merged conflict patch] Signed-off-by: Don Zickus <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Cyrill Gorcunov <[email protected]> Cc: Eric Paris <[email protected]> Cc: Randy Dunlap <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]>
2010-02-25	nmi_watchdog: Clean up various small details	Don Zickus	1	-1/+1
	Mostly copy/paste whitespace damage with a couple of nitpicks by the checkpatch script. Fix the struct definition as requested by Ingo too. Signed-off-by: Don Zickus <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> -- arch/x86/kernel/apic/hw_nmi.c \| 14 +++++------ arch/x86/kernel/traps.c \| 6 ++-- include/linux/nmi.h \| 2 - kernel/nmi_watchdog.c \| 51 ++++++++++++++++++++---------------------- 4 files changed, 36 insertions(+), 37 deletions(-)
2010-02-14	nmi_watchdog: Compile and portability fixes	Don Zickus	1	-0/+9
	The original patch was x86_64 centric. Changed the code to make it less so. ested by building and running on a powerpc. Signed-off-by: Don Zickus <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-02-08	nmi_watchdog: Config option to enable new nmi_watchdog	Don Zickus	1	-0/+4
	These are the bits that enable the new nmi_watchdog and safely isolate the old nmi_watchdog. Only one or the other can run, not both at the same time. Signed-off-by: Don Zickus <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Andrew Morton <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2009-08-03	debug lockups: Improve lockup detection, fix generic arch fallback	Ingo Molnar	1	-2/+17
	As Andrew noted, my previous patch ("debug lockups: Improve lockup detection") broke/removed SysRq-L support from architecture that do not provide a __trigger_all_cpu_backtrace implementation. Restore a fallback path and clean up the SysRq-L machinery a bit: - Rename the arch method to arch_trigger_all_cpu_backtrace() - Simplify the define - Document the method a bit - in the hope of more architectures adding support for it. [ The patch touches Sparc code for the rename. ] Cc: Paul E. McKenney <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: "David S. Miller" <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2007-02-13	[PATCH] x86: fix laptop bootup hang in init_acpi()	Ingo Molnar	1	-1/+8
	During kernel bootup, a new T60 laptop (CoreDuo, 32-bit) hangs about 10%-20% of the time in acpi_init(): Calling initcall 0xc055ce1a: topology_init+0x0/0x2f() Calling initcall 0xc055d75e: mtrr_init_finialize+0x0/0x2c() Calling initcall 0xc05664f3: param_sysfs_init+0x0/0x175() Calling initcall 0xc014cb65: pm_sysrq_init+0x0/0x17() Calling initcall 0xc0569f99: init_bio+0x0/0xf4() Calling initcall 0xc056b865: genhd_device_init+0x0/0x50() Calling initcall 0xc056c4bd: fbmem_init+0x0/0x87() Calling initcall 0xc056dd74: acpi_init+0x0/0x1ee() It's a hard hang that not even an NMI could punch through! Frustratingly, adding printks or function tracing to the ACPI code made the hangs go away ... After some time an additional detail emerged: disabling the NMI watchdog made these occasional hangs go away. So i spent the better part of today trying to debug this and trying out various theories when i finally found the likely reason for the hang: if acpi_ns_initialize_devices() executes an _INI AML method and an NMI happens to hit that AML execution in the wrong moment, the machine would hang. (my theory is that this must be some sort of chipset setup method doing stores to chipset mmio registers?) Unfortunately given the characteristics of the hang it was sheer impossible to figure out which of the numerous AML methods is impacted by this problem. As a workaround i wrote an interface to disable chipset-based NMIs while executing _INI sections - and indeed this fixed the hang. I did a boot-loop of 100 separate reboots and none hung - while without the patch it would hang every 5-10 attempts. Out of caution i did not touch the nmi_watchdog=2 case (it's not related to the chipset anyway and didnt hang). I implemented this for both x86_64 and i686, tested the i686 laptop both with nmi_watchdog=1 [which triggered the hangs] and nmi_watchdog=2, and tested an Athlon64 box with the 64-bit kernel as well. Everything builds and works with the patch applied. Signed-off-by: Ingo Molnar <[email protected]> Signed-off-by: Andi Kleen <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Len Brown <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2006-12-07	[PATCH] x86: all cpu backtrace	Andrew Morton	1	-0/+5
	When a spinlock lockup occurs, arrange for the NMI code to emit an all-cpu backtrace, so we get to see which CPU is holding the lock, and where. Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Badari Pulavarty <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Andi Kleen <[email protected]>
2006-09-29	[PATCH] Make touch_nmi_watchdog imply touch_softlockup_watchdog on all archs	Michal Schmidt	1	-1/+2
	touch_nmi_watchdog() calls touch_softlockup_watchdog() on both architectures that implement it (i386 and x86_64). On other architectures it does nothing at all. touch_nmi_watchdog() should imply touch_softlockup_watchdog() on all architectures. Suggested by Andi Kleen. [[email protected]: s390 fix] Signed-off-by: Michal Schmidt <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Martin Schwidefsky <[email protected]> Signed-off-by: Heiko Carstens <[email protected]> Cc: Michal Schmidt <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2005-04-16	Linux-2.6.12-rc2	Linus Torvalds	1	-0/+22
	Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!