| Age | Commit message (Collapse) | Author | Files | Lines |
|
This reverts commit f03e8c1060f86c23eb49bafee99d9fcbd1c1bd77.
Let's roll back all of the serial core and printk console changes that
went into 6.10-rc1 as there still are problems with them that need to be
sorted out.
Link: https://lore.kernel.org/r/ZnpRozsdw6zbjqze@tlindgre-MOBL1
Reported-by: Petr Mladek <[email protected]>
Reported-by: Tony Lindgren <[email protected]>
Cc: Jiri Slaby <[email protected]>
Cc: John Ogness <[email protected]>
Cc: Sergey Senozhatsky <[email protected]>
Cc: Andy Shevchenko <[email protected]>
Cc: Ilpo Järvinen <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Driver subsystems may need to translate the preferred console name to the
character device name used. We already do some of this in console_setup()
with a few hardcoded names, but that does not scale well.
The console options are parsed early in console_setup(), and the consoles
are added with __add_preferred_console(). At this point we don't know much
about the character device names and device drivers getting probed.
To allow driver subsystems to set up a preferred console, let's save the
kernel command line console options. To add a preferred console from a
driver subsystem with optional character device name translation, let's
add a new function add_preferred_console_match().
This allows the serial core layer to support console=DEVNAME:0.0 style
hardware based addressing in addition to the current console=ttyS0 style
naming. And we can start moving console_setup() character device parsing
to the driver subsystem specific code.
We use a separate array from the console_cmdline array as the character
device name and index may be unknown at the console_setup() time. And
eventually there's no need to call __add_preferred_console() until the
subsystem is ready to handle the console.
Adding the console name in addition to the character device name, and a
flag for an added console, could be added to the struct console_cmdline.
And the console_cmdline array handling could be modified accordingly. But
that complicates things compared saving the console options, and then
adding the consoles when the subsystems handling the consoles are ready.
Co-developed-by: Andy Shevchenko <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>
Signed-off-by: Andy Shevchenko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
The current console/printk subsystem is protected by a Big Kernel Lock,
(aka console_lock) which has ill defined semantics and is more or less
stateless. This puts severe limitations on the console subsystem and
makes forced takeover and output in emergency and panic situations a
fragile endeavour that is based on try and pray.
The goal of non-BKL (nbcon) consoles is to break out of the console lock
jail and to provide a new infrastructure that avoids the pitfalls and
also allows console drivers to be gradually converted over.
The proposed infrastructure aims for the following properties:
- Per console locking instead of global locking
- Per console state that allows to make informed decisions
- Stateful handover and takeover
As a first step, state is added to struct console. The per console state
is an atomic_t using a 32bit bit field.
Reserve state bits, which will be populated later in the series. Wire
it up into the console register/unregister functionality.
It was decided to use a bitfield because using a plain u32 with
mask/shift operations resulted in uncomprehensible code.
Co-developed-by: John Ogness <[email protected]>
Signed-off-by: John Ogness <[email protected]>
Signed-off-by: Thomas Gleixner (Intel) <[email protected]>
Reviewed-by: Petr Mladek <[email protected]>
Signed-off-by: Petr Mladek <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
kernel/sysctl.c is a kitchen sink where everyone leaves their dirty
dishes, this makes it very difficult to maintain.
To help with this maintenance let's start by moving sysctls to places
where they actually belong. The proc sysctl maintainers do not want to
know what sysctl knobs you wish to add for your own piece of code, we
just care about the core logic.
So move printk sysctl from kernel/sysctl.c to kernel/printk/sysctl.c.
Use register_sysctl() to register the sysctl interface.
[[email protected]: fixed compile issues when PRINTK is not set, commit log update]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Xiaoming Ni <[email protected]>
Signed-off-by: Luis Chamberlain <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Amir Goldstein <[email protected]>
Cc: Andy Shevchenko <[email protected]>
Cc: Antti Palosaari <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Benjamin LaHaise <[email protected]>
Cc: Clemens Ladisch <[email protected]>
Cc: David Airlie <[email protected]>
Cc: Douglas Gilbert <[email protected]>
Cc: Eric Biederman <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Iurii Zaikin <[email protected]>
Cc: James E.J. Bottomley <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: John Ogness <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: Julia Lawall <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Lukas Middendorf <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Martin K. Petersen <[email protected]>
Cc: Paul Turner <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Phillip Potter <[email protected]>
Cc: Qing Wang <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: Sebastian Reichel <[email protected]>
Cc: Sergey Senozhatsky <[email protected]>
Cc: Stephen Kitt <[email protected]>
Cc: Steven Rostedt (VMware) <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Cc: Tetsuo Handa <[email protected]>
Cc: "Theodore Ts'o" <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
We have a number of systems industry-wide that have a subset of their
functionality that works as follows:
1. Receive a message from local kmsg, serial console, or netconsole;
2. Apply a set of rules to classify the message;
3. Do something based on this classification (like scheduling a
remediation for the machine), rinse, and repeat.
As a couple of examples of places we have this implemented just inside
Facebook, although this isn't a Facebook-specific problem, we have this
inside our netconsole processing (for alarm classification), and as part
of our machine health checking. We use these messages to determine
fairly important metrics around production health, and it's important
that we get them right.
While for some kinds of issues we have counters, tracepoints, or metrics
with a stable interface which can reliably indicate the issue, in order
to react to production issues quickly we need to work with the interface
which most kernel developers naturally use when developing: printk.
Most production issues come from unexpected phenomena, and as such
usually the code in question doesn't have easily usable tracepoints or
other counters available for the specific problem being mitigated. We
have a number of lines of monitoring defence against problems in
production (host metrics, process metrics, service metrics, etc), and
where it's not feasible to reliably monitor at another level, this kind
of pragmatic netconsole monitoring is essential.
As one would expect, monitoring using printk is rather brittle for a
number of reasons -- most notably that the message might disappear
entirely in a new version of the kernel, or that the message may change
in some way that the regex or other classification methods start to
silently fail.
One factor that makes this even harder is that, under normal operation,
many of these messages are never expected to be hit. For example, there
may be a rare hardware bug which one wants to detect if it was to ever
happen again, but its recurrence is not likely or anticipated. This
precludes using something like checking whether the printk in question
was printed somewhere fleetwide recently to determine whether the
message in question is still present or not, since we don't anticipate
that it should be printed anywhere, but still need to monitor for its
future presence in the long-term.
This class of issue has happened on a number of occasions, causing
unhealthy machines with hardware issues to remain in production for
longer than ideal. As a recent example, some monitoring around
blk_update_request fell out of date and caused semi-broken machines to
remain in production for longer than would be desirable.
Searching through the codebase to find the message is also extremely
fragile, because many of the messages are further constructed beyond
their callsite (eg. btrfs_printk and other module-specific wrappers,
each with their own functionality). Even if they aren't, guessing the
format and formulation of the underlying message based on the aesthetics
of the message emitted is not a recipe for success at scale, and our
previous issues with fleetwide machine health checking demonstrate as
much.
This provides a solution to the issue of silently changed or deleted
printks: we record pointers to all printk format strings known at
compile time into a new .printk_index section, both in vmlinux and
modules. At runtime, this can then be iterated by looking at
<debugfs>/printk/index/<module>, which emits the following format, both
readable by humans and able to be parsed by machines:
$ head -1 vmlinux; shuf -n 5 vmlinux
# <level[,flags]> filename:line function "format"
<5> block/blk-settings.c:661 disk_stack_limits "%s: Warning: Device %s is misaligned\n"
<4> kernel/trace/trace.c:8296 trace_create_file "Could not create tracefs '%s' entry\n"
<6> arch/x86/kernel/hpet.c:144 _hpet_print_config "hpet: %s(%d):\n"
<6> init/do_mounts.c:605 prepare_namespace "Waiting for root device %s...\n"
<6> drivers/acpi/osl.c:1410 acpi_no_auto_serialize_setup "ACPI: auto-serialization disabled\n"
This mitigates the majority of cases where we have a highly-specific
printk which we want to match on, as we can now enumerate and check
whether the format changed or the printk callsite disappeared entirely
in userspace. This allows us to catch changes to printks we monitor
earlier and decide what to do about it before it becomes problematic.
There is no additional runtime cost for printk callers or printk itself,
and the assembly generated is exactly the same.
Signed-off-by: Chris Down <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Jessica Yu <[email protected]>
Cc: Sergey Senozhatsky <[email protected]>
Cc: John Ogness <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Kees Cook <[email protected]>
Reviewed-by: Petr Mladek <[email protected]>
Tested-by: Petr Mladek <[email protected]>
Reported-by: kernel test robot <[email protected]>
Acked-by: Andy Shevchenko <[email protected]>
Acked-by: Jessica Yu <[email protected]> # for module.{c,h}
Signed-off-by: Petr Mladek <[email protected]>
Link: https://lore.kernel.org/r/e42070983637ac5e384f17fbdbe86d19c7b212a5.1623775748.git.chris@chrisdown.name
|
|
Introduce a multi-reader multi-writer lockless ringbuffer for storing
the kernel log messages. Readers and writers may use their API from
any context (including scheduler and NMI). This ringbuffer will make
it possible to decouple printk() callers from any context, locking,
or console constraints. It also makes it possible for readers to have
full access to the ringbuffer contents at any time and context (for
example from any panic situation).
The printk_ringbuffer is made up of 3 internal ringbuffers:
desc_ring:
A ring of descriptors. A descriptor contains all record meta data
(sequence number, timestamp, loglevel, etc.) as well as internal state
information about the record and logical positions specifying where in
the other ringbuffers the text and dictionary strings are located.
text_data_ring:
A ring of data blocks. A data block consists of an unsigned long
integer (ID) that maps to a desc_ring index followed by the text
string of the record.
dict_data_ring:
A ring of data blocks. A data block consists of an unsigned long
integer (ID) that maps to a desc_ring index followed by the dictionary
string of the record.
The internal state information of a descriptor is the key element to
allow readers and writers to locklessly synchronize access to the data.
Co-developed-by: Petr Mladek <[email protected]>
Signed-off-by: John Ogness <[email protected]>
Reviewed-by: Petr Mladek <[email protected]>
Reviewed-by: Paul E. McKenney <[email protected]>
Signed-off-by: Petr Mladek <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Add SPDX license identifiers to all Make/Kconfig files which:
- Have no license information of any form
These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:
GPL-2.0-only
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
This patch extends the idea of NMI per-cpu buffers to regions
that may cause recursive printk() calls and possible deadlocks.
Namely, printk() can't handle printk calls from schedule code
or printk() calls from lock debugging code (spin_dump() for instance);
because those may be called with `sem->lock' already taken or any
other `critical' locks (p->pi_lock, etc.). An example of deadlock
can be
vprintk_emit()
console_unlock()
up() << raw_spin_lock_irqsave(&sem->lock, flags);
wake_up_process()
try_to_wake_up()
ttwu_queue()
ttwu_activate()
activate_task()
enqueue_task()
enqueue_task_fair()
cfs_rq_of()
task_of()
WARN_ON_ONCE(!entity_is_task(se))
vprintk_emit()
console_trylock()
down_trylock()
raw_spin_lock_irqsave(&sem->lock, flags)
^^^^ deadlock
and some other cases.
Just like in NMI implementation, the solution uses a per-cpu
`printk_func' pointer to 'redirect' printk() calls to a 'safe'
callback, that store messages in a per-cpu buffer and flushes
them back to logbuf buffer later.
Usage example:
printk()
printk_safe_enter_irqsave(flags)
//
// any printk() call from here will endup in vprintk_safe(),
// that stores messages in a special per-CPU buffer.
//
printk_safe_exit_irqrestore(flags)
The 'redirection' mechanism, though, has been reworked, as suggested
by Petr Mladek. Instead of using a per-cpu @print_func callback we now
keep a per-cpu printk-context variable and call either default or nmi
vprintk function depending on its value. printk_nmi_entrer/exit and
printk_safe_enter/exit, thus, just set/celar corresponding bits in
printk-context functions.
The patch only adds printk_safe support, we don't use it yet.
Link: http://lkml.kernel.org/r/[email protected]
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Calvin Owens <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Peter Hurley <[email protected]>
Cc: [email protected]
Signed-off-by: Sergey Senozhatsky <[email protected]>
Signed-off-by: Petr Mladek <[email protected]>
Reviewed-by: Steven Rostedt (VMware) <[email protected]>
|
|
A preparation patch for printk_safe work. No functional change.
- rename nmi.c to print_safe.c
- add `printk_safe' prefix to some (which used both by printk-safe
and printk-nmi) of the exported functions.
Link: http://lkml.kernel.org/r/[email protected]
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Calvin Owens <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Peter Hurley <[email protected]>
Cc: [email protected]
Signed-off-by: Sergey Senozhatsky <[email protected]>
Signed-off-by: Petr Mladek <[email protected]>
|
|
printk() takes some locks and could not be used a safe way in NMI
context.
The chance of a deadlock is real especially when printing stacks from
all CPUs. This particular problem has been addressed on x86 by the
commit a9edc8809328 ("x86/nmi: Perform a safe NMI stack trace on all
CPUs").
The patchset brings two big advantages. First, it makes the NMI
backtraces safe on all architectures for free. Second, it makes all NMI
messages almost safe on all architectures (the temporary buffer is
limited. We still should keep the number of messages in NMI context at
minimum).
Note that there already are several messages printed in NMI context:
WARN_ON(in_nmi()), BUG_ON(in_nmi()), anything being printed out from MCE
handlers. These are not easy to avoid.
This patch reuses most of the code and makes it generic. It is useful
for all messages and architectures that support NMI.
The alternative printk_func is set when entering and is reseted when
leaving NMI context. It queues IRQ work to copy the messages into the
main ring buffer in a safe context.
__printk_nmi_flush() copies all available messages and reset the buffer.
Then we could use a simple cmpxchg operations to get synchronized with
writers. There is also used a spinlock to get synchronized with other
flushers.
We do not longer use seq_buf because it depends on external lock. It
would be hard to make all supported operations safe for a lockless use.
It would be confusing and error prone to make only some operations safe.
The code is put into separate printk/nmi.c as suggested by Steven
Rostedt. It needs a per-CPU buffer and is compiled only on
architectures that call nmi_enter(). This is achieved by the new
HAVE_NMI Kconfig flag.
The are MN10300 and Xtensa architectures. We need to clean up NMI
handling there first. Let's do it separately.
The patch is heavily based on the draft from Peter Zijlstra, see
https://lkml.org/lkml/2015/6/10/327
[[email protected]: printk-nmi: use %zu format string for size_t]
[[email protected]: min_t->min - all types are size_t here]
Signed-off-by: Petr Mladek <[email protected]>
Suggested-by: Peter Zijlstra <[email protected]>
Suggested-by: Steven Rostedt <[email protected]>
Cc: Jan Kara <[email protected]>
Acked-by: Russell King <[email protected]> [arm part]
Cc: Daniel Thompson <[email protected]>
Cc: Jiri Kosina <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: David Miller <[email protected]>
Cc: Daniel Thompson <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Create files with prototypes and static inlines for braille support. Make
braille_console functions return 1 on success.
Corrected CONFIG_A11Y_BRAILLE_CONSOLE=n _braille_console_setup
return value to NULL.
Signed-off-by: Joe Perches <[email protected]>
Reviewed-by: Samuel Thibault <[email protected]>
Cc: Ming Lei <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Make it easier to break up printk into bite-sized chunks.
Remove printk path/filename from comment.
Signed-off-by: Joe Perches <[email protected]>
Cc: Samuel Thibault <[email protected]>
Cc: Ming Lei <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|