Age | Commit message (Collapse) | Author | Files | Lines |
|
If a superblock write hasn't happened (i.e. we never had to go rw), then
c->sb.version will be out of date w.r.t. c->disk_sb.sb->version.
Signed-off-by: Kent Overstreet <[email protected]>
|
|
turns out iterate_iovec() mutates __iov, we need to save our own copy
Signed-off-by: Kent Overstreet <[email protected]>
Reported-by: Marcin Mirosław <[email protected]>
|
|
peek_upto() checks against the end position and bails out before
FILTER_SNAPSHOTS checks; this is because if we end up at a different
inode number than the original search key none of the keys we see might
be visibile in the current snapshot - we might be looking at inode in a
completely different subvolume.
But this is broken, because when we're iterating over extents we're
checking against the extent start position to decide when to bail out,
and the extent start position isn't monotonically increasing until after
we've run FILTER_SNAPSHOTS.
Fix this by adding a simple inode number check where the old bailout
check was, and moving the main check to the correct position.
Signed-off-by: Kent Overstreet <[email protected]>
Reported-by: "Carl E. Thompson" <[email protected]>
|
|
I am stepping down as TJA11XX C45 maintainer.
Andrei Botila will take the responsibility to maintain and improve the
support for TJA11XX C45 PHYs.
Signed-off-by: Radu Pirea (NXP OSS) <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add support for client processors starting from Kaby Lake.
Signed-off-by: Srinivas Pandruvada <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Hans de Goede <[email protected]>
Signed-off-by: Hans de Goede <[email protected]>
|
|
It seems traffic there is quite low and changes are often not related
to PDx86 anyhow. Besides that I have a lot of other stuff to do, I'm
rearly pay attention on these emails. Doesn't seem Daren to be active
either. With this in mind, remove (stale) section.
Note, it might be make sense to actually move that folder under PDx86
umbrella (in MAINTAINERS) if people find it suitable. That will reduce
burden on arch/x86 maintenance.
Signed-off-by: Andy Shevchenko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Hans de Goede <[email protected]>
Signed-off-by: Hans de Goede <[email protected]>
|
|
Some r8168 NICs stop working upon system resume:
[ 688.051096] r8169 0000:02:00.1 enp2s0f1: rtl_ep_ocp_read_cond == 0 (loop: 10, delay: 10000).
[ 688.175131] r8169 0000:02:00.1 enp2s0f1: Link is Down
...
[ 691.534611] r8169 0000:02:00.1 enp2s0f1: PCI error (cmd = 0x0407, status_errs = 0x0000)
Not sure if it's related, but those NICs have a BMC device at function
0:
02:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. Realtek RealManage BMC [10ec:816e] (rev 1a)
Trial and error shows that increase the loop wait on
rtl_ep_ocp_read_cond to 30 can eliminate the issue, so let
rtl8168ep_driver_start() to wait a bit longer.
Fixes: e6d6ca6e1204 ("r8169: Add support for another RTL8168FP")
Signed-off-by: Kai-Heng Feng <[email protected]>
Reviewed-by: Heiner Kallweit <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The freeing and re-allocation of algorithm are protected by cpool_mutex,
so it doesn't fix an actual use-after-free, but avoids a deserved
refcount_warn_saturate() warning.
A trivial fix for the racy behavior.
Fixes: 8c73b26315aa ("net/tcp: Prepare tcp_md5sig_pool for TCP-AO")
Suggested-by: Eric Dumazet <[email protected]>
Signed-off-by: Dmitry Safonov <[email protected]>
Tested-by: Bagas Sanjaya <[email protected]>
Reported-by: syzbot <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
m->data needs to be freed when em_text_destroy is called.
Fixes: d675c989ed2d ("[PKT_SCHED]: Packet classification based on textsearch (ematch)")
Acked-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: Hangyu Hua <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
|
|
When parsing emails from .yaml files in particular, stray punctuation
such as a leading '-' can end up in the name. For example, consider a
common YAML section such as:
maintainers:
- [email protected]
This would previously be processed by get_maintainer.pl as:
- <[email protected]>
Make the logic in clean_file_emails more robust by deleting any
sub-names which consist of common single punctuation marks before
proceeding to the best-effort name extraction logic. The output is then
correct:
[email protected]
Some additional comments are added to the function to make things
clearer to future readers.
Link: https://lore.kernel.org/all/[email protected]/
Suggested-by: Joe Perches <[email protected]>
Signed-off-by: Alvin Šipraga <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
While the script correctly extracts UTF-8 encoded names from the
MAINTAINERS file, the regular expressions damage my name when parsing
from .yaml files. Fix this by replacing the Latin-1-compatible regular
expressions with the unicode property matcher \p{L}, which matches on
any letter according to the Unicode General Category of letters.
The proposed solution only works if the script uses proper string
encoding from the outset, so instruct Perl to unconditionally open all
files with UTF-8 encoding. This should be safe, as the entire source
tree is either UTF-8 or ASCII encoded anyway. See [1] for a detailed
analysis.
Furthermore, to prevent the \w expression from matching non-ASCII when
checking for whether a name should be escaped with quotes, add the /a
flag to the regular expression. The escaping logic was duplicated in
two places, so it has been factored out into its own function.
The original issue was also identified on the tools mailing list [2].
This should solve the observed side effects there as well.
Link: https://lore.kernel.org/all/dzn6uco4c45oaa3ia4u37uo5mlt33obecv7gghj2l756fr4hdh@mt3cprft3tmq/ [1]
Link: https://lore.kernel.org/tools/20230726-gush-slouching-a5cd41@meerkat/ [2]
Signed-off-by: Alvin Šipraga <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Fix readers that are blocked on the ring buffer when buffer_percent
is 100%. They are supposed to wake up when the buffer is full, but
because the sub-buffer that the writer is on is never considered
"dirty" in the calculation, dirty pages will never equal nr_pages.
Add +1 to the dirty count in order to count for the sub-buffer that
the writer is on.
- When a reader is blocked on the "snapshot_raw" file, it is to be
woken up when a snapshot is done and be able to read the snapshot
buffer. But because the snapshot swaps the buffers (the main one with
the snapshot one), and the snapshot reader is waiting on the old
snapshot buffer, it was not woken up (because it is now on the main
buffer after the swap). Worse yet, when it reads the buffer after a
snapshot, it's not reading the snapshot buffer, it's reading the live
active main buffer.
Fix this by forcing a wakeup of all readers on the snapshot buffer
when a new snapshot happens, and then update the buffer that the
reader is reading to be back on the snapshot buffer.
- Fix the modification of the direct_function hash. There was a race
when new functions were added to the direct_function hash as when it
moved function entries from the old hash to the new one, a direct
function trace could be hit and not see its entry.
This is fixed by allocating the new hash, copy all the old entries
onto it as well as the new entries, and then use rcu_assign_pointer()
to update the new direct_function hash with it.
This also fixes a memory leak in that code.
- Fix eventfs ownership
* tag 'trace-v6.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
ftrace: Fix modification of direct_function hash while in use
tracing: Fix blocked reader of snapshot buffer
ring-buffer: Fix wake ups when buffer_percent is set to 100
eventfs: Fix file and directory uid and gid ownership
|
|
Directly return NULL or 'next' instead of breaking out of the loop.
Signed-off-by: David Laight <[email protected]>
[ Split original patch into two independent parts - Linus ]
Link: https://lore.kernel.org/lkml/[email protected]/
Signed-off-by: Linus Torvalds <[email protected]>
|
|
osq_wait_next() is passed 'prev' from osq_lock() and NULL from
osq_unlock() but only needs the 'cpu' value to write to lock->tail.
Just pass prev->cpu or OSQ_UNLOCKED_VAL instead.
Should have no effect on the generated code since gcc manages to assume
that 'prev != NULL' due to an earlier dereference.
Signed-off-by: David Laight <[email protected]>
[ Changed 'old' to 'old_cpu' by request from Waiman Long - Linus ]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
struct optimistic_spin_node is private to the implementation.
Move it into the C file to ensure nothing is accessing it.
Signed-off-by: David Laight <[email protected]>
Acked-by: Waiman Long <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Masami Hiramatsu reported a memory leak in register_ftrace_direct() where
if the number of new entries are added is large enough to cause two
allocations in the loop:
for (i = 0; i < size; i++) {
hlist_for_each_entry(entry, &hash->buckets[i], hlist) {
new = ftrace_add_rec_direct(entry->ip, addr, &free_hash);
if (!new)
goto out_remove;
entry->direct = addr;
}
}
Where ftrace_add_rec_direct() has:
if (ftrace_hash_empty(direct_functions) ||
direct_functions->count > 2 * (1 << direct_functions->size_bits)) {
struct ftrace_hash *new_hash;
int size = ftrace_hash_empty(direct_functions) ? 0 :
direct_functions->count + 1;
if (size < 32)
size = 32;
new_hash = dup_hash(direct_functions, size);
if (!new_hash)
return NULL;
*free_hash = direct_functions;
direct_functions = new_hash;
}
The "*free_hash = direct_functions;" can happen twice, losing the previous
allocation of direct_functions.
But this also exposed a more serious bug.
The modification of direct_functions above is not safe. As
direct_functions can be referenced at any time to find what direct caller
it should call, the time between:
new_hash = dup_hash(direct_functions, size);
and
direct_functions = new_hash;
can have a race with another CPU (or even this one if it gets interrupted),
and the entries being moved to the new hash are not referenced.
That's because the "dup_hash()" is really misnamed and is really a
"move_hash()". It moves the entries from the old hash to the new one.
Now even if that was changed, this code is not proper as direct_functions
should not be updated until the end. That is the best way to handle
function reference changes, and is the way other parts of ftrace handles
this.
The following is done:
1. Change add_hash_entry() to return the entry it created and inserted
into the hash, and not just return success or not.
2. Replace ftrace_add_rec_direct() with add_hash_entry(), and remove
the former.
3. Allocate a "new_hash" at the start that is made for holding both the
new hash entries as well as the existing entries in direct_functions.
4. Copy (not move) the direct_function entries over to the new_hash.
5. Copy the entries of the added hash to the new_hash.
6. If everything succeeds, then use rcu_pointer_assign() to update the
direct_functions with the new_hash.
This simplifies the code and fixes both the memory leak as well as the
race condition mentioned above.
Link: https://lore.kernel.org/all/170368070504.42064.8960569647118388081.stgit@devnote2/
Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Cc: [email protected]
Cc: Mark Rutland <[email protected]>
Cc: Mathieu Desnoyers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Acked-by: Masami Hiramatsu (Google) <[email protected]>
Fixes: 763e34e74bb7d ("ftrace: Add register_ftrace_direct()")
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
In
https://lore.kernel.org/r/20231206110636.GBZXBVvCWj2IDjVk4c@fat_crate.local
I wanted to adjust the alternative patching debug output to the new
changes introduced by
da0fe6e68e10 ("x86/alternative: Add indirect call patching")
but removed the '*' which denotes the ->x86_capability word. The correct
output should be, for example:
[ 0.230071] SMP alternatives: feat: 11*32+15, old: (entry_SYSCALL_64_after_hwframe+0x5a/0x77 (ffffffff81c000c2) len: 16), repl: (ffffffff89ae896a, len: 5) flags: 0x0
while the incorrect one says "... 1132+15" currently.
Add back the '*'.
Fixes: da0fe6e68e10 ("x86/alternative: Add indirect call patching")
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Link: https://lore.kernel.org/r/20231206110636.GBZXBVvCWj2IDjVk4c@fat_crate.local
|
|
The HP Pavilion 14 ec1xxx series uses the HP mainboard 8A0F with the
ALC287 codec.
The mute led can be enabled using the already existing
ALC287_FIXUP_HP_GPIO_LED quirk.
Tested on an HP Pavilion ec1003AU
Signed-off-by: Aabish Malik <[email protected]>
Cc: <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
|
|
During our experiment with zswap, we sometimes observe swap IOs due to
occasional zswap store failures and writebacks-to-swap. These swapping
IOs prevent many users who cannot tolerate swapping from adopting zswap to
save memory and improve performance where possible.
This patch adds the option to disable this behavior entirely: do not
writeback to backing swapping device when a zswap store attempt fail, and
do not write pages in the zswap pool back to the backing swap device (both
when the pool is full, and when the new zswap shrinker is called).
This new behavior can be opted-in/out on a per-cgroup basis via a new
cgroup file. By default, writebacks to swap device is enabled, which is
the previous behavior. Initially, writeback is enabled for the root
cgroup, and a newly created cgroup will inherit the current setting of its
parent.
Note that this is subtly different from setting memory.swap.max to 0, as
it still allows for pages to be stored in the zswap pool (which itself
consumes swap space in its current form).
This patch should be applied on top of the zswap shrinker series:
https://lore.kernel.org/linux-mm/[email protected]/
as it also disables the zswap shrinker, a major source of zswap
writebacks.
For the most part, this feature is motivated by internal parties who
have already established their opinions regarding swapping - the
workloads that are highly sensitive to IO, and especially those who are
using servers with really slow disk performance (for instance, massive
but slow HDDs). For these folks, it's impossible to convince them to
even entertain zswap if swapping also comes as a packaged deal.
Writeback disabling is quite a useful feature in these situations - on
a mixed workloads deployment, they can disable writeback for the more
IO-sensitive workloads, and enable writeback for other background
workloads.
For instance, on a server with HDD, I allocate memories and populate
them with random values (so that zswap store will always fail), and
specify memory.high low enough to trigger reclaim. The time it takes
to allocate the memories and just read through it a couple of times
(doing silly things like computing the values' average etc.):
zswap.writeback disabled:
real 0m30.537s
user 0m23.687s
sys 0m6.637s
0 pages swapped in
0 pages swapped out
zswap.writeback enabled:
real 0m45.061s
user 0m24.310s
sys 0m8.892s
712686 pages swapped in
461093 pages swapped out
(the last two lines are from vmstat -s).
[[email protected]: add a comment about recurring zswap store failures leading to reclaim inefficiency]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Nhat Pham <[email protected]>
Suggested-by: Johannes Weiner <[email protected]>
Reviewed-by: Yosry Ahmed <[email protected]>
Acked-by: Chris Li <[email protected]>
Cc: Dan Streetman <[email protected]>
Cc: David Heidelberg <[email protected]>
Cc: Domenico Cerasuolo <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Konrad Rzeszutek Wilk <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Mike Rapoport (IBM) <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Roman Gushchin <[email protected]>
Cc: Sergey Senozhatsky <[email protected]>
Cc: Seth Jennings <[email protected]>
Cc: Shakeel Butt <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Vitaly Wool <[email protected]>
Cc: Zefan Li <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Under heavy traffic, the BlueField Gigabit interface can
become unresponsive. This is due to a possible race condition
in the mlxbf_gige_rx_packet function, where the function exits
with producer and consumer indices equal but there are remaining
packet(s) to be processed. In order to prevent this situation,
read receive consumer index *before* the HW replenish so that
the mlxbf_gige_rx_packet function returns an accurate return
value even if a packet is received into just-replenished buffer
prior to exiting this routine. If the just-replenished buffer
is received and occupies the last RX ring entry, the interface
would not recover and instead would encounter RX packet drops
related to internal buffer shortages since the driver RX logic
is not being triggered to drain the RX ring. This patch will
address and prevent this "ring full" condition.
Fixes: f92e1869d74e ("Add Mellanox BlueField Gigabit Ethernet driver")
Reviewed-by: Asmaa Mnebhi <[email protected]>
Signed-off-by: David Thompson <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Remove duplicate definitions, no functional changes.
Link: https://lkml.kernel.org/r/MW4PR84MB3145459ADC7EB38BBB36955B8198A@MW4PR84MB3145.NAMPRD84.PROD.OUTLOOK.COM
Signed-off-by: Youling Tang <[email protected]>
Reported-by: Huacai Chen <[email protected]>
Acked-by: Baoquan He <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
When the kernel log is acquired over a serial cable it is not uncommon for
the log to contain carriage return characters, in addition to the expected
line feeds.
When this output is feed into decode_stacktrace.sh, handle_line() fails to
strip the trailing ']' off the module name, which results in find_module()
not being able to find the referred to kernel module. This is reported to
the user as:
WARNING! Modules path isn't set, but is needed to parse this symbol
The solution is to reconfigure the serial port, or to strip the carriage
returns from the log, but this isn't obvious from the error reported by
the script.
Instead, make decode_stacktrace.sh more user friendly by stripping the
trailing carriage return.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Bjorn Andersson <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
If, as part of handling a hardlockup or softlockup, we've already dumped
all CPUs and we're just about to panic, don't reenable dumping and give
some other CPU a chance to hop in there and add some confusing logs right
as the panic is happening.
Link: https://lkml.kernel.org/r/20231220131534.4.Id3a9c7ec2d7d83e4080da6f8662ba2226b40543f@changeid
Signed-off-by: Douglas Anderson <[email protected]>
Cc: John Ogness <[email protected]>
Cc: Lecopzer Chen <[email protected]>
Cc: Li Zhe <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Pingfan Liu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
If two CPUs end up reporting a hardlockup at the same time then their logs
could get interleaved which is hard to read.
The interleaving problem was especially bad with the "perf" hardlockup
detector where the locked up CPU is always the same as the running CPU and
we end up in show_regs(). show_regs() has no inherent serialization so we
could mix together two crawls if two hardlockups happened at the same time
(and if we didn't have `sysctl_hardlockup_all_cpu_backtrace` set). With
this change we'll fully serialize hardlockups when using the "perf"
hardlockup detector.
The interleaving problem was less bad with the "buddy" hardlockup
detector. With "buddy" we always end up calling
`trigger_single_cpu_backtrace(cpu)` on some CPU other than the running
one. trigger_single_cpu_backtrace() always at least serializes the
individual stack crawls because it eventually uses
printk_cpu_sync_get_irqsave(). Unfortunately the fact that
trigger_single_cpu_backtrace() eventually calls
printk_cpu_sync_get_irqsave() (on a different CPU) means that we have to
drop the "lock" before calling it and we can't fully serialize all
printouts associated with a given hardlockup. However, we still do get
the advantage of serializing the output of print_modules() and
print_irqtrace_events().
Aside from serializing hardlockups from each other, this change also has
the advantage of serializing hardlockups and softlockups from each other
if they happen to happen at the same time since they are both using the
same "lock".
Even though nobody is expected to hang while holding the lock associated
with printk_cpu_sync_get_irqsave(), out of an abundance of caution, we
don't call printk_cpu_sync_get_irqsave() until after we print out about
the hardlockup. This makes extra sure that, even if
printk_cpu_sync_get_irqsave() somehow never runs we at least print that we
saw the hardlockup. This is different than the choice made for softlockup
because hardlockup is really our last resort.
Link: https://lkml.kernel.org/r/20231220131534.3.I6ff691b3b40f0379bc860f80c6e729a0485b5247@changeid
Signed-off-by: Douglas Anderson <[email protected]>
Reviewed-by: John Ogness <[email protected]>
Cc: Lecopzer Chen <[email protected]>
Cc: Li Zhe <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Pingfan Liu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Instead of introducing a spinlock, use printk_cpu_sync_get_irqsave() and
printk_cpu_sync_put_irqrestore() to serialize softlockup reporting. Alone
this doesn't have any real advantage over the spinlock, but this will
allow us to use the same function in a future change to also serialize
hardlockup crawls.
NOTE: for the most part this serialization is important because we often
end up in the show_regs() path and that has no built-in serialization if
there are multiple callers at once. However, even in the case where we
end up in the dump_stack() path this still has some advantages because the
stack will be guaranteed to be together in the logs with the lockup
message with no interleaving.
NOTE: the fact that printk_cpu_sync_get_irqsave() is allowed to be called
multiple times on the same CPU is important here. Specifically we hold
the "lock" while calling dump_stack() which also gets the same "lock".
This is explicitly documented to be OK and means we don't need to
introduce a variant of dump_stack() that doesn't grab the lock.
Link: https://lkml.kernel.org/r/20231220131534.2.Ia5906525d440d8e8383cde31b7c61c2aadc8f907@changeid
Signed-off-by: Douglas Anderson <[email protected]>
Reviewed-by: John Ogness <[email protected]>
Reviewed-by: Li Zhe <[email protected]>
Cc: Lecopzer Chen <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Pingfan Liu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Patch series "watchdog: Better handling of concurrent lockups".
When we get multiple lockups at roughly the same time, the output in the
kernel logs can be very confusing since the reports about the lockups end
up interleaved in the logs. There is some code in the kernel to try to
handle this but it wasn't that complete.
Li Zhe recently made this a bit better for softlockups (specifically for
the case where `kernel.softlockup_all_cpu_backtrace` is not set) in commit
9d02330abd3e ("softlockup: serialized softlockup's log"), but that only
handled softlockup reports. Hardlockup reports still had similar issues.
This series also has a small fix to avoid dumping all stacks a second time
in the case of a panic. This is a bit unrelated to the interleaving fixes
but it does also improve the clarity of lockup reports.
This patch (of 4):
The hardlockup detector and softlockup detector both have the ability to
dump the stack of all CPUs (`kernel.hardlockup_all_cpu_backtrace` and
`kernel.softlockup_all_cpu_backtrace`). Both detectors also have some
logic to attempt to avoid interleaving printouts if two CPUs were trying
to do dumps of all CPUs at the same time. However:
- The hardlockup detector's logic still allowed interleaving some
information. Specifically another CPU could print modules and dump
the stack of the locked CPU at the same time we were dumping all
CPUs.
- In the case where `kernel.hardlockup_panic` was set in addition to
`kernel.hardlockup_all_cpu_backtrace`, when two CPUs both detected
hardlockups at the same time the second CPU could call panic() while
the first was still dumping stacks. This was especially bad if the
locked up CPU wasn't responding to the request for a backtrace since
the function nmi_trigger_cpumask_backtrace() can wait up to 10
seconds.
Let's resolve this by adopting the softlockup logic in the hardlockup
handler.
NOTES:
- As part of this, one might think that we should make a helper
function that both the hard and softlockup detectors call. This
turns out not to be super trivial since it would have to be
parameterized quite a bit since there are separate global variables
controlling each lockup detector and they print log messages that
are just different enough that it would be a pain. We probably don't
want to change the messages that are printed without good reason to
avoid throwing log parsers for a loop.
- One might also think that it would be a good idea to have the
hardlockup and softlockup detector use the same global variable to
prevent interleaving. This would make sure that softlockups and
hardlockups can't interleave each other. That _almost_ works but has
a dangerous flaw if `kernel.hardlockup_panic` is not the same as
`kernel.softlockup_panic` because we might skip a call to panic() if
one type of lockup was detected at the same time as another.
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/20231220131534.1.I4f35a69fbb124b5f0c71f75c631e11fabbe188ff@changeid
Signed-off-by: Douglas Anderson <[email protected]>
Cc: John Ogness <[email protected]>
Cc: Lecopzer Chen <[email protected]>
Cc: Li Zhe <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Pingfan Liu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
image->control_page represents the starting address for allocating the
next control page, while hole_end represents the address of the last valid
byte of the currently allocated control page.
This bug actually does not affect the correctness of allocating control
pages, because image->control_page is currently only used in
kimage_alloc_crash_control_pages(), and this function, when allocating
control pages, will first align image->control_page up to the nearest
`(1 << order) << PAGE_SHIFT` boundary, then use this value as the
starting address of the next control page. This ensures that the newly
allocated control page will use the correct starting address and not
overlap with previously allocated control pages.
Although it does not affect the correctness of the final result, it is
better for us to set image->control_page to the correct value, in case
it might be used elsewhere in the future, potentially causing errors.
Therefore, after successfully allocating a control page,
image->control_page should be updated to `hole_end + 1`, rather than
hole_end.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Yuntao Wang <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
kernel_ident_mapping_init() takes an exclusive memory range [pstart, pend)
where pend is not included in the range, while res represents an inclusive
memory range [start, end] where end is considered part of the range.
Passing [start, end] rather than [start, end+1) to
kernel_ident_mapping_init() may result in the identity mapping for the
end address not being set up.
For example, when res->start is equal to res->end,
kernel_ident_mapping_init() will not establish any identity mapping.
Similarly, when the value of res->end is a multiple of 2M and the page
table maps 2M pages, kernel_ident_mapping_init() will also not set up
identity mapping for res->end.
Therefore, passing res->end directly to kernel_ident_mapping_init() is
incorrect, the correct end address should be `res->end + 1`.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Yuntao Wang <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Borislav Petkov (AMD) <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Simon Horman <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
asm-generic/io.h can be replaced with linux/io.h and the file will still
build correctly. It is an asm-generic file which should be avoided if
possible.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Tanzir Hasan <[email protected]>
Suggested-by: Al Viro <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Correct the function parameter names for nilfs_cpfile_get_info():
cpfile.c:564: warning: Function parameter or member 'cnop' not described in 'nilfs_cpfile_get_cpinfo'
cpfile.c:564: warning: Function parameter or member 'mode' not described in 'nilfs_cpfile_get_cpinfo'
cpfile.c:564: warning: Function parameter or member 'buf' not described in 'nilfs_cpfile_get_cpinfo'
cpfile.c:564: warning: Function parameter or member 'cisz' not described in 'nilfs_cpfile_get_cpinfo'
cpfile.c:564: warning: Excess function parameter 'cno' description in 'nilfs_cpfile_get_cpinfo'
cpfile.c:564: warning: Excess function parameter 'ci' description in 'nilfs_cpfile_get_cpinfo'
Also add missing descriptions of the function's specification.
[ [email protected]: filled in missing descriptions ]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Randy Dunlap <[email protected]>
Signed-off-by: Ryusuke Konishi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Change @task to @tsk to prevent kernel-doc warnings:
kernel/stacktrace.c:138: warning: Excess function parameter 'task' description in 'stack_trace_save_tsk'
kernel/stacktrace.c:138: warning: Function parameter or member 'tsk' not described in 'stack_trace_save_tsk'
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Randy Dunlap <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
When I use older version aarch64 objdump (2.24) to disassemble aarch64
vmlinux, I get the result like below. There is no space between sp and
offset.
ffff800008010000 <dw_apb_ictl_handle_irq>:
ffff800008010000: d503233f hint #0x19
ffff800008010004: a9bc7bfd stp x29, x30, [sp,#-64]!
ffff800008010008: 90011e60 adrp x0, ffff80000a3dc000 <num_ictlrs>
ffff80000801000c: 910003fd mov x29, sp
ffff800008010010: a9025bf5 stp x21, x22, [sp,#32]
When I use newer version aarch64 objdump (2.35), I get
the result like below.
There is a space between sp and offset.
ffff800008010000 <dw_apb_ictl_handle_irq>:
ffff800008010000: d503233f paciasp
ffff800008010004: a9bc7bfd stp x29, x30, [sp, #-64]!
ffff800008010008: 90011e60 adrp x0, ffff80000a3dc000 <num_ictlrs>
ffff80000801000c: 910003fd mov x29, sp
ffff800008010010: a9025bf5 stp x21, x22, [sp, #32]
Add no space support of regular expression for old version objdump.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Kuan-Ying Lee <[email protected]>
Cc: Casper Li <[email protected]>
Cc: AngeloGioacchino Del Regno <[email protected]>
Cc: Chinwen Chang <[email protected]>
Cc: Matthias Brugger <[email protected]>
Cc: Qun-Wei Lin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
kexec_dprintk() expects the last argument to be kbuf.memsz, but the actual
argument being passed is kbuf.bufsz.
Although these two values are currently equal, it is better to pass the
correct one, in case these two values become different in the future.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Yuntao Wang <[email protected]>
Cc: Baoquan He <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
When detecting an error, the current code uses kexec_dprintk() to output
log message. This is not quite appropriate as kexec_dprintk() is mainly
used for outputting debugging messages, rather than error messages.
Replace kexec_dprintk() with pr_err(). This also makes the output method
for this error log align with the output method for other error logs in
this function.
Additionally, the last return statement in set_page_address() is
unnecessary, remove it.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Yuntao Wang <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Borislav Petkov (AMD) <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The kernel thread function nilfs_segctor_thread() invokes the
try_to_freeze() in its loop. But all the kernel threads are non-freezable
by default. So if we want to make a kernel thread to be freezable, we
have to invoke set_freezable() explicitly.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Kevin Hao <[email protected]>
Signed-off-by: Ryusuke Konishi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Documentation/filesystems/relay.rst says to use
return debugfs_create_file(filename, mode, parent, buf,
&relay_file_operations);
and this is the only way relay_file_operations is used.
Thus: debugfs_create_file(&relay_file_operations)
-> __debugfs_create_file(&debugfs_full_proxy_file_operations,
&relay_file_operations)
-> dentry{inode: {i_fop: &debugfs_full_proxy_file_operations},
d_fsdata: &relay_file_operations
| DEBUGFS_FSDATA_IS_REAL_FOPS_BIT}
debugfs_full_proxy_file_operations.open is full_proxy_open, which extracts
the &relay_file_operations from the dentry, and allocates via
__full_proxy_fops_init() new fops, with trivial wrappers around release,
llseek, read, write, poll, and unlocked_ioctl, then replaces the fops on
the opened file therewith.
Naturally, all thusly-created debugfs files have .splice_read = NULL.
This was introduced in commit 49d200deaa68 ("debugfs: prevent access to
removed files' private data") from 2016-03-22.
AFAICT, relay_file_operations is the only struct file_operations used for
debugfs which defines a .splice_read callback. Hooking it up with
> diff --git a/fs/debugfs/file.c b/fs/debugfs/file.c
> index 5063434be0fc..952fcf5b2afa 100644
> --- a/fs/debugfs/file.c
> +++ b/fs/debugfs/file.c
> @@ -328,6 +328,11 @@ FULL_PROXY_FUNC(write, ssize_t, filp,
> loff_t *ppos),
> ARGS(filp, buf, size, ppos));
>
> +FULL_PROXY_FUNC(splice_read, long, in,
> + PROTO(struct file *in, loff_t *ppos, struct pipe_inode_info *pipe,
> + size_t len, unsigned int flags),
> + ARGS(in, ppos, pipe, len, flags));
> +
> FULL_PROXY_FUNC(unlocked_ioctl, long, filp,
> PROTO(struct file *filp, unsigned int cmd, unsigned long arg),
> ARGS(filp, cmd, arg));
> @@ -382,6 +387,8 @@ static void __full_proxy_fops_init(struct file_operations *proxy_fops,
> proxy_fops->write = full_proxy_write;
> if (real_fops->poll)
> proxy_fops->poll = full_proxy_poll;
> + if (real_fops->splice_read)
> + proxy_fops->splice_read = full_proxy_splice_read;
> if (real_fops->unlocked_ioctl)
> proxy_fops->unlocked_ioctl = full_proxy_unlocked_ioctl;
> }
shows it just doesn't work, and splicing always instantly returns empty
(subsequent reads actually return the contents).
No-one noticed it became dead code in 2016, who knows if it worked back
then. Clearly no-one cares; just delete it.
Link: https://lkml.kernel.org/r/dtexwpw6zcdx7dkx3xj5gyjp5syxmyretdcbcdtvrnukd4vvuh@tarta.nabijaczleweli.xyz
Signed-off-by: Ahelenia Ziemiańska <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Li kunyu <[email protected]>
Cc: Mike Rapoport (IBM) <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Cc: Zhang Zhengming <[email protected]>
Cc: Zhao Lei <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
After commit 7dfbea4c468c ("scripts: remove namespace.pl"),
scripts/namespace.pl has been removed from the kernel, and "make
namespacecheck" has been removed from the English version of
submit-checklist.rst, so also remove it in the related translations.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Tiezhu Yang <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
According to Documentation/process/submit-checklist.rst, checkstack does
not point out problems explicitly, but any one function that uses more
than 512 bytes on the stack is a candidate for change, hence it is better
to omit any stack frame sizes smaller than 512 bytes, just change
min_stack to 512 by default.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Tiezhu Yang <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
For some unknown reason the regular expression for checkstack only matches
three digit numbers starting with the number "3", or any higher number.
Which means that it skips any stack sizes smaller than 304 bytes. This
makes the checkstack script a bit less useful than it could be.
Change the script to match any number. To be filtered out stack sizes can
be configured with the min_stack variable, which omits any stack frame
sizes smaller than 100 bytes by default.
This is similar with commit aab1f809d754 ("scripts/checkstack.pl: match
all stack sizes for s390").
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Tiezhu Yang <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
After commit 572220aad525 ("scripts/checkstack.pl: Add argument to print
stacks greather than value."), it is appropriate to add min_stack to the
usage comment, then the users know explicitly that "min_stack" can be
specified like "arch".
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Tiezhu Yang <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Patch series "Modify some code about checkstack".
This patch (of 5):
After commit cf8e8658100d ("arch: Remove Itanium (IA-64) architecture"),
the ia64 port has been removed from the kernel, so also remove the ia64
specific bits from the checkstack.pl script.
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Tiezhu Yang <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
crc_ccitt_false() was introduced in commit 0d85adb5fbd33 ("lib/crc-ccitt:
Add CCITT-FALSE CRC16 variant"), but it is redundant with crc_itu_t().
Since the latter is more used, it is the one being kept.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Mathis Marion <[email protected]>
Cc: Andrey Smirnov <[email protected]>
Cc: Andrey Vostrikov <[email protected]>
Cc: Jérôme Pouiller <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
DEBUG_STACK_USAGE doesn't only have an influence on the output of sysrq-T
and sysrq-P, it also enables a message at process exit. See
check_stack_usage() in kernel/exit.c where this is implemented.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Uwe Kleine-König <[email protected]>
Cc: Douglas Anderson <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Marco Elver <[email protected]>
Cc: "Paul E. McKenney" <[email protected]>
Cc: Pengutronix Kernel Team <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Stephen Boyd <[email protected]>
Cc: Zhaoyang Huang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
scripts/checkstack.pl lacks support for the loongarch architecture. Add
support to detect "addi.{w,d} $sp, $sp, -FRAME_SIZE" stack frame
generation instruction.
Link: https://lkml.kernel.org/r/MW4PR84MB314514273F0B7DBCC5E35A978192A@MW4PR84MB3145.NAMPRD84.PROD.OUTLOOK.COM
Signed-off-by: Youling Tang <[email protected]>
Acked-by: Huacai Chen <[email protected]>
Cc: John Paul Adrian Glaubitz <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
An example how to invoke decodecode for loongarch64:
$ echo 'Code: 380839f6 380831f9 28412bae <24000ca6>
004081ad 0014cb50 004083e8 02bff34c 58008e91' | \
ARCH=loongarch CROSS_COMPILE=loongarch64-linux-gnu- \
./scripts/decodecode
Code: 380839f6 380831f9 28412bae <24000ca6> 004081ad 0014cb50 004083e8 02bff34c 58008e91
All code
========
0: 380839f6 ldx.w $fp, $t3, $t2
4: 380831f9 ldx.w $s2, $t3, $t0
8: 28412bae ld.h $t2, $s6, 74(0x4a)
c:* 24000ca6 ldptr.w $a2, $a1, 12(0xc) <-- trapping instruction
10: 004081ad slli.w $t1, $t1, 0x0
14: 0014cb50 and $t4, $s3, $t6
18: 004083e8 slli.w $a4, $s8, 0x0
1c: 02bff34c addi.w $t0, $s3, -4(0xffc)
20: 58008e91 beq $t8, $t5, 140(0x8c) # 0xac
Code starting with the faulting instruction
===========================================
0: 24000ca6 ldptr.w $a2, $a1, 12(0xc)
4: 004081ad slli.w $t1, $t1, 0x0
8: 0014cb50 and $t4, $s3, $t6
c: 004083e8 slli.w $a4, $s8, 0x0
10: 02bff34c addi.w $t0, $s3, -4(0xffc)
14: 58008e91 beq $t8, $t5, 140(0x8c) # 0xa0
Link: https://lkml.kernel.org/r/MW4PR84MB3145B99B9677BB7887BB26CD8192A@MW4PR84MB3145.NAMPRD84.PROD.OUTLOOK.COM
Signed-off-by: Youling Tang <[email protected]>
Acked-by: Huacai Chen <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
temp_end represents the address of the last available byte. Therefore,
the starting address of the memory segment with temp_end as its last
available byte and a size of `kbuf->memsz`, that is, the value of
temp_start, should be `temp_end - kbuf->memsz + 1` instead of `temp_end -
kbuf->memsz`.
Additionally, use the ALIGN_DOWN macro instead of open-coding it directly
in locate_mem_hole_top_down() to improve code readability.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Yuntao Wang <[email protected]>
Acked-by: Baoquan He <[email protected]>
Cc: Borislav Petkov (AMD) <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The end parameter received by kimage_is_destination_range() should be the
last valid byte address of the target memory segment plus 1. However, in
the locate_mem_hole_bottom_up() and locate_mem_hole_top_down() functions,
the corresponding value passed to kimage_is_destination_range() is the
last valid byte address of the target memory segment, which is 1 less.
There are two ways to fix this bug. We can either correct the logic of
the locate_mem_hole_bottom_up() and locate_mem_hole_top_down() functions,
or we can fix kimage_is_destination_range() by making the end parameter
represent the last valid byte address of the target memory segment. Here,
we choose the second approach.
Due to the modification to kimage_is_destination_range(), we also need to
adjust its callers, such as kimage_alloc_normal_control_pages() and
kimage_alloc_page().
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Yuntao Wang <[email protected]>
Acked-by: Baoquan He <[email protected]>
Cc: Borislav Petkov (AMD) <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Commit 62c46d55688894 ("MAINTAINERS: Removing Ohad from remoteproc/rpmsg
maintenance") removes his MAINTAINERS entry in regards to remoteproc
subsystem due to his inactivity (the last commit with his Signed-off-by is
99c429cb4e628e ("remoteproc/wkup_m3: Use MODULE_DEVICE_TABLE to export
alias") which is authored in 2015 and his last LKML message prior to
62c46d55688894 was [1]).
Remove also his MAINTAINERS entry for hwspinlock subsystem as there is no
point of Cc'ing maintainers who never respond in a long time.
[1]: https://lore.kernel.org/r/CAK=Wgbbcyi36ef1-PV8VS=M6nFoQnFGUDWy6V7OCnkt0dDrtfg@mail.gmail.com/
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Bagas Sanjaya <[email protected]>
Acked-by: Ohad Ben Cohen <[email protected]>
Cc: Baolin Wang <[email protected]>
Cc: Bjorn Andersson <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
asm-generic/mman-common.h can be replaced by linux/mman.h and the file
will still build correctly. It is an asm-generic file which should be
avoided if possible.
Link: https://lkml.kernel.org/r/[email protected]
Fixes: 6dea8add4d28 ("mm/damon/vaddr: support DAMON-based Operation Schemes")
Signed-off-by: Tanzir Hasan <[email protected]>
Suggested-by: Al Viro <[email protected]>
Reviewed-by: SeongJae Park <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|