Age | Commit message (Collapse) | Author | Files | Lines |
|
Currently, if someone modprobes and rmmods rcuscale successfully, but
the next run errors out during the modprobe, non-NULL pointers to freed
memory will remain. If the run after that also errors out during the
modprobe, there will be double-free bugs.
This commit therefore NULLs out top-level pointers to memory that has
just been freed.
Signed-off-by: "Paul E. McKenney" <[email protected]>
Signed-off-by: Neeraj Upadhyay <[email protected]>
|
|
The rcu_scale_writer() function needs only a fixed number of rcu_head
structures per kthread, which means that a trivial allocator suffices.
This commit therefore uses an llist-based allocator using a fixed array of
structures per kthread. This allows aggressive testing of RCU performance
without stressing the slab allocators.
Signed-off-by: "Paul E. McKenney" <[email protected]>
Signed-off-by: Neeraj Upadhyay <[email protected]>
|
|
Under some conditions, kmalloc(GFP_KERNEL) allocations have been
observed to repeatedly fail. This situation has been observed to
cause one of the rcu_scale_writer() instances to loop indefinitely
retrying memory allocation for an asynchronous grace-period primitive.
The problem is that if memory is short, all the other instances will
allocate all available memory before the looping task is awakened from
its rcu_barrier*() call. This in turn results in hangs, so that rcuscale
fails to complete.
This commit therefore removes the tight retry loop, so that when this
condition occurs, the affected task is still passing through the full
loop with its full set of termination checks. This spreads the risk
of indefinite memory-allocation retry failures across all instances of
rcu_scale_writer() tasks, which in turn prevents the hangs.
Signed-off-by: "Paul E. McKenney" <[email protected]>
Signed-off-by: Neeraj Upadhyay <[email protected]>
|
|
This commit causes all writer tasks to provide a brief report after a
hang has been reported, spaced at one-second intervals.
Signed-off-by: "Paul E. McKenney" <[email protected]>
Signed-off-by: Neeraj Upadhyay <[email protected]>
|
|
Currently, if the rcuscale module's async module parameter is specified
for RCU implementations that do not have async primitives such as RCU
Tasks Rude (which now lacks a call_rcu_tasks_rude() function), there
will be a series of splats due to calls to a NULL pointer. This commit
therefore warns of this situation, but switches to non-async testing.
Signed-off-by: "Paul E. McKenney" <[email protected]>
Signed-off-by: Neeraj Upadhyay <[email protected]>
|
|
The commit 2d7f00b2f0130 ("rcu: Suppress smp_processor_id() complaint
in synchronize_rcu_expedited_wait()") disabled preemption around
dump_cpu_task() to suppress warning on its usage within preemtible context.
Calling dump_cpu_task() doesn't required to be in non-preemptible context
except for suppressing the smp_processor_id() warning.
As the smp_processor_id() is evaluated along with in_hardirq()
to check if it's in interrupt context, this patch removes the need
for its preemtion disablement by reordering the condition so that
smp_processor_id() only gets evaluated when it's in interrupt context.
Signed-off-by: Ryo Takakura <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Neeraj Upadhyay <[email protected]>
|
|
During CSD-lock stalls, the additional information output by expedited
RCU CPU stall warnings is usually redundant, flooding the console for
not good reason. However, this has been the way things work for a few
years. This commit therefore uses rcutree.csd_lock_suppress_rcu_stall
kernel boot parameter that causes expedited RCU CPU stall warnings to
be abbreviated to a single line when there is at least one CPU that has
been stuck waiting for CSD lock for more than five seconds.
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Neeraj Upadhyay <[email protected]>
|
|
synchronize_rcu_expedited_wait()
This commit extracts the RCU CPU stall-warning report code from
synchronize_rcu_expedited_wait() and places it in a new function named
synchronize_rcu_expedited_stall(). This is strictly a code-movement
commit. A later commit will use this reorganization to avoid printing
expedited RCU CPU stall warnings while there are ongoing CSD-lock stall
reports.
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Neeraj Upadhyay <[email protected]>
|
|
During CSD-lock stalls, the additional information output by RCU CPU
stall warnings is usually redundant, flooding the console for not good
reason. However, this has been the way things work for a few years.
This commit therefore adds an rcutree.csd_lock_suppress_rcu_stall kernel
boot parameter that causes RCU CPU stall warnings to be abbreviated to
a single line when there is at least one CPU that has been stuck waiting
for CSD lock for more than five seconds.
To make this abbreviated message happen with decent probability:
tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 8 \
--configs "2*TREE01" --kconfig "CONFIG_CSD_LOCK_WAIT_DEBUG=y" \
--bootargs "csdlock_debug=1 rcutorture.stall_cpu=200 \
rcutorture.stall_cpu_holdoff=120 rcutorture.stall_cpu_irqsoff=1 \
rcutree.csd_lock_suppress_rcu_stall=1 \
rcupdate.rcu_exp_cpu_stall_timeout=5000" --trust-make
[ paulmck: Apply kernel test robot feedback. ]
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Neeraj Upadhyay <[email protected]>
|
|
About 40% of all csd_lock warnings observed in our fleet appear to
be due to sched_clock() going backward in time (usually only a little
bit), resulting in ts0 being larger than ts2.
When the local CPU is at fault, we should print out a message reflecting
that, rather than trying to get the remote CPU's stack trace.
Signed-off-by: Rik van Riel <[email protected]>
Tested-by: "Paul E. McKenney" <[email protected]>
Signed-off-by: Neeraj Upadhyay <[email protected]>
|
|
Currently, the CSD-lock diagnostics in CONFIG_CSD_LOCK_WAIT_DEBUG=y
kernels are emitted at five-second intervals. Although this has proven
to be a good time interval for the first diagnostic, if the target CPU
keeps interrupts disabled for way longer than five seconds, the ratio
of useful new information to pointless repetition increases considerably.
Therefore, back off the time period for repeated reports of the same
incident, increasing linearly with the number of reports and logarithmicly
with the number of online CPUs.
[ paulmck: Apply Dan Carpenter feedback. ]
Signed-off-by: Paul E. McKenney <[email protected]>
Cc: Imran Khan <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Leonardo Bras <[email protected]>
Cc: "Peter Zijlstra (Intel)" <[email protected]>
Cc: Rik van Riel <[email protected]>
Reviewed-by: Rik van Riel <[email protected]>
Signed-off-by: Neeraj Upadhyay <[email protected]>
|
|
If a CSD-lock stall goes on long enough, it will cause an RCU CPU
stall warning. This additional warning provides much additional
console-log traffic and little additional information. Therefore,
provide a new csd_lock_is_stuck() function that returns true if there
is an ongoing CSD-lock stall. This function will be used by the RCU
CPU stall warnings to provide a one-line indication of the stall when
this function returns true.
[ neeraj.upadhyay: Apply Rik van Riel feedback. ]
[ neeraj.upadhyay: Apply kernel test robot feedback. ]
Signed-off-by: Paul E. McKenney <[email protected]>
Cc: Imran Khan <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Leonardo Bras <[email protected]>
Cc: "Peter Zijlstra (Intel)" <[email protected]>
Cc: Rik van Riel <[email protected]>
Signed-off-by: Neeraj Upadhyay <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
"VFS:
- Fix the name of file lease slab cache. When file leases were split
out of file locks the name of the file lock slab cache was used for
the file leases slab cache as well.
- Fix a type in take_fd() helper.
- Fix infinite directory iteration for stable offsets in tmpfs.
- When the icache is pruned all reclaimable inodes are marked with
I_FREEING and other processes that try to lookup such inodes will
block.
But some filesystems like ext4 can trigger lookups in their inode
evict callback causing deadlocks. Ext4 does such lookups if the
ea_inode feature is used whereby a separate inode may be used to
store xattrs.
Introduce I_LRU_ISOLATING which pins the inode while its pages are
reclaimed. This avoids inode deletion during inode_lru_isolate()
avoiding the deadlock and evict is made to wait until
I_LRU_ISOLATING is done.
netfs:
- Fault in smaller chunks for non-large folio mappings for
filesystems that haven't been converted to large folios yet.
- Fix the CONFIG_NETFS_DEBUG config option. The config option was
renamed a short while ago and that introduced two minor issues.
First, it depended on CONFIG_NETFS whereas it wants to depend on
CONFIG_NETFS_SUPPORT. The former doesn't exist, while the latter
does. Second, the documentation for the config option wasn't fixed
up.
- Revert the removal of the PG_private_2 writeback flag as ceph is
using it and fix how that flag is handled in netfs.
- Fix DIO reads on 9p. A program watching a file on a 9p mount
wouldn't see any changes in the size of the file being exported by
the server if the file was changed directly in the source
filesystem. Fix this by attempting to read the full size specified
when a DIO read is requested.
- Fix a NULL pointer dereference bug due to a data race where a
cachefiles cookies was retired even though it was still in use.
Check the cookie's n_accesses counter before discarding it.
nsfs:
- Fix ioctl declaration for NS_GET_MNTNS_ID from _IO() to _IOR() as
the kernel is writing to userspace.
pidfs:
- Prevent the creation of pidfds for kthreads until we have a
use-case for it and we know the semantics we want. It also confuses
userspace why they can get pidfds for kthreads.
squashfs:
- Fix an unitialized value bug reported by KMSAN caused by a
corrupted symbolic link size read from disk. Check that the
symbolic link size is not larger than expected"
* tag 'vfs-6.11-rc4.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
Squashfs: sanity check symbolic link size
9p: Fix DIO read through netfs
vfs: Don't evict inode under the inode lru traversing context
netfs: Fix handling of USE_PGPRIV2 and WRITE_TO_CACHE flags
netfs, ceph: Revert "netfs: Remove deprecated use of PG_private_2 as a second writeback flag"
file: fix typo in take_fd() comment
pidfd: prevent creation of pidfds for kthreads
netfs: clean up after renaming FSCACHE_DEBUG config
libfs: fix infinite directory reads for offset dir
nsfs: fix ioctl declaration
fs/netfs/fscache_cookie: add missing "n_accesses" check
filelock: fix name of file_lease slab cache
netfs: Fault in smaller chunks for non-large folio mappings
|
|
This commit uses the new rcu_tasks_torture_stats_print(),
rcu_tasks_trace_torture_stats_print(), and
rcu_tasks_rude_torture_stats_print() functions in order to provide
detailed diagnostics on grace-period, callback, and barrier state when
rcu_scale_writer() hangs.
[ paulmck: Apply kernel test robot feedback. ]
Signed-off-by: "Paul E. McKenney" <[email protected]>
Signed-off-by: Neeraj Upadhyay <[email protected]>
|
|
RCU keeps a count of the number of callbacks that the current
rcu_barrier() is waiting on, but there is currently no easy way to
work out which callback is stuck. One way to do this is to mark idle
RCU-barrier callbacks by making the ->next pointer point to the callback
itself, and this commit does just that.
Later commits will use this for debug output.
Signed-off-by: "Paul E. McKenney" <[email protected]>
Signed-off-by: Neeraj Upadhyay <[email protected]>
|
|
This commit adds a .stats function pointer to the rcu_scale_ops structure,
and if this is non-NULL, it is invoked after stack traces are dumped in
response to a rcu_scale_writer() stall.
Signed-off-by: "Paul E. McKenney" <[email protected]>
Signed-off-by: Neeraj Upadhyay <[email protected]>
|
|
This commit improves debuggability by dumping the stacks of
rcu_scale_writer() instances that have not completed in a reasonable
timeframe. These stacks are dumped remotely, but they will be accurate
in the thus-far common case where the stalled rcu_scale_writer() instances
are blocked.
[ paulmck: Apply kernel test robot feedback. ]
Signed-off-by: "Paul E. McKenney" <[email protected]>
Signed-off-by: Neeraj Upadhyay <[email protected]>
|
|
This whitespace-only commit fuses a few lines of code, taking advantage
of the newish 100-character-per-line limit to save a few lines of code.
Signed-off-by: "Paul E. McKenney" <[email protected]>
Signed-off-by: Neeraj Upadhyay <[email protected]>
|
|
process_durations() is not a hot path, but there is no good reason to
iterate over and over the data already in 'buf'.
Using a seq_buf saves some useless strcat() and the need of a temp buffer.
Data is written directly at the correct place.
Signed-off-by: Christophe JAILLET <[email protected]>
Tested-by: "Paul E. McKenney" <[email protected]>
Reviewed-by: Davidlohr Bueso <[email protected]>
Signed-off-by: Neeraj Upadhyay <[email protected]>
|
|
This commit adds the start time, in jiffies, of the most recently started
rcu_barrier_tasks*() operation to the diagnostic output used by rcuscale.
This information can be helpful in distinguishing a hung barrier operation
from a long series of barrier operations.
Signed-off-by: "Paul E. McKenney" <[email protected]>
Signed-off-by: Neeraj Upadhyay <[email protected]>
|
|
This commit adds rcu_tasks_torture_stats_print(),
rcu_tasks_trace_torture_stats_print(), and
rcu_tasks_rude_torture_stats_print() functions that provide detailed
diagnostics on grace-period, callback, and barrier state.
Signed-off-by: "Paul E. McKenney" <[email protected]>
Signed-off-by: Neeraj Upadhyay <[email protected]>
|
|
Each Tasks RCU flavor keeps a count of the number of callbacks that the
current rcu_barrier_tasks*() is waiting on, but there is currently no
easy way to work out which callback is stuck. One way to do this is to
mark idle RCU-barrier callbacks by making the ->next pointer point to
the callback itself, and this commit does just that.
Later commits will use this for debug output.
Signed-off-by: "Paul E. McKenney" <[email protected]>
Signed-off-by: Neeraj Upadhyay <[email protected]>
|
|
This commit provides a rcu_barrier_cb_is_done() function that returns
true if the *rcu_barrier*() callback passed in is done. This will be
used when printing grace-period debugging information.
Signed-off-by: "Paul E. McKenney" <[email protected]>
Signed-off-by: Neeraj Upadhyay <[email protected]>
|
|
The rtp->tasks_gp_seq grace-period sequence number is not a strict count,
but rather the usual RCU sequence number with the lower few bits tracking
per-grace-period state and the upper bits the count of grace periods
since boot, give or take the initial value. This commit therefore
adjusts this comment.
Signed-off-by: "Paul E. McKenney" <[email protected]>
Signed-off-by: Neeraj Upadhyay <[email protected]>
|
|
The current mapping of smp_processor_id() to a CPU processing Tasks-RCU
callbacks makes some assumptions about layout. This commit therefore
adds a WARN_ON() to check these assumptions.
[ neeraj.upadhyay: Replace nr_cpu_ids with rcu_task_cpu_ids. ]
Signed-off-by: "Paul E. McKenney" <[email protected]>
Signed-off-by: Neeraj Upadhyay <[email protected]>
|
|
rcu_tasks_need_gpcb()
For kernels built with CONFIG_FORCE_NR_CPUS=y, the nr_cpu_ids is
defined as NR_CPUS instead of the number of possible cpus, this
will cause the following system panic:
smpboot: Allowing 4 CPUs, 0 hotplug CPUs
...
setup_percpu: NR_CPUS:512 nr_cpumask_bits:512 nr_cpu_ids:512 nr_node_ids:1
...
BUG: unable to handle page fault for address: ffffffff9911c8c8
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 0 PID: 15 Comm: rcu_tasks_trace Tainted: G W
6.6.21 #1 5dc7acf91a5e8e9ac9dcfc35bee0245691283ea6
RIP: 0010:rcu_tasks_need_gpcb+0x25d/0x2c0
RSP: 0018:ffffa371c00a3e60 EFLAGS: 00010082
CR2: ffffffff9911c8c8 CR3: 000000040fa20005 CR4: 00000000001706f0
Call Trace:
<TASK>
? __die+0x23/0x80
? page_fault_oops+0xa4/0x180
? exc_page_fault+0x152/0x180
? asm_exc_page_fault+0x26/0x40
? rcu_tasks_need_gpcb+0x25d/0x2c0
? __pfx_rcu_tasks_kthread+0x40/0x40
rcu_tasks_one_gp+0x69/0x180
rcu_tasks_kthread+0x94/0xc0
kthread+0xe8/0x140
? __pfx_kthread+0x40/0x40
ret_from_fork+0x34/0x80
? __pfx_kthread+0x40/0x40
ret_from_fork_asm+0x1b/0x80
</TASK>
Considering that there may be holes in the CPU numbers, use the
maximum possible cpu number, instead of nr_cpu_ids, for configuring
enqueue and dequeue limits.
[ neeraj.upadhyay: Fix htmldocs build error reported by Stephen Rothwell ]
Closes: https://lore.kernel.org/linux-input/CALMA0xaTSMN+p4xUXkzrtR5r6k7hgoswcaXx7baR_z9r5jjskw@mail.gmail.com/T/#u
Reported-by: Zhixu Liu <[email protected]>
Signed-off-by: Zqiang <[email protected]>
Signed-off-by: Neeraj Upadhyay <[email protected]>
|
|
The call_rcu_tasks_rude() and rcu_barrier_tasks_rude() APIs are currently
unused. This commit therefore removes their definitions and boot-time
self-tests.
Signed-off-by: Paul E. McKenney <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Signed-off-by: Neeraj Upadhyay <[email protected]>
|
|
The call_rcu_tasks_rude() and rcu_barrier_tasks_rude() APIs are currently
unused. Furthermore, the idea is to get rid of RCU Tasks Rude entirely
once all architectures have their deep-idle and entry/exit code correctly
marked as inline or noinstr. As a step towards this goal, this commit
therefore removes these two functions from rcuscale testing.
Signed-off-by: Paul E. McKenney <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Signed-off-by: Neeraj Upadhyay <[email protected]>
|
|
The call_rcu_tasks_rude() and rcu_barrier_tasks_rude() APIs are currently
unused. Furthermore, the idea is to get rid of RCU Tasks Rude entirely
once all architectures have their deep-idle and entry/exit code correctly
marked as inline or noinstr. As a first step towards this goal, this
commit therefore removes these two functions from rcutorture testing.
Signed-off-by: Paul E. McKenney <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Signed-off-by: Neeraj Upadhyay <[email protected]>
|
|
This commit adds an stall_cpu_repeat kernel, which is also the
rcutorture.stall_cpu_repeat boot parameter, to test repeated CPU stalls.
Note that only the first stall will pay attention to the stall_cpu_irqsoff
module parameter. For the second and subsequent stalls, interrupts will
be enabled. This is helpful when testing the interaction between RCU
CPU stall warnings and CSD-lock stall warnings.
Reported-by: Rik van Riel <[email protected]>
Signed-off-by: "Paul E. McKenney" <[email protected]>
Signed-off-by: Neeraj Upadhyay <[email protected]>
|
|
The two hrtimer_cpu_base_.*_expiry() functions are wrappers around the
locking functions and sparse complains about the missing counterpart.
Add sparse annotation to denote that this bevaviour is expected.
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
|
|
timer_sync_wait_running() first releases two locks and then acquires
them again. This is unexpected and sparse complains about it.
Add sparse annotation for timer_sync_wait_running() to note that the
locking is expected.
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
|
|
Commit 558abc7e3f89 ("perf: Fix event_function_call() locking") lost
IRQ disabling by mistake.
Fixes: 558abc7e3f89 ("perf: Fix event_function_call() locking")
Reported-by: Pengfei Xu <[email protected]>
Reported-by: Naresh Kamboju <[email protected]>
Tested-by: Pengfei Xu <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
|
|
All failure exits prior to fdget() leave the scope, all matching fdput()
are immediately followed by leaving the scope.
Reviewed-by: Christian Brauner <[email protected]>
Signed-off-by: Al Viro <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
|
|
fdget() is the first thing done in scope, all matching fdput() are
immediately followed by leaving the scope.
Reviewed-by: Christian Brauner <[email protected]>
Signed-off-by: Al Viro <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
|
|
Calling conventions for __bpf_map_get() would be more convenient
if it left fpdut() on failure to callers. Makes for simpler logics
in the callers.
Among other things, the proof of memory safety no longer has to
rely upon file->private_data never being ERR_PTR(...) for bpffs files.
Original calling conventions made it impossible for the caller to tell
whether __bpf_map_get() has returned ERR_PTR(-EINVAL) because it has found
the file not be a bpf map one (in which case it would've done fdput())
or because it found that ERR_PTR(-EINVAL) in file->private_data of a
bpf map file (in which case fdput() would _not_ have been done).
Signed-off-by: Al Viro <[email protected]>
Reviewed-by: Christian Brauner <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
|
|
Factor out the logic to extract bpf_map instances from FD embedded in
bpf_insns, adding it to the list of used_maps (unless it's already
there, in which case we just reuse map's index). This simplifies the
logic in resolve_pseudo_ldimm64(), especially around `struct fd`
handling, as all that is now neatly contained in the helper and doesn't
leak into a dozen error handling paths.
Signed-off-by: Andrii Nakryiko <[email protected]>
|
|
Swith fdget_raw() use cases in bpf_inode_storage.c to CLASS(fd_raw).
Reviewed-by: Christian Brauner <[email protected]>
Signed-off-by: Al Viro <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
|
|
Irregularity here is fdput() not in the same scope as fdget();
just fold ____bpf_prog_get() into its (only) caller and that's
it...
Signed-off-by: Al Viro <[email protected]>
Acked-by: Andrii Nakryiko <[email protected]>
Reviewed-by: Christian Brauner <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
|
|
Merge Al Viro's struct fd refactorings.
Signed-off-by: Andrii Nakryiko <[email protected]>
|
|
consume_remote_task() and dispatch_to_local_dsq() use
move_task_to_local_dsq() to migrate the task to the target CPU. Currently,
move_task_to_local_dsq() expects the caller to lock both the source and
destination rq's. While this may save a few lock operations while the rq's
are not contended, under contention, the double locking can exacerbate the
situation significantly (refer to the linked message below).
Update the migration path so that double locking is not used.
move_task_to_local_dsq() now expects the caller to be locking the source rq,
drops it and then acquires the destination rq lock. Code is simpler this way
and, on a 2-way NUMA machine w/ Xeon Gold 6138, 'hackbench 100 thread 5000`
shows ~3% improvement with scx_simple.
Signed-off-by: Tejun Heo <[email protected]>
Suggested-by: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Acked-by: David Vernet <[email protected]>
|
|
Add an interface for a user-defined workqueue lockdep map, which is
helpful when multiple workqueues are created for the same purpose. This
also helps avoid leaking lockdep maps on each workqueue creation.
v2:
- Add alloc_workqueue_lockdep_map (Tejun)
v3:
- Drop __WQ_USER_OWNED_LOCKDEP (Tejun)
- static inline alloc_ordered_workqueue_lockdep_map (Tejun)
Cc: Tejun Heo <[email protected]>
Cc: Lai Jiangshan <[email protected]>
Signed-off-by: Matthew Brost <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
|
|
Will help enable user-defined lockdep maps for workqueues.
Cc: Tejun Heo <[email protected]>
Cc: Lai Jiangshan <[email protected]>
Signed-off-by: Matthew Brost <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
|
|
Will help enable user-defined lockdep maps for workqueues.
Cc: Tejun Heo <[email protected]>
Cc: Lai Jiangshan <[email protected]>
Signed-off-by: Matthew Brost <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
|
|
`__bpf_ops_sched_ext_ops` was missing the initialization of some struct
attributes. With
https://lore.kernel.org/all/[email protected]/
every single attributes need to be initialized programs (like scx_layered)
will fail to load.
05:26:48 [INFO] libbpf: struct_ops layered: member cgroup_init not found in kernel, skipping it as it's set to zero
05:26:48 [INFO] libbpf: struct_ops layered: member cgroup_exit not found in kernel, skipping it as it's set to zero
05:26:48 [INFO] libbpf: struct_ops layered: member cgroup_prep_move not found in kernel, skipping it as it's set to zero
05:26:48 [INFO] libbpf: struct_ops layered: member cgroup_move not found in kernel, skipping it as it's set to zero
05:26:48 [INFO] libbpf: struct_ops layered: member cgroup_cancel_move not found in kernel, skipping it as it's set to zero
05:26:48 [INFO] libbpf: struct_ops layered: member cgroup_set_weight not found in kernel, skipping it as it's set to zero
05:26:48 [WARN] libbpf: prog 'layered_dump': BPF program load failed: unknown error (-524)
05:26:48 [WARN] libbpf: prog 'layered_dump': -- BEGIN PROG LOAD LOG --
attach to unsupported member dump of struct sched_ext_ops
processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
-- END PROG LOAD LOG --
05:26:48 [WARN] libbpf: prog 'layered_dump': failed to load: -524
05:26:48 [WARN] libbpf: failed to load object 'bpf_bpf'
05:26:48 [WARN] libbpf: failed to load BPF skeleton 'bpf_bpf': -524
Error: Failed to load BPF program
Signed-off-by: Manu Bretelle <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
|
|
The regressing commit is new in 6.10. It assumed that anytime event->prog
is set bpf_overflow_handler() should be invoked to execute the attached bpf
program. This assumption is false for tracing events, and as a result the
regressing commit broke bpftrace by invoking the bpf handler with garbage
inputs on overflow.
Prior to the regression the overflow handlers formed a chain (of length 0,
1, or 2) and perf_event_set_bpf_handler() (the !tracing case) added
bpf_overflow_handler() to that chain, while perf_event_attach_bpf_prog()
(the tracing case) did not. Both set event->prog. The chain of overflow
handlers was replaced by a single overflow handler slot and a fixed call to
bpf_overflow_handler() when appropriate. This modifies the condition there
to check event->prog->type == BPF_PROG_TYPE_PERF_EVENT, restoring the
previous behavior and fixing bpftrace.
Signed-off-by: Kyle Huey <[email protected]>
Suggested-by: Andrii Nakryiko <[email protected]>
Reported-by: Joe Damato <[email protected]>
Closes: https://lore.kernel.org/lkml/ZpFfocvyF3KHaSzF@LQ3V64L9R2/
Fixes: f11f10bfa1ca ("perf/bpf: Call BPF handler directly, not through overflow machinery")
Cc: [email protected]
Tested-by: Joe Damato <[email protected]> # bpftrace
Acked-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
commit 779dbc2e78d7 ("printk: Avoid non-panic CPUs writing
to ringbuffer") disabled non-panic CPUs to further write messages to
ringbuffer after panicked.
Since the commit, non-panicked CPU's are not allowed to write to
ring buffer after panicked and CPU backtrace which is triggered
after panicked to sample non-panicked CPUs' backtrace no longer
serves its function as it has nothing to print.
Fix the issue by allowing non-panicked CPUs to write into ringbuffer
while CPU backtrace is in flight.
Fixes: 779dbc2e78d7 ("printk: Avoid non-panic CPUs writing to ringbuffer")
Signed-off-by: Ryo Takakura <[email protected]>
Reviewed-by: Petr Mladek <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Petr Mladek <[email protected]>
|
|
When the domain suffix is not supplied alloc_fwnode_name() unconditionally
adds a separator.
Fix the format strings to get rid of the stray '-' separator.
Fixes: 1e7c05292531 ("irqdomain: Allow giving name suffix for domain")
Signed-off-by: Andy Shevchenko <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
|
|
The code uses if (bus_token) and if (bus_token == DOMAIN_BUS_ANY).
Since bus_token is an enum, the latter is more robust against changes.
Convert all !bus_token checks to explicitely check for DOMAIN_BUS_ANY.
Signed-off-by: Andy Shevchenko <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
|
|
We want the compiler to see that fdput() on empty instance
is a no-op. The emptiness check is that file reference is NULL,
while fdput() is "fput() if FDPUT_FPUT is present in flags".
The reason why fdput() on empty instance is a no-op is something
compiler can't see - it's that we never generate instances with
NULL file reference combined with non-zero flags.
It's not that hard to deal with - the real primitives behind
fdget() et.al. are returning an unsigned long value, unpacked by (inlined)
__to_fd() into the current struct file * + int. The lower bits are
used to store flags, while the rest encodes the pointer. Linus suggested
that keeping this unsigned long around with the extractions done by inlined
accessors should generate a sane code and that turns out to be the case.
Namely, turning struct fd into a struct-wrapped unsinged long, with
fd_empty(f) => unlikely(f.word == 0)
fd_file(f) => (struct file *)(f.word & ~3)
fdput(f) => if (f.word & 1) fput(fd_file(f))
ends up with compiler doing the right thing. The cost is the patch
footprint, of course - we need to switch f.file to fd_file(f) all over
the tree, and it's not doable with simple search and replace; there are
false positives, etc.
Note that the sole member of that structure is an opaque
unsigned long - all accesses should be done via wrappers and I don't
want to use a name that would invite manual casts to file pointers,
etc. The value of that member is equal either to (unsigned long)p | flags,
p being an address of some struct file instance, or to 0 for an empty fd.
For now the new predicate (fd_empty(f)) has no users; all the
existing checks have form (!fd_file(f)). We will convert to fd_empty()
use later; here we only define it (and tell the compiler that it's
unlikely to return true).
This commit only deals with representation change; there will
be followups.
Reviewed-by: Christian Brauner <[email protected]>
Signed-off-by: Al Viro <[email protected]>
|