Age | Commit message (Collapse) | Author | Files | Lines |
|
No conflicts.
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Previously, I failed to realize that Kees' patch [1] has not been merged
into the mainline yet, and dropped DEBUG_INFO=y too eagerly from the
mainline. As the results, "make debug.config" won't be able to flip
DEBUG_INFO=n from the existing .config. This should close the gaps of a
few weeks before Kees' patch is there, and work regardless of their
merging status anyway.
Link: https://lore.kernel.org/all/[email protected]/ [1]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Qian Cai <[email protected]>
Reported-by: Daniel Thompson <[email protected]>
Reviewed-by: Daniel Thompson <[email protected]>
Cc: Kees Cook <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Merge misc fixes from David Howells:
"A set of patches for watch_queue filter issues noted by Jann. I've
added in a cleanup patch from Christophe Jaillet to convert to using
formal bitmap specifiers for the note allocation bitmap.
Also two filesystem fixes (afs and cachefiles)"
* emailed patches from David Howells <[email protected]>:
cachefiles: Fix volume coherency attribute
afs: Fix potential thrashing in afs writeback
watch_queue: Make comment about setting ->defunct more accurate
watch_queue: Fix lack of barrier/sync/lock between post and read
watch_queue: Free the alloc bitmap when the watch_queue is torn down
watch_queue: Fix the alloc bitmap size to reflect notes allocated
watch_queue: Use the bitmap API when applicable
watch_queue: Fix to always request a pow-of-2 pipe ring size
watch_queue: Fix to release page in ->release()
watch_queue, pipe: Free watchqueue state after clearing pipe ring
watch_queue: Fix filter limit check
|
|
watch_queue_clear() has a comment stating that setting ->defunct to true
preventing new additions as well as preventing notifications. Whilst
the latter is true, the first bit is superfluous since at the time this
function is called, the pipe cannot be accessed to add new event
sources.
Remove the "new additions" bit from the comment.
Fixes: c73be61cede5 ("pipe: Add general notification queue support")
Reported-by: Jann Horn <[email protected]>
Signed-off-by: David Howells <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
There's nothing to synchronise post_one_notification() versus
pipe_read(). Whilst posting is done under pipe->rd_wait.lock, the
reader only takes pipe->mutex which cannot bar notification posting as
that may need to be made from contexts that cannot sleep.
Fix this by setting pipe->head with a barrier in post_one_notification()
and reading pipe->head with a barrier in pipe_read().
If that's not sufficient, the rd_wait.lock will need to be taken,
possibly in a ->confirm() op so that it only applies to notifications.
The lock would, however, have to be dropped before copy_page_to_iter()
is invoked.
Fixes: c73be61cede5 ("pipe: Add general notification queue support")
Reported-by: Jann Horn <[email protected]>
Signed-off-by: David Howells <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Free the watch_queue note allocation bitmap when the watch_queue is
destroyed.
Fixes: c73be61cede5 ("pipe: Add general notification queue support")
Reported-by: Jann Horn <[email protected]>
Signed-off-by: David Howells <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Currently, watch_queue_set_size() sets the number of notes available in
wqueue->nr_notes according to the number of notes allocated, but sets
the size of the bitmap to the unrounded number of notes originally asked
for.
Fix this by setting the bitmap size to the number of notes we're
actually going to make available (ie. the number allocated).
Fixes: c73be61cede5 ("pipe: Add general notification queue support")
Reported-by: Jann Horn <[email protected]>
Signed-off-by: David Howells <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Use bitmap_alloc() to simplify code, improve the semantic and reduce
some open-coded arithmetic in allocator arguments.
Also change a memset(0xff) into an equivalent bitmap_fill() to keep
consistency.
Signed-off-by: Christophe JAILLET <[email protected]>
Signed-off-by: David Howells <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The pipe ring size must always be a power of 2 as the head and tail
pointers are masked off by AND'ing with the size of the ring - 1.
watch_queue_set_size(), however, lets you specify any number of notes
between 1 and 511. This number is passed through to pipe_resize_ring()
without checking/forcing its alignment.
Fix this by rounding the number of slots required up to the nearest
power of two. The request is meant to guarantee that at least that many
notifications can be generated before the queue is full, so rounding
down isn't an option, but, alternatively, it may be better to give an
error if we aren't allowed to allocate that much ring space.
Fixes: c73be61cede5 ("pipe: Add general notification queue support")
Reported-by: Jann Horn <[email protected]>
Signed-off-by: David Howells <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
When a pipe ring descriptor points to a notification message, the
refcount on the backing page is incremented by the generic get function,
but the release function, which marks the bitmap, doesn't drop the page
ref.
Fix this by calling generic_pipe_buf_release() at the end of
watch_queue_pipe_buf_release().
Fixes: c73be61cede5 ("pipe: Add general notification queue support")
Reported-by: Jann Horn <[email protected]>
Signed-off-by: David Howells <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
In watch_queue_set_filter(), there are a couple of places where we check
that the filter type value does not exceed what the type_filter bitmap
can hold. One place calculates the number of bits by:
if (tf[i].type >= sizeof(wfilter->type_filter) * 8)
which is fine, but the second does:
if (tf[i].type >= sizeof(wfilter->type_filter) * BITS_PER_LONG)
which is not. This can lead to a couple of out-of-bounds writes due to
a too-large type:
(1) __set_bit() on wfilter->type_filter
(2) Writing more elements in wfilter->filters[] than we allocated.
Fix this by just using the proper WATCH_TYPE__NR instead, which is the
number of types we actually know about.
The bug may cause an oops looking something like:
BUG: KASAN: slab-out-of-bounds in watch_queue_set_filter+0x659/0x740
Write of size 4 at addr ffff88800d2c66bc by task watch_queue_oob/611
...
Call Trace:
<TASK>
dump_stack_lvl+0x45/0x59
print_address_description.constprop.0+0x1f/0x150
...
kasan_report.cold+0x7f/0x11b
...
watch_queue_set_filter+0x659/0x740
...
__x64_sys_ioctl+0x127/0x190
do_syscall_64+0x43/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
Allocated by task 611:
kasan_save_stack+0x1e/0x40
__kasan_kmalloc+0x81/0xa0
watch_queue_set_filter+0x23a/0x740
__x64_sys_ioctl+0x127/0x190
do_syscall_64+0x43/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
The buggy address belongs to the object at ffff88800d2c66a0
which belongs to the cache kmalloc-32 of size 32
The buggy address is located 28 bytes inside of
32-byte region [ffff88800d2c66a0, ffff88800d2c66c0)
Fixes: c73be61cede5 ("pipe: Add general notification queue support")
Reported-by: Jann Horn <[email protected]>
Signed-off-by: David Howells <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
"Minor tracing fixes:
- Fix unregistering the same event twice. A user could disable the
same event that osnoise will disable on unregistering.
- Inform RCU of a quiescent state in the osnoise testing thread.
- Fix some kerneldoc comments"
* tag 'trace-v5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
ftrace: Fix some W=1 warnings in kernel doc comments
tracing/osnoise: Force quiescent states while tracing
tracing/osnoise: Do not unregister events twice
|
|
net/dsa/dsa2.c
commit afb3cc1a397d ("net: dsa: unlock the rtnl_mutex when dsa_master_setup() fails")
commit e83d56537859 ("net: dsa: replay master state events in dsa_tree_{setup,teardown}_master")
https://lore.kernel.org/all/[email protected]/
drivers/net/ethernet/intel/ice/ice.h
commit 97b0129146b1 ("ice: Fix error with handling of bonding MTU")
commit 43113ff73453 ("ice: add TTY for GNSS module for E810T device")
https://lore.kernel.org/all/[email protected]/
drivers/staging/gdm724x/gdm_lte.c
commit fc7f750dc9d1 ("staging: gdm724x: fix use after free in gdm_lte_rx()")
commit 4bcc4249b4cf ("staging: Use netif_rx().")
https://lore.kernel.org/all/[email protected]/
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Clean up the following clang-w1 warning:
kernel/trace/ftrace.c:7827: warning: Function parameter or member 'ops'
not described in 'unregister_ftrace_function'.
kernel/trace/ftrace.c:7805: warning: Function parameter or member 'ops'
not described in 'register_ftrace_function'.
Link: https://lkml.kernel.org/r/[email protected]
Reported-by: Abaci Robot <[email protected]>
Signed-off-by: Jiapeng Chong <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
At the moment running osnoise on a nohz_full CPU or uncontested FIFO
priority and a PREEMPT_RCU kernel might have the side effect of
extending grace periods too much. This will entice RCU to force a
context switch on the wayward CPU to end the grace period, all while
introducing unwarranted noise into the tracer. This behaviour is
unavoidable as overly extending grace periods might exhaust the system's
memory.
This same exact problem is what extended quiescent states (EQS) were
created for, conversely, rcu_momentary_dyntick_idle() emulates them by
performing a zero duration EQS. So let's make use of it.
In the common case rcu_momentary_dyntick_idle() is fairly inexpensive:
atomically incrementing a local per-CPU counter and doing a store. So it
shouldn't affect osnoise's measurements (which has a 1us granularity),
so we'll call it unanimously.
The uncommon case involve calling rcu_momentary_dyntick_idle() after
having the osnoise process:
- Receive an expedited quiescent state IPI with preemption disabled or
during an RCU critical section. (activates rdp->cpu_no_qs.b.exp
code-path).
- Being preempted within in an RCU critical section and having the
subsequent outermost rcu_read_unlock() called with interrupts
disabled. (t->rcu_read_unlock_special.b.blocked code-path).
Neither of those are possible at the moment, and are unlikely to be in
the future given the osnoise's loop design. On top of this, the noise
generated by the situations described above is unavoidable, and if not
exposed by rcu_momentary_dyntick_idle() will be eventually seen in
subsequent rcu_read_unlock() calls or schedule operations.
Link: https://lkml.kernel.org/r/[email protected]
Cc: [email protected]
Fixes: bce29ac9ce0b ("trace: Add osnoise tracer")
Signed-off-by: Nicolas Saenz Julienne <[email protected]>
Acked-by: Paul E. McKenney <[email protected]>
Acked-by: Daniel Bristot de Oliveira <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Nicolas reported that using:
# trace-cmd record -e all -M 10 -p osnoise --poll
Resulted in the following kernel warning:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 1217 at kernel/tracepoint.c:404 tracepoint_probe_unregister+0x280/0x370
[...]
CPU: 0 PID: 1217 Comm: trace-cmd Not tainted 5.17.0-rc6-next-20220307-nico+ #19
RIP: 0010:tracepoint_probe_unregister+0x280/0x370
[...]
CR2: 00007ff919b29497 CR3: 0000000109da4005 CR4: 0000000000170ef0
Call Trace:
<TASK>
osnoise_workload_stop+0x36/0x90
tracing_set_tracer+0x108/0x260
tracing_set_trace_write+0x94/0xd0
? __check_object_size.part.0+0x10a/0x150
? selinux_file_permission+0x104/0x150
vfs_write+0xb5/0x290
ksys_write+0x5f/0xe0
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7ff919a18127
[...]
---[ end trace 0000000000000000 ]---
The warning complains about an attempt to unregister an
unregistered tracepoint.
This happens on trace-cmd because it first stops tracing, and
then switches the tracer to nop. Which is equivalent to:
# cd /sys/kernel/tracing/
# echo osnoise > current_tracer
# echo 0 > tracing_on
# echo nop > current_tracer
The osnoise tracer stops the workload when no trace instance
is actually collecting data. This can be caused both by
disabling tracing or disabling the tracer itself.
To avoid unregistering events twice, use the existing
trace_osnoise_callback_enabled variable to check if the events
(and the workload) are actually active before trying to
deactivate them.
Link: https://lore.kernel.org/all/[email protected]/
Link: https://lkml.kernel.org/r/938765e17d5a781c2df429a98f0b2e7cc317b022.1646823913.git.bristot@kernel.org
Cc: [email protected]
Cc: Marcelo Tosatti <[email protected]>
Fixes: 2fac8d6486d5 ("tracing/osnoise: Allow multiple instances of the same tracer")
Reported-by: Nicolas Saenz Julienne <[email protected]>
Signed-off-by: Daniel Bristot de Oliveira <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 spectre fixes from Borislav Petkov:
- Mitigate Spectre v2-type Branch History Buffer attacks on machines
which support eIBRS, i.e., the hardware-assisted speculation
restriction after it has been shown that such machines are vulnerable
even with the hardware mitigation.
- Do not use the default LFENCE-based Spectre v2 mitigation on AMD as
it is insufficient to mitigate such attacks. Instead, switch to
retpolines on all AMD by default.
- Update the docs and add some warnings for the obviously vulnerable
cmdline configurations.
* tag 'x86_bugs_for_v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/speculation: Warn about eIBRS + LFENCE + Unprivileged eBPF + SMT
x86/speculation: Warn about Spectre v2 LFENCE mitigation
x86/speculation: Update link to AMD speculation whitepaper
x86/speculation: Use generic retpoline by default on AMD
x86/speculation: Include unprivileged eBPF status in Spectre v2 mitigation reporting
Documentation/hw-vuln: Update spectre doc
x86/speculation: Add eIBRS + Retpoline options
x86/speculation: Rename RETPOLINE_AMD to RETPOLINE_LFENCE
|
|
Unfortunately, we ended up merging an old version of the patch "fix info
leak with DMA_FROM_DEVICE" instead of merging the latest one. Christoph
(the swiotlb maintainer), he asked me to create an incremental fix
(after I have pointed this out the mix up, and asked him for guidance).
So here we go.
The main differences between what we got and what was agreed are:
* swiotlb_sync_single_for_device is also required to do an extra bounce
* We decided not to introduce DMA_ATTR_OVERWRITE until we have exploiters
* The implantation of DMA_ATTR_OVERWRITE is flawed: DMA_ATTR_OVERWRITE
must take precedence over DMA_ATTR_SKIP_CPU_SYNC
Thus this patch removes DMA_ATTR_OVERWRITE, and makes
swiotlb_sync_single_for_device() bounce unconditionally (that is, also
when dir == DMA_TO_DEVICE) in order do avoid synchronising back stale
data from the swiotlb buffer.
Let me note, that if the size used with dma_sync_* API is less than the
size used with dma_[un]map_*, under certain circumstances we may still
end up with swiotlb not being transparent. In that sense, this is no
perfect fix either.
To get this bullet proof, we would have to bounce the entire
mapping/bounce buffer. For that we would have to figure out the starting
address, and the size of the mapping in
swiotlb_sync_single_for_device(). While this does seem possible, there
seems to be no firm consensus on how things are supposed to work.
Signed-off-by: Halil Pasic <[email protected]>
Fixes: ddbd89deb7d3 ("swiotlb: fix info leak with DMA_FROM_DEVICE")
Cc: [email protected]
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
- Fix sorting on old "cpu" value in histograms
- Fix return value of __setup() boot parameter handlers
* tag 'trace-v5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Fix return value of __setup handlers
tracing/histogram: Fix sorting on old "cpu" value
|
|
Merge misc fixes from Andrew Morton:
"8 patches.
Subsystems affected by this patch series: mm (hugetlb, pagemap, and
userfaultfd), memfd, selftests, and kconfig"
* emailed patches from Andrew Morton <[email protected]>:
configs/debug: set CONFIG_DEBUG_INFO=y properly
proc: fix documentation and description of pagemap
kselftest/vm: fix tests build with old libc
memfd: fix F_SEAL_WRITE after shmem huge page allocated
mm: fix use-after-free when anon vma name is used after vma is freed
mm: prevent vm_area_struct::anon_name refcount saturation
mm: refactor vm_area_struct::anon_vma_name usage code
selftests/vm: cleanup hugetlb file after mremap test
|
|
CONFIG_DEBUG_INFO can't be set by user directly, so set
CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y instead.
Otherwise, we end up with no debuginfo in vmlinux which is a big no-no
for kernel debugging.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Qian Cai <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Avoid mixing strings and their anon_vma_name referenced pointers by
using struct anon_vma_name whenever possible. This simplifies the code
and allows easier sharing of anon_vma_name structures when they
represent the same name.
[[email protected]: fix comment]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Suren Baghdasaryan <[email protected]>
Suggested-by: Matthew Wilcox <[email protected]>
Suggested-by: Michal Hocko <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: Colin Cross <[email protected]>
Cc: Sumit Semwal <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: Christian Brauner <[email protected]>
Cc: Alexey Gladkov <[email protected]>
Cc: Sasha Levin <[email protected]>
Cc: Chris Hyser <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Peter Collingbourne <[email protected]>
Cc: Xiaofeng Cao <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Cyrill Gorcunov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Daniel Borkmann says:
====================
pull-request: bpf-next 2022-03-04
We've added 32 non-merge commits during the last 14 day(s) which contain
a total of 59 files changed, 1038 insertions(+), 473 deletions(-).
The main changes are:
1) Optimize BPF stackmap's build_id retrieval by caching last valid build_id,
as consecutive stack frames are likely to be in the same VMA and therefore
have the same build id, from Hao Luo.
2) Several improvements to arm64 BPF JIT, that is, support for JITing
the atomic[64]_fetch_add, atomic[64]_[fetch_]{and,or,xor} and lastly
atomic[64]_{xchg|cmpxchg}. Also fix the BTF line info dump for JITed
programs, from Hou Tao.
3) Optimize generic BPF map batch deletion by only enforcing synchronize_rcu()
barrier once upon return to user space, from Eric Dumazet.
4) For kernel build parse DWARF and generate BTF through pahole with enabled
multithreading, from Kui-Feng Lee.
5) BPF verifier usability improvements by making log info more concise and
replacing inv with scalar type name, from Mykola Lysenko.
6) Two follow-up fixes for BPF prog JIT pack allocator, from Song Liu.
7) Add a new Kconfig to allow for loading kernel modules with non-matching
BTF type info; their BTF info is then removed on load, from Connor O'Brien.
8) Remove reallocarray() usage from bpftool and switch to libbpf_reallocarray()
in order to fix compilation errors for older glibc, from Mauricio Vásquez.
9) Fix libbpf to error on conflicting name in BTF when type declaration
appears before the definition, from Xu Kuohai.
10) Fix issue in BPF preload for in-kernel light skeleton where loaded BPF
program fds prevent init process from setting up fd 0-2, from Yucong Sun.
11) Fix libbpf reuse of pinned perf RB map when max_entries is auto-determined
by libbpf, from Stijn Tintel.
12) Several cleanups for libbpf and a fix to enforce perf RB map #pages to be
non-zero, from Yuntao Wang.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (32 commits)
bpf: Small BPF verifier log improvements
libbpf: Add a check to ensure that page_cnt is non-zero
bpf, x86: Set header->size properly before freeing it
x86: Disable HAVE_ARCH_HUGE_VMALLOC on 32-bit x86
bpf, test_run: Fix overflow in XDP frags bpf_test_finish
selftests/bpf: Update btf_dump case for conflicting names
libbpf: Skip forward declaration when counting duplicated type names
bpf: Add some description about BPF_JIT_ALWAYS_ON in Kconfig
bpf, docs: Add a missing colon in verifier.rst
bpf: Cache the last valid build_id
libbpf: Fix BPF_MAP_TYPE_PERF_EVENT_ARRAY auto-pinning
bpf, selftests: Use raw_tp program for atomic test
bpf, arm64: Support more atomic operations
bpftool: Remove redundant slashes
bpf: Add config to allow loading modules with BTF mismatches
bpf, arm64: Feed byte-offset into bpf line info
bpf, arm64: Call build_prologue() first in first JIT pass
bpf: Fix issue with bpf preload module taking over stdout/stdin of kernel.
bpftool: Bpf skeletons assert type sizes
bpf: Cleanup comments
...
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Pull block fix from Jens Axboe:
"Just a small UAF fix for blktrace"
* tag 'block-5.17-2022-03-04' of git://git.kernel.dk/linux-block:
blktrace: fix use after free for struct blk_trace
|
|
__setup() handlers should generally return 1 to indicate that the
boot options have been handled.
Using invalid option values causes the entire kernel boot option
string to be reported as Unknown and added to init's environment
strings, polluting it.
Unknown kernel command line parameters "BOOT_IMAGE=/boot/bzImage-517rc6
kprobe_event=p,syscall_any,$arg1 trace_options=quiet
trace_clock=jiffies", will be passed to user space.
Run /sbin/init as init process
with arguments:
/sbin/init
with environment:
HOME=/
TERM=linux
BOOT_IMAGE=/boot/bzImage-517rc6
kprobe_event=p,syscall_any,$arg1
trace_options=quiet
trace_clock=jiffies
Return 1 from the __setup() handlers so that init's environment is not
polluted with kernel boot options.
Link: lore.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Cc: [email protected]
Fixes: 7bcfaf54f591 ("tracing: Add trace_options kernel command line parameter")
Fixes: e1e232ca6b8f ("tracing: Add trace_clock=<clock> kernel parameter")
Fixes: 970988e19eb0 ("tracing/kprobe: Add kprobe_event= boot parameter")
Signed-off-by: Randy Dunlap <[email protected]>
Reported-by: Igor Zhbanov <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
net/batman-adv/hard-interface.c
commit 690bb6fb64f5 ("batman-adv: Request iflink once in batadv-on-batadv check")
commit 6ee3c393eeb7 ("batman-adv: Demote batadv-on-batadv skip error message")
https://lore.kernel.org/all/[email protected]/
net/smc/af_smc.c
commit 4d08b7b57ece ("net/smc: Fix cleanup when register ULP fails")
commit 462791bbfa35 ("net/smc: add sysctl interface for SMC")
https://lore.kernel.org/all/[email protected]/
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
In particular these include:
1) Remove output of inv for scalars in print_verifier_state
2) Replace inv with scalar in verifier error messages
3) Remove _value suffixes for umin/umax/s32_min/etc (except map_value)
4) Remove output of id=0
5) Remove output of ref_obj_id=0
Signed-off-by: Mykola Lysenko <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull ucounts fix from Eric Biederman:
"Etienne Dechamps recently found a regression caused by enforcing
RLIMIT_NPROC for root where the rlimit was not previously enforced.
Michal Koutný had previously pointed out the inconsistency in
enforcing the RLIMIT_NPROC that had been on the root owned process
after the root user creates a user namespace.
Which makes the fix for the regression simply removing the
inconsistency"
* 'ucount-rlimit-fixes-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
ucounts: Fix systemd LimitNPROC with private users regression
|
|
On do_jit failure path, the header is freed by bpf_jit_binary_pack_free.
While bpf_jit_binary_pack_free doesn't require proper ro_header->size,
bpf_prog_pack_free still uses it. Set header->size in bpf_int_jit_compile
before calling bpf_jit_binary_pack_free.
Fixes: 1022a5498f6f ("bpf, x86_64: Use bpf_jit_binary_pack_alloc")
Fixes: 33c9805860e5 ("bpf: Introduce bpf_jit_binary_pack_[alloc|finalize|free]")
Reported-by: Kui-Feng Lee <[email protected]>
Signed-off-by: Song Liu <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Yonghong Song <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
When trying to add a histogram against an event with the "cpu" field, it
was impossible due to "cpu" being a keyword to key off of the running CPU.
So to fix this, it was changed to "common_cpu" to match the other generic
fields (like "common_pid"). But since some scripts used "cpu" for keying
off of the CPU (for events that did not have "cpu" as a field, which is
most of them), a backward compatibility trick was added such that if "cpu"
was used as a key, and the event did not have "cpu" as a field name, then
it would fallback and switch over to "common_cpu".
This fix has a couple of subtle bugs. One was that when switching over to
"common_cpu", it did not change the field name, it just set a flag. But
the code still found a "cpu" field. The "cpu" field is used for filtering
and is returned when the event does not have a "cpu" field.
This was found by:
# cd /sys/kernel/tracing
# echo hist:key=cpu,pid:sort=cpu > events/sched/sched_wakeup/trigger
# cat events/sched/sched_wakeup/hist
Which showed the histogram unsorted:
{ cpu: 19, pid: 1175 } hitcount: 1
{ cpu: 6, pid: 239 } hitcount: 2
{ cpu: 23, pid: 1186 } hitcount: 14
{ cpu: 12, pid: 249 } hitcount: 2
{ cpu: 3, pid: 994 } hitcount: 5
Instead of hard coding the "cpu" checks, take advantage of the fact that
trace_event_field_field() returns a special field for "cpu" and "CPU" if
the event does not have "cpu" as a field. This special field has the
"filter_type" of "FILTER_CPU". Check that to test if the returned field is
of the CPU type instead of doing the string compare.
Also, fix the sorting bug by testing for the hist_field flag of
HIST_FIELD_FL_CPU when setting up the sort routine. Otherwise it will use
the special CPU field to know what compare routine to use, and since that
special field does not have a size, it returns tracing_map_cmp_none.
Cc: [email protected]
Fixes: 1e3bac71c505 ("tracing/histogram: Rename "cpu" to "common_cpu"")
Reported-by: Daniel Bristot de Oliveira <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
When CONFIG_BPF_JIT_ALWAYS_ON is enabled, /proc/sys/net/core/bpf_jit_enable
is permanently set to 1 and setting any other value than that will return
failure.
Add the above description in the help text of config BPF_JIT_ALWAYS_ON, and
then we can distinguish between BPF_JIT_ALWAYS_ON and BPF_JIT_DEFAULT_ON.
Signed-off-by: Tiezhu Yang <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Song Liu <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
For binaries that are statically linked, consecutive stack frames are
likely to be in the same VMA and therefore have the same build id.
On a real-world workload, we observed that 66% of CPU cycles in
__bpf_get_stackid() were spent on build_id_parse() and find_vma().
As an optimization for this case, we can cache the previous frame's
VMA, if the new frame has the same VMA as the previous one, reuse the
previous one's build id.
We are holding the MM locks as reader across the entire loop, so we
don't need to worry about VMA going away.
Tested through "stacktrace_build_id" and "stacktrace_build_id_nmi" in
test_progs.
Suggested-by: Greg Thelen <[email protected]>
Signed-off-by: Hao Luo <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Reviewed-by: Pasha Tatashin <[email protected]>
Acked-by: Andrii Nakryiko <[email protected]>
Acked-by: Song Liu <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
When tracing the whole disk, 'dropped' and 'msg' will be created
under 'q->debugfs_dir' and 'bt->dir' is NULL, thus blk_trace_free()
won't remove those files. What's worse, the following UAF can be
triggered because of accessing stale 'dropped' and 'msg':
==================================================================
BUG: KASAN: use-after-free in blk_dropped_read+0x89/0x100
Read of size 4 at addr ffff88816912f3d8 by task blktrace/1188
CPU: 27 PID: 1188 Comm: blktrace Not tainted 5.17.0-rc4-next-20220217+ #469
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-4
Call Trace:
<TASK>
dump_stack_lvl+0x34/0x44
print_address_description.constprop.0.cold+0xab/0x381
? blk_dropped_read+0x89/0x100
? blk_dropped_read+0x89/0x100
kasan_report.cold+0x83/0xdf
? blk_dropped_read+0x89/0x100
kasan_check_range+0x140/0x1b0
blk_dropped_read+0x89/0x100
? blk_create_buf_file_callback+0x20/0x20
? kmem_cache_free+0xa1/0x500
? do_sys_openat2+0x258/0x460
full_proxy_read+0x8f/0xc0
vfs_read+0xc6/0x260
ksys_read+0xb9/0x150
? vfs_write+0x3d0/0x3d0
? fpregs_assert_state_consistent+0x55/0x60
? exit_to_user_mode_prepare+0x39/0x1e0
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fbc080d92fd
Code: ce 20 00 00 75 10 b8 00 00 00 00 0f 05 48 3d 01 f0 ff ff 73 31 c3 48 83 1
RSP: 002b:00007fbb95ff9cb0 EFLAGS: 00000293 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 00007fbb95ff9dc0 RCX: 00007fbc080d92fd
RDX: 0000000000000100 RSI: 00007fbb95ff9cc0 RDI: 0000000000000045
RBP: 0000000000000045 R08: 0000000000406299 R09: 00000000fffffffd
R10: 000000000153afa0 R11: 0000000000000293 R12: 00007fbb780008c0
R13: 00007fbb78000938 R14: 0000000000608b30 R15: 00007fbb780029c8
</TASK>
Allocated by task 1050:
kasan_save_stack+0x1e/0x40
__kasan_kmalloc+0x81/0xa0
do_blk_trace_setup+0xcb/0x410
__blk_trace_setup+0xac/0x130
blk_trace_ioctl+0xe9/0x1c0
blkdev_ioctl+0xf1/0x390
__x64_sys_ioctl+0xa5/0xe0
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
Freed by task 1050:
kasan_save_stack+0x1e/0x40
kasan_set_track+0x21/0x30
kasan_set_free_info+0x20/0x30
__kasan_slab_free+0x103/0x180
kfree+0x9a/0x4c0
__blk_trace_remove+0x53/0x70
blk_trace_ioctl+0x199/0x1c0
blkdev_common_ioctl+0x5e9/0xb30
blkdev_ioctl+0x1a5/0x390
__x64_sys_ioctl+0xa5/0xe0
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
The buggy address belongs to the object at ffff88816912f380
which belongs to the cache kmalloc-96 of size 96
The buggy address is located 88 bytes inside of
96-byte region [ffff88816912f380, ffff88816912f3e0)
The buggy address belongs to the page:
page:000000009a1b4e7c refcount:1 mapcount:0 mapping:0000000000000000 index:0x0f
flags: 0x17ffffc0000200(slab|node=0|zone=2|lastcpupid=0x1fffff)
raw: 0017ffffc0000200 ffffea00044f1100 dead000000000002 ffff88810004c780
raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff88816912f280: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
ffff88816912f300: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
>ffff88816912f380: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
^
ffff88816912f400: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
ffff88816912f480: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
==================================================================
Fixes: c0ea57608b69 ("blktrace: remove debugfs file dentries from struct blk_trace")
Signed-off-by: Yu Kuai <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
BTF mismatch can occur for a separately-built module even when the ABI is
otherwise compatible and nothing else would prevent successfully loading.
Add a new Kconfig to control how mismatches are handled. By default, preserve
the current behavior of refusing to load the module. If MODULE_ALLOW_BTF_MISMATCH
is enabled, load the module but ignore its BTF information.
Suggested-by: Yonghong Song <[email protected]>
Suggested-by: Michal Suchánek <[email protected]>
Signed-off-by: Connor O'Brien <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Shung-Hsi Yu <[email protected]>
Acked-by: Song Liu <[email protected]>
Link: https://lore.kernel.org/bpf/CAADnVQJ+OVPnBz8z3vNu8gKXX42jCUqfuvhWAyCQDu8N_yqqwQ@mail.gmail.com
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Pull dma-mapping fix from Christoph Hellwig:
- fix a swiotlb info leak (Halil Pasic)
* tag 'dma-mapping-5.17-1' of git://git.infradead.org/users/hch/dma-mapping:
swiotlb: fix info leak with DMA_FROM_DEVICE
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
- rtla (Real-Time Linux Analysis tool):
- fix typo in man page
- Update API -e to -E before it is released
- Error message fix and memory leak fix
- Partially uninline trace event soft disable to shrink text
- Fix function graph start up test
- Have triggers affect the trace instance they are in and not top level
- Have osnoise sleep in the units it says it uses
- Remove unused ftrace stub function
- Remove event probe redundant info from event in the buffer
- Fix group ownership setting in tracefs
- Ensure trace buffer is minimum size to prevent crashes
* tag 'trace-v5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
rtla/osnoise: Fix error message when failing to enable trace instance
rtla/osnoise: Free params at the exit
rtla/hist: Make -E the short version of --entries
tracing: Fix selftest config check for function graph start up test
tracefs: Set the group ownership in apply_options() not parse_options()
tracing/osnoise: Make osnoise_main to sleep for microseconds
ftrace: Remove unused ftrace_startup_enable() stub
tracing: Ensure trace buffer is at least 4096 bytes large
tracing: Uninline trace_trigger_soft_disabled() partly
eprobes: Remove redundant event type information
tracing: Have traceon and traceoff trigger honor the instance
tracing: Dump stacktrace trigger to the corresponding instance
rtla: Fix systme -> system typo on man page
|
|
CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS is required to test
direct tramp.
Link: https://lkml.kernel.org/r/bdc7e594e13b0891c1d61bc8d56c94b1890eaed7.1640017960.git.christophe.leroy@csgroup.eu
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
In cb80ddc67152 ("bpf: Convert bpf_preload.ko to use light skeleton.")
BPF preload was switched from user mode process to use in-kernel light
skeleton instead. However, in the kernel context, early in the boot
sequence, the first available FD can start from 0, instead of normally
3 for user mode process. So FDs 0 and 1 are then used for loaded BPF
programs and prevent init process from setting up stdin/stdout/stderr on
FD 0, 1, and 2 as expected.
Before the fix:
ls -lah /proc/1/fd/*
lrwx------1 root root 64 Feb 23 17:20 /proc/1/fd/0 -> /dev/null
lrwx------ 1 root root 64 Feb 23 17:20 /proc/1/fd/1 -> /dev/null
lrwx------ 1 root root 64 Feb 23 17:20 /proc/1/fd/2 -> /dev/console
lrwx------ 1 root root 64 Feb 23 17:20 /proc/1/fd/6 -> /dev/console
lrwx------ 1 root root 64 Feb 23 17:20 /proc/1/fd/7 -> /dev/console
After the fix:
ls -lah /proc/1/fd/*
lrwx------ 1 root root 64 Feb 24 21:23 /proc/1/fd/0 -> /dev/console
lrwx------ 1 root root 64 Feb 24 21:23 /proc/1/fd/1 -> /dev/console
lrwx------ 1 root root 64 Feb 24 21:23 /proc/1/fd/2 -> /dev/console
Fix by closing prog FDs after initialization. struct bpf_prog's
themselves are kept alive through direct kernel references taken with
bpf_link_get_from_fd().
Fixes: cb80ddc67152 ("bpf: Convert bpf_preload.ko to use light skeleton.")
Signed-off-by: Yucong Sun <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
osnoise's runtime and period are in the microseconds scale, but it is
currently sleeping in the millisecond's scale. This behavior roots in the
usage of hwlat as the skeleton for osnoise.
Make osnoise to sleep in the microseconds scale. Also, move the sleep to
a specialized function.
Link: https://lkml.kernel.org/r/302aa6c7bdf2d131719b22901905e9da122a11b2.1645197336.git.bristot@kernel.org
Cc: Ingo Molnar <[email protected]>
Signed-off-by: Daniel Bristot de Oliveira <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
When building with clang + CONFIG_DYNAMIC_FTRACE=n + W=1, there is a
warning:
kernel/trace/ftrace.c:7194:20: error: unused function 'ftrace_startup_enable' [-Werror,-Wunused-function]
static inline void ftrace_startup_enable(int command) { }
^
1 error generated.
Clang warns on instances of static inline functions in .c files with W=1
after commit 6863f5643dd7 ("kbuild: allow Clang to find unused static
inline functions for W=1 build").
The ftrace_startup_enable() stub has been unused since
commit e1effa0144a1 ("ftrace: Annotate the ops operation on update"),
where its use outside of the CONFIG_DYNAMIC_TRACE section was replaced
by ftrace_startup_all(). Remove it to resolve the warning.
Link: https://lkml.kernel.org/r/[email protected]
Reported-by: kernel test robot <[email protected]>
Signed-off-by: Nathan Chancellor <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Booting the kernel with 'trace_buf_size=1' give a warning at
boot during the ftrace selftests:
[ 0.892809] Running postponed tracer tests:
[ 0.892893] Testing tracer function:
[ 0.901899] Callback from call_rcu_tasks_trace() invoked.
[ 0.983829] Callback from call_rcu_tasks_rude() invoked.
[ 1.072003] .. bad ring buffer .. corrupted trace buffer ..
[ 1.091944] Callback from call_rcu_tasks() invoked.
[ 1.097695] PASSED
[ 1.097701] Testing dynamic ftrace: .. filter failed count=0 ..FAILED!
[ 1.353474] ------------[ cut here ]------------
[ 1.353478] WARNING: CPU: 0 PID: 1 at kernel/trace/trace.c:1951 run_tracer_selftest+0x13c/0x1b0
Therefore enforce a minimum of 4096 bytes to make the selftest pass.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Sven Schnelle <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
On a powerpc32 build with CONFIG_CC_OPTIMISE_FOR_SIZE, the inline
keyword is not honored and trace_trigger_soft_disabled() appears
approx 50 times in vmlinux.
Adding -Winline to the build, the following message appears:
./include/linux/trace_events.h:712:1: error: inlining failed in call to 'trace_trigger_soft_disabled': call is unlikely and code size would grow [-Werror=inline]
That function is rather big for an inlined function:
c003df60 <trace_trigger_soft_disabled>:
c003df60: 94 21 ff f0 stwu r1,-16(r1)
c003df64: 7c 08 02 a6 mflr r0
c003df68: 90 01 00 14 stw r0,20(r1)
c003df6c: bf c1 00 08 stmw r30,8(r1)
c003df70: 83 e3 00 24 lwz r31,36(r3)
c003df74: 73 e9 01 00 andi. r9,r31,256
c003df78: 41 82 00 10 beq c003df88 <trace_trigger_soft_disabled+0x28>
c003df7c: 38 60 00 00 li r3,0
c003df80: 39 61 00 10 addi r11,r1,16
c003df84: 4b fd 60 ac b c0014030 <_rest32gpr_30_x>
c003df88: 73 e9 00 80 andi. r9,r31,128
c003df8c: 7c 7e 1b 78 mr r30,r3
c003df90: 41 a2 00 14 beq c003dfa4 <trace_trigger_soft_disabled+0x44>
c003df94: 38 c0 00 00 li r6,0
c003df98: 38 a0 00 00 li r5,0
c003df9c: 38 80 00 00 li r4,0
c003dfa0: 48 05 c5 f1 bl c009a590 <event_triggers_call>
c003dfa4: 73 e9 00 40 andi. r9,r31,64
c003dfa8: 40 82 00 28 bne c003dfd0 <trace_trigger_soft_disabled+0x70>
c003dfac: 73 ff 02 00 andi. r31,r31,512
c003dfb0: 41 82 ff cc beq c003df7c <trace_trigger_soft_disabled+0x1c>
c003dfb4: 80 01 00 14 lwz r0,20(r1)
c003dfb8: 83 e1 00 0c lwz r31,12(r1)
c003dfbc: 7f c3 f3 78 mr r3,r30
c003dfc0: 83 c1 00 08 lwz r30,8(r1)
c003dfc4: 7c 08 03 a6 mtlr r0
c003dfc8: 38 21 00 10 addi r1,r1,16
c003dfcc: 48 05 6f 6c b c0094f38 <trace_event_ignore_this_pid>
c003dfd0: 38 60 00 01 li r3,1
c003dfd4: 4b ff ff ac b c003df80 <trace_trigger_soft_disabled+0x20>
However it is located in a hot path so inlining it is important.
But forcing inlining of the entire function by using __always_inline
leads to increasing the text size by approx 20 kbytes.
Instead, split the fonction in two parts, one part with the likely
fast path, flagged __always_inline, and a second part out of line.
With this change, on a powerpc32 with CONFIG_CC_OPTIMISE_FOR_SIZE
vmlinux text increases by only 1,4 kbytes, which is partly
compensated by a decrease of vmlinux data by 7 kbytes.
On ppc64_defconfig which has CONFIG_CC_OPTIMISE_FOR_SPEED, this
change reduces vmlinux text by more than 30 kbytes.
Link: https://lkml.kernel.org/r/69ce0986a52d026d381d612801d978aa4f977460.1644563295.git.christophe.leroy@csgroup.eu
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Currently, the event probes save the type of the event they are attached
to when recording the event. For example:
# echo 'e:switch sched/sched_switch prev_state=$prev_state prev_prio=$prev_prio next_pid=$next_pid next_prio=$next_prio' > dynamic_events
# cat events/eprobes/switch/format
name: switch
ID: 1717
format:
field:unsigned short common_type; offset:0; size:2; signed:0;
field:unsigned char common_flags; offset:2; size:1; signed:0;
field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
field:int common_pid; offset:4; size:4; signed:1;
field:unsigned int __probe_type; offset:8; size:4; signed:0;
field:u64 prev_state; offset:12; size:8; signed:0;
field:u64 prev_prio; offset:20; size:8; signed:0;
field:u64 next_pid; offset:28; size:8; signed:0;
field:u64 next_prio; offset:36; size:8; signed:0;
print fmt: "(%u) prev_state=0x%Lx prev_prio=0x%Lx next_pid=0x%Lx next_prio=0x%Lx", REC->__probe_type, REC->prev_state, REC->prev_prio, REC->next_pid, REC->next_prio
The __probe_type adds 4 bytes to every event.
One of the reasons for creating eprobes is to limit what is traced in an
event to be able to limit what is written into the ring buffer. Having
this redundant 4 bytes to every event takes away from this.
The event that is recorded can be retrieved from the event probe itself,
that is available when the trace is happening. For user space tools, it
could simply read the dynamic_event file to find the event they are for.
So there is really no reason to write this information into the ring
buffer for every event.
Link: https://lkml.kernel.org/r/[email protected]
Acked-by: Masami Hiramatsu <[email protected]>
Reviewed-by: Joel Fernandes <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
If a trigger is set on an event to disable or enable tracing within an
instance, then tracing should be disabled or enabled in the instance and
not at the top level, which is confusing to users.
Link: https://lkml.kernel.org/r/[email protected]
Cc: [email protected]
Fixes: ae63b31e4d0e2 ("tracing: Separate out trace events from global variables")
Tested-by: Daniel Bristot de Oliveira <[email protected]>
Reviewed-by: Tom Zanussi <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Long story short recursively enforcing RLIMIT_NPROC when it is not
enforced on the process that creates a new user namespace, causes
currently working code to fail. There is no reason to enforce
RLIMIT_NPROC recursively when we don't enforce it normally so update
the code to detect this case.
I would like to simply use capable(CAP_SYS_RESOURCE) to detect when
RLIMIT_NPROC is not enforced upon the caller. Unfortunately because
RLIMIT_NPROC is charged and checked for enforcement based upon the
real uid, using capable() which is euid based is inconsistent with reality.
Come as close as possible to testing for capable(CAP_SYS_RESOURCE) by
testing for when the real uid would match the conditions when
CAP_SYS_RESOURCE would be present if the real uid was the effective
uid.
Reported-by: Etienne Dechamps <[email protected]>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215596
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Cc: [email protected]
Fixes: 21d1c5e386bc ("Reimplement RLIMIT_NPROC on top of ucounts")
Reviewed-by: Kees Cook <[email protected]>
Signed-off-by: "Eric W. Biederman" <[email protected]>
|
|
The stacktrace event trigger is not dumping the stacktrace to the instance
where it was enabled, but to the global "instance."
Use the private_data, pointing to the trigger file, to figure out the
corresponding trace instance, and use it in the trigger action, like
snapshot_trigger does.
Link: https://lkml.kernel.org/r/afbb0b4f18ba92c276865bc97204d438473f4ebc.1645396236.git.bristot@kernel.org
Cc: [email protected]
Fixes: ae63b31e4d0e2 ("tracing: Separate out trace events from global variables")
Reviewed-by: Tom Zanussi <[email protected]>
Tested-by: Tom Zanussi <[email protected]>
Signed-off-by: Daniel Bristot de Oliveira <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
tools/testing/selftests/net/mptcp/mptcp_join.sh
34aa6e3bccd8 ("selftests: mptcp: add ip mptcp wrappers")
857898eb4b28 ("selftests: mptcp: add missing join check")
6ef84b1517e0 ("selftests: mptcp: more robust signal race test")
https://lore.kernel.org/all/[email protected]/
drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/act.h
drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/ct.c
fb7e76ea3f3b6 ("net/mlx5e: TC, Skip redundant ct clear actions")
c63741b426e11 ("net/mlx5e: Fix MPLSoUDP encap to use MPLS action information")
09bf97923224f ("net/mlx5e: TC, Move pedit_headers_action to parse_attr")
84ba8062e383 ("net/mlx5e: Test CT and SAMPLE on flow attr")
efe6f961cd2e ("net/mlx5e: CT, Don't set flow flag CT for ct clear flow")
3b49a7edec1d ("net/mlx5e: TC, Reject rules with multiple CT actions")
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from bpf and netfilter.
Current release - regressions:
- bpf: fix crash due to out of bounds access into reg2btf_ids
- mvpp2: always set port pcs ops, avoid null-deref
- eth: marvell: fix driver load from initrd
- eth: intel: revert "Fix reset bw limit when DCB enabled with 1 TC"
Current release - new code bugs:
- mptcp: fix race in overlapping signal events
Previous releases - regressions:
- xen-netback: revert hotplug-status changes causing devices to not
be configured
- dsa:
- avoid call to __dev_set_promiscuity() while rtnl_mutex isn't
held
- fix panic when removing unoffloaded port from bridge
- dsa: microchip: fix bridging with more than two member ports
Previous releases - always broken:
- bpf:
- fix crash due to incorrect copy_map_value when both spin lock
and timer are present in a single value
- fix a bpf_timer initialization issue with clang
- do not try bpf_msg_push_data with len 0
- add schedule points in batch ops
- nf_tables:
- unregister flowtable hooks on netns exit
- correct flow offload action array size
- fix a couple of memory leaks
- vsock: don't check owner in vhost_vsock_stop() while releasing
- gso: do not skip outer ip header in case of ipip and net_failover
- smc: use a mutex for locking "struct smc_pnettable"
- openvswitch: fix setting ipv6 fields causing hw csum failure
- mptcp: fix race in incoming ADD_ADDR option processing
- sysfs: add check for netdevice being present to speed_show
- sched: act_ct: fix flow table lookup after ct clear or switching
zones
- eth: intel: fixes for SR-IOV forwarding offloads
- eth: broadcom: fixes for selftests and error recovery
- eth: mellanox: flow steering and SR-IOV forwarding fixes
Misc:
- make __pskb_pull_tail() & pskb_carve_frag_list() drop_monitor
friends not report freed skbs as drops
- force inlining of checksum functions in net/checksum.h"
* tag 'net-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (85 commits)
net: mv643xx_eth: process retval from of_get_mac_address
ping: remove pr_err from ping_lookup
Revert "i40e: Fix reset bw limit when DCB enabled with 1 TC"
openvswitch: Fix setting ipv6 fields causing hw csum failure
ipv6: prevent a possible race condition with lifetimes
net/smc: Use a mutex for locking "struct smc_pnettable"
bnx2x: fix driver load from initrd
Revert "xen-netback: Check for hotplug-status existence before watching"
Revert "xen-netback: remove 'hotplug-status' once it has served its purpose"
net/mlx5e: Fix VF min/max rate parameters interchange mistake
net/mlx5e: Add missing increment of count
net/mlx5e: MPLSoUDP decap, fix check for unsupported matches
net/mlx5e: Fix MPLSoUDP encap to use MPLS action information
net/mlx5e: Add feature check for set fec counters
net/mlx5e: TC, Skip redundant ct clear actions
net/mlx5e: TC, Reject rules with forward and drop actions
net/mlx5e: TC, Reject rules with drop and modify hdr action
net/mlx5e: kTLS, Use CHECKSUM_UNNECESSARY for device-offloaded packets
net/mlx5e: Fix wrong return value on ioctl EEPROM query failure
net/mlx5: Fix possible deadlock on rule deletion
...
|
|
Add leading space to spdx tag
Use // for spdx c file comment
Replacements
resereved to reserved
inbetween to in between
everytime to every time
intutivie to intuitive
currenct to current
encontered to encountered
referenceing to referencing
upto to up to
exectuted to executed
Signed-off-by: Tom Rix <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Acked-by: Song Liu <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:
- Fix for a subtle bug in the recent release_agent permission check
update
- Fix for a long-standing race condition between cpuset and cpu hotplug
- Comment updates
* 'for-5.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cpuset: Fix kernel-doc
cgroup-v1: Correct privileges check in release_agent writes
cgroup: clarify cgroup_css_set_fork()
cgroup/cpuset: Fix a race between cpuset_attach() and cpu hotplug
|