Age | Commit message (Collapse) | Author | Files | Lines |
|
Use perl's m@<match>@ match and not /<match>/ comparisons to avoid
an error using c90's // comment style.
Miscellanea:
o Use normal tab indentation and alignment
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/f08eb62458407a145cfedf959d1091af151cd665.1563575364.git.joe@perches.com
Signed-off-by: Joe Perches <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Add more types of lines that appear to be stack dumps that also include
hex lines that might otherwise be interpreted as commit IDs.
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/f7dc9727795db3802809a24162abe0b67e14123b.1563575364.git.joe@perches.com
Signed-off-by: Joe Perches <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
I'm seeing a bunch of debug prints from a user of print_hex_dump_bytes()
in my kernel logs, but I don't have CONFIG_DYNAMIC_DEBUG enabled nor do I
have DEBUG defined in my build. The problem is that
print_hex_dump_bytes() calls a wrapper function in lib/hexdump.c that
calls print_hex_dump() with KERN_DEBUG level. There are three cases to
consider here
1. CONFIG_DYNAMIC_DEBUG=y --> call dynamic_hex_dum()
2. CONFIG_DYNAMIC_DEBUG=n && DEBUG --> call print_hex_dump()
3. CONFIG_DYNAMIC_DEBUG=n && !DEBUG --> stub it out
Right now, that last case isn't detected and we still call
print_hex_dump() from the stub wrapper.
Let's make print_hex_dump_bytes() only call print_hex_dump_debug() so that
it works properly in all cases.
Case #1, print_hex_dump_debug() calls dynamic_hex_dump() and we get same
behavior. Case #2, print_hex_dump_debug() calls print_hex_dump() with
KERN_DEBUG and we get the same behavior. Case #3, print_hex_dump_debug()
is a nop, changing behavior to what we want, i.e. print nothing.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Stephen Boyd <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
When building with W=1, a number of warnings are issued:
CC lib/extable.o
lib/extable.c:63:6: warning: no previous prototype for 'sort_extable' [-Wmissing-prototypes]
63 | void sort_extable(struct exception_table_entry *start,
| ^~~~~~~~~~~~
lib/extable.c:75:6: warning: no previous prototype for 'trim_init_extable' [-Wmissing-prototypes]
75 | void trim_init_extable(struct module *m)
| ^~~~~~~~~~~~~~~~~
lib/extable.c:115:1: warning: no previous prototype for 'search_extable' [-Wmissing-prototypes]
115 | search_extable(const struct exception_table_entry *base,
| ^~~~~~~~~~~~~~
Add the missing #include for the prototypes.
Link: http://lkml.kernel.org/r/45574.1565235784@turing-police
Signed-off-by: Valdis Kletnieks <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
When building with W=1, we get some warnings:
l CC lib/generic-radix-tree.o
lib/generic-radix-tree.c:39:10: warning: no previous prototype for 'genradix_root_to_depth' [-Wmissing-prototypes]
39 | unsigned genradix_root_to_depth(struct genradix_root *r)
| ^~~~~~~~~~~~~~~~~~~~~~
lib/generic-radix-tree.c:44:23: warning: no previous prototype for 'genradix_root_to_node' [-Wmissing-prototypes]
44 | struct genradix_node *genradix_root_to_node(struct genradix_root *r)
| ^~~~~~~~~~~~~~~~~~~~~
They're not used anywhere else, so make them static inline.
Link: http://lkml.kernel.org/r/46923.1565236485@turing-police
Signed-off-by: Valdis Kletnieks <[email protected]>
Cc: Kent Overstreet <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
As already done for snprintf(), add a check in strscpy() for giant (i.e.
likely negative and/or miscalculated) copy sizes, WARN, and error out.
Link: http://lkml.kernel.org/r/201907260928.23DE35406@keescook
Signed-off-by: Kees Cook <[email protected]>
Cc: Joe Perches <[email protected]>
Cc: Rasmus Villemoes <[email protected]>
Cc: Yann Droneaud <[email protected]>
Cc: David Laight <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Stephen Kitt <[email protected]>
Cc: Jann Horn <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
There are many of those warnings.
In file included from ./arch/powerpc/include/asm/paca.h:15,
from ./arch/powerpc/include/asm/current.h:13,
from ./include/linux/thread_info.h:21,
from ./include/asm-generic/preempt.h:5,
from ./arch/powerpc/include/generated/asm/preempt.h:1,
from ./include/linux/preempt.h:78,
from ./include/linux/spinlock.h:51,
from fs/fs-writeback.c:19:
In function 'strncpy',
inlined from 'perf_trace_writeback_page_template' at
./include/trace/events/writeback.h:56:1:
./include/linux/string.h:260:9: warning: '__builtin_strncpy' specified
bound 32 equals destination size [-Wstringop-truncation]
return __builtin_strncpy(p, q, size);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fix it by using the new strscpy_pad() which was introduced in "lib/string:
Add strscpy_pad() function" and will always be NUL-terminated instead of
strncpy(). Also, change strlcpy() to use strscpy_pad() in this file for
consistency.
Link: http://lkml.kernel.org/r/[email protected]
Fixes: 455b2864686d ("writeback: Initial tracing support")
Fixes: 028c2dd184c0 ("writeback: Add tracing to balance_dirty_pages")
Fixes: e84d0a4f8e39 ("writeback: trace event writeback_queue_io")
Fixes: b48c104d2211 ("writeback: trace event bdi_dirty_ratelimit")
Fixes: cc1676d917f3 ("writeback: Move requeueing when I_SYNC set to writeback_sb_inodes()")
Fixes: 9fb0a7da0c52 ("writeback: add more tracepoints")
Signed-off-by: Qian Cai <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Cc: Tobin C. Harding <[email protected]>
Cc: Steven Rostedt (VMware) <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Fengguang Wu <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Joe Perches <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Jann Horn <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Nitin Gote <[email protected]>
Cc: Rasmus Villemoes <[email protected]>
Cc: Stephen Kitt <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
core-api should show all the various string functions including the newly
added stracpy and stracpy_pad.
Miscellanea:
o Update the Returns: value for strscpy
o fix a defect with %NUL)
[[email protected]: correct return of -E2BIG descriptions]
Link: http://lkml.kernel.org/r/29f998b4c1a9d69fbeae70500ba0daa4b340c546.1563889130.git.joe@perches.com
Link: http://lkml.kernel.org/r/224a6ebf39955f4107c0c376d66155d970e46733.1563841972.git.joe@perches.com
Signed-off-by: Joe Perches <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Stephen Kitt <[email protected]>
Cc: Nitin Gote <[email protected]>
Cc: Rasmus Villemoes <[email protected]>
Cc: Jann Horn <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Change the definition of the RBCOMPUTE function. The propagate callback
repeatedly calls RBCOMPUTE as it moves from leaf to root. it wants to
stop recomputing once the augmented subtree information doesn't change.
This was previously checked using the == operator, but that only works
when the augmented subtree information is a scalar field. This commit
modifies the RBCOMPUTE function so that it now sets the augmented subtree
information instead of returning it, and returns a boolean value
indicating if the propagate callback should stop.
The motivation for this change is that I want to introduce augmented
rbtree uses where the augmented data for the subtree is a struct instead
of a scalar.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Michel Lespinasse <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Cc: David Howells <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Uladzislau Rezki <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Add RB_DECLARE_CALLBACKS_MAX, which generates augmented rbtree callbacks
for the case where the augmented value is a scalar whose definition
follows a max(f(node)) pattern. This actually covers all present uses of
RB_DECLARE_CALLBACKS, and saves some (source) code duplication in the
various RBCOMPUTE function definitions.
[[email protected]: fix mm/vmalloc.c]
Link: http://lkml.kernel.org/r/CANN689FXgK13wDYNh1zKxdipeTuALG4eKvKpsdZqKFJ-rvtGiQ@mail.gmail.com
[[email protected]: re-add check to check_augmented()]
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Michel Lespinasse <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Cc: David Howells <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Uladzislau Rezki <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Patch series "make RB_DECLARE_CALLBACKS more generic", v3.
These changes are intended to make the RB_DECLARE_CALLBACKS macro more
generic (allowing the aubmented subtree information to be a struct instead
of a scalar).
I have verified the compiled lib/interval_tree.o and mm/mmap.o files to
check that they didn't change. This held as expected for interval_tree.o;
mmap.o did have some changes which could be reverted by marking
__vma_link_rb as noinline. I did not add such a change to the patchset; I
felt it was reasonable enough to leave the inlining decision up to the
compiler.
This patch (of 3):
Add a short comment summarizing the arguments to RB_DECLARE_CALLBACKS.
The arguments are also now capitalized. This copies the style of the
INTERVAL_TREE_DEFINE macro.
No functional changes in this commit, only comments and capitalization.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Michel Lespinasse <[email protected]>
Acked-by: Davidlohr Bueso <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Cc: David Howells <[email protected]>
Cc: Uladzislau Rezki <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
As was already noted in rbtree.h, the logic to cache rb_first (or
rb_last) can easily be implemented externally to the core rbtree api.
This commit takes the changes applied to the include/linux/ and lib/
rbtree files in 9f973cb38088 ("lib/rbtree: avoid generating code twice
for the cached versions"), and applies these to the
tools/include/linux/ and tools/lib/ files as well to keep them
synchronized.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Michel Lespinasse <[email protected]>
Cc: David Howells <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
When building with W=1, gcc properly complains that there's no prototypes:
CC kernel/elfcore.o
kernel/elfcore.c:7:17: warning: no previous prototype for 'elf_core_extra_phdrs' [-Wmissing-prototypes]
7 | Elf_Half __weak elf_core_extra_phdrs(void)
| ^~~~~~~~~~~~~~~~~~~~
kernel/elfcore.c:12:12: warning: no previous prototype for 'elf_core_write_extra_phdrs' [-Wmissing-prototypes]
12 | int __weak elf_core_write_extra_phdrs(struct coredump_params *cprm, loff_t offset)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
kernel/elfcore.c:17:12: warning: no previous prototype for 'elf_core_write_extra_data' [-Wmissing-prototypes]
17 | int __weak elf_core_write_extra_data(struct coredump_params *cprm)
| ^~~~~~~~~~~~~~~~~~~~~~~~~
kernel/elfcore.c:22:15: warning: no previous prototype for 'elf_core_extra_data_size' [-Wmissing-prototypes]
22 | size_t __weak elf_core_extra_data_size(void)
| ^~~~~~~~~~~~~~~~~~~~~~~~
Provide the include file so gcc is happy, and we don't have potential code drift
Link: http://lkml.kernel.org/r/29875.1565224705@turing-police
Signed-off-by: Valdis Kletnieks <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Add a header include guard just in case.
My motivation is to allow Kbuild to detect missing include guard:
https://patchwork.kernel.org/patch/11063011/
Before I enable this checker I want to fix as many headers as possible.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Masahiro Yamada <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Thomas has noticed the following NULL ptr dereference when using cgroup
v1 kmem limit:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
PGD 0
P4D 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 3 PID: 16923 Comm: gtk-update-icon Not tainted 4.19.51 #42
Hardware name: Gigabyte Technology Co., Ltd. Z97X-Gaming G1/Z97X-Gaming G1, BIOS F9 07/31/2015
RIP: 0010:create_empty_buffers+0x24/0x100
Code: cd 0f 1f 44 00 00 0f 1f 44 00 00 41 54 49 89 d4 ba 01 00 00 00 55 53 48 89 fb e8 97 fe ff ff 48 89 c5 48 89 c2 eb 03 48 89 ca <48> 8b 4a 08 4c 09 22 48 85 c9 75 f1 48 89 6a 08 48 8b 43 18 48 8d
RSP: 0018:ffff927ac1b37bf8 EFLAGS: 00010286
RAX: 0000000000000000 RBX: fffff2d4429fd740 RCX: 0000000100097149
RDX: 0000000000000000 RSI: 0000000000000082 RDI: ffff9075a99fbe00
RBP: 0000000000000000 R08: fffff2d440949cc8 R09: 00000000000960c0
R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000000
R13: ffff907601f18360 R14: 0000000000002000 R15: 0000000000001000
FS: 00007fb55b288bc0(0000) GS:ffff90761f8c0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000008 CR3: 000000007aebc002 CR4: 00000000001606e0
Call Trace:
create_page_buffers+0x4d/0x60
__block_write_begin_int+0x8e/0x5a0
? ext4_inode_attach_jinode.part.82+0xb0/0xb0
? jbd2__journal_start+0xd7/0x1f0
ext4_da_write_begin+0x112/0x3d0
generic_perform_write+0xf1/0x1b0
? file_update_time+0x70/0x140
__generic_file_write_iter+0x141/0x1a0
ext4_file_write_iter+0xef/0x3b0
__vfs_write+0x17e/0x1e0
vfs_write+0xa5/0x1a0
ksys_write+0x57/0xd0
do_syscall_64+0x55/0x160
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Tetsuo then noticed that this is because the __memcg_kmem_charge_memcg
fails __GFP_NOFAIL charge when the kmem limit is reached. This is a wrong
behavior because nofail allocations are not allowed to fail. Normal
charge path simply forces the charge even if that means to cross the
limit. Kmem accounting should be doing the same.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Michal Hocko <[email protected]>
Reported-by: Thomas Lindroth <[email protected]>
Debugged-by: Tetsuo Handa <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Vladimir Davydov <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Thomas Lindroth <[email protected]>
Cc: Shakeel Butt <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-next
- Multiple panfrost fixes for regulator support and page fault handling
- Some cleanups and fixes in the self-refresh helpers
- Some cleanups and fixes in the atomic helpers
Signed-off-by: Dave Airlie <[email protected]>
From: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/20190923160946.nvaqiw5j7fpcdhc7@gilmour
|
|
Some recent changes in latest Clang started causing the following
warning when unrolling strobemeta test case main loop:
progs/strobemeta.h:416:2: warning: loop not unrolled: the optimizer was
unable to perform the requested transformation; the transformation might
be disabled or specified as part of an unsupported transformation
ordering [-Wpass-failed=transform-warning]
This patch simplifies loop's exit condition to depend only on constant
max iteration number (STROBE_MAX_MAP_ENTRIES), while moving early
termination logic inside the loop body. The changes are equivalent from
program logic standpoint, but fixes the warning. It also appears to
improve generated BPF code, as it fixes previously failing non-unrolled
strobemeta test cases.
Cc: Alexei Starovoitov <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
|
|
Remove no longer used variables and avoid compiler warnings.
Signed-off-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
|
|
Some compilers emit warning for potential uninitialized next_id usage.
The code is correct, but control flow is too complicated for some
compilers to figure this out. Re-initialize next_id to satisfy
compiler.
Signed-off-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
|
|
When kzalloc() failed, NULL was returned to the caller, which
tested the pointer with IS_ERR(), which didn't match, so the
pointer was used later, resulting in a NULL dereference.
Return ERR_PTR(-ENOMEM) instead of NULL.
Reported-by: [email protected]
Fixes: 0402acd683c6 ("xsk: remove AF_XDP socket from map when the socket is released")
Signed-off-by: Jonathan Lemon <[email protected]>
Acked-by: Björn Töpel <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
|
|
This is the same problem I found earlier in test_sockopt_inherit:
there is a race between server thread doing accept() and client
thread doing connect(). Let's explicitly synchronize them via
pthread conditional variable.
v2:
* don't exit from server_thread without signaling condvar,
fixes possible issue where main() would wait forever (Andrii Nakryiko)
Fixes: b55873984dab ("selftests/bpf: test BPF_SOCK_OPS_RTT_CB")
Signed-off-by: Stanislav Fomichev <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
|
|
In dcn*_create_resource_pool the allocated memory should be released if
construct pool fails.
Reviewed-by: Harry Wentland <[email protected]>
Signed-off-by: Navid Emamdoost <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
They go on accumulating there like the debug.h one, that was introduced
here:
f23610245c1a ("perf list: Add debug support for outputing alias string")
But then, when that need is removed via:
2073ad3326b7 ("perf tools: Factor out PMU matching in parser")
The thing stays there, so continue the house cleaning spree...
list.h not needed, no macros from there are used, and 'struct
list_head' is in linux/types.h, ditto for util.h, no need for that as
well.
Cc: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
With Java 11 there is no seperate JRE anymore.
Details:
https://coderanch.com/t/701603/java/JRE-JDK
Therefore the detection of the JRE needs to be adapted.
This change works for s390 and x86. I have not tested other platforms.
Committer testing:
Continues to work with the OpenJDK 8:
$ rm -f ~acme/lib64/libperf-jvmti.so
$ rpm -qa | grep jdk-devel
java-1.8.0-openjdk-devel-1.8.0.222.b10-0.fc30.x86_64
$ git log --oneline -1
a51937170f33 (HEAD -> perf/core) perf build: Add detection of java-11-openjdk-devel package
$ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf ; make -C tools/perf O=/tmp/build/perf install > /dev/null 2>1
$ ls -la ~acme/lib64/libperf-jvmti.so
-rwxr-xr-x. 1 acme acme 230744 Sep 24 16:46 /home/acme/lib64/libperf-jvmti.so
$
Suggested-by: Andreas Krebbel <[email protected]>
Signed-off-by: Thomas Richter <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Hendrik Brueckner <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Enable JVMTI support for s390 perf tool chain.
Signed-off-by: Thomas Richter <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Hendrik Brueckner <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This patch is to remove following hardware events
from JSON file which are not supported on POWER8.
pm_l3_p0_grp_pump
pm_l3_p0_lco_data
pm_l3_p0_lco_no_data
pm_l3_p0_lco_rty
Note: Unfortunately power8 event list is not publicly available.
Fixes: c3b4d5c4afb0 ("perf vendor events: Remove P8 HW events which are not supported")
Signed-off-by: Mamatha Inamdar <[email protected]>
Acked-by: Ravi Bangoria <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
I'm not fully sure if this is the correct fix, but without this I get
crashes on more complex perf stat metric usages. The problem is that
part of the state gets freed when a weak group fails, but then is later
still used. Just don't free the ids, we're going to reuse them anyways
on the weak group retry.
For example:
% perf stat -M IpB,IpCall,IpTB,IPC,Retiring_SMT,Frontend_Bound_SMT,Kernel_Utilization,CPU_Utilization --metric-only -a -I 1000 sleep 2
crashes and gives in valgrind:
=21527== Invalid write of size 8
==21527== at 0x4EE582: hlist_add_head (list.h:644)
==21527== by 0x4EFD3C: perf_evlist__id_hash (evlist.c:477)
==21527== by 0x4EFD99: perf_evlist__id_add (evlist.c:483)
==21527== by 0x4EFF15: perf_evlist__id_add_fd (evlist.c:524)
==21527== by 0x4FC693: store_evsel_ids (evsel.c:2969)
==21527== by 0x4FC76C: perf_evsel__store_ids (evsel.c:2986)
==21527== by 0x450DA7: __run_perf_stat (builtin-stat.c:519)
==21527== by 0x451285: run_perf_stat (builtin-stat.c:636)
==21527== by 0x454619: cmd_stat (builtin-stat.c:1966)
==21527== by 0x4D557D: run_builtin (perf.c:310)
==21527== by 0x4D57EA: handle_internal_command (perf.c:362)
==21527== by 0x4D5931: run_argv (perf.c:406)
==21527== Address 0x12e3f008 is 104 bytes inside a block of size 2,056 free'd
==21527== at 0x4839A0C: free (vg_replace_malloc.c:540)
==21527== by 0x627139: xyarray__delete (xyarray.c:32)
==21527== by 0x4F6BE4: perf_evsel__free_id (evsel.c:1253)
==21527== by 0x4FA11F: evsel__close (evsel.c:1994)
==21527== by 0x4F30A3: perf_evlist__reset_weak_group (evlist.c:1783)
==21527== by 0x450B47: __run_perf_stat (builtin-stat.c:466)
==21527== by 0x451285: run_perf_stat (builtin-stat.c:636)
==21527== by 0x454619: cmd_stat (builtin-stat.c:1966)
==21527== by 0x4D557D: run_builtin (perf.c:310)
==21527== by 0x4D57EA: handle_internal_command (perf.c:362)
==21527== by 0x4D5931: run_argv (perf.c:406)
==21527== by 0x4D5CAE: main (perf.c:531)
==21527== Block was alloc'd at
==21527== at 0x483AB1A: calloc (vg_replace_malloc.c:762)
==21527== by 0x627024: zalloc (zalloc.c:8)
==21527== by 0x627088: xyarray__new (xyarray.c:10)
==21527== by 0x4F6B20: perf_evsel__alloc_id (evsel.c:1237)
==21527== by 0x4FC74E: perf_evsel__store_ids (evsel.c:2983)
==21527== by 0x450DA7: __run_perf_stat (builtin-stat.c:519)
==21527== by 0x451285: run_perf_stat (builtin-stat.c:636)
==21527== by 0x454619: cmd_stat (builtin-stat.c:1966)
==21527== by 0x4D557D: run_builtin (perf.c:310)
==21527== by 0x4D57EA: handle_internal_command (perf.c:362)
==21527== by 0x4D5931: run_argv (perf.c:406)
==21527== by 0x4D5CAE: main (perf.c:531)
Signed-off-by: Andi Kleen <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Make sure to not free the name passed in by the caller, but free all the
allocated ids when parsing expressions.
The loop at the end knows that the first entry shouldn't be freed, so
make sure the caller name is the first entry.
Fixes
% perf stat -M IpB,IpCall,IpTB,IPC,Retiring_SMT,Frontend_Bound_SMT,Kernel_Utilization,CPU_Utilization --metric-only -a -I 1000 sleep 2
valgrind:
1.009943231 ==21527== Invalid read of size 1
==21527== at 0x483CB74: strcmp (vg_replace_strmem.c:849)
==21527== by 0x582CF8: collect_all_aliases (stat-display.c:554)
==21527== by 0x582EB3: collect_data (stat-display.c:577)
==21527== by 0x583A32: print_counter_aggr (stat-display.c:806)
==21527== by 0x584FAD: perf_evlist__print_counters (stat-display.c:1200)
==21527== by 0x45133A: print_counters (builtin-stat.c:655)
==21527== by 0x450629: process_interval (builtin-stat.c:353)
==21527== by 0x450FBD: __run_perf_stat (builtin-stat.c:564)
==21527== by 0x451285: run_perf_stat (builtin-stat.c:636)
==21527== by 0x454619: cmd_stat (builtin-stat.c:1966)
==21527== by 0x4D557D: run_builtin (perf.c:310)
==21527== by 0x4D57EA: handle_internal_command (perf.c:362)
==21527== Address 0x12826cd0 is 0 bytes inside a block of size 25 free'd
==21527== at 0x4839A0C: free (vg_replace_malloc.c:540)
==21527== by 0x627041: __zfree (zalloc.c:13)
==21527== by 0x57F66A: generic_metric (stat-shadow.c:814)
==21527== by 0x580B21: perf_stat__print_shadow_stats (stat-shadow.c:1057)
==21527== by 0x58418E: print_metric_headers (stat-display.c:943)
==21527== by 0x5844BC: print_interval (stat-display.c:1004)
==21527== by 0x584DEB: perf_evlist__print_counters (stat-display.c:1172)
==21527== by 0x45133A: print_counters (builtin-stat.c:655)
==21527== by 0x450629: process_interval (builtin-stat.c:353)
==21527== by 0x450FBD: __run_perf_stat (builtin-stat.c:564)
==21527== by 0x451285: run_perf_stat (builtin-stat.c:636)
==21527== by 0x454619: cmd_stat (builtin-stat.c:1966)
==21527== Block was alloc'd at
==21527== at 0x483880B: malloc (vg_replace_malloc.c:309)
==21527== by 0x51677DE: strdup (in /usr/lib64/libc-2.29.so)
==21527== by 0x506457: parse_events_name (parse-events.c:1754)
==21527== by 0x5550BB: parse_events_parse (parse-events.y:214)
==21527== by 0x50694D: parse_events__scanner (parse-events.c:1887)
==21527== by 0x506AEF: parse_events (parse-events.c:1927)
==21527== by 0x521D8B: metricgroup__parse_groups (metricgroup.c:527)
==21527== by 0x45156F: parse_metric_groups (builtin-stat.c:721)
==21527== by 0x6228A9: get_value (parse-options.c:243)
==21527== by 0x62363F: parse_short_opt (parse-options.c:348)
==21527== by 0x62363F: parse_options_step (parse-options.c:536)
==21527== by 0x62363F: parse_options_subcommand (parse-options.c:651)
==21527== by 0x453C1D: cmd_stat (builtin-stat.c:1718)
==21527== by 0x4D557D: run_builtin (perf.c:310)
and also a leak report.
Committer testing:
Before:
# perf stat -M IpB,IpCall,IpTB,IPC,Retiring_SMT,Frontend_Bound_SMT,Kernel_Utilization,CPU_Utilization --metric-only -a -I 1000 sleep 2
# time CPU_Utilization
1.000470810 free(): double free detected in tcache 2
Aborted (core dumped)
#
After:
# perf stat -M IpB,IpCall,IpTB,IPC,Retiring_SMT,Frontend_Bound_SMT,Kernel_Utilization,CPU_Utilization --metric-only -a -I 1000 sleep 2
# time CPU_Utilization
1.000494752 0.1
2.001105112 0.1
#
Signed-off-by: Andi Kleen <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The perf_sample struct definition and the event_attr_init() are in
util/event.h, but some places were getting it thru an otherwise needless
util/mmap.h header, fix it by including util/event.h directly.
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Further reducing the size of util/evsel.h.
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Ditch it, noone is using it, one more stdio.h include in a hot header.
Fix the fallout in parse-events.y, where we end up using a FILE pointer,
I think due to YYDEBUG being set and in some places, like Amazon Linux 1
we don't get stdio.h included by luck, like in most other places, add a
explicit stdio.h include directive.
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
We already had evsel_fprintf.c, add its counterpart, so that we can
reduce evsel.h a bit more.
We needed a new perf_event_attr_fprintf.c file so as to have a separate
object to link with the python binding in tools/perf/util/python-ext-sources
and not drag symbol_conf, etc into the python binding.
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
So that we an later link it to the python binding without having to
drag the symbol object files.
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
KVM was incorrectly checking vmcs12->host_ia32_efer even if the "load
IA32_EFER" exit control was reset. Also, some checks were not using
the new CC macro for tracing.
Cleanup everything so that the vCPU's 64-bit mode is determined
directly from EFER_LMA and the VMCS checks are based on that, which
matches section 26.2.4 of the SDM.
Cc: Sean Christopherson <[email protected]>
Cc: Krish Sadhukhan <[email protected]>
Fixes: 5845038c111db27902bc220a4f70070fe945871c
Reviewed-by: Jim Mattson <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Pull ceph updates from Ilya Dryomov:
"The highlights are:
- automatic recovery of a blacklisted filesystem session (Zheng Yan).
This is disabled by default and can be enabled by mounting with the
new "recover_session=clean" option.
- serialize buffered reads and O_DIRECT writes (Jeff Layton). Care is
taken to avoid serializing O_DIRECT reads and writes with each
other, this is based on the exclusion scheme from NFS.
- handle large osdmaps better in the face of fragmented memory
(myself)
- don't limit what security.* xattrs can be get or set (Jeff Layton).
We were overly restrictive here, unnecessarily preventing things
like file capability sets stored in security.capability from
working.
- allow copy_file_range() within the same inode and across different
filesystems within the same cluster (Luis Henriques)"
* tag 'ceph-for-5.4-rc1' of git://github.com/ceph/ceph-client: (41 commits)
ceph: call ceph_mdsc_destroy from destroy_fs_client
libceph: use ceph_kvmalloc() for osdmap arrays
libceph: avoid a __vmalloc() deadlock in ceph_kvmalloc()
ceph: allow object copies across different filesystems in the same cluster
ceph: include ceph_debug.h in cache.c
ceph: move static keyword to the front of declarations
rbd: pull rbd_img_request_create() dout out into the callers
ceph: reconnect connection if session hang in opening state
libceph: drop unused con parameter of calc_target()
ceph: use release_pages() directly
rbd: fix response length parameter for encoded strings
ceph: allow arbitrary security.* xattrs
ceph: only set CEPH_I_SEC_INITED if we got a MAC label
ceph: turn ceph_security_invalidate_secctx into static inline
ceph: add buffered/direct exclusionary locking for reads and writes
libceph: handle OSD op ceph_pagelist_append() errors
ceph: don't return a value from void function
ceph: don't freeze during write page faults
ceph: update the mtime when truncating up
ceph: fix indentation in __get_snap_name()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse updates from Miklos Szeredi:
- Continue separating the transport (user/kernel communication) and the
filesystem layers of fuse. Getting rid of most layering violations
will allow for easier cleanup and optimization later on.
- Prepare for the addition of the virtio-fs filesystem. The actual
filesystem will be introduced by a separate pull request.
- Convert to new mount API.
- Various fixes, optimizations and cleanups.
* tag 'fuse-update-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: (55 commits)
fuse: Make fuse_args_to_req static
fuse: fix memleak in cuse_channel_open
fuse: fix beyond-end-of-page access in fuse_parse_cache()
fuse: unexport fuse_put_request
fuse: kmemcg account fs data
fuse: on 64-bit store time in d_fsdata directly
fuse: fix missing unlock_page in fuse_writepage()
fuse: reserve byteswapped init opcodes
fuse: allow skipping control interface and forced unmount
fuse: dissociate DESTROY from fuseblk
fuse: delete dentry if timeout is zero
fuse: separate fuse device allocation and installation in fuse_conn
fuse: add fuse_iqueue_ops callbacks
fuse: extract fuse_fill_super_common()
fuse: export fuse_dequeue_forget() function
fuse: export fuse_get_unique()
fuse: export fuse_send_init_request()
fuse: export fuse_len_args()
fuse: export fuse_end_request()
fuse: fix request limit
...
|
|
Pull tpm fixes from Jarkko Sakkinen.
* tag 'tpmdd-next-20190925' of git://git.infradead.org/users/jjs/linux-tpmdd:
tpm: Wrap the buffer from the caller to tpm_buf in tpm_send()
MAINTAINERS: keys: Update path to trusted.h
KEYS: trusted: correctly initialize digests and fix locking issue
selftests/tpm2: Add log and *.pyc to .gitignore
selftests/tpm2: Add the missing TEST_FILES assignment
|
|
Pull iomap updates from Darrick Wong:
"After last week's failed pull request attempt, I scuttled everything
in the branch except for the directio endio api changes, which were
trivial. Everything else will simply have to wait for the next cycle.
Summary:
- Report both io errors and short io results to the directio endio
handler.
- Allow directio callers to pass an ops structure to iomap_dio_rw"
* tag 'iomap-5.4-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
iomap: move the iomap_dio_rw ->end_io callback into a structure
iomap: split size and error for iomap_dio_rw ->end_io
|
|
The EAS wake-up path computes the system energy for several CPU
candidates: the CPU with maximum spare capacity in each performance
domain, and the prev_cpu. However, if prev_cpu also happens to be the
CPU with maximum spare capacity in its performance domain, the energy
calculation is still done twice, unnecessarily.
Add a condition to filter out this corner case before doing the energy
calculation.
Reported-by: Pavan Kondeti <[email protected]>
Signed-off-by: Quentin Perret <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Fixes: eb92692b2544 ("sched/fair: Speed-up energy-aware wake-ups")
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
update_max_interval() is called in both CPUHP_AP_SCHED_STARTING's startup
and teardown callbacks, but it turns out it's also called at the end of
the startup callback of CPUHP_AP_ACTIVE (which is further down the
startup sequence).
There's no point in repeating this interval update in the startup sequence
since the CPU will remain online until it goes down the teardown path.
Remove the redundant call in sched_cpu_activate() (CPUHP_AP_ACTIVE).
Signed-off-by: Valentin Schneider <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
preempt_schedule_irq() is the one that should be called on return from
interrupt, clean up the comment to avoid any ambiguity.
Signed-off-by: Valentin Schneider <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Thomas Gleixner <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Commit:
de53fd7aedb1 ("sched/fair: Fix low cpu usage with high throttling by removing expiration of cpu-local slices")
introduced a few compilation warnings:
kernel/sched/fair.c: In function '__refill_cfs_bandwidth_runtime':
kernel/sched/fair.c:4365:6: warning: variable 'now' set but not used [-Wunused-but-set-variable]
kernel/sched/fair.c: In function 'start_cfs_bandwidth':
kernel/sched/fair.c:4992:6: warning: variable 'overrun' set but not used [-Wunused-but-set-variable]
Also, __refill_cfs_bandwidth_runtime() does no longer update the
expiration time, so fix the comments accordingly.
Signed-off-by: Qian Cai <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Ben Segall <[email protected]>
Reviewed-by: Dave Chiluk <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Fixes: de53fd7aedb1 ("sched/fair: Fix low cpu usage with high throttling by removing expiration of cpu-local slices")
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
An oops can be triggered in the scheduler when running qemu on arm64:
Unable to handle kernel paging request at virtual address ffff000008effe40
Internal error: Oops: 96000007 [#1] SMP
Process migration/0 (pid: 12, stack limit = 0x00000000084e3736)
pstate: 20000085 (nzCv daIf -PAN -UAO)
pc : __ll_sc___cmpxchg_case_acq_4+0x4/0x20
lr : move_queued_task.isra.21+0x124/0x298
...
Call trace:
__ll_sc___cmpxchg_case_acq_4+0x4/0x20
__migrate_task+0xc8/0xe0
migration_cpu_stop+0x170/0x180
cpu_stopper_thread+0xec/0x178
smpboot_thread_fn+0x1ac/0x1e8
kthread+0x134/0x138
ret_from_fork+0x10/0x18
__set_cpus_allowed_ptr() will choose an active dest_cpu in affinity mask to
migrage the process if process is not currently running on any one of the
CPUs specified in affinity mask. __set_cpus_allowed_ptr() will choose an
invalid dest_cpu (dest_cpu >= nr_cpu_ids, 1024 in my virtual machine) if
CPUS in an affinity mask are deactived by cpu_down after cpumask_intersects
check. cpumask_test_cpu() of dest_cpu afterwards is overflown and may pass if
corresponding bit is coincidentally set. As a consequence, kernel will
access an invalid rq address associate with the invalid CPU in
migration_cpu_stop->__migrate_task->move_queued_task and the Oops occurs.
The reproduce the crash:
1) A process repeatedly binds itself to cpu0 and cpu1 in turn by calling
sched_setaffinity.
2) A shell script repeatedly does "echo 0 > /sys/devices/system/cpu/cpu1/online"
and "echo 1 > /sys/devices/system/cpu/cpu1/online" in turn.
3) Oops appears if the invalid CPU is set in memory after tested cpumask.
Signed-off-by: KeMeng Shi <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Valentin Schneider <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Remove the IPI fallback code from membarrier to deal with very
infrequent cpumask memory allocation failure. Use GFP_KERNEL rather
than GFP_NOWAIT, and relax the blocking guarantees for the expedited
membarrier system call commands, allowing it to block if waiting for
memory to be made available.
In addition, now -ENOMEM can be returned to user-space if the cpumask
memory allocation fails.
Signed-off-by: Mathieu Desnoyers <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: Kirill Tkhai <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Russell King - ARM Linux admin <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
If there is only a single mm_user for the mm, the private expedited
membarrier command can skip the IPIs, because only a single thread
is using the mm.
Signed-off-by: Mathieu Desnoyers <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: Kirill Tkhai <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Russell King - ARM Linux admin <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
membarrier commands cover very different code paths if they are in
a single-threaded vs multi-threaded process. Therefore, exercise both
scenarios in the kernel selftests to increase coverage of this selftest.
Signed-off-by: Mathieu Desnoyers <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: Kirill Tkhai <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Russell King - ARM Linux admin <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The membarrier_state field is located within the mm_struct, which
is not guaranteed to exist when used from runqueue-lock-free iteration
on runqueues by the membarrier system call.
Copy the membarrier_state from the mm_struct into the scheduler runqueue
when the scheduler switches between mm.
When registering membarrier for mm, after setting the registration bit
in the mm membarrier state, issue a synchronize_rcu() to ensure the
scheduler observes the change. In order to take care of the case
where a runqueue keeps executing the target mm without swapping to
other mm, iterate over each runqueue and issue an IPI to copy the
membarrier_state from the mm_struct into each runqueue which have the
same mm which state has just been modified.
Move the mm membarrier_state field closer to pgd in mm_struct to use
a cache line already touched by the scheduler switch_mm.
The membarrier_execve() (now membarrier_exec_mmap) hook now needs to
clear the runqueue's membarrier state in addition to clear the mm
membarrier state, so move its implementation into the scheduler
membarrier code so it can access the runqueue structure.
Add memory barrier in membarrier_exec_mmap() prior to clearing
the membarrier state, ensuring memory accesses executed prior to exec
are not reordered with the stores clearing the membarrier state.
As suggested by Linus, move all membarrier.c RCU read-side locks outside
of the for each cpu loops.
Suggested-by: Linus Torvalds <[email protected]>
Signed-off-by: Mathieu Desnoyers <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: Kirill Tkhai <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Russell King - ARM Linux admin <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
When the prev and next task's mm change, switch_mm() provides the core
serializing guarantees before returning to usermode. The only case
where an explicit core serialization is needed is when the scheduler
keeps the same mm for prev and next.
Suggested-by: Oleg Nesterov <[email protected]>
Signed-off-by: Mathieu Desnoyers <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: Kirill Tkhai <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Russell King - ARM Linux admin <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Checking that the number of threads is 1 is redundant with checking
mm_users == 1.
No change in functionality intended.
Suggested-by: Oleg Nesterov <[email protected]>
Signed-off-by: Mathieu Desnoyers <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: Kirill Tkhai <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Russell King - ARM Linux admin <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Fix a logic flaw in the way membarrier_register_private_expedited()
handles ready state checks for private expedited sync core and private
expedited registrations.
If a private expedited membarrier registration is first performed, and
then a private expedited sync_core registration is performed, the ready
state check will skip the second registration when it really should not.
Signed-off-by: Mathieu Desnoyers <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: Kirill Tkhai <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Russell King - ARM Linux admin <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|