Age | Commit message (Collapse) | Author | Files | Lines |
|
The core part of the selftest, i.e., the je <-> jmp cycle, mimics the
original sched-ext bpf program. The test will fail without the
previous patch.
I tried to create some cases for other potential cycles
(je <-> je, jmp <-> je and jmp <-> jmp) with similar pattern
to the test in this patch, but failed. So this patch
only contains one test for je <-> jmp cycle.
Signed-off-by: Yonghong Song <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Pull bpf/master to receive baebe9aaba1e ("bpf: allow passing struct
bpf_iter_<type> as kfunc arguments") and related changes in preparation for
the DSQ iterator patchset.
Signed-off-by: Tejun Heo <[email protected]>
|
|
Fix eventfs ownership testcase to find mount point if stat -c "%m" failed.
This can happen on the system based on busybox. In this case, this will
try to use the current working directory, which should be a tracefs top
directory (and eventfs is mounted as a part of tracefs.)
If it does not work, the test is skipped as UNRESOLVED because of
the environmental problem.
Fixes: ee9793be08b1 ("tracing/selftests: Add ownership modification tests for eventfs")
Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
Acked-by: Steven Rostedt (Google) <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>
|
|
This patch adds scx_flatcg example scheduler which implements hierarchical
weight-based cgroup CPU control by flattening the cgroup hierarchy into a
single layer by compounding the active weight share at each level.
This flattening of hierarchy can bring a substantial performance gain when
the cgroup hierarchy is nested multiple levels. in a simple benchmark using
wrk[8] on apache serving a CGI script calculating sha1sum of a small file,
it outperforms CFS by ~3% with CPU controller disabled and by ~10% with two
apache instances competing with 2:1 weight ratio nested four level deep.
However, the gain comes at the cost of not being able to properly handle
thundering herd of cgroups. For example, if many cgroups which are nested
behind a low priority parent cgroup wake up around the same time, they may
be able to consume more CPU cycles than they are entitled to. In many use
cases, this isn't a real concern especially given the performance gain.
Also, there are ways to mitigate the problem further by e.g. introducing an
extra scheduling layer on cgroup delegation boundaries.
v5: - Updated to specify SCX_OPS_HAS_CGROUP_WEIGHT instead of
SCX_OPS_KNOB_CGROUP_WEIGHT.
v4: - Revert reference counted kptr for cgv_node as the change caused easily
reproducible stalls.
v3: - Updated to reflect the core API changes including ops.init/exit_task()
and direct dispatch from ops.select_cpu(). Fixes and improvements
including additional statistics.
- Use reference counted kptr for cgv_node instead of xchg'ing against
stash location.
- Dropped '-p' option.
v2: - Use SCX_BUG[_ON]() to simplify error handling.
Signed-off-by: Tejun Heo <[email protected]>
Reviewed-by: David Vernet <[email protected]>
Acked-by: Josh Don <[email protected]>
Acked-by: Hao Luo <[email protected]>
Acked-by: Barret Rhoden <[email protected]>
|
|
Add sched_ext_ops operations to init/exit cgroups, and track task migrations
and config changes. A BPF scheduler may not implement or implement only
subset of cgroup features. The implemented features can be indicated using
%SCX_OPS_HAS_CGOUP_* flags. If cgroup configuration makes use of features
that are not implemented, a warning is triggered.
While a BPF scheduler is being enabled and disabled, relevant cgroup
operations are locked out using scx_cgroup_rwsem. This avoids situations
like task prep taking place while the task is being moved across cgroups,
making things easier for BPF schedulers.
v7: - cgroup interface file visibility toggling is dropped in favor just
warning messages. Dynamically changing interface visiblity caused more
confusion than helping.
v6: - Updated to reflect the removal of SCX_KF_SLEEPABLE.
- Updated to use CONFIG_GROUP_SCHED_WEIGHT and fixes for
!CONFIG_FAIR_GROUP_SCHED && CONFIG_EXT_GROUP_SCHED.
v5: - Flipped the locking order between scx_cgroup_rwsem and
cpus_read_lock() to avoid locking order conflict w/ cpuset. Better
documentation around locking.
- sched_move_task() takes an early exit if the source and destination
are identical. This triggered the warning in scx_cgroup_can_attach()
as it left p->scx.cgrp_moving_from uncleared. Updated the cgroup
migration path so that ops.cgroup_prep_move() is skipped for identity
migrations so that its invocations always match ops.cgroup_move()
one-to-one.
v4: - Example schedulers moved into their own patches.
- Fix build failure when !CONFIG_CGROUP_SCHED, reported by Andrea Righi.
v3: - Make scx_example_pair switch all tasks by default.
- Convert to BPF inline iterators.
- scx_bpf_task_cgroup() is added to determine the current cgroup from
CPU controller's POV. This allows BPF schedulers to accurately track
CPU cgroup membership.
- scx_example_flatcg added. This demonstrates flattened hierarchy
implementation of CPU cgroup control and shows significant performance
improvement when cgroups which are nested multiple levels are under
competition.
v2: - Build fixes for different CONFIG combinations.
Signed-off-by: Tejun Heo <[email protected]>
Reviewed-by: David Vernet <[email protected]>
Acked-by: Josh Don <[email protected]>
Acked-by: Hao Luo <[email protected]>
Acked-by: Barret Rhoden <[email protected]>
Reported-by: kernel test robot <[email protected]>
Cc: Andrea Righi <[email protected]>
|
|
The ARRAY_SIZE macro is more compact and more formal in linux source.
Signed-off-by: Feng Yang <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Add selftest for cases where btf_name_valid_section() does not properly
check for certain types of names.
Suggested-by: Eduard Zingerman <[email protected]>
Signed-off-by: Jeongjun Park <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Eduard Zingerman <[email protected]>
|
|
Fix two inconsistencies in feature names as discussed in [1]:
1. Rename "dwarf-unwind-support" to "dwarf-unwind"
2. 'get_cpuid' feature and 'HAVE_AUXTRACE_SUPPORT' names don't
look related, change the feature name to 'auxtrace' to match the
macro name, as 'get_cpuid' string is not used anywhere to check the
feature presence
[1]: https://lore.kernel.org/linux-perf-users/ZoRw5we4HLSTZND6@x1/
Suggested-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Aditya Gupta <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Disha Goel <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
In probe_vfs_getname.sh, current we use "perf record --dry-run"
to check for libtraceevent and skip the test if perf is not
build with libtraceevent. Change the check to use "perf check feature"
option
Signed-off-by: Athira Rajeev <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Disha Goel <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Currently we use output of 'perf version --build-options', to check
whether perf was built with libtraceevent support.
Instead, use 'perf check feature libtraceevent' to check for
libtraceevent support.
Reviewed-by: Athira Rajeev <[email protected]>
Signed-off-by: Aditya Gupta <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Disha Goel <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Now that the feature list has been duplicated in a global
'supported_features' array, use that array instead of manually checking
status of built-in features.
This helps in being consistent with commands such as 'perf check feature',
so commands can use the same array, and any new feature can be added at
one place, in the 'supported_features' array
Reviewed-by: Athira Rajeev <[email protected]>
Signed-off-by: Aditya Gupta <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Disha Goel <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools fixes from Namhyung Kim:
"A number of small fixes for the late cycle:
- Two more build fixes on 32-bit archs
- Fixed a segfault during perf test
- Fixed spinlock/rwlock accounting bug in perf lock contention"
* tag 'perf-tools-fixes-for-v6.11-2024-09-04' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
perf daemon: Fix the build on more 32-bit architectures
perf python: include "util/sample.h"
perf lock contention: Fix spinlock and rwlock accounting
perf test pmu: Set uninitialized PMU alias to null
|
|
Some distros have grub2 config files with the lines
if [ x"${feature_menuentry_id}" = xy ]; then
menuentry_id_option="--id"
else
menuentry_id_option=""
fi
which match the skip regex defined for grub2 in get_grub_index():
$skip = '^\s*menuentry';
These false positives cause the grub number to be higher than it
should be, and the wrong kernel can end up booting.
Grub documents the menuentry command with whitespace between it and the
title, so make the skip regex reflect this.
Link: https://lore.kernel.org/[email protected]
Signed-off-by: Daniel Jordan <[email protected]>
Acked-by: John 'Warthog9' Hawley (Tenstorrent) <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
|
|
If a warning happens at build, give a warning at the end:
Build time: 1 minute 40 seconds
Install time: 17 seconds
Reboot time: 25 seconds
*** WARNING found in build: 1 ***
*******************************************
*******************************************
KTEST RESULT: TEST 1 SUCCESS!!!! **
*******************************************
*******************************************
This way, even if the test isn't made to fail on warnings during the
build, a message is still displayed that warnings were found.
Link: https://lore.kernel.org/<[email protected]>
Acked-by: John 'Warthog9' Hawley (Tenstorrent) <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
|
|
When the PROCMAP_QUERY is not defined, a compilation error occurs due to the
mismatch of the procmap_query()'s params, procmap_query() only be called in
the file where the function is defined, modify the params so they can match.
We get a warning when build samples/bpf:
trace_helpers.c:252:5: warning: no previous prototype for ‘procmap_query’ [-Wmissing-prototypes]
252 | int procmap_query(int fd, const void *addr, __u32 query_flags, size_t *start, size_t *offset, int *flags)
| ^~~~~~~~~~~~~
As this function is only used in the file, mark it as 'static'.
Fixes: 4e9e07603ecd ("selftests/bpf: make use of PROCMAP_QUERY ioctl if available")
Signed-off-by: Yuan Chen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
The unsigned int should use "%u" instead of "%d".
Signed-off-by: zhang jiao <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Mark Brown <[email protected]>
|
|
When restricting jevents generated json lookup code with JEVENTS_MODEL
a list of models must be provided. Some builds don't know model names
but know cpuids. Add a command that can convert a cpuid to a model
using mapfile.csv files. This can be used with JEVENTS_MODEL like:
$ make JEVENTS_MODEL=`./pmu-events/models.py x86 'GenuineIntel-6-8D-1,AuthenticAMD-26-1' pmu-events/arch/`
Committer testing:
$ tools/perf/pmu-events/models.py x86 'GenuineIntel-6-8D-1,AuthenticAMD-26-1' tools/perf/pmu-events/arch/
tigerlake,amdzen5
$ perf stat -v sleep 1 |& head -1
Using CPUID GenuineIntel-6-B7-1
$ tools/perf/pmu-events/models.py x86 'GenuineIntel-6-B7-1' tools/perf/pmu-events/arch/
alderlake
$
Signed-off-by: Ian Rogers <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Currently the presence of a feature is checked with a combination of
perf version --build-options and greps, such as:
perf version --build-options | grep " on .* HAVE_FEATURE"
Instead of this, introduce a subcommand "perf check feature", with which
scripts can test for presence of a feature, such as:
perf check feature HAVE_FEATURE
'perf check feature' command is expected to have exit status of 0 if
feature is built-in, and 1 if it's not built-in or if feature is not known.
Multiple features can also be passed as a comma-separated list, in which
case the exit status will be 1 only if all of the passed features are
built-in. For example, with below command, it will have exit status of 0
only if both libtraceevent and bpf are enabled, else 1 in all other cases
perf check feature libtraceevent,bpf
The arguments are case-insensitive.
An array 'supported_features' has also been introduced that can be used by
other commands like 'perf version --build-options', so that new features
can be added in one place, with the array
Committer testing:
$ perf check feature libtraceevent,bpf
libtraceevent: [ on ] # HAVE_LIBTRACEEVENT
bpf: [ on ] # HAVE_LIBBPF_SUPPORT
$ perf check feature libtraceevent
libtraceevent: [ on ] # HAVE_LIBTRACEEVENT
$ perf check feature bpf
bpf: [ on ] # HAVE_LIBBPF_SUPPORT
$ perf check -q feature bpf && echo "BPF support is present"
BPF support is present
$ perf check -q feature Bogus && echo "Bogus support is present"
$
Reviewed-by: Athira Rajeev <[email protected]>
Signed-off-by: Aditya Gupta <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Disha Goel <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Currently, commands which depend on 'parse_options_subcommand()' don't
show the usage string, and instead show '(null)'
$ ./perf sched
Usage: (null)
-D, --dump-raw-trace dump raw trace in ASCII
-f, --force don't complain, do it
-i, --input <file> input file name
-v, --verbose be more verbose (show symbol address, etc)
'parse_options_subcommand()' is generally expected to initialise the usage
string, with information in the passed 'subcommands[]' array
This behaviour was changed in:
230a7a71f92212e7 ("libsubcmd: Fix parse-options memory leak")
Where the generated usage string is deallocated, and usage[0] string is
reassigned as NULL.
As discussed in [1], free the allocated usage string in the main
function itself, and don't reset usage string to NULL in
parse_options_subcommand
With this change, the behaviour is restored.
$ ./perf sched
Usage: perf sched [<options>] {record|latency|map|replay|script|timehist}
-D, --dump-raw-trace dump raw trace in ASCII
-f, --force don't complain, do it
-i, --input <file> input file name
-v, --verbose be more verbose (show symbol address, etc)
[1]: https://lore.kernel.org/linux-perf-users/htq5vhx6piet4nuq2mmhk7fs2bhfykv52dbppwxmo3s7du2odf@styd27tioc6e/
Fixes: 230a7a71f92212e7 ("libsubcmd: Fix parse-options memory leak")
Suggested-by: Namhyung Kim <[email protected]>
Signed-off-by: Aditya Gupta <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Disha Goel <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
On arm64 the breakpoint length should be 4-bytes but 8-bytes is
tolerated as perf passes that as sizeof(long). Just pass the correct
value.
On i386 the sizeof(long) check in the kernel needs to match the
kernel's long size. Check using an environment (uname checks) whether
4 or 8 bytes needs to be passed. Cache the value in a static.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Chaitanya S Prakash <[email protected]>
Cc: Colin Ian King <[email protected]>
Cc: Dominique Martinet <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Junhao He <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Yang Jihong <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The default breakpoint length is "sizeof(long)" however this is
incorrect on platforms like Aarch64 where sizeof(long) is 8 but the
breakpoint length is 4. Add a helper function that can be used to
determine the correct breakpoint length, in this change it just
returns the existing default sizeof(long) value.
Use the helper in the bp_account test so that, when modifying the
event from a watchpoint to a breakpoint, the breakpoint length is
appropriate for the architecture and not just sizeof(long).
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Chaitanya S Prakash <[email protected]>
Cc: Colin Ian King <[email protected]>
Cc: Dominique Martinet <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Junhao He <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Yang Jihong <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
- Standardize directory variables to support more flexible installations.
- Add copyright and licensing information to the Makefile.
- Introduce ".PHONY" declarations to ensure that specific targets are always
executed, regardless of the presence of files with matching names.
- Add a help target to provide usage instructions.
Signed-off-by: Amit Vadhavana <[email protected]>
Acked-by: Todd Brandt <[email protected]>
Link: https://patch.msgid.link/Update directory handling and installation process in Makefile
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <[email protected]>
|
|
By default, sleepgraph.py creates suspend-{date}-{time} directories
to store artifacts, or suspend-{date}-{time}-xN if the --multi option
is used.
Ignore those directories by adding a .gitignore file.
Signed-off-by: Yo-Jung (Leo) Lin <[email protected]>
Acked-by: Todd Brandt <[email protected]>
Link: https://patch.msgid.link/[email protected]
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <[email protected]>
|
|
Add the SO_PEEK_OFF selftest for UDP. In this patch, I mainly do
three things:
1. rename tcp_so_peek_off.c
2. adjust for UDP protocol
3. add selftests into it
Suggested-by: Jon Maloy <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Signed-off-by: Jason Xing <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Ensure that we get signal context for POR_EL0 if and only if POE is present
on the system.
Copied from the TPIDR2 test.
Signed-off-by: Joey Gouly <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Shuah Khan <[email protected]>
Reviewed-by: Mark Brown <[email protected]>
Acked-by: Shuah Khan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
|
|
Teach the signal frame parsing about the new POE frame, avoids warning when it
is generated.
Signed-off-by: Joey Gouly <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Shuah Khan <[email protected]>
Reviewed-by: Mark Brown <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
|
|
Check that when POE is enabled, the POR_EL0 register is accessible.
Signed-off-by: Joey Gouly <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Shuah Khan <[email protected]>
Reviewed-by: Mark Brown <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
|
|
The encoding of the pkey register differs on arm64, than on x86/ppc. On those
platforms, a bit in the register is used to disable permissions, for arm64, a
bit enabled in the register indicates that the permission is allowed.
This drops two asserts of the form:
assert(read_pkey_reg() <= orig_pkey_reg);
Because on arm64 this doesn't hold, due to the encoding.
The pkey must be reset to both access allow and write allow in the signal
handler. pkey_access_allow() works currently for PowerPC as the
PKEY_DISABLE_ACCESS and PKEY_DISABLE_WRITE have overlapping bits set.
Access to the uc_mcontext is abstracted, as arm64 has a different structure.
Signed-off-by: Joey Gouly <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Aneesh Kumar K.V <[email protected]>
Acked-by: Dave Hansen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
|
|
arm64's fpregs are not at a constant offset from sigcontext. Since this is
not an important part of the test, don't print the fpregs pointer on arm64.
Signed-off-by: Joey Gouly <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Aneesh Kumar K.V <[email protected]>
Acked-by: Dave Hansen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
|
|
Put this function in the header so that it can be used by other tests, without
needing to link to testcases.c.
This will be used by selftest/mm/protection_keys.c
Signed-off-by: Joey Gouly <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Aneesh Kumar K.V <[email protected]>
Reviewed-by: Mark Brown <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
|
|
Add new system registers:
- POR_EL1
- POR_EL0
Signed-off-by: Joey Gouly <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Oliver Upton <[email protected]>
Cc: Shuah Khan <[email protected]>
Reviewed-by: Mark Brown <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
|
|
fd03c5b85855 ("sched: Rework pick_next_task()") changed the definition of
pick_next_task() from:
pick_next_task() := pick_task() + set_next_task(.first = true)
to:
pick_next_task(prev) := pick_task() + put_prev_task() + set_next_task(.first = true)
making invoking put_prev_task() pick_next_task()'s responsibility. This
reordering allows pick_task() to be shared between regular and core-sched
paths and put_prev_task() to know the next task.
sched_ext depended on put_prev_task_scx() enqueueing the current task before
pick_next_task_scx() is called. While pulling sched/core changes,
70cc76aa0d80 ("Merge branch 'tip/sched/core' into for-6.12") added an
explicit put_prev_task_scx() call for SCX tasks in pick_next_task_scx()
before picking the first task as a workaround.
Clean it up and adopt the conventions that other sched classes are
following.
The operation of keeping running the current task was spread and required
the task to be put on the local DSQ before picking:
- balance_one() used SCX_TASK_BAL_KEEP to indicate that the task is still
runnable, hasn't exhausted its slice, and thus should keep running.
- put_prev_task_scx() enqueued the task to local DSQ if SCX_TASK_BAL_KEEP
is set. It also called do_enqueue_task() with SCX_ENQ_LAST if it is the
only runnable task. do_enqueue_task() in turn decided whether to use the
local DSQ depending on SCX_OPS_ENQ_LAST.
Consolidate the logic in balance_one() as it always knows whether it is
going to keep the current task. balance_one() now considers all conditions
where the current task should be kept and uses SCX_TASK_BAL_KEEP to tell
pick_next_task_scx() to keep the current task instead of picking one from
the local DSQ. Accordingly, SCX_ENQ_LAST handling is removed from
put_prev_task_scx() and do_enqueue_task() and pick_next_task_scx() is
updated to pick the current task if SCX_TASK_BAL_KEEP is set.
The workaround put_prev_task[_scx]() calls are replaced with
put_prev_set_next_task().
This causes two behavior changes observable from the BPF scheduler:
- When a task keep running, it no longer goes through enqueue/dequeue cycle
and thus ops.stopping/running() transitions. The new behavior is better
and all the existing schedulers should be able to handle the new behavior.
- The BPF scheduler cannot keep executing the current task by enqueueing
SCX_ENQ_LAST task to the local DSQ. If SCX_OPS_ENQ_LAST is specified, the
BPF scheduler is responsible for resuming execution after each
SCX_ENQ_LAST. SCX_OPS_ENQ_LAST is mostly useful for cases where scheduling
decisions are not made on the local CPU - e.g. central or userspace-driven
schedulin - and the new behavior is more logical and shouldn't pose any
problems. SCX_OPS_ENQ_LAST demonstration from scx_qmap is dropped as it
doesn't fit that well anymore and the last task handling is moved to the
end of qmap_dispatch().
Signed-off-by: Tejun Heo <[email protected]>
Cc: David Vernet <[email protected]>
Cc: Andrea Righi <[email protected]>
Cc: Changwoo Min <[email protected]>
Cc: Daniel Hodges <[email protected]>
Cc: Dan Schatzberg <[email protected]>
|
|
Some test scripts are missing executable permissions. It causes warnings
that make the test output unnecessarily verbose. Add executable
permissions.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: SeongJae Park <[email protected]>
Cc: Brendan Higgins <[email protected]>
Cc: David Gow <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Python-based tests creates __pycache__/ directory. Remove it with 'make
clean' by defining it as EXTRA_CLEAN.
Link: https://lkml.kernel.org/r/[email protected]
Fixes: b5906f5f7359 ("selftests/damon: add a test for update_schemes_tried_regions sysfs command")
Signed-off-by: SeongJae Park <[email protected]>
Cc: Brendan Higgins <[email protected]>
Cc: David Gow <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Patch series "misc fixups for DAMON {self,kunit} tests".
This patchset is for minor fixups of DAMON selftests and kunit tests.
First three patches make DAMON selftests more cleanly maintained (patches
1 and 2) without unnecessary warnings (patch 3). Following six patches
remove unnecessary test case (patch 4), handle configs combinations that
can make tests fail (patches 5-7), reorganize the test files following the
new guideline (patch 8), and add reference kunitconfig for DAMON kunit
tests (patch 9).
This patch (of 9):
DAMON selftests build access_memory_even, but its not on the .gitignore
list. Add it to make 'git status' output cleaner.
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Fixes: c94df805c774 ("selftests/damon: implement a program for even-numbered memory regions access")
Signed-off-by: SeongJae Park <[email protected]>
Cc: Brendan Higgins <[email protected]>
Cc: David Gow <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
In commit 714965ca8252 ("mm/mmap: start distinguishing if vma can be
removed in mergeability test") we relaxed the VMA merge rules for VMAs
possessing a vm_ops->close() hook, permitting this operation in instances
where we wouldn't delete the VMA as part of the merge operation.
This was later corrected in commit fc0c8f9089c2 ("mm, mmap: fix
vma_merge() case 7 with vma_ops->close") to account for a subtle case that
the previous commit had not taken into account.
In both instances, we first rely on is_mergeable_vma() to determine
whether we might be dealing with a VMA that might be removed, taking
advantage of the fact that a 'previous' VMA will never be deleted, only
VMAs that follow it.
The second patch corrects the instance where a merge of the previous VMA
into a subsequent one did not correctly check whether the subsequent VMA
had a vm_ops->close() handler.
Both changes prevent merge cases that are actually permissible (for
instance a merge of a VMA into a following VMA with a vm_ops->close(), but
with no previous VMA, which would result in the next VMA being extended,
not deleted).
In addition, both changes fail to consider the case where a VMA that would
otherwise be merged with the previous and next VMA might have
vm_ops->close(), on the assumption that for this to be the case, all three
would have to have the same vma->vm_file to be mergeable and thus the same
vm_ops.
And in addition both changes operate at 50,000 feet, trying to guess
whether a VMA will be deleted.
As we have majorly refactored the VMA merge operation and de-duplicated
code to the point where we know precisely where deletions will occur, this
patch removes the aforementioned checks altogether and instead explicitly
checks whether a VMA will be deleted.
In cases where a reduced merge is still possible (where we merge both
previous and next VMA but the next VMA has a vm_ops->close hook, meaning
we could just merge the previous and current VMA), we do so, otherwise the
merge is not permitted.
We take advantage of our userland testing to assert that this functions
correctly - replacing the previous limited vm_ops->close() tests with
tests for every single case where we delete a VMA.
We also update all testing for both new and modified VMAs to set
vma->vm_ops->close() in every single instance where this would not prevent
the merge, to assert that we never do so.
Link: https://lkml.kernel.org/r/9f96b8cfeef3d14afabddac3d6144afdfbef2e22.1725040657.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Cc: Liam R. Howlett <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Bert Karwatzki <[email protected]>
Cc: Jeff Xu <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Lorenzo Stoakes <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: "Paul E. McKenney" <[email protected]>
Cc: Paul Moore <[email protected]>
Cc: Sidhartha Kumar <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The existing vma_merge() function is no longer required to handle what
were previously referred to as cases 1-3 (i.e. the merging of a new VMA),
as this is now handled by vma_merge_new_vma().
Additionally, simplify the convoluted control flow of the original,
maintaining identical logic only expressed more clearly and doing away
with a complicated set of cases, rather logically examining each possible
outcome - merging of both the previous and subsequent VMA, merging of the
previous VMA and merging of the subsequent VMA alone.
We now utilise the previously implemented commit_merge() function to share
logic with vma_expand() de-duplicating code and providing less surface
area for bugs and confusion. In order to do so, we adjust this function
to accept parameters specific to merging existing ranges.
Link: https://lkml.kernel.org/r/2cf6016b7bfcc4965fc3cde10827560c42e4f12c.1725040657.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <[email protected]>
Cc: Liam R. Howlett <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Bert Karwatzki <[email protected]>
Cc: Jeff Xu <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Lorenzo Stoakes <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: "Paul E. McKenney" <[email protected]>
Cc: Paul Moore <[email protected]>
Cc: Sidhartha Kumar <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Abstract vma_merge_new_vma() to use vma_merge_struct and rename the
resultant function vma_merge_new_range() to be clear what the purpose of
this function is - a new VMA is desired in the specified range, and we
wish to see if it is possible to 'merge' surrounding VMAs into this range
rather than having to allocate a new VMA.
Note that this function uses vma_extend() exclusively, so adopts its
requirement that the iterator point at or before the gap. We add an
assert to this effect.
This is as opposed to vma_merge_existing_range(), which will be introduced
in a subsequent commit, and provide the same functionality for cases in
which we are modifying an existing VMA.
In mmap_region() and do_brk_flags() we open code scenarios where we prefer
to use vma_expand() rather than invoke a full vma_merge() operation.
Abstract this logic and eliminate all of the open-coding, and also use the
same logic for all cases where we add new VMAs to, rather than ultimately
use vma_merge(), rather use vma_expand().
Doing so removes duplication and simplifies VMA merging in all such cases,
laying the ground for us to eliminate the merging of new VMAs in
vma_merge() altogether.
Also add the ability for the vmg to track state, and able to report
errors, allowing for us to differentiate a failed merge from an inability
to allocate memory in callers.
This makes it far easier to understand what is happening in these cases
avoiding confusion, bugs and allowing for future optimisation.
Also introduce vma_iter_next_rewind() to allow for retrieval of the next,
and (optionally) the prev VMA, rewinding to the start of the previous gap.
Introduce are_anon_vmas_compatible() to abstract individual VMA anon_vma
comparison for the case of merging on both sides where the anon_vma of the
VMA being merged maybe compatible with prev and next, but prev and next's
anon_vma's may not be compatible with each other.
Finally also introduce can_vma_merge_left() / can_vma_merge_right() to
check adjacent VMA compatibility and that they are indeed adjacent.
Link: https://lkml.kernel.org/r/49d37c0769b6b9dc03b27fe4d059173832556392.1725040657.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <[email protected]>
Tested-by: Mark Brown <[email protected]>
Cc: Liam R. Howlett <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Bert Karwatzki <[email protected]>
Cc: Jeff Xu <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Lorenzo Stoakes <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: "Paul E. McKenney" <[email protected]>
Cc: Paul Moore <[email protected]>
Cc: Sidhartha Kumar <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The purpose of the vmg is to thread merge state through functions and
avoid egregious parameter lists. We expand this to vma_expand(), which is
used for a number of merge cases.
Accordingly, adjust its callers, mmap_region() and relocate_vma_down(), to
use a vmg.
An added purpose of this change is the ability in a future commit to
perform all new VMA range merging using vma_expand().
Link: https://lkml.kernel.org/r/4bc8c9dbc9ca52452ef8e587b28fe555854ceb38.1725040657.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <[email protected]>
Reviewed-by: Liam R. Howlett <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Bert Karwatzki <[email protected]>
Cc: Jeff Xu <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Lorenzo Stoakes <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: "Paul E. McKenney" <[email protected]>
Cc: Paul Moore <[email protected]>
Cc: Sidhartha Kumar <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Rather than passing around huge numbers of parameters to numerous helper
functions, abstract them into a single struct that we thread through the
operation, the vma_merge_struct ('vmg').
Adjust vma_merge() and vma_modify() to accept this parameter, as well as
predicate functions can_vma_merge_before(), can_vma_merge_after(), and the
vma_modify_...() helper functions.
Also introduce VMG_STATE() and VMG_VMA_STATE() helper macros to allow for
easy vmg declaration.
We additionally remove the requirement that vma_merge() is passed a VMA
object representing the candidate new VMA. Previously it used this to
obtain the mm_struct, file and anon_vma properties of the proposed range
(a rather confusing state of affairs), which are now provided by the vmg
directly.
We also remove the pgoff calculation previously performed vma_modify(),
and instead calculate this in VMG_VMA_STATE() via the vma_pgoff_offset()
helper.
Link: https://lkml.kernel.org/r/a955aad09d81329f6fbeb636b2dd10cde7b73dab.1725040657.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <[email protected]>
Reviewed-by: Liam R. Howlett <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Bert Karwatzki <[email protected]>
Cc: Jeff Xu <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Lorenzo Stoakes <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: "Paul E. McKenney" <[email protected]>
Cc: Paul Moore <[email protected]>
Cc: Sidhartha Kumar <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Add a variety of VMA merge unit tests to assert that the behaviour of VMA
merge is correct at an abstract level and VMAs are merged or not merged as
expected.
These are intentionally added _before_ we start refactoring vma_merge() in
order that we can continually assert correctness throughout the rest of
the series.
In order to reduce churn going forward, we backport the vma_merge_struct
data type to the test code which we introduce and use in a future commit,
and add wrappers around the merge new and existing VMA cases.
Link: https://lkml.kernel.org/r/1c7a0b43cfad2c511a6b1b52f3507696478ff51a.1725040657.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <[email protected]>
Reviewed-by: Liam R. Howlett <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Bert Karwatzki <[email protected]>
Cc: Jeff Xu <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Lorenzo Stoakes <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: "Paul E. McKenney" <[email protected]>
Cc: Paul Moore <[email protected]>
Cc: Sidhartha Kumar <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Patch series "mm: remove vma_merge()", v3.
The infamous vma_merge() function has been the cause of a great deal of
pain, bugs and confusion for a very long time.
It is subtle, contains many corner cases, tries to do far too much and is
as a result very fragile.
The fact that the function requires there to be a numbering system to
cover each possible eventuality with references to each in the many
branches of its implementation as to which case you are looking at speaks
to all this.
Some of this complexity is inherent - unfortunately there is no getting
away from the need to figure out precisely how to execute the merge,
whether we need to remove VMAs, whether it is safe to do so, what
constitutes a mergeable VMA and so on.
However, a lot of the complexity is not inherent but instead a product of
the function's 'organic' development.
Liam has gone to great lengths to improve the situation as a part of his
maple tree implementation, greatly improving the readability of the code,
and Vlastimil and myself have additionally gone to lengths to try to
improve things further.
However, with the availability of userland VMA testing, it now becomes
possible to perform a rather more significant refactoring while
maintaining confidence in its correct operation.
An attempt was previously made by Vlastimil [0] to eliminate vma_merge(),
however it was rather - brutal - and an astute reader might refer to the
date of that patch for insight as to its intent.
This series instead divides merge operations into two natural kinds -
merges which occur when a NEW vma is being added to the address space, and
merges which occur when a vma is being MODIFIED.
Happily, the vma_expand() function introduced by Liam, which has the
capacity for also deleting a subsequent VMA, covers each of the NEW vma
cases.
By abstracting the actual final commit of changes to a VMA to its own
function, commit_merge() and writing a wrapper around vma_expand() for new
VMA cases vma_merge_new_range(), we can avoid having to use vma_merge()
for these instances altogether.
By doing so we are also able to then de-duplicate all existing merge logic
in mmap_region() and do_brk_flags() and have everything invoke this new
function, so we universally take the same approach to merging new VMAs.
Having done so, we can then completely rework vma_merge() into
vma_merge_existing_range() and use this for the instances where a merge is
proposed for a region of an existing VMA.
This eliminates vma_merge() and its numbered cases and instead divides
things into logical cases - merge both, merge left, merge right (the
latter 2 being either partial or full merges).
The code is heavily annotated with ASCII diagrams and greatly simplified
in comparison to the existing vma_merge() function.
Having made this change, we take the opportunity to address an issue with
merging VMAs possessing a vm_ops->close() hook - commit 714965ca8252
("mm/mmap: start distinguishing if vma can be removed in mergeability
test") and commit fc0c8f9089c2 ("mm, mmap: fix vma_merge() case 7 with
vma_ops->close") make efforts to relax how we handle these, making
assumptions about which VMAs might end up deleted (and thus, if possessing
a vm_ops->close() hook, cannot be).
This refactor means we do not need to guess, so instead explicitly only
disallow merge in instances where a VMA with a vm_ops->close() hook would
be deleted (and try a smaller merge in cases where this is possible).
In addition to these changes, we introduce a new vma_merge_struct
abstraction to allow VMA merge state to be threaded through the operation
neatly.
There is heavy unit testing provided for all merge functionality, added
prior to the refactoring, allowing for before/after testing.
The vm_ops->close() change also introduces exhaustive testing to
demonstrate that this functions as expected, and in addition to this the
reproduction code from commit fc0c8f9089c2 ("mm, mmap: fix vma_merge()
case 7 with vma_ops->close") was tested and confirmed passing.
[0]:https://lore.kernel.org/linux-mm/[email protected]/
This patch (of 10):
Have vma.o depend on its source dependencies explicitly, as previously
these were simply being ignored as existing object files were up to date.
This now correctly re-triggers the build if mm/ source is changed as well
as local source code.
Also set clean as a phony rule.
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/e3ea58f08364ae5432c9a074de0195a7c7e0b04a.1725040657.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <[email protected]>
Reviewed-by: Liam R. Howlett <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Bert Karwatzki <[email protected]>
Cc: Jeff Xu <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Lorenzo Stoakes <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: "Paul E. McKenney" <[email protected]>
Cc: Paul Moore <[email protected]>
Cc: Sidhartha Kumar <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Ensure that zswap.writeback check goes up the cgroup tree, i.e. is
hierarchical. Create a subcgroup which has zswap.writeback set to 1, and
the upper hierarchy's restrictions shall apply.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Mike Yuan <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Michal Koutný <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Nhat Pham <[email protected]>
Cc: Roman Gushchin <[email protected]>
Cc: Shakeel Butt <[email protected]>
Cc: Yosry Ahmed <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Currently, running the charge_reserved_hugetlb.sh selftest we can
sometimes observe something like:
$ ./charge_reserved_hugetlb.sh -cgroup-v2
...
write_result is 0
After write:
hugetlb_usage=0
reserved_usage=10485760
killing write_to_hugetlbfs
Received 2.
Deleting the memory
Detach failure: Invalid argument
umount: /mnt/huge: target is busy.
Both cases are issues in the test.
While the unmount error seems to be racy, it will make the test fail:
$ ./run_vmtests.sh -t hugetlb
...
# [FAIL]
not ok 10 charge_reserved_hugetlb.sh -cgroup-v2 # exit=32
The issue is that we are not waiting for the write_to_hugetlbfs process to
quit. So it might still have a hugetlbfs file open, about which umount is
not happy. Fix that by making "killall" wait for the process to quit.
The other error ("Detach failure: Invalid argument") does not seem to
result in a test error, but is misleading. Turns out write_to_hugetlbfs.c
unconditionally tries to cleanup using shmdt(), even when we only
mmap()'ed a hugetlb file. Even worse, shmaddr is never even set for the
SHM case. Fix that as well.
With this change it seems to work as expected.
Link: https://lkml.kernel.org/r/[email protected]
Fixes: 29750f71a9b4 ("hugetlb_cgroup: add hugetlb_cgroup reservation tests")
Signed-off-by: David Hildenbrand <[email protected]>
Reported-by: Mario Casquero <[email protected]>
Reviewed-by: Mina Almasry <[email protected]>
Tested-by: Mario Casquero <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Muchun Song <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Convert x86 to use PG_arch_2 instead of PG_uncached and remove
PG_uncached.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
This flag has similar constraints to PG_owner_priv_1 -- it is ignored by
core code, and is entirely for the use of the code which allocated the
folio. Since the pagecache does not use it, individual filesystems can
use it. The bufferhead code does use it, so filesystems which use the
buffer cache must not use it for another purpose.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Add more mseal traversal tests across VMAs, where we could possibly screw
up sealing checks. These test more across-vma traversal for mprotect,
munmap and madvise. Particularly, we test for the case where a regular
VMA is followed by a sealed VMA.
[[email protected]: remove incorrect comment, per review]
[[email protected]: remove the correct comment, per Pedro]
[[email protected]: fix mseal's length]
Link: https://lkml.kernel.org/r/vc4czyuemmu3kylqb4ctaga6y5yvondlyabimx6jvljlw2fkea@djawlllf45xa
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Pedro Falcato <[email protected]>
Reviewed-by: Liam R. Howlett <[email protected]>
Reviewed-by: Lorenzo Stoakes <[email protected]>
Cc: Jeff Xu <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Add shmem mTHP collpase testing. Similar to the anonymous page, users can
use the '-s' parameter to specify the shmem mTHP size for testing.
Link: https://lkml.kernel.org/r/fa44bfa20ca5b9fd6f9163a048f3d3c1e53cd0a8.1724140601.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <[email protected]>
Cc: Barry Song <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Ryan Roberts <[email protected]>
Cc: Yang Shi <[email protected]>
Cc: Zi Yan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
IA64 has gone with commit cf8e8658100d ("arch: Remove Itanium (IA-64)
architecture"), so remove unnecessary ia64 special mm code and comment in
selftests too.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Jinjiang Tu <[email protected]>
Cc: Kefeng Wang <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Nanyong Sun <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
'MPTCP_PM_NAME' is defined in 'linux/mptcp_pm.h', included in
'linux/mptcp.h', no need to re-define it.
'MPTCP_PM_EVENTS' is not defined in 'linux/mptcp.h', but
'MPTCP_PM_EV_GRP_NAME' is, with the same value. We can then use the
latter, and drop the other one.
Reviewed-by: Geliang Tang <[email protected]>
Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
Link: https://patch.msgid.link/20240902-net-next-mptcp-mib-mpjtx-misc-v1-11-d3e0f3773b90@kernel.org
Signed-off-by: Jakub Kicinski <[email protected]>
|