Age | Commit message (Collapse) | Author | Files | Lines |
|
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- fix dangling pointer to rb-tree of defragmented inodes after cleanup
- a followup fix to handle concurrent lseek on the same fd that could
leak memory under some conditions
- fix wrong root id reported in tree checker when verifying dref
* tag 'for-6.12-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix use-after-free on rbtree that tracks inodes for auto defrag
btrfs: tree-checker: fix the wrong output of data backref objectid
btrfs: fix race setting file private on concurrent lseek using same fd
|
|
Avoid "gcc" since it is not the only compiler supported by Kbuild.
Signed-off-by: Masahiro Yamada <[email protected]>
Reviewed-by: Nicolas Schier <[email protected]>
|
|
Building external modules is typically done using this command:
$ make -C <KERNEL_DIR> M=<EXTMOD_DIR>
Here, <KERNEL_DIR> refers to the output directory where the kernel was
built, not the kernel source directory.
When the kernel is built in the source tree, there is no ambiguity, as
the output directory and the source directory are the same.
If the kernel was built in a separate build directory, <KERNEL_DIR>
should be the kernel output directory. Otherwise, Kbuild cannot locate
necessary build artifacts. This has been the method for building
external modules against a pre-built kernel in a separate directory
for over 20 years. [1]
If you pass the kernel source directory to the -C option, you must also
specify the kernel build directory using the O= option. This approach
works as well, though it results in a slightly longer command:
$ make -C <KERNEL_SOURCE_DIR> O=<KERNEL_BUILD_DIR> M=<EXTMOD_DIR>
Some people mistakenly believe that O= should specify a build directory
for external modules when used together with M=. This commit adds more
clarification to Documentation/kbuild/kbuild.rst.
[1]: https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/?id=e321b2ec2eb2993b3d0116e5163c78ad923e3c54
Signed-off-by: Masahiro Yamada <[email protected]>
Reviewed-by: Nicolas Schier <[email protected]>
|
|
The use of shipped files is discouraged in the upstream kernel these
days. [1]
Downstream Makefiles have the freedom to use shipped files or other
options to handle binaries, but this should not be advertised in the
upstream document.
[1]: https://lore.kernel.org/all/CAHk-=wgSEi_ZrHdqr=20xv+d6dr5G895CbOAi8ok+7-CQUN=fQ@mail.gmail.com/
Signed-off-by: Masahiro Yamada <[email protected]>
Reviewed-by: Nicolas Schier <[email protected]>
|
|
Do similar to commit 1a4c1c9df72e ("docs/kbuild/makefiles: drop section
numbering, use references").
Signed-off-by: Masahiro Yamada <[email protected]>
|
|
Do similar to commit 5e8f0ba38a4d ("docs/kbuild/makefiles: throw out the
local table of contents").
Signed-off-by: Masahiro Yamada <[email protected]>
|
|
Kbuild used to manipulate header search paths, enforcing the odd
limitation of "no space after -I".
Commit cdd750bfb1f7 ("kbuild: remove 'addtree' and 'flags' magic for
header search paths") stopped doing that. This limitation no longer
exists. Instead, you need to accurately specify the header search path.
(In this case, $(src)/include)
Signed-off-by: Masahiro Yamada <[email protected]>
Reviewed-by: Nicolas Schier <[email protected]>
|
|
This description was added 20 years ago [1]. It does not convey any
useful information except for a feeling of nostalgia.
[1]: https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/?id=65e433436b5794ae056d22ddba60fe9194bba007
Signed-off-by: Masahiro Yamada <[email protected]>
Reviewed-by: Nicolas Schier <[email protected]>
|
|
The phrase "In newer versions of the kernel" was added 14 years ago, by
commit efdf02cf0651 ("Documentation/kbuild: major edit of modules.txt
sections 1-4"). This feature is no longer new, so remove it and update
the paragraph.
Example 3 was written 20 years ago [1]. There is no need to note about
backward compatibility with such an old build system. Remove Example 3
entirely.
[1]: https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/?id=65e433436b5794ae056d22ddba60fe9194bba007
Signed-off-by: Masahiro Yamada <[email protected]>
Reviewed-by: Nicolas Schier <[email protected]>
|
|
If RUST_LIB_SRC is defined in the top-level Makefile (via an environment
variable or command line), it is already exported.
The only situation where it is defined but not exported is when the
top-level Makefile is wrapped by another Makefile (e.g., GNUmakefile).
I cannot think of any other use cases.
I know some people use this tip to define custom variables. However,
even in that case, you can export it directly in the wrapper Makefile.
Example GNUmakefile:
export RUST_LIB_SRC = /path/to/your/sysroot/lib/rustlib/src/rust/library
include Makefile
Signed-off-by: Masahiro Yamada <[email protected]>
Reviewed-by: Nicolas Schier <[email protected]>
Reviewed-by: Alice Ryhl <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull quota and isofs updates from Jan Kara:
"A few small cleanups in quota and isofs"
* tag 'fs_for_v6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
isofs: Annotate struct SL_component with __counted_by()
quota: remove unnecessary error code translation in dquot_quota_enable
quota: remove redundant return at end of void function
quota: remove unneeded return value of register_quota_format
quota: avoid missing put_quota_format when DQUOT_SUSPENDED is passed
|
|
Pull bcachefs updates from Kent Overstreet:
- rcu_pending, btree key cache rework: this solves lock contenting in
the key cache, eliminating the biggest source of the srcu lock hold
time warnings, and drastically improving performance on some metadata
heavy workloads - on multithreaded creates we're now 3-4x faster than
xfs.
- We're now using an rhashtable instead of the system inode hash table;
this is another significant performance improvement on multithreaded
metadata workloads, eliminating more lock contention.
- for_each_btree_key_in_subvolume_upto(): new helper for iterating over
keys within a specific subvolume, eliminating a lot of open coded
"subvolume_get_snapshot()" and also fixing another source of srcu
lock time warnings, by running each loop iteration in its own
transaction (as the existing for_each_btree_key() does).
- More work on btree_trans locking asserts; we now assert that we don't
hold btree node locks when trans->locked is false, which is important
because we don't use lockdep for tracking individual btree node
locks.
- Some cleanups and improvements in the bset.c btree node lookup code,
from Alan.
- Rework of btree node pinning, which we use in backpointers fsck. The
old hacky implementation, where the shrinker just skipped over nodes
in the pinned range, was causing OOMs; instead we now use another
shrinker with a much higher seeks number for pinned nodes.
- Rebalance now uses BCH_WRITE_ONLY_SPECIFIED_DEVS; this fixes an issue
where rebalance would sometimes fall back to allocating from the full
filesystem, which is not what we want when it's trying to move data
to a specific target.
- Use __GFP_ACCOUNT, GFP_RECLAIMABLE for btree node, key cache
allocations.
- Idmap mounts are now supported (Hongbo Li)
- Rename whiteouts are now supported (Hongbo Li)
- Erasure coding can now handle devices being marked as failed, or
forcibly removed. We still need the evacuate path for erasure coding,
but it's getting very close to ready for people to start using.
* tag 'bcachefs-2024-09-21' of git://evilpiepirate.org/bcachefs: (99 commits)
bcachefs: return err ptr instead of null in read sb clean
bcachefs: Remove duplicated include in backpointers.c
bcachefs: Don't drop devices with stripe pointers
bcachefs: bch2_ec_stripe_head_get() now checks for change in rw devices
bcachefs: bch_fs.rw_devs_change_count
bcachefs: bch2_dev_remove_stripes()
bcachefs: bch2_trigger_ptr() calculates sectors even when no device
bcachefs: improve error messages in bch2_ec_read_extent()
bcachefs: improve error message on too few devices for ec
bcachefs: improve bch2_new_stripe_to_text()
bcachefs: ec_stripe_head.nr_created
bcachefs: bch_stripe.disk_label
bcachefs: stripe_to_mem()
bcachefs: EIO errcode cleanup
bcachefs: Rework btree node pinning
bcachefs: split up btree cache counters for live, freeable
bcachefs: btree cache counters should be size_t
bcachefs: Don't count "skipped access bit" as touched in btree cache scan
bcachefs: Failed devices no longer require mounting in degraded mode
bcachefs: bch2_dev_rcu_noerror()
...
|
|
As discussed during the distro-centric session within the sched_ext
Microconference at LPC 2024, introduce a sequence counter that is
incremented every time a BPF scheduler is loaded.
This feature can help distributions in diagnosing potential performance
regressions by identifying systems where users are running (or have ran)
custom BPF schedulers.
Example:
arighi@virtme-ng~> cat /sys/kernel/sched_ext/enable_seq
0
arighi@virtme-ng~> sudo scx_simple
local=1 global=0
^CEXIT: unregistered from user space
arighi@virtme-ng~> cat /sys/kernel/sched_ext/enable_seq
1
In this way user-space tools (such as Ubuntu's apport and similar) are
able to gather and include this information in bug reports.
Cc: Giovanni Gherdovich <[email protected]>
Cc: Kleber Sacilotto de Souza <[email protected]>
Cc: Marcelo Henrique Cerri <[email protected]>
Cc: Phil Auld <[email protected]>
Signed-off-by: Andrea Righi <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
|
|
a2f4b16e736d ("sched_ext: Build fix on !CONFIG_STACKTRACE[_SUPPORT]") tried
fixing build when !CONFIG_STACKTRACE but didn't so fully. Also put
stack_trace_print() and stack_trace_save() inside CONFIG_STACKTRACE to fix
build when !CONFIG_STACKTRACE.
Signed-off-by: Tejun Heo <[email protected]>
Reported-by: kernel test robot <[email protected]>
Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull 'struct fd' updates from Al Viro:
"Just the 'struct fd' layout change, with conversion to accessor
helpers"
* tag 'pull-stable-struct_fd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
add struct fd constructors, get rid of __to_fd()
struct fd: representation change
introduce fd_file(), convert all accessors to it.
|
|
The merge resolution to deal with the conflict between commits
ea72ce5da228 ("x86/kaslr: Expose and use the end of the physical memory
address space") and 99185c10d5d9 ("resource, kunit: add test case for
region_intersects()") ended up being broken in configurations didn't
define a MAX_PHYSMEM_BITS and that had a 32-bit 'phys_addr_t'.
The fallback to using all bits set (ie "(-1ULL)") ended up causing a
build error:
kernel/resource.c: In function ‘gfr_start’:
include/linux/minmax.h:93:30: error: conversion from ‘long long unsigned int’ to ‘resource_size_t’ {aka ‘unsigned int’} changes value from ‘18446744073709551615’ to ‘4294967295’ [-Werror=overflow]
this was reported by Geert for m68k, but he points out that it happens
on other 32-bit architectures too, eg mips, xtensa, parisc, and powerpc.
Limiting 'PHYSMEM_END' to a 'phys_addr_t' (which is the same as
'resource_size_t') fixes the build, but Geert points out that it will
then cause a silent overflow in mm/sparse.c:
unsigned long max_sparsemem_pfn = (PHYSMEM_END + 1) >> PAGE_SHIFT;
so we actually do want PHYSMEM_END to be defined a 64-bit type - just
not all ones, and not larger than 'phys_addr_t'.
The proper fix is probably to not have some kind of default fallback at
all, but just make sure every architecture has a valid MAX_PHYSMEM_BITS.
But in the meantime, this just applies the rule that PHYSMEM_END is the
largest value that fits in a 'phys_addr_t', but does not have the high
bit set in 64 bits.
Ugly, ugly.
Reported-by: Geert Uytterhoeven <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Huang Ying <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
This patch allows f2fs to submit bios of in-place writes on pinned file.
Reviewed-by: Daeho Jeong <[email protected]>
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
Disable the rq empty path when scx is enabled. SCX must consult the BPF
scheduler (via the dispatch path in balance) to determine if rq is empty.
This fixes stalls when scx is enabled.
Signed-off-by: Pat Somaru <[email protected]>
Fixes: 3dcac251b066 ("sched/core: Introduce SM_IDLE and an idle re-entry fast-path in __schedule()")
Signed-off-by: Tejun Heo <[email protected]>
|
|
When build with CONFIG_GROUP_SCHED_WEIGHT && !CONFIG_FAIR_GROUP_SCHED,
the idle member is not defined:
kernel/sched/ext.c:3701:16: error: 'struct task_group' has no member named 'idle'
3701 | if (!tg->idle)
| ^~
Fix this by putting 'idle' under new CONFIG_GROUP_SCHED_WEIGHT.
tj: Move idle field upward to avoid breaking up CONFIG_FAIR_GROUP_SCHED block.
Fixes: e179e80c5d4f ("sched: Introduce CONFIG_GROUP_SCHED_WEIGHT")
Reported-by: kernel test robot <[email protected]>
Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
Signed-off-by: Yu Liao <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
|
|
Replace a comma between expression statements by a semicolon.
Signed-off-by: Chen Ni <[email protected]>
Reviewed-by: Cristian Ciocaltea <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lee Jones <[email protected]>
|
|
Document the compatible for sa8775p SoC.
Reviewed-by: Elliot Berman <[email protected]>
Signed-off-by: Mukesh Ojha <[email protected]>
Acked-by: Rob Herring (Arm) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lee Jones <[email protected]>
|
|
Add Intel Panther Lake-H/P PCI IDs.
Signed-off-by: Ilpo Järvinen <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lee Jones <[email protected]>
|
|
Add Intel Arrow Lake-H PCI IDs.
Signed-off-by: Ilpo Järvinen <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lee Jones <[email protected]>
|
|
Document rk3576 compatible for QoS registers.
Signed-off-by: Detlev Casanova <[email protected]>
Reviewed-by: Krzysztof Kozlowski <[email protected]>
Acked-by: Heiko Stuebner <[email protected]>
Link: https://lore.kernel.org/r/01020191998a2fd4-4d7b091c-9c4c-4067-b8d9-fe7482074d6d-000000@eu-west-1.amazonses.com
Signed-off-by: Lee Jones <[email protected]>
|
|
Allow parsing GPIO controller children nodes with GPIO hogs.
Signed-off-by: Haibo Chen <[email protected]>
Reviewed-by: Laurent Pinchart <[email protected]>
Reviewed-by: Krzysztof Kozlowski <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lee Jones <[email protected]>
|
|
There's no need to list "tc3589x" in the DT match table. The I2C core
will strip any vendor prefix and match against the i2c_device_id table
which has an "tc3589x" entry.
Probably "tc3589x" and TC3589X_UNKNOWN could be removed altogether.
Use of that compatible was only on some STE platforms and was dropped
in 2013. There were ABI breaks in 2014 claiming no DTs in the wild. See
commit 1637d480f873 ("pinctrl: nomadik: force-convert to generic config
bindings").
Signed-off-by: Rob Herring (Arm) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lee Jones <[email protected]>
|
|
Avoids the need for manual cleanup of_node_put() in early exits
from the loop.
Signed-off-by: Jinjie Ruan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lee Jones <[email protected]>
|
|
Avoids the need for manual cleanup of_node_put() in early exits
from the loop.
Signed-off-by: Jinjie Ruan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lee Jones <[email protected]>
|
|
There are 2G and 4G RAM versions of the Lenovo Yoga Tab 3 X90F and it
turns out that the 2G version has a DMI product name of
"CHERRYVIEW D1 PLATFORM" where as the 4G version has
"CHERRYVIEW C0 PLATFORM". The sys-vendor + product-version check are
unique enough that the product-name check is not necessary.
Drop the product-name check so that the existing DMI match for the 4G
RAM version also matches the 2G RAM version.
Signed-off-by: Hans de Goede <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lee Jones <[email protected]>
|
|
The module description can be backtracked to commit e7c256fbfb15
("platform/chrome: Add Chrome OS EC userspace device interface").
The description became out-of-date after a bunch of changes e.g:
- commit 5668bfdd90cd ("platform/chrome: cros_ec_dev - Register cros-ec sensors").
- commit ea01a31b9058 ("cros_ec: Split cros_ec_devs module").
- commit 5e0115581bbc ("cros_ec: Move cros_ec_dev module to drivers/mfd").
Update the description.
Signed-off-by: Tzung-Bi Shih <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lee Jones <[email protected]>
|
|
Simplify cros_ec_dev_init() by the following changes:
- Get rid of label `failed_devreg`.
- Remove a redundant space and comment.
- Use `if (ret)` instead of `if (ret < 0)`.
Signed-off-by: Tzung-Bi Shih <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lee Jones <[email protected]>
|
|
Add MODULE_DEVICE_TABLE(), so modules could be properly autoloaded
based on the alias from of_device_id table.
Signed-off-by: Liao Chen <[email protected]>
Reviewed-by: Krzysztof Kozlowski <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lee Jones <[email protected]>
|
|
The ArmSoM Sige 5 board connects the rk806 PMIC on an i2c bus.
Signed-off-by: Detlev Casanova <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lee Jones <[email protected]>
|
|
Fix the following error when build with CONFIG_GROUP_SCHED_WEIGHT &&
!CONFIG_FAIR_GROUP_SCHED:
kernel/sched/core.c:9634:15: error: implicit declaration of function
'sched_group_set_idle'; did you mean 'scx_group_set_idle'? [-Wimplicit-function-declaration]
9634 | ret = sched_group_set_idle(css_tg(css), idle);
| ^~~~~~~~~~~~~~~~~~~~
| scx_group_set_idle
Fixes: e179e80c5d4f ("sched: Introduce CONFIG_GROUP_SCHED_WEIGHT")
Reported-by: kernel test robot <[email protected]>
Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
Signed-off-by: Yu Liao <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
|
|
Hexagon images fail to build with the following error.
arch/hexagon/kernel/vdso.c:57:3: error: use of undeclared identifier 'name'
name = "[vdso]",
^
Add the missing '.' to fix the problem.
Fixes: 497258dfafcc ("mm: remove legacy install_special_mapping() code")
Cc: Linus Torvalds <[email protected]>
Signed-off-by: Guenter Roeck <[email protected]>
Reviewed-by: Brian Cain <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Add SPDX identifier to the gitignore. Remove the comment and .i file
since the file it references was removed in another patch. This patch
depends on Min-Hua Chen's 'pm: cpupower: rename raw_pylibcpupower.i'.
Signed-off-by: John B. Wyatt IV <[email protected]>
Signed-off-by: John B. Wyatt IV <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>
|
|
I have been contributing to exfat for sometime and I would like to help
with code reviews as well.
Signed-off-by: Yuezhang Mo <[email protected]>
Acked-by: Sungjong Seo <[email protected]>
Signed-off-by: Namjae Jeon <[email protected]>
|
|
If exfat_load_upcase_table reaches end and returns -EINVAL,
allocated memory doesn't get freed and while
exfat_load_default_upcase_table allocates more memory, leading to a
memory leak.
Here's link to syzkaller crash report illustrating this issue:
https://syzkaller.appspot.com/text?tag=CrashReport&x=1406c201980000
Reported-by: [email protected]
Fixes: a13d1a4de3b0 ("exfat: move freeing sbi, upcase table and dropping nls into rcu-delayed helper")
Cc: [email protected]
Signed-off-by: Daniel Yang <[email protected]>
Signed-off-by: Namjae Jeon <[email protected]>
|
|
It is not a good way to extend valid_size to the end of the
mmap area by writing zeros in mmap. Because after calling mmap,
no data may be written, or only a small amount of data may be
written to the head of the mmap area.
This commit moves extending valid_size to exfat_page_mkwrite().
In exfat_page_mkwrite() only extend valid_size to the starting
position of new data writing, which reduces unnecessary writing
of zeros.
If the block is not mapped and is marked as new after being
mapped for writing, block_write_begin() will zero the page
cache corresponding to the block, so there is no need to call
zero_user_segment() in exfat_file_zeroed_range(). And after moving
extending valid_size to exfat_page_mkwrite(), the data written by
mmap will be copied to the page cache but the page cache may be
not mapped to the disk. Calling zero_user_segment() will cause
the data written by mmap to be cleared. So this commit removes
calling zero_user_segment() from exfat_file_zeroed_range() and
renames exfat_file_zeroed_range() to exfat_extend_valid_size().
Signed-off-by: Yuezhang Mo <[email protected]>
Signed-off-by: Namjae Jeon <[email protected]>
|
|
We should convert fs/fuse code to use a newly introduced
invalid_mnt_idmap instead of passing a NULL as idmap pointer.
Suggested-by: Christian Brauner <[email protected]>
Signed-off-by: Alexander Mikhalitsyn <[email protected]>
Reviewed-by: Christian Brauner <[email protected]>
Signed-off-by: Miklos Szeredi <[email protected]>
|
|
The correct macro name for creating a u32 array property entry is
PROPERTY_ENTRY_U32_ARRAY().
Reported-by: kernel test robot <[email protected]>
Fixes: 1b05a7013751 ("ARM: spitz: Use software nodes/properties for the matrix keypad")
Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
Signed-off-by: Dmitry Torokhov <[email protected]>
|
|
Link: https://lore.kernel.org/linux-fsdevel/20240904-baugrube-erhoben-b3c1c49a2645@brauner/
Suggested-by: Christian Brauner <[email protected]>
Signed-off-by: Alexander Mikhalitsyn <[email protected]>
Reviewed-by: Christian Brauner <[email protected]>
Signed-off-by: Miklos Szeredi <[email protected]>
|
|
Let's convert all existing callers properly.
No functional changes intended.
Suggested-by: Christian Brauner <[email protected]>
Signed-off-by: Alexander Mikhalitsyn <[email protected]>
Reviewed-by: Christian Brauner <[email protected]>
Signed-off-by: Miklos Szeredi <[email protected]>
|
|
It was reported [1] that on linux-next/fs-next the following crash
is reproducible:
[ 42.659136] Oops: general protection fault, probably for non-canonical address 0xdffffc000000000b: 0000 [#1] PREEMPT SMP KASAN NOPTI
[ 42.660501] fbcon: Taking over console
[ 42.660930] KASAN: null-ptr-deref in range [0x0000000000000058-0x000000000000005f]
[ 42.661752] CPU: 1 UID: 0 PID: 1589 Comm: dtprobed Not tainted 6.11.0-rc6+ #1
[ 42.662565] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.6.6 08/22/2023
[ 42.663472] RIP: 0010:fuse_get_req+0x36b/0x990 [fuse]
[ 42.664046] Code: 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 8c 05 00 00 48 b8 00 00 00 00 00 fc ff df 48 8b 6d 08 48 8d 7d 58 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 4d 05 00 00 f6 45 59 20 0f 85 06 03 00 00 48 83
[ 42.666945] RSP: 0018:ffffc900009a7730 EFLAGS: 00010212
[ 42.668837] RAX: dffffc0000000000 RBX: 1ffff92000134eed RCX: ffffffffc20dec9a
[ 42.670122] RDX: 000000000000000b RSI: 0000000000000008 RDI: 0000000000000058
[ 42.672154] RBP: 0000000000000000 R08: 0000000000000001 R09: ffffed1022110172
[ 42.672160] R10: ffff888110880b97 R11: ffffc900009a737a R12: 0000000000000001
[ 42.672179] R13: ffff888110880b60 R14: ffff888110880b90 R15: ffff888169973840
[ 42.672186] FS: 00007f28cd21d7c0(0000) GS:ffff8883ef280000(0000) knlGS:0000000000000000
[ 42.672191] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 42.[ CR02: ;32m00007f3237366208 CR3: 0 OK 79e001 CR4: 0000000000770ef0
[ 42.672214] PKRU: 55555554
[ 42.672218] Call Trace:
[ 42.672223] <TASK>
[ 42.672226] ? die_addr+0x41/0xa0
[ 42.672238] ? exc_general_protection+0x14c/0x230
[ 42.672250] ? asm_exc_general_protection+0x26/0x30
[ 42.672260] ? fuse_get_req+0x77a/0x990 [fuse]
[ 42.672281] ? fuse_get_req+0x36b/0x990 [fuse]
[ 42.672300] ? kasan_unpoison+0x27/0x60
[ 42.672310] ? __pfx_fuse_get_req+0x10/0x10 [fuse]
[ 42.672327] ? srso_alias_return_thunk+0x5/0xfbef5
[ 42.672333] ? alloc_pages_mpol_noprof+0x195/0x440
[ 42.672340] ? srso_alias_return_thunk+0x5/0xfbef5
[ 42.672345] ? kasan_unpoison+0x27/0x60
[ 42.672350] ? srso_alias_return_thunk+0x5/0xfbef5
[ 42.672355] ? __kasan_slab_alloc+0x4d/0x90
[ 42.672362] ? srso_alias_return_thunk+0x5/0xfbef5
[ 42.672367] ? __kmalloc_cache_noprof+0x134/0x350
[ 42.672376] fuse_simple_background+0xe7/0x180 [fuse]
[ 42.672406] cuse_channel_open+0x540/0x710 [cuse]
[ 42.672415] misc_open+0x2a7/0x3a0
[ 42.672424] chrdev_open+0x1ef/0x5f0
[ 42.672432] ? __pfx_chrdev_open+0x10/0x10
[ 42.672439] ? srso_alias_return_thunk+0x5/0xfbef5
[ 42.672443] ? security_file_open+0x3bb/0x720
[ 42.672451] do_dentry_open+0x43d/0x1200
[ 42.672459] ? __pfx_chrdev_open+0x10/0x10
[ 42.672468] vfs_open+0x79/0x340
[ 42.672475] ? srso_alias_return_thunk+0x5/0xfbef5
[ 42.672482] do_open+0x68c/0x11e0
[ 42.672489] ? srso_alias_return_thunk+0x5/0xfbef5
[ 42.672495] ? __pfx_do_open+0x10/0x10
[ 42.672501] ? srso_alias_return_thunk+0x5/0xfbef5
[ 42.672506] ? open_last_lookups+0x2a2/0x1370
[ 42.672515] path_openat+0x24f/0x640
[ 42.672522] ? __pfx_path_openat+0x10/0x10
[ 42.723972] ? stack_depot_save_flags+0x45/0x4b0
[ 42.724787] ? __fput+0x43c/0xa70
[ 42.725100] do_filp_open+0x1b3/0x3e0
[ 42.725710] ? poison_slab_object+0x10d/0x190
[ 42.726145] ? __kasan_slab_free+0x33/0x50
[ 42.726570] ? __pfx_do_filp_open+0x10/0x10
[ 42.726981] ? do_syscall_64+0x64/0x170
[ 42.727418] ? entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 42.728018] ? srso_alias_return_thunk+0x5/0xfbef5
[ 42.728505] ? do_raw_spin_lock+0x131/0x270
[ 42.728922] ? __pfx_do_raw_spin_lock+0x10/0x10
[ 42.729494] ? do_raw_spin_unlock+0x14c/0x1f0
[ 42.729992] ? srso_alias_return_thunk+0x5/0xfbef5
[ 42.730889] ? srso_alias_return_thunk+0x5/0xfbef5
[ 42.732178] ? alloc_fd+0x176/0x5e0
[ 42.732585] do_sys_openat2+0x122/0x160
[ 42.732929] ? __pfx_do_sys_openat2+0x10/0x10
[ 42.733448] ? srso_alias_return_thunk+0x5/0xfbef5
[ 42.734013] ? __pfx_map_id_up+0x10/0x10
[ 42.734482] ? srso_alias_return_thunk+0x5/0xfbef5
[ 42.735529] ? __memcg_slab_free_hook+0x292/0x500
[ 42.736131] __x64_sys_openat+0x123/0x1e0
[ 42.736526] ? __pfx___x64_sys_openat+0x10/0x10
[ 42.737369] ? __x64_sys_close+0x7c/0xd0
[ 42.737717] ? srso_alias_return_thunk+0x5/0xfbef5
[ 42.738192] ? syscall_trace_enter+0x11e/0x1b0
[ 42.738739] do_syscall_64+0x64/0x170
[ 42.739113] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 42.739638] RIP: 0033:0x7f28cd13e87b
[ 42.740038] Code: 25 00 00 41 00 3d 00 00 41 00 74 4b 64 8b 04 25 18 00 00 00 85 c0 75 67 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 91 00 00 00 48 8b 54 24 28 64 48 2b 14 25
[ 42.741943] RSP: 002b:00007ffc992546c0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
[ 42.742951] RAX: ffffffffffffffda RBX: 00007f28cd44f1ee RCX: 00007f28cd13e87b
[ 42.743660] RDX: 0000000000000002 RSI: 00007f28cd44f2fa RDI: 00000000ffffff9c
[ 42.744518] RBP: 00007f28cd44f2fa R08: 0000000000000000 R09: 0000000000000001
[ 42.745211] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000002
[ 42.745920] R13: 00007f28cd44f2fa R14: 0000000000000000 R15: 0000000000000003
[ 42.746708] </TASK>
[ 42.746937] Modules linked in: cuse vfat fat ext4 mbcache jbd2 intel_rapl_msr intel_rapl_common kvm_amd ccp bochs drm_vram_helper kvm drm_ttm_helper ttm pcspkr i2c_piix4 drm_kms_helper i2c_smbus pvpanic_mmio pvpanic joydev sch_fq_codel drm fuse xfs nvme_tcp nvme_fabrics nvme_core sd_mod sg virtio_net net_failover virtio_scsi failover crct10dif_pclmul crc32_pclmul ata_generic pata_acpi ata_piix ghash_clmulni_intel virtio_pci sha512_ssse3 virtio_pci_legacy_dev sha256_ssse3 virtio_pci_modern_dev sha1_ssse3 libata serio_raw dm_multipath btrfs blake2b_generic xor zstd_compress raid6_pq sunrpc dm_mirror dm_region_hash dm_log dm_mod be2iscsi bnx2i cnic uio cxgb4i cxgb4 tls cxgb3i cxgb3 mdio libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi qemu_fw_cfg aesni_intel crypto_simd cryptd
[ 42.754333] ---[ end trace 0000000000000000 ]---
[ 42.756899] RIP: 0010:fuse_get_req+0x36b/0x990 [fuse]
[ 42.757851] Code: 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 8c 05 00 00 48 b8 00 00 00 00 00 fc ff df 48 8b 6d 08 48 8d 7d 58 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 4d 05 00 00 f6 45 59 20 0f 85 06 03 00 00 48 83
[ 42.760334] RSP: 0018:ffffc900009a7730 EFLAGS: 00010212
[ 42.760940] RAX: dffffc0000000000 RBX: 1ffff92000134eed RCX: ffffffffc20dec9a
[ 42.761697] RDX: 000000000000000b RSI: 0000000000000008 RDI: 0000000000000058
[ 42.763009] RBP: 0000000000000000 R08: 0000000000000001 R09: ffffed1022110172
[ 42.763920] R10: ffff888110880b97 R11: ffffc900009a737a R12: 0000000000000001
[ 42.764839] R13: ffff888110880b60 R14: ffff888110880b90 R15: ffff888169973840
[ 42.765716] FS: 00007f28cd21d7c0(0000) GS:ffff8883ef280000(0000) knlGS:0000000000000000
[ 42.766890] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 42.767828] CR2: 00007f3237366208 CR3: 000000012c79e001 CR4: 0000000000770ef0
[ 42.768730] PKRU: 55555554
[ 42.769022] Kernel panic - not syncing: Fatal exception
[ 42.770758] Kernel Offset: 0x7200000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 42.771947] ---[ end Kernel panic - not syncing: Fatal exception ]---
It's obviously CUSE related callstack. For CUSE case, we don't have superblock and
our checks for SB_I_NOIDMAP flag does not make any sense. Let's handle this case gracefully.
Fixes: aa16880d9f13 ("fuse: add basic infrastructure to support idmappings")
Link: https://lore.kernel.org/linux-next/87v7z586py.fsf@debian-BULLSEYE-live-builder-AMD64/ [1]
Reported-by: Chandan Babu R <[email protected]>
Reported-by: [email protected]
Signed-off-by: Alexander Mikhalitsyn <[email protected]>
Reviewed-by: Christian Brauner <[email protected]>
Signed-off-by: Miklos Szeredi <[email protected]>
|
|
While using the IOMMU DMA path, the dma_addressing_limited() function
checks ops struct which doesn't exist in the IOMMU case. This causes
to the kernel panic while loading ADMGPU driver.
BUG: kernel NULL pointer dereference, address: 00000000000000a0
PGD 0 P4D 0
Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 10 UID: 0 PID: 611 Comm: (udev-worker) Tainted: G T 6.11.0-clang-07154-g726e2d0cf2bb #257
Tainted: [T]=RANDSTRUCT
Hardware name: ASUS System Product Name/ROG STRIX Z690-G GAMING WIFI, BIOS 3701 07/03/2024
RIP: 0010:dma_addressing_limited+0x53/0xa0
Code: 8b 93 48 02 00 00 48 39 d1 49 89 d6 4c 0f 42 f1 48 85 d2 4c 0f 44 f1 f6 83 fc 02 00 00 40 75 0a 48 89 df e8 1f 09 00 00 eb 24 <4c> 8b 1c 25 a0 00 00 00 4d 85 db 74 17 48 89 df 41 ba 8b 84 2d 55
RSP: 0018:ffffa8d2c12cf740 EFLAGS: 00010202
RAX: 00000000ffffffff RBX: ffff8948820220c8 RCX: 000000ffffffffff
RDX: 0000000000000000 RSI: ffffffffc124dc6d RDI: ffff8948820220c8
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff894883c3f040
R13: ffff89488dac8828 R14: 000000ffffffffff R15: ffff8948820220c8
FS: 00007fe6ba881900(0000) GS:ffff894fdf700000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000000a0 CR3: 0000000111984000 CR4: 0000000000f50ef0
PKRU: 55555554
Call Trace:
<TASK>
? __die_body+0x65/0xc0
? page_fault_oops+0x3b9/0x450
? _prb_read_valid+0x212/0x390
? do_user_addr_fault+0x608/0x680
? exc_page_fault+0x4e/0xa0
? asm_exc_page_fault+0x26/0x30
? dma_addressing_limited+0x53/0xa0
amdgpu_ttm_init+0x56/0x4b0 [amdgpu]
gmc_v8_0_sw_init+0x561/0x670 [amdgpu]
amdgpu_device_ip_init+0xf5/0x570 [amdgpu]
amdgpu_device_init+0x1a57/0x1ea0 [amdgpu]
? _raw_spin_unlock_irqrestore+0x1a/0x40
? pci_conf1_read+0xc0/0xe0
? pci_bus_read_config_word+0x52/0xa0
amdgpu_driver_load_kms+0x15/0xa0 [amdgpu]
amdgpu_pci_probe+0x1b7/0x4c0 [amdgpu]
pci_device_probe+0x1c5/0x260
really_probe+0x130/0x470
__driver_probe_device+0x77/0x150
driver_probe_device+0x19/0x120
__driver_attach+0xb1/0x1e0
? __cfi___driver_attach+0x10/0x10
bus_for_each_dev+0x115/0x170
bus_add_driver+0x192/0x2d0
driver_register+0x5c/0xf0
? __cfi_init_module+0x10/0x10 [amdgpu]
do_one_initcall+0x128/0x380
? idr_alloc_cyclic+0x139/0x1d0
? security_kernfs_init_security+0x42/0x140
? __kernfs_new_node+0x1be/0x250
? sysvec_apic_timer_interrupt+0xb6/0xc0
? asm_sysvec_apic_timer_interrupt+0x1a/0x20
? _raw_spin_unlock+0x11/0x30
? free_unref_page+0x283/0x650
? kfree+0x274/0x3a0
? kfree+0x274/0x3a0
? kfree+0x274/0x3a0
? load_module+0xf2e/0x1130
? __kmalloc_cache_noprof+0x12a/0x2e0
do_init_module+0x7d/0x240
__se_sys_init_module+0x19e/0x220
do_syscall_64+0x8a/0x150
? __irq_exit_rcu+0x5e/0x100
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7fe6bb5980ee
Code: 48 8b 0d 3d ed 12 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 0a ed 12 00 f7 d8 64 89 01 48
RSP: 002b:00007ffd462219d8 EFLAGS: 00000206 ORIG_RAX: 00000000000000af
RAX: ffffffffffffffda RBX: 0000556caf0d0670 RCX: 00007fe6bb5980ee
RDX: 0000556caf0d3080 RSI: 0000000002893458 RDI: 00007fe6b3400010
RBP: 0000000000020000 R08: 0000000000020010 R09: 0000000000000080
R10: c26073c166186e00 R11: 0000000000000206 R12: 0000556caf0d3430
R13: 0000556caf0d0670 R14: 0000556caf0d3080 R15: 0000556caf0ce700
</TASK>
Modules linked in: amdgpu(+) i915(+) drm_suballoc_helper intel_gtt drm_exec drm_buddy iTCO_wdt i2c_algo_bit intel_pmc_bxt drm_display_helper iTCO_vendor_support gpu_sched drm_ttm_helper cec ttm amdxcp video backlight pinctrl_alderlake nct6775 hwmon_vid nct6775_core coretemp
CR2: 00000000000000a0
---[ end trace 0000000000000000 ]---
RIP: 0010:dma_addressing_limited+0x53/0xa0
Code: 8b 93 48 02 00 00 48 39 d1 49 89 d6 4c 0f 42 f1 48 85 d2 4c 0f 44 f1 f6 83 fc 02 00 00 40 75 0a 48 89 df e8 1f 09 00 00 eb 24 <4c> 8b 1c 25 a0 00 00 00 4d 85 db 74 17 48 89 df 41 ba 8b 84 2d 55
RSP: 0018:ffffa8d2c12cf740 EFLAGS: 00010202
RAX: 00000000ffffffff RBX: ffff8948820220c8 RCX: 000000ffffffffff
RDX: 0000000000000000 RSI: ffffffffc124dc6d RDI: ffff8948820220c8
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff894883c3f040
R13: ffff89488dac8828 R14: 000000ffffffffff R15: ffff8948820220c8
FS: 00007fe6ba881900(0000) GS:ffff894fdf700000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000000a0 CR3: 0000000111984000 CR4: 0000000000f50ef0
PKRU: 55555554
Fixes: b5c58b2fdc42 ("dma-mapping: direct calls for dma-iommu")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219292
Reported-by: Niklāvs Koļesņikovs <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
Tested-by: Niklāvs Koļesņikovs <[email protected]>
|
|
Many architectures support load acquire which can replace a memory
barrier and save some cycles.
A typical sequence
do {
seq = read_seqcount_begin(&s);
<something>
} while (read_seqcount_retry(&s, seq);
requires 13 cycles on an N1 Neoverse arm64 core (Ampere Altra, to be
specific) for an empty loop. Two read memory barriers are needed. One
for each of the seqcount_* functions.
We can replace the first read barrier with a load acquire of the
seqcount which saves us one barrier.
On the Altra doing so reduces the cycle count from 13 to 8.
According to ARM, this is a general improvement for the ARM64
architecture and not specific to a certain processor.
See
https://developer.arm.com/documentation/102336/0100/Load-Acquire-and-Store-Release-instructions
"Weaker ordering requirements that are imposed by Load-Acquire and
Store-Release instructions allow for micro-architectural
optimizations, which could reduce some of the performance impacts that
are otherwise imposed by an explicit memory barrier.
If the ordering requirement is satisfied using either a Load-Acquire
or Store-Release, then it would be preferable to use these
instructions instead of a DMB"
[ NOTE! This is my original minimal patch that unconditionally switches
over to using smp_load_acquire(), instead of the much more involved
and subtle patch that Christoph Lameter wrote that made it
conditional.
But Christoph gets authorship credit because I had initially thought
that we needed the more complex model, and Christoph ran with it it
and did the work. Only after looking at code generation for all the
relevant architectures, did I come to the conclusion that nobody
actually really needs the old "smp_rmb()" model.
Even architectures without load-acquire support generally do as well
or better with smp_load_acquire().
So credit to Christoph, but if this then causes issues on other
architectures, put the blame solidly on me.
Also note as part of the ruthless simplification, this gets rid of the
overly subtle optimization where some code uses a non-barrier version
of the sequence count (see the __read_seqcount_begin() users in
fs/namei.c). They then play games with their own barriers and/or with
nested sequence counts.
Those optimizations are literally meaningless on x86, and questionable
elsewhere. If somebody can show that they matter, we need to re-do
them more cleanly than "use an internal helper". - Linus ]
Signed-off-by: Christoph Lameter (Ampere) <[email protected]>
Link: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Merge user access fast validation using address masking.
This allows architectures to optionally use a data dependent address
masking model instead of a conditional branch for validating user
accesses. That avoids the Spectre-v1 speculation barriers.
Right now only x86-64 takes advantage of this, and not all architectures
will be able to do it. It requires a guard region between the user and
kernel address spaces (so that you can't overflow from one to the
other), and an easy way to generate a guaranteed-to-fault address for
invalid user pointers.
Also note that this currently assumes that there is no difference
between user read and write accesses. If extended to architectures like
powerpc, we'll also need to separate out the user read-vs-write cases.
* address-masking:
x86: make the masked_user_access_begin() macro use its argument only once
x86: do the user address masking outside the user access area
x86: support user address masking instead of non-speculative conditional
|
|
This doesn't actually matter for any of the current users, but before
merging it mainline, make sure we don't have any surprising semantics.
We don't actually want to use an inline function here, because we want
to allow - but not require - const pointer arguments, and return them as
such. But we already had a local auto-type variable, so let's just use
it to avoid any possible double evaluation.
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The direct calls from mapping.c all guarded by use_dma_iommu(), so don't
bother to provide stubs, but instead just expose the prototypes
unconditionally.
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
|
|
Commit b5c58b2fdc42 ("dma-mapping: direct calls for dma-iommu") switched
to use direct calls to dma-iommu, but missed the dma_vmap_noncontiguous,
dma_vunmap_noncontiguous and dma_mmap_noncontiguous behavior keyed off the
presence of the alloc_noncontiguous method.
Fix this by removing the now unused alloc_noncontiguous and
free_noncontiguous methods and moving the vmapping and mmaping of the
noncontiguous allocations into the iommu code, as it is the only provider
of actually noncontiguous allocations.
Fixes: b5c58b2fdc42 ("dma-mapping: direct calls for dma-iommu")
Reported-by: Xi Ruoyao <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Tested-by: Xi Ruoyao <[email protected]>
|