Age | Commit message (Collapse) | Author | Files | Lines |
|
Reserved regions with direct mapping may contain references to other
regions. CMA region with fixed location is reserved without creating
kmemleak_object for it.
So add them as gray kmemleak objects.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Calvin Zhang <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Frank Rowand <[email protected]>
Cc: Catalin Marinas <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
With HW tag-based kasan enable, We will get the warning when we free
object whose address starts with 0xFF.
It is because kmemleak rbtree stores tagged object and this freeing
object's tag does not match with rbtree object.
In the example below, kmemleak rbtree stores the tagged object in the
kmalloc(), and kfree() gets the pointer with 0xFF tag.
Call sequence:
ptr = kmalloc(size, GFP_KERNEL);
page = virt_to_page(ptr);
offset = offset_in_page(ptr);
kfree(page_address(page) + offset);
ptr = kmalloc(size, GFP_KERNEL);
A sequence like that may cause the warning as following:
1) Freeing unknown object:
In kfree(), we will get free unknown object warning in
kmemleak_free(). Because object(0xFx) in kmemleak rbtree and
pointer(0xFF) in kfree() have different tag.
2) Overlap existing:
When we allocate that object with the same hw-tag again, we will
find the overlap in the kmemleak rbtree and kmemleak thread will be
killed.
kmemleak: Freeing unknown object at 0xffff000003f88000
CPU: 5 PID: 177 Comm: cat Not tainted 5.16.0-rc1-dirty #21
Hardware name: linux,dummy-virt (DT)
Call trace:
dump_backtrace+0x0/0x1ac
show_stack+0x1c/0x30
dump_stack_lvl+0x68/0x84
dump_stack+0x1c/0x38
kmemleak_free+0x6c/0x70
slab_free_freelist_hook+0x104/0x200
kmem_cache_free+0xa8/0x3d4
test_version_show+0x270/0x3a0
module_attr_show+0x28/0x40
sysfs_kf_seq_show+0xb0/0x130
kernfs_seq_show+0x30/0x40
seq_read_iter+0x1bc/0x4b0
seq_read_iter+0x1bc/0x4b0
kernfs_fop_read_iter+0x144/0x1c0
generic_file_splice_read+0xd0/0x184
do_splice_to+0x90/0xe0
splice_direct_to_actor+0xb8/0x250
do_splice_direct+0x88/0xd4
do_sendfile+0x2b0/0x344
__arm64_sys_sendfile64+0x164/0x16c
invoke_syscall+0x48/0x114
el0_svc_common.constprop.0+0x44/0xec
do_el0_svc+0x74/0x90
el0_svc+0x20/0x80
el0t_64_sync_handler+0x1a8/0x1b0
el0t_64_sync+0x1ac/0x1b0
...
kmemleak: Cannot insert 0xf2ff000003f88000 into the object search tree (overlaps existing)
CPU: 5 PID: 178 Comm: cat Not tainted 5.16.0-rc1-dirty #21
Hardware name: linux,dummy-virt (DT)
Call trace:
dump_backtrace+0x0/0x1ac
show_stack+0x1c/0x30
dump_stack_lvl+0x68/0x84
dump_stack+0x1c/0x38
create_object.isra.0+0x2d8/0x2fc
kmemleak_alloc+0x34/0x40
kmem_cache_alloc+0x23c/0x2f0
test_version_show+0x1fc/0x3a0
module_attr_show+0x28/0x40
sysfs_kf_seq_show+0xb0/0x130
kernfs_seq_show+0x30/0x40
seq_read_iter+0x1bc/0x4b0
kernfs_fop_read_iter+0x144/0x1c0
generic_file_splice_read+0xd0/0x184
do_splice_to+0x90/0xe0
splice_direct_to_actor+0xb8/0x250
do_splice_direct+0x88/0xd4
do_sendfile+0x2b0/0x344
__arm64_sys_sendfile64+0x164/0x16c
invoke_syscall+0x48/0x114
el0_svc_common.constprop.0+0x44/0xec
do_el0_svc+0x74/0x90
el0_svc+0x20/0x80
el0t_64_sync_handler+0x1a8/0x1b0
el0t_64_sync+0x1ac/0x1b0
kmemleak: Kernel memory leak detector disabled
kmemleak: Object 0xf2ff000003f88000 (size 128):
kmemleak: comm "cat", pid 177, jiffies 4294921177
kmemleak: min_count = 1
kmemleak: count = 0
kmemleak: flags = 0x1
kmemleak: checksum = 0
kmemleak: backtrace:
kmem_cache_alloc+0x23c/0x2f0
test_version_show+0x1fc/0x3a0
module_attr_show+0x28/0x40
sysfs_kf_seq_show+0xb0/0x130
kernfs_seq_show+0x30/0x40
seq_read_iter+0x1bc/0x4b0
kernfs_fop_read_iter+0x144/0x1c0
generic_file_splice_read+0xd0/0x184
do_splice_to+0x90/0xe0
splice_direct_to_actor+0xb8/0x250
do_splice_direct+0x88/0xd4
do_sendfile+0x2b0/0x344
__arm64_sys_sendfile64+0x164/0x16c
invoke_syscall+0x48/0x114
el0_svc_common.constprop.0+0x44/0xec
do_el0_svc+0x74/0x90
kmemleak: Automatic memory scanning thread ended
[[email protected]: whitespace tweak]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Kuan-Ying Lee <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
Cc: Doug Berger <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
There is no external users of slab_start/next/stop(), so make them
static. And the memory.kmem.slabinfo is deprecated, which outputs
nothing now, so move memcg_slab_show() into mm/memcontrol.c and rename
it to mem_cgroup_slab_show to be consistent with other function names.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Muchun Song <[email protected]>
Reviewed-by: Vlastimil Babka <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Calling kmem_cache_destroy() while the cache still has objects allocated
is a kernel bug, and will usually result in the entire cache being
leaked. While the message in kmem_cache_destroy() resembles a warning,
it is currently not implemented using a real WARN().
This is problematic for infrastructure testing the kernel, all of which
rely on the specific format of WARN()s to pick up on bugs.
Some 13 years ago this used to be a simple WARN_ON() in slub, but commit
d629d8195793 ("slub: improve kmem_cache_destroy() error message")
changed it into an open-coded warning to avoid confusion with a bug in
slub itself.
Instead, turn the open-coded warning into a real WARN() with the message
preserved, so that test systems can actually identify these issues, and
we get all the other benefits of using a normal WARN(). The warning
message is extended with "when called from <caller-ip>" to make it even
clearer where the fault lies.
For most configurations this is only a cosmetic change, however, note
that WARN() here will now also respect panic_on_warn.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Marco Elver <[email protected]>
Reviewed-by: Vlastimil Babka <[email protected]>
Acked-by: David Rientjes <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Ingo Molnar <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
__user annotations are used by the checker (e.g sparse) to mark user
pointers. However here __user is applied to a struct directly, without a
pointer being directly involved.
Although the presence of __user does not cause sparse to emit a warning,
__user should be removed for consistency with other uses of offsetof().
Note: No functional changes intended.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Amit Daniel Kachhap <[email protected]>
Cc: Vincenzo Frascino <[email protected]>
Cc: Kevin Brodsky <[email protected]>
Cc: Al Viro <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The variable 'free_space' is being initialized with a value that is not
read, it is being re-assigned later in the two paths of an if statement.
The early initialization is redundant and can be removed.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Colin Ian King <[email protected]>
Acked-by: Joseph Qi <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Changwei Ge <[email protected]>
Cc: Gang He <[email protected]>
Cc: Jun Piao <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
There are currently two ways to create a set of sysfs files for a
kobj_type, through the default_attrs field, and the default_groups
field.
Move the ocfs2 cluster sysfs code to use default_groups field which has
been the preferred way since aa30f47cf666 ("kobject: Add support for
default attribute groups to kobj_type") so that we can soon get rid of
the obsolete default_attrs field.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Reviewed-by: Joseph Qi <[email protected]>
Tested-by: Joseph Qi <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Changwei Ge <[email protected]>
Cc: Gang He <[email protected]>
Cc: Jun Piao <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The variable 'root_bh' is being initialized with a value that is not
read, it is being re-assigned later on closer to its use. The early
initialization is redundant and can be removed.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Colin Ian King <[email protected]>
Acked-by: Joseph Qi <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Changwei Ge <[email protected]>
Cc: Gang He <[email protected]>
Cc: Jun Piao <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
There are currently two ways to create a set of sysfs files for a
kobj_type, through the default_attrs field, and the default_groups
field.
Move the ocfs2 code to use default_groups field which has been the
preferred way since aa30f47cf666 ("kobject: Add support for default
attribute groups to kobj_type") so that we can soon get rid of the
obsolete default_attrs field.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Acked-by: Joseph Qi <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Changwei Ge <[email protected]>
Cc: Gang He <[email protected]>
Cc: Jun Piao <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
ocfs2_grab_pages_for_write() may return -EAGAIN if write context type is
mmap and it could not lock the target page. In this case, we exit with
no error and no target page. And then trigger the caller page_mkwrite()
to retry.
Since there are other caller types, e.g. buffer and direct io, make the
return value handling more clear.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Joseph Qi <[email protected]>
Reported-by: Dan Carpenter <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Changwei Ge <[email protected]>
Cc: Gang He <[email protected]>
Cc: Jun Piao <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
This issue was detected with the help of Coccinelle.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Zhang Mingyu <[email protected]>
Reported-by: Zeal Robot <[email protected]>
Acked-by: Joseph Qi <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Changwei Ge <[email protected]>
Cc: Gang He <[email protected]>
Cc: Jun Piao <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Commit c1f6925e1091 ("mm: put readahead pages in cache earlier") causes
the read performance of squashfs to deteriorate.Through testing, we find
that the performance will be back by closing the readahead of squashfs.
So we want to learn the way of ubifs, provides backing_dev_info and
disable read-ahead
We tested the following data by fio.
squashfs image blocksize=128K
test command:
fio --name basic --bs=? --filename="/mnt/test_file" --rw=? --iodepth=1 --ioengine=psync --runtime=200 --time_based
turn on squashfs readahead in 5.10 kernel
bs(k) read/randread MB/s
4 randread 271
128 randread 231
1024 randread 246
4 read 310
128 read 245
1024 read 247
turn off squashfs readahead in 5.10 kernel
bs(k) read/randread MB/s
4 randread 293
128 randread 330
1024 randread 363
4 read 338
128 read 360
1024 read 365
turn on squashfs readahead and revert the
commit c1f6925e1091("mm: put readahead
pages in cache earlier") in 5.10 kernel
bs(k) read/randread MB/s
4 randread 289
128 randread 306
1024 randread 335
4 read 337
128 read 336
1024 read 338
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Zheng Liang <[email protected]>
Reviewed-by: Phillip Lougher <[email protected]>
Cc: Zhang Yi <[email protected]>
Cc: Hou Tao <[email protected]>
Cc: Miao Xie <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The comments for the file should not be in kernel-doc format:
/**
* attrib.c - NTFS attribute operations. Part of the Linux-NTFS
as it causes it to be incorrectly identified for function
ntfs_map_runlist_nolock(), causing some warnings found by running
scripts/kernel-doc.:
fs/ntfs/attrib.c:25: warning: Incorrect use of kernel-doc format: * ntfs_map_runlist_nolock - map (a part of) a runlist of an ntfs inode
fs/ntfs/attrib.c:71: warning: Function parameter or member 'ni' not described in 'ntfs_map_runlist_nolock'
fs/ntfs/attrib.c:71: warning: Function parameter or member 'vcn' not described in 'ntfs_map_runlist_nolock'
fs/ntfs/attrib.c:71: warning: Function parameter or member 'ctx' not described in 'ntfs_map_runlist_nolock'
fs/ntfs/attrib.c:71: warning: expecting prototype for attrib.c - NTFS attribute operations. Part of the Linux(). Prototype was for ntfs_map_runlist_nolock() instead
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Yang Li <[email protected]>
Reported-by: Abaci Robot <[email protected]>
Acked-by: Randy Dunlap <[email protected]>
Cc: Anton Altaparmakov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Add typo "oveflow" for "overflow". This typo was found and fixed in
tools/testing/selftests/bpf/prog_tests/btf_dump.c
Link: https://lore.kernel.org/all/[email protected]/
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Drew Fustini <[email protected]>
Suggested-by: Gustavo A. R. Silva <[email protected]>
Cc: Colin Ian King <[email protected]>
Cc: Drew Fustini <[email protected]>
Cc: zuoqilin <[email protected]>
Cc: Tom Saeger <[email protected]>
Cc: Sven Eckelmann <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
There are currently two ways to create a set of sysfs files for a kobj_type,
through the default_attrs field, and the default_groups field.
Move the ia64 topology sysfs code to use default_groups field which has
been the preferred way since aa30f47cf666 ("kobject: Add support for
default attribute groups to kobj_type") so that we can soon get rid of
the obsolete default_attrs field.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: David Hildenbrand <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The double `the' in a comment is repeated, thus it should be removed.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Jason Wang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Use the macro 'swap()' defined in 'include/linux/minmax.h' to avoid
opencoding it.
Link: https://lkml.kernel.org/r/[email protected]
Reported-by: Zeal Robot <[email protected]>
Signed-off-by: Yang Guang <[email protected]>
Cc: David Yang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Use the macro 'swap()' defined in 'include/linux/minmax.h' to avoid
opencoding it.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Yang Guang <[email protected]>
Reported-by: Zeal Robot <[email protected]>
Cc: David Yang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Replace kthread_create_on_cpu/wake_up_process() with kthread_run_on_cpu()
to simplify the code.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Cai Huoqing <[email protected]>
Cc: Bernard Metzler <[email protected]>
Cc: Daniel Bristot de Oliveira <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Joel Fernandes (Google) <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Lai Jiangshan <[email protected]>
Cc: Mathieu Desnoyers <[email protected]>
Cc: "Paul E . McKenney" <[email protected]>
Cc: Steven Rostedt <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Replace kthread_create_on_cpu/wake_up_process() with kthread_run_on_cpu()
to simplify the code.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Cai Huoqing <[email protected]>
Cc: Bernard Metzler <[email protected]>
Cc: Daniel Bristot de Oliveira <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Joel Fernandes (Google) <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Lai Jiangshan <[email protected]>
Cc: Mathieu Desnoyers <[email protected]>
Cc: "Paul E . McKenney" <[email protected]>
Cc: Steven Rostedt <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Replace kthread_create_on_node/kthread_bind/wake_up_process() with
kthread_run_on_cpu() to simplify the code.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Cai Huoqing <[email protected]>
Cc: Bernard Metzler <[email protected]>
Cc: Daniel Bristot de Oliveira <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Joel Fernandes (Google) <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Lai Jiangshan <[email protected]>
Cc: Mathieu Desnoyers <[email protected]>
Cc: "Paul E . McKenney" <[email protected]>
Cc: Steven Rostedt <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Replace kthread_create/kthread_bind/wake_up_process() with
kthread_run_on_cpu() to simplify the code.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Cai Huoqing <[email protected]>
Cc: Bernard Metzler <[email protected]>
Cc: Daniel Bristot de Oliveira <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Joel Fernandes (Google) <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Lai Jiangshan <[email protected]>
Cc: Mathieu Desnoyers <[email protected]>
Cc: "Paul E . McKenney" <[email protected]>
Cc: Steven Rostedt <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Replace kthread_create/kthread_bind/wake_up_process() with
kthread_run_on_cpu() to simplify the code.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Cai Huoqing <[email protected]>
Cc: Bernard Metzler <[email protected]>
Cc: Daniel Bristot de Oliveira <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Joel Fernandes (Google) <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Lai Jiangshan <[email protected]>
Cc: Mathieu Desnoyers <[email protected]>
Cc: "Paul E . McKenney" <[email protected]>
Cc: Steven Rostedt <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Add a new helper function kthread_run_on_cpu(), which includes
kthread_create_on_cpu/wake_up_process().
In some cases, use kthread_run_on_cpu() directly instead of
kthread_create_on_node/kthread_bind/wake_up_process() or
kthread_create_on_cpu/wake_up_process() or
kthreadd_create/kthread_bind/wake_up_process() to simplify the code.
[[email protected]: export kthread_create_on_cpu to modules]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Cai Huoqing <[email protected]>
Cc: Bernard Metzler <[email protected]>
Cc: Cai Huoqing <[email protected]>
Cc: Daniel Bristot de Oliveira <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Joel Fernandes (Google) <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Lai Jiangshan <[email protected]>
Cc: Mathieu Desnoyers <[email protected]>
Cc: "Paul E . McKenney" <[email protected]>
Cc: Steven Rostedt <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
This semantic patch does not take into account the fact that of_node_put
can be safely applied to NULL. Thus it gives only false positives.
Drop it.
Reported-by: Qing Wang <[email protected]>
Signed-off-by: Julia Lawall <[email protected]>
|
|
The BUG_ON script was never safe, in that it was not able to check
whether the condition was side-effecting. At this point, BUG_ON
should be well known, so it has probably outlived its usefuless.
Signed-off-by: Julia Lawall <[email protected]>
Suggested-by: Matthew Wilcox <[email protected]>
|
|
Gilles Muller passed away on November 17, 2021. We would like
to thank him for his continued support for the development of
Coccinelle.
Signed-off-by: Julia Lawall <[email protected]>
|
|
Pull xfs fixes from Darrick Wong:
"These are the last few obvious fixes that I found while stress testing
online fsck for XFS prior to initiating a design review of the whole
giant machinery.
- Fix a minor locking inconsistency in readdir
- Fix incorrect fs feature bit validation for secondary superblocks"
* tag 'xfs-5.17-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: fix online fsck handling of v5 feature bits on secondary supers
xfs: take the ILOCK when readdir inspects directory mapping data
|
|
The fbdev layer is orphaned, but seems to need some care.
So I'd like to step up as new maintainer.
Signed-off-by: Helge Deller <[email protected]>
Acked-by: Geert Uytterhoeven <[email protected]>
|
|
Fix sparse warnings in xstate and remove inline prefix.
Fixes: 980fe2fddcff ("x86/fpu: Extend fpu_xstate_prctl() with guest permissions")
Signed-off-by: Yang Zhong <[email protected]>
Reported-by: kernel test robot <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
This selftest covers two aspects of AMX. The first is triggering #NM
exception and checking the MSR XFD_ERR value. The second case is
loading tile config and tile data into guest registers and trapping to
the host side for a complete save/load of the guest state. TMM0
is also checked against memory data after save/restore.
Signed-off-by: Yang Zhong <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Those changes can avoid dereferencing pointer compile issue
when amx_test.c reference state->xsave.
Move struct kvm_x86_state definition to processor.h.
Signed-off-by: Yang Zhong <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
For AMX support it is recommended to load XCR0 after XFD, so
that KVM does not see XFD=0, XCR=1 for a save state that will
eventually be disabled (which would lead to premature allocation
of the space required for that save state).
It is also required to load XSAVE data after XCR0 and XFD, so
that KVM can trigger allocation of the extra space required to
store AMX state.
Adjust vcpu_load_state to obey these new requirements.
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Yang Zhong <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Always intercepting IA32_XFD causes non-negligible overhead when this
register is updated frequently in the guest.
Disable r/w emulation after intercepting the first WRMSR(IA32_XFD)
with a non-zero value.
Disable WRMSR emulation implies that IA32_XFD becomes out-of-sync
with the software states in fpstate and the per-cpu xfd cache. This
leads to two additional changes accordingly:
- Call fpu_sync_guest_vmexit_xfd_state() after vm-exit to bring
software states back in-sync with the MSR, before handle_exit_irqoff()
is called.
- Always trap #NM once write interception is disabled for IA32_XFD.
The #NM exception is rare if the guest doesn't use dynamic
features. Otherwise, there is at most one exception per guest
task given a dynamic feature.
p.s. We have confirmed that SDM is being revised to say that
when setting IA32_XFD[18] the AMX register state is not guaranteed
to be preserved. This clarification avoids adding mess for a creative
guest which sets IA32_XFD[18]=1 before saving active AMX state to
its own storage.
Signed-off-by: Kevin Tian <[email protected]>
Signed-off-by: Jing Liu <[email protected]>
Signed-off-by: Yang Zhong <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
KVM can disable the write emulation for the XFD MSR when the vCPU's fpstate
is already correctly sized to reduce the overhead.
When write emulation is disabled the XFD MSR state after a VMEXIT is
unknown and therefore not in sync with the software states in fpstate and
the per CPU XFD cache.
Provide fpu_sync_guest_vmexit_xfd_state() which has to be invoked after a
VMEXIT before enabling interrupts when write emulation is disabled for the
XFD MSR.
It could be invoked unconditionally even when write emulation is enabled
for the price of a pointless MSR read.
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Jing Liu <[email protected]>
Signed-off-by: Yang Zhong <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
When KVM_CAP_XSAVE2 is supported, userspace is expected to allocate
buffer for KVM_GET_XSAVE2 and KVM_SET_XSAVE using the size returned
by KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2).
Signed-off-by: Wei Wang <[email protected]>
Signed-off-by: Guang Zeng <[email protected]>
Signed-off-by: Jing Liu <[email protected]>
Signed-off-by: Yang Zhong <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
With KVM_CAP_XSAVE, userspace uses a hardcoded 4KB buffer to get/set
xstate data from/to KVM. This doesn't work when dynamic xfeatures
(e.g. AMX) are exposed to the guest as they require a larger buffer
size.
Introduce a new capability (KVM_CAP_XSAVE2). Userspace VMM gets the
required xstate buffer size via KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2).
KVM_SET_XSAVE is extended to work with both legacy and new capabilities
by doing properly-sized memdup_user() based on the guest fpu container.
KVM_GET_XSAVE is kept for backward-compatible reason. Instead,
KVM_GET_XSAVE2 is introduced under KVM_CAP_XSAVE2 as the preferred
interface for getting xstate buffer (4KB or larger size) from KVM
(Link: https://lkml.org/lkml/2021/12/15/510)
Also, update the api doc with the new KVM_GET_XSAVE2 ioctl.
Signed-off-by: Guang Zeng <[email protected]>
Signed-off-by: Wei Wang <[email protected]>
Signed-off-by: Jing Liu <[email protected]>
Signed-off-by: Kevin Tian <[email protected]>
Signed-off-by: Yang Zhong <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Userspace needs to inquire KVM about the buffer size to work
with the new KVM_SET_XSAVE and KVM_GET_XSAVE2. Add the size info
to guest_fpu for KVM to access.
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Wei Wang <[email protected]>
Signed-off-by: Jing Liu <[email protected]>
Signed-off-by: Yang Zhong <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Extend CPUID emulation to support XFD, AMX_TILE, AMX_INT8 and
AMX_BF16. Adding those bits into kvm_cpu_caps finally activates all
previous logics in this series.
Hide XFD on 32bit host kernels. Otherwise it leads to a weird situation
where KVM tells userspace to migrate MSR_IA32_XFD and then rejects
attempts to read/write the MSR.
Signed-off-by: Jing Liu <[email protected]>
Signed-off-by: Yang Zhong <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Two XCR0 bits are defined for AMX to support XSAVE mechanism. Bit 17
is for tilecfg and bit 18 is for tiledata.
The value of XCR0[17:18] is always either 00b or 11b. Also, SDM
recommends that only 64-bit operating systems enable Intel AMX by
setting XCR0[18:17]. 32-bit host kernel never sets the tile bits in
vcpu->arch.guest_supported_xcr0.
Signed-off-by: Jing Liu <[email protected]>
Signed-off-by: Kevin Tian <[email protected]>
Signed-off-by: Yang Zhong <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
This saves one unnecessary VM-exit in guest #NM handler, given that the
MSR is already restored with the guest value before the guest is resumed.
Suggested-by: Paolo Bonzini <[email protected]>
Signed-off-by: Jing Liu <[email protected]>
Signed-off-by: Yang Zhong <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Emulate read/write to IA32_XFD_ERR MSR.
Only the saved value in the guest_fpu container is touched in the
emulation handler. Actual MSR update is handled right before entering
the guest (with preemption disabled)
Signed-off-by: Jing Liu <[email protected]>
Signed-off-by: Zeng Guang <[email protected]>
Signed-off-by: Wei Wang <[email protected]>
Signed-off-by: Jing Liu <[email protected]>
Signed-off-by: Yang Zhong <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Guest IA32_XFD_ERR is generally modified in two places:
- Set by CPU when #NM is triggered;
- Cleared by guest in its #NM handler;
Intercept #NM for the first case when a nonzero value is written
to IA32_XFD. Nonzero indicates that the guest is willing to do
dynamic fpstate expansion for certain xfeatures, thus KVM needs to
manage and virtualize guest XFD_ERR properly. The vcpu exception
bitmap is updated in XFD write emulation according to guest_fpu::xfd.
Save the current XFD_ERR value to the guest_fpu container in the #NM
VM-exit handler. This must be done with interrupt disabled, otherwise
the unsaved MSR value may be clobbered by host activity.
The saving operation is conducted conditionally only when guest_fpu:xfd
includes a non-zero value. Doing so also avoids misread on a platform
which doesn't support XFD but #NM is triggered due to L1 interception.
Queueing #NM to the guest is postponed to handle_exception_nmi(). This
goes through the nested_vmx check so a virtual vmexit is queued instead
when #NM is triggered in L2 but L1 wants to intercept it.
Restore the host value (always ZERO outside of the host #NM
handler) before enabling interrupt.
Restore the guest value from the guest_fpu container right before
entering the guest (with interrupt disabled).
Suggested-by: Thomas Gleixner <[email protected]>
Signed-off-by: Jing Liu <[email protected]>
Signed-off-by: Kevin Tian <[email protected]>
Signed-off-by: Yang Zhong <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
When XFD causes an instruction to generate #NM, IA32_XFD_ERR
contains information about which disabled state components are
being accessed. The #NM handler is expected to check this
information and then enable the state components by clearing
IA32_XFD for the faulting task (if having permission).
If the XFD_ERR value generated in guest is consumed/clobbered
by the host before the guest itself doing so, it may lead to
non-XFD-related #NM treated as XFD #NM in host (due to non-zero
value in XFD_ERR), or XFD-related #NM treated as non-XFD #NM in
guest (XFD_ERR cleared by the host #NM handler).
Introduce a new field in fpu_guest to save the guest xfd_err value.
KVM is expected to save guest xfd_err before interrupt is enabled
and restore it right before entering the guest (with interrupt
disabled).
Signed-off-by: Jing Liu <[email protected]>
Signed-off-by: Kevin Tian <[email protected]>
Signed-off-by: Jing Liu <[email protected]>
Signed-off-by: Yang Zhong <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Intel's eXtended Feature Disable (XFD) feature allows the software
to dynamically adjust fpstate buffer size for XSAVE features which
have large state.
Because guest fpstate has been expanded for all possible dynamic
xstates at KVM_SET_CPUID2, emulation of the IA32_XFD MSR is
straightforward. For write just call fpu_update_guest_xfd() to
update the guest fpu container once all the sanity checks are passed.
For read simply return the cached value in the container.
Signed-off-by: Jing Liu <[email protected]>
Signed-off-by: Zeng Guang <[email protected]>
Signed-off-by: Wei Wang <[email protected]>
Signed-off-by: Jing Liu <[email protected]>
Signed-off-by: Yang Zhong <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Guest XFD can be updated either in the emulation path or in the
restore path.
Provide a wrapper to update guest_fpu::fpstate::xfd. If the guest
fpstate is currently in-use, also update the per-cpu xfd cache and
the actual MSR.
Signed-off-by: Kevin Tian <[email protected]>
Signed-off-by: Jing Liu <[email protected]>
Signed-off-by: Yang Zhong <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
KVM can request fpstate expansion in two approaches:
1) When intercepting guest updates to XCR0 and XFD MSR;
2) Before vcpu runs (e.g. at KVM_SET_CPUID2);
The first option doesn't waste memory for legacy guest if it doesn't
support XFD. However doing so introduces more complexity and also
imposes an order requirement in the restoring path, i.e. XCR0/XFD
must be restored before XSTATE.
Given that the agreement is to do the static approach. This is
considered a better tradeoff though it does waste 8K memory for
legacy guest if its CPUID includes dynamically-enabled xfeatures.
Successful fpstate expansion requires userspace VMM to acquire
guest xstate permissions before calling KVM_SET_CPUID2.
Also take the chance to adjust the indent in kvm_set_cpuid().
Signed-off-by: Jing Liu <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Kevin Tian <[email protected]>
Signed-off-by: Yang Zhong <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Provide a wrapper for expanding the guest fpstate buffer according
to requested xfeatures. KVM wants to call this wrapper to manage
any dynamic xstate used by the guest.
Suggested-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Kevin Tian <[email protected]>
Signed-off-by: Yang Zhong <[email protected]>
Reviewed-by: Paolo Bonzini <[email protected]>
Message-Id: <[email protected]>
[Remove unnecessary 32-bit check. - Paolo]
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Guest support for dynamically enabled FPU features requires a few
modifications to the enablement function which is currently invoked from
the #NM handler:
1) Use guest permissions and sizes for the update
2) Update fpu_guest state accordingly
3) Take into account that the enabling can be triggered either from a
running guest via XSETBV and MSR_IA32_XFD write emulation or from
a guest restore. In the latter case the guests fpstate is not the
current tasks active fpstate.
Split the function and implement the guest mechanics throughout the
callchain.
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Jing Liu <[email protected]>
Signed-off-by: Yang Zhong <[email protected]>
Message-Id: <[email protected]>
[Add 32-bit stub for __xfd_enable_feature. - Paolo]
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
vCPU threads are different from native tasks regarding to the initial XFD
value. While all native tasks follow a fixed value (init_fpstate::xfd)
established by the FPU core at boot, vCPU threads need to obey the reset
value (i.e. ZERO) defined by the specification, to meet the expectation of
the guest.
Let the caller supply an argument and adjust the host and guest related
invocations accordingly.
Signed-off-by: Jing Liu <[email protected]>
Signed-off-by: Yang Zhong <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Jing Liu <[email protected]>
Signed-off-by: Yang Zhong <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|