Age | Commit message (Collapse) | Author | Files | Lines |
|
a single CPU
Steven noted that when the user-provided cpumask contains a single CPU,
then the filtering function can use a scalar as input instead of a
full-fledged cpumask.
When the mask contains a single CPU, directly re-use the unsigned field
predicate functions. Transform '&' into '==' beforehand.
Link: https://lkml.kernel.org/r/[email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Daniel Bristot de Oliveira <[email protected]>
Cc: Marcelo Tosatti <[email protected]>
Cc: Leonardo Bras <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Suggested-by: Steven Rostedt <[email protected]>
Signed-off-by: Valentin Schneider <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
single CPU
Steven noted that when the user-provided cpumask contains a single CPU,
then the filtering function can use a scalar as input instead of a
full-fledged cpumask.
Reuse do_filter_scalar_cpumask() when the input mask has a weight of one.
Link: https://lkml.kernel.org/r/[email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Daniel Bristot de Oliveira <[email protected]>
Cc: Marcelo Tosatti <[email protected]>
Cc: Leonardo Bras <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Suggested-by: Steven Rostedt <[email protected]>
Signed-off-by: Valentin Schneider <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
The tracing_cpumask lets us specify which CPUs are traced in a buffer
instance, but doesn't let us do this on a per-event basis (unless one
creates an instance per event).
A previous commit added filtering scalar fields by a user-given cpumask,
make this work with the CPU common field as well.
This enables doing things like
$ trace-cmd record -e 'sched_switch' -f 'CPU & CPUS{12-52}' \
-e 'sched_wakeup' -f 'target_cpu & CPUS{12-52}'
Link: https://lkml.kernel.org/r/[email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Daniel Bristot de Oliveira <[email protected]>
Cc: Marcelo Tosatti <[email protected]>
Cc: Leonardo Bras <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Signed-off-by: Valentin Schneider <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Several events use a scalar field to denote a CPU:
o sched_wakeup.target_cpu
o sched_migrate_task.orig_cpu,dest_cpu
o sched_move_numa.src_cpu,dst_cpu
o ipi_send_cpu.cpu
o ...
Filtering these currently requires using arithmetic comparison functions,
which can be tedious when dealing with interleaved SMT or NUMA CPU ids.
Allow these to be filtered by a user-provided cpumask, which enables e.g.:
$ trace-cmd record -e 'sched_wakeup' -f 'target_cpu & CPUS{2,4,6,8-32}'
Link: https://lkml.kernel.org/r/[email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Daniel Bristot de Oliveira <[email protected]>
Cc: Marcelo Tosatti <[email protected]>
Cc: Leonardo Bras <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Signed-off-by: Valentin Schneider <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
The recently introduced ipi_send_cpumask trace event contains a cpumask
field, but it currently cannot be used in filter expressions.
Make event filtering aware of cpumask fields, and allow these to be
filtered by a user-provided cpumask.
The user-provided cpumask is to be given in cpulist format and wrapped as:
"CPUS{$cpulist}". The use of curly braces instead of parentheses is to
prevent predicate_parse() from parsing the contents of CPUS{...} as a
full-fledged predicate subexpression.
This enables e.g.:
$ trace-cmd record -e 'ipi_send_cpumask' -f 'cpumask & CPUS{2,4,6,8-32}'
Link: https://lkml.kernel.org/r/[email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Daniel Bristot de Oliveira <[email protected]>
Cc: Marcelo Tosatti <[email protected]>
Cc: Leonardo Bras <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Signed-off-by: Valentin Schneider <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Every predicate allocation includes a MAX_FILTER_STR_VAL (256) char array
in the regex field, even if the predicate function does not use the field.
A later commit will introduce a dynamically allocated cpumask to struct
filter_pred, which will require a dedicated freeing function. Bite the
bullet and make filter_pred.regex dynamically allocated.
While at it, reorder the fields of filter_pred to fill in the byte
holes. The struct now fits on a single cacheline.
No change in behaviour intended.
The kfree()'s were patched via Coccinelle:
@@
struct filter_pred *pred;
@@
-kfree(pred);
+free_predicate(pred);
Link: https://lkml.kernel.org/r/[email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Daniel Bristot de Oliveira <[email protected]>
Cc: Marcelo Tosatti <[email protected]>
Cc: Leonardo Bras <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Signed-off-by: Valentin Schneider <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
kprobe_args_char.tc, kprobe_args_string.tc has validation check
for tracefs_create_dir, for eventfs it should be eventfs_create_dir.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ajay Kaher <[email protected]>
Co-developed-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Tested-by: Ching-lin Yu <[email protected]>
Acked-by: Masami Hiramatsu (Google) <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Up until now, /sys/kernel/tracing/events was no different than any other
part of tracefs. The files and directories within the events directory was
created when the tracefs was mounted, and also created for the instances in
/sys/kernel/tracing/instances/<instance>/events. Most of these files and
directories will never be referenced. Since there are thousands of these
files and directories they spend their time wasting precious memory
resources.
Move the "events" directory to the new eventfs. The eventfs will take the
meta data of the events that they represent and store that. When the files
in the events directory are referenced, the dentry and inodes to represent
them are then created. When the files are no longer referenced, they are
freed. This saves the precious memory resources that were wasted on these
seldom referenced dentries and inodes.
Running the following:
~# cat /proc/meminfo /proc/slabinfo > before.out
~# mkdir /sys/kernel/tracing/instances/foo
~# cat /proc/meminfo /proc/slabinfo > after.out
to test the changes produces the following deltas:
Before this change:
Before after deltas for meminfo:
MemFree: -32260
MemAvailable: -21496
KReclaimable: 21528
Slab: 22440
SReclaimable: 21528
SUnreclaim: 912
VmallocUsed: 16
Before after deltas for slabinfo:
<slab>: <objects> [ * <size> = <total>]
tracefs_inode_cache: 14472 [* 1184 = 17134848]
buffer_head: 24 [* 168 = 4032]
hmem_inode_cache: 28 [* 1480 = 41440]
dentry: 14450 [* 312 = 4508400]
lsm_inode_cache: 14453 [* 32 = 462496]
vma_lock: 11 [* 152 = 1672]
vm_area_struct: 2 [* 184 = 368]
trace_event_file: 1748 [* 88 = 153824]
kmalloc-256: 1072 [* 256 = 274432]
kmalloc-64: 2842 [* 64 = 181888]
Total slab additions in size: 22,763,400 bytes
With this change:
Before after deltas for meminfo:
MemFree: -12600
MemAvailable: -12580
Cached: 24
Active: 12
Inactive: 68
Inactive(anon): 48
Active(file): 12
Inactive(file): 20
Dirty: -4
AnonPages: 68
KReclaimable: 12
Slab: 1856
SReclaimable: 12
SUnreclaim: 1844
KernelStack: 16
PageTables: 36
VmallocUsed: 16
Before after deltas for slabinfo:
<slab>: <objects> [ * <size> = <total>]
tracefs_inode_cache: 108 [* 1184 = 127872]
buffer_head: 24 [* 168 = 4032]
hmem_inode_cache: 18 [* 1480 = 26640]
dentry: 127 [* 312 = 39624]
lsm_inode_cache: 152 [* 32 = 4864]
vma_lock: 67 [* 152 = 10184]
vm_area_struct: -12 [* 184 = -2208]
trace_event_file: 1764 [* 96 = 169344]
kmalloc-96: 14322 [* 96 = 1374912]
kmalloc-64: 2814 [* 64 = 180096]
kmalloc-32: 1103 [* 32 = 35296]
kmalloc-16: 2308 [* 16 = 36928]
kmalloc-8: 12800 [* 8 = 102400]
Total slab additions in size: 2,109,984 bytes
Which is a savings of 20,653,416 bytes (20 MB) per tracing instance.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ajay Kaher <[email protected]>
Co-developed-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Tested-by: Ching-lin Yu <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
When events are removed from tracefs, the eventfs must be aware of this.
The eventfs_remove() removes the meta data from eventfs so that it will no
longer create the files associated with that event.
When an instance is removed from tracefs, eventfs_remove_events_dir() will
remove and clean up the entire "events" directory.
The helper function eventfs_remove_rec() is used to clean up and free the
associated data from eventfs for both of the added functions. SRCU is used
to protect the lists of meta data stored in the eventfs. The eventfs_mutex
is used to protect the content of the items in the list.
As lookups may be happening as deletions of events are made, the freeing
of dentry/inodes and relative information is done after the SRCU grace
period has passed.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ajay Kaher <[email protected]>
Co-developed-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Tested-by: Ching-lin Yu <[email protected]>
Reported-by: kernel test robot <[email protected]>
Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Add create_file() and create_dir() functions to create the files and
directories respectively when they are accessed. The functions will be
called from the lookup operation of the inode_operations or from the open
function of file_operations.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ajay Kaher <[email protected]>
Co-developed-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Tested-by: Ching-lin Yu <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Add the inode_operations, file_operations, and helper functions to eventfs:
dcache_dir_open_wrapper()
eventfs_root_lookup()
eventfs_release()
eventfs_set_ef_status_free()
eventfs_post_create_dir()
The inode_operations and file_operations functions will be called from the
VFS layer.
create_file() and create_dir() are added as stub functions and will be
filled in later.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ajay Kaher <[email protected]>
Co-developed-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Tested-by: Ching-lin Yu <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Add the following functions to add files to evenfs:
eventfs_add_events_file() to add the data needed to create a specific file
located at the top level events directory. The dentry/inode will be
created when the events directory is scanned.
eventfs_add_file() to add the data needed for files within the directories
below the top level events directory. The dentry/inode of the file will be
created when the directory that the file is in is scanned.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ajay Kaher <[email protected]>
Co-developed-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Tested-by: Ching-lin Yu <[email protected]>
Reported-by: kernel test robot <[email protected]>
Closes: https://lore.kernel.org/oe-lkp/[email protected]
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Add eventfs_file structure which will hold the properties of the eventfs
files and directories.
Add following functions to create the directories in eventfs:
eventfs_create_events_dir() will create the top level "events" directory
within the tracefs file system.
eventfs_add_subsystem_dir() creates an eventfs_file descriptor with the
given name of the subsystem.
eventfs_add_dir() creates an eventfs_file descriptor with the given name of
the directory and attached to a eventfs_file of a subsystem.
Add tracefs_inode structure to hold the inodes, flags and pointers to
private data used by eventfs.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ajay Kaher <[email protected]>
Co-developed-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Tested-by: Ching-lin Yu <[email protected]>
Reported-by: kernel test robot <[email protected]>
Closes: https://lore.kernel.org/oe-lkp/[email protected]
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Export a few tracefs functions that will be needed by the eventfs dynamic
file system. Rename them to start with "tracefs_" to keep with the name
space.
start_creating -> tracefs_start_creating
failed_creating -> tracefs_failed_creating
end_creating -> tracefs_end_creating
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ajay Kaher <[email protected]>
Co-developed-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Tested-by: Ching-lin Yu <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Create a kmem cache of tracefs_inodes. To be more efficient, as there are
lots of tracefs inodes, create its own cache. This also allows to see how
many tracefs inodes have been created.
Add helper functions:
tracefs_alloc_inode()
tracefs_free_inode()
get_tracefs()
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ajay Kaher <[email protected]>
Co-developed-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Tested-by: Ching-lin Yu <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
The creation of the trace event directory requires that a TRACE_SYSTEM is
defined that the trace event directory is added within the system it was
defined in.
The code handled the case where a TRACE_SYSTEM was not added, and would
then add the event at the events directory. But nothing should be doing
this. This code also prevents the implementation of creating dynamic
dentrys for the eventfs system.
As this path has never been hit on correct code, remove it. If it does get
hit, issues a WARN_ON_ONCE() and return ENODEV.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Steven Rostedt (Google) <[email protected]>
Signed-off-by: Ajay Kaher <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Currently we can resize trace ringbuffer by writing a value into file
'buffer_size_kb', then by reading the file, we get the value that is
usually what we wrote. However, this value may be not actual size of
trace ring buffer because of the round up when doing resize in kernel,
and the actual size would be more useful.
Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Cc: <[email protected]>
Signed-off-by: Zheng Yejian <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
As the trace iterator is created and used by various interfaces, the clean
up of it needs to be consistent. Create a free_trace_iter_content() helper
function that frees the content of the iterator and use that to clean it
up in all places that it is used.
Link: https://lkml.kernel.org/r/[email protected]
Cc: Mark Rutland <[email protected]>
Cc: Andrew Morton <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
Reviewed-by: Masami Hiramatsu (Google) <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
The iterator allocated a descriptor to copy the current_trace. This was done
with the assumption that the function pointers might change. But this was a
false assuption, as it does not change. There's no reason to make a copy of the
current_trace and just use the pointer it points to. This removes needing to
manage freeing the descriptor. Worse yet, there's locations that the iterator
is used but does make a copy and just uses the pointer. This could cause the
actual pointer to the trace descriptor to be freed and not the allocated copy.
This is more of a clean up than a fix.
Link: https://lkml.kernel.org/r/[email protected]
Cc: Mark Rutland <[email protected]>
Cc: Andrew Morton <[email protected]>
Fixes: d7350c3f45694 ("tracing/core: make the read callbacks reentrants")
Reviewed-by: Masami Hiramatsu (Google) <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Use try_cmpxchg instead of cmpxchg (*ptr, old, new) == old in
ring_buffer.c. x86 CMPXCHG instruction returns success in ZF flag,
so this change saves a compare after cmpxchg (and related move
instruction in front of cmpxchg).
No functional change intended.
Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Cc: Steven Rostedt <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Signed-off-by: Uros Bizjak <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
For backward compatibility, older tooling expects to see the kernel_stack
event with a "caller" field that is a fixed size array of 8 addresses. The
code now supports more than 8 with an added "size" field that states the
real number of entries. But the "caller" field still just looks like a
fixed size to user space.
Since the tracing macros that create the user space format files also
creates the structures that those files represent, the kernel_stack event
structure had its "caller" field a fixed size of 8, but in reality, when
it is allocated on the ring buffer, it can hold more if the stack trace is
bigger that 8 functions. The copying of these entries was simply done with
a memcpy():
size = nr_entries * sizeof(unsigned long);
memcpy(entry->caller, fstack->calls, size);
The FORTIFY_SOURCE logic noticed at runtime that when the nr_entries was
larger than 8, that the memcpy() was writing more than what the structure
stated it can hold and it complained about it. This is because the
FORTIFY_SOURCE code is unaware that the amount allocated is actually
enough to hold the size. It does not expect that a fixed size field will
hold more than the fixed size.
This was originally solved by hiding the caller assignment with some
pointer arithmetic.
ptr = ring_buffer_data();
entry = ptr;
ptr += offsetof(typeof(*entry), caller);
memcpy(ptr, fstack->calls, size);
But it is considered bad form to hide from kernel hardening. Instead, make
it work nicely with FORTIFY_SOURCE by adding a new __stack_array() macro
that is specific for this one special use case. The macro will take 4
arguments: type, item, len, field (whereas the __array() macro takes just
the first three). This macro will act just like the __array() macro when
creating the code to deal with the format file that is exposed to user
space. But for the kernel, it will turn the caller field into:
type item[] __counted_by(field);
or for this instance:
unsigned long caller[] __counted_by(size);
Now the kernel code can expose the assignment of the caller to the
FORTIFY_SOURCE and everyone is happy!
Link: https://lore.kernel.org/linux-trace-kernel/[email protected]/
Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Sven Schnelle <[email protected]>
Suggested-by: Kees Cook <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A bunch of fixes for the Qualcomm QSPI driver, fixing multiple issues
with the newly added DMA mode - it had a number of issues exposed when
tested in a wider range of use cases, both race condition style issues
and issues with different inputs to those that had been used in test"
* tag 'spi-fix-v6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: spi-qcom-qspi: Add mem_ops to avoid PIO for badly sized reads
spi: spi-qcom-qspi: Fallback to PIO for xfers that aren't multiples of 4 bytes
spi: spi-qcom-qspi: Add DMA_CHAIN_DONE to ALL_IRQS
spi: spi-qcom-qspi: Call dma_wmb() after setting up descriptors
spi: spi-qcom-qspi: Use GFP_ATOMIC flag while allocating for descriptor
spi: spi-qcom-qspi: Ignore disabled interrupts' status in isr
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fixes from Mark Brown:
"A couple of small fixes for the the mt6358 driver, fixing error
reporting and a bootstrapping issue"
* tag 'regulator-fix-v6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: mt6358: Fix incorrect VCN33 sync error message
regulator: mt6358: Sync VCN33_* enable status after checking ID
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are a set of USB driver fixes for 6.5-rc4. Include in here are:
- new USB serial device ids
- dwc3 driver fixes for reported issues
- typec driver fixes for reported problems
- gadget driver fixes
- reverts of some problematic USB changes that went into -rc1
All of these have been in linux-next with no reported problems"
* tag 'usb-6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (24 commits)
usb: misc: ehset: fix wrong if condition
usb: dwc3: pci: skip BYT GPIO lookup table for hardwired phy
usb: cdns3: fix incorrect calculation of ep_buf_size when more than one config
usb: gadget: call usb_gadget_check_config() to verify UDC capability
usb: typec: Use sysfs_emit_at when concatenating the string
usb: typec: Iterate pds array when showing the pd list
usb: typec: Set port->pd before adding device for typec_port
usb: typec: qcom: fix return value check in qcom_pmic_typec_probe()
Revert "usb: gadget: tegra-xudc: Fix error check in tegra_xudc_powerdomain_init()"
Revert "usb: xhci: tegra: Fix error check"
USB: gadget: Fix the memory leak in raw_gadget driver
usb: gadget: core: remove unbalanced mutex_unlock in usb_gadget_activate
Revert "usb: dwc3: core: Enable AutoRetry feature in the controller"
Revert "xhci: add quirk for host controllers that don't update endpoint DCS"
USB: quirks: add quirk for Focusrite Scarlett
usb: xhci-mtk: set the dma max_seg_size
MAINTAINERS: drop invalid usb/cdns3 Reviewer e-mail
usb: dwc3: don't reset device side if dwc3 was configured as host-only
usb: typec: ucsi: move typec_set_mode(TYPEC_STATE_SAFE) to ucsi_unregister_partner()
usb: ohci-at91: Fix the unhandle interrupt when resume
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial fixes from Greg KH:
"Here are some small TTY and serial driver fixes for 6.5-rc4 for some
reported problems. Included in here is:
- TIOCSTI fix for braille readers
- documentation fix for minor numbers
- MAINTAINERS update for new serial files in -rc1
- minor serial driver fixes for reported problems
All of these have been in linux-next with no reported problems"
* tag 'tty-6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
serial: 8250_dw: Preserve original value of DLF register
tty: serial: sh-sci: Fix sleeping in atomic context
serial: sifive: Fix sifive_serial_console_setup() section
Documentation: devices.txt: reconcile serial/ucc_uart minor numers
MAINTAINERS: Update TTY layer for lists and recently added files
tty: n_gsm: fix UAF in gsm_cleanup_mux
TIOCSTI: always enable for CAP_SYS_ADMIN
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver fixes from Greg KH:
"Here are three small staging driver fixes for 6.5-rc4 that resolve
some reported problems. These fixes are:
- fix for an old bug in the r8712 driver
- fbtft driver fix for a spi device
- potential overflow fix in the ks7010 driver
All of these have been in linux-next with no reported problems"
* tag 'staging-6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: ks7010: potential buffer overflow in ks_wlan_set_encode_ext()
staging: fbtft: ili9341: use macro FBTFT_REGISTER_SPI_DRIVER
staging: r8712: Fix memory leak in _r8712_init_xmit_priv()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char driver and Documentation fixes from Greg KH:
"Here is a char driver fix and some documentation updates for 6.5-rc4
that contain the following changes:
- sram/genalloc bugfix for reported problem
- security-bugs.rst update based on recent discussions
- embargoed-hardware-issues minor cleanups and then partial revert
for the project/company lists
All of these have been in linux-next for a while with no reported
problems, and the documentation updates have all been reviewed by the
relevant developers"
* tag 'char-misc-6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
misc/genalloc: Name subpools by of_node_full_name()
Documentation: embargoed-hardware-issues.rst: add AMD to the list
Documentation: embargoed-hardware-issues.rst: clean out empty and unused entries
Documentation: security-bugs.rst: clarify CVE handling
Documentation: security-bugs.rst: update preferences when dealing with the linux-distros group
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull probe fixes from Masami Hiramatsu:
- probe-events: add NULL check for some BTF API calls which can return
error code and NULL.
- ftrace selftests: check fprobe and kprobe event correctly. This fixes
a miss condition of the test command.
- kprobes: do not allow probing functions that start with "__cfi_" or
"__pfx_" since those are auto generated for kernel CFI and not
executed.
* tag 'probes-fixes-v6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
kprobes: Prohibit probing on CFI preamble symbol
selftests/ftrace: Fix to check fprobe event eneblement
tracing/probes: Fix to add NULL check for BTF APIs
|
|
Pull kvm fixes from Paolo Bonzini:
"x86:
- Do not register IRQ bypass consumer if posted interrupts not
supported
- Fix missed device interrupt due to non-atomic update of IRR
- Use GFP_KERNEL_ACCOUNT for pid_table in ipiv
- Make VMREAD error path play nice with noinstr
- x86: Acquire SRCU read lock when handling fastpath MSR writes
- Support linking rseq tests statically against glibc 2.35+
- Fix reference count for stats file descriptors
- Detect userspace setting invalid CR0
Non-KVM:
- Remove coccinelle script that has caused multiple confusion
("debugfs, coccinelle: check for obsolete DEFINE_SIMPLE_ATTRIBUTE()
usage", acked by Greg)"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (21 commits)
KVM: selftests: Expand x86's sregs test to cover illegal CR0 values
KVM: VMX: Don't fudge CR0 and CR4 for restricted L2 guest
KVM: x86: Disallow KVM_SET_SREGS{2} if incoming CR0 is invalid
Revert "debugfs, coccinelle: check for obsolete DEFINE_SIMPLE_ATTRIBUTE() usage"
KVM: selftests: Verify stats fd is usable after VM fd has been closed
KVM: selftests: Verify stats fd can be dup()'d and read
KVM: selftests: Verify userspace can create "redundant" binary stats files
KVM: selftests: Explicitly free vcpus array in binary stats test
KVM: selftests: Clean up stats fd in common stats_test() helper
KVM: selftests: Use pread() to read binary stats header
KVM: Grab a reference to KVM for VM and vCPU stats file descriptors
selftests/rseq: Play nice with binaries statically linked against glibc 2.35+
Revert "KVM: SVM: Skip WRMSR fastpath on VM-Exit if next RIP isn't valid"
KVM: x86: Acquire SRCU read lock when handling fastpath MSR writes
KVM: VMX: Use vmread_error() to report VM-Fail in "goto" path
KVM: VMX: Make VMREAD error path play nice with noinstr
KVM: x86/irq: Conditionally register IRQ bypass consumer again
KVM: X86: Use GFP_KERNEL_ACCOUNT for pid_table in ipiv
KVM: x86: check the kvm_cpu_get_interrupt result before using it
KVM: x86: VMX: set irr_pending in kvm_apic_update_irr
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fix from Borislav Petkov:
- Fix a rtmutex race condition resulting from sharing of the sort key
between the lock waiters and the PI chain tree (->pi_waiters) of a
task by giving each tree their own sort key
* tag 'locking_urgent_for_v6.5_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/rtmutex: Fix task->pi_waiters integrity
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- AMD's automatic IBRS doesn't enable cross-thread branch target
injection protection (STIBP) for user processes. Enable STIBP on such
systems.
- Do not delete (but put the ref instead) of AMD MCE error thresholding
sysfs kobjects when destroying them in order not to delete the kernfs
pointer prematurely
- Restore annotation in ret_from_fork_asm() in order to fix kthread
stack unwinding from being marked as unreliable and thus breaking
livepatching
* tag 'x86_urgent_for_v6.5_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpu: Enable STIBP on AMD if Automatic IBRS is enabled
x86/MCE/AMD: Decrement threshold_bank refcount when removing threshold blocks
x86: Fix kthread unwind
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Borislav Petkov:
- Work around an erratum on GIC700, where a race between a CPU handling
a wake-up interrupt, a change of affinity, and another CPU going to
sleep can result in a lack of wake-up event on the next interrupt
- Fix the locking required on a VPE for GICv4
- Enable Rockchip 3588001 erratum workaround for RK3588S
- Fix the irq-bcm6345-l1 assumtions of the boot CPU always be the first
CPU in the system
* tag 'irq_urgent_for_v6.5_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/gic-v3: Workaround for GIC-700 erratum 2941627
irqchip/gic-v3: Enable Rockchip 3588001 erratum workaround for RK3588S
irqchip/gic-v4.1: Properly lock VPEs when doing a directLPI invalidation
irq-bcm6345-l1: Do not assume a fixed block to cpu mapping
|
|
Pull smb client fixes from Steve French:
"Four small SMB3 client fixes:
- two reconnect fixes (to address the case where non-default
iocharset gets incorrectly overridden at reconnect with the
default charset)
- fix for NTLMSSP_AUTH request setting a flag incorrectly)
- Add missing check for invalid tlink (tree connection) in ioctl"
* tag '6.5-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: add missing return value check for cifs_sb_tlink
smb3: do not set NTLMSSP_VERSION flag for negotiate not auth request
cifs: fix charset issue in reconnection
fs/nls: make load_nls() take a const parameter
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Fix to /sys/kernel/tracing/per_cpu/cpu*/stats read and entries.
If a resize shrinks the buffer it clears the read count to notify
readers that they need to reset. But the read count is also used for
accounting and this causes the numbers to be off. Instead, create a
separate variable to use to notify readers to reset.
- Fix the ref counts of the "soft disable" mode. The wrong value was
used for testing if soft disable mode should be enabled or disable,
but instead, just change the logic to do the enable and disable in
place when the SOFT_MODE is set or cleared.
- Several kernel-doc fixes
- Removal of unused external declarations
* tag 'trace-v6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Fix warning in trace_buffered_event_disable()
ftrace: Remove unused extern declarations
tracing: Fix kernel-doc warnings in trace_seq.c
tracing: Fix kernel-doc warnings in trace_events_trigger.c
tracing/synthetic: Fix kernel-doc warnings in trace_events_synth.c
ring-buffer: Fix kernel-doc warnings in ring_buffer.c
ring-buffer: Fix wrong stat of cpu_buffer->read
|
|
Commit a2225d931f75 ("autofs: remove left-over autofs4 stubs")
promised the removal of the fs/autofs/Kconfig fragment for AUTOFS4_FS
within a couple of releases, but five years later this still has not
happened yet, and AUTOFS4_FS is still enabled in 63 defconfigs.
Get rid of it mechanically:
git grep -l CONFIG_AUTOFS4_FS -- '*defconfig' |
xargs sed -i 's/AUTOFS4_FS/AUTOFS_FS/'
Also just remove the AUTOFS4_FS config option stub. Anybody who hasn't
regenerated their config file in the last five years will need to just
get the new name right when they do.
Signed-off-by: Sven Joachim <[email protected]>
Acked-by: Ian Kent <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen:
"Some bug fixes for build system, builtin cmdline handling, bpf and
{copy, clear}_user, together with a trivial cleanup"
* tag 'loongarch-fixes-6.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
LoongArch: Cleanup __builtin_constant_p() checking for cpu_has_*
LoongArch: BPF: Fix check condition to call lu32id in move_imm()
LoongArch: BPF: Enable bpf_probe_read{, str}() on LoongArch
LoongArch: Fix return value underflow in exception path
LoongArch: Fix CMDLINE_EXTEND and CMDLINE_BOOTLOADER handling
LoongArch: Fix module relocation error with binutils 2.41
LoongArch: Only fiddle with CHECKFLAGS if `need-compiler'
|
|
Add coverage to x86's set_sregs_test to verify KVM rejects vendor-agnostic
illegal CR0 values, i.e. CR0 values whose legality doesn't depend on the
current VMX mode. KVM historically has neglected to reject bad CR0s from
userspace, i.e. would happily accept a completely bogus CR0 via
KVM_SET_SREGS{2}.
Punt VMX specific subtests to future work, as they would require quite a
bit more effort, and KVM gets coverage for CR0 checks in general through
other means, e.g. KVM-Unit-Tests.
Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Stuff CR0 and/or CR4 to be compliant with a restricted guest if and only
if KVM itself is not configured to utilize unrestricted guests, i.e. don't
stuff CR0/CR4 for a restricted L2 that is running as the guest of an
unrestricted L1. Any attempt to VM-Enter a restricted guest with invalid
CR0/CR4 values should fail, i.e. in a nested scenario, KVM (as L0) should
never observe a restricted L2 with incompatible CR0/CR4, since nested
VM-Enter from L1 should have failed.
And if KVM does observe an active, restricted L2 with incompatible state,
e.g. due to a KVM bug, fudging CR0/CR4 instead of letting VM-Enter fail
does more harm than good, as KVM will often neglect to undo the side
effects, e.g. won't clear rmode.vm86_active on nested VM-Exit, and thus
the damage can easily spill over to L1. On the other hand, letting
VM-Enter fail due to bad guest state is more likely to contain the damage
to L2 as KVM relies on hardware to perform most guest state consistency
checks, i.e. KVM needs to be able to reflect a failed nested VM-Enter into
L1 irrespective of (un)restricted guest behavior.
Cc: Jim Mattson <[email protected]>
Cc: [email protected]
Fixes: bddd82d19e2e ("KVM: nVMX: KVM needs to unset "unrestricted guest" VM-execution control in vmcs02 if vmcs12 doesn't set it")
Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Reject KVM_SET_SREGS{2} with -EINVAL if the incoming CR0 is invalid,
e.g. due to setting bits 63:32, illegal combinations, or to a value that
isn't allowed in VMX (non-)root mode. The VMX checks in particular are
"fun" as failure to disallow Real Mode for an L2 that is configured with
unrestricted guest disabled, when KVM itself has unrestricted guest
enabled, will result in KVM forcing VM86 mode to virtual Real Mode for
L2, but then fail to unwind the related metadata when synthesizing a
nested VM-Exit back to L1 (which has unrestricted guest enabled).
Opportunistically fix a benign typo in the prototype for is_valid_cr4().
Cc: [email protected]
Reported-by: [email protected]
Closes: https://lore.kernel.org/all/[email protected]
Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Remove coccinelle's recommendation to use DEFINE_DEBUGFS_ATTRIBUTE()
instead of DEFINE_SIMPLE_ATTRIBUTE(). Regardless of whether or not the
"significant overhead" incurred by debugfs_create_file() is actually
meaningful, warnings from the script have led to a rash of low-quality
patches that have sowed confusion and consumed maintainer time for little
to no benefit. There have been no less than four attempts to "fix" KVM,
and a quick search on lore shows that KVM is not alone.
This reverts commit 5103068eaca290f890a30aae70085fac44cecaf6.
Link: https://lore.kernel.org/all/[email protected]
Link: https://lore.kernel.org/all/[email protected]
Link: https://lkml.kernel.org/r/20230706072954.4881-1-duminjie%40vivo.com
Link: https://lore.kernel.org/all/[email protected]
Link: https://lore.kernel.org/all/Y2ENJJ1YiSg5oHiy@orome
Link: https://lore.kernel.org/all/[email protected]
Suggested-by: Paolo Bonzini <[email protected]>
Acked-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Verify that VM and vCPU binary stats files are usable even after userspace
has put its last direct reference to the VM. This is a regression test
for a UAF bug where KVM didn't gift the stats files a reference to the VM.
Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Expand the binary stats test to verify that a stats fd can be dup()'d
and read, to (very) roughly simulate userspace passing around the file.
Adding the dup() test is primarily an intermediate step towards verifying
that userspace can read VM/vCPU stats before _and_ after userspace closes
its copy of the VM fd; the dup() test itself is only mildly interesting.
Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Verify that KVM doesn't artificially limit KVM_GET_STATS_FD to a single
file per VM/vCPU. There's no known use case for getting multiple stats
fds, but it should work, and more importantly creating multiple files will
make it easier to test that KVM correct manages VM refcounts for stats
files.
Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Explicitly free the all-encompassing vcpus array in the binary stats test
so that the test is consistent with respect to freeing all dynamically
allocated resources (versus letting them be freed on exit).
Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Move the stats fd cleanup code into stats_test() and drop the
superfluous vm_stats_test() and vcpu_stats_test() helpers in order to
decouple creation of the stats file from consuming/testing the file
(deduping code is a bonus). This will make it easier to test various
edge cases related to stats, e.g. that userspace can dup() a stats fd,
that userspace can have multiple stats files for a singleVM/vCPU, etc.
Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Use pread() with an explicit offset when reading the header and the header
name for a binary stats fd so that the common helper and the binary stats
test don't subtly rely on the file effectively being untouched, e.g. to
allow multiple reads of the header, name, etc.
Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Grab a reference to KVM prior to installing VM and vCPU stats file
descriptors to ensure the underlying VM and vCPU objects are not freed
until the last reference to any and all stats fds are dropped.
Note, the stats paths manually invoke fd_install() and so don't need to
grab a reference before creating the file.
Fixes: ce55c049459c ("KVM: stats: Support binary stats retrieval for a VCPU")
Fixes: fcfe1baeddbf ("KVM: stats: Support binary stats retrieval for a VM")
Reported-by: Zheng Zhang <[email protected]>
Closes: https://lore.kernel.org/all/CAC_GQSr3xzZaeZt85k_RCBd5kfiOve8qXo7a81Cq53LuVQ5r=Q@mail.gmail.com
Cc: [email protected]
Cc: Kees Cook <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
To allow running rseq and KVM's rseq selftests as statically linked
binaries, initialize the various "trampoline" pointers to point directly
at the expect glibc symbols, and skip the dlysm() lookups if the rseq
size is non-zero, i.e. the binary is statically linked *and* the libc
registered its own rseq.
Define weak versions of the symbols so as not to break linking against
libc versions that don't support rseq in any capacity.
The KVM selftests in particular are often statically linked so that they
can be run on targets with very limited runtime environments, i.e. test
machines.
Fixes: 233e667e1ae3 ("selftests/rseq: Uplift rseq selftests for compatibility with glibc-2.35")
Cc: Aaron Lewis <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Now that handle_fastpath_set_msr_irqoff() acquires kvm->srcu, i.e. allows
dereferencing memslots during WRMSR emulation, drop the requirement that
"next RIP" is valid. In hindsight, acquiring kvm->srcu would have been a
better fix than avoiding the pastpath, but at the time it was thought that
accessing SRCU-protected data in the fastpath was a one-off edge case.
This reverts commit 5c30e8101e8d5d020b1d7119117889756a6ed713.
Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|