Age | Commit message (Collapse) | Author | Files | Lines |
|
On short runs it is possible to get no samples on a cpu, like this:
# rtla timerlat hist -u -T50
Index IRQ-001 Thr-001 Usr-001 IRQ-002 Thr-002 Usr-002
2 1 0 0 0 0 0
33 0 1 0 0 0 0
36 0 0 1 0 0 0
49 0 0 0 1 0 0
52 0 0 0 0 1 0
over: 0 0 0 0 0 0
count: 1 1 1 1 1 0
min: 2 33 36 49 52 18446744073709551615
avg: 2 33 36 49 52 -
max: 2 33 36 49 52 0
rtla timerlat hit stop tracing
IRQ handler delay: (exit from idle) 48.21 us (91.09 %)
IRQ latency: 49.11 us
Timerlat IRQ duration: 2.17 us (4.09 %)
Blocking thread: 1.01 us (1.90 %)
swapper/2:0 1.01 us
------------------------------------------------------------------------
Thread latency: 52.93 us (100%)
Max timerlat IRQ latency from idle: 49.11 us in cpu 2
Note, the value 18446744073709551615 is the same as ~0.
Fix this by reporting no results for the min, avg and max if the count
is 0.
Link: https://lkml.kernel.org/r/[email protected]
Cc: [email protected]
Fixes: 1eeb6328e8b3 ("rtla/timerlat: Add timerlat hist mode")
Suggested-by: Daniel Bristot de Oliveria <[email protected]>
Signed-off-by: John Kacur <[email protected]>
Signed-off-by: Daniel Bristot de Oliveira <[email protected]>
|
|
Add the option allow the users to set a different buffer size for the
trace. For example, in large systems, the user might be interested on
reducing the trace buffer to avoid large tracing files.
The buffer size is specified in kB, and it is only affecting
the tracing instance.
The function trace_set_buffer_size() appears on libtracefs v1.6,
so increase the minimum required version on Makefile.config.
Link: https://lkml.kernel.org/r/e7c9ca5b3865f28e131a49ec3b984fadf2d056c6.1715860611.git.bristot@kernel.org
Cc: Jonathan Corbet <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: John Kacur <[email protected]>
Signed-off-by: Daniel Bristot de Oliveira <[email protected]>
|
|
With some compilers/configs fadump_setup_param_area() isn't inlined into
its caller (which is __init), leading to a section mismatch warning:
WARNING: modpost: vmlinux: section mismatch in reference:
fadump_setup_param_area+0x200 (section: .text.fadump_setup_param_area)
-> memblock_phys_alloc_range (section: .init.text)
Fix it by adding an __init annotation.
Fixes: 683eab94da75 ("powerpc/fadump: setup additional parameters for dump capture kernel")
Reported-by: Stephen Rothwell <[email protected]>
Closes: https://lore.kernel.org/all/[email protected]/
Reported-by: kernel test robot <[email protected]>
Closes: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/[email protected]
|
|
|
|
Add implicit fallthrough checking to the decompressor code and fix this
warning:
arch/x86/boot/printf.c: In function ‘vsprintf’:
arch/x86/boot/printf.c:248:10: warning: this statement may fall through [-Wimplicit-fallthrough=]
248 | flags |= SMALL;
| ^
arch/x86/boot/printf.c:249:3: note: here
249 | case 'X':
| ^~~~
This is a patch from three years ago which I found in my trees, thus the
SUSE authorship still.
Signed-off-by: Borislav Petkov <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Raw NAND:
Two small fixes, one in the Hynix vendor code for properly returning an
error which might have been ignored and another in the Davinci driver to
properly synchronize the controller with the gpio domain.
Signed-off-by: Miquel Raynal <[email protected]>
|
|
SPI NOR now uses div_u64() instead of div64_u64() in places where the
divisor is 32 bits. Many 32 bit architectures can optimize this variant
better than a full 64 bit divide.
Signed-off-by: Miquel Raynal <[email protected]>
|
|
There is no need to add the name to ns_list again if the netns already
recoreded.
Fixes: 25ae948b4478 ("selftests/net: add lib.sh")
Signed-off-by: Hangbin Liu <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The qrtr protocol core logic and the qrtr nameservice are combined into
a single module. Neither the core logic or nameservice provide much
functionality by themselves; combining the two into a single module also
prevents any possible issues that may stem from client modules loading
inbetween qrtr and the ns.
Creating a socket takes two references to the module that owns the
socket protocol. Since the ns needs to create the control socket, this
creates a scenario where there are always two references to the qrtr
module. This prevents the execution of 'rmmod' for qrtr.
To resolve this, forcefully put the module refcount for the socket
opened by the nameservice.
Fixes: a365023a76f2 ("net: qrtr: combine nameservice into main module")
Reported-by: Jeffrey Hugo <[email protected]>
Tested-by: Jeffrey Hugo <[email protected]>
Signed-off-by: Chris Lew <[email protected]>
Reviewed-by: Manivannan Sadhasivam <[email protected]>
Reviewed-by: Jeffrey Hugo <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
A debugfs directory entry is create early during probe(). This entry is
not removed on error path leading to some "already present" issues in
case of EPROBE_DEFER.
Create this entry later in the probe() code to avoid the need to change
many 'return' in 'goto' and add the removal in the already present error
path.
Fixes: 942814840127 ("net: lan966x: Add VCAP debugFS support")
Cc: <[email protected]>
Signed-off-by: Herve Codina <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Reviewed-by: Horatiu Vultur <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The cpudata memory from kzalloc() in amd_pstate_epp_cpu_init() is
not freed in the analogous exit function, so fix that.
Signed-off-by: Peng Ma <[email protected]>
Acked-by: Mario Limonciello <[email protected]>
Reviewed-by: Perry Yuan <[email protected]>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <[email protected]>
|
|
Since commit c98d2ecae08f ("s390/mm: Uncouple physical vs virtual address
spaces") the kernel image and module area are within the same 4GB area.
This eliminates the need of a custom insn slot allocator for kprobes within
the kernel image, since standard module_alloc() allocated pages are
sufficient for PC relative instructions with a signed 32 bit offset.
Signed-off-by: Heiko Carstens <[email protected]>
|
|
Properties with GPIOs should define number of actual GPIOs, so add
missing maxItems to ep-gpios. Otherwise multiple GPIOs could be
provided which is not a true hardware description.
Fixes: aa222f9311e1 ("dt-bindings: PCI: Convert Rockchip RK3399 PCIe to DT schema")
Link: https://lore.kernel.org/linux-pci/[email protected]
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Signed-off-by: Krzysztof Wilczyński <[email protected]>
Acked-by: Conor Dooley <[email protected]>
|
|
It is nowhere used in the decompressor, therefore remove it.
Fixes: 17e89e1340a3 ("s390/facilities: move stfl information from lowcore to global data")
Reviewed-by: Heiko Carstens <[email protected]>
Signed-off-by: Sven Schnelle <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
|
|
With the mentioned commit (see the fixes tag) on every AP bus scan an
uevent "AP bus change bindings complete" is emitted. Furthermore if an AP
device switched from one driver to another, for example by manipulating the
apmask, there was never a "bindings complete" uevent generated.
The "bindings complete" event should be sent once when all AP devices have
been bound to device drivers and again if unbind/bind actions take place
and finally all AP devices are bound again. Therefore implement this.
Fixes: 778412ab915d ("s390/ap: rearm APQNs bindings complete completion")
Reported-by: Marc Hartmayer <[email protected]>
Signed-off-by: Harald Freudenberger <[email protected]>
Reviewed-by: Holger Dengler <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
|
|
A system crash like this
Failing address: 200000cb7df6f000 TEID: 200000cb7df6f403
Fault in home space mode while using kernel ASCE.
AS:00000002d71bc007 R3:00000003fe5b8007 S:000000011a446000 P:000000015660c13d
Oops: 0038 ilc:3 [#1] PREEMPT SMP
Modules linked in: mlx5_ib ...
CPU: 8 PID: 7556 Comm: bash Not tainted 6.9.0-rc7 #8
Hardware name: IBM 3931 A01 704 (LPAR)
Krnl PSW : 0704e00180000000 0000014b75e7b606 (ap_parse_bitmap_str+0x10e/0x1f8)
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
Krnl GPRS: 0000000000000001 ffffffffffffffc0 0000000000000001 00000048f96b75d3
000000cb00000100 ffffffffffffffff ffffffffffffffff 000000cb7df6fce0
000000cb7df6fce0 00000000ffffffff 000000000000002b 00000048ffffffff
000003ff9b2dbc80 200000cb7df6fcd8 0000014bffffffc0 000000cb7df6fbc8
Krnl Code: 0000014b75e7b5fc: a7840047 brc 8,0000014b75e7b68a
0000014b75e7b600: 18b2 lr %r11,%r2
#0000014b75e7b602: a7f4000a brc 15,0000014b75e7b616
>0000014b75e7b606: eb22d00000e6 laog %r2,%r2,0(%r13)
0000014b75e7b60c: a7680001 lhi %r6,1
0000014b75e7b610: 187b lr %r7,%r11
0000014b75e7b612: 84960021 brxh %r9,%r6,0000014b75e7b654
0000014b75e7b616: 18e9 lr %r14,%r9
Call Trace:
[<0000014b75e7b606>] ap_parse_bitmap_str+0x10e/0x1f8
([<0000014b75e7b5dc>] ap_parse_bitmap_str+0xe4/0x1f8)
[<0000014b75e7b758>] apmask_store+0x68/0x140
[<0000014b75679196>] kernfs_fop_write_iter+0x14e/0x1e8
[<0000014b75598524>] vfs_write+0x1b4/0x448
[<0000014b7559894c>] ksys_write+0x74/0x100
[<0000014b7618a440>] __do_syscall+0x268/0x328
[<0000014b761a3558>] system_call+0x70/0x98
INFO: lockdep is turned off.
Last Breaking-Event-Address:
[<0000014b75e7b636>] ap_parse_bitmap_str+0x13e/0x1f8
Kernel panic - not syncing: Fatal exception: panic_on_oops
occured when /sys/bus/ap/a[pq]mask was updated with a relative mask value
(like +0x10-0x12,+60,-90) with one of the numeric values exceeding INT_MAX.
The fix is simple: use unsigned long values for the internal variables. The
correct checks are already in place in the function but a simple int for
the internal variables was used with the possibility to overflow.
Reported-by: Marc Hartmayer <[email protected]>
Signed-off-by: Harald Freudenberger <[email protected]>
Tested-by: Marc Hartmayer <[email protected]>
Reviewed-by: Holger Dengler <[email protected]>
Cc: <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
|
|
Instead of calling BUG() at runtime introduce and use a prototype for a
non-existing function to produce a link error during compile when a not
supported opcode is used with the __cpacf_query() or __cpacf_check_opcode()
inline functions.
Suggested-by: Heiko Carstens <[email protected]>
Signed-off-by: Harald Freudenberger <[email protected]>
Reviewed-by: Holger Dengler <[email protected]>
Reviewed-by: Juergen Christ <[email protected]>
Cc: [email protected]
Signed-off-by: Heiko Carstens <[email protected]>
|
|
Rework the cpacf query functions to use the correct RRE
or RRF instruction formats and set register fields within
instructions correctly.
Fixes: 1afd43e0fbba ("s390/crypto: allow to query all known cpacf functions")
Reported-by: Nina Schoetterl-Glausch <[email protected]>
Suggested-by: Heiko Carstens <[email protected]>
Suggested-by: Juergen Christ <[email protected]>
Suggested-by: Holger Dengler <[email protected]>
Signed-off-by: Harald Freudenberger <[email protected]>
Reviewed-by: Holger Dengler <[email protected]>
Reviewed-by: Juergen Christ <[email protected]>
Cc: <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
|
|
Allocate cleared blocks in the bias range when the DRM
buddy's clear avail is zero. This will validate the bias
range allocation in scenarios like system boot when no
cleared blocks are available and exercise the fallback
path too. The resulting blocks should always be dirty.
v1:(Matthew)
- move the size to the variable declaration section.
- move the mm.clear_avail init to allocator init.
Signed-off-by: Arunpravin Paneer Selvam <[email protected]>
Reviewed-by: Matthew Auld <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Problem statement: During the system boot time, an application request
for the bulk volume of cleared range bias memory when the clear_avail
is zero, we dont fallback into normal allocation method as we had an
unnecessary clear_avail check which prevents the fallback method leads
to fb allocation failure following system goes into unresponsive state.
Solution: Remove the unnecessary clear_avail check in the range bias
allocation function.
v2: add a kunit for this corner case (Daniel Vetter)
Signed-off-by: Arunpravin Paneer Selvam <[email protected]>
Fixes: 96950929eb23 ("drm/buddy: Implement tracking clear page feature")
Reviewed-by: Matthew Auld <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
New since 2024.04.08:
Len Brown (6):
tools/power turbostat: Add "snapshot:" Makefile target
tools/power turbostat: Harden probe_intel_uncore_frequency()
tools/power turbostat: Remember global max_die_id
tools/power turbostat: Survive sparse die_id
tools/power turbostat: Add columns for clustered uncore frequency
tools/power turbostat: version 2024.05.10
Patryk Wlazlyn (7):
tools/power turbostat: Replace _Static_assert with BUILD_BUG_ON
tools/power turbostat: Enable non-privileged users to read sysfs counters
tools/power turbostat: Avoid possible memory corruption due to sparse topology IDs
tools/power turbostat: Read Core-cstates via perf
tools/power turbostat: Read Package-cstates via perf
tools/power turbostat: Fix order of strings in pkg_cstate_limit_strings
tools/power turbostat: Ignore pkg_cstate_limit when it is not available
Zhang Rui (2):
tools/power turbostat: Enhance ARL/LNL support
tools/power turbostat: Add ARL-H support
Signed-off-by: Len Brown <[email protected]>
|
|
When running in no-msr mode, the pkg_cstate_limit is not populated, thus
we use perf to determine if given pcstate counter is present on the
platform.
Signed-off-by: Patryk Wlazlyn <[email protected]>
Signed-off-by: Len Brown <[email protected]>
|
|
Change the order so that it matches the indexes defined in:
Signed-off-by: Patryk Wlazlyn <[email protected]>
Signed-off-by: Len Brown <[email protected]>
|
|
Reading the counters via perf can be done in bulk with a single syscall,
making the counter values more accurate with respect to one another by
minimizing the time gap between individual counter reads.
Signed-off-by: Patryk Wlazlyn <[email protected]>
Signed-off-by: Len Brown <[email protected]>
|
|
Reading the counters via perf can be done in bulk with a single syscall,
making the counter values more accurate with respect to one another by
minimizing the time gap between individual counter reads.
Signed-off-by: Patryk Wlazlyn <[email protected]>
Signed-off-by: Len Brown <[email protected]>
|
|
topology IDs
Save the highest core and package id when parsing topology to
allocate enough memory when get_rapl_counters() is called with a core or
a package id as a domain.
Note that RAPL domains are per-package on Intel, but per-core on AMD.
Thus, the RAPL code effectively runs in different modes on those two
product lines.
Signed-off-by: Patryk Wlazlyn <[email protected]>
Signed-off-by: Len Brown <[email protected]>
|
|
New machines have multiple uncore frequencies per package,
visible in /sys/devices/system/cpu/intel_uncore_frequency/uncore##/
turbostat now samples these frequencies each measurement interval.
For each package, turbostat now prints "UMHzX.Y" columns,
where X = domain_id, and Y = fabric_cluster_id.
The system summary for each UMHzX.Y column is the average value
for across all of the packages in the system.
Signed-off-by: Len Brown <[email protected]>
|
|
Pull workqueue updates from Tejun Heo:
- Work items can now be disabled and enabled, and cancel_work_sync()
and disable_work() can be called form atomic contexts for BH work
items.
This closes feature gap with tasklet and should allow converting all
existing tasklet users to BH workqueues.
- Improve pool sharing for unbound workqueues with strict affinity.
- Misc changes including doc updates, improved debug annotations and
cleanups.
* tag 'wq-for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: Use "@..." in function comment to describe variable length argument
workqueue: Add destroy_work_on_stack() in workqueue_softirq_dead()
workqueue: remove unnecessary import and function in wq_monitor.py
workqueue: Introduce enable_and_queue_work() convenience function
workqueue: add function in event of workqueue_activate_work
workqueue: Cleanup subsys attribute registration
workqueue: Use list_last_entry() to get the last idle worker
workqueue: Move attrs->cpumask out of worker_pool's properties when attrs->affn_strict
workqueue: Use INIT_WORK_ONSTACK in workqueue_softirq_dead()
workqueue: Allow cancel_work_sync() and disable_work() from atomic contexts on BH work items
workqueue: Remember whether a work item was on a BH workqueue
workqueue: Remove WORK_OFFQ_CANCELING
workqueue: Implement disable/enable for (delayed) work items
workqueue: Preserve OFFQ bits in cancel[_sync] paths
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:
- The locking around cpuset hotplug processing has always been a bit of
mess which was worked around by making hotplug processing
asynchronous. The asynchronity isn't great and led to other issues.
We tried to make the behavior synchronous a while ago but that led to
lockdep splats. Waiman took another stab at cleaning up and making it
synchronous. The patch has been in -next for well over a month and
there haven't been any complaints, so fingers crossed.
- Tracepoints added to help understanding rstat lock contentions.
- A bunch of minor changes - doc updates, code cleanups and selftests.
* tag 'cgroup-for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (24 commits)
cgroup/rstat: add cgroup_rstat_cpu_lock helpers and tracepoints
selftests/cgroup: Drop define _GNU_SOURCE
docs: cgroup-v1: Update page cache removal functions
selftests/cgroup: fix uninitialized variables in test_zswap.c
selftests/cgroup: cpu_hogger init: use {} instead of {NULL}
selftests/cgroup: fix clang warnings: uninitialized fd variable
selftests/cgroup: fix clang build failures for abs() calls
cgroup/cpuset: Remove outdated comment in sched_partition_write()
cgroup/cpuset: Fix incorrect top_cpuset flags
cgroup/cpuset: Avoid clearing CS_SCHED_LOAD_BALANCE twice
cgroup/cpuset: Statically initialize more members of top_cpuset
cgroup: Avoid unnecessary looping in cgroup_no_v1()
cgroup, legacy_freezer: update comment for freezer_css_offline()
docs, cgroup: add entries for pids to cgroup-v2.rst
cgroup: don't call cgroup1_pidlist_destroy_all() for v2
cgroup_freezer: update comment for freezer_css_online()
cgroup/rstat: desc member cgrp in cgroup_rstat_flush_release
cgroup/rstat: add cgroup_rstat_lock helpers and tracepoints
cgroup/pids: Remove superfluous zeroing
docs: cgroup-v1: Fix description for css_online
...
|
|
If an error happens in ftrace, ftrace_kill() will prevent disarming
kprobes. Eventually, the ftrace_ops associated with the kprobes will be
freed, yet the kprobes will still be active, and when triggered, they
will use the freed memory, likely resulting in a page fault and panic.
This behavior can be reproduced quite easily, by creating a kprobe and
then triggering a ftrace_kill(). For simplicity, we can simulate an
ftrace error with a kernel module like [1]:
[1]: https://github.com/brenns10/kernel_stuff/tree/master/ftrace_killer
sudo perf probe --add commit_creds
sudo perf trace -e probe:commit_creds
# In another terminal
make
sudo insmod ftrace_killer.ko # calls ftrace_kill(), simulating bug
# Back to perf terminal
# ctrl-c
sudo perf probe --del commit_creds
After a short period, a page fault and panic would occur as the kprobe
continues to execute and uses the freed ftrace_ops. While ftrace_kill()
is supposed to be used only in extreme circumstances, it is invoked in
FTRACE_WARN_ON() and so there are many places where an unexpected bug
could be triggered, yet the system may continue operating, possibly
without the administrator noticing. If ftrace_kill() does not panic the
system, then we should do everything we can to continue operating,
rather than leave a ticking time bomb.
Link: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Stephen Brennan <[email protected]>
Acked-by: Masami Hiramatsu (Google) <[email protected]>
Acked-by: Guo Ren <[email protected]>
Reviewed-by: Steven Rostedt (Google) <[email protected]>
Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
|
|
When invalidating a file as part of breaking a lease, the folios holding
the file data are disposed of, and truncate calls ->invalidate_folio()
to get rid of them rather than calling ->release_folio(). This means
that the netfs_inode::zero_point value didn't get updated in current
upstream code to reflect the point after which we can assume that the
server will only return zeroes, and future reads will then return blocks
of zeroes if the file got extended for any region beyond the old zero
point.
Fix this by updating zero_point before invalidating the inode in
cifs_revalidate_mapping().
Suggested-by: David Howells <[email protected]>
Fixes: 3ee1a1fc3981 ("cifs: Cut over to using netfslib")
Reviewed-by: David Howells <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
This reverts commit e23d4192bf9b612bce5b24f22719fd3cc6edaa69.
IMS (Interrupt Message Store) support appeared in v6.2, but there are no
users yet.
Remove it for now. We can add it back when a user comes along.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Kevin Tian <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
|
|
This reverts commit 6e24c887732901140f4e82ba2315c2e15f06f1d6.
IMS (Interrupt Message Store) support appeared in v6.2, but there are no
users yet.
Remove it for now. We can add it back when a user comes along.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Kevin Tian <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
|
|
This reverts commit 810531a1af5393f010d6508b1cb48e6650fc5e8f.
IMS (Interrupt Message Store) support appeared in v6.2, but there are no
users yet.
Remove it for now. We can add it back when a user comes along.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Kevin Tian <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
|
|
This reverts commit fa5745aca1dc819aee6463a2475b5c277f7cf8f6.
IMS (Interrupt Message Store) support appeared in v6.2, but there are no
users yet.
Remove it for now. We can add it back when a user comes along.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Kevin Tian <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
|
|
This reverts commit 0194425af0c87acaad457989a2c6d90dba58e776.
IMS (Interrupt Message Store) support appeared in v6.2, but there are no
users yet.
Remove it for now. We can add it back when a user comes along. If this is
re-added later, the relevant part of 41efa431244f ("PCI/MSI: Provide stubs
for IMS functions") should be squashed into it.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Kevin Tian <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
|
|
This reverts commit c9e5bea273834a63b5e9ba90ad94b305ba50704e.
IMS (Interrupt Message Store) support appeared in v6.2, but there are no
users yet.
Remove it for now. We can add it back when a user comes along. If this is
re-added later, the relevant part of 41efa431244f ("PCI/MSI: Provide stubs
for IMS functions") should be squashed into it.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Kevin Tian <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
|
|
This reverts commit 41efa431244f6498833ff8ee8dde28c4924c5479.
IMS (Interrupt Message Store) support appeared in v6.2, but there are no
users yet.
Remove it for now. We can add it back when a user comes along. If this is
re-added later, this could be squashed with these commits:
0194425af0c8 ("PCI/MSI: Provide IMS (Interrupt Message Store) support")
c9e5bea27383 ("PCI/MSI: Provide pci_ims_alloc/free_irq()")
which added the non-stub implementations.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Kevin Tian <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
|
|
Pull KVM updates from Paolo Bonzini:
"ARM:
- Move a lot of state that was previously stored on a per vcpu basis
into a per-CPU area, because it is only pertinent to the host while
the vcpu is loaded. This results in better state tracking, and a
smaller vcpu structure.
- Add full handling of the ERET/ERETAA/ERETAB instructions in nested
virtualisation. The last two instructions also require emulating
part of the pointer authentication extension. As a result, the trap
handling of pointer authentication has been greatly simplified.
- Turn the global (and not very scalable) LPI translation cache into
a per-ITS, scalable cache, making non directly injected LPIs much
cheaper to make visible to the vcpu.
- A batch of pKVM patches, mostly fixes and cleanups, as the
upstreaming process seems to be resuming. Fingers crossed!
- Allocate PPIs and SGIs outside of the vcpu structure, allowing for
smaller EL2 mapping and some flexibility in implementing more or
less than 32 private IRQs.
- Purge stale mpidr_data if a vcpu is created after the MPIDR map has
been created.
- Preserve vcpu-specific ID registers across a vcpu reset.
- Various minor cleanups and improvements.
LoongArch:
- Add ParaVirt IPI support
- Add software breakpoint support
- Add mmio trace events support
RISC-V:
- Support guest breakpoints using ebreak
- Introduce per-VCPU mp_state_lock and reset_cntx_lock
- Virtualize SBI PMU snapshot and counter overflow interrupts
- New selftests for SBI PMU and Guest ebreak
- Some preparatory work for both TDX and SNP page fault handling.
This also cleans up the page fault path, so that the priorities of
various kinds of fauls (private page, no memory, write to read-only
slot, etc.) are easier to follow.
x86:
- Minimize amount of time that shadow PTEs remain in the special
REMOVED_SPTE state.
This is a state where the mmu_lock is held for reading but
concurrent accesses to the PTE have to spin; shortening its use
allows other vCPUs to repopulate the zapped region while the zapper
finishes tearing down the old, defunct page tables.
- Advertise the max mappable GPA in the "guest MAXPHYADDR" CPUID
field, which is defined by hardware but left for software use.
This lets KVM communicate its inability to map GPAs that set bits
51:48 on hosts without 5-level nested page tables. Guest firmware
is expected to use the information when mapping BARs; this avoids
that they end up at a legal, but unmappable, GPA.
- Fixed a bug where KVM would not reject accesses to MSR that aren't
supposed to exist given the vCPU model and/or KVM configuration.
- As usual, a bunch of code cleanups.
x86 (AMD):
- Implement a new and improved API to initialize SEV and SEV-ES VMs,
which will also be extendable to SEV-SNP.
The new API specifies the desired encryption in KVM_CREATE_VM and
then separately initializes the VM. The new API also allows
customizing the desired set of VMSA features; the features affect
the measurement of the VM's initial state, and therefore enabling
them cannot be done tout court by the hypervisor.
While at it, the new API includes two bugfixes that couldn't be
applied to the old one without a flag day in userspace or without
affecting the initial measurement. When a SEV-ES VM is created with
the new VM type, KVM_GET_REGS/KVM_SET_REGS and friends are rejected
once the VMSA has been encrypted. Also, the FPU and AVX state will
be synchronized and encrypted too.
- Support for GHCB version 2 as applicable to SEV-ES guests.
This, once more, is only accessible when using the new
KVM_SEV_INIT2 flow for initialization of SEV-ES VMs.
x86 (Intel):
- An initial bunch of prerequisite patches for Intel TDX were merged.
They generally don't do anything interesting. The only somewhat
user visible change is a new debugging mode that checks that KVM's
MMU never triggers a #VE virtualization exception in the guest.
- Clear vmcs.EXIT_QUALIFICATION when synthesizing an EPT Misconfig
VM-Exit to L1, as per the SDM.
Generic:
- Use vfree() instead of kvfree() for allocations that always use
vcalloc() or __vcalloc().
- Remove .change_pte() MMU notifier - the changes to non-KVM code are
small and Andrew Morton asked that I also take those through the
KVM tree.
The callback was only ever implemented by KVM (which was also the
original user of MMU notifiers) but it had been nonfunctional ever
since calls to set_pte_at_notify were wrapped with
invalidate_range_start and invalidate_range_end... in 2012.
Selftests:
- Enhance the demand paging test to allow for better reporting and
stressing of UFFD performance.
- Convert the steal time test to generate TAP-friendly output.
- Fix a flaky false positive in the xen_shinfo_test due to comparing
elapsed time across two different clock domains.
- Skip the MONITOR/MWAIT test if the host doesn't actually support
MWAIT.
- Avoid unnecessary use of "sudo" in the NX hugepage test wrapper
shell script, to play nice with running in a minimal userspace
environment.
- Allow skipping the RSEQ test's sanity check that the vCPU was able
to complete a reasonable number of KVM_RUNs, as the assert can fail
on a completely valid setup.
If the test is run on a large-ish system that is otherwise idle,
and the test isn't affined to a low-ish number of CPUs, the vCPU
task can be repeatedly migrated to CPUs that are in deep sleep
states, which results in the vCPU having very little net runtime
before the next migration due to high wakeup latencies.
- Define _GNU_SOURCE for all selftests to fix a warning that was
introduced by a change to kselftest_harness.h late in the 6.9
cycle, and because forcing every test to #define _GNU_SOURCE is
painful.
- Provide a global pseudo-RNG instance for all tests, so that library
code can generate random, but determinstic numbers.
- Use the global pRNG to randomly force emulation of select writes
from guest code on x86, e.g. to help validate KVM's emulation of
locked accesses.
- Allocate and initialize x86's GDT, IDT, TSS, segments, and default
exception handlers at VM creation, instead of forcing tests to
manually trigger the related setup.
Documentation:
- Fix a goof in the KVM_CREATE_GUEST_MEMFD documentation"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (225 commits)
selftests/kvm: remove dead file
KVM: selftests: arm64: Test vCPU-scoped feature ID registers
KVM: selftests: arm64: Test that feature ID regs survive a reset
KVM: selftests: arm64: Store expected register value in set_id_regs
KVM: selftests: arm64: Rename helper in set_id_regs to imply VM scope
KVM: arm64: Only reset vCPU-scoped feature ID regs once
KVM: arm64: Reset VM feature ID regs from kvm_reset_sys_regs()
KVM: arm64: Rename is_id_reg() to imply VM scope
KVM: arm64: Destroy mpidr_data for 'late' vCPU creation
KVM: arm64: Use hVHE in pKVM by default on CPUs with VHE support
KVM: arm64: Fix hvhe/nvhe early alias parsing
KVM: SEV: Allow per-guest configuration of GHCB protocol version
KVM: SEV: Add GHCB handling for termination requests
KVM: SEV: Add GHCB handling for Hypervisor Feature Support requests
KVM: SEV: Add support to handle AP reset MSR protocol
KVM: x86: Explicitly zero kvm_caps during vendor module load
KVM: x86: Fully re-initialize supported_mce_cap on vendor module load
KVM: x86: Fully re-initialize supported_vm_types on vendor module load
KVM: x86/mmu: Sanity check that __kvm_faultin_pfn() doesn't create noslot pfns
KVM: x86/mmu: Initialize kvm_page_fault's pfn and hva to error values
...
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull CXL updates from Dave Jiang:
- Three CXL mailbox passthrough commands are added to support the
populating and clearing of vendor debug logs:
- Get Log Capabilities
- Get Supported Log Sub-List Commands
- Clear Log
- Add support of Device Phyiscal Address (DPA) to Host Physical Address
(HPA) translation for CXL events of cxl_dram and cxl_general media.
This allows user space to figure out which CXL region the event
occured via trace event.
- Connect CXL to CPER reporting.
If a device is configured for firmware first, CXL event records are
not sent directly to the host. Those records are reported through EFI
Common Platform Error Records (CPER). Add support to route the CPER
records through the CXL sub-system in order to provide DPA to HPA
translation and also event decoding and tracing. This is useful for
users to determine which system issues may correspond to specific
hardware events.
- A number of misc cleanups and fixes:
- Fix for compile warning of cxl_security_ops
- Add debug message for invalid interleave granularity
- Enhancement to cxl-test event testing
- Add dev_warn() on unsupported mixed mode decoder
- Fix use of phys_to_target_node() for x86
- Use helper function for decoder enum instead of open coding
- Include missing headers for cxl-event
- Fix MAINTAINERS file entry
- Fix cxlr_pmem memory leak
- Cleanup __cxl_parse_cfmws via scope-based resource menagement
- Convert cxl_pmem_region_alloc() to scope-based resource management
* tag 'cxl-for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (21 commits)
cxl/cper: Remove duplicated GUID defines
cxl/cper: Fix non-ACPI-APEI-GHES build
cxl/pci: Process CPER events
acpi/ghes: Process CXL Component Events
cxl/region: Convert cxl_pmem_region_alloc to scope-based resource management
cxl/acpi: Cleanup __cxl_parse_cfmws()
cxl/region: Fix cxlr_pmem leaks
cxl/core: Add region info to cxl_general_media and cxl_dram events
cxl/region: Move cxl_trace_hpa() work to the region driver
cxl/region: Move cxl_dpa_to_region() work to the region driver
cxl/trace: Correct DPA field masks for general_media & dram events
MAINTAINERS: repair file entry in COMPUTE EXPRESS LINK
cxl/cxl-event: include missing <linux/types.h> and <linux/uuid.h>
cxl/hdm: Debug, use decoder name function
cxl: Fix use of phys_to_target_node() for x86
cxl/hdm: dev_warn() on unsupported mixed mode decoder
cxl/test: Enhance event testing
cxl/hdm: Add debug message for invalid interleave granularity
cxl: Fix compile warning for cxl_security_ops extern
cxl/mbox: Add Clear Log mailbox command
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull nvdimm updates from Ira Weiny:
"The changes include removing duplicate code and updating the nvdimm
tree to the current kernel interfaces such as using const for struct
device_type and changing the platform remove callback signature"
* tag 'libnvdimm-for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
dax: remove redundant assignment to variable rc
ndtest: Convert to platform remove callback returning void
nvdimm/btt: always set max_integrity_segments
nvdimm: remove nd_integrity_init
dax: constify the struct device_type usage
powerpc/papr_scm: Move duplicate definitions to common header files
|
|
Remove wrong mask on subsys_vendor_id. Both the Vendor ID and Subsystem
Vendor ID are u16 variables and are written to a u32 register of the
controller. The Subsystem Vendor ID was always 0 because the u16 value
was masked incorrectly with GENMASK(31,16) resulting in all lower 16
bits being set to 0 prior to the shift.
Remove both masks as they are unnecessary and set the register correctly
i.e., the lower 16-bits are the Vendor ID and the upper 16-bits are the
Subsystem Vendor ID.
This is documented in the RK3399 TRM section 17.6.7.1.17
[kwilczynski: removed unnecesary newline]
Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
Link: https://lore.kernel.org/linux-pci/[email protected]
Signed-off-by: Rick Wertenbroek <[email protected]>
Signed-off-by: Krzysztof Wilczyński <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Damien Le Moal <[email protected]>
Cc: [email protected]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux
Pull modules updates from Luis Chamberlain:
"Finally something fun. Mike Rapoport does some cleanup to allow us to
take out module_alloc() out of modules into a new paint shedded
execmem_alloc() and execmem_free() so to make emphasis these helpers
are actually used outside of modules.
It starts with a non-functional changes API rename / placeholders to
then allow architectures to define their requirements into a new shiny
struct execmem_info with ranges, and requirements for those ranges.
Archs now can intitialize this execmem_info as the last part of
mm_core_init() if they have to diverge from the norm. Each range is a
known type clearly articulated and spelled out in enum execmem_type.
Although a lot of this is major cleanup and prep work for future
enhancements an immediate clear gain is we get to enable KPROBES
without MODULES now. That is ultimately what motiviated to pick this
work up again, now with smaller goal as concrete stepping stone"
* tag 'modules-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
bpf: remove CONFIG_BPF_JIT dependency on CONFIG_MODULES of
kprobes: remove dependency on CONFIG_MODULES
powerpc: use CONFIG_EXECMEM instead of CONFIG_MODULES where appropriate
x86/ftrace: enable dynamic ftrace without CONFIG_MODULES
arch: make execmem setup available regardless of CONFIG_MODULES
powerpc: extend execmem_params for kprobes allocations
arm64: extend execmem_info for generated code allocations
riscv: extend execmem_params for generated code allocations
mm/execmem, arch: convert remaining overrides of module_alloc to execmem
mm/execmem, arch: convert simple overrides of module_alloc to execmem
mm: introduce execmem_alloc() and execmem_free()
module: make module_memory_{alloc,free} more self-contained
sparc: simplify module_alloc()
nios2: define virtual address space for modules
mips: module: rename MODULE_START to MODULES_VADDR
arm64: module: remove unneeded call to kasan_alloc_module_shadow()
kallsyms: replace deprecated strncpy with strscpy
module: allow UNUSED_KSYMS_WHITELIST to be relative against objtree.
|
|
Booting an LPAE-enabled kernel built with CONFIG_CC_OPTIMIZE_FOR_SIZE=y
fails when starting userspace:
Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004
CPU: 1 PID: 1 Comm: init Tainted: G W N 6.9.0-rc1-koelsch-00004-g7af5b901e847 #1930
Hardware name: Generic R-Car Gen2 (Flattened Device Tree)
Call trace:
unwind_backtrace from show_stack+0x10/0x14
show_stack from dump_stack_lvl+0x78/0xa8
dump_stack_lvl from panic+0x118/0x398
panic from do_exit+0x1ec/0x938
do_exit from sys_exit_group+0x0/0x10
---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004 ]---
Add the missing memory clobber to cpu_set_ttbcr(), as suggested by
Russell King.
Force inlining of uaccess_save_and_enable(), as suggested by Ard
Biesheuvel.
The latter fixes booting on Koelsch.
Closes: https://lore.kernel.org/r/CAMuHMdWTAJcZ9BReWNhpmsgkOzQxLNb5OhNYxzxv6D5TSh2fwQ@mail.gmail.com/
Fixes: 7af5b901e84743c6 ("ARM: 9358/2: Implement PAN for LPAE by TTBR0 page table walks disablement")
Acked-by: Ard Biesheuvel <[email protected]>
Reviewed-by: Linus Walleij <[email protected]>
Tested-by: Florian Fainelli <[email protected]>
Signed-off-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Russell King (Oracle) <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching
Pull livepatching update from Petr Mladek:
- Use more informative names for the livepatch transition states
* tag 'livepatching-for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching:
livepatch: Rename KLP_* to KLP_TRANSITION_*
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull printk updates from Petr Mladek:
- Use no_printk() instead of "if (0) printk()" constructs to avoid
generating printk index for messages disabled at compile time
- Remove deprecated strncpy/strcpy from printk.c
- Remove redundant CONFIG_BASE_FULL in favor of CONFIG_BASE_SMALL
* tag 'printk-for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
printk: cleanup deprecated uses of strncpy/strcpy
printk: Remove redundant CONFIG_BASE_FULL
printk: Change type of CONFIG_BASE_SMALL to bool
printk: Fix LOG_CPU_MAX_BUF_SHIFT when BASE_SMALL is enabled
ceph: Use no_printk() helper
dyndbg: Use *no_printk() helpers
dev_printk: Add and use dev_no_printk()
printk: Let no_printk() use _printk()
|
|
Pull smb client updates from Steve French:
- three important fixes to recent netfs conversion to fix various
xfstest failures, and rmmod oops
- cleanup patch to fix various GCC-14 warnings
* tag '6.10-rc-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb3: fix perf regression with cached writes with netfs conversion
cifs: Fix locking in cifs_strict_readv()
cifs: Change from mempool_destroy to mempool_exit for request pools
smb: smb2pdu.h: Avoid -Wflex-array-member-not-at-end warnings
|
|
Choices and their members are associated via the P_CHOICE property.
Currently, prop_get_symbol(sym_get_choice_prop()) is used to obtain
the choice of the given choice member.
Replace it with sym_get_choice_menu(), which retrieves the choice
without relying on P_CHOICE.
Signed-off-by: Masahiro Yamada <[email protected]>
|
|
This file was supposed to be removed in commit 2b7deea3ec7c ("Revert
"kvm: selftests: move base kvm_util.h declarations to kvm_util_base.h""),
but it survived. Remove it now.
Signed-off-by: Paolo Bonzini <[email protected]>
|