| Age | Commit message (Collapse) | Author | Files | Lines |
|
Move the fiddly bits of the efivar layer into its only remaining user,
efivarfs, and confine its use to that particular module. All other uses
of the EFI variable store have no need for this additional layer of
complexity, given that they either only read variables, or read and
write variables into a separate GUIDed namespace, and cannot be used to
manipulate EFI variables that are covered by the EFI spec and/or affect
the boot flow.
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
__efivar_entry_iter() uses a list iterator in a dubious way, i.e., it
assumes that the iteration variable always points to an object of the
appropriate type, even if the list traversal exhausts the list
completely, in which case it will point somewhere in the vicinity of the
list's anchor instead.
Fortunately, we no longer use this function so we can just get rid of it
entirely.
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
Both efivars and efivarfs uses __efivar_entry_iter() to go over the
linked list that shadows the list of EFI variables held by the firmware,
but fail to call the begin/end helpers that are documented as a
prerequisite.
So switch to the proper version, which is efivar_entry_iter(). Given
that in both cases, efivar_entry_remove() is invoked with the lock held
already, don't take the lock there anymore.
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
Commit 5d9db883761a ("efi: Add support for a UEFI variable filesystem")
dated Oct 5, 2012, introduced a new efivarfs pseudo-filesystem to
replace the efivars sysfs interface that was used up to that point to
expose EFI variables to user space.
The main problem with the sysfs interface was that it only supported up
to 1024 bytes of payload per file, whereas the underlying variables
themselves are only bounded by a platform specific per-variable and
global limit that is typically much higher than 1024 bytes.
The deprecated sysfs interface is only enabled on x86 and Itanium, other
EFI enabled architectures only support the efivarfs pseudo-filesystem.
So let's finally rip off the band aid, and drop the old interface
entirely. This will make it easier to refactor and clean up the
underlying infrastructure that is shared between efivars, efivarfs and
efi-pstore, and is long overdue for a makeover.
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
Avoid the efivars layer and simply call the newly introduced EFI
varstore helpers instead. This simplifies the code substantially, and
also allows us to remove some hacks in the shared efivars layer that
were added for efi-pstore specifically.
In order to be able to delete the EFI variable associated with a record,
store the UTF-16 name of the variable in the pstore record's priv field.
That way, we don't have to make guesses regarding which variable the
record may have been loaded from.
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
The current efivars layer is a jumble of list iterators, shadow data
structures and safe variable manipulation helpers that really belong in
the efivarfs pseudo file system once the obsolete sysfs access method to
EFI variables is removed.
So split off a minimal efivar get/set variable API that reuses the
existing efivars_lock semaphore to mediate access to the various runtime
services, primarily to ensure that performing a SetVariable() on one CPU
while another is calling GetNextVariable() in a loop to enumerate the
contents of the EFI variable store does not result in surprises.
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
Even though the efivars_lock lock is documented as protecting the
efivars->ops pointer (among other things), efivar_init() happily
releases and reacquires the lock for every EFI variable that it
enumerates. This used to be needed because the lock was originally a
spinlock, which prevented the callback that is invoked for every
variable from being able to sleep. However, releasing the lock could
potentially invalidate the ops pointer, but more importantly, it might
allow a SetVariable() runtime service call to take place concurrently,
and the UEFI spec does not define how this affects an enumeration that
is running in parallel using the GetNextVariable() runtime service,
which is what efivar_init() uses.
In the meantime, the lock has been converted into a semaphore, and the
only reason we need to drop the lock is because the efivarfs pseudo
filesystem driver will otherwise deadlock when it invokes the efivars
API from the callback to create the efivar_entry items and insert them
into the linked list. (EFI pstore is affected in a similar way)
So let's switch to helpers that can be used while the lock is already
taken. This way, we can hold on to the lock throughout the enumeration.
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
The EFI pstore backend will need to store per-record variable name data
when we switch away from the efivars layer. Add a priv field to struct
pstore_record, and document it as holding a backend specific pointer
that is assumed to be a kmalloc()d buffer, and will be kfree()d when the
entire record is freed.
Acked-by: Kees Cook <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
Pull block fixes from Jens Axboe:
- Series fixing issues with sysfs locking and name reuse (Christoph)
- NVMe pull request via Christoph:
- Fix the mixed up CRIMS/CRWMS constants (Joel Granados)
- Add another broken identifier quirk (Leo Savernik)
- Fix up a quirk because Samsung reuses PCI IDs over different
products (Christoph Hellwig)
- Remove old WARN_ON() that doesn't apply anymore (Li)
- Fix for using a stale cached request value for rq-qos throttling
mechanisms that may schedule(), like iocost (me)
- Remove unused parameter to blk_independent_access_range() (Damien)
* tag 'block-5.19-2022-06-24' of git://git.kernel.dk/linux-block:
block: remove WARN_ON() from bd_link_disk_holder
nvme: move the Samsung X5 quirk entry to the core quirks
nvme: fix the CRIMS and CRWMS definitions to match the spec
nvme: add a bogus subsystem NQN quirk for Micron MTFDKBA2T0TFH
block: pop cached rq before potentially blocking rq_qos_throttle()
block: remove queue from struct blk_independent_access_range
block: freeze the queue earlier in del_gendisk
block: remove per-disk debugfs files in blk_unregister_queue
block: serialize all debugfs operations using q->debugfs_mutex
block: disable the elevator int del_gendisk
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull printk kernel thread revert from Petr Mladek:
"Revert printk console kthreads.
The testing of 5.19 release candidates revealed issues that did not
happen when all consoles were serialized using the console semaphore.
More time is needed to check expectations of the existing console
drivers and be confident that they can be safely used in parallel"
* tag 'printk-for-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
Revert "printk: add functions to prefer direct printing"
Revert "printk: add kthread console printers"
Revert "printk: extend console_lock for per-console locking"
Revert "printk: remove @console_locked"
Revert "printk: Block console kthreads when direct printing will be required"
Revert "printk: Wait for the global console lock when the system is going down"
|
|
Add a new debugfs file to expose the pid of each vcpu threads. This
is very helpful for userland tools to get the vcpu pids without
worrying about thread naming conventions of the VMM.
Signed-off-by: Vineeth Pillai (Google) <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Add dma_release_coherent_memory to DMA API to allow dma
user call it to release dev->dma_mem when the device is
removed.
Signed-off-by: Mark-PK Tsai <[email protected]>
Acked-by: Christoph Hellwig <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mathieu Poirier <[email protected]>
|
|
Thanks to the recent commit 0a97953fd221 ("lib: add
bitmap_{from,to}_arr64") now we can directly convert a U64 value into a
bitmap and vice verse.
However when checking the header there is duplicated helper for
bitmap_to_arr64(), but no bitmap_from_arr64().
Just fix the copy-n-paste error.
Signed-off-by: Qu Wenruo <[email protected]>
Signed-off-by: Yury Norov <[email protected]>
|
|
We no longer need to acquire mrt_lock() in mr_dump,
using rcu_read_lock() is enough.
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
We will soon use RCU instead of rwlock in ipmr & ip6mr
This preliminary patch adds proper rcu verbs to read/write
(struct vif_device)->dev
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Allow the capacity of the kvm_mmu_memory_cache struct to be chosen at
declaration time rather than being fixed for all declarations. This will
be used in a follow-up commit to declare an cache in x86 with a capacity
of 512+ objects without having to increase the capacity of all caches in
KVM.
This change requires each cache now specify its capacity at runtime,
since the cache struct itself no longer has a fixed capacity known at
compile time. To protect against someone accidentally defining a
kvm_mmu_memory_cache struct directly (without the extra storage), this
commit includes a WARN_ON() in kvm_mmu_topup_memory_cache().
In order to support different capacities, this commit changes the
objects pointer array to be dynamically allocated the first time the
cache is topped-up.
While here, opportunistically clean up the stack-allocated
kvm_mmu_memory_cache structs in riscv and arm64 to use designated
initializers.
No functional change intended.
Reviewed-by: Marc Zyngier <[email protected]>
Signed-off-by: David Matlack <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Commit dfd5e3f5fe27 ("locking/lockdep: Mark local_lock_t") added yet
another lockdep_init_map_*() variant, but forgot to update all the
existing users of the most complicated version.
This could lead to a loss of lock_type and hence an incorrect report.
Given the relative rarity of both local_lock and these annotations,
this is unlikely to happen in practise, still, best fix things.
Fixes: dfd5e3f5fe27 ("locking/lockdep: Mark local_lock_t")
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Instead of defaulting to patching NOP opcodes at init time, and leaving
it to the architectures to override this if this is not needed, switch
to a model where doing nothing is the default. This is the common case
by far, as only MIPS requires NOP patching at init time. On all other
architectures, the correct encodings are emitted by the compiler and so
no initial patching is needed.
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Mark Rutland <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
MIPS is the only remaining architecture that needs to patch jump label
NOP encodings to initialize them at load time. So let's move the module
patching part of that from generic code into arch/mips, and drop it from
the others.
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
We try to harden virtio device notifications in 8b4ec69d7e09 ("virtio:
harden vring IRQ"). It works with the assumption that the driver or
core can properly call virtio_device_ready() at the right
place. Unfortunately, this seems to be not true and uncover various
bugs of the existing drivers, mainly the issue of using
virtio_device_ready() incorrectly.
So let's add a Kconfig option and disable it by default. It gives
us time to fix the drivers and then we can consider re-enabling it.
Signed-off-by: Jason Wang <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
|
|
At the moment FPGA manager core loads to the device entire image
provided to fpga_mgr_load(). But it is not always whole FPGA image
buffer meant to be written to the device. In particular, .dat formatted
image for Microchip MPF contains meta info in the header that is not
meant to be written to the device. This is issue for those low level
drivers that loads data to the device with write() fpga_manager_ops
callback, since write() can be called in iterator over scatter-gather
table, not only linear image buffer. On the other hand, write_sg()
callback is provided with whole image in scatter-gather form and can
decide itself which part should be sent to the device.
Add header_size and data_size to the fpga_image_info struct, add
skip_header to the fpga_manager_ops struct and adjust fpga_mgr_write()
callers with respect to them.
* info->header_size indicates part at the beginning of image buffer
that contains some meta info. It is optional and can be 0,
initialized with mops->initial_header_size.
* mops->skip_header tells fpga-mgr core whether write should start
from the beginning of image buffer or at the offset of header_size.
* info->data_size is the size of bitstream data that is meant to be
written to the device. It is also optional and can be 0, which
means bitstream data is up to the end of image buffer.
Also add parse_header() callback to fpga_manager_ops, which purpose is
to set info->header_size and info->data_size. At least
initial_header_size bytes of image buffer will be passed into
parse_header() first time. If it is not enough, parse_header() should
set desired size into info->header_size and return -EAGAIN, then it will
be called again with greater part of image buffer on the input.
Suggested-by: Xu Yilun <[email protected]>
Signed-off-by: Ivan Bornyakov <[email protected]>
Acked-by: Xu Yilun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Xu Yilun <[email protected]>
|
|
In the CONFIG_MEMREGION=n case, memregion_free() is meant to be a static
inline. 0day reports:
In file included from drivers/cxl/core/port.c:4:
include/linux/memregion.h:19:6: warning: no previous prototype for
function 'memregion_free' [-Wmissing-prototypes]
Mark memregion_free() static.
Fixes: 33dd70752cd7 ("lib: Uplevel the pmem "region" ida to a global allocator")
Reported-by: kernel test robot <[email protected]>
Reviewed-by: Alison Schofield <[email protected]>
Link: https://lore.kernel.org/r/165601455171.4042645.3350844271068713515.stgit@dwillia2-xfh
Signed-off-by: Dan Williams <[email protected]>
|
|
Commit 48ec13d36d3f ("gpio: Properly document parent data union")
is supposed to have fixed a warning from "make htmldocs" regarding
kernel-doc comments to union members. However, the same warning
still remains [1].
Fix the issue by following the example found in section "Nested
structs/unions" of Documentation/doc-guide/kernel-doc.rst.
Signed-off-by: Akira Yokosawa <[email protected]>
Reported-by: Stephen Rothwell <[email protected]>
Fixes: 48ec13d36d3f ("gpio: Properly document parent data union")
Link: https://lore.kernel.org/r/[email protected]/ [1]
Cc: Linus Walleij <[email protected]>
Cc: Bartosz Golaszewski <[email protected]>
Cc: Joey Gouly <[email protected]>
Cc: Marc Zyngier <[email protected]>
Tested-by: Stephen Rothwell <[email protected]>
Reviewed-by: Mauro Carvalho Chehab <[email protected]>
Signed-off-by: Bartosz Golaszewski <[email protected]>
|
|
Add a "flags" field to the "struct dw_edma_chip" so that the controller
drivers can pass flags that are relevant to the platform.
DW_EDMA_CHIP_LOCAL - Used by the controller drivers accessing eDMA
locally. Local eDMA access doesn't require generating MSIs to the remote.
Link: https://lore.kernel.org/r/[email protected]
Tested-by: Serge Semin <[email protected]>
Tested-by: Manivannan Sadhasivam <[email protected]>
Signed-off-by: Frank Li <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Serge Semin <[email protected]>
Reviewed-by: Manivannan Sadhasivam <[email protected]>
Acked-By: Vinod Koul <[email protected]>
|
|
The struct dw_edma contains wr(rd)_ch_cnt fields. The EDMA driver gets
write(read) channel number from register, then saves these into dw_edma.
The wr(rd)_ch_cnt in dw_edma_chip actually means how many link list memory
are available in ll_region_wr(rd)[EDMA_MAX_WR_CH]. Rename it to
ll_wr(rd)_cnt to indicate actual usage.
Link: https://lore.kernel.org/r/[email protected]
Tested-by: Serge Semin <[email protected]>
Tested-by: Manivannan Sadhasivam <[email protected]>
Signed-off-by: Frank Li <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Serge Semin <[email protected]>
Reviewed-by: Manivannan Sadhasivam <[email protected]>
Acked-By: Vinod Koul <[email protected]>
|
|
struct dw_edma_region rg_region included virtual address, physical address
and size information. But only the virtual address is used by EDMA driver.
Change it to void __iomem *reg_base to clean up code.
Link: https://lore.kernel.org/r/[email protected]
Tested-by: Serge Semin <[email protected]>
Tested-by: Manivannan Sadhasivam <[email protected]>
Signed-off-by: Frank Li <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Serge Semin <[email protected]>
Reviewed-by: Manivannan Sadhasivam <[email protected]>
Acked-By: Vinod Koul <[email protected]>
|
|
"struct dw_edma_chip" contains an internal structure "struct dw_edma" that
is used by the eDMA core internally and should not be touched by the eDMA
controller drivers themselves. But currently, the eDMA controller drivers
like "dw-edma-pci" allocate and populate this internal structure before
passing it on to the eDMA core. The eDMA core further populates the
structure and uses it. This is wrong!
Hence, move all the "struct dw_edma" specifics from controller drivers to
the eDMA core.
Link: https://lore.kernel.org/r/[email protected]
Tested-by: Serge Semin <[email protected]>
Tested-by: Manivannan Sadhasivam <[email protected]>
Signed-off-by: Frank Li <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Serge Semin <[email protected]>
Reviewed-by: Manivannan Sadhasivam <[email protected]>
Acked-By: Vinod Koul <[email protected]>
|
|
No conflicts.
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
|
|
This reverts commit 2bb2b7b57f81255c13f4395ea911d6bdc70c9fe2.
The testing of 5.19 release candidates revealed missing synchronization
between early and regular console functionality.
It would be possible to start the console kthreads later as a workaround.
But it is clear that console lock serialized console drivers between
each other. It opens a big area of possible problems that were not
considered by people involved in the development and review.
printk() is crucial for debugging kernel issues and console output is
very important part of it. The number of consoles is huge and a proper
review would take some time. As a result it need to be reverted for 5.19.
Link: https://lore.kernel.org/r/YrBdjVwBOVgLfHyb@alley
Signed-off-by: Petr Mladek <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
This reverts commit 09c5ba0aa2fcfdadb17d045c3ee6f86d69270df7.
This reverts commit b87f02307d3cfbda768520f0687c51ca77e14fc3.
The testing of 5.19 release candidates revealed missing synchronization
between early and regular console functionality.
It would be possible to start the console kthreads later as a workaround.
But it is clear that console lock serialized console drivers between
each other. It opens a big area of possible problems that were not
considered by people involved in the development and review.
printk() is crucial for debugging kernel issues and console output is
very important part of it. The number of consoles is huge and a proper
review would take some time. As a result it need to be reverted for 5.19.
Link: https://lore.kernel.org/r/YrBdjVwBOVgLfHyb@alley
Signed-off-by: Petr Mladek <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
This reverts commit 8e274732115f63c1d09136284431b3555bd5cc56.
The testing of 5.19 release candidates revealed missing synchronization
between early and regular console functionality.
It would be possible to start the console kthreads later as a workaround.
But it is clear that console lock serialized console drivers between
each other. It opens a big area of possible problems that were not
considered by people involved in the development and review.
printk() is crucial for debugging kernel issues and console output is
very important part of it. The number of consoles is huge and a proper
review would take some time. As a result it need to be reverted for 5.19.
Link: https://lore.kernel.org/r/YrBdjVwBOVgLfHyb@alley
Signed-off-by: Petr Mladek <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
This reverts commit b87f02307d3cfbda768520f0687c51ca77e14fc3.
The testing of 5.19 release candidates revealed missing synchronization
between early and regular console functionality.
It would be possible to start the console kthreads later as a workaround.
But it is clear that console lock serialized console drivers between
each other. It opens a big area of possible problems that were not
considered by people involved in the development and review.
printk() is crucial for debugging kernel issues and console output is
very important part of it. The number of consoles is huge and a proper
review would take some time. As a result it need to be reverted for 5.19.
Link: https://lore.kernel.org/r/YrBdjVwBOVgLfHyb@alley
Signed-off-by: Petr Mladek <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Adjust the values of NVME_CAP_CRMS_CRIMS and NVME_CAP_CRMS_CRWMS masks as
they are different from the ones in TP4084 - Time-to-ready.
Fixes: 354201c53e61 ("nvme: add support for TP4084 - Time-to-Ready Enhancements").
Signed-off-by: Joel Granados <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Reviewed-by: Chaitanya Kulkarni <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
|
|
For the entire history of the devm_clk_*unregister() existence they were
used only once (*) in 2015. Remove them.
*) The commit 264e3b75de4e ("clk: s2mps11: Simplify s2mps11_clk_probe unwind
paths") exactly supports the point of the change proposed here.
Signed-off-by: Andy Shevchenko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Acked-by: Uwe Kleine-König <[email protected]>
Signed-off-by: Stephen Boyd <[email protected]>
|
|
The declaration was necessary until commit cc2d22477779 ("pwm: Drop
per-chip dbg_show callback").
Signed-off-by: Uwe Kleine-König <[email protected]>
Signed-off-by: Thierry Reding <[email protected]>
|
|
There is no cyclic dependency, so by reordering the forward declaration
can be dropped.
Signed-off-by: Uwe Kleine-König <[email protected]>
Signed-off-by: Thierry Reding <[email protected]>
|
|
There are no drivers left providing the legacy callbacks. So drop
support for these.
If this commit breaks your out-of-tree pwm driver, look at e.g. commit
ec00cd5e63f0 ("pwm: renesas-tpu: Implement .apply() callback") for an
example of the needed conversion for your driver.
Signed-off-by: Uwe Kleine-König <[email protected]>
Signed-off-by: Thierry Reding <[email protected]>
|
|
The 'swiotlb_force' is removed since commit c6af2aa9ffc9 ("swiotlb: make
the swiotlb_init interface more useful").
Signed-off-by: Dongli Zhang <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
|
|
With the introduction of the Surface Laptop Studio, more event- and
target categories have been added. Therefore, increase the number of
reserved events and extend the enum of know target categories to
accommodate this.
Signed-off-by: Maximilian Luz <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Hans de Goede <[email protected]>
Signed-off-by: Hans de Goede <[email protected]>
|
|
__ct_user_enter/exit()
The context tracking namespace is going to expand and some new functions
will require even longer names. Start shrinking the context_tracking
prefix to "ct" as is already the case for some existing macros, this
will make the introduction of new functions easier.
Acked-by: Paul E. McKenney <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Neeraj Upadhyay <[email protected]>
Cc: Uladzislau Rezki <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Boqun Feng <[email protected]>
Cc: Nicolas Saenz Julienne <[email protected]>
Cc: Marcelo Tosatti <[email protected]>
Cc: Xiongfeng Wang <[email protected]>
Cc: Yu Liao <[email protected]>
Cc: Phil Auld <[email protected]>
Cc: Paul Gortmaker<[email protected]>
Cc: Alex Belits <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Reviewed-by: Nicolas Saenz Julienne <[email protected]>
Tested-by: Nicolas Saenz Julienne <[email protected]>
|
|
Currently, the RCU Tasks Trace grace-period kthread IPIs each online CPU
using smp_call_function_single() in order to track any tasks currently in
RCU Tasks Trace read-side critical sections during which the corresponding
task has neither blocked nor been preempted. These IPIs are annoying
and are also not strictly necessary because any task that blocks or is
preempted within its current RCU Tasks Trace read-side critical section
will be tracked on one of the per-CPU rcu_tasks_percpu structure's
->rtp_blkd_tasks list. So the only time that this is a problem is if
one of the CPUs runs through a long-duration RCU Tasks Trace read-side
critical section without a context switch.
Note that the task_call_func() function cannot help here because there is
no safe way to identify the target task. Of course, the task_call_func()
function will be very useful later, when processing the list of tasks,
but it needs to know the task.
This commit therefore creates a cpu_curr_snapshot() function that returns
a pointer the task_struct structure of some task that happened to be
running on the specified CPU more or less during the time that the
cpu_curr_snapshot() function was executing. If there was no context
switch during this time, this function will return a pointer to the
task_struct structure of the task that was running throughout. If there
was a context switch, then the outgoing task will be taken care of by
RCU's context-switch hook, and the incoming task was either already taken
care during some previous context switch, or it is not currently within an
RCU Tasks Trace read-side critical section. And in this latter case, the
grace period already started, so there is no need to wait on this task.
This new cpu_curr_snapshot() function is invoked on each CPU early in
the RCU Tasks Trace grace-period processing, and the resulting tasks
are queued for later quiescent-state inspection.
Signed-off-by: Paul E. McKenney <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Neeraj Upadhyay <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Martin KaFai Lau <[email protected]>
Cc: KP Singh <[email protected]>
|
|
dev_coredumpm
The dev_coredumpv() and dev_coredumpm() could not be used in atomic
context, because they call kvasprintf_const() and kstrdup() with
GFP_KERNEL parameter. The process is shown below:
dev_coredumpv(.., gfp_t gfp)
dev_coredumpm(.., gfp_t gfp)
dev_set_name
kobject_set_name_vargs
kvasprintf_const(GFP_KERNEL, ...); //may sleep
kstrdup(s, GFP_KERNEL); //may sleep
This patch removes gfp_t parameter of dev_coredumpv() and dev_coredumpm()
and changes the gfp_t parameter of kzalloc() in dev_coredumpm() to
GFP_KERNEL in order to show they could not be used in atomic context.
Fixes: 833c95456a70 ("device coredump: add new device coredump class")
Reviewed-by: Brian Norris <[email protected]>
Reviewed-by: Johannes Berg <[email protected]>
Signed-off-by: Duoming Zhou <[email protected]>
Link: https://lore.kernel.org/r/df72af3b1862bac7d8e793d1f3931857d3779dfd.1654569290.git.duoming@zju.edu.cn
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
The BPF core/verifier is hard-coded to permit mixing bpf2bpf and tail
calls for only x86-64. Change the logic to instead rely on a new weak
function 'bool bpf_jit_supports_subprog_tailcalls(void)', which a capable
JIT backend can override.
Update the x86-64 eBPF JIT to reflect this.
Signed-off-by: Tony Ambardar <[email protected]>
[jakub: drop MIPS bits and tweak patch subject]
Signed-off-by: Jakub Sitnicki <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
There are some drivers that can use the Type C mux API, but don't have
to. Introduce CONFIG guards for the mux functions so that drivers can
include the header file and not run into compilation errors on systems
which don't have CONFIG_TYPEC enabled. When CONFIG_TYPEC is not enabled,
the Type C mux functions will be stub versions of the original calls.
Reported-by: kernel test robot <[email protected]>
Reviewed-by: Nícolas F. R. A. Prado <[email protected]>
Reviewed-by: Heikki Krogerus <[email protected]>
Reviewed-by: AngeloGioacchino Del Regno <[email protected]>
Tested-by: Nícolas F. R. A. Prado <[email protected]>
Signed-off-by: Prashant Malani <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Make sure skb_mac_header(), skb_mac_offset() and skb_mac_header_len() uses
are not fooled if the mac header has not been set.
These checks are enabled if CONFIG_DEBUG_NET=y
This commit will likely expose existing bugs in linux networking stacks.
Signed-off-by: Eric Dumazet <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Calls to `bpf_loop` are replaced with direct loops to avoid
indirection. E.g. the following:
bpf_loop(10, foo, NULL, 0);
Is replaced by equivalent of the following:
for (int i = 0; i < 10; ++i)
foo(i, NULL);
This transformation could be applied when:
- callback is known and does not change during program execution;
- flags passed to `bpf_loop` are always zero.
Inlining logic works as follows:
- During execution simulation function `update_loop_inline_state`
tracks the following information for each `bpf_loop` call
instruction:
- is callback known and constant?
- are flags constant and zero?
- Function `optimize_bpf_loop` increases stack depth for functions
where `bpf_loop` calls can be inlined and invokes `inline_bpf_loop`
to apply the inlining. The additional stack space is used to spill
registers R6, R7 and R8. These registers are used as loop counter,
loop maximal bound and callback context parameter;
Measurements using `benchs/run_bench_bpf_loop.sh` inside QEMU / KVM on
i7-4710HQ CPU show a drop in latency from 14 ns/op to 2 ns/op.
Signed-off-by: Eduard Zingerman <[email protected]>
Acked-by: Song Liu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
This function is not used and CT_WARN_ON() coupled with ct_state() is
the preferred way to assert context tracking state values.
Reported-by: Nicolas Saenz Julienne <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Neeraj Upadhyay <[email protected]>
Cc: Uladzislau Rezki <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Boqun Feng <[email protected]>
Cc: Nicolas Saenz Julienne <[email protected]>
Cc: Marcelo Tosatti <[email protected]>
Cc: Xiongfeng Wang <[email protected]>
Cc: Yu Liao <[email protected]>
Cc: Phil Auld <[email protected]>
Cc: Paul Gortmaker<[email protected]>
Cc: Alex Belits <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
|
|
This commit places any task that has ever blocked within its current
RCU Tasks Trace read-side critical section on a per-CPU list within the
rcu_tasks_percpu structure. Tasks are removed from this list when they
exit by the exit_tasks_rcu_finish_trace() function. The purpose of this
commit is to provide the information needed to eliminate the current
scan of the full task list.
This commit offsets the INT_MIN value for ->trc_reader_nesting with the
new nesting level in order to avoid queueing tasks that are exiting
their read-side critical sections.
[ paulmck: Apply kernel test robot feedback. ]
[ paulmck: Apply feedback from [email protected] ]
Signed-off-by: Paul E. McKenney <[email protected]>
Tested-by: syzbot <[email protected]>
Tested-by: "Zhang, Qiang1" <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Neeraj Upadhyay <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Martin KaFai Lau <[email protected]>
Cc: KP Singh <[email protected]>
|
|
This commit adds fields to task_struct and to rcu_tasks_percpu that will
be used to avoid the task-list scan for RCU Tasks Trace grace periods,
and also initializes these fields.
Signed-off-by: Paul E. McKenney <[email protected]>
Cc: Neeraj Upadhyay <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Martin KaFai Lau <[email protected]>
Cc: KP Singh <[email protected]>
|