| Age | Commit message (Collapse) | Author | Files | Lines |
|
FGP_ENTRY is unused now, so remove it.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Christoph Hellwig <[email protected]>
Cc: Andreas Gruenbacher <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Ryusuke Konishi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
mapping_get_entry is useful for page cache API users that need to know
about xa_value internals. Rename it and make it available in pagemap.h.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Matthew Wilcox (Oracle) <[email protected]>
Cc: Andreas Gruenbacher <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Ryusuke Konishi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
This patch adds support for platform-specific reset logic in the
stmmac driver. Some SoCs require a different reset mechanism than
the standard dwmac IP reset. To support these platforms, a new function
pointer 'fix_soc_reset' is added to the plat_stmmacenet_data structure.
The stmmac_reset in hwif.h is modified to call the 'fix_soc_reset'
function if it exists. This enables the driver to use the platform-specific
reset logic when necessary.
Signed-off-by: Shenwei Wang <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Use the maple tree in RCU mode for VMA tracking.
The maple tree tracks the stack and is able to update the pivot
(lower/upper boundary) in-place to allow the page fault handler to write
to the tree while holding just the mmap read lock. This is safe as the
writes to the stack have a guard VMA which ensures there will always be a
NULL in the direction of the growth and thus will only update a pivot.
It is possible, but not recommended, to have VMAs that grow up/down
without guard VMAs. syzbot has constructed a testcase which sets up a VMA
to grow and consume the empty space. Overwriting the entire NULL entry
causes the tree to be altered in a way that is not safe for concurrent
readers; the readers may see a node being rewritten or one that does not
match the maple state they are using.
Enabling RCU mode allows the concurrent readers to see a stable node and
will return the expected result.
[[email protected]: we don't need to free the nodes with RCU[
Link: https://lore.kernel.org/linux-mm/[email protected]/
Link: https://lkml.kernel.org/r/[email protected]
Fixes: d4af56c5c7c6 ("mm: start tracking VMAs with maple tree")
Signed-off-by: Liam R. Howlett <[email protected]>
Signed-off-by: Suren Baghdasaryan <[email protected]>
Reported-by: [email protected]
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Merge series from William Breathitt Gray <[email protected]>:
The regmap API supports IO port accessors so we can take advantage of
regmap abstractions rather than handling access to the device registers
directly in the driver.
A patch to pass irq_drv_data as a parameter for struct regmap_irq_chip
set_type_config() is included. This is needed by the
idio_24_set_type_config() and ws16c48_set_type_config() callbacks in
order to update the type configuration on their respective devices.
|
|
Refactor pci_bus_for_each_resource() in the same way as
pci_dev_for_each_resource(). This allows the index to be hidden inside the
implementation so the caller can omit it when it's not used otherwise.
No functional changes intended.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Andy Shevchenko <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Krzysztof Wilczyński <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
|
|
All users of these fields have been removed.
They are now computed when needed with [mn]shift and [mn]width.
This shrinks the size of struct clk_fractional_divider from 72 to 56 bytes.
Signed-off-by: Christophe JAILLET <[email protected]>
Link: https://lore.kernel.org/r/680357e5acb338433bfc94114b65b4a4ce2c99e2.1680423909.git.christophe.jaillet@wanadoo.fr
Reviewed-by: Heiko Stuebner <[email protected]>
Signed-off-by: Stephen Boyd <[email protected]>
|
|
Provide a module_nvmem_layout_driver() macro at the end of the
nvmem-provider.h header to reduce the boilerplate when registering nvmem
layout drivers.
Suggested-by: Srinivas Kandagatla <[email protected]>
Signed-off-by: Miquel Raynal <[email protected]>
Acked-by: Rafał Miłecki <[email protected]>
Signed-off-by: Srinivas Kandagatla <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Callback .read_post_process() is designed to modify raw cell content
before providing it to the consumer. So far we were dealing with
modifications that didn't affect cell size (length). In some cases
however cell content needs to be reformatted and resized.
It's required e.g. to provide properly formatted MAC address in case
it's stored in a non-binary format (e.g. using ASCII).
There were few discussions how to optimally handle that. Following
possible solutions were considered:
1. Allow .read_post_process() to realloc (resize) content buffer
2. Allow .read_post_process() to adjust (decrease) just buffer length
3. Register NVMEM cells using post-read sizes
The preferred solution was the last one. The problem is that simply
adjusting "bytes" in NVMEM providers would result in core code NOT
passing whole raw data to .read_post_process() callbacks. It means
callback functions couldn't do their job without somehow manually
reading original cell content on their own.
This patch deals with that by registering NVMEM cells with both lengths:
raw content one and post read one. It allows:
1. Core code to read whole raw cell content
2. Callbacks to return content they want
Signed-off-by: Rafał Miłecki <[email protected]>
Signed-off-by: Srinivas Kandagatla <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
It doesn't make any more sense to have a opaque pointer set up by the
nvmem device. Usually, the layout isn't associated with a particular
nvmem device. Instead, let the caller who set the post process callback
provide the priv pointer.
Signed-off-by: Michael Walle <[email protected]>
Signed-off-by: Miquel Raynal <[email protected]>
Signed-off-by: Srinivas Kandagatla <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
There are no users anymore for the global cell_post_process callback
anymore. New users should use proper nvmem layouts.
Signed-off-by: Michael Walle <[email protected]>
Signed-off-by: Miquel Raynal <[email protected]>
Signed-off-by: Srinivas Kandagatla <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Provide a way to modify a cell before it will get added. This is useful
to attach a custom post processing hook via a layout.
Signed-off-by: Michael Walle <[email protected]>
Signed-off-by: Miquel Raynal <[email protected]>
Signed-off-by: Srinivas Kandagatla <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Instead of relying on the name the consumer is using for the cell, like
it is done for the nvmem .cell_post_process configuration parameter,
provide a per-cell post processing hook. This can then be populated by
the NVMEM provider (or the NVMEM layout) when adding the cell.
Signed-off-by: Michael Walle <[email protected]>
Signed-off-by: Miquel Raynal <[email protected]>
Signed-off-by: Srinivas Kandagatla <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
NVMEM layouts are used to generate NVMEM cells during runtime. Think of
an EEPROM with a well-defined conent. For now, the content can be
described by a device tree or a board file. But this only works if the
offsets and lengths are static and don't change. One could also argue
that putting the layout of the EEPROM in the device tree is the wrong
place. Instead, the device tree should just have a specific compatible
string.
Right now there are two use cases:
(1) The NVMEM cell needs special processing. E.g. if it only specifies
a base MAC address offset and you need to add an offset, or it
needs to parse a MAC from ASCII format or some proprietary format.
(Post processing of cells is added in a later commit).
(2) u-boot environment parsing. The cells don't have a particular
offset but it needs parsing the content to determine the offsets
and length.
Co-developed-by: Miquel Raynal <[email protected]>
Signed-off-by: Miquel Raynal <[email protected]>
Signed-off-by: Michael Walle <[email protected]>
Signed-off-by: Srinivas Kandagatla <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
A new helper has been introduced, of_request_module(). Users have been
converted, this helper can now be deleted.
Signed-off-by: Miquel Raynal <[email protected]>
Reviewed-by: Rob Herring <[email protected]>
Signed-off-by: Srinivas Kandagatla <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Depending on device.c for pure OF handling is considered
backwards. Let's extract the content of of_device_request_module() to
have the real logic under module.c.
The next step will be to convert users of of_device_request_module() to
use the new helper.
Signed-off-by: Miquel Raynal <[email protected]>
Reviewed-by: Rob Herring <[email protected]>
Signed-off-by: Srinivas Kandagatla <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Create a specific .c file for OF related module handling.
Move of_modalias() inside as a first step.
The helper is exposed through of.h even though it is only used by core
files because the users from device.c will soon be split into an OF-only
helper in module.c as well as a device-oriented inline helper in
of_device.h. Putting this helper in of_private.h would require to
include of_private.h from of_device.h, which is not acceptable.
Suggested-by: Rob Herring <[email protected]>
Signed-off-by: Miquel Raynal <[email protected]>
Reviewed-by: Rob Herring <[email protected]>
Signed-off-by: Srinivas Kandagatla <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
This helper does not produce a real modalias, but tries to get the
"product" compatible part of the "vendor,product" compatibles only. It
is far from creating a purely useful modalias string and does not seem
to be used like that directly anyway, so let's try to give this helper a
more meaningful name before moving there a real modalias helper (already
existing under of/device.c).
Also update the various documentations to refer to the strings as
"aliases" rather than "modaliases" which has a real meaning in the Linux
kernel.
There is no functional change.
Cc: Rafael J. Wysocki <[email protected]>
Cc: Len Brown <[email protected]>
Cc: Maarten Lankhorst <[email protected]>
Cc: Maxime Ripard <[email protected]>
Cc: Thomas Zimmermann <[email protected]>
Cc: Sebastian Reichel <[email protected]>
Cc: Wolfram Sang <[email protected]>
Cc: Mark Brown <[email protected]>
Signed-off-by: Miquel Raynal <[email protected]>
Reviewed-by: Rob Herring <[email protected]>
Acked-by: Mark Brown <[email protected]>
Signed-off-by: Srinivas Kandagatla <[email protected]>
Acked-by: Sebastian Reichel <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Allow the struct regmap_irq_chip set_type_config() callback to access
irq_drv_data by passing it as a parameter.
Signed-off-by: William Breathitt Gray <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Link: https://lore.kernel.org/r/20e15cd3afae80922b7e0577c7741df86b3390c5.1680708357.git.william.gray@linaro.org
Signed-off-by: Mark Brown <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Fix timerlat notification, as it was not triggering the notify to
users when a new max latency was hit.
- Do not trigger max latency if the tracing is off.
When tracing is off, the ring buffer is not updated, it does not make
sense to notify when there's a new max latency detected by the
tracer, as why that latency happened is not available. The tracing
logic still runs when the ring buffer is disabled, but it should not
be triggering notifications.
- Fix race on freeing the synthetic event "last_cmd" variable by adding
a mutex around it.
- Fix race between reader and writer of the ring buffer by adding
memory barriers. When the writer is still on the reader page it must
have its content visible on the buffer before it moves the commit
index that the reader uses to know how much content is on the page.
- Make get_lock_parent_ip() always inlined, as it uses _THIS_IP_ and
_RET_IP_, which gets broken if it is not inlined.
- Make __field(int, arr[5]) in a TRACE_EVENT() macro fail to build.
The field formats of trace events are calculated by using
sizeof(type) and other means by what is passed into the structure
macros like __field(). The __field() macro is only meant for atom
types like int, long, short, pointer, etc. It is not meant for
arrays.
The code will currently compile with arrays, but then the format
produced will be inaccurate, and user space parsing tools will break.
Two bugs have already been fixed, now add code that will make the
kernel fail to build if another trace event includes this buggy field
format.
- Fix boot up snapshot code:
Boot snapshots were triggering when not even asked for on the kernel
command line. This was caused by two bugs:
1) It would trigger a snapshot on any instance if one was created
from the kernel command line.
2) The error handling would only affect the top level instance.
So the fact that a snapshot was done on a instance that didn't
allocate a buffer triggered a warning written into the top level
buffer, and worse yet, disabled the top level buffer.
- Fix memory leak that was caused when an error was logged in a trace
buffer instance, and then the buffer instance was removed.
The allocated error log messages still needed to be freed.
* tag 'trace-v6.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Free error logs of tracing instances
tracing: Fix ftrace_boot_snapshot command line logic
tracing: Have tracing_snapshot_instance_cond() write errors to the appropriate instance
tracing: Error if a trace event has an array for a __field()
tracing/osnoise: Fix notify new tracing_max_latency
tracing/timerlat: Notify new max thread latency
ftrace: Mark get_lock_parent_ip() __always_inline
ring-buffer: Fix race while reader and writer are on the same page
tracing/synthetic: Fix races on freeing last_cmd
|
|
'rcu/staging-kfree', remote-tracking branches 'paul/srcu-cf.2023.04.04a', 'fbq/rcu/lockdep.2023.03.27a' and 'fbq/rcu/rcutorture.2023.03.20a' into rcu/staging
|
|
For CONFIG_NO_HZ_FULL systems, the tick_do_timer_cpu cannot be offlined.
However, cpu_is_hotpluggable() still returns true for those CPUs. This causes
torture tests that do offlining to end up trying to offline this CPU causing
test failures. Such failure happens on all architectures.
Fix the repeated error messages thrown by this (even if the hotplug errors are
harmless) by asking the opinion of the nohz subsystem on whether the CPU can be
hotplugged.
[ Apply Frederic Weisbecker feedback on refactoring tick_nohz_cpu_down(). ]
For drivers/base/ portion:
Acked-by: Greg Kroah-Hartman <[email protected]>
Acked-by: Frederic Weisbecker <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: "Paul E. McKenney" <[email protected]>
Cc: Zhouyi Zhou <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: rcu <[email protected]>
Cc: [email protected]
Fixes: 2987557f52b9 ("driver-core/cpu: Expose hotpluggability to the rest of the kernel")
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Joel Fernandes (Google) <[email protected]>
|
|
The SRCU_SIZE_* names are not self-explanatory, so this commit therefore
adds comments to the definitions.
Signed-off-by: Pingfan Liu <[email protected]>
Cc: Lai Jiangshan <[email protected]>
Cc: "Paul E. McKenney" <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Mathieu Desnoyers <[email protected]>
Cc: "Zhang, Qiang1" <[email protected]>
To: [email protected]
Reviewed-by: Paul E. McKenney <[email protected]>
Reviewed-by: Joel Fernandes (Google) <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Joel Fernandes (Google) <[email protected]>
|
|
It returns following attributes:
locking range start
locking range length
read lock enabled
write lock enabled
lock state (RW, RO or LK)
It can be retrieved by user authority provided the authority
was added to locking range via prior IOC_OPAL_ADD_USR_TO_LR
ioctl command. The command was extended to add user in ACE that
allows to read attributes listed above.
Signed-off-by: Ondrej Kozina <[email protected]>
Tested-by: Luca Boccassi <[email protected]>
Tested-by: Milan Broz <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
Now that the link array is deallocated when destroying nodes and the
explicit link removal has been dropped from the exynos driver there are
no further users of and no need for the icc_link_destroy() interface.
Signed-off-by: Johan Hovold <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Georgi Djakov <[email protected]>
|
|
PSI offers 2 mechanisms to get information about a specific resource
pressure. One is reading from /proc/pressure/<resource>, which gives
average pressures aggregated every 2s. The other is creating a pollable
fd for a specific resource and cgroup.
The trigger creation requires CAP_SYS_RESOURCE, and gives the
possibility to pick specific time window and threshold, spawing an RT
thread to aggregate the data.
Systemd would like to provide containers the option to monitor pressure
on their own cgroup and sub-cgroups. For example, if systemd launches a
container that itself then launches services, the container should have
the ability to poll() for pressure in individual services. But neither
the container nor the services are privileged.
This patch implements a mechanism to allow unprivileged users to create
pressure triggers. The difference with privileged triggers creation is
that unprivileged ones must have a time window that's a multiple of 2s.
This is so that we can avoid unrestricted spawning of rt threads, and
use instead the same aggregation mechanism done for the averages, which
runs independently of any triggers.
Suggested-by: Johannes Weiner <[email protected]>
Signed-off-by: Domenico Cerasuolo <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Johannes Weiner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Renaming in PSI implementation to make a clear distinction between
privileged and unprivileged triggers code to be implemented in the
next patch.
Suggested-by: Johannes Weiner <[email protected]>
Signed-off-by: Domenico Cerasuolo <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Johannes Weiner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
btf_nested_type_is_trusted() tries to find a struct member at corresponding offset.
It works for flat structures and falls apart in more complex structs with nested structs.
The offset->member search is already performed by btf_struct_walk() including nested structs.
Reuse this work and pass {field name, field btf id} into btf_nested_type_is_trusted()
instead of offset to make BTF_TYPE_SAFE*() logic more robust.
Signed-off-by: Alexei Starovoitov <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Acked-by: David Vernet <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Remove unused arguments from btf_struct_access() callback.
Signed-off-by: Alexei Starovoitov <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Acked-by: David Vernet <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Pull kvm fixes from Paolo Bonzini:
"PPC:
- Hide KVM_CAP_IRQFD_RESAMPLE if XIVE is enabled
s390:
- Fix handling of external interrupts in protected guests
x86:
- Resample the pending state of IOAPIC interrupts when unmasking them
- Fix usage of Hyper-V "enlightened TLB" on AMD
- Small fixes to real mode exceptions
- Suppress pending MMIO write exits if emulator detects exception
Documentation:
- Fix rST syntax"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
docs: kvm: x86: Fix broken field list
KVM: PPC: Make KVM_CAP_IRQFD_RESAMPLE platform dependent
KVM: s390: pv: fix external interruption loop not always detected
KVM: nVMX: Do not report error code when synthesizing VM-Exit from Real Mode
KVM: x86: Clear "has_error_code", not "error_code", for RM exception injection
KVM: x86: Suppress pending MMIO write exits if emulator detects exception
KVM: x86/ioapic: Resample the pending state of an IRQ when unmasking
KVM: irqfd: Make resampler_list an RCU list
KVM: SVM: Flush Hyper-V TLB when required
|
|
There might be confusion about why pci_bus_for_each_resource() uses
Logical OR. Document the entire macro and explain how it works and why the
conditional needs to be like that.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Andy Shevchenko <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
|
|
Instead of open-coding it everywhere introduce a tiny helper that can be
used to iterate over each resource of a PCI device, and convert the most
obvious users into it.
While at it drop doubled empty line before pdev_sort_resources().
No functional changes intended.
Suggested-by: Andy Shevchenko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mika Westerberg <[email protected]>
Signed-off-by: Andy Shevchenko <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Krzysztof Wilczyński <[email protected]>
|
|
Introduce pci_resource_n() and replace open-coded implementations of it
in pci.h.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Andy Shevchenko <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
|
|
This commit moves the ->reschedule_jiffies, ->reschedule_count, and
->work fields from the srcu_struct structure to the srcu_usage structure
to reduce the size of the former in order to improve cache locality.
However, this means that the container_of() calls cannot get a pointer
to the srcu_struct because they are no longer in the srcu_struct.
This issue is addressed by adding a ->srcu_ssp field in the srcu_usage
structure that references the corresponding srcu_struct structure.
And given the presence of the sup pointer to the srcu_usage structure,
replace some ssp->srcu_usage-> instances with sup->.
[ paulmck Apply feedback from kernel test robot. ]
Link: https://lore.kernel.org/oe-kbuild-all/[email protected]/
Suggested-by: Christoph Hellwig <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Tested-by: "Zhang, Qiang1" <[email protected]>
Tested-by: Joel Fernandes (Google) <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
|
|
This commit moves the ->srcu_barrier_seq, ->srcu_barrier_mutex,
->srcu_barrier_completion, and ->srcu_barrier_cpu_cnt fields from the
srcu_struct structure to the srcu_usage structure to reduce the size of
the former in order to improve cache locality.
Suggested-by: Christoph Hellwig <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Tested-by: "Zhang, Qiang1" <[email protected]>
Tested-by: Joel Fernandes (Google) <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
|
|
This commit moves the ->sda_is_static field from the srcu_struct structure
to the srcu_usage structure to reduce the size of the former in order
to improve cache locality.
Suggested-by: Christoph Hellwig <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Tested-by: "Zhang, Qiang1" <[email protected]>
Tested-by: Joel Fernandes (Google) <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
|
|
This commit moves the ->srcu_size_jiffies, ->srcu_n_lock_retries,
and ->srcu_n_exp_nodelay fields from the srcu_struct structure to the
srcu_usage structure to reduce the size of the former in order to improve
cache locality.
Suggested-by: Christoph Hellwig <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Tested-by: "Zhang, Qiang1" <[email protected]>
Tested-by: Joel Fernandes (Google) <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
|
|
This commit moves the ->srcu_gp_seq, ->srcu_gp_seq_needed,
->srcu_gp_seq_needed_exp, ->srcu_gp_start, and ->srcu_last_gp_end fields
from the srcu_struct structure to the srcu_usage structure to reduce
the size of the former in order to improve cache locality.
Suggested-by: Christoph Hellwig <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Tested-by: "Zhang, Qiang1" <[email protected]>
Tested-by: Joel Fernandes (Google) <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
|
|
This commit moves the ->srcu_gp_mutex field from the srcu_struct structure
to the srcu_usage structure to reduce the size of the former in order
to improve cache locality.
Suggested-by: Christoph Hellwig <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Tested-by: "Zhang, Qiang1" <[email protected]>
Tested-by: Joel Fernandes (Google) <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
|
|
This commit moves the ->lock field from the srcu_struct structure to
the srcu_usage structure to reduce the size of the former in order to
improve cache locality.
Suggested-by: Christoph Hellwig <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Tested-by: "Zhang, Qiang1" <[email protected]>
Tested-by: Joel Fernandes (Google) <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
|
|
This commit moves the ->srcu_cb_mutex field from the srcu_struct structure
to the srcu_usage structure to reduce the size of the former in order
to improve cache locality.
Suggested-by: Christoph Hellwig <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Tested-by: "Zhang, Qiang1" <[email protected]>
Tested-by: Joel Fernandes (Google) <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
|
|
This commit moves the ->srcu_size_state field from the srcu_struct
structure to the srcu_usage structure to reduce the size of the former
in order to improve cache locality.
Suggested-by: Christoph Hellwig <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Tested-by: "Zhang, Qiang1" <[email protected]>
Tested-by: Joel Fernandes (Google) <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
|
|
This commit moves the ->level[] array from the srcu_struct structure to
the srcu_usage structure to reduce the size of the former in order to
improve cache locality.
Suggested-by: Christoph Hellwig <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Tested-by: "Zhang, Qiang1" <[email protected]>
Tested-by: Joel Fernandes (Google) <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
|
|
The current srcu_struct structure is on the order of 200 bytes in size
(depending on architecture and .config), which is much better than the
old-style 26K bytes, but still all too inconvenient when one is trying
to achieve good cache locality on a fastpath involving SRCU readers.
However, only a few fields in srcu_struct are used by SRCU readers.
The remaining fields could be offloaded to a new srcu_update
structure, thus shrinking the srcu_struct structure down to a few
tens of bytes. This commit begins this noble quest, a quest that is
complicated by open-coded initialization of the srcu_struct within the
srcu_notifier_head structure. This complication is addressed by updating
the srcu_notifier_head structure's open coding, given that there does
not appear to be a straightforward way of abstracting that initialization.
This commit moves only the ->node pointer to srcu_update. Later commits
will move additional fields.
[ paulmck: Fold in [email protected]'s memory-leak fix. ]
Link: https://lore.kernel.org/all/[email protected]/
Suggested-by: Christoph Hellwig <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: "Michał Mirosław" <[email protected]>
Cc: Dmitry Osipenko <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Tested-by: "Zhang, Qiang1" <[email protected]>
Acked-by: Rafael J. Wysocki <[email protected]>
Tested-by: Joel Fernandes (Google) <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
|
|
Further shrinking the srcu_struct structure is eased by requiring
that in-module srcu_struct structures rely more heavily on static
initialization. In particular, this preserves the property that
a module-load-time srcu_struct initialization can fail only due
to memory-allocation failure of the per-CPU srcu_data structures.
It might also slightly improve robustness by keeping the number of memory
allocations that must succeed down percpu_alloc() call.
This is in preparation for splitting an srcu_usage structure out
of the srcu_struct structure.
[ paulmck: Fold in [email protected] feedback. ]
Cc: Christoph Hellwig <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Tested-by: "Zhang, Qiang1" <[email protected]>
Tested-by: Joel Fernandes (Google) <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
|
|
This is a whitespace-only commit with no change in functionality.
Its purpose is to prepare for later commits that: (1) Cause statically
allocated srcu_struct structures to rely on compile-time initialization
and (2) Move fields from the srcu_struct structure to a new srcu_usage
structure.
Cc: Christoph Hellwig <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Tested-by: "Zhang, Qiang1" <[email protected]>
Tested-by: Joel Fernandes (Google) <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
|
|
The number of entries in the rsrc node cache is limited to 512, which
still seems unnecessarily large. Add per cache thresholds and set to
to 32 for the rsrc node cache.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/d0cd538b944dac0bf878e276fc0199f21e6bccea.1680576071.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>
|
|
Add allocation cache for struct io_rsrc_node, it's always allocated and
put under ->uring_lock, so it doesn't need any extra synchronisation
around caches.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/252a9d9ef9654e6467af30fdc02f57c0118fb76e.1680576071.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>
|
|
struct delayed_work rsrc_put_work was previously used to offload node
freeing because io_rsrc_node_ref_zero() was previously called by RCU in
the IRQ context. Now, as percpu refcounting is gone, we can do it
eagerly at the spot without pushing it to a worker.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/13fb1aac1e8d068ad8fd4a0c6d0d157ab61b90c0.1680576071.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>
|
|
We use ->rsrc_ref_lock spinlock to protect ->rsrc_ref_list in
io_rsrc_node_ref_zero(). Now we removed pcpu refcounting, which means
io_rsrc_node_ref_zero() is not executed from the irq context as an RCU
callback anymore, and we also put it under ->uring_lock.
io_rsrc_node_switch(), which queues up nodes into the list, is also
protected by ->uring_lock, so we can safely get rid of ->rsrc_ref_lock.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/6b60af883c263551190b526a55ff2c9d5ae07141.1680576071.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>
|