aboutsummaryrefslogtreecommitdiff
path: root/Documentation/core-api
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/core-api')
-rw-r--r--Documentation/core-api/cleanup.rst8
-rw-r--r--Documentation/core-api/cpu_hotplug.rst10
-rw-r--r--Documentation/core-api/folio_queue.rst212
-rw-r--r--Documentation/core-api/index.rst3
-rw-r--r--Documentation/core-api/memory-allocation.rst5
-rw-r--r--Documentation/core-api/packing.rst71
-rw-r--r--Documentation/core-api/printk-formats.rst4
-rw-r--r--Documentation/core-api/unaligned-memory-access.rst2
-rw-r--r--Documentation/core-api/union_find.rst106
9 files changed, 411 insertions, 10 deletions
diff --git a/Documentation/core-api/cleanup.rst b/Documentation/core-api/cleanup.rst
new file mode 100644
index 000000000000..527eb2f8ec6e
--- /dev/null
+++ b/Documentation/core-api/cleanup.rst
@@ -0,0 +1,8 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===========================
+Scope-based Cleanup Helpers
+===========================
+
+.. kernel-doc:: include/linux/cleanup.h
+ :doc: scope-based cleanup helpers
diff --git a/Documentation/core-api/cpu_hotplug.rst b/Documentation/core-api/cpu_hotplug.rst
index dcb0e379e5e8..a21dbf261be7 100644
--- a/Documentation/core-api/cpu_hotplug.rst
+++ b/Documentation/core-api/cpu_hotplug.rst
@@ -737,8 +737,9 @@ can process the event further.
When changes to the CPUs in the system occur, the sysfs file
/sys/devices/system/cpu/crash_hotplug contains '1' if the kernel
-updates the kdump capture kernel list of CPUs itself (via elfcorehdr),
-or '0' if userspace must update the kdump capture kernel list of CPUs.
+updates the kdump capture kernel list of CPUs itself (via elfcorehdr and
+other relevant kexec segment), or '0' if userspace must update the kdump
+capture kernel list of CPUs.
The availability depends on the CONFIG_HOTPLUG_CPU kernel configuration
option.
@@ -750,8 +751,9 @@ file can be used in a udev rule as follows:
SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"
For a CPU hot un/plug event, if the architecture supports kernel updates
-of the elfcorehdr (which contains the list of CPUs), then the rule skips
-the unload-then-reload of the kdump capture kernel.
+of the elfcorehdr (which contains the list of CPUs) and other relevant
+kexec segments, then the rule skips the unload-then-reload of the kdump
+capture kernel.
Kernel Inline Documentations Reference
======================================
diff --git a/Documentation/core-api/folio_queue.rst b/Documentation/core-api/folio_queue.rst
new file mode 100644
index 000000000000..1fe7a9bc4b8d
--- /dev/null
+++ b/Documentation/core-api/folio_queue.rst
@@ -0,0 +1,212 @@
+.. SPDX-License-Identifier: GPL-2.0+
+
+===========
+Folio Queue
+===========
+
+:Author: David Howells <[email protected]>
+
+.. Contents:
+
+ * Overview
+ * Initialisation
+ * Adding and removing folios
+ * Querying information about a folio
+ * Querying information about a folio_queue
+ * Folio queue iteration
+ * Folio marks
+ * Lockless simultaneous production/consumption issues
+
+
+Overview
+========
+
+The folio_queue struct forms a single segment in a segmented list of folios
+that can be used to form an I/O buffer. As such, the list can be iterated over
+using the ITER_FOLIOQ iov_iter type.
+
+The publicly accessible members of the structure are::
+
+ struct folio_queue {
+ struct folio_queue *next;
+ struct folio_queue *prev;
+ ...
+ };
+
+A pair of pointers are provided, ``next`` and ``prev``, that point to the
+segments on either side of the segment being accessed. Whilst this is a
+doubly-linked list, it is intentionally not a circular list; the outward
+sibling pointers in terminal segments should be NULL.
+
+Each segment in the list also stores:
+
+ * an ordered sequence of folio pointers,
+ * the size of each folio and
+ * three 1-bit marks per folio,
+
+but hese should not be accessed directly as the underlying data structure may
+change, but rather the access functions outlined below should be used.
+
+The facility can be made accessible by::
+
+ #include <linux/folio_queue.h>
+
+and to use the iterator::
+
+ #include <linux/uio.h>
+
+
+Initialisation
+==============
+
+A segment should be initialised by calling::
+
+ void folioq_init(struct folio_queue *folioq);
+
+with a pointer to the segment to be initialised. Note that this will not
+necessarily initialise all the folio pointers, so care must be taken to check
+the number of folios added.
+
+
+Adding and removing folios
+==========================
+
+Folios can be set in the next unused slot in a segment struct by calling one
+of::
+
+ unsigned int folioq_append(struct folio_queue *folioq,
+ struct folio *folio);
+
+ unsigned int folioq_append_mark(struct folio_queue *folioq,
+ struct folio *folio);
+
+Both functions update the stored folio count, store the folio and note its
+size. The second function also sets the first mark for the folio added. Both
+functions return the number of the slot used. [!] Note that no attempt is made
+to check that the capacity wasn't overrun and the list will not be extended
+automatically.
+
+A folio can be excised by calling::
+
+ void folioq_clear(struct folio_queue *folioq, unsigned int slot);
+
+This clears the slot in the array and also clears all the marks for that folio,
+but doesn't change the folio count - so future accesses of that slot must check
+if the slot is occupied.
+
+
+Querying information about a folio
+==================================
+
+Information about the folio in a particular slot may be queried by the
+following function::
+
+ struct folio *folioq_folio(const struct folio_queue *folioq,
+ unsigned int slot);
+
+If a folio has not yet been set in that slot, this may yield an undefined
+pointer. The size of the folio in a slot may be queried with either of::
+
+ unsigned int folioq_folio_order(const struct folio_queue *folioq,
+ unsigned int slot);
+
+ size_t folioq_folio_size(const struct folio_queue *folioq,
+ unsigned int slot);
+
+The first function returns the size as an order and the second as a number of
+bytes.
+
+
+Querying information about a folio_queue
+========================================
+
+Information may be retrieved about a particular segment with the following
+functions::
+
+ unsigned int folioq_nr_slots(const struct folio_queue *folioq);
+
+ unsigned int folioq_count(struct folio_queue *folioq);
+
+ bool folioq_full(struct folio_queue *folioq);
+
+The first function returns the maximum capacity of a segment. It must not be
+assumed that this won't vary between segments. The second returns the number
+of folios added to a segments and the third is a shorthand to indicate if the
+segment has been filled to capacity.
+
+Not that the count and fullness are not affected by clearing folios from the
+segment. These are more about indicating how many slots in the array have been
+initialised, and it assumed that slots won't get reused, but rather the segment
+will get discarded as the queue is consumed.
+
+
+Folio marks
+===========
+
+Folios within a queue can also have marks assigned to them. These marks can be
+used to note information such as if a folio needs folio_put() calling upon it.
+There are three marks available to be set for each folio.
+
+The marks can be set by::
+
+ void folioq_mark(struct folio_queue *folioq, unsigned int slot);
+ void folioq_mark2(struct folio_queue *folioq, unsigned int slot);
+ void folioq_mark3(struct folio_queue *folioq, unsigned int slot);
+
+Cleared by::
+
+ void folioq_unmark(struct folio_queue *folioq, unsigned int slot);
+ void folioq_unmark2(struct folio_queue *folioq, unsigned int slot);
+ void folioq_unmark3(struct folio_queue *folioq, unsigned int slot);
+
+And the marks can be queried by::
+
+ bool folioq_is_marked(const struct folio_queue *folioq, unsigned int slot);
+ bool folioq_is_marked2(const struct folio_queue *folioq, unsigned int slot);
+ bool folioq_is_marked3(const struct folio_queue *folioq, unsigned int slot);
+
+The marks can be used for any purpose and are not interpreted by this API.
+
+
+Folio queue iteration
+=====================
+
+A list of segments may be iterated over using the I/O iterator facility using
+an ``iov_iter`` iterator of ``ITER_FOLIOQ`` type. The iterator may be
+initialised with::
+
+ void iov_iter_folio_queue(struct iov_iter *i, unsigned int direction,
+ const struct folio_queue *folioq,
+ unsigned int first_slot, unsigned int offset,
+ size_t count);
+
+This may be told to start at a particular segment, slot and offset within a
+queue. The iov iterator functions will follow the next pointers when advancing
+and prev pointers when reverting when needed.
+
+
+Lockless simultaneous production/consumption issues
+===================================================
+
+If properly managed, the list can be extended by the producer at the head end
+and shortened by the consumer at the tail end simultaneously without the need
+to take locks. The ITER_FOLIOQ iterator inserts appropriate barriers to aid
+with this.
+
+Care must be taken when simultaneously producing and consuming a list. If the
+last segment is reached and the folios it refers to are entirely consumed by
+the IOV iterators, an iov_iter struct will be left pointing to the last segment
+with a slot number equal to the capacity of that segment. The iterator will
+try to continue on from this if there's another segment available when it is
+used again, but care must be taken lest the segment got removed and freed by
+the consumer before the iterator was advanced.
+
+It is recommended that the queue always contain at least one segment, even if
+that segment has never been filled or is entirely spent. This prevents the
+head and tail pointers from collapsing.
+
+
+API Function Reference
+======================
+
+.. kernel-doc:: include/linux/folio_queue.h
diff --git a/Documentation/core-api/index.rst b/Documentation/core-api/index.rst
index f147854700e4..6a875743dd4b 100644
--- a/Documentation/core-api/index.rst
+++ b/Documentation/core-api/index.rst
@@ -35,7 +35,9 @@ Library functionality that is used throughout the kernel.
kobject
kref
+ cleanup
assoc_array
+ folio_queue
xarray
maple_tree
idr
@@ -49,6 +51,7 @@ Library functionality that is used throughout the kernel.
wrappers/atomic_t
wrappers/atomic_bitops
floating-point
+ union_find
Low level entry and exit
========================
diff --git a/Documentation/core-api/memory-allocation.rst b/Documentation/core-api/memory-allocation.rst
index 8b84eb4bdae7..0f19dd524323 100644
--- a/Documentation/core-api/memory-allocation.rst
+++ b/Documentation/core-api/memory-allocation.rst
@@ -45,8 +45,9 @@ here we briefly outline their recommended usage:
* If the allocation is performed from an atomic context, e.g interrupt
handler, use ``GFP_NOWAIT``. This flag prevents direct reclaim and
IO or filesystem operations. Consequently, under memory pressure
- ``GFP_NOWAIT`` allocation is likely to fail. Allocations which
- have a reasonable fallback should be using ``GFP_NOWARN``.
+ ``GFP_NOWAIT`` allocation is likely to fail. Users of this flag need
+ to provide a suitable fallback to cope with such failures where
+ appropriate.
* If you think that accessing memory reserves is justified and the kernel
will be stressed unless allocation succeeds, you may use ``GFP_ATOMIC``.
* Untrusted allocations triggered from userspace should be a subject
diff --git a/Documentation/core-api/packing.rst b/Documentation/core-api/packing.rst
index 3ed13bc9a195..821691f23c54 100644
--- a/Documentation/core-api/packing.rst
+++ b/Documentation/core-api/packing.rst
@@ -151,6 +151,77 @@ the more significant 4-byte word.
We always think of our offsets as if there were no quirk, and we translate
them afterwards, before accessing the memory region.
+Note on buffer lengths not multiple of 4
+----------------------------------------
+
+To deal with memory layout quirks where groups of 4 bytes are laid out "little
+endian" relative to each other, but "big endian" within the group itself, the
+concept of groups of 4 bytes is intrinsic to the packing API (not to be
+confused with the memory access, which is performed byte by byte, though).
+
+With buffer lengths not multiple of 4, this means one group will be incomplete.
+Depending on the quirks, this may lead to discontinuities in the bit fields
+accessible through the buffer. The packing API assumes discontinuities were not
+the intention of the memory layout, so it avoids them by effectively logically
+shortening the most significant group of 4 octets to the number of octets
+actually available.
+
+Example with a 31 byte sized buffer given below. Physical buffer offsets are
+implicit, and increase from left to right within a group, and from top to
+bottom within a column.
+
+No quirks:
+
+::
+
+ 31 29 28 | Group 7 (most significant)
+ 27 26 25 24 | Group 6
+ 23 22 21 20 | Group 5
+ 19 18 17 16 | Group 4
+ 15 14 13 12 | Group 3
+ 11 10 9 8 | Group 2
+ 7 6 5 4 | Group 1
+ 3 2 1 0 | Group 0 (least significant)
+
+QUIRK_LSW32_IS_FIRST:
+
+::
+
+ 3 2 1 0 | Group 0 (least significant)
+ 7 6 5 4 | Group 1
+ 11 10 9 8 | Group 2
+ 15 14 13 12 | Group 3
+ 19 18 17 16 | Group 4
+ 23 22 21 20 | Group 5
+ 27 26 25 24 | Group 6
+ 30 29 28 | Group 7 (most significant)
+
+QUIRK_LITTLE_ENDIAN:
+
+::
+
+ 30 28 29 | Group 7 (most significant)
+ 24 25 26 27 | Group 6
+ 20 21 22 23 | Group 5
+ 16 17 18 19 | Group 4
+ 12 13 14 15 | Group 3
+ 8 9 10 11 | Group 2
+ 4 5 6 7 | Group 1
+ 0 1 2 3 | Group 0 (least significant)
+
+QUIRK_LITTLE_ENDIAN | QUIRK_LSW32_IS_FIRST:
+
+::
+
+ 0 1 2 3 | Group 0 (least significant)
+ 4 5 6 7 | Group 1
+ 8 9 10 11 | Group 2
+ 12 13 14 15 | Group 3
+ 16 17 18 19 | Group 4
+ 20 21 22 23 | Group 5
+ 24 25 26 27 | Group 6
+ 28 29 30 | Group 7 (most significant)
+
Intended use
------------
diff --git a/Documentation/core-api/printk-formats.rst b/Documentation/core-api/printk-formats.rst
index 4451ef501936..14e093da3ccd 100644
--- a/Documentation/core-api/printk-formats.rst
+++ b/Documentation/core-api/printk-formats.rst
@@ -576,13 +576,12 @@ The field width is passed by value, the bitmap is passed by reference.
Helper macros cpumask_pr_args() and nodemask_pr_args() are available to ease
printing cpumask and nodemask.
-Flags bitfields such as page flags, page_type, gfp_flags
+Flags bitfields such as page flags and gfp_flags
--------------------------------------------------------
::
%pGp 0x17ffffc0002036(referenced|uptodate|lru|active|private|node=0|zone=2|lastcpupid=0x1fffff)
- %pGt 0xffffff7f(buddy)
%pGg GFP_USER|GFP_DMA32|GFP_NOWARN
%pGv read|exec|mayread|maywrite|mayexec|denywrite
@@ -591,7 +590,6 @@ would construct the value. The type of flags is given by the third
character. Currently supported are:
- p - [p]age flags, expects value of type (``unsigned long *``)
- - t - page [t]ype, expects value of type (``unsigned int *``)
- v - [v]ma_flags, expects value of type (``unsigned long *``)
- g - [g]fp_flags, expects value of type (``gfp_t *``)
diff --git a/Documentation/core-api/unaligned-memory-access.rst b/Documentation/core-api/unaligned-memory-access.rst
index 1ee82419d8aa..5ceeb80eb539 100644
--- a/Documentation/core-api/unaligned-memory-access.rst
+++ b/Documentation/core-api/unaligned-memory-access.rst
@@ -203,7 +203,7 @@ Avoiding unaligned accesses
===========================
The easiest way to avoid unaligned access is to use the get_unaligned() and
-put_unaligned() macros provided by the <asm/unaligned.h> header file.
+put_unaligned() macros provided by the <linux/unaligned.h> header file.
Going back to an earlier example of code that potentially causes unaligned
access::
diff --git a/Documentation/core-api/union_find.rst b/Documentation/core-api/union_find.rst
new file mode 100644
index 000000000000..6df8b94fdb5a
--- /dev/null
+++ b/Documentation/core-api/union_find.rst
@@ -0,0 +1,106 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+====================
+Union-Find in Linux
+====================
+
+
+:Date: June 21, 2024
+:Author: Xavier <[email protected]>
+
+What is union-find, and what is it used for?
+------------------------------------------------
+
+Union-find is a data structure used to handle the merging and querying
+of disjoint sets. The primary operations supported by union-find are:
+
+ Initialization: Resetting each element as an individual set, with
+ each set's initial parent node pointing to itself.
+
+ Find: Determine which set a particular element belongs to, usually by
+ returning a “representative element” of that set. This operation
+ is used to check if two elements are in the same set.
+
+ Union: Merge two sets into one.
+
+As a data structure used to maintain sets (groups), union-find is commonly
+utilized to solve problems related to offline queries, dynamic connectivity,
+and graph theory. It is also a key component in Kruskal's algorithm for
+computing the minimum spanning tree, which is crucial in scenarios like
+network routing. Consequently, union-find is widely referenced. Additionally,
+union-find has applications in symbolic computation, register allocation,
+and more.
+
+Space Complexity: O(n), where n is the number of nodes.
+
+Time Complexity: Using path compression can reduce the time complexity of
+the find operation, and using union by rank can reduce the time complexity
+of the union operation. These optimizations reduce the average time
+complexity of each find and union operation to O(α(n)), where α(n) is the
+inverse Ackermann function. This can be roughly considered a constant time
+complexity for practical purposes.
+
+This document covers use of the Linux union-find implementation. For more
+information on the nature and implementation of union-find, see:
+
+ Wikipedia entry on union-find
+ https://en.wikipedia.org/wiki/Disjoint-set_data_structure
+
+Linux implementation of union-find
+-----------------------------------
+
+Linux's union-find implementation resides in the file "lib/union_find.c".
+To use it, "#include <linux/union_find.h>".
+
+The union-find data structure is defined as follows::
+
+ struct uf_node {
+ struct uf_node *parent;
+ unsigned int rank;
+ };
+
+In this structure, parent points to the parent node of the current node.
+The rank field represents the height of the current tree. During a union
+operation, the tree with the smaller rank is attached under the tree with the
+larger rank to maintain balance.
+
+Initializing union-find
+-----------------------
+
+You can complete the initialization using either static or initialization
+interface. Initialize the parent pointer to point to itself and set the rank
+to 0.
+Example::
+
+ struct uf_node my_node = UF_INIT_NODE(my_node);
+
+or
+
+ uf_node_init(&my_node);
+
+Find the Root Node of union-find
+--------------------------------
+
+This operation is mainly used to determine whether two nodes belong to the same
+set in the union-find. If they have the same root, they are in the same set.
+During the find operation, path compression is performed to improve the
+efficiency of subsequent find operations.
+Example::
+
+ int connected;
+ struct uf_node *root1 = uf_find(&node_1);
+ struct uf_node *root2 = uf_find(&node_2);
+ if (root1 == root2)
+ connected = 1;
+ else
+ connected = 0;
+
+Union Two Sets in union-find
+----------------------------
+
+To union two sets in the union-find, you first find their respective root nodes
+and then link the smaller node to the larger node based on the rank of the root
+nodes.
+Example::
+
+ uf_union(&node_1, &node_2);