diff options
Diffstat (limited to 'Documentation')
18 files changed, 333 insertions, 38 deletions
diff --git a/Documentation/ABI/testing/securityfs-secrets-coco b/Documentation/ABI/testing/securityfs-secrets-coco new file mode 100644 index 000000000000..f2b6909155f9 --- /dev/null +++ b/Documentation/ABI/testing/securityfs-secrets-coco @@ -0,0 +1,51 @@ +What: security/secrets/coco +Date: February 2022 +Contact: Dov Murik <dovmurik@linux.ibm.com> +Description: + Exposes confidential computing (coco) EFI secrets to + userspace via securityfs. + + EFI can declare memory area used by confidential computing + platforms (such as AMD SEV and SEV-ES) for secret injection by + the Guest Owner during VM's launch. The secrets are encrypted + by the Guest Owner and decrypted inside the trusted enclave, + and therefore are not readable by the untrusted host. + + The efi_secret module exposes the secrets to userspace. Each + secret appears as a file under <securityfs>/secrets/coco, + where the filename is the GUID of the entry in the secrets + table. This module is loaded automatically by the EFI driver + if the EFI secret area is populated. + + Two operations are supported for the files: read and unlink. + Reading the file returns the content of secret entry. + Unlinking the file overwrites the secret data with zeroes and + removes the entry from the filesystem. A secret cannot be read + after it has been unlinked. + + For example, listing the available secrets:: + + # modprobe efi_secret + # ls -l /sys/kernel/security/secrets/coco + -r--r----- 1 root root 0 Jun 28 11:54 736870e5-84f0-4973-92ec-06879ce3da0b + -r--r----- 1 root root 0 Jun 28 11:54 83c83f7f-1356-4975-8b7e-d3a0b54312c6 + -r--r----- 1 root root 0 Jun 28 11:54 9553f55d-3da2-43ee-ab5d-ff17f78864d2 + -r--r----- 1 root root 0 Jun 28 11:54 e6f5a162-d67f-4750-a67c-5d065f2a9910 + + Reading the secret data by reading a file:: + + # cat /sys/kernel/security/secrets/coco/e6f5a162-d67f-4750-a67c-5d065f2a9910 + the-content-of-the-secret-data + + Wiping a secret by unlinking a file:: + + # rm /sys/kernel/security/secrets/coco/e6f5a162-d67f-4750-a67c-5d065f2a9910 + # ls -l /sys/kernel/security/secrets/coco + -r--r----- 1 root root 0 Jun 28 11:54 736870e5-84f0-4973-92ec-06879ce3da0b + -r--r----- 1 root root 0 Jun 28 11:54 83c83f7f-1356-4975-8b7e-d3a0b54312c6 + -r--r----- 1 root root 0 Jun 28 11:54 9553f55d-3da2-43ee-ab5d-ff17f78864d2 + + Note: The binary format of the secrets table injected by the + Guest Owner is described in + drivers/virt/coco/efi_secret/efi_secret.c under "Structure of + the EFI secret area". diff --git a/Documentation/RCU/Design/Data-Structures/Data-Structures.rst b/Documentation/RCU/Design/Data-Structures/Data-Structures.rst index f4efd6897b09..b34990c7c377 100644 --- a/Documentation/RCU/Design/Data-Structures/Data-Structures.rst +++ b/Documentation/RCU/Design/Data-Structures/Data-Structures.rst @@ -973,7 +973,7 @@ The ``->dynticks`` field counts the corresponding CPU's transitions to and from either dyntick-idle or user mode, so that this counter has an even value when the CPU is in dyntick-idle mode or user mode and an odd value otherwise. The transitions to/from user mode need to be counted -for user mode adaptive-ticks support (see timers/NO_HZ.txt). +for user mode adaptive-ticks support (see Documentation/timers/no_hz.rst). The ``->rcu_need_heavy_qs`` field is used to record the fact that the RCU core code would really like to see a quiescent state from the diff --git a/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst b/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst index 6f89cf1e567d..c9c957c85bac 100644 --- a/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst +++ b/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst @@ -406,7 +406,7 @@ In earlier implementations, the task requesting the expedited grace period also drove it to completion. This straightforward approach had the disadvantage of needing to account for POSIX signals sent to user tasks, so more recent implemementations use the Linux kernel's -`workqueues <https://www.kernel.org/doc/Documentation/core-api/workqueue.rst>`__. +workqueues (see Documentation/core-api/workqueue.rst). The requesting task still does counter snapshotting and funnel-lock processing, but the task reaching the top of the funnel lock does a diff --git a/Documentation/RCU/Design/Requirements/Requirements.rst b/Documentation/RCU/Design/Requirements/Requirements.rst index 45278e2974c0..04ed8bf27a0e 100644 --- a/Documentation/RCU/Design/Requirements/Requirements.rst +++ b/Documentation/RCU/Design/Requirements/Requirements.rst @@ -370,8 +370,8 @@ pointer fetched by rcu_dereference() may not be used outside of the outermost RCU read-side critical section containing that rcu_dereference(), unless protection of the corresponding data element has been passed from RCU to some other synchronization -mechanism, most commonly locking or `reference -counting <https://www.kernel.org/doc/Documentation/RCU/rcuref.txt>`__. +mechanism, most commonly locking or reference counting +(see ../../rcuref.rst). .. |high-quality implementation of C11 memory_order_consume [PDF]| replace:: high-quality implementation of C11 ``memory_order_consume`` [PDF] .. _high-quality implementation of C11 memory_order_consume [PDF]: http://www.rdrop.com/users/paulmck/RCU/consume.2015.07.13a.pdf @@ -2654,6 +2654,38 @@ synchronize_rcu(), and rcu_barrier(), respectively. In three APIs are therefore implemented by separate functions that check for voluntary context switches. +Tasks Rude RCU +~~~~~~~~~~~~~~ + +Some forms of tracing need to wait for all preemption-disabled regions +of code running on any online CPU, including those executed when RCU is +not watching. This means that synchronize_rcu() is insufficient, and +Tasks Rude RCU must be used instead. This flavor of RCU does its work by +forcing a workqueue to be scheduled on each online CPU, hence the "Rude" +moniker. And this operation is considered to be quite rude by real-time +workloads that don't want their ``nohz_full`` CPUs receiving IPIs and +by battery-powered systems that don't want their idle CPUs to be awakened. + +The tasks-rude-RCU API is also reader-marking-free and thus quite compact, +consisting of call_rcu_tasks_rude(), synchronize_rcu_tasks_rude(), +and rcu_barrier_tasks_rude(). + +Tasks Trace RCU +~~~~~~~~~~~~~~~ + +Some forms of tracing need to sleep in readers, but cannot tolerate +SRCU's read-side overhead, which includes a full memory barrier in both +srcu_read_lock() and srcu_read_unlock(). This need is handled by a +Tasks Trace RCU that uses scheduler locking and IPIs to synchronize with +readers. Real-time systems that cannot tolerate IPIs may build their +kernels with ``CONFIG_TASKS_TRACE_RCU_READ_MB=y``, which avoids the IPIs at +the expense of adding full memory barriers to the read-side primitives. + +The tasks-trace-RCU API is also reasonably compact, +consisting of rcu_read_lock_trace(), rcu_read_unlock_trace(), +rcu_read_lock_trace_held(), call_rcu_tasks_trace(), +synchronize_rcu_tasks_trace(), and rcu_barrier_tasks_trace(). + Possible Future Changes ----------------------- diff --git a/Documentation/RCU/arrayRCU.rst b/Documentation/RCU/arrayRCU.rst index 4051ea3871ef..a5f2ff8fc54c 100644 --- a/Documentation/RCU/arrayRCU.rst +++ b/Documentation/RCU/arrayRCU.rst @@ -33,8 +33,8 @@ Situation 1: Hash Tables Hash tables are often implemented as an array, where each array entry has a linked-list hash chain. Each hash chain can be protected by RCU -as described in the listRCU.txt document. This approach also applies -to other array-of-list situations, such as radix trees. +as described in listRCU.rst. This approach also applies to other +array-of-list situations, such as radix trees. .. _static_arrays: diff --git a/Documentation/RCU/checklist.rst b/Documentation/RCU/checklist.rst index f4545b7c9a63..42cc5d891bd2 100644 --- a/Documentation/RCU/checklist.rst +++ b/Documentation/RCU/checklist.rst @@ -140,8 +140,7 @@ over a rather long period of time, but improvements are always welcome! prevents destructive compiler optimizations. However, with a bit of devious creativity, it is possible to mishandle the return value from rcu_dereference(). - Please see rcu_dereference.txt in this directory for - more information. + Please see rcu_dereference.rst for more information. The rcu_dereference() primitive is used by the various "_rcu()" list-traversal primitives, such @@ -151,7 +150,7 @@ over a rather long period of time, but improvements are always welcome! primitives. This is particularly useful in code that is common to readers and updaters. However, lockdep will complain if you access rcu_dereference() outside - of an RCU read-side critical section. See lockdep.txt + of an RCU read-side critical section. See lockdep.rst to learn what to do about this. Of course, neither rcu_dereference() nor the "_rcu()" @@ -323,7 +322,7 @@ over a rather long period of time, but improvements are always welcome! primitives when the update-side lock is held is that doing so can be quite helpful in reducing code bloat when common code is shared between readers and updaters. Additional primitives - are provided for this case, as discussed in lockdep.txt. + are provided for this case, as discussed in lockdep.rst. One exception to this rule is when data is only ever added to the linked data structure, and is never removed during any @@ -480,4 +479,4 @@ over a rather long period of time, but improvements are always welcome! both rcu_barrier() and synchronize_rcu(), if necessary, using something like workqueues to to execute them concurrently. - See rcubarrier.txt for more information. + See rcubarrier.rst for more information. diff --git a/Documentation/RCU/rcu.rst b/Documentation/RCU/rcu.rst index 0e03c6ef3147..3cfe01ba9a49 100644 --- a/Documentation/RCU/rcu.rst +++ b/Documentation/RCU/rcu.rst @@ -10,9 +10,8 @@ A "grace period" must elapse between the two parts, and this grace period must be long enough that any readers accessing the item being deleted have since dropped their references. For example, an RCU-protected deletion from a linked list would first remove the item from the list, wait for -a grace period to elapse, then free the element. See the -:ref:`Documentation/RCU/listRCU.rst <list_rcu_doc>` for more information on -using RCU with linked lists. +a grace period to elapse, then free the element. See listRCU.rst for more +information on using RCU with linked lists. Frequently Asked Questions -------------------------- @@ -50,7 +49,7 @@ Frequently Asked Questions - If I am running on a uniprocessor kernel, which can only do one thing at a time, why should I wait for a grace period? - See :ref:`Documentation/RCU/UP.rst <up_doc>` for more information. + See UP.rst for more information. - How can I see where RCU is currently used in the Linux kernel? @@ -64,13 +63,13 @@ Frequently Asked Questions - What guidelines should I follow when writing code that uses RCU? - See the checklist.txt file in this directory. + See checklist.rst. - Why the name "RCU"? "RCU" stands for "read-copy update". - :ref:`Documentation/RCU/listRCU.rst <list_rcu_doc>` has more information on where - this name came from, search for "read-copy update" to find it. + listRCU.rst has more information on where this name came from, search + for "read-copy update" to find it. - I hear that RCU is patented? What is with that? diff --git a/Documentation/RCU/rculist_nulls.rst b/Documentation/RCU/rculist_nulls.rst index a9fc774bc400..ca4692775ad4 100644 --- a/Documentation/RCU/rculist_nulls.rst +++ b/Documentation/RCU/rculist_nulls.rst @@ -8,7 +8,7 @@ This section describes how to use hlist_nulls to protect read-mostly linked lists and objects using SLAB_TYPESAFE_BY_RCU allocations. -Please read the basics in Documentation/RCU/listRCU.rst +Please read the basics in listRCU.rst. Using 'nulls' ============= diff --git a/Documentation/RCU/stallwarn.rst b/Documentation/RCU/stallwarn.rst index 78404625bad2..794837eb519b 100644 --- a/Documentation/RCU/stallwarn.rst +++ b/Documentation/RCU/stallwarn.rst @@ -162,6 +162,26 @@ CONFIG_RCU_CPU_STALL_TIMEOUT Stall-warning messages may be enabled and disabled completely via /sys/module/rcupdate/parameters/rcu_cpu_stall_suppress. +CONFIG_RCU_EXP_CPU_STALL_TIMEOUT +-------------------------------- + + Same as the CONFIG_RCU_CPU_STALL_TIMEOUT parameter but only for + the expedited grace period. This parameter defines the period + of time that RCU will wait from the beginning of an expedited + grace period until it issues an RCU CPU stall warning. This time + period is normally 20 milliseconds on Android devices. A zero + value causes the CONFIG_RCU_CPU_STALL_TIMEOUT value to be used, + after conversion to milliseconds. + + This configuration parameter may be changed at runtime via the + /sys/module/rcupdate/parameters/rcu_exp_cpu_stall_timeout, however + this parameter is checked only at the beginning of a cycle. If you + are in a current stall cycle, setting it to a new value will change + the timeout for the -next- stall. + + Stall-warning messages may be enabled and disabled completely via + /sys/module/rcupdate/parameters/rcu_cpu_stall_suppress. + RCU_STALL_DELAY_DELTA --------------------- diff --git a/Documentation/RCU/whatisRCU.rst b/Documentation/RCU/whatisRCU.rst index c34d2212eaca..77ea260efd12 100644 --- a/Documentation/RCU/whatisRCU.rst +++ b/Documentation/RCU/whatisRCU.rst @@ -224,7 +224,7 @@ synchronize_rcu() be delayed. This property results in system resilience in face of denial-of-service attacks. Code using call_rcu() should limit update rate in order to gain this same sort of resilience. See - checklist.txt for some approaches to limiting the update rate. + checklist.rst for some approaches to limiting the update rate. rcu_assign_pointer() ^^^^^^^^^^^^^^^^^^^^ @@ -318,7 +318,7 @@ rcu_dereference() must prohibit. The rcu_dereference_protected() variant takes a lockdep expression to indicate which locks must be acquired by the caller. If the indicated protection is not provided, - a lockdep splat is emitted. See Documentation/RCU/Design/Requirements/Requirements.rst + a lockdep splat is emitted. See Design/Requirements/Requirements.rst and the API's code comments for more details and example usage. .. [2] If the list_for_each_entry_rcu() instance might be used by @@ -399,8 +399,7 @@ for specialized uses, but are relatively uncommon. This section shows a simple use of the core RCU API to protect a global pointer to a dynamically allocated structure. More-typical -uses of RCU may be found in :ref:`listRCU.rst <list_rcu_doc>`, -:ref:`arrayRCU.rst <array_rcu_doc>`, and :ref:`NMI-RCU.rst <NMI_rcu_doc>`. +uses of RCU may be found in listRCU.rst, arrayRCU.rst, and NMI-RCU.rst. :: struct foo { @@ -482,10 +481,9 @@ So, to sum up: RCU read-side critical sections that might be referencing that data item. -See checklist.txt for additional rules to follow when using RCU. -And again, more-typical uses of RCU may be found in :ref:`listRCU.rst -<list_rcu_doc>`, :ref:`arrayRCU.rst <array_rcu_doc>`, and :ref:`NMI-RCU.rst -<NMI_rcu_doc>`. +See checklist.rst for additional rules to follow when using RCU. +And again, more-typical uses of RCU may be found in listRCU.rst, +arrayRCU.rst, and NMI-RCU.rst. .. _4_whatisRCU: @@ -579,7 +577,7 @@ to avoid having to write your own callback:: kfree_rcu(old_fp, rcu); -Again, see checklist.txt for additional rules governing the use of RCU. +Again, see checklist.rst for additional rules governing the use of RCU. .. _5_whatisRCU: @@ -663,7 +661,7 @@ been able to write-acquire the lock otherwise. The smp_mb__after_spinlock() promotes synchronize_rcu() to a full memory barrier in compliance with the "Memory-Barrier Guarantees" listed in: - Documentation/RCU/Design/Requirements/Requirements.rst + Design/Requirements/Requirements.rst It is possible to nest rcu_read_lock(), since reader-writer locks may be recursively acquired. Note also that rcu_read_lock() is immune diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 3f1cc5e317ed..f8d9af5d51e5 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -4893,6 +4893,18 @@ rcupdate.rcu_cpu_stall_timeout= [KNL] Set timeout for RCU CPU stall warning messages. + The value is in seconds and the maximum allowed + value is 300 seconds. + + rcupdate.rcu_exp_cpu_stall_timeout= [KNL] + Set timeout for expedited RCU CPU stall warning + messages. The value is in milliseconds + and the maximum allowed value is 21000 + milliseconds. Please note that this value is + adjusted to an arch timer tick resolution. + Setting this to zero causes the value from + rcupdate.rcu_cpu_stall_timeout to be used (after + conversion from seconds to milliseconds). rcupdate.rcu_expedited= [KNL] Use expedited grace-period primitives, for @@ -4955,10 +4967,34 @@ number avoids disturbing real-time workloads, but lengthens grace periods. + rcupdate.rcu_task_stall_info= [KNL] + Set initial timeout in jiffies for RCU task stall + informational messages, which give some indication + of the problem for those not patient enough to + wait for ten minutes. Informational messages are + only printed prior to the stall-warning message + for a given grace period. Disable with a value + less than or equal to zero. Defaults to ten + seconds. A change in value does not take effect + until the beginning of the next grace period. + + rcupdate.rcu_task_stall_info_mult= [KNL] + Multiplier for time interval between successive + RCU task stall informational messages for a given + RCU tasks grace period. This value is clamped + to one through ten, inclusive. It defaults to + the value three, so that the first informational + message is printed 10 seconds into the grace + period, the second at 40 seconds, the third at + 160 seconds, and then the stall warning at 600 + seconds would prevent a fourth at 640 seconds. + rcupdate.rcu_task_stall_timeout= [KNL] - Set timeout in jiffies for RCU task stall warning - messages. Disable with a value less than or equal - to zero. + Set timeout in jiffies for RCU task stall + warning messages. Disable with a value less + than or equal to zero. Defaults to ten minutes. + A change in value does not take effect until + the beginning of the next grace period. rcupdate.rcu_self_test= [KNL] Run the RCU early boot self tests @@ -5377,6 +5413,17 @@ smart2= [HW] Format: <io1>[,<io2>[,...,<io8>]] + smp.csd_lock_timeout= [KNL] + Specify the period of time in milliseconds + that smp_call_function() and friends will wait + for a CPU to release the CSD lock. This is + useful when diagnosing bugs involving CPUs + disabling interrupts for extended periods + of time. Defaults to 5,000 milliseconds, and + setting a value of zero disables this feature. + This feature may be more efficiently disabled + using the csdlock_debug- kernel parameter. + smsc-ircc2.nopnp [HW] Don't use PNP to discover SMC devices smsc-ircc2.ircc_cfg= [HW] Device configuration I/O port smsc-ircc2.ircc_sir= [HW] SIR base I/O port @@ -5608,6 +5655,30 @@ off: Disable mitigation and remove performance impact to RDRAND and RDSEED + srcutree.big_cpu_lim [KNL] + Specifies the number of CPUs constituting a + large system, such that srcu_struct structures + should immediately allocate an srcu_node array. + This kernel-boot parameter defaults to 128, + but takes effect only when the low-order four + bits of srcutree.convert_to_big is equal to 3 + (decide at boot). + + srcutree.convert_to_big [KNL] + Specifies under what conditions an SRCU tree + srcu_struct structure will be converted to big + form, that is, with an rcu_node tree: + + 0: Never. + 1: At init_srcu_struct() time. + 2: When rcutorture decides to. + 3: Decide at boot time (default). + 0x1X: Above plus if high contention. + + Either way, the srcu_node tree will be sized based + on the actual runtime number of CPUs (nr_cpu_ids) + instead of the compile-time CONFIG_NR_CPUS. + srcutree.counter_wrap_check [KNL] Specifies how frequently to check for grace-period sequence counter wrap for the @@ -5625,6 +5696,14 @@ expediting. Set to zero to disable automatic expediting. + srcutree.small_contention_lim [KNL] + Specifies the number of update-side contention + events per jiffy will be tolerated before + initiating a conversion of an srcu_struct + structure to big form. Note that the value of + srcutree.convert_to_big must have the 0x10 bit + set for contention-based conversions to occur. + ssbd= [ARM64,HW] Speculative Store Bypass Disable control diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst index 466cb9e89047..d27db84d585e 100644 --- a/Documentation/arm64/silicon-errata.rst +++ b/Documentation/arm64/silicon-errata.rst @@ -189,6 +189,9 @@ stable kernels. +----------------+-----------------+-----------------+-----------------------------+ | Qualcomm Tech. | Kryo4xx Silver | N/A | ARM64_ERRATUM_1024718 | +----------------+-----------------+-----------------+-----------------------------+ +| Qualcomm Tech. | Kryo4xx Gold | N/A | ARM64_ERRATUM_1286807 | ++----------------+-----------------+-----------------+-----------------------------+ + +----------------+-----------------+-----------------+-----------------------------+ | Fujitsu | A64FX | E#010001 | FUJITSU_ERRATUM_010001 | +----------------+-----------------+-----------------+-----------------------------+ diff --git a/Documentation/devicetree/bindings/input/mediatek,mt6779-keypad.yaml b/Documentation/devicetree/bindings/input/mediatek,mt6779-keypad.yaml index b1770640f94b..03ebd2665d07 100644 --- a/Documentation/devicetree/bindings/input/mediatek,mt6779-keypad.yaml +++ b/Documentation/devicetree/bindings/input/mediatek,mt6779-keypad.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Mediatek's Keypad Controller device tree bindings maintainers: - - Fengping Yu <fengping.yu@mediatek.com> + - Mattijs Korpershoek <mkorpershoek@baylibre.com> allOf: - $ref: "/schemas/input/matrix-keymap.yaml#" diff --git a/Documentation/devicetree/bindings/pinctrl/aspeed,ast2600-pinctrl.yaml b/Documentation/devicetree/bindings/pinctrl/aspeed,ast2600-pinctrl.yaml index 57b68d6c7c70..3666ac5b6518 100644 --- a/Documentation/devicetree/bindings/pinctrl/aspeed,ast2600-pinctrl.yaml +++ b/Documentation/devicetree/bindings/pinctrl/aspeed,ast2600-pinctrl.yaml @@ -33,7 +33,7 @@ patternProperties: $ref: "/schemas/types.yaml#/definitions/string" enum: [ ADC0, ADC1, ADC10, ADC11, ADC12, ADC13, ADC14, ADC15, ADC2, ADC3, ADC4, ADC5, ADC6, ADC7, ADC8, ADC9, BMCINT, EMMC, ESPI, ESPIALT, - FSI1, FSI2, FWSPIABR, FWSPID, FWSPIWP, GPIT0, GPIT1, GPIT2, GPIT3, + FSI1, FSI2, FWQSPI, FWSPIABR, FWSPID, FWSPIWP, GPIT0, GPIT1, GPIT2, GPIT3, GPIT4, GPIT5, GPIT6, GPIT7, GPIU0, GPIU1, GPIU2, GPIU3, GPIU4, GPIU5, GPIU6, GPIU7, I2C1, I2C10, I2C11, I2C12, I2C13, I2C14, I2C15, I2C16, I2C2, I2C3, I2C4, I2C5, I2C6, I2C7, I2C8, I2C9, I3C3, I3C4, I3C5, @@ -58,7 +58,7 @@ patternProperties: $ref: "/schemas/types.yaml#/definitions/string" enum: [ ADC0, ADC1, ADC10, ADC11, ADC12, ADC13, ADC14, ADC15, ADC2, ADC3, ADC4, ADC5, ADC6, ADC7, ADC8, ADC9, BMCINT, EMMCG1, EMMCG4, - EMMCG8, ESPI, ESPIALT, FSI1, FSI2, FWSPIABR, FWSPID, FWQSPID, FWSPIWP, + EMMCG8, ESPI, ESPIALT, FSI1, FSI2, FWQSPI, FWSPIABR, FWSPID, FWSPIWP, GPIT0, GPIT1, GPIT2, GPIT3, GPIT4, GPIT5, GPIT6, GPIT7, GPIU0, GPIU1, GPIU2, GPIU3, GPIU4, GPIU5, GPIU6, GPIU7, HVI3C3, HVI3C4, I2C1, I2C10, I2C11, I2C12, I2C13, I2C14, I2C15, I2C16, I2C2, I2C3, I2C4, I2C5, diff --git a/Documentation/process/embargoed-hardware-issues.rst b/Documentation/process/embargoed-hardware-issues.rst index 6f8f36e10e8b..95999302d279 100644 --- a/Documentation/process/embargoed-hardware-issues.rst +++ b/Documentation/process/embargoed-hardware-issues.rst @@ -244,10 +244,11 @@ disclosure of a particular issue, unless requested by a response team or by an involved disclosed party. The current ambassadors list: ============= ======================================================== - ARM Grant Likely <grant.likely@arm.com> AMD Tom Lendacky <tom.lendacky@amd.com> - IBM Z Christian Borntraeger <borntraeger@de.ibm.com> - IBM Power Anton Blanchard <anton@linux.ibm.com> + Ampere Darren Hart <darren@os.amperecomputing.com> + ARM Catalin Marinas <catalin.marinas@arm.com> + IBM Power Anton Blanchard <anton@linux.ibm.com> + IBM Z Christian Borntraeger <borntraeger@de.ibm.com> Intel Tony Luck <tony.luck@intel.com> Qualcomm Trilok Soni <tsoni@codeaurora.org> diff --git a/Documentation/security/index.rst b/Documentation/security/index.rst index 16335de04e8c..6ed8d2fa6f9e 100644 --- a/Documentation/security/index.rst +++ b/Documentation/security/index.rst @@ -17,3 +17,4 @@ Security Documentation tpm/index digsig landlock + secrets/index diff --git a/Documentation/security/secrets/coco.rst b/Documentation/security/secrets/coco.rst new file mode 100644 index 000000000000..262e7abb1b24 --- /dev/null +++ b/Documentation/security/secrets/coco.rst @@ -0,0 +1,103 @@ +.. SPDX-License-Identifier: GPL-2.0 + +============================== +Confidential Computing secrets +============================== + +This document describes how Confidential Computing secret injection is handled +from the firmware to the operating system, in the EFI driver and the efi_secret +kernel module. + + +Introduction +============ + +Confidential Computing (coco) hardware such as AMD SEV (Secure Encrypted +Virtualization) allows guest owners to inject secrets into the VMs +memory without the host/hypervisor being able to read them. In SEV, +secret injection is performed early in the VM launch process, before the +guest starts running. + +The efi_secret kernel module allows userspace applications to access these +secrets via securityfs. + + +Secret data flow +================ + +The guest firmware may reserve a designated memory area for secret injection, +and publish its location (base GPA and length) in the EFI configuration table +under a ``LINUX_EFI_COCO_SECRET_AREA_GUID`` entry +(``adf956ad-e98c-484c-ae11-b51c7d336447``). This memory area should be marked +by the firmware as ``EFI_RESERVED_TYPE``, and therefore the kernel should not +be use it for its own purposes. + +During the VM's launch, the virtual machine manager may inject a secret to that +area. In AMD SEV and SEV-ES this is performed using the +``KVM_SEV_LAUNCH_SECRET`` command (see [sev]_). The strucutre of the injected +Guest Owner secret data should be a GUIDed table of secret values; the binary +format is described in ``drivers/virt/coco/efi_secret/efi_secret.c`` under +"Structure of the EFI secret area". + +On kernel start, the kernel's EFI driver saves the location of the secret area +(taken from the EFI configuration table) in the ``efi.coco_secret`` field. +Later it checks if the secret area is populated: it maps the area and checks +whether its content begins with ``EFI_SECRET_TABLE_HEADER_GUID`` +(``1e74f542-71dd-4d66-963e-ef4287ff173b``). If the secret area is populated, +the EFI driver will autoload the efi_secret kernel module, which exposes the +secrets to userspace applications via securityfs. The details of the +efi_secret filesystem interface are in [secrets-coco-abi]_. + + +Application usage example +========================= + +Consider a guest performing computations on encrypted files. The Guest Owner +provides the decryption key (= secret) using the secret injection mechanism. +The guest application reads the secret from the efi_secret filesystem and +proceeds to decrypt the files into memory and then performs the needed +computations on the content. + +In this example, the host can't read the files from the disk image +because they are encrypted. Host can't read the decryption key because +it is passed using the secret injection mechanism (= secure channel). +Host can't read the decrypted content from memory because it's a +confidential (memory-encrypted) guest. + +Here is a simple example for usage of the efi_secret module in a guest +to which an EFI secret area with 4 secrets was injected during launch:: + + # ls -la /sys/kernel/security/secrets/coco + total 0 + drwxr-xr-x 2 root root 0 Jun 28 11:54 . + drwxr-xr-x 3 root root 0 Jun 28 11:54 .. + -r--r----- 1 root root 0 Jun 28 11:54 736870e5-84f0-4973-92ec-06879ce3da0b + -r--r----- 1 root root 0 Jun 28 11:54 83c83f7f-1356-4975-8b7e-d3a0b54312c6 + -r--r----- 1 root root 0 Jun 28 11:54 9553f55d-3da2-43ee-ab5d-ff17f78864d2 + -r--r----- 1 root root 0 Jun 28 11:54 e6f5a162-d67f-4750-a67c-5d065f2a9910 + + # hd /sys/kernel/security/secrets/coco/e6f5a162-d67f-4750-a67c-5d065f2a9910 + 00000000 74 68 65 73 65 2d 61 72 65 2d 74 68 65 2d 6b 61 |these-are-the-ka| + 00000010 74 61 2d 73 65 63 72 65 74 73 00 01 02 03 04 05 |ta-secrets......| + 00000020 06 07 |..| + 00000022 + + # rm /sys/kernel/security/secrets/coco/e6f5a162-d67f-4750-a67c-5d065f2a9910 + + # ls -la /sys/kernel/security/secrets/coco + total 0 + drwxr-xr-x 2 root root 0 Jun 28 11:55 . + drwxr-xr-x 3 root root 0 Jun 28 11:54 .. + -r--r----- 1 root root 0 Jun 28 11:54 736870e5-84f0-4973-92ec-06879ce3da0b + -r--r----- 1 root root 0 Jun 28 11:54 83c83f7f-1356-4975-8b7e-d3a0b54312c6 + -r--r----- 1 root root 0 Jun 28 11:54 9553f55d-3da2-43ee-ab5d-ff17f78864d2 + + +References +========== + +See [sev-api-spec]_ for more info regarding SEV ``LAUNCH_SECRET`` operation. + +.. [sev] Documentation/virt/kvm/amd-memory-encryption.rst +.. [secrets-coco-abi] Documentation/ABI/testing/securityfs-secrets-coco +.. [sev-api-spec] https://www.amd.com/system/files/TechDocs/55766_SEV-KM_API_Specification.pdf diff --git a/Documentation/security/secrets/index.rst b/Documentation/security/secrets/index.rst new file mode 100644 index 000000000000..ced34e9c43bd --- /dev/null +++ b/Documentation/security/secrets/index.rst @@ -0,0 +1,9 @@ +.. SPDX-License-Identifier: GPL-2.0 + +===================== +Secrets documentation +===================== + +.. toctree:: + + coco |