diff options
author | Nathan Lynch <nathanl@linux.ibm.com> | 2023-02-10 12:42:00 -0600 |
---|---|---|
committer | Michael Ellerman <mpe@ellerman.id.au> | 2023-02-13 22:35:02 +1100 |
commit | 43033bc62d349d8d852855a336c91d046de819bd (patch) | |
tree | 9f7ce98c09f955129e605446b45fedf442da0e20 /arch/powerpc/include | |
parent | 24098f580e2b5ceb2cec4f02833e0a0bb5d46d2e (diff) |
powerpc/pseries: add RTAS work area allocator
Various pseries-specific RTAS functions take a temporary "work area"
parameter - a buffer in memory accessible to RTAS. Typically such
functions are passed the statically allocated rtas_data_buf buffer as
the argument. This buffer is protected by a global spinlock. So users
of rtas_data_buf cannot perform sleeping operations while accessing
the buffer.
Most RTAS functions that have a work area parameter can return a
status (-2/990x) that indicates that the caller should retry. Before
retrying, the caller may need to reschedule or sleep (see
rtas_busy_delay() for details). This combination of factors
leads to uncomfortable constructions like this:
do {
spin_lock(&rtas_data_buf_lock);
rc = rtas_call(token, __pa(rtas_data_buf, ...);
if (rc == 0) {
/* parse or copy out rtas_data_buf contents */
}
spin_unlock(&rtas_data_buf_lock);
} while (rtas_busy_delay(rc));
Another unfortunately common way of handling this is for callers to
blithely ignore the possibility of a -2/990x status and hope for the
best.
If users were allowed to perform blocking operations while owning a
work area, the programming model would become less tedious and
error-prone. Users could schedule away, sleep, or perform other
blocking operations without having to release and re-acquire
resources.
We could continue to use a single work area buffer, and convert
rtas_data_buf_lock to a mutex. But that would impose an unnecessarily
coarse serialization on all users. As awkward as the current design
is, it prevents longer running operations that need to repeatedly use
rtas_data_buf from blocking the progress of others.
There are more considerations. One is that while 4KB is fine for all
current in-kernel uses, some RTAS calls can take much smaller buffers,
and some (VPD, platform dumps) would likely benefit from larger
ones. Another is that at least one RTAS function (ibm,get-vpd)
has *two* work area parameters. And finally, we should expect the
number of work area users in the kernel to increase over time as we
introduce lockdown-compatible ABIs to replace less safe use cases
based on sys_rtas/librtas.
So a special-purpose allocator for RTAS work area buffers seems worth
trying.
Properties:
* The backing memory for the allocator is reserved early in boot in
order to satisfy RTAS addressing requirements, and then managed with
genalloc.
* Allocations can block, but they never fail (mempool-like).
* Prioritizes first-come, first-serve fairness over throughput.
* Early boot allocations before the allocator has been initialized are
served via an internal static buffer.
Intended to replace rtas_data_buf. New code that needs RTAS work area
buffers should prefer this API.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-12-26929c8cce78@linux.ibm.com
Diffstat (limited to 'arch/powerpc/include')
-rw-r--r-- | arch/powerpc/include/asm/rtas-work-area.h | 96 |
1 files changed, 96 insertions, 0 deletions
diff --git a/arch/powerpc/include/asm/rtas-work-area.h b/arch/powerpc/include/asm/rtas-work-area.h new file mode 100644 index 000000000000..251a395dbd2e --- /dev/null +++ b/arch/powerpc/include/asm/rtas-work-area.h @@ -0,0 +1,96 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef _ASM_POWERPC_RTAS_WORK_AREA_H +#define _ASM_POWERPC_RTAS_WORK_AREA_H + +#include <linux/build_bug.h> +#include <linux/sizes.h> +#include <linux/types.h> + +#include <asm/page.h> + +/** + * struct rtas_work_area - RTAS work area descriptor. + * + * Descriptor for a "work area" in PAPR terminology that satisfies + * RTAS addressing requirements. + */ +struct rtas_work_area { + /* private: Use the APIs provided below. */ + char *buf; + size_t size; +}; + +enum { + /* Maximum allocation size, enforced at build time. */ + RTAS_WORK_AREA_MAX_ALLOC_SZ = SZ_128K, +}; + +/** + * rtas_work_area_alloc() - Acquire a work area of the requested size. + * @size_: Allocation size. Must be compile-time constant and not more + * than %RTAS_WORK_AREA_MAX_ALLOC_SZ. + * + * Allocate a buffer suitable for passing to RTAS functions that have + * a memory address parameter, often (but not always) referred to as a + * "work area" in PAPR. Although callers are allowed to block while + * holding a work area, the amount of memory reserved for this purpose + * is limited, and allocations should be short-lived. A good guideline + * is to release any allocated work area before returning from a + * system call. + * + * This function does not fail. It blocks until the allocation + * succeeds. To prevent deadlocks, callers are discouraged from + * allocating more than one work area simultaneously in a single task + * context. + * + * Context: This function may sleep. + * Return: A &struct rtas_work_area descriptor for the allocated work area. + */ +#define rtas_work_area_alloc(size_) ({ \ + static_assert(__builtin_constant_p(size_)); \ + static_assert((size_) > 0); \ + static_assert((size_) <= RTAS_WORK_AREA_MAX_ALLOC_SZ); \ + __rtas_work_area_alloc(size_); \ +}) + +/* + * Do not call __rtas_work_area_alloc() directly. Use + * rtas_work_area_alloc(). + */ +struct rtas_work_area *__rtas_work_area_alloc(size_t size); + +/** + * rtas_work_area_free() - Release a work area. + * @area: Work area descriptor as returned from rtas_work_area_alloc(). + * + * Return a work area buffer to the pool. + */ +void rtas_work_area_free(struct rtas_work_area *area); + +static inline char *rtas_work_area_raw_buf(const struct rtas_work_area *area) +{ + return area->buf; +} + +static inline size_t rtas_work_area_size(const struct rtas_work_area *area) +{ + return area->size; +} + +static inline phys_addr_t rtas_work_area_phys(const struct rtas_work_area *area) +{ + return __pa(area->buf); +} + +/* + * Early setup for the work area allocator. Call from + * rtas_initialize() only. + */ + +#ifdef CONFIG_PPC_PSERIES +void rtas_work_area_reserve_arena(phys_addr_t limit); +#else /* CONFIG_PPC_PSERIES */ +static inline void rtas_work_area_reserve_arena(phys_addr_t limit) {} +#endif /* CONFIG_PPC_PSERIES */ + +#endif /* _ASM_POWERPC_RTAS_WORK_AREA_H */ |