aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-02-13powerpc/rtas: handle extended delays safely in early bootNathan Lynch1-1/+48
Some code that runs early in boot calls RTAS functions that can return -2 or 990x statuses, which mean the caller should retry. An example is pSeries_cmo_feature_init(), which invokes ibm,get-system-parameter but treats these benign statuses as errors instead of retrying. pSeries_cmo_feature_init() and similar code should be made to retry until they succeed or receive a real error, using the usual pattern: do { rc = rtas_call(token, etc...); } while (rtas_busy_delay(rc)); But rtas_busy_delay() will perform a timed sleep on any 990x status. This isn't safe so early in boot, before the CPU scheduler and timer subsystem have initialized. The -2 RTAS status is much more likely to occur during single-threaded boot than 990x in practice, at least on PowerVM. This is because -2 usually means that RTAS made progress but exhausted its self-imposed timeslice, while 990x is associated with concurrent requests from the OS causing internal contention. Regardless, according to the language in PAPR, the OS should be prepared to handle either type of status at any time. Add a fallback path to rtas_busy_delay() to handle this as safely as possible, performing a small delay on 990x. Include a counter to detect retry loops that aren't making progress and bail out. Add __ref to rtas_busy_delay() since it now conditionally calls an __init function. This was found by inspection and I'm not aware of any real failures. However, the implementation of rtas_busy_delay() before commit 38f7b7067dae ("powerpc/rtas: rtas_busy_delay() improvements") was not susceptible to this problem, so let's treat this as a regression. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Fixes: 38f7b7067dae ("powerpc/rtas: rtas_busy_delay() improvements") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-1-26929c8cce78@linux.ibm.com
2023-02-13integrity/powerpc: Support loading keys from PLPKSRussell Currey1-7/+10
Add support for loading keys from the PLPKS on pseries machines, with the "ibm,plpks-sb-v1" format. The object format is expected to be the same, so there shouldn't be any functional differences between objects retrieved on powernv or pseries. Unlike on powernv, on pseries the format string isn't contained in the device tree. Use secvar_ops->format() to fetch the format string in a generic manner, rather than searching the device tree ourselves. (The current code searches the device tree for a node compatible with "ibm,edk2-compat-v1". This patch switches to calling secvar_ops->format(), which in the case of OPAL/powernv means opal_secvar_format(), which searches the device tree for a node compatible with "ibm,secvar-backend" and checks its "format" property. These are equivalent, as skiboot creates a node with both "ibm,edk2-compat-v1" and "ibm,secvar-backend" as compatible strings.) Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230210080401.345462-27-ajd@linux.ibm.com
2023-02-13integrity/powerpc: Improve error handling & reporting when loading certsRussell Currey1-6/+20
A few improvements to load_powerpc.c: - include integrity.h for the pr_fmt() - move all error reporting out of get_cert_list() - use ERR_PTR() to better preserve error detail - don't use pr_err() for missing keys Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230210080401.345462-26-ajd@linux.ibm.com
2023-02-13powerpc/pseries: Implement secvars for dynamic secure bootRussell Currey4-3/+293
The pseries platform can support dynamic secure boot (i.e. secure boot using user-defined keys) using variables contained with the PowerVM LPAR Platform KeyStore (PLPKS). Using the powerpc secvar API, expose the relevant variables for pseries dynamic secure boot through the existing secvar filesystem layout. The relevant variables for dynamic secure boot are signed in the keystore, and can only be modified using the H_PKS_SIGNED_UPDATE hcall. Object labels in the keystore are encoded using ucs2 format. With our fixed variable names we don't have to care about encoding outside of the necessary byte padding. When a user writes to a variable, the first 8 bytes of data must contain the signed update flags as defined by the hypervisor. When a user reads a variable, the first 4 bytes of data contain the policies defined for the object. Limitations exist due to the underlying implementation of sysfs binary attributes, as is the case for the OPAL secvar implementation - partial writes are unsupported and writes cannot be larger than PAGE_SIZE. (Even when using bin_attributes, which can be larger than a single page, sysfs only gives us one page's worth of write buffer at a time, and the hypervisor does not expose an interface for partial writes.) Co-developed-by: Nayna Jain <nayna@linux.ibm.com> Signed-off-by: Nayna Jain <nayna@linux.ibm.com> Co-developed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Russell Currey <ruscur@russell.cc> [mpe: Add NLS dependency to fix build errors, squash fix from ajd] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230210080401.345462-25-ajd@linux.ibm.com
2023-02-12powerpc/pseries: Pass PLPKS password on kexecRussell Currey4-5/+92
Before interacting with the PLPKS, we ask the hypervisor to generate a password for the current boot, which is then required for most further PLPKS operations. If we kexec into a new kernel, the new kernel will try and fail to generate a new password, as the password has already been set. Pass the password through to the new kernel via the device tree, in /chosen/ibm,plpks-pw. Check for the presence of this property before trying to generate a new password - if it exists, use the existing password and remove it from the device tree. This only works with the kexec_file_load() syscall, not the older kexec_load() syscall, however if you're using Secure Boot then you want to be using kexec_file_load() anyway. Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230210080401.345462-24-ajd@linux.ibm.com
2023-02-12powerpc/pseries: Add helper to get PLPKS password lengthRussell Currey2-0/+10
Add helper function to get the PLPKS password length. This will be used in a later patch to support passing the password between kernels over kexec. Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230210080401.345462-23-ajd@linux.ibm.com
2023-02-12powerpc/pseries: Clarify warning when PLPKS password already setAndrew Donnellan1-1/+1
When the H_PKS_GEN_PASSWORD hcall returns H_IN_USE, operations that require authentication (i.e. anything other than reading a world-readable variable) will not work. The current error message doesn't explain this clearly enough. Reword it to emphasise that authenticated operations will fail. Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230210080401.345462-22-ajd@linux.ibm.com
2023-02-12powerpc/pseries: Turn PSERIES_PLPKS into a hidden optionAndrew Donnellan2-10/+10
It seems a bit unnecessary for the PLPKS code to have a user-visible config option when it doesn't do anything on its own, and there's existing options for enabling Secure Boot-related features. It should be enabled by PPC_SECURE_BOOT, which will eventually be what uses PLPKS to populate keyrings. However, we can't get of the separate option completely, because it will also be used for SED Opal purposes. Change PSERIES_PLPKS into a hidden option, which is selected by PPC_SECURE_BOOT. Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Russell Currey <ruscur@russell.cc> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230210080401.345462-21-ajd@linux.ibm.com
2023-02-12powerpc/pseries: Make caller pass buffer to plpks_read_var()Andrew Donnellan2-7/+16
Currently, plpks_read_var() allocates a buffer to pass to the H_PKS_READ_OBJECT hcall, then allocates another buffer into which the data is copied, and returns that buffer to the caller. This is a bit over the top - while we probably still want to allocate a separate buffer to pass to the hypervisor in the hcall, we can let the caller allocate the final buffer and specify the size. Don't allocate var->data in plpks_read_var(), instead expect the caller to allocate it. If the caller needs to discover the size, it can set var->data to NULL and var->datalen will be populated. Update header file to document this. Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Russell Currey <ruscur@russell.cc> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230210080401.345462-20-ajd@linux.ibm.com
2023-02-12powerpc/pseries: Log hcall return codes for PLPKS debugRussell Currey1-0/+2
The plpks code converts hypervisor return codes into their Linux equivalents so that users can understand them. Having access to the original return codes is really useful for debugging, so add a pr_debug() so we don't lose information from the conversion. Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230210080401.345462-19-ajd@linux.ibm.com
2023-02-12powerpc/pseries: Implement signed update for PLPKS objectsNayna Jain3-5/+75
The Platform Keystore provides a signed update interface which can be used to create, replace or append to certain variables in the PKS in a secure fashion, with the hypervisor requiring that the update be signed using the Platform Key. Implement an interface to the H_PKS_SIGNED_UPDATE hcall in the plpks driver to allow signed updates to PKS objects. (The plpks driver doesn't need to do any cryptography or otherwise handle the actual signed variable contents - that will be handled by userspace tooling.) Signed-off-by: Nayna Jain <nayna@linux.ibm.com> [ajd: split patch, add timeout handling and misc cleanups] Co-developed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Russell Currey <ruscur@russell.cc> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230210080401.345462-18-ajd@linux.ibm.com
2023-02-12powerpc/pseries: Expose PLPKS config values, support additional fieldsNayna Jain2-12/+195
The plpks driver uses the H_PKS_GET_CONFIG hcall to retrieve configuration and status information about the PKS from the hypervisor. Update _plpks_get_config() to handle some additional fields. Add getter functions to allow the PKS configuration information to be accessed from other files. Validate that the values we're getting comply with the spec. While we're here, move the config struct in _plpks_get_config() off the stack - it's getting large and we also need to make sure it doesn't cross a page boundary. Signed-off-by: Nayna Jain <nayna@linux.ibm.com> [ajd: split patch, extend to support additional v3 API fields, minor fixes] Co-developed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Russell Currey <ruscur@russell.cc> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230210080401.345462-17-ajd@linux.ibm.com
2023-02-12powerpc/pseries: Move PLPKS constants to header fileRussell Currey2-40/+53
Move the constants defined in plpks.c to plpks.h, and standardise their naming, so that PLPKS consumers can make use of them later on. Signed-off-by: Russell Currey <ruscur@russell.cc> Co-developed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230210080401.345462-16-ajd@linux.ibm.com
2023-02-12powerpc/pseries: Move plpks.h to include directoryRussell Currey2-5/+8
Move plpks.h from platforms/pseries/ to include/asm/. This is necessary for later patches to make use of the PLPKS from code in other subsystems. Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230210080401.345462-15-ajd@linux.ibm.com
2023-02-12powerpc/secvar: Don't print error on ENOENT when reading variablesAndrew Donnellan1-3/+4
If attempting to read the size or data attributes of a non-existent variable (which will be possible after a later patch to expose the PLPKS via the secvar interface), don't spam the kernel log with error messages. Only print errors for return codes that aren't ENOENT. Reported-by: Sudhakar Kuppusamy <sudhakar@linux.ibm.com> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230210080401.345462-14-ajd@linux.ibm.com
2023-02-12powerpc/secvar: Warn when PAGE_SIZE is smaller than max object sizeAndrew Donnellan1-0/+9
Due to sysfs constraints, when writing to a variable, we can only handle writes of up to PAGE_SIZE. It's possible that the maximum object size is larger than PAGE_SIZE, in which case, print a warning on boot so that the user is aware. Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Russell Currey <ruscur@russell.cc> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230210080401.345462-13-ajd@linux.ibm.com
2023-02-12powerpc/secvar: Allow backend to populate static list of variable namesAndrew Donnellan2-21/+52
Currently, the list of variables is populated by calling secvar_ops->get_next() repeatedly, which is explicitly modelled on the OPAL API (including the keylen parameter). For the upcoming PLPKS backend, we have a static list of variable names. It is messy to fit that into get_next(), so instead, let the backend put a NULL-terminated array of variable names into secvar_ops->var_names, which will be used if get_next() is undefined. Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Russell Currey <ruscur@russell.cc> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230210080401.345462-12-ajd@linux.ibm.com
2023-02-12powerpc/secvar: Extend sysfs to include config varsRussell Currey2-5/+30
The forthcoming pseries consumer of the secvar API wants to expose a number of config variables. Allowing secvar implementations to provide their own sysfs attributes makes it easy for consumers to expose what they need to. This is not being used by the OPAL secvar implementation at present, and the config directory will not be created if no attributes are set. Signed-off-by: Russell Currey <ruscur@russell.cc> Co-developed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230210080401.345462-11-ajd@linux.ibm.com
2023-02-12powerpc/secvar: Clean up init error messagesAndrew Donnellan1-3/+3
Remove unnecessary prefixes from error messages in secvar_sysfs_init() (the file defines pr_fmt, so putting "secvar:" in every message is unnecessary). Make capitalisation and punctuation more consistent. Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Russell Currey <ruscur@russell.cc> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230210080401.345462-10-ajd@linux.ibm.com
2023-02-12powerpc/secvar: Handle max object size in the consumerRussell Currey3-14/+26
Currently the max object size is handled in the core secvar code with an entirely OPAL-specific implementation, so create a new max_size() op and move the existing implementation into the powernv platform. Should be no functional change. Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230210080401.345462-9-ajd@linux.ibm.com
2023-02-12powerpc/secvar: Handle format string in the consumerRussell Currey3-18/+35
The code that handles the format string in secvar-sysfs.c is entirely OPAL specific, so create a new "format" op in secvar_operations to make the secvar code more generic. No functional change. Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230210080401.345462-8-ajd@linux.ibm.com
2023-02-12powerpc/secvar: Use sysfs_emit() instead of sprintf()Russell Currey1-2/+2
The secvar format string and object size sysfs files are both ASCII text, and should use sysfs_emit(). No functional change. Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230210080401.345462-7-ajd@linux.ibm.com
2023-02-12powerpc/secvar: Warn and error if multiple secvar ops are setRussell Currey3-7/+11
The secvar code only supports one consumer at a time. Multiple consumers aren't possible at this point in time, but we'd want it to be obvious if it ever could happen. Signed-off-by: Russell Currey <ruscur@russell.cc> Co-developed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230210080401.345462-6-ajd@linux.ibm.com
2023-02-12powerpc/secvar: Use u64 in secvar_operationsMichael Ellerman4-18/+12
There's no reason for secvar_operations to use uint64_t vs the more common kernel type u64. The types are compatible, but they require different printk format strings which can lead to confusion. Change all the secvar related routines to use u64. Reviewed-by: Russell Currey <ruscur@russell.cc> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230210080401.345462-5-ajd@linux.ibm.com
2023-02-12powerpc/secvar: Fix incorrect return in secvar_sysfs_load()Russell Currey1-2/+4
secvar_ops->get_next() returns -ENOENT when there are no more variables to return, which is expected behaviour. Fix this by returning 0 if get_next() returns -ENOENT. This fixes an issue introduced in commit bd5d9c743d38 ("powerpc: expose secure variables to userspace via sysfs"), but the return code of secvar_sysfs_load() was never checked so this issue never mattered. Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230210080401.345462-4-ajd@linux.ibm.com
2023-02-12powerpc/pseries: Fix alignment of PLPKS structures and buffersAndrew Donnellan1-3/+7
A number of structures and buffers passed to PKS hcalls have alignment requirements, which could on occasion cause problems: - Authorisation structures must be 16-byte aligned and must not cross a page boundary - Label structures must not cross page boundaries - Password output buffers must not cross page boundaries To ensure correct alignment, we adjust the allocation size of each of these structures/buffers to be the closest power of 2 that is at least the size of the structure/buffer (since kmalloc() guarantees that an allocation of a power of 2 size will be aligned to at least that size). Reported-by: Benjamin Gray <bgray@linux.ibm.com> Fixes: 2454a7af0f2a ("powerpc/pseries: define driver for Platform KeyStore") Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230210080401.345462-3-ajd@linux.ibm.com
2023-02-12powerpc/pseries: Fix handling of PLPKS object flushing timeoutAndrew Donnellan1-3/+8
plpks_confirm_object_flushed() uses the H_PKS_CONFIRM_OBJECT_FLUSHED hcall to check whether changes to an object in the Platform KeyStore have been flushed to non-volatile storage. The hcall returns two output values, the return code and the flush status. plpks_confirm_object_flushed() polls the hcall until either the flush status has updated, the return code is an error, or a timeout has been exceeded. While we're still polling, the hcall is returning H_SUCCESS (0) as the return code. In the timeout case, this means that upon exiting the polling loop, rc is 0, and therefore 0 is returned to the user. Handle the timeout case separately and return ETIMEDOUT if triggered. Fixes: 2454a7af0f2a ("powerpc/pseries: define driver for Platform KeyStore") Reported-by: Benjamin Gray <bgray@linux.ibm.com> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Tested-by: Russell Currey <ruscur@russell.cc> Reviewed-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230210080401.345462-2-ajd@linux.ibm.com
2023-02-12Merge branch 'fixes' into nextMichael Ellerman15-92/+140
Merge our fixes branch to bring in some changes that conflict with upcoming next content.
2023-02-12powerpc/ps3: Refresh ps3_defconfigGeoff Levand1-22/+17
Refresh ps3_defconfig for v6.2. Signed-off-by: Geoff Levand <geoff@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/99e87549b17feca3494e9df6f4def04a9ec7c042.1672767868.git.geoff@infradead.org
2023-02-12powerpc/ps3: Change updateboltedpp() panic to infoGeoff Levand1-1/+1
Commit fdacae8a8402 ("powerpc: Activate CONFIG_STRICT_KERNEL_RWX by default") causes ps3_hpte_updateboltedpp() to be called. The correct fix would be to implement updateboltedpp() for PS3, but it's not clear if that's possible. As a stop-gap, change the panic statment in ps3_hpte_updateboltedpp() to a pr_info statement so that bootup can continue. Signed-off-by: Geoff Levand <geoff@infradead.org> [mpe: Flesh out change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/2df879d982809c05b0dfade57942fe03dbe9e7de.1672767868.git.geoff@infradead.org
2023-02-12powerpc/kcsan: Add KCSAN SupportRohan McLure1-0/+1
Enable HAVE_ARCH_KCSAN for 64-bit Book3S, permitting use of the kernel concurrency sanitiser through the CONFIG_KCSAN_* kconfig options. KCSAN requires compiler builtins __atomic_* 64-bit values, and so only report support on 64-bit. See documentation in Documentation/dev-tools/kcsan.rst for more information. Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> [mpe: Limit to Book3S to avoid build failure on Book3E] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230206021801.105268-6-rmclure@linux.ibm.com
2023-02-10powerpc/kcsan: Prevent recursive instrumentation with IRQ save/restoresRohan McLure1-4/+4
Instrumented memory accesses provided by KCSAN will access core-local memories (which will save and restore IRQs) as well as restoring IRQs directly. Avoid recursive instrumentation by applying __no_kcsan annotation to IRQ restore routines. Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> [mpe: Resolve merge conflict with IRQ replay recursion changes] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230206021801.105268-5-rmclure@linux.ibm.com
2023-02-10powerpc/kcsan: Memory barriers semanticsRohan McLure1-6/+6
Annotate memory barriers *mb() with calls to kcsan_mb(), signaling to compilers supporting KCSAN that the respective memory barrier has been issued. Rename memory barrier *mb() to __*mb() to opt in for asm-generic/barrier.h to generate the respective *mb() macro. Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230206021801.105268-4-rmclure@linux.ibm.com
2023-02-10powerpc/kcsan: Exclude udelay to prevent recursive instrumentationRohan McLure1-2/+2
In order for KCSAN to increase its likelihood of observing a data race, it sets a watchpoint on memory accesses and stalls, allowing for detection of conflicting accesses by other kernel threads or interrupts. Stalls are implemented by injecting a call to udelay in instrumented code. To prevent recursive instrumentation, exclude udelay from being instrumented. Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230206021801.105268-3-rmclure@linux.ibm.com
2023-02-10powerpc/kcsan: Add exclusions from instrumentationRohan McLure6-0/+16
Exclude various incompatible compilation units from KCSAN instrumentation. Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230206021801.105268-2-rmclure@linux.ibm.com
2023-02-10powerpc: Skip stack validation checking alternate stacks if they are not ↵Nicholas Piggin1-0/+11
allocated Stack validation in early boot can just bail out of checking alternate stacks if they are not validated yet. Checking against a NULL stack could cause NULLish pointer values to be considered valid. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221216115930.2667772-5-npiggin@gmail.com
2023-02-10powerpc/64: Move paca allocation to early_setup()Nicholas Piggin5-16/+13
The early paca and boot cpuid dance is complicated and currently does not quite work as expected for boot cpuid != 0 cases. early_init_devtree() currently allocates the paca_ptrs and boot cpuid paca, but until that returns and early_setup() calls setup_paca(), this thread is currently still executing with smp_processor_id() == 0. One problem this causes is the paca_ptrs[smp_processor_id()] pointer is poisoned, so valid_emergency_stack() (any backtrace) and any similar users will crash. Another is that the hardware id which is set here will not be returned by get_hard_smp_processor_id(smp_processor_id()), but it would work correctly for boot_cpuid == 0, which could lead to difficult to reproduce or find bugs. The hard id does not seem to be used by the rest of early_init_devtree(), it just looks like all this code might have been put here to allocate somewhere to store boot CPU hardware id while scanning the devtree. Rearrange things so the hwid is put in a global variable like boot_cpuid, and do all the paca allocation and boot paca setup in the 64-bit early_setup() after we have everything ready to go. The paca_ptrs[0] re-poisoning code in early_setup does not seem to have ever worked, because paca_ptrs[0] was never not-poisoned when boot_cpuid is not 0. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Fix build error on 32-bit] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221216115930.2667772-4-npiggin@gmail.com
2023-02-10powerpc/64: Fix task_cpu in early boot when booting non-zero cpuidNicholas Piggin1-0/+5
powerpc/64 can boot on a non-zero SMP processor id. Initially, the boot CPU is said to be "assumed to be 0" until early_init_devtree() discovers the id from the device tree. That is not a good description because the assumption can be wrong and that has to be handled, the better description is that 0 is used as a placeholder, and things are fixed after the real id is discovered. smp_processor_id() is set to the boot cpuid, but task_cpu(current) is not, which causes the smp_processor_id() == task_cpu(current) invariant to be broken until init_idle() in sched_init(). This is quite fragile and could lead to subtle bugs in future. One bug is that validate_sp_size uses task_cpu() to get the process stack, so any stack trace from the booting CPU between early_init_devtree() and sched_init() will have problems. Early on paca_ptrs[0] will be poisoned, so that can cause machine checks dereferencing that memory in real mode. Later, validating the current stack pointer against the idle task of a different secondary will probably cause no stack trace to be printed. Fix this by setting thread_info->cpu right after smp_processor_id() is set to the boot cpuid. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Fix SMP=n build as reported by sfr] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221216115930.2667772-3-npiggin@gmail.com
2023-02-10powerpc/64s: Fix stress_hpt memblock alloc alignmentNicholas Piggin1-1/+2
The stress_hpt memblock allocation did not pass in an alignment, which causes a stack dump in early boot (that I missed, oops). Fixes: 6b34a099faa1 ("powerpc/64s/hash: add stress_hpt kernel boot option to increase hash faults") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221216115930.2667772-2-npiggin@gmail.com
2023-02-10powerpc/64e: Simplify address calculation in secondary hold loopNicholas Piggin1-5/+1
As the earlier comment explains, __secondary_hold_spinloop does not have to be accessed at its virtual address, slightly simplifying code. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230203113858.1152093-4-npiggin@gmail.com
2023-02-10powerpc/64s: Refactor initialisation after promNicholas Piggin1-19/+25
Move some basic Book3S initialisation after prom to a function similar to what Book3E looks like. Book3E returns from this function at the virtual address mapping, and Book3S will do the same in a later change, so making them look similar helps with that. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230203113858.1152093-3-npiggin@gmail.com
2023-02-10crypto: powerpc - Use address generation helper for asmNicholas Piggin1-9/+4
Replace open-coded toc-relative address calculation with helper macros, commit dab3b8f4fd09 ("powerpc/64: asm use consistent global variable declaration and access") made similar conversions already but missed this one. This allows data addressing model to be changed more easily. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230203113858.1152093-2-npiggin@gmail.com
2023-02-10powerpc/powernv/ioda: Skip unallocated resources when mapping to PEFrederic Barrat1-1/+2
pnv_ioda_setup_pe_res() calls opal to map a resource with a PE. However, the code assumes the resource is allocated and it uses the resource address to find out the segment(s) which need to be mapped to the PE. In the unlikely case where the resource hasn't been allocated, the computation for the segment number is garbage, which can lead to invalid memory access and potentially a kernel crash, such as: [ ] pci_bus 0002:02: Configuring PE for bus [ ] pci 0002:02 : [PE# fc] Secondary bus 0x0000000000000002..0x0000000000000002 associated with PE#fc [ ] BUG: Kernel NULL pointer dereference on write at 0x00000000 [ ] Faulting instruction address: 0xc00000000005eac4 [ ] Oops: Kernel access of bad area, sig: 7 [#1] [ ] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV [ ] Modules linked in: [ ] CPU: 12 PID: 1 Comm: swapper/20 Not tainted 5.10.50-openpower1 #2 [ ] NIP: c00000000005eac4 LR: c00000000005ea44 CTR: 0000000030061b9c [ ] REGS: c000200007383650 TRAP: 0300 Not tainted (5.10.50-openpower1) [ ] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 44000224 XER: 20040000 [ ] CFAR: c00000000005eaa0 DAR: 0000000000000000 DSISR: 02080000 IRQMASK: 0 [ ] GPR00: c00000000005dd98 c0002000073838e0 c00000000185de00 c000200fff018960 [ ] GPR04: 00000000000000fc 0000000000000003 0000000000000000 0000000000000000 [ ] GPR08: 0000000000000000 0000000000000000 0000000000000000 9000000000001033 [ ] GPR12: 0000000031cb0000 c000000ffffe6a80 c000000000010a58 0000000000000000 [ ] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ ] GPR20: 0000000000000000 0000000000000000 0000000000000000 c00000000711e200 [ ] GPR24: 0000000000000100 c000200009501120 c00020000cee2800 00000000000003ff [ ] GPR28: c000200fff018960 0000000000000000 c000200ffcb7fd00 0000000000000000 [ ] NIP [c00000000005eac4] pnv_ioda_setup_pe_res+0x94/0x1a0 [ ] LR [c00000000005ea44] pnv_ioda_setup_pe_res+0x14/0x1a0 [ ] Call Trace: [ ] [c0002000073838e0] [c00000000005eb98] pnv_ioda_setup_pe_res+0x168/0x1a0 (unreliable) [ ] [c000200007383970] [c00000000005dd98] pnv_pci_ioda_dma_dev_setup+0x43c/0x970 [ ] [c000200007383a60] [c000000000032cdc] pcibios_bus_add_device+0x78/0x18c [ ] [c000200007383aa0] [c00000000028f2bc] pci_bus_add_device+0x28/0xbc [ ] [c000200007383b10] [c00000000028f3a0] pci_bus_add_devices+0x50/0x7c [ ] [c000200007383b50] [c00000000028f3c4] pci_bus_add_devices+0x74/0x7c [ ] [c000200007383b90] [c00000000028f3c4] pci_bus_add_devices+0x74/0x7c [ ] [c000200007383bd0] [c00000000069ad0c] pcibios_init+0xf0/0x104 [ ] [c000200007383c50] [c0000000000106d8] do_one_initcall+0x84/0x1c4 [ ] [c000200007383d20] [c0000000006910b8] kernel_init_freeable+0x264/0x268 [ ] [c000200007383dc0] [c000000000010a68] kernel_init+0x18/0x138 [ ] [c000200007383e20] [c00000000000cbfc] ret_from_kernel_thread+0x5c/0x80 [ ] Instruction dump: [ ] 7f89e840 409d000c 7fbbf840 409c000c 38210090 4848f448 809c002c e95e0120 [ ] 7ba91764 38a00003 57a7043e 38c00000 <7c8a492e> 5484043e e87e0018 4bff23bd Hitting the problem is not that easy. It was seen with a (semi-bogus) PCI device with a class code of 0. The generic PCI framework doesn't allocate resources in such a case. The patch is simply skipping resources which are still flagged with IORESOURCE_UNSET. We don't have the problem with 64-bit mem resources, as the address of the resource is checked to be within the range of the 64-bit mmio window. See pnv_ioda_reserve_dev_m64_pe() and pnv_pci_is_m64(). Reported-by: Andrew Jeffery <andrew@aj.id.au> Fixes: 23e79425fe7c ("powerpc/powernv: Simplify pnv_ioda_setup_pe_seg()") Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230120093215.19496-1-fbarrat@linux.ibm.com
2023-02-10powerpc/32: select HAVE_VIRT_CPU_ACCOUNTING_GENNicholas Piggin1-0/+1
cputime_t is no longer a type, so VIRT_CPU_ACCOUNTING_GEN does not have any affect on the type for 32-bit architectures, so there is no reason it can't be supported. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230121095805.2823731-4-npiggin@gmail.com
2023-02-10powerpc/32: implement HAVE_CONTEXT_TRACKING_USER supportNicholas Piggin1-1/+1
Context tracking involves tracking user, kernel, guest switches. 32-bit shares interrupt and syscall entry and exit code (and context tracking calls) with 64-bit, and KVM can not be selected if CONTEXT_TRACKING_USER is enabled, so context tracking can be enabled for 32-bit. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230121095805.2823731-3-npiggin@gmail.com
2023-02-10powerpc: Consolidate 32-bit and 64-bit interrupt_enter_prepareNicholas Piggin1-27/+8
There are two separeate implementations for 32-bit and 64-bit which mostly do the same thing. Consolidating on one implementation ends up being smaller and simpler, there is just irq soft-mask reconcile that is specific to 64-bit. There should be no real functional change with this patch, but it does make the context tracking calls necessary for 32-bit to support context tracking. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230121095805.2823731-2-npiggin@gmail.com
2023-02-10powerpc/hv-24x7: Fix pvr check when setting interface versionKajol Jain1-1/+2
Commit ec3eb9d941a9 ("powerpc/perf: Use PVR rather than oprofile field to determine CPU version") added usage of pvr value instead of oprofile field to determine the platform. In hv-24x7 pmu driver code, pvr check uses PVR_POWER8 when assigning the interface version for power8 platform. But power8 can also have other pvr values like PVR_POWER8E and PVR_POWER8NVL. Hence the interface version won't be set properly incase of PVR_POWER8E and PVR_POWER8NVL. Fix this issue by adding the checks for PVR_POWER8E and PVR_POWER8NVL as well. Fixes: ec3eb9d941a9 ("powerpc/perf: Use PVR rather than oprofile field to determine CPU version") Reported-by: Sachin Sant <sachinp@linux.ibm.com> Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230131184804.220756-1-kjain@linux.ibm.com
2023-02-10powerpc/bpf/32: perform three operands ALU operationsChristophe Leroy1-0/+10
When an ALU instruction is preceded by a MOV instruction that just moves a source register into the destination register of the ALU, replace that MOV+ALU instructions by an ALU operation taking the source of the MOV as second source instead of using its destination. Before the change, code could look like the following, with superfluous separate register move (mr) instructions. 70: 7f c6 f3 78 mr r6,r30 74: 7f a5 eb 78 mr r5,r29 78: 30 c6 ff f4 addic r6,r6,-12 7c: 7c a5 01 d4 addme r5,r5 With this commit, addition instructions take r30 and r29 directly. 70: 30 de ff f4 addic r6,r30,-12 74: 7c bd 01 d4 addme r5,r29 Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b6719beaf01f9dcbcdbb787ef67c4a2f8e3a4cb6.1675245773.git.christophe.leroy@csgroup.eu
2023-02-10powerpc/bpf/32: introduce a second source register for ALU operationsChristophe Leroy1-167/+183
At the time being, all ALU operation are performed with same L-source and destination, requiring the L-source to be moved into destination via a separate register move, like: 70: 7f c6 f3 78 mr r6,r30 74: 7f a5 eb 78 mr r5,r29 78: 30 c6 ff f4 addic r6,r6,-12 7c: 7c a5 01 d4 addme r5,r5 Introduce a second source register to all ALU operations. For the time being that second source register is made equal to the destination register. That change will allow, via following patch, to optimise the generated code as: 70: 30 de ff f4 addic r6,r30,-12 74: 7c bd 01 d4 addme r5,r29 Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/d5aaaba50d9d6b4a0e9f0cd4a5e34101aca1e247.1675245773.git.christophe.leroy@csgroup.eu
2023-02-10powerpc/bpf/32: Optimise some particular const operationsChristophe Leroy1-3/+20
Simplify multiplications and divisions with constants when the constant is 1 or -1. When the constant is a power of 2, replace them by bit shits. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/e53b1f4a4150ec6cabcaeeef82bf9c361b5f9204.1675245773.git.christophe.leroy@csgroup.eu