aboutsummaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)AuthorFilesLines
2022-10-21ata: remove unused helper ata_id_flush_enabled()Niklas Cassel1-9/+0
Not only is this function unused, but even worse, the bit it is checking is actually used for signaling if the feature is supported, not enabled. Therefore, remove the unused helper function ata_id_flush_enabled(). ata_id_has_flush() is left unmodified, since this extra supported bit (Bit 12 of word 86) is simply a copy of the bit that ata_id_has_flush() already checks (Bit 12 of word 83), see ACS-5 r10: 7.13.6.41 Words 85..87, 120: Commands and feature sets supported or enabled Signed-off-by: Niklas Cassel <[email protected]> Signed-off-by: Damien Le Moal <[email protected]>
2022-10-21ata: remove unused helper ata_id_lba48_enabled()Niklas Cassel1-9/+0
Not only is this function unused, but even worse, the bit it is checking is actually used for signaling if the feature is supported, not enabled. Therefore, remove the unused helper function ata_id_lba48_enabled(). ata_id_has_lba48() is left unmodified, since this extra supported bit (Bit 10 of word 86) is simply a copy of the bit that ata_id_has_lba48() already checks (Bit 10 of word 83), see ACS-5 r10: 7.13.6.41 Words 85..87, 120: Commands and feature sets supported or enabled Signed-off-by: Niklas Cassel <[email protected]> Signed-off-by: Damien Le Moal <[email protected]>
2022-10-20srcu: Check for consistent per-CPU per-srcu_struct NMI safetyPaul E. McKenney2-6/+11
This commit adds runtime checks to verify that a given srcu_struct uses consistent NMI-safe (or not) read-side primitives on a per-CPU basis. Link: https://lore.kernel.org/all/[email protected]/ Signed-off-by: Paul E. McKenney <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: John Ogness <[email protected]> Cc: Petr Mladek <[email protected]>
2022-10-20srcu: Create an srcu_read_lock_nmisafe() and srcu_read_unlock_nmisafe()Paul E. McKenney1-0/+51
On strict load-store architectures, the use of this_cpu_inc() by srcu_read_lock() and srcu_read_unlock() is not NMI-safe in TREE SRCU. To see this suppose that an NMI arrives in the middle of srcu_read_lock(), just after it has read ->srcu_lock_count, but before it has written the incremented value back to memory. If that NMI handler also does srcu_read_lock() and srcu_read_lock() on that same srcu_struct structure, then upon return from that NMI handler, the interrupted srcu_read_lock() will overwrite the NMI handler's update to ->srcu_lock_count, but leave unchanged the NMI handler's update by srcu_read_unlock() to ->srcu_unlock_count. This can result in a too-short SRCU grace period, which can in turn result in arbitrary memory corruption. If the NMI handler instead interrupts the srcu_read_unlock(), this can result in eternal SRCU grace periods, which is not much better. This commit therefore creates a pair of new srcu_read_lock_nmisafe() and srcu_read_unlock_nmisafe() functions, which allow SRCU readers in both NMI handlers and in process and IRQ context. It is bad practice to mix the existing and the new _nmisafe() primitives on the same srcu_struct structure. Use one set or the other, not both. Just to underline that "bad practice" point, using srcu_read_lock() at process level and srcu_read_lock_nmisafe() in your NMI handler will not, repeat NOT, work. If you do not immediately understand why this is the case, please review the earlier paragraphs in this commit log. [ paulmck: Apply kernel test robot feedback. ] [ paulmck: Apply feedback from Randy Dunlap. ] [ paulmck: Apply feedback from John Ogness. ] [ paulmck: Apply feedback from Frederic Weisbecker. ] Link: https://lore.kernel.org/all/[email protected]/ Signed-off-by: Paul E. McKenney <[email protected]> Acked-by: Randy Dunlap <[email protected]> # build-tested Reviewed-by: Frederic Weisbecker <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: John Ogness <[email protected]> Cc: Petr Mladek <[email protected]>
2022-10-20USB: make devnode() callback in usb_class_driver take a const *Greg Kroah-Hartman1-1/+1
With the changes to the driver core to make more pointers const, the USB subsystem also needs to be modified to take a const * for the devnode callback so that the driver core's constant pointer will also be properly propagated. Cc: Benjamin Tissoires <[email protected]> Cc: Juergen Stuber <[email protected]> Reviewed-by: Johan Hovold <[email protected]> Acked-by: Pete Zaitcev <[email protected]> Reviewed-by: Jiri Kosina <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2022-10-20USB: allow some usb functions to take a const pointer.Greg Kroah-Hartman1-3/+52
The functions to_usb_interface(), to_usb_device, and interface_to_usbdev() sometimes would like to take a const * and return a const * back. As we are doing pointer math, a call to container_of() loses the const-ness of a pointer, so use a _Generic() macro to pick the proper inline function to call instead. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2022-10-20driver core: allow kobj_to_dev() to take a const pointerGreg Kroah-Hartman1-1/+17
If a const * to a kobject is passed to kobj_to_dev(), we want to return back a const * to a device as the driver core shouldn't be modifying a constant structure. But when dealing with container_of() the pointer const attribute is cast away, so we need to manually handle this by determining the type of the pointer passed in to know the type of the pointer to pass out. Luckily _Generic can do this type of magic, and as the kernel now supports C11 it is availble to us to handle this type of build-time type detection. Cc: "Rafael J. Wysocki" <[email protected]> Reviewed-by: Sakari Ailus <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2022-10-20acl: remove a slew of now unused helpersChristian Brauner1-20/+0
Now that the posix acl api is active we can remove all the hacky helpers we had to keep around for all these years and also remove the set and get posix acl xattr handler methods as they aren't needed anymore. Signed-off-by: Christian Brauner (Microsoft) <[email protected]>
2022-10-20xattr: use posix acl apiChristian Brauner1-2/+8
In previous patches we built a new posix api solely around get and set inode operations. Now that we have all the pieces in place we can switch the system calls and the vfs over to only rely on this api when interacting with posix acls. This finally removes all type unsafety and type conversion issues explained in detail in [1] that we aim to get rid of. With the new posix acl api we immediately translate into an appropriate kernel internal struct posix_acl format both when getting and setting posix acls. This is a stark contrast to before were we hacked unsafe raw values into the uapi struct that was stored in a void pointer relying and having filesystems and security modules hack around in the uapi struct as well. Link: https://lore.kernel.org/all/[email protected] [1] Signed-off-by: Christian Brauner (Microsoft) <[email protected]>
2022-10-20ovl: use posix acl apiChristian Brauner1-0/+6
Now that posix acls have a proper api us it to copy them. All filesystems that can serve as lower or upper layers for overlayfs have gained support for the new posix acl api in previous patches. So switch all internal overlayfs codepaths for copying posix acls to the new posix acl api. Acked-by: Miklos Szeredi <[email protected]> Signed-off-by: Christian Brauner (Microsoft) <[email protected]>
2022-10-20acl: add vfs_remove_acl()Christian Brauner2-0/+21
In previous patches we implemented get and set inode operations for all non-stacking filesystems that support posix acls but didn't yet implement get and/or set acl inode operations. This specifically affected cifs and 9p. Now we can build a posix acl api based solely on get and set inode operations. We add a new vfs_remove_acl() api that can be used to set posix acls. This finally removes all type unsafety and type conversion issues explained in detail in [1] that we aim to get rid of. After we finished building the vfs api we can switch stacking filesystems to rely on the new posix api and then finally switch the xattr system calls themselves to rely on the posix acl api. Link: https://lore.kernel.org/all/[email protected] [1] Signed-off-by: Christian Brauner (Microsoft) <[email protected]>
2022-10-20acl: add vfs_get_acl()Christian Brauner2-0/+19
In previous patches we implemented get and set inode operations for all non-stacking filesystems that support posix acls but didn't yet implement get and/or set acl inode operations. This specifically affected cifs and 9p. Now we can build a posix acl api based solely on get and set inode operations. We add a new vfs_get_acl() api that can be used to get posix acls. This finally removes all type unsafety and type conversion issues explained in detail in [1] that we aim to get rid of. After we finished building the vfs api we can switch stacking filesystems to rely on the new posix api and then finally switch the xattr system calls themselves to rely on the posix acl api. Link: https://lore.kernel.org/all/[email protected] [1] Signed-off-by: Christian Brauner (Microsoft) <[email protected]>
2022-10-20acl: add vfs_set_acl()Christian Brauner2-0/+20
In previous patches we implemented get and set inode operations for all non-stacking filesystems that support posix acls but didn't yet implement get and/or set acl inode operations. This specifically affected cifs and 9p. Now we can build a posix acl api based solely on get and set inode operations. We add a new vfs_set_acl() api that can be used to set posix acls. This finally removes all type unsafety and type conversion issues explained in detail in [1] that we aim to get rid of. After we finished building the vfs api we can switch stacking filesystems to rely on the new posix api and then finally switch the xattr system calls themselves to rely on the posix acl api. Link: https://lore.kernel.org/all/[email protected] [1] Signed-off-by: Christian Brauner (Microsoft) <[email protected]>
2022-10-20evm: add post set acl hookChristian Brauner1-0/+13
The security_inode_post_setxattr() hook is used by security modules to update their own security.* xattrs. Consequently none of the security modules operate on posix acls. So we don't need an additional security hook when post setting posix acls. However, the integrity subsystem wants to be informed about posix acl changes in order to reset the EVM status flag. -> evm_inode_post_setxattr() -> evm_update_evmxattr() -> evm_calc_hmac() -> evm_calc_hmac_or_hash() and evm_cacl_hmac_or_hash() walks the global list of protected xattr names evm_config_xattrnames. This global list can be modified via /sys/security/integrity/evm/evm_xattrs. The write to "evm_xattrs" is restricted to security.* xattrs and the default xattrs in evm_config_xattrnames only contains security.* xattrs as well. So the actual value for posix acls is currently completely irrelevant for evm during evm_inode_post_setxattr() and frankly it should stay that way in the future to not cause the vfs any more headaches. But if the actual posix acl values matter then evm shouldn't operate on the binary void blob and try to hack around in the uapi struct anyway. Instead it should then in the future add a dedicated hook which takes a struct posix_acl argument passing the posix acls in the proper vfs format. For now it is sufficient to make evm_inode_post_set_acl() a wrapper around evm_inode_post_setxattr() not passing any actual values down. This will cause the hashes to be updated as before. Reviewed-by: Paul Moore <[email protected]> Signed-off-by: Christian Brauner (Microsoft) <[email protected]>
2022-10-20integrity: implement get and set acl hookChristian Brauner2-0/+47
The current way of setting and getting posix acls through the generic xattr interface is error prone and type unsafe. The vfs needs to interpret and fixup posix acls before storing or reporting it to userspace. Various hacks exist to make this work. The code is hard to understand and difficult to maintain in it's current form. Instead of making this work by hacking posix acls through xattr handlers we are building a dedicated posix acl api around the get and set inode operations. This removes a lot of hackiness and makes the codepaths easier to maintain. A lot of background can be found in [1]. So far posix acls were passed as a void blob to the security and integrity modules. Some of them like evm then proceed to interpret the void pointer and convert it into the kernel internal struct posix acl representation to perform their integrity checking magic. This is obviously pretty problematic as that requires knowledge that only the vfs is guaranteed to have and has lead to various bugs. Add a proper security hook for setting posix acls and pass down the posix acls in their appropriate vfs format instead of hacking it through a void pointer stored in the uapi format. I spent considerate time in the security module and integrity infrastructure and audited all codepaths. EVM is the only part that really has restrictions based on the actual posix acl values passed through it (e.g., i_mode). Before this dedicated hook EVM used to translate from the uapi posix acl format sent to it in the form of a void pointer into the vfs format. This is not a good thing. Instead of hacking around in the uapi struct give EVM the posix acls in the appropriate vfs format and perform sane permissions checks that mirror what it used to to in the generic xattr hook. IMA doesn't have any restrictions on posix acls. When posix acls are changed it just wants to update its appraisal status to trigger an EVM revalidation. The removal of posix acls is equivalent to passing NULL to the posix set acl hooks. This is the same as before through the generic xattr api. Link: https://lore.kernel.org/all/[email protected] [1] Acked-by: Paul Moore <[email protected]> (LSM) Signed-off-by: Christian Brauner (Microsoft) <[email protected]>
2022-10-20security: add get, remove and set acl hookChristian Brauner3-0/+47
The current way of setting and getting posix acls through the generic xattr interface is error prone and type unsafe. The vfs needs to interpret and fixup posix acls before storing or reporting it to userspace. Various hacks exist to make this work. The code is hard to understand and difficult to maintain in it's current form. Instead of making this work by hacking posix acls through xattr handlers we are building a dedicated posix acl api around the get and set inode operations. This removes a lot of hackiness and makes the codepaths easier to maintain. A lot of background can be found in [1]. So far posix acls were passed as a void blob to the security and integrity modules. Some of them like evm then proceed to interpret the void pointer and convert it into the kernel internal struct posix acl representation to perform their integrity checking magic. This is obviously pretty problematic as that requires knowledge that only the vfs is guaranteed to have and has lead to various bugs. Add a proper security hook for setting posix acls and pass down the posix acls in their appropriate vfs format instead of hacking it through a void pointer stored in the uapi format. In the next patches we implement the hooks for the few security modules that do actually have restrictions on posix acls. Link: https://lore.kernel.org/all/[email protected] [1] Acked-by: Paul Moore <[email protected]> Signed-off-by: Christian Brauner (Microsoft) <[email protected]>
2022-10-209p: implement get acl methodChristian Brauner1-0/+11
The current way of setting and getting posix acls through the generic xattr interface is error prone and type unsafe. The vfs needs to interpret and fixup posix acls before storing or reporting it to userspace. Various hacks exist to make this work. The code is hard to understand and difficult to maintain in it's current form. Instead of making this work by hacking posix acls through xattr handlers we are building a dedicated posix acl api around the get and set inode operations. This removes a lot of hackiness and makes the codepaths easier to maintain. A lot of background can be found in [1]. In order to build a type safe posix api around get and set acl we need all filesystem to implement get and set acl. So far 9p implemented a ->get_inode_acl() operation that didn't require access to the dentry in order to allow (limited) permission checking via posix acls in the vfs. Now that we have get and set acl inode operations that take a dentry argument we can give 9p get and set acl inode operations. This is mostly a refactoring of the codepaths currently used in 9p posix acl xattr handler. After we have fully implemented the posix acl api and switched the vfs over to it, the 9p specific posix acl xattr handler and associated code will be removed. Note, until the vfs has been switched to the new posix acl api this patch is a non-functional change. Link: https://lore.kernel.org/all/[email protected] [1] Signed-off-by: Christian Brauner (Microsoft) <[email protected]>
2022-10-20fs: add new get acl methodChristian Brauner1-0/+2
The current way of setting and getting posix acls through the generic xattr interface is error prone and type unsafe. The vfs needs to interpret and fixup posix acls before storing or reporting it to userspace. Various hacks exist to make this work. The code is hard to understand and difficult to maintain in it's current form. Instead of making this work by hacking posix acls through xattr handlers we are building a dedicated posix acl api around the get and set inode operations. This removes a lot of hackiness and makes the codepaths easier to maintain. A lot of background can be found in [1]. Since some filesystem rely on the dentry being available to them when setting posix acls (e.g., 9p and cifs) they cannot rely on the old get acl inode operation to retrieve posix acl and need to implement their own custom handlers because of that. In a previous patch we renamed the old get acl inode operation to ->get_inode_acl(). We decided to rename it and implement a new one since ->get_inode_acl() is called generic_permission() and inode_permission() both of which can be called during an filesystem's ->permission() handler. So simply passing a dentry argument to ->get_acl() would have amounted to also having to pass a dentry argument to ->permission(). We avoided that change. This adds a new ->get_acl() inode operations which takes a dentry argument which filesystems such as 9p, cifs, and overlayfs can implement to get posix acls. Link: https://lore.kernel.org/all/[email protected] [1] Signed-off-by: Christian Brauner (Microsoft) <[email protected]>
2022-10-20fs: rename current get acl methodChristian Brauner2-4/+4
The current way of setting and getting posix acls through the generic xattr interface is error prone and type unsafe. The vfs needs to interpret and fixup posix acls before storing or reporting it to userspace. Various hacks exist to make this work. The code is hard to understand and difficult to maintain in it's current form. Instead of making this work by hacking posix acls through xattr handlers we are building a dedicated posix acl api around the get and set inode operations. This removes a lot of hackiness and makes the codepaths easier to maintain. A lot of background can be found in [1]. The current inode operation for getting posix acls takes an inode argument but various filesystems (e.g., 9p, cifs, overlayfs) need access to the dentry. In contrast to the ->set_acl() inode operation we cannot simply extend ->get_acl() to take a dentry argument. The ->get_acl() inode operation is called from: acl_permission_check() -> check_acl() -> get_acl() which is part of generic_permission() which in turn is part of inode_permission(). Both generic_permission() and inode_permission() are called in the ->permission() handler of various filesystems (e.g., overlayfs). So simply passing a dentry argument to ->get_acl() would amount to also having to pass a dentry argument to ->permission(). We should avoid this unnecessary change. So instead of extending the existing inode operation rename it from ->get_acl() to ->get_inode_acl() and add a ->get_acl() method later that passes a dentry argument and which filesystems that need access to the dentry can implement instead of ->get_inode_acl(). Filesystems like cifs which allow setting and getting posix acls but not using them for permission checking during lookup can simply not implement ->get_inode_acl(). This is intended to be a non-functional change. Link: https://lore.kernel.org/all/[email protected] [1] Suggested-by/Inspired-by: Christoph Hellwig <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Christian Brauner (Microsoft) <[email protected]>
2022-10-19fscrypt: fix keyring memory leak on mount failureEric Biggers1-2/+2
Commit d7e7b9af104c ("fscrypt: stop using keyrings subsystem for fscrypt_master_key") moved the keyring destruction from __put_super() to generic_shutdown_super() so that the filesystem's block device(s) are still available. Unfortunately, this causes a memory leak in the case where a mount is attempted with the test_dummy_encryption mount option, but the mount fails after the option has already been processed. To fix this, attempt the keyring destruction in both places. Reported-by: [email protected] Fixes: d7e7b9af104c ("fscrypt: stop using keyrings subsystem for fscrypt_master_key") Signed-off-by: Eric Biggers <[email protected]> Reviewed-by: Christian Brauner (Microsoft) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-10-19netlink: add support for formatted extack messagesEdward Cree1-2/+27
Include an 80-byte buffer in struct netlink_ext_ack that can be used for scnprintf()ed messages. This does mean that the resulting string can't be enumerated, translated etc. in the way NL_SET_ERR_MSG() was designed to allow. Signed-off-by: Edward Cree <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-10-19gpio: aspeed: Add missing header(s)Andy Shevchenko1-0/+4
Do not imply that some of the generic headers may be always included. Instead, include explicitly what we are direct user of. While at it, sort headers alphabetically. Signed-off-by: Andy Shevchenko <[email protected]>
2022-10-19firmware: xilinx: Add qspi firmware interfaceRajan Vaja1-0/+19
Add support for QSPI ioctl functions and enums. Signed-off-by: Rajan Vaja <[email protected]> Signed-off-by: Michal Simek <[email protected]> Signed-off-by: Amit Kumar Mahapatra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]>
2022-10-19net: phylink: provide phylink_validate_mask_caps() helperRussell King (Oracle)1-0/+3
Provide a helper that restricts the link modes according to the phylink capabilities. Signed-off-by: Russell King (Oracle) <[email protected]> [rebased on net-next/master and added documentation] Signed-off-by: Sean Anderson <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-10-19fs: pass dentry to set acl methodChristian Brauner2-7/+7
The current way of setting and getting posix acls through the generic xattr interface is error prone and type unsafe. The vfs needs to interpret and fixup posix acls before storing or reporting it to userspace. Various hacks exist to make this work. The code is hard to understand and difficult to maintain in it's current form. Instead of making this work by hacking posix acls through xattr handlers we are building a dedicated posix acl api around the get and set inode operations. This removes a lot of hackiness and makes the codepaths easier to maintain. A lot of background can be found in [1]. Since some filesystem rely on the dentry being available to them when setting posix acls (e.g., 9p and cifs) they cannot rely on set acl inode operation. But since ->set_acl() is required in order to use the generic posix acl xattr handlers filesystems that do not implement this inode operation cannot use the handler and need to implement their own dedicated posix acl handlers. Update the ->set_acl() inode method to take a dentry argument. This allows all filesystems to rely on ->set_acl(). As far as I can tell all codepaths can be switched to rely on the dentry instead of just the inode. Note that the original motivation for passing the dentry separate from the inode instead of just the dentry in the xattr handlers was because of security modules that call security_d_instantiate(). This hook is called during d_instantiate_new(), d_add(), __d_instantiate_anon(), and d_splice_alias() to initialize the inode's security context and possibly to set security.* xattrs. Since this only affects security.* xattrs this is completely irrelevant for posix acls. Link: https://lore.kernel.org/all/[email protected] [1] Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Christian Brauner (Microsoft) <[email protected]>
2022-10-19x86/signal/32: Merge native and compat 32-bit signal codeBrian Gerst1-0/+2
There are significant differences between signal handling on 32-bit vs. 64-bit, like different structure layouts and legacy syscalls. Instead of duplicating that code for native and compat, merge both versions into one file. Signed-off-by: Brian Gerst <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Acked-by: "Eric W. Biederman" <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Borislav Petkov <[email protected]>
2022-10-19signal/compat: Remove compat_sigset_t overrideBrian Gerst1-2/+0
x86 no longer uses compat_sigset_t when CONFIG_COMPAT isn't enabled, so remove the override define. Signed-off-by: Brian Gerst <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Acked-by: "Eric W. Biederman" <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Borislav Petkov <[email protected]>
2022-10-19security: Create file_truncate hook from path_truncate hookGünther Noack3-1/+16
Like path_truncate, the file_truncate hook also restricts file truncation, but is called in the cases where truncation is attempted on an already-opened file. This is required in a subsequent commit to handle ftruncate() operations differently to truncate() operations. Acked-by: Tetsuo Handa <[email protected]> Acked-by: John Johansen <[email protected]> Acked-by: Paul Moore <[email protected]> Signed-off-by: Günther Noack <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mickaël Salaün <[email protected]>
2022-10-18net: remove smc911x driverArnd Bergmann1-14/+0
This driver was used on Arm and SH machines until 2009, when the last platforms moved to the smsc911x driver for the same hardware. Time to retire this version. Link: https://lore.kernel.org/netdev/[email protected]/ Signed-off-by: Arnd Bergmann <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-10-18Merge tag 'for-netdev' of ↵Jakub Kicinski1-0/+12
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2022-10-18 We've added 33 non-merge commits during the last 14 day(s) which contain a total of 31 files changed, 874 insertions(+), 538 deletions(-). The main changes are: 1) Add RCU grace period chaining to BPF to wait for the completion of access from both sleepable and non-sleepable BPF programs, from Hou Tao & Paul E. McKenney. 2) Improve helper UAPI by explicitly defining BPF_FUNC_xxx integer values. In the wild we have seen OS vendors doing buggy backports where helper call numbers mismatched. This is an attempt to make backports more foolproof, from Andrii Nakryiko. 3) Add libbpf *_opts API-variants for bpf_*_get_fd_by_id() functions, from Roberto Sassu. 4) Fix libbpf's BTF dumper for structs with padding-only fields, from Eduard Zingerman. 5) Fix various libbpf bugs which have been found from fuzzing with malformed BPF object files, from Shung-Hsi Yu. 6) Clean up an unneeded check on existence of SSE2 in BPF x86-64 JIT, from Jie Meng. 7) Fix various ASAN bugs in both libbpf and selftests when running the BPF selftest suite on arm64, from Xu Kuohai. 8) Fix missing bpf_iter_vma_offset__destroy() call in BPF iter selftest and use in-skeleton link pointer to remove an explicit bpf_link__destroy(), from Jiri Olsa. 9) Fix BPF CI breakage by pointing to iptables-legacy instead of relying on symlinked iptables which got upgraded to iptables-nft, from Martin KaFai Lau. 10) Minor BPF selftest improvements all over the place, from various others. * tag 'for-netdev' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (33 commits) bpf/docs: Update README for most recent vmtest.sh bpf: Use rcu_trace_implies_rcu_gp() for program array freeing bpf: Use rcu_trace_implies_rcu_gp() in local storage map bpf: Use rcu_trace_implies_rcu_gp() in bpf memory allocator rcu-tasks: Provide rcu_trace_implies_rcu_gp() selftests/bpf: Use sys_pidfd_open() helper when possible libbpf: Fix null-pointer dereference in find_prog_by_sec_insn() libbpf: Deal with section with no data gracefully libbpf: Use elf_getshdrnum() instead of e_shnum selftest/bpf: Fix error usage of ASSERT_OK in xdp_adjust_tail.c selftests/bpf: Fix error failure of case test_xdp_adjust_tail_grow selftest/bpf: Fix memory leak in kprobe_multi_test selftests/bpf: Fix memory leak caused by not destroying skeleton libbpf: Fix memory leak in parse_usdt_arg() libbpf: Fix use-after-free in btf_dump_name_dups selftests/bpf: S/iptables/iptables-legacy/ in the bpf_nf and xdp_synproxy test selftests/bpf: Alphabetize DENYLISTs selftests/bpf: Add tests for _opts variants of bpf_*_get_fd_by_id() libbpf: Introduce bpf_link_get_fd_by_id_opts() libbpf: Introduce bpf_btf_get_fd_by_id_opts() ... ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-10-18rcu: Remove unused 'cpu' in rcu_virt_note_context_switch()Zeng Heng3-3/+3
This commit removes the unused function argument 'cpu'. This does not change functionality, but might save a cycle or two. Signed-off-by: Zeng Heng <[email protected]> Acked-by: Mukesh Ojha <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2022-10-18srcu: Convert ->srcu_lock_count and ->srcu_unlock_count to atomicPaul E. McKenney1-2/+2
NMI-safe variants of srcu_read_lock() and srcu_read_unlock() are needed by printk(), which on many architectures entails read-modify-write atomic operations. This commit prepares Tree SRCU for this change by making both ->srcu_lock_count and ->srcu_unlock_count by atomic_long_t. [ paulmck: Apply feedback from John Ogness. ] Link: https://lore.kernel.org/all/[email protected]/ Signed-off-by: Paul E. McKenney <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: John Ogness <[email protected]> Cc: Petr Mladek <[email protected]>
2022-10-18rcu-tasks: Provide rcu_trace_implies_rcu_gp()Paul E. McKenney1-0/+12
As an accident of implementation, an RCU Tasks Trace grace period also acts as an RCU grace period. However, this could change at any time. This commit therefore creates an rcu_trace_implies_rcu_gp() that currently returns true to codify this accident. Code relying on this accident must call this function to verify that this accident is still happening. Reported-by: Hou Tao <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2022-10-18Merge drm/drm-next into drm-misc-nextMaxime Ripard337-2961/+7661
Let's kick-off this release cycle. Signed-off-by: Maxime Ripard <[email protected]>
2022-10-18attr: use consistent sgid stripping checksChristian Brauner1-1/+1
Currently setgid stripping in file_remove_privs()'s should_remove_suid() helper is inconsistent with other parts of the vfs. Specifically, it only raises ATTR_KILL_SGID if the inode is S_ISGID and S_IXGRP but not if the inode isn't in the caller's groups and the caller isn't privileged over the inode although we require this already in setattr_prepare() and setattr_copy() and so all filesystem implement this requirement implicitly because they have to use setattr_{prepare,copy}() anyway. But the inconsistency shows up in setgid stripping bugs for overlayfs in xfstests (e.g., generic/673, generic/683, generic/685, generic/686, generic/687). For example, we test whether suid and setgid stripping works correctly when performing various write-like operations as an unprivileged user (fallocate, reflink, write, etc.): echo "Test 1 - qa_user, non-exec file $verb" setup_testfile chmod a+rws $junk_file commit_and_check "$qa_user" "$verb" 64k 64k The test basically creates a file with 6666 permissions. While the file has the S_ISUID and S_ISGID bits set it does not have the S_IXGRP set. On a regular filesystem like xfs what will happen is: sys_fallocate() -> vfs_fallocate() -> xfs_file_fallocate() -> file_modified() -> __file_remove_privs() -> dentry_needs_remove_privs() -> should_remove_suid() -> __remove_privs() newattrs.ia_valid = ATTR_FORCE | kill; -> notify_change() -> setattr_copy() In should_remove_suid() we can see that ATTR_KILL_SUID is raised unconditionally because the file in the test has S_ISUID set. But we also see that ATTR_KILL_SGID won't be set because while the file is S_ISGID it is not S_IXGRP (see above) which is a condition for ATTR_KILL_SGID being raised. So by the time we call notify_change() we have attr->ia_valid set to ATTR_KILL_SUID | ATTR_FORCE. Now notify_change() sees that ATTR_KILL_SUID is set and does: ia_valid = attr->ia_valid |= ATTR_MODE attr->ia_mode = (inode->i_mode & ~S_ISUID); which means that when we call setattr_copy() later we will definitely update inode->i_mode. Note that attr->ia_mode still contains S_ISGID. Now we call into the filesystem's ->setattr() inode operation which will end up calling setattr_copy(). Since ATTR_MODE is set we will hit: if (ia_valid & ATTR_MODE) { umode_t mode = attr->ia_mode; vfsgid_t vfsgid = i_gid_into_vfsgid(mnt_userns, inode); if (!vfsgid_in_group_p(vfsgid) && !capable_wrt_inode_uidgid(mnt_userns, inode, CAP_FSETID)) mode &= ~S_ISGID; inode->i_mode = mode; } and since the caller in the test is neither capable nor in the group of the inode the S_ISGID bit is stripped. But assume the file isn't suid then ATTR_KILL_SUID won't be raised which has the consequence that neither the setgid nor the suid bits are stripped even though it should be stripped because the inode isn't in the caller's groups and the caller isn't privileged over the inode. If overlayfs is in the mix things become a bit more complicated and the bug shows up more clearly. When e.g., ovl_setattr() is hit from ovl_fallocate()'s call to file_remove_privs() then ATTR_KILL_SUID and ATTR_KILL_SGID might be raised but because the check in notify_change() is questioning the ATTR_KILL_SGID flag again by requiring S_IXGRP for it to be stripped the S_ISGID bit isn't removed even though it should be stripped: sys_fallocate() -> vfs_fallocate() -> ovl_fallocate() -> file_remove_privs() -> dentry_needs_remove_privs() -> should_remove_suid() -> __remove_privs() newattrs.ia_valid = ATTR_FORCE | kill; -> notify_change() -> ovl_setattr() // TAKE ON MOUNTER'S CREDS -> ovl_do_notify_change() -> notify_change() // GIVE UP MOUNTER'S CREDS // TAKE ON MOUNTER'S CREDS -> vfs_fallocate() -> xfs_file_fallocate() -> file_modified() -> __file_remove_privs() -> dentry_needs_remove_privs() -> should_remove_suid() -> __remove_privs() newattrs.ia_valid = attr_force | kill; -> notify_change() The fix for all of this is to make file_remove_privs()'s should_remove_suid() helper to perform the same checks as we already require in setattr_prepare() and setattr_copy() and have notify_change() not pointlessly requiring S_IXGRP again. It doesn't make any sense in the first place because the caller must calculate the flags via should_remove_suid() anyway which would raise ATTR_KILL_SGID. While we're at it we move should_remove_suid() from inode.c to attr.c where it belongs with the rest of the iattr helpers. Especially since it returns ATTR_KILL_S{G,U}ID flags. We also rename it to setattr_should_drop_suidgid() to better reflect that it indicates both setuid and setgid bit removal and also that it returns attr flags. Running xfstests with this doesn't report any regressions. We should really try and use consistent checks. Reviewed-by: Amir Goldstein <[email protected]> Signed-off-by: Christian Brauner (Microsoft) <[email protected]>
2022-10-18ata: add ata_port_is_frozen() helperNiklas Cassel1-0/+5
At the request of the libata maintainer, introduce a ata_port_is_frozen() helper function. Signed-off-by: Niklas Cassel <[email protected]> Signed-off-by: Damien Le Moal <[email protected]>
2022-10-17Merge tag 'cgroup-for-6.1-rc1-fixes' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: - Fix a recent regression where a sleeping kernfs function is called with css_set_lock (spinlock) held - Revert the commit to enable cgroup1 support for cgroup_get_from_fd/file() Multiple users assume that the lookup only works for cgroup2 and breaks when fed a cgroup1 file. Instead, introduce a separate set of functions to lookup both v1 and v2 and use them where the user explicitly wants to support both versions. - Compat update for tools/perf/util/bpf_skel/bperf_cgroup.bpf.c. - Add Josef Bacik as a blkcg maintainer. * tag 'cgroup-for-6.1-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: blkcg: Update MAINTAINERS entry mm: cgroup: fix comments for get from fd/file helpers perf stat: Support old kernels for bperf cgroup counting bpf: cgroup_iter: support cgroup1 using cgroup fd cgroup: add cgroup_v1v2_get_from_[fd/file]() Revert "cgroup: enable cgroup_get_from_file() on cgroup1" cgroup: Reorganize css_set_lock and kernfs path processing
2022-10-18dma-buf: Remove obsoleted internal lockDmitry Osipenko1-9/+0
The internal dma-buf lock isn't needed anymore because the updated locking specification claims that dma-buf reservation must be locked by importers, and thus, the internal data is already protected by the reservation lock. Remove the obsoleted internal lock. Acked-by: Sumit Semwal <[email protected]> Acked-by: Christian König <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Dmitry Osipenko <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2022-10-18dma-buf: Add unlocked variant of attachment-mapping functionsDmitry Osipenko1-0/+6
Add unlocked variant of dma_buf_map/unmap_attachment() that will be used by drivers that don't take the reservation lock explicitly. Acked-by: Sumit Semwal <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Dmitry Osipenko <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2022-10-18dma-buf: Add unlocked variant of vmapping functionsDmitry Osipenko1-0/+2
Add unlocked variant of dma_buf_vmap/vunmap() that will be utilized by drivers that don't take the reservation lock explicitly. Acked-by: Sumit Semwal <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Dmitry Osipenko <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2022-10-17pstore/ram: Move internal definitions out of kernel-wide includeKees Cook1-99/+0
Most of the details of the ram backend are entirely internal to the backend itself. Leave only what is needed to instantiate a ram backend in the kernel-wide header. Cc: Anton Vorontsov <[email protected]> Cc: Colin Cross <[email protected]> Cc: Tony Luck <[email protected]> Signed-off-by: Kees Cook <[email protected]> Reviewed-and-tested-by: Guilherme G. Piccoli <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-10-17fs: edit a comment made in bad tastePaul Moore1-1/+1
I know nobody likes a buzzkill, but I figure it's best to keep the bad jokes appropriate for small children. Signed-off-by: Paul Moore <[email protected]>
2022-10-17static_call: Add call depth tracking supportPeter Zijlstra1-0/+2
When indirect calls are switched to direct calls then it has to be ensured that the call target is not the function, but the call thunk when call depth tracking is enabled. But static calls are available before call thunks have been set up. Ensure a second run through the static call patching code after call thunks have been created. When call thunks are not enabled this has no side effects. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-10-17x86/Kconfig: Introduce function paddingThomas Gleixner1-0/+4
Now that all functions are 16 byte aligned, add 16 bytes of NOP padding in front of each function. This prepares things for software call stack tracking and kCFI/FineIBT. This significantly increases kernel .text size, around 5.1% on a x86_64-defconfig-ish build. However, per the random access argument used for alignment, these 16 extra bytes are code that wouldn't be used. Performance measurements back this up by showing no significant performance regressions. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-10-17arch: Introduce CONFIG_FUNCTION_ALIGNMENTPeter Zijlstra1-2/+2
Generic function-alignment infrastructure. Architectures can select FUNCTION_ALIGNMENT_xxB symbols; the FUNCTION_ALIGNMENT symbol is then set to the largest such selected size, 0 otherwise. From this the -falign-functions compiler argument and __ALIGN macro are set. This incorporates the DEBUG_FORCE_FUNCTION_ALIGN_64B knob and future alignment requirements for x86_64 (later in this series) into a single place. NOTE: also removes the 0x90 filler byte from the generic __ALIGN primitive, that value makes no sense outside of x86. NOTE: .balign 0 reverts to a no-op. Requested-by: Linus Torvalds <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-10-17perf: Fix missing SIGTRAPsPeter Zijlstra1-4/+15
Marco reported: Due to the implementation of how SIGTRAP are delivered if perf_event_attr::sigtrap is set, we've noticed 3 issues: 1. Missing SIGTRAP due to a race with event_sched_out() (more details below). 2. Hardware PMU events being disabled due to returning 1 from perf_event_overflow(). The only way to re-enable the event is for user space to first "properly" disable the event and then re-enable it. 3. The inability to automatically disable an event after a specified number of overflows via PERF_EVENT_IOC_REFRESH. The worst of the 3 issues is problem (1), which occurs when a pending_disable is "consumed" by a racing event_sched_out(), observed as follows: CPU0 | CPU1 --------------------------------+--------------------------- __perf_event_overflow() | perf_event_disable_inatomic() | pending_disable = CPU0 | ... | _perf_event_enable() | event_function_call() | task_function_call() | /* sends IPI to CPU0 */ <IPI> | ... __perf_event_enable() +--------------------------- ctx_resched() task_ctx_sched_out() ctx_sched_out() group_sched_out() event_sched_out() pending_disable = -1 </IPI> <IRQ-work> perf_pending_event() perf_pending_event_disable() /* Fails to send SIGTRAP because no pending_disable! */ </IRQ-work> In the above case, not only is that particular SIGTRAP missed, but also all future SIGTRAPs because 'event_limit' is not reset back to 1. To fix, rework pending delivery of SIGTRAP via IRQ-work by introduction of a separate 'pending_sigtrap', no longer using 'event_limit' and 'pending_disable' for its delivery. Additionally; and different to Marco's proposed patch: - recognise that pending_disable effectively duplicates oncpu for the case where it is set. As such, change the irq_work handler to use ->oncpu to target the event and use pending_* as boolean toggles. - observe that SIGTRAP targets the ctx->task, so the context switch optimization that carries contexts between tasks is invalid. If the irq_work were delayed enough to hit after a context switch the SIGTRAP would be delivered to the wrong task. - observe that if the event gets scheduled out (rotation/migration/context-switch/...) the irq-work would be insufficient to deliver the SIGTRAP when the event gets scheduled back in (the irq-work might still be pending on the old CPU). Therefore have event_sched_out() convert the pending sigtrap into a task_work which will deliver the signal at return_to_user. Fixes: 97ba62b27867 ("perf: Add support for SIGTRAP on perf events") Reported-by: Dmitry Vyukov <[email protected]> Debugged-by: Dmitry Vyukov <[email protected]> Reported-by: Marco Elver <[email protected]> Debugged-by: Marco Elver <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Marco Elver <[email protected]> Tested-by: Marco Elver <[email protected]>
2022-10-17counter: Reduce DEFINE_COUNTER_ARRAY_POLARITY() to defining counter_arrayWilliam Breathitt Gray1-3/+2
A spare warning was reported for drivers/counter/ti-ecap-capture.c:: sparse warnings: (new ones prefixed by >>) >> drivers/counter/ti-ecap-capture.c:380:8: sparse: sparse: symbol 'ecap_cnt_pol_array' was not declared. Should it be static? vim +/ecap_cnt_pol_array +380 drivers/counter/ti-ecap-capture.c 379 > 380 static DEFINE_COUNTER_ARRAY_POLARITY(ecap_cnt_pol_array, ecap_cnt_pol_avail, ECAP_NB_CEVT); 381 The first argument to the DEFINE_COUNTER_ARRAY_POLARITY() macro is a token serving as the symbol name in the definition of a new struct counter_array structure. However, this macro actually expands to two statements:: #define DEFINE_COUNTER_ARRAY_POLARITY(_name, _enums, _length) \ DEFINE_COUNTER_AVAILABLE(_name##_available, _enums); \ struct counter_array _name = { \ .type = COUNTER_COMP_SIGNAL_POLARITY, \ .avail = &(_name##_available), \ .length = (_length), \ } Because of this, the "static" on line 380 only applies to the first statement. This patch splits out the DEFINE_COUNTER_AVAILABLE() line and leaves DEFINE_COUNTER_ARRAY_POLARITY() as a simple structure definition to avoid issues like this. Reported-by: kernel test robot <[email protected]> Link: https://lore.kernel.org/all/[email protected]/ Cc: Julien Panis <[email protected]> Signed-off-by: William Breathitt Gray <[email protected]>
2022-10-17fbdev: MIPS supports iomem addressesKees Cook1-1/+1
Add MIPS to fb_* helpers list for iomem addresses. This silences Sparse warnings about lacking __iomem address space casts: drivers/video/fbdev/pvr2fb.c:800:9: sparse: sparse: incorrect type in argument 1 (different address spaces) drivers/video/fbdev/pvr2fb.c:800:9: sparse: expected void const * drivers/video/fbdev/pvr2fb.c:800:9: sparse: got char [noderef] __iomem *screen_base Reported-by: kernel test robot <[email protected]> Link: https://lore.kernel.org/lkml/[email protected]/ Cc: Helge Deller <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Kees Cook <[email protected]> Signed-off-by: Helge Deller <[email protected]>
2022-10-17Merge existing fixes from spi/for-6.1 into new branchMark Brown1-1/+1
2022-10-16of: declare string literals constChristian Göttsche1-2/+2
of_overlay_action_name() returns a string literal from a function local array. Modifying string literals is undefined behavior which usage of const pointer can avoid. of_overlay_action_name() is currently only used once in overlay_notify() to print the returned value. While on it declare the data array const as well. Reported by Clang: In file included from arch/x86/kernel/asm-offsets.c:22: In file included from arch/x86/kernel/../kvm/vmx/vmx.h:5: In file included from ./include/linux/kvm_host.h:19: In file included from ./include/linux/msi.h:23: In file included from ./arch/x86/include/asm/msi.h:5: In file included from ./arch/x86/include/asm/irqdomain.h:5: In file included from ./include/linux/irqdomain.h:35: ./include/linux/of.h:1555:3: error: initializing 'char *' with an expression of type 'const char[5]' discards qualifiers [-Werror,-Wincompatible-pointer-types-discards-qualifiers] "init", ^~~~~~ ./include/linux/of.h:1556:3: error: initializing 'char *' with an expression of type 'const char[10]' discards qualifiers [-Werror,-Wincompatible-pointer-types-discards-qualifiers] "pre-apply", ^~~~~~~~~~~ ./include/linux/of.h:1557:3: error: initializing 'char *' with an expression of type 'const char[11]' discards qualifiers [-Werror,-Wincompatible-pointer-types-discards-qualifiers] "post-apply", ^~~~~~~~~~~~ ./include/linux/of.h:1558:3: error: initializing 'char *' with an expression of type 'const char[11]' discards qualifiers [-Werror,-Wincompatible-pointer-types-discards-qualifiers] "pre-remove", ^~~~~~~~~~~~ ./include/linux/of.h:1559:3: error: initializing 'char *' with an expression of type 'const char[12]' discards qualifiers [-Werror,-Wincompatible-pointer-types-discards-qualifiers] "post-remove", ^~~~~~~~~~~~~ Signed-off-by: Christian Göttsche <[email protected]> Reviewed-by: Frank Rowand <[email protected]> Reviewed-by: Nick Desaulniers <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Rob Herring <[email protected]>