aboutsummaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)AuthorFilesLines
2023-10-30Merge tag 'irq-core-2023-10-29-v2' of ↵Linus Torvalds2-9/+27
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "Core: - Exclude managed interrupts in the calculation of interrupts which are targeted to a CPU which is about to be offlined to ensure that there are enough free vectors on the still online CPUs to migrate them over. Managed interrupts do not need to be accounted because they are either shut down on offline or migrated to an already reserved and guaranteed slot on a still online CPU in the interrupts affinity mask. Including managed interrupts is overaccounting and can result in needlessly aborting hibernation on large server machines. - The usual set of small improvements Drivers: - Make the generic interrupt chip implementation handle interrupt domains correctly and initialize the name pointers correctly - Add interrupt affinity setting support to the Renesas RZG2L chip driver. - Prevent registering syscore operations multiple times in the SiFive PLIC chip driver. - Update device tree handling in the NXP Layerscape MSI chip driver" * tag 'irq-core-2023-10-29-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/sifive-plic: Fix syscore registration for multi-socket systems irqchip/ls-scfg-msi: Use device_get_match_data() genirq/generic_chip: Make irq_remove_generic_chip() irqdomain aware genirq/matrix: Exclude managed interrupts in irq_matrix_allocated() PCI/MSI: Provide stubs for IMS functions irqchip/renesas-rzg2l: Enhance driver to support interrupt affinity setting genirq/generic-chip: Fix the irq_chip name for /proc/interrupts irqdomain: Annotate struct irq_domain with __counted_by
2023-10-30closures: Fix race in closure_sync()Kent Overstreet1-1/+9
As pointed out by Linus, closure_sync() was racy; we could skip blocking immediately after a get() and a put(), but then that would skip any barrier corresponding to the other thread's put() barrier. To fix this, always do the full __closure_sync() sequence whenever any get() has happened and the closure might have been used by other threads. Signed-off-by: Kent Overstreet <[email protected]>
2023-10-30closures: Better memory barriersKent Overstreet1-2/+0
atomic_(dec|sub)_return_release() are a thing now - use them. Also, delete the useless barrier in set_closure_fn(): it's redundant with the memory barrier in closure_put(0. Since closure_put() would now otherwise just have a release barrier, we also need a new barrier when the ref hits 0 - smp_acquire__after_ctrl_dep(). Signed-off-by: Kent Overstreet <[email protected]>
2023-10-30Merge tag 'x86-mm-2023-10-28' of ↵Linus Torvalds1-0/+7
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm handling updates from Ingo Molnar: - Add new NX-stack self-test - Improve NUMA partial-CFMWS handling - Fix #VC handler bugs resulting in SEV-SNP boot failures - Drop the 4MB memory size restriction on minimal NUMA nodes - Reorganize headers a bit, in preparation to header dependency reduction efforts - Misc cleanups & fixes * tag 'x86-mm-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Drop the 4 MB restriction on minimal NUMA node memory size selftests/x86/lam: Zero out buffer for readlink() x86/sev: Drop unneeded #include x86/sev: Move sev_setup_arch() to mem_encrypt.c x86/tdx: Replace deprecated strncpy() with strtomem_pad() selftests/x86/mm: Add new test that userspace stack is in fact NX x86/sev: Make boot_ghcb_page[] static x86/boot: Move x86_cache_alignment initialization to correct spot x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach x86/sev-es: Allow copy_from_kernel_nofault() in earlier boot x86_64: Show CR4.PSE on auxiliaries like on BSP x86/iommu/docs: Update AMD IOMMU specification document URL x86/sev/docs: Update document URL in amd-memory-encryption.rst x86/mm: Move arch_memory_failure() and arch_is_platform_page() definitions from <asm/processor.h> to <asm/pgtable.h> ACPI/NUMA: Apply SRAT proximity domain to entire CFMWS window x86/numa: Introduce numa_fill_memblks()
2023-10-30Merge tag 'perf-core-2023-10-28' of ↵Linus Torvalds1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull performance event updates from Ingo Molnar: - Add AMD Unified Memory Controller (UMC) events introduced with Zen 4 - Simplify & clean up the uncore management code - Fall back from RDPMC to RDMSR on certain uncore PMUs - Improve per-package and cstate event reading - Extend the Intel ref-cycles event to GP counters - Fix Intel MTL event constraints - Improve the Intel hybrid CPU handling code - Micro-optimize the RAPL code - Optimize perf_cgroup_switch() - Improve large AUX area error handling - Misc fixes and cleanups * tag 'perf-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits) perf/x86/amd/uncore: Pass through error code for initialization failures, instead of -ENODEV perf/x86/amd/uncore: Fix uninitialized return value in amd_uncore_init() x86/cpu: Fix the AMD Fam 17h, Fam 19h, Zen2 and Zen4 MSR enumerations perf: Optimize perf_cgroup_switch() perf/x86/amd/uncore: Add memory controller support perf/x86/amd/uncore: Add group exclusivity perf/x86/amd/uncore: Use rdmsr if rdpmc is unavailable perf/x86/amd/uncore: Move discovery and registration perf/x86/amd/uncore: Refactor uncore management perf/core: Allow reading package events from perf_event_read_local perf/x86/cstate: Allow reading the package statistics from local CPU perf/x86/intel/pt: Fix kernel-doc comments perf/x86/rapl: Annotate 'struct rapl_pmus' with __counted_by perf/core: Rename perf_proc_update_handler() -> perf_event_max_sample_rate_handler(), for readability perf/x86/rapl: Fix "Using plain integer as NULL pointer" Sparse warning perf/x86/rapl: Use local64_try_cmpxchg in rapl_event_update() perf/x86/rapl: Stop doing cpu_relax() in the local64_cmpxchg() loop in rapl_event_update() perf/core: Bail out early if the request AUX area is out of bound perf/x86/intel: Extend the ref-cycles event to GP counters perf/x86/intel: Fix broken fixed event constraints extension ...
2023-10-30Merge tag 'objtool-core-2023-10-28' of ↵Linus Torvalds1-5/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool updates from Ingo Molnar: "Misc fixes and cleanups: - Fix potential MAX_NAME_LEN limit related build failures - Fix scripts/faddr2line symbol filtering bug - Fix scripts/faddr2line on LLVM=1 - Fix scripts/faddr2line to accept readelf output with mapping symbols - Minor cleanups" * tag 'objtool-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: scripts/faddr2line: Skip over mapping symbols in output from readelf scripts/faddr2line: Use LLVM addr2line and readelf if LLVM=1 scripts/faddr2line: Don't filter out non-function symbols from readelf objtool: Remove max symbol name length limitation objtool: Propagate early errors objtool: Use 'the fallthrough' pseudo-keyword x86/speculation, objtool: Use absolute relocations for annotations x86/unwind/orc: Remove redundant initialization of 'mid' pointer in __orc_find()
2023-10-30Merge tag 'sched-core-2023-10-28' of ↵Linus Torvalds16-22/+99
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "Fair scheduler (SCHED_OTHER) improvements: - Remove the old and now unused SIS_PROP code & option - Scan cluster before LLC in the wake-up path - Use candidate prev/recent_used CPU if scanning failed for cluster wakeup NUMA scheduling improvements: - Improve the VMA access-PID code to better skip/scan VMAs - Extend tracing to cover VMA-skipping decisions - Improve/fix the recently introduced sched_numa_find_nth_cpu() code - Generalize numa_map_to_online_node() Energy scheduling improvements: - Remove the EM_MAX_COMPLEXITY limit - Add tracepoints to track energy computation - Make the behavior of the 'sched_energy_aware' sysctl more consistent - Consolidate and clean up access to a CPU's max compute capacity - Fix uclamp code corner cases RT scheduling improvements: - Drive dl_rq->overloaded with dl_rq->pushable_dl_tasks updates - Drive the ->rto_mask with rt_rq->pushable_tasks updates Scheduler scalability improvements: - Rate-limit updates to tg->load_avg - On x86 disable IBRS when CPU is offline to improve single-threaded performance - Micro-optimize in_task() and in_interrupt() - Micro-optimize the PSI code - Avoid updating PSI triggers and ->rtpoll_total when there are no state changes Core scheduler infrastructure improvements: - Use saved_state to reduce some spurious freezer wakeups - Bring in a handful of fast-headers improvements to scheduler headers - Make the scheduler UAPI headers more widely usable by user-space - Simplify the control flow of scheduler syscalls by using lock guards - Fix sched_setaffinity() vs. CPU hotplug race Scheduler debuggability improvements: - Disallow writing invalid values to sched_rt_period_us - Fix a race in the rq-clock debugging code triggering warnings - Fix a warning in the bandwidth distribution code - Micro-optimize in_atomic_preempt_off() checks - Enforce that the tasklist_lock is held in for_each_thread() - Print the TGID in sched_show_task() - Remove the /proc/sys/kernel/sched_child_runs_first sysctl ... and misc cleanups & fixes" * tag 'sched-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (82 commits) sched/fair: Remove SIS_PROP sched/fair: Use candidate prev/recent_used CPU if scanning failed for cluster wakeup sched/fair: Scan cluster before scanning LLC in wake-up path sched: Add cpus_share_resources API sched/core: Fix RQCF_ACT_SKIP leak sched/fair: Remove unused 'curr' argument from pick_next_entity() sched/nohz: Update comments about NEWILB_KICK sched/fair: Remove duplicate #include sched/psi: Update poll => rtpoll in relevant comments sched: Make PELT acronym definition searchable sched: Fix stop_one_cpu_nowait() vs hotplug sched/psi: Bail out early from irq time accounting sched/topology: Rename 'DIE' domain to 'PKG' sched/psi: Delete the 'update_total' function parameter from update_triggers() sched/psi: Avoid updating PSI triggers and ->rtpoll_total when there are no state changes sched/headers: Remove comment referring to rq::cpu_load, since this has been removed sched/numa: Complete scanning of inactive VMAs when there is no alternative sched/numa: Complete scanning of partial VMAs regardless of PID activity sched/numa: Move up the access pid reset logic sched/numa: Trace decisions related to skipping VMAs ...
2023-10-30Merge tag 'locking-core-2023-10-28' of ↵Linus Torvalds7-26/+107
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Info Molnar: "Futex improvements: - Add the 'futex2' syscall ABI, which is an attempt to get away from the multiplex syscall and adds a little room for extentions, while lifting some limitations. - Fix futex PI recursive rt_mutex waiter state bug - Fix inter-process shared futexes on no-MMU systems - Use folios instead of pages Micro-optimizations of locking primitives: - Improve arch_spin_value_unlocked() on asm-generic ticket spinlock architectures, to improve lockref code generation - Improve the x86-32 lockref_get_not_zero() main loop by adding build-time CMPXCHG8B support detection for the relevant lockref code, and by better interfacing the CMPXCHG8B assembly code with the compiler - Introduce arch_sync_try_cmpxchg() on x86 to improve sync_try_cmpxchg() code generation. Convert some sync_cmpxchg() users to sync_try_cmpxchg(). - Micro-optimize rcuref_put_slowpath() Locking debuggability improvements: - Improve CONFIG_DEBUG_RT_MUTEXES=y to have a fast-path as well - Enforce atomicity of sched_submit_work(), which is de-facto atomic but was un-enforced previously. - Extend <linux/cleanup.h>'s no_free_ptr() with __must_check semantics - Fix ww_mutex self-tests - Clean up const-propagation in <linux/seqlock.h> and simplify the API-instantiation macros a bit RT locking improvements: - Provide the rt_mutex_*_schedule() primitives/helpers and use them in the rtmutex code to avoid recursion vs. rtlock on the PI state. - Add nested blocking lockdep asserts to rt_mutex_lock(), rtlock_lock() and rwbase_read_lock() .. plus misc fixes & cleanups" * tag 'locking-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits) futex: Don't include process MM in futex key on no-MMU locking/seqlock: Fix grammar in comment alpha: Fix up new futex syscall numbers locking/seqlock: Propagate 'const' pointers within read-only methods, remove forced type casts locking/lockdep: Fix string sizing bug that triggers a format-truncation compiler-warning locking/seqlock: Change __seqprop() to return the function pointer locking/seqlock: Simplify SEQCOUNT_LOCKNAME() locking/atomics: Use atomic_try_cmpxchg_release() to micro-optimize rcuref_put_slowpath() locking/atomic, xen: Use sync_try_cmpxchg() instead of sync_cmpxchg() locking/atomic/x86: Introduce arch_sync_try_cmpxchg() locking/atomic: Add generic support for sync_try_cmpxchg() and its fallback locking/seqlock: Fix typo in comment futex/requeue: Remove unnecessary ‘NULL’ initialization from futex_proxy_trylock_atomic() locking/local, arch: Rewrite local_add_unless() as a static inline function locking/debug: Fix debugfs API return value checks to use IS_ERR() locking/ww_mutex/test: Make sure we bail out instead of livelock locking/ww_mutex/test: Fix potential workqueue corruption locking/ww_mutex/test: Use prng instead of rng to avoid hangs at bootup futex: Add sys_futex_requeue() futex: Add flags2 argument to futex_requeue() ...
2023-10-30Merge tag 'x86_platform_for_6.7_rc1' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 platform updates from Borislav Petkov: - Make sure PCI function 4 IDs of AMD family 0x19, models 0x60-0x7f are actually used in the amd_nb.c enumeration - Add support for extracting NUMA information from devicetree for Hyper-V usages - Add PCI device IDs for the new AMD MI300 AI accelerators - Annotate an array in struct uv_rtc_timer_head with the new __counted_by attribute - Rework UV's NMI action parameter handling * tag 'x86_platform_for_6.7_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/amd_nb: Use Family 19h Models 60h-7Fh Function 4 IDs x86/numa: Add Devicetree support x86/of: Move the x86_flattree_get_config() call out of x86_dtb_init() x86/amd_nb: Add AMD Family MI300 PCI IDs x86/platform/uv: Annotate struct uv_rtc_timer_head with __counted_by x86/platform/uv: Rework NMI "action" modparam handling
2023-10-30Merge tag 'x86_cache_for_6.7_rc1' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 resource control updates from Borislav Petkov: - Add support for non-contiguous capacity bitmasks being added to Intel's CAT implementation - Other improvements to resctrl code: better configuration, simplifications, debugging support, fixes * tag 'x86_cache_for_6.7_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/resctrl: Display RMID of resource group x86/resctrl: Add support for the files of MON groups only x86/resctrl: Display CLOSID for resource group x86/resctrl: Introduce "-o debug" mount option x86/resctrl: Move default group file creation to mount x86/resctrl: Unwind properly from rdt_enable_ctx() x86/resctrl: Rename rftype flags for consistency x86/resctrl: Simplify rftype flag definitions x86/resctrl: Add multiple tasks to the resctrl group at once Documentation/x86: Document resctrl's new sparse_masks x86/resctrl: Add sparse_masks file in info x86/resctrl: Enable non-contiguous CBMs in Intel CAT x86/resctrl: Rename arch_has_sparse_bitmaps x86/resctrl: Fix remaining kernel-doc warnings
2023-10-30Merge tag 'x86_bugs_for_6.7_rc1' of ↵Linus Torvalds1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 hw mitigation updates from Borislav Petkov: - A bunch of improvements, cleanups and fixlets to the SRSO mitigation machinery and other, general cleanups to the hw mitigations code, by Josh Poimboeuf - Improve the return thunk detection by objtool as it is absolutely important that the default return thunk is not used after returns have been patched. Future work to detect and report this better is pending - Other misc cleanups and fixes * tag 'x86_bugs_for_6.7_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits) x86/retpoline: Document some thunk handling aspects x86/retpoline: Make sure there are no unconverted return thunks due to KCSAN x86/callthunks: Delete unused "struct thunk_desc" x86/vdso: Run objtool on vdso32-setup.o objtool: Fix return thunk patching in retpolines x86/srso: Remove unnecessary semicolon x86/pti: Fix kernel warnings for pti= and nopti cmdline options x86/calldepth: Rename __x86_return_skl() to call_depth_return_thunk() x86/nospec: Refactor UNTRAIN_RET[_*] x86/rethunk: Use SYM_CODE_START[_LOCAL]_NOALIGN macros x86/srso: Disentangle rethunk-dependent options x86/srso: Move retbleed IBPB check into existing 'has_microcode' code block x86/bugs: Remove default case for fully switched enums x86/srso: Remove 'pred_cmd' label x86/srso: Unexport untraining functions x86/srso: Improve i-cache locality for alias mitigation x86/srso: Fix unret validation dependencies x86/srso: Fix vulnerability reporting for missing microcode x86/srso: Print mitigation for retbleed IBPB case x86/srso: Print actual mitigation if requested mitigation isn't possible ...
2023-10-30Merge tag 'edac_updates_for_v6.7' of ↵Linus Torvalds1-0/+12
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC updates from Borislav Petkov: - A new EDAC driver for Xilinx's Versal integrated memory controller * tag 'edac_updates_for_v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC/versal: Add a Xilinx Versal memory controller driver dt-bindings: memory-controllers: Add support for Xilinx Versal EDAC for DDRMC
2023-10-30Merge tag 'bcachefs-2023-10-30' of https://evilpiepirate.org/git/bcachefsLinus Torvalds6-3/+481
Pull initial bcachefs updates from Kent Overstreet: "Here's the bcachefs filesystem pull request. One new patch since last week: the exportfs constants ended up conflicting with other filesystems that are also getting added to the global enum, so switched to new constants picked by Amir. The only new non fs/bcachefs/ patch is the objtool patch that adds bcachefs functions to the list of noreturns. The patch that exports osq_lock() has been dropped for now, per Ingo" * tag 'bcachefs-2023-10-30' of https://evilpiepirate.org/git/bcachefs: (2781 commits) exportfs: Change bcachefs fid_type enum to avoid conflicts bcachefs: Refactor memcpy into direct assignment bcachefs: Fix drop_alloc_keys() bcachefs: snapshot_create_lock bcachefs: Fix snapshot skiplists during snapshot deletion bcachefs: bch2_sb_field_get() refactoring bcachefs: KEY_TYPE_error now counts towards i_sectors bcachefs: Fix handling of unknown bkey types bcachefs: Switch to unsafe_memcpy() in a few places bcachefs: Use struct_size() bcachefs: Correctly initialize new buckets on device resize bcachefs: Fix another smatch complaint bcachefs: Use strsep() in split_devs() bcachefs: Add iops fields to bch_member bcachefs: Rename bch_sb_field_members -> bch_sb_field_members_v1 bcachefs: New superblock section members_v2 bcachefs: Add new helper to retrieve bch_member from sb bcachefs: bucket_lock() is now a sleepable lock bcachefs: fix crc32c checksum merge byte order problem bcachefs: Fix bch2_inode_delete_keys() ...
2023-10-30Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linuxLinus Torvalds2-35/+51
Pull fscrypt updates from Eric Biggers: "This update adds support for configuring the crypto data unit size (i.e. the granularity of file contents encryption) to be less than the filesystem block size. This can allow users to use inline encryption hardware in some cases when it wouldn't otherwise be possible. In addition, there are two commits that are prerequisites for the extent-based encryption support that the btrfs folks are working on" * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux: fscrypt: track master key presence separately from secret fscrypt: rename fscrypt_info => fscrypt_inode_info fscrypt: support crypto data unit size less than filesystem block size fscrypt: replace get_ino_and_lblk_bits with just has_32bit_inodes fscrypt: compute max_lblk_bits from s_maxbytes and block size fscrypt: make the bounce page pool opt-in instead of opt-out fscrypt: make it clearer that key_prefix is deprecated
2023-10-30Merge tag 'nfsd-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linuxLinus Torvalds9-85/+415
Pull nfsd updates from Chuck Lever: "This release completes the SunRPC thread scheduler work that was begun in v6.6. The scheduler can now find an svc thread to wake in constant time and without a list walk. Thanks again to Neil Brown for this overhaul. Lorenzo Bianconi contributed infrastructure for a netlink-based NFSD control plane. The long-term plan is to provide the same functionality as found in /proc/fs/nfsd, plus some interesting additions, and then migrate the NFSD user space utilities to netlink. A long series to overhaul NFSD's NFSv4 operation encoding was applied in this release. The goals are to bring this family of encoding functions in line with the matching NFSv4 decoding functions and with the NFSv2 and NFSv3 XDR functions, preparing the way for better memory safety and maintainability. A further improvement to NFSD's write delegation support was contributed by Dai Ngo. This adds a CB_GETATTR callback, enabling the server to retrieve cached size and mtime data from clients holding write delegations. If the server can retrieve this information, it does not have to recall the delegation in some cases. The usual panoply of bug fixes and minor improvements round out this release. As always I am grateful to all contributors, reviewers, and testers" * tag 'nfsd-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (127 commits) svcrdma: Fix tracepoint printk format svcrdma: Drop connection after an RDMA Read error NFSD: clean up alloc_init_deleg() NFSD: Fix frame size warning in svc_export_parse() NFSD: Rewrite synopsis of nfsd_percpu_counters_init() nfsd: Clean up errors in nfs3proc.c nfsd: Clean up errors in nfs4state.c NFSD: Clean up errors in stats.c NFSD: simplify error paths in nfsd_svc() NFSD: Clean up nfsd4_encode_seek() NFSD: Clean up nfsd4_encode_offset_status() NFSD: Clean up nfsd4_encode_copy_notify() NFSD: Clean up nfsd4_encode_copy() NFSD: Clean up nfsd4_encode_test_stateid() NFSD: Clean up nfsd4_encode_exchange_id() NFSD: Clean up nfsd4_do_encode_secinfo() NFSD: Clean up nfsd4_encode_access() NFSD: Clean up nfsd4_encode_readdir() NFSD: Clean up nfsd4_encode_entry4() NFSD: Add an nfsd4_encode_nfs_cookie4() helper ...
2023-10-30Merge tag 'vfs-6.7.ctime' of ↵Linus Torvalds2-18/+77
gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs Pull vfs inode time accessor updates from Christian Brauner: "This finishes the conversion of all inode time fields to accessor functions as discussed on list. Changing timestamps manually as we used to do before is error prone. Using accessors function makes this robust. It does not contain the switch of the time fields to discrete 64 bit integers to replace struct timespec and free up space in struct inode. But after this, the switch can be trivially made and the patch should only affect the vfs if we decide to do it" * tag 'vfs-6.7.ctime' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: (86 commits) fs: rename inode i_atime and i_mtime fields security: convert to new timestamp accessors selinux: convert to new timestamp accessors apparmor: convert to new timestamp accessors sunrpc: convert to new timestamp accessors mm: convert to new timestamp accessors bpf: convert to new timestamp accessors ipc: convert to new timestamp accessors linux: convert to new timestamp accessors zonefs: convert to new timestamp accessors xfs: convert to new timestamp accessors vboxsf: convert to new timestamp accessors ufs: convert to new timestamp accessors udf: convert to new timestamp accessors ubifs: convert to new timestamp accessors tracefs: convert to new timestamp accessors sysv: convert to new timestamp accessors squashfs: convert to new timestamp accessors server: convert to new timestamp accessors client: convert to new timestamp accessors ...
2023-10-30Merge tag 'vfs-6.7.xattr' of ↵Linus Torvalds2-2/+2
gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs Pull vfs xattr updates from Christian Brauner: "The 's_xattr' field of 'struct super_block' currently requires a mutable table of 'struct xattr_handler' entries (although each handler itself is const). However, no code in vfs actually modifies the tables. This changes the type of 's_xattr' to allow const tables, and modifies existing file systems to move their tables to .rodata. This is desirable because these tables contain entries with function pointers in them; moving them to .rodata makes it considerably less likely to be modified accidentally or maliciously at runtime" * tag 'vfs-6.7.xattr' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: (30 commits) const_structs.checkpatch: add xattr_handler net: move sockfs_xattr_handlers to .rodata shmem: move shmem_xattr_handlers to .rodata overlayfs: move xattr tables to .rodata xfs: move xfs_xattr_handlers to .rodata ubifs: move ubifs_xattr_handlers to .rodata squashfs: move squashfs_xattr_handlers to .rodata smb: move cifs_xattr_handlers to .rodata reiserfs: move reiserfs_xattr_handlers to .rodata orangefs: move orangefs_xattr_handlers to .rodata ocfs2: move ocfs2_xattr_handlers and ocfs2_xattr_handler_map to .rodata ntfs3: move ntfs_xattr_handlers to .rodata nfs: move nfs4_xattr_handlers to .rodata kernfs: move kernfs_xattr_handlers to .rodata jfs: move jfs_xattr_handlers to .rodata jffs2: move jffs2_xattr_handlers to .rodata hfsplus: move hfsplus_xattr_handlers to .rodata hfs: move hfs_xattr_handlers to .rodata gfs2: move gfs2_xattr_handlers_max to .rodata fuse: move fuse_xattr_handlers to .rodata ...
2023-10-30Merge tag 'vfs-6.7.iov_iter' of ↵Linus Torvalds3-30/+281
gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs Pull iov_iter updates from Christian Brauner: "This contain's David's iov_iter cleanup work to convert the iov_iter iteration macros to inline functions: - Remove last_offset from iov_iter as it was only used by ITER_PIPE - Add a __user tag on copy_mc_to_user()'s dst argument on x86 to match that on powerpc and get rid of a sparse warning - Convert iter->user_backed to user_backed_iter() in the sound PCM driver - Convert iter->user_backed to user_backed_iter() in a couple of infiniband drivers - Renumber the type enum so that the ITER_* constants match the order in iterate_and_advance*() - Since the preceding patch puts UBUF and IOVEC at 0 and 1, change user_backed_iter() to just use the type value and get rid of the extra flag - Convert the iov_iter iteration macros to always-inline functions to make the code easier to follow. It uses function pointers, but they get optimised away - Move the check for ->copy_mc to _copy_from_iter() and copy_page_from_iter_atomic() rather than in memcpy_from_iter_mc() where it gets repeated for every segment. Instead, we check once and invoke a side function that can use iterate_bvec() rather than iterate_and_advance() and supply a different step function - Move the copy-and-csum code to net/ where it can be in proximity with the code that uses it - Fold memcpy_and_csum() in to its two users - Move csum_and_copy_from_iter_full() out of line and merge in csum_and_copy_from_iter() since the former is the only caller of the latter - Move hash_and_copy_to_iter() to net/ where it can be with its only caller" * tag 'vfs-6.7.iov_iter' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: iov_iter, net: Move hash_and_copy_to_iter() to net/ iov_iter, net: Merge csum_and_copy_from_iter{,_full}() together iov_iter, net: Fold in csum_and_memcpy() iov_iter, net: Move csum_and_copy_to/from_iter() to net/ iov_iter: Don't deal with iter->copy_mc in memcpy_from_iter_mc() iov_iter: Convert iterate*() to inline funcs iov_iter: Derive user-backedness from the iterator type iov_iter: Renumber ITER_* constants infiniband: Use user_backed_iter() to see if iterator is UBUF/IOVEC sound: Fix snd_pcm_readv()/writev() to use iov access functions iov_iter, x86: Be consistent about the __user tag on copy_mc_to_user() iov_iter: Remove last_offset from iov_iter as it was for ITER_PIPE
2023-10-30Merge tag 'vfs-6.7.misc' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfsLinus Torvalds7-36/+73
Pull misc vfs updates from Christian Brauner: "This contains the usual miscellaneous features, cleanups, and fixes for vfs and individual fses. Features: - Rename and export helpers that get write access to a mount. They are used in overlayfs to get write access to the upper mount. - Print the pretty name of the root device on boot failure. This helps in scenarios where we would usually only print "unknown-block(1,2)". - Add an internal SB_I_NOUMASK flag. This is another part in the endless POSIX ACL saga in a way. When POSIX ACLs are enabled via SB_POSIXACL the vfs cannot strip the umask because if the relevant inode has POSIX ACLs set it might take the umask from there. But if the inode doesn't have any POSIX ACLs set then we apply the umask in the filesytem itself. So we end up with: (1) no SB_POSIXACL -> strip umask in vfs (2) SB_POSIXACL -> strip umask in filesystem The umask semantics associated with SB_POSIXACL allowed filesystems that don't even support POSIX ACLs at all to raise SB_POSIXACL purely to avoid umask stripping. That specifically means NFS v4 and Overlayfs. NFS v4 does it because it delegates this to the server and Overlayfs because it needs to delegate umask stripping to the upper filesystem, i.e., the filesystem used as the writable layer. This went so far that SB_POSIXACL is raised eve on kernels that don't even have POSIX ACL support at all. Stop this blatant abuse and add SB_I_NOUMASK which is an internal superblock flag that filesystems can raise to opt out of umask handling. That should really only be the two mentioned above. It's not that we want any filesystems to do this. Ideally we have all umask handling always in the vfs. - Make overlayfs use SB_I_NOUMASK too. - Now that we have SB_I_NOUMASK, stop checking for SB_POSIXACL in IS_POSIXACL() if the kernel doesn't have support for it. This is a very old patch but it's only possible to do this now with the wider cleanup that was done. - Follow-up work on fake path handling from last cycle. Citing mostly from Amir: When overlayfs was first merged, overlayfs files of regular files and directories, the ones that are installed in file table, had a "fake" path, namely, f_path is the overlayfs path and f_inode is the "real" inode on the underlying filesystem. In v6.5, we took another small step by introducing of the backing_file container and the file_real_path() helper. This change allowed vfs and filesystem code to get the "real" path of an overlayfs backing file. With this change, we were able to make fsnotify work correctly and report events on the "real" filesystem objects that were accessed via overlayfs. This method works fine, but it still leaves the vfs vulnerable to new code that is not aware of files with fake path. A recent example is commit db1d1e8b9867 ("IMA: use vfs_getattr_nosec to get the i_version"). This commit uses direct referencing to f_path in IMA code that otherwise uses file_inode() and file_dentry() to reference the filesystem objects that it is measuring. This contains work to switch things around: instead of having filesystem code opt-in to get the "real" path, have generic code opt-in for the "fake" path in the few places that it is needed. Is it far more likely that new filesystems code that does not use the file_dentry() and file_real_path() helpers will end up causing crashes or averting LSM/audit rules if we keep the "fake" path exposed by default. This change already makes file_dentry() moot, but for now we did not change this helper just added a WARN_ON() in ovl_d_real() to catch if we have made any wrong assumptions. After the dust settles on this change, we can make file_dentry() a plain accessor and we can drop the inode argument to ->d_real(). - Switch struct file to SLAB_TYPESAFE_BY_RCU. This looks like a small change but it really isn't and I would like to see everyone on their tippie toes for any possible bugs from this work. Essentially we've been doing most of what SLAB_TYPESAFE_BY_RCU for files since a very long time because of the nasty interactions between the SCM_RIGHTS file descriptor garbage collection. So extending it makes a lot of sense but it is a subtle change. There are almost no places that fiddle with file rcu semantics directly and the ones that did mess around with struct file internal under rcu have been made to stop doing that because it really was always dodgy. I forgot to put in the link tag for this change and the discussion in the commit so adding it into the merge message: https://lore.kernel.org/r/[email protected] Cleanups: - Various smaller pipe cleanups including the removal of a spin lock that was only used to protect against writes without pipe_lock() from O_NOTIFICATION_PIPE aka watch queues. As that was never implemented remove the additional locking from pipe_write(). - Annotate struct watch_filter with the new __counted_by attribute. - Clarify do_unlinkat() cleanup so that it doesn't look like an extra iput() is done that would cause issues. - Simplify file cleanup when the file has never been opened. - Use module helper instead of open-coding it. - Predict error unlikely for stale retry. - Use WRITE_ONCE() for mount expiry field instead of just commenting that one hopes the compiler doesn't get smart. Fixes: - Fix readahead on block devices. - Fix writeback when layztime is enabled and inodes whose timestamp is the only thing that changed reside on wb->b_dirty_time. This caused excessively large zombie memory cgroup when lazytime was enabled as such inodes weren't handled fast enough. - Convert BUG_ON() to WARN_ON_ONCE() in open_last_lookups()" * tag 'vfs-6.7.misc' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: (26 commits) file, i915: fix file reference for mmap_singleton() vfs: Convert BUG_ON to WARN_ON_ONCE in open_last_lookups writeback, cgroup: switch inodes with dirty timestamps to release dying cgwbs chardev: Simplify usage of try_module_get() ovl: rely on SB_I_NOUMASK fs: fix umask on NFS with CONFIG_FS_POSIX_ACL=n fs: store real path instead of fake path in backing file f_path fs: create helper file_user_path() for user displayed mapped file path fs: get mnt_writers count for an open backing file's real path vfs: stop counting on gcc not messing with mnt_expiry_mark if not asked vfs: predict the error in retry_estale as unlikely backing file: free directly vfs: fix readahead(2) on block devices io_uring: use files_lookup_fd_locked() file: convert to SLAB_TYPESAFE_BY_RCU vfs: shave work on failed file open fs: simplify misleading code to remove ambiguity regarding ihold()/iput() watch_queue: Annotate struct watch_filter with __counted_by fs/pipe: use spinlock in pipe_read() only if there is a watch_queue fs/pipe: remove unnecessary spinlock from pipe_write() ...
2023-10-30Merge tag 'vfs-6.7.super' of ↵Linus Torvalds5-1/+17
gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs Pull vfs superblock updates from Christian Brauner: "This contains the work to make block device opening functions return a struct bdev_handle instead of just a struct block_device. The same struct bdev_handle is then also passed to block device closing functions. This allows us to propagate context from opening to closing a block device without having to modify all users everytime. Sidenote, in the future we might even want to try and have block device opening functions return a struct file directly but that's a series on top of this. These are further preparatory changes to be able to count writable opens and blocking writes to mounted block devices. That's a separate piece of work for next cycle and for that we absolutely need the changes to btrfs that have been quietly dropped somehow. Originally the series contained a patch that removed the old blkdev_*() helpers. But since this would've caused needles churn in -next for bcachefs we ended up delaying it. The second piece of work addresses one of the major annoyances about the work last cycle, namely that we required dropping s_umount whenever we used the superblock and fs_holder_ops for a block device. The reason for that requirement had been that in some codepaths s_umount could've been taken under disk->open_mutex (that's always been the case, at least theoretically). For example, on surprise block device removal or media change. And opening and closing block devices required grabbing disk->open_mutex as well. So we did the work and went through the block layer and fixed all those places so that s_umount is never taken under disk->open_mutex. This means no more brittle games where we yield and reacquire s_umount during block device opening and closing and no more requirements where block devices need to be closed. Filesystems don't need to care about this. There's a bunch of other follow-up work such as moving block device freezing and thawing to holder operations which makes it work for all block devices and not just the main block device just as we did for surprise removal. But that is for next cycle. Tested with fstests for all major fses, blktests, LTP" * tag 'vfs-6.7.super' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: (37 commits) porting: update locking requirements fs: assert that open_mutex isn't held over holder ops block: assert that we're not holding open_mutex over blk_report_disk_dead block: move bdev_mark_dead out of disk_check_media_change block: WARN_ON_ONCE() when we remove active partitions block: simplify bdev_del_partition() fs: Avoid grabbing sb->s_umount under bdev->bd_holder_lock jfs: fix log->bdev_handle null ptr deref in lbmStartIO bcache: Fixup error handling in register_cache() xfs: Convert to bdev_open_by_path() reiserfs: Convert to bdev_open_by_dev/path() ocfs2: Convert to use bdev_open_by_dev() nfs/blocklayout: Convert to use bdev_open_by_dev/path() jfs: Convert to bdev_open_by_dev() f2fs: Convert to bdev_open_by_dev/path() ext4: Convert to bdev_open_by_dev() erofs: Convert to use bdev_open_by_path() btrfs: Convert to bdev_open_by_path() fs: Convert to bdev_open_by_dev() mm/swap: Convert to use bdev_open_by_dev() ...
2023-10-28seq_buf: Introduce DECLARE_SEQ_BUF and seq_buf_str()Kees Cook1-4/+17
Solve two ergonomic issues with struct seq_buf; 1) Too much boilerplate is required to initialize: struct seq_buf s; char buf[32]; seq_buf_init(s, buf, sizeof(buf)); Instead, we can build this directly on the stack. Provide DECLARE_SEQ_BUF() macro to do this: DECLARE_SEQ_BUF(s, 32); 2) %NUL termination is fragile and requires 2 steps to get a valid C String (and is a layering violation exposing the "internals" of seq_buf): seq_buf_terminate(s); do_something(s->buffer); Instead, we can just return s->buffer directly after terminating it in the refactored seq_buf_terminate(), now known as seq_buf_str(): do_something(seq_buf_str(s)); Link: https://lore.kernel.org/linux-trace-kernel/[email protected] Link: https://lore.kernel.org/linux-trace-kernel/[email protected]/ Cc: Yosry Ahmed <[email protected]> Cc: "Matthew Wilcox (Oracle)" <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Justin Stitt <[email protected]> Cc: Kent Overstreet <[email protected]> Cc: Petr Mladek <[email protected]> Cc: Andy Shevchenko <[email protected]> Cc: Rasmus Villemoes <[email protected]> Cc: Sergey Senozhatsky <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Yun Zhou <[email protected]> Cc: Jacob Keller <[email protected]> Cc: Zhen Lei <[email protected]> Signed-off-by: Kees Cook <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
2023-10-28Merge branch 'pci/misc'Bjorn Helgaas2-3/+1
- Prevent xHCI driver from claiming AMD VanGogh USB3 DRD device so dwc3 can claim it instead (Vicki Pfau) - Make pci_assign_unassigned_resources() non-init because sparc uses it after init-time (Randy Dunlap) - Remove logic_outb(), _outw(), outl() duplicate declarations (John Sanpe) - Remove unnecessary UTF-8 in Kconfig help text that confuses menuconfig (Liu Song) - Fix double free in __pci_epc_create() (Dan Carpenter) - Simplify pcie_capability_clear_and_set_word() cases that could be pcie_capability_clear_word() (Ilpo Järvinen) * pci/misc: PCI: Simplify pcie_capability_clear_and_set_word() to ..._clear_word() PCI: endpoint: Fix double free in __pci_epc_create() PCI: Replace unnecessary UTF-8 in Kconfig logic_pio: Remove logic_outb(), _outw(), outl() duplicate declarations PCI: Make pci_assign_unassigned_resources() non-init PCI: Prevent xHCI driver from claiming AMD VanGogh USB3 DRD device
2023-10-28Merge branch 'pci/vga'Bjorn Helgaas1-0/+24
- Add pci_is_vga() helper, which checks for both PCI_CLASS_DISPLAY_VGA and PCI_CLASS_NOT_DEFINED_VGA (which catches ancient devices built before Class Codes were defined) (Sui Jingfeng) - Use the new pci_is_vga() to identify devices for the VGA arbiter, the sysfs "boot_vga" attribute, and the virtio and qxl drivers (SUi Jingfeng) * pci/vga: drm/qxl: Use pci_is_vga() to identify VGA devices drm/virtio: Use pci_is_vga() to identify VGA devices PCI/sysfs: Enable 'boot_vga' attribute via pci_is_vga() PCI/VGA: Select VGA devices earlier PCI/VGA: Use pci_is_vga() to identify VGA devices PCI: Add pci_is_vga() helper
2023-10-28fs: fix build error with CONFIG_EXPORTFS=m or not definedAmir Goldstein1-7/+2
Many of the filesystems that call the generic exportfs helpers do not select the EXPORTFS config. Move generic_encode_ino32_fh() to libfs.c, same as generic_fh_to_*() to avoid having to fix all those config dependencies. Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Fixes: dfaf653dc415 ("exportfs: make ->encode_fh() a mandatory method for NFS export") Suggested-by: Arnd Bergmann <[email protected]> Signed-off-by: Amir Goldstein <[email protected]> Link: https://lore.kernel.org/r/[email protected] Tested-by: Arnd Bergmann <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
2023-10-28exportfs: support encoding non-decodeable file handles by defaultAmir Goldstein1-3/+7
AT_HANDLE_FID was added as an API for name_to_handle_at() that request the encoding of a file id, which is not intended to be decoded. This file id is used by fanotify to describe objects in events. So far, overlayfs is the only filesystem that supports encoding non-decodeable file ids, by providing export_operations with an ->encode_fh() method and without a ->decode_fh() method. Add support for encoding non-decodeable file ids to all the filesystems that do not provide export_operations, by encoding a file id of type FILEID_INO64_GEN from { i_ino, i_generation }. A filesystem may that does not support NFS export, can opt-out of encoding non-decodeable file ids for fanotify by defining an empty export_operations struct (i.e. with a NULL ->encode_fh() method). This allows the use of fanotify events with file ids on filesystems like 9p which do not support NFS export to bring fanotify in feature parity with inotify on those filesystems. Note that fanotify also requires that the filesystems report a non-null fsid. Currently, many simple filesystems that have support for inotify (e.g. debugfs, tracefs, sysfs) report a null fsid, so can still not be used with fanotify in file id reporting mode. Reviewed-by: Jan Kara <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Amir Goldstein <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Christian Brauner <[email protected]>
2023-10-28exportfs: define FILEID_INO64_GEN* file handle typesAmir Goldstein1-0/+11
Similar to the common FILEID_INO32* file handle types, define common FILEID_INO64* file handle types. The type values of FILEID_INO64_GEN and FILEID_INO64_GEN_PARENT are the values returned by fuse and xfs for 64bit ino encoded file handle types. Note that these type value are filesystem specific and they do not define a universal file handle format, for example: fuse encodes FILEID_INO64_GEN as [ino-hi32,ino-lo32,gen] and xfs encodes FILEID_INO64_GEN as [hostr-order-ino64,gen] (a.k.a xfs_fid64). The FILEID_INO64_GEN fhandle type is going to be used for file ids for fanotify from filesystems that do not support NFS export. Reviewed-by: Jan Kara <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Amir Goldstein <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Christian Brauner <[email protected]>
2023-10-28exportfs: make ->encode_fh() a mandatory method for NFS exportAmir Goldstein1-1/+8
Rename the default helper for encoding FILEID_INO32_GEN* file handles to generic_encode_ino32_fh() and convert the filesystems that used the default implementation to use the generic helper explicitly. After this change, exportfs_encode_inode_fh() no longer has a default implementation to encode FILEID_INO32_GEN* file handles. This is a step towards allowing filesystems to encode non-decodeable file handles for fanotify without having to implement any export_operations. Reviewed-by: Jan Kara <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Acked-by: Chuck Lever <[email protected]> Signed-off-by: Amir Goldstein <[email protected]> Link: https://lore.kernel.org/r/[email protected] Acked-by: Dave Kleikamp <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
2023-10-28ACPI: Add helper acpi_use_parent_companionHeiner Kallweit1-0/+5
In several drivers devices use the ACPI companion of the parent. Add a helper for this use case to avoid code duplication. Signed-off-by: Heiner Kallweit <[email protected]> Acked-by: Rafael J. Wysocki <[email protected]> Reviewed-by: Andi Shyti <[email protected]> Signed-off-by: Wolfram Sang <[email protected]>
2023-10-28linux/init: remove __memexit* annotationsMasahiro Yamada1-3/+0
We have never used __memexit, __memexitdata, or __memexitconst. These were unneeded. Signed-off-by: Masahiro Yamada <[email protected]> Acked-by: Arnd Bergmann <[email protected]>
2023-10-28fs: Convert to bdev_open_by_dev()Jan Kara1-0/+1
Convert mount code to use bdev_open_by_dev() and propagate the handle around to bdev_release(). Acked-by: Christoph Hellwig <[email protected]> Reviewed-by: Christian Brauner <[email protected]> Signed-off-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Christian Brauner <[email protected]>
2023-10-28mm/swap: Convert to use bdev_open_by_dev()Jan Kara1-0/+1
Convert swapping code to use bdev_open_by_dev() and pass the handle around. CC: [email protected] CC: Andrew Morton <[email protected]> Acked-by: Christoph Hellwig <[email protected]> Acked-by: Christian Brauner <[email protected]> Signed-off-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Christian Brauner <[email protected]>
2023-10-28dm: Convert to bdev_open_by_dev()Jan Kara1-0/+1
Convert device mapper to use bdev_open_by_dev() and pass the handle around. CC: Alasdair Kergon <[email protected]> CC: Mike Snitzer <[email protected]> CC: [email protected] Acked-by: Christoph Hellwig <[email protected]> Acked-by: Christian Brauner <[email protected]> Signed-off-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Christian Brauner <[email protected]>
2023-10-28pktcdvd: Convert to bdev_open_by_dev()Jan Kara1-1/+3
Convert pktcdvd to use bdev_open_by_dev(). Acked-by: Christoph Hellwig <[email protected]> Acked-by: Christian Brauner <[email protected]> Signed-off-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Christian Brauner <[email protected]>
2023-10-28block: Use bdev_open_by_dev() in blkdev_open()Jan Kara1-0/+1
Convert blkdev_open() to use bdev_open_by_dev(). To be able to propagate handle from blkdev_open() to blkdev_release() we need to stop using existence of file->private_data to determine exclusive block device opens. Use bdev_handle->mode for this purpose since file->f_flags isn't usable for this (O_EXCL is cleared from the flags during open). Acked-by: Christoph Hellwig <[email protected]> Reviewed-by: Christian Brauner <[email protected]> Signed-off-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Christian Brauner <[email protected]>
2023-10-28block: Provide bdev_open_* functionsJan Kara1-0/+10
Create struct bdev_handle that contains all parameters that need to be passed to blkdev_put() and provide bdev_open_* functions that return this structure instead of plain bdev pointer. This will eventually allow us to pass one more argument to blkdev_put() (renamed to bdev_release()) without too much hassle. Acked-by: Christoph Hellwig <[email protected]> Reviewed-by: Christian Brauner <[email protected]> Signed-off-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Christian Brauner <[email protected]>
2023-10-27acpi: Move common tables helper functions to common libDave Jiang2-31/+54
Some of the routines in ACPI driver/acpi/tables.c can be shared with parsing CDAT. CDAT is a device-provided data structure that is formatted similar to a platform provided ACPI table. CDAT is used by CXL and can exist on platforms that do not use ACPI. Split out the common routine from ACPI to accommodate platforms that do not support ACPI and move that to /lib. The common routines can be built outside of ACPI if FIRMWARE_TABLES is selected. Link: https://lore.kernel.org/linux-cxl/CAJZ5v0jipbtTNnsA0-o5ozOk8ZgWnOg34m34a9pPenTyRLj=6A@mail.gmail.com/ Suggested-by: "Rafael J. Wysocki" <[email protected]> Reviewed-by: Hanjun Guo <[email protected]> Reviewed-by: Jonathan Cameron <[email protected]> Signed-off-by: Dave Jiang <[email protected]> Acked-by: "Rafael J. Wysocki" <[email protected]> Link: https://lore.kernel.org/r/169713683430.2205276.17899451119920103445.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <[email protected]>
2023-10-27PCI/AER: Refactor cper_print_aer() for use by CXL driver moduleTerry Bowman1-1/+1
The CXL driver plans to use cper_print_aer() for logging restricted CXL host (RCH) AER errors. cper_print_aer() is not currently exported and therefore not usable by the CXL drivers built as loadable modules. Export the cper_print_aer() function. Use the EXPORT_SYMBOL_NS_GPL() variant to restrict the export to CXL drivers. The CONFIG_ACPI_APEI_PCIEAER kernel config is currently used to enable cper_print_aer(). cper_print_aer() logs the AER registers and is useful in PCIE AER logging outside of APEI. Remove the CONFIG_ACPI_APEI_PCIEAER dependency to enable cper_print_aer(). The cper_print_aer() function name implies CPER specific use but is useful in non-CPER cases as well. Rename cper_print_aer() to pci_print_aer(). Also, update cxl_core to import CXL namespace imports. Co-developed-by: Robert Richter <[email protected]> Signed-off-by: Terry Bowman <[email protected]> Signed-off-by: Robert Richter <[email protected]> Cc: Mahesh J Salgaonkar <[email protected]> Cc: Oliver O'Halloran <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: [email protected] Reviewed-by: Jonathan Cameron <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> Reviewed-by: Dave Jiang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Dan Williams <[email protected]>
2023-10-27cdx: add sysfs for subsystem, class and revisionAbhijit Gangurde2-2/+35
CDX controller provides subsystem vendor, subsystem device, class and revision info of the device along with vendor and device ID in native endian format. CDX Bus system uses this information to bind the cdx device to the cdx device driver. Co-developed-by: Puneet Gupta <[email protected]> Signed-off-by: Puneet Gupta <[email protected]> Co-developed-by: Nipun Gupta <[email protected]> Signed-off-by: Nipun Gupta <[email protected]> Signed-off-by: Abhijit Gangurde <[email protected]> Reviewed-by: Pieter Jansen van Vuuren <[email protected]> Tested-by: Nikhil Agarwal <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-10-27cdx: add support for bus enable and disableAbhijit Gangurde1-0/+10
CDX bus needs to be disabled before updating/writing devices in the FPGA. Once the devices are written, the bus shall be rescanned. This change provides sysfs entry to enable/disable the CDX bus. Co-developed-by: Nipun Gupta <[email protected]> Signed-off-by: Nipun Gupta <[email protected]> Signed-off-by: Abhijit Gangurde <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-10-27cdx: Register cdx bus as a device on cdx subsystemAbhijit Gangurde1-0/+2
While scanning for CDX devices, register newly discovered bus as a cdx device. CDX device attributes are visible based on device type. Signed-off-by: Abhijit Gangurde <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-10-27cdx: Remove cdx controller list from cdx bus systemAbhijit Gangurde1-0/+2
Remove xarray list of cdx controller. Instead, use platform bus to locate the cdx controller using compat string used by cdx controller platform driver. Also, use ida to allocate a unique id for the controller. Signed-off-by: Abhijit Gangurde <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-10-27Revert "nvmem: add new config option"Rafał Miłecki1-2/+0
This reverts commit 517f14d9cf3533d5ab4fded195ab6f80a92e378f. Config option "no_of_node" is no longer needed since adding a more explicit and targeted option "add_legacy_fixed_of_cells". That "no_of_node" config option was needed *earlier* to help mtd's case. DT nodes of MTD partitions (that are also NVMEM devices) may contain subnodes. Those SHOULD NOT be treated as NVMEM fixed cells. To prevent NVMEM core code from parsing subnodes a "no_of_node" option was added (and set to true in mtd) to make for_each_child_of_node() in NVMEM a no-op. That was a bit hacky because it was messing with "of_node" pointer to achieve some side-effect. With the introduction of "add_legacy_fixed_of_cells" config option things got more explicit. MTD subsystem simply tells NVMEM when to look for fixed cells and there is no need to hack "of_node" pointer anymore. Signed-off-by: Rafał Miłecki <[email protected]> Reviewed-by: Miquel Raynal <[email protected]> Signed-off-by: Srinivas Kandagatla <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-10-27x509: Add OIDs for FIPS 202 SHA-3 hash and signaturesDimitri John Ledkov1-0/+11
Add OID for FIPS 202 SHA-3 family of hash functions, RSA & ECDSA signatures using those. Limit to 256 or larger sizes, for interoperability reasons. 224 is too weak for any practical uses. Signed-off-by: Dimitri John Ledkov <[email protected]> Signed-off-by: Herbert Xu <[email protected]>
2023-10-27crypto: ahash - remove support for nonzero alignmaskEric Biggers1-13/+14
Currently, the ahash API checks the alignment of all key and result buffers against the algorithm's declared alignmask, and for any unaligned buffers it falls back to manually aligned temporary buffers. This is virtually useless, however. First, since it does not apply to the message, its effect is much more limited than e.g. is the case for the alignmask for "skcipher". Second, the key and result buffers are given as virtual addresses and cannot (in general) be DMA'ed into, so drivers end up having to copy to/from them in software anyway. As a result it's easy to use memcpy() or the unaligned access helpers. The crypto_hash_walk_*() helper functions do use the alignmask to align the message. But with one exception those are only used for shash algorithms being exposed via the ahash API, not for native ahashes, and aligning the message is not required in this case, especially now that alignmask support has been removed from shash. The exception is the n2_core driver, which doesn't set an alignmask. In any case, no ahash algorithms actually set a nonzero alignmask anymore. Therefore, remove support for it from ahash. The benefit is that all the code to handle "misaligned" buffers in the ahash API goes away, reducing the overhead of the ahash API. This follows the same change that was made to shash. Signed-off-by: Eric Biggers <[email protected]> Signed-off-by: Herbert Xu <[email protected]>
2023-10-27units: Add BYTES_PER_*BITDamian Muszynski1-0/+4
There is going to be a new user of the BYTES_PER_[K/M/G]BIT definition besides possibly existing ones. Add them to the header. Signed-off-by: Damian Muszynski <[email protected]> Reviewed-by: Giovanni Cabiddu <[email protected]> Reviewed-by: Tero Kristo <[email protected]> Signed-off-by: Herbert Xu <[email protected]>
2023-10-27net: Add MDB get device operationIdo Schimmel1-0/+4
Add MDB net device operation that will be invoked by rtnetlink code in response to received RTM_GETMDB messages. Subsequent patches will implement the operation in the bridge and VXLAN drivers. Signed-off-by: Ido Schimmel <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-10-27Merge tag 'thunderbolt-for-v6.7-rc1' of ↵Greg Kroah-Hartman1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt into usb-next Mika writes: thunderbolt: Changes for v6.7 merge window This includes following USB4/Thunderbolt changes for the v6.7 merge window: - Configure asymmetric link if the DisplayPort bandwidth requires so - Enable path power management packet support for USB4 v2 routers - Make the bandwidth reservations to follow the USB4 v2 connection manager guide suggestions - DisplayPort tunneling improvements - Small cleanups and improvements around the driver. All these have been in linux-next with no reported issues. * tag 'thunderbolt-for-v6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt: (25 commits) thunderbolt: Fix one kernel-doc comment thunderbolt: Configure asymmetric link if needed and bandwidth allows thunderbolt: Add support for asymmetric link thunderbolt: Introduce tb_switch_depth() thunderbolt: Introduce tb_for_each_upstream_port_on_path() thunderbolt: Introduce tb_port_path_direction_downstream() thunderbolt: Set path power management packet support bit for USB4 v2 routers thunderbolt: Change bandwidth reservations to comply USB4 v2 thunderbolt: Make is_gen4_link() available to the rest of the driver thunderbolt: Use weight constants in tb_usb3_consumed_bandwidth() thunderbolt: Use constants for path weight and priority thunderbolt: Add DP IN added last in the head of the list of DP resources thunderbolt: Create multiple DisplayPort tunnels if there are more DP IN/OUT pairs thunderbolt: Log NVM version of routers and retimers thunderbolt: Use tb_tunnel_xxx() log macros in tb.c thunderbolt: Expose tb_tunnel_xxx() log macros to the rest of the driver thunderbolt: Use tb_tunnel_dbg() where possible to make logging more consistent thunderbolt: Fix typo of HPD bit for Hot Plug Detect thunderbolt: Fix typo in enum tb_link_width kernel-doc thunderbolt: Fix debug log when DisplayPort adapter not available for pairing ...
2023-10-27net/tcp: Wire TCP-AO to request socketsDmitry Safonov1-0/+18
Now when the new request socket is created from the listening socket, it's recorded what MKT was used by the peer. tcp_rsk_used_ao() is a new helper for checking if TCP-AO option was used to create the request socket. tcp_ao_copy_all_matching() will copy all keys that match the peer on the request socket, as well as preparing them for the usage (creating traffic keys). Co-developed-by: Francesco Ruggeri <[email protected]> Signed-off-by: Francesco Ruggeri <[email protected]> Co-developed-by: Salam Noureddine <[email protected]> Signed-off-by: Salam Noureddine <[email protected]> Signed-off-by: Dmitry Safonov <[email protected]> Acked-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-10-27net/tcp: Add TCP-AO sign to twskDmitry Safonov1-0/+3
Add support for sockets in time-wait state. ao_info as well as all keys are inherited on transition to time-wait socket. The lifetime of ao_info is now protected by ref counter, so that tcp_ao_destroy_sock() will destruct it only when the last user is gone. Co-developed-by: Francesco Ruggeri <[email protected]> Signed-off-by: Francesco Ruggeri <[email protected]> Co-developed-by: Salam Noureddine <[email protected]> Signed-off-by: Salam Noureddine <[email protected]> Signed-off-by: Dmitry Safonov <[email protected]> Acked-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-10-27net/tcp: Introduce TCP_AO setsockopt()sDmitry Safonov1-0/+23
Add 3 setsockopt()s: 1. TCP_AO_ADD_KEY to add a new Master Key Tuple (MKT) on a socket 2. TCP_AO_DEL_KEY to delete present MKT from a socket 3. TCP_AO_INFO to change flags, Current_key/RNext_key on a TCP-AO sk Userspace has to introduce keys on every socket it wants to use TCP-AO option on, similarly to TCP_MD5SIG/TCP_MD5SIG_EXT. RFC5925 prohibits definition of MKTs that would match the same peer, so do sanity checks on the data provided by userspace. Be as conservative as possible, including refusal of defining MKT on an established connection with no AO, removing the key in-use and etc. (1) and (2) are to be used by userspace key manager to add/remove keys. (3) main purpose is to set RNext_key, which (as prescribed by RFC5925) is the KeyID that will be requested in TCP-AO header from the peer to sign their segments with. At this moment the life of ao_info ends in tcp_v4_destroy_sock(). Co-developed-by: Francesco Ruggeri <[email protected]> Signed-off-by: Francesco Ruggeri <[email protected]> Co-developed-by: Salam Noureddine <[email protected]> Signed-off-by: Salam Noureddine <[email protected]> Signed-off-by: Dmitry Safonov <[email protected]> Acked-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>