aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-08-28Merge remote-tracking branch 'linux-efi/urgent' into efi/nextArd Biesheuvel1-1/+1
2023-08-24x86/efistub: Fix PCI ROM preservation in mixed modeMikel Rychliski1-1/+1
preserve_pci_rom_image() was accessing the romsize field in efi_pci_io_protocol_t directly instead of using the efi_table_attr() helper. This prevents the ROM image from being saved correctly during a mixed mode boot. Fixes: 2c3625cb9fa2 ("efi/x86: Fold __setup_efi_pci32() and __setup_efi_pci64() into one function") Signed-off-by: Mikel Rychliski <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]>
2023-08-22efi/runtime-wrappers: Clean up white space and add __init annotationArd Biesheuvel1-22/+19
Some cosmetic changes as well as a missing __init annotation. Signed-off-by: Ard Biesheuvel <[email protected]>
2023-08-22acpi/prmt: Use EFI runtime sandbox to invoke PRM handlersArd Biesheuvel4-4/+40
Instead of bypassing the kernel's adaptation layer for performing EFI runtime calls, wire up ACPI PRM handling into it. This means these calls can no longer occur concurrently with EFI runtime calls, and will be made from the EFI runtime workqueue. It also means any page faults occurring during PRM handling will be identified correctly as originating in firmware code. Acked-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]>
2023-08-22efi/runtime-wrappers: Don't duplicate setup/teardown codeArd Biesheuvel2-10/+22
Avoid duplicating the EFI arch setup and teardown routine calls numerous times in efi_call_rts(). Instead, expand the efi_call_virt_pointer() macro into efi_call_rts(), taking the pre and post parts out of the switch. Signed-off-by: Ard Biesheuvel <[email protected]>
2023-08-22efi/runtime-wrappers: Remove duplicated macro for service returning voidArd Biesheuvel4-24/+14
__efi_call_virt() exists as an alternative for efi_call_virt() for the sole reason that ResetSystem() returns void, and so we cannot use a call to it in the RHS of an assignment. Given that there is only a single user, let's drop the macro, and expand it into the caller. That way, the remaining macro can be tightened somewhat in terms of type safety too. Note that the use of typeof() on the runtime service invocation does not result in an actual call being made, but it does require a few pointer types to be fixed up and converted into the proper function pointer prototypes. Signed-off-by: Ard Biesheuvel <[email protected]>
2023-08-21efi/runtime-wrapper: Move workqueue manipulation out of lineArd Biesheuvel1-28/+33
efi_queue_work() is a macro that implements the non-trivial manipulation of the EFI runtime workqueue and completion data structure, most of which is generic, and could be shared between all the users of the macro. So move it out of the macro and into a new helper function. Signed-off-by: Ard Biesheuvel <[email protected]>
2023-08-21efi/runtime-wrappers: Use type safe encapsulation of call argumentsArd Biesheuvel2-76/+138
The current code that marshalls the EFI runtime call arguments to hand them off to a async helper does so in a type unsafe and slightly messy manner - everything is cast to void* except for some integral types that are passed by reference and dereferenced on the receiver end. Let's clean this up a bit, and record the arguments of each runtime service invocation exactly as they are issued, in a manner that permits the compiler to check the types of the arguments at both ends. Signed-off-by: Ard Biesheuvel <[email protected]>
2023-08-21efi/riscv: Move EFI runtime call setup/teardown helpers out of lineArd Biesheuvel2-10/+15
Only the arch_efi_call_virt() macro that some architectures override needs to be a macro, given that it is variadic and encapsulates calls via function pointers that have different prototypes. The associated setup and teardown code are not special in this regard, and don't need to be instantiated at each call site. So turn them into ordinary C functions and move them out of line. Cc: Paul Walmsley <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Albert Ou <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]>
2023-08-21efi/arm64: Move EFI runtime call setup/teardown helpers out of lineArd Biesheuvel2-16/+18
Only the arch_efi_call_virt() macro that some architectures override needs to be a macro, given that it is variadic and encapsulates calls via function pointers that have different prototypes. The associated setup and teardown code are not special in this regard, and don't need to be instantiated at each call site. So turn them into ordinary C functions and move them out of line. Acked-by: Catalin Marinas <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]>
2023-08-03efi/riscv: libstub: Fix comment about absolute relocationXiao Wang1-1/+1
We don't want absolute symbols references in the stub, so fix the double negation in the comment. Signed-off-by: Xiao Wang <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]>
2023-08-03efi: memmap: Remove kernel-doc warningsZhu Wang1-1/+1
Remove kernel-doc warnings: arch/x86/platform/efi/memmap.c:94: warning: Function parameter or member 'data' not described in 'efi_memmap_install' arch/x86/platform/efi/memmap.c:94: warning: Excess function parameter 'ctx' description in 'efi_memmap_install' Signed-off-by: Zhu Wang <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]>
2023-08-03efi: Remove unused extern declaration efi_lookup_mapped_addr()YueHaibing1-1/+0
Since commit 50a0cb565246 ("x86/efi-bgrt: Fix kernel panic when mapping BGRT data") this extern declaration is not used anymore. Signed-off-by: YueHaibing <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]>
2023-07-09Linux 6.5-rc1Linus Torvalds1-2/+2
2023-07-09MAINTAINERS 2: Electric BoogalooLinus Torvalds1-46/+46
We just sorted the entries and fields last release, so just out of a perverse sense of curiosity, I decided to see if we can keep things ordered for even just one release. The answer is "No. No we cannot". I suggest that all kernel developers will need weekly training sessions, involving a lot of Big Bird and Sesame Street. And at the yearly maintainer summit, we will all sing the alphabet song together. I doubt I will keep doing this. At some point "perverse sense of curiosity" turns into just a cold dark place filled with sadness and despair. Repeats: 80e62bc8487b ("MAINTAINERS: re-sort all entries and fields") Signed-off-by: Linus Torvalds <[email protected]>
2023-07-09Merge tag 'dma-mapping-6.5-2023-07-09' of ↵Linus Torvalds1-11/+35
git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping fixes from Christoph Hellwig: - swiotlb area sizing fixes (Petr Tesarik) * tag 'dma-mapping-6.5-2023-07-09' of git://git.infradead.org/users/hch/dma-mapping: swiotlb: reduce the number of areas to match actual memory pool size swiotlb: always set the number of areas before allocating the pool
2023-07-09Merge tag 'irq_urgent_for_v6.5_rc1' of ↵Linus Torvalds1-3/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq update from Borislav Petkov: - Optimize IRQ domain's name assignment * tag 'irq_urgent_for_v6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqdomain: Use return value of strreplace()
2023-07-09Merge tag 'x86_urgent_for_v6.5_rc1' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fpu fix from Borislav Petkov: - Do FPU AP initialization on Xen PV too which got missed by the recent boot reordering work * tag 'x86_urgent_for_v6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/xen: Fix secondary processors' FPU initialization
2023-07-09Merge tag 'x86-core-2023-07-09' of ↵Linus Torvalds1-0/+8
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fix from Thomas Gleixner: "A single fix for the mechanism to park CPUs with an INIT IPI. On shutdown or kexec, the kernel tries to park the non-boot CPUs with an INIT IPI. But the same code path is also used by the crash utility. If the CPU which panics is not the boot CPU then it sends an INIT IPI to the boot CPU which resets the machine. Prevent this by validating that the CPU which runs the stop mechanism is the boot CPU. If not, leave the other CPUs in HLT" * tag 'x86-core-2023-07-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/smp: Don't send INIT to boot CPU
2023-07-09Merge tag 'mips_6.5_1' of ↵Linus Torvalds10-53/+46
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS fixes from Thomas Bogendoerfer: - fixes for KVM - fix for loongson build and cpu probing - DT fixes * tag 'mips_6.5_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MIPS: kvm: Fix build error with KVM_MIPS_DEBUG_COP0_COUNTERS enabled MIPS: dts: add missing space before { MIPS: Loongson: Fix build error when make modules_install MIPS: KVM: Fix NULL pointer dereference MIPS: Loongson: Fix cpu_probe_loongson() again
2023-07-09Merge tag 'xfs-6.5-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds1-1/+1
Pull xfs fix from Darrick Wong: "Nothing exciting here, just getting rid of a gcc warning that I got tired of seeing when I turn on gcov" * tag 'xfs-6.5-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: fix uninit warning in xfs_growfs_data
2023-07-09Merge tag '6.5-rc-smb3-client-fixes-part2' of ↵Linus Torvalds4-5/+72
git://git.samba.org/sfrench/cifs-2.6 Pull more smb client updates from Steve French: - fix potential use after free in unmount - minor cleanup - add worker to cleanup stale directory leases * tag '6.5-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6: cifs: Add a laundromat thread for cached directories smb: client: remove redundant pointer 'server' cifs: fix session state transition to avoid use-after-free issue
2023-07-09Merge tag 'ntb-6.5' of https://github.com/jonmason/ntbLinus Torvalds10-33/+36
Pull NTB updates from Jon Mason: "Fixes for pci_clean_master, error handling in driver inits, and various other issues/bugs" * tag 'ntb-6.5' of https://github.com/jonmason/ntb: ntb: hw: amd: Fix debugfs_create_dir error checking ntb.rst: Fix copy and paste error ntb_netdev: Fix module_init problem ntb: intel: Remove redundant pci_clear_master ntb: epf: Remove redundant pci_clear_master ntb_hw_amd: Remove redundant pci_clear_master ntb: idt: drop redundant pci_enable_pcie_error_reporting() MAINTAINERS: git://github -> https://github.com for jonmason NTB: EPF: fix possible memory leak in pci_vntb_probe() NTB: ntb_tool: Add check for devm_kcalloc NTB: ntb_transport: fix possible memory leak while device_register() fails ntb: intel: Fix error handling in intel_ntb_pci_driver_init() NTB: amd: Fix error handling in amd_ntb_pci_driver_init() ntb: idt: Fix error handling in idt_pci_driver_init()
2023-07-08mm: lock newly mapped VMA with corrected orderingHugh Dickins1-2/+2
Lockdep is certainly right to complain about (&vma->vm_lock->lock){++++}-{3:3}, at: vma_start_write+0x2d/0x3f but task is already holding lock: (&mapping->i_mmap_rwsem){+.+.}-{3:3}, at: mmap_region+0x4dc/0x6db Invert those to the usual ordering. Fixes: 33313a747e81 ("mm: lock newly mapped VMA which can be modified after it becomes visible") Cc: [email protected] Signed-off-by: Hugh Dickins <[email protected]> Tested-by: Suren Baghdasaryan <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2023-07-08Merge tag 'mm-hotfixes-stable-2023-07-08-10-43' of ↵Linus Torvalds18-44/+99
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull hotfixes from Andrew Morton: "16 hotfixes. Six are cc:stable and the remainder address post-6.4 issues" The merge undoes the disabling of the CONFIG_PER_VMA_LOCK feature, since it was all hopefully fixed in mainline. * tag 'mm-hotfixes-stable-2023-07-08-10-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: lib: dhry: fix sleeping allocations inside non-preemptable section kasan, slub: fix HW_TAGS zeroing with slub_debug kasan: fix type cast in memory_is_poisoned_n mailmap: add entries for Heiko Stuebner mailmap: update manpage link bootmem: remove the vmemmap pages from kmemleak in free_bootmem_page MAINTAINERS: add linux-next info mailmap: add Markus Schneider-Pargmann writeback: account the number of pages written back mm: call arch_swap_restore() from do_swap_page() squashfs: fix cache race with migration mm/hugetlb.c: fix a bug within a BUG(): inconsistent pte comparison docs: update ocfs2-devel mailing list address MAINTAINERS: update ocfs2-devel mailing list address mm: disable CONFIG_PER_VMA_LOCK until its fixed fork: lock VMAs of the parent process when forking
2023-07-08fork: lock VMAs of the parent process when forkingSuren Baghdasaryan1-0/+1
When forking a child process, the parent write-protects anonymous pages and COW-shares them with the child being forked using copy_present_pte(). We must not take any concurrent page faults on the source vma's as they are being processed, as we expect both the vma and the pte's behind it to be stable. For example, the anon_vma_fork() expects the parents vma->anon_vma to not change during the vma copy. A concurrent page fault on a page newly marked read-only by the page copy might trigger wp_page_copy() and a anon_vma_prepare(vma) on the source vma, defeating the anon_vma_clone() that wasn't done because the parent vma originally didn't have an anon_vma, but we now might end up copying a pte entry for a page that has one. Before the per-vma lock based changes, the mmap_lock guaranteed exclusion with concurrent page faults. But now we need to do a vma_start_write() to make sure no concurrent faults happen on this vma while it is being processed. This fix can potentially regress some fork-heavy workloads. Kernel build time did not show noticeable regression on a 56-core machine while a stress test mapping 10000 VMAs and forking 5000 times in a tight loop shows ~5% regression. If such fork time regression is unacceptable, disabling CONFIG_PER_VMA_LOCK should restore its performance. Further optimizations are possible if this regression proves to be problematic. Suggested-by: David Hildenbrand <[email protected]> Reported-by: Jiri Slaby <[email protected]> Closes: https://lore.kernel.org/all/[email protected]/ Reported-by: Holger Hoffstätte <[email protected]> Closes: https://lore.kernel.org/all/[email protected]/ Reported-by: Jacob Young <[email protected]> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217624 Fixes: 0bff0aaea03e ("x86/mm: try VMA lock-based page fault handling first") Cc: [email protected] Signed-off-by: Suren Baghdasaryan <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2023-07-08mm: lock newly mapped VMA which can be modified after it becomes visibleSuren Baghdasaryan1-0/+2
mmap_region adds a newly created VMA into VMA tree and might modify it afterwards before dropping the mmap_lock. This poses a problem for page faults handled under per-VMA locks because they don't take the mmap_lock and can stumble on this VMA while it's still being modified. Currently this does not pose a problem since post-addition modifications are done only for file-backed VMAs, which are not handled under per-VMA lock. However, once support for handling file-backed page faults with per-VMA locks is added, this will become a race. Fix this by write-locking the VMA before inserting it into the VMA tree. Other places where a new VMA is added into VMA tree do not modify it after the insertion, so do not need the same locking. Cc: [email protected] Signed-off-by: Suren Baghdasaryan <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2023-07-08mm: lock a vma before stack expansionSuren Baghdasaryan1-0/+4
With recent changes necessitating mmap_lock to be held for write while expanding a stack, per-VMA locks should follow the same rules and be write-locked to prevent page faults into the VMA being expanded. Add the necessary locking. Cc: [email protected] Signed-off-by: Suren Baghdasaryan <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2023-07-08Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds12-706/+32
Pull more SCSI updates from James Bottomley: "A few late arriving patches that missed the initial pull request. It's mostly bug fixes (the dt-bindings is a fix for the initial pull)" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: core: Remove unused function declaration scsi: target: docs: Remove tcm_mod_builder.py scsi: target: iblock: Quiet bool conversion warning with pr_preempt use scsi: dt-bindings: ufs: qcom: Fix ICE phandle scsi: core: Simplify scsi_cdl_check_cmd() scsi: isci: Fix comment typo scsi: smartpqi: Replace one-element arrays with flexible-array members scsi: target: tcmu: Replace strlcpy() with strscpy() scsi: ncr53c8xx: Replace strlcpy() with strscpy() scsi: lpfc: Fix lpfc_name struct packing
2023-07-08Merge tag 'i2c-for-6.5-rc1-part2' of ↵Linus Torvalds3-3/+2
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull more i2c updates from Wolfram Sang: - xiic patch should have been in the original pull but slipped through - mpc patch fixes a build regression - nomadik cleanup * tag 'i2c-for-6.5-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: mpc: Drop unused variable i2c: nomadik: Remove a useless call in the remove function i2c: xiic: Don't try to handle more interrupt events after error
2023-07-08Merge tag 'hardening-v6.5-rc1-fixes' of ↵Linus Torvalds4-16/+9
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening fixes from Kees Cook: - Check for NULL bdev in LoadPin (Matthias Kaehlcke) - Revert unwanted KUnit FORTIFY build default - Fix 1-element array causing boot warnings with xhci-hub * tag 'hardening-v6.5-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: usb: ch9: Replace bmSublinkSpeedAttr 1-element array with flexible array Revert "fortify: Allow KUnit test to build without FORTIFY" dm: verity-loadpin: Add NULL pointer check for 'bdev' parameter
2023-07-08ntb: hw: amd: Fix debugfs_create_dir error checkingAnup Sharma1-1/+1
The debugfs_create_dir function returns ERR_PTR in case of error, and the only correct way to check if an error occurred is 'IS_ERR' inline function. This patch will replace the null-comparison with IS_ERR. Signed-off-by: Anup Sharma <[email protected]> Suggested-by: Ivan Orlov <[email protected]> Signed-off-by: Jon Mason <[email protected]>
2023-07-08Merge tag 'perf-tools-for-v6.5-2-2023-07-06' of ↵Linus Torvalds95-448/+9316
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next Pull more perf tools updates from Namhyung Kim: "These are remaining changes and fixes for this cycle. Build: - Allow generating vmlinux.h from BTF using `make GEN_VMLINUX_H=1` and skip if the vmlinux has no BTF. - Replace deprecated clang -target xxx option by --target=xxx. perf record: - Print event attributes with well known type and config symbols in the debug output like below: # perf record -e cycles,cpu-clock -C0 -vv true <SNIP> ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0 (PERF_COUNT_HW_CPU_CYCLES) { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER read_format ID disabled 1 inherit 1 freq 1 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 5 ------------------------------------------------------------ perf_event_attr: type 1 (PERF_TYPE_SOFTWARE) size 136 config 0 (PERF_COUNT_SW_CPU_CLOCK) { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER read_format ID disabled 1 inherit 1 freq 1 sample_id_all 1 exclude_guest 1 - Update AMD IBS event error message since it now support per-process profiling but no priviledge filters. $ sudo perf record -e ibs_op//k -C 0 Error: AMD IBS doesn't support privilege filtering. Try again without the privilege modifiers (like 'k') at the end. perf lock contention: - Support CSV style output using -x option $ sudo perf lock con -ab -x, sleep 1 # output: contended, total wait, max wait, avg wait, type, caller 19, 194232, 21415, 10222, spinlock, process_one_work+0x1f0 15, 162748, 23843, 10849, rwsem:R, do_user_addr_fault+0x40e 4, 86740, 23415, 21685, rwlock:R, ep_poll_callback+0x2d 1, 84281, 84281, 84281, mutex, iwl_mvm_async_handlers_wk+0x135 8, 67608, 27404, 8451, spinlock, __queue_work+0x174 3, 58616, 31125, 19538, rwsem:W, do_mprotect_pkey+0xff 3, 52953, 21172, 17651, rwlock:W, do_epoll_wait+0x248 2, 30324, 19704, 15162, rwsem:R, do_madvise+0x3ad 1, 24619, 24619, 24619, spinlock, rcu_core+0xd4 - Add --output option to save the data to a file not to be interfered by other debug messages. Test: - Fix event parsing test on ARM where there's no raw PMU nor supports PERF_PMU_CAP_EXTENDED_HW_TYPE. - Update the lock contention test case for CSV output. - Fix a segfault in the daemon command test. Vendor events (JSON): - Add has_event() to check if the given event is available on system at runtime. On Intel machines, some transaction events may not be present when TSC extensions are disabled. - Update Intel event metrics. Misc: - Sort symbols by name using an external array of pointers instead of a rbtree node in the symbol. This will save 16-bytes or 24-bytes per symbol whether the sorting is actually requested or not. - Fix unwinding DWARF callstacks using libdw when --symfs option is used" * tag 'perf-tools-for-v6.5-2-2023-07-06' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next: (38 commits) perf test: Fix event parsing test when PERF_PMU_CAP_EXTENDED_HW_TYPE isn't supported. perf test: Fix event parsing test on Arm perf evsel amd: Fix IBS error message perf: unwind: Fix symfs with libdw perf symbol: Fix uninitialized return value in symbols__find_by_name() perf test: Test perf lock contention CSV output perf lock contention: Add --output option perf lock contention: Add -x option for CSV style output perf lock: Remove stale comments perf vendor events intel: Update tigerlake to 1.13 perf vendor events intel: Update skylakex to 1.31 perf vendor events intel: Update skylake to 57 perf vendor events intel: Update sapphirerapids to 1.14 perf vendor events intel: Update icelakex to 1.21 perf vendor events intel: Update icelake to 1.19 perf vendor events intel: Update cascadelakex to 1.19 perf vendor events intel: Update meteorlake to 1.03 perf vendor events intel: Add rocketlake events/metrics perf vendor metrics intel: Make transaction metrics conditional perf jevents: Support for has_event function ...
2023-07-08Merge tag 'bitmap-6.5-rc1' of https://github.com/norov/linuxLinus Torvalds6-11/+27
Pull bitmap updates from Yury Norov: "Fixes for different bitmap pieces: - lib/test_bitmap: increment failure counter properly The tests that don't use expect_eq() macro to determine that a test is failured must increment failed_tests explicitly. - lib/bitmap: drop optimization of bitmap_{from,to}_arr64 bitmap_{from,to}_arr64() optimization is overly optimistic on 32-bit LE architectures when it's wired to bitmap_copy_clear_tail(). - nodemask: Drop duplicate check in for_each_node_mask() As the return value type of first_node() became unsigned, the node >= 0 became unnecessary. - cpumask: fix function description kernel-doc notation - MAINTAINERS: Add bits.h and bitfield.h to the BITMAP API record Add linux/bits.h and linux/bitfield.h for visibility" * tag 'bitmap-6.5-rc1' of https://github.com/norov/linux: MAINTAINERS: Add bitfield.h to the BITMAP API record MAINTAINERS: Add bits.h to the BITMAP API record cpumask: fix function description kernel-doc notation nodemask: Drop duplicate check in for_each_node_mask() lib/bitmap: drop optimization of bitmap_{from,to}_arr64 lib/test_bitmap: increment failure counter properly
2023-07-08lib: dhry: fix sleeping allocations inside non-preemptable sectionGeert Uytterhoeven1-2/+9
The Smatch static checker reports the following warnings: lib/dhry_run.c:38 dhry_benchmark() warn: sleeping in atomic context lib/dhry_run.c:43 dhry_benchmark() warn: sleeping in atomic context Indeed, dhry() does sleeping allocations inside the non-preemptable section delimited by get_cpu()/put_cpu(). Fix this by using atomic allocations instead. Add error handling, as atomic these allocations may fail. Link: https://lkml.kernel.org/r/bac6d517818a7cd8efe217c1ad649fffab9cc371.1688568764.git.geert+renesas@glider.be Fixes: 13684e966d46283e ("lib: dhry: fix unstable smp_processor_id(_) usage") Reported-by: Dan Carpenter <[email protected]> Closes: https://lore.kernel.org/r/[email protected] Signed-off-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-07-08kasan, slub: fix HW_TAGS zeroing with slub_debugAndrey Konovalov2-14/+14
Commit 946fa0dbf2d8 ("mm/slub: extend redzone check to extra allocated kmalloc space than requested") added precise kmalloc redzone poisoning to the slub_debug functionality. However, this commit didn't account for HW_TAGS KASAN fully initializing the object via its built-in memory initialization feature. Even though HW_TAGS KASAN memory initialization contains special memory initialization handling for when slub_debug is enabled, it does not account for in-object slub_debug redzones. As a result, HW_TAGS KASAN can overwrite these redzones and cause false-positive slub_debug reports. To fix the issue, avoid HW_TAGS KASAN memory initialization when slub_debug is enabled altogether. Implement this by moving the __slub_debug_enabled check to slab_post_alloc_hook. Common slab code seems like a more appropriate place for a slub_debug check anyway. Link: https://lkml.kernel.org/r/678ac92ab790dba9198f9ca14f405651b97c8502.1688561016.git.andreyknvl@google.com Fixes: 946fa0dbf2d8 ("mm/slub: extend redzone check to extra allocated kmalloc space than requested") Signed-off-by: Andrey Konovalov <[email protected]> Reported-by: Will Deacon <[email protected]> Acked-by: Marco Elver <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: David Rientjes <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Feng Tang <[email protected]> Cc: Hyeonggon Yoo <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: [email protected] Cc: Pekka Enberg <[email protected]> Cc: Peter Collingbourne <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Vincenzo Frascino <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-07-08kasan: fix type cast in memory_is_poisoned_nAndrey Konovalov1-1/+2
Commit bb6e04a173f0 ("kasan: use internal prototypes matching gcc-13 builtins") introduced a bug into the memory_is_poisoned_n implementation: it effectively removed the cast to a signed integer type after applying KASAN_GRANULE_MASK. As a result, KASAN started failing to properly check memset, memcpy, and other similar functions. Fix the bug by adding the cast back (through an additional signed integer variable to make the code more readable). Link: https://lkml.kernel.org/r/8c9e0251c2b8b81016255709d4ec42942dcaf018.1688431866.git.andreyknvl@google.com Fixes: bb6e04a173f0 ("kasan: use internal prototypes matching gcc-13 builtins") Signed-off-by: Andrey Konovalov <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Marco Elver <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-07-08mailmap: add entries for Heiko StuebnerHeiko Stuebner1-0/+3
I am going to lose my vrull.eu address at the end of july, and while adding it to mailmap I also realised that there are more old addresses from me dangling, so update .mailmap for all of them. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Heiko Stuebner <[email protected]> Signed-off-by: Heiko Stuebner <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-07-08mailmap: update manpage linkHeiko Stuebner1-1/+2
Patch series "Update .mailmap for my work address and fix manpage". While updating mailmap for the going-away address, I also found that on current systems the manpage linked from the header comment changed. And in fact it looks like the git mailmap feature got its own manpage. This patch (of 2): On recent systems the git-shortlog manpage only tells people to See gitmailmap(5) So instead of sending people on a scavenger hunt, put that info into the header directly. Though keep the old reference around for older systems. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Heiko Stuebner <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-07-08bootmem: remove the vmemmap pages from kmemleak in free_bootmem_pageLiu Shixin1-0/+2
commit dd0ff4d12dd2 ("bootmem: remove the vmemmap pages from kmemleak in put_page_bootmem") fix an overlaps existing problem of kmemleak. But the problem still existed when HAVE_BOOTMEM_INFO_NODE is disabled, because in this case, free_bootmem_page() will call free_reserved_page() directly. Fix the problem by adding kmemleak_free_part() in free_bootmem_page() when HAVE_BOOTMEM_INFO_NODE is disabled. Link: https://lkml.kernel.org/r/[email protected] Fixes: f41f2ed43ca5 ("mm: hugetlb: free the vmemmap pages associated with each HugeTLB page") Signed-off-by: Liu Shixin <[email protected]> Acked-by: Muchun Song <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-07-08MAINTAINERS: add linux-next infoRandy Dunlap1-0/+7
Add linux-next info to MAINTAINERS for ease of finding this data. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Randy Dunlap <[email protected]> Acked-by: Stephen Rothwell <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-07-08mailmap: add Markus Schneider-PargmannMarkus Schneider-Pargmann1-0/+1
Add my old mail address and update my name. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Markus Schneider-Pargmann <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-07-08writeback: account the number of pages written backMatthew Wilcox (Oracle)1-3/+5
nr_to_write is a count of pages, so we need to decrease it by the number of pages in the folio we just wrote, not by 1. Most callers specify either LONG_MAX or 1, so are unaffected, but writeback_sb_inodes() might end up writing 512x as many pages as it asked for. Dave added: : XFS is the only filesystem this would affect, right? AFAIA, nothing : else enables large folios and uses writeback through : write_cache_pages() at this point... : : In which case, I'd be surprised if much difference, if any, gets : noticed by anyone. Link: https://lkml.kernel.org/r/[email protected] Fixes: 793917d997df ("mm/readahead: Add large folio readahead") Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Cc: Jan Kara <[email protected]> Cc: Dave Chinner <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-07-08mm: call arch_swap_restore() from do_swap_page()Peter Collingbourne1-0/+7
Commit c145e0b47c77 ("mm: streamline COW logic in do_swap_page()") moved the call to swap_free() before the call to set_pte_at(), which meant that the MTE tags could end up being freed before set_pte_at() had a chance to restore them. Fix it by adding a call to the arch_swap_restore() hook before the call to swap_free(). Link: https://lkml.kernel.org/r/[email protected] Link: https://linux-review.googlesource.com/id/I6470efa669e8bd2f841049b8c61020c510678965 Fixes: c145e0b47c77 ("mm: streamline COW logic in do_swap_page()") Signed-off-by: Peter Collingbourne <[email protected]> Reported-by: Qun-wei Lin <[email protected]> Closes: https://lore.kernel.org/all/[email protected]/ Acked-by: David Hildenbrand <[email protected]> Acked-by: "Huang, Ying" <[email protected]> Reviewed-by: Steven Price <[email protected]> Acked-by: Catalin Marinas <[email protected]> Cc: <[email protected]> [6.1+] Signed-off-by: Andrew Morton <[email protected]>
2023-07-08squashfs: fix cache race with migrationVincent Whitchurch1-4/+23
Migration replaces the page in the mapping before copying the contents and the flags over from the old page, so check that the page in the page cache is really up to date before using it. Without this, stressing squashfs reads with parallel compaction sometimes results in squashfs reporting data corruption. Link: https://lkml.kernel.org/r/[email protected] Fixes: e994f5b677ee ("squashfs: cache partial compressed blocks") Signed-off-by: Vincent Whitchurch <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Phillip Lougher <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-07-08mm/hugetlb.c: fix a bug within a BUG(): inconsistent pte comparisonJohn Hubbard1-1/+6
The following crash happens for me when running the -mm selftests (below). Specifically, it happens while running the uffd-stress subtests: kernel BUG at mm/hugetlb.c:7249! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI CPU: 0 PID: 3238 Comm: uffd-stress Not tainted 6.4.0-hubbard-github+ #109 Hardware name: ASUS X299-A/PRIME X299-A, BIOS 1503 08/03/2018 RIP: 0010:huge_pte_alloc+0x12c/0x1a0 ... Call Trace: <TASK> ? __die_body+0x63/0xb0 ? die+0x9f/0xc0 ? do_trap+0xab/0x180 ? huge_pte_alloc+0x12c/0x1a0 ? do_error_trap+0xc6/0x110 ? huge_pte_alloc+0x12c/0x1a0 ? handle_invalid_op+0x2c/0x40 ? huge_pte_alloc+0x12c/0x1a0 ? exc_invalid_op+0x33/0x50 ? asm_exc_invalid_op+0x16/0x20 ? __pfx_put_prev_task_idle+0x10/0x10 ? huge_pte_alloc+0x12c/0x1a0 hugetlb_fault+0x1a3/0x1120 ? finish_task_switch+0xb3/0x2a0 ? lock_is_held_type+0xdb/0x150 handle_mm_fault+0xb8a/0xd40 ? find_vma+0x5d/0xa0 do_user_addr_fault+0x257/0x5d0 exc_page_fault+0x7b/0x1f0 asm_exc_page_fault+0x22/0x30 That happens because a BUG() statement in huge_pte_alloc() attempts to check that a pte, if present, is a hugetlb pte, but it does so in a non-lockless-safe manner that leads to a false BUG() report. We got here due to a couple of bugs, each of which by itself was not quite enough to cause a problem: First of all, before commit c33c794828f2("mm: ptep_get() conversion"), the BUG() statement in huge_pte_alloc() was itself fragile: it relied upon compiler behavior to only read the pte once, despite using it twice in the same conditional. Next, commit c33c794828f2 ("mm: ptep_get() conversion") broke that delicate situation, by causing all direct pte reads to be done via READ_ONCE(). And so READ_ONCE() got called twice within the same BUG() conditional, leading to comparing (potentially, occasionally) different versions of the pte, and thus to false BUG() reports. Fix this by taking a single snapshot of the pte before using it in the BUG conditional. Now, that commit is only partially to blame here but, people doing bisections will invariably land there, so this will help them find a fix for a real crash. And also, the previous behavior was unlikely to ever expose this bug--it was fragile, yet not actually broken. So that's why I chose this commit for the Fixes tag, rather than the commit that created the original BUG() statement. Link: https://lkml.kernel.org/r/[email protected] Fixes: c33c794828f2 ("mm: ptep_get() conversion") Signed-off-by: John Hubbard <[email protected]> Acked-by: James Houghton <[email protected]> Acked-by: Muchun Song <[email protected]> Reviewed-by: Ryan Roberts <[email protected]> Acked-by: Mike Kravetz <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Al Viro <[email protected]> Cc: Alex Williamson <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andrey Konovalov <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Dave Airlie <[email protected]> Cc: Dimitri Sivanich <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Cc: Lorenzo Stoakes <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Miaohe Lin <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Mike Rapoport (IBM) <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Oleksandr Tyshchenko <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: SeongJae Park <[email protected]> Cc: Shakeel Butt <[email protected]> Cc: Uladzislau Rezki (Sony) <[email protected]> Cc: Vincenzo Frascino <[email protected]> Cc: Yu Zhao <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-07-08docs: update ocfs2-devel mailing list addressAnthony Iliopoulos7-17/+17
The ocfs2-devel mailing list has been migrated to the kernel.org infrastructure, update all related documentation pointers to reflect the change. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Anthony Iliopoulos <[email protected]> Acked-by: Joseph Qi <[email protected]> Acked-by: Joel Becker <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Mark Fasheh <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-07-08MAINTAINERS: update ocfs2-devel mailing list addressAnthony Iliopoulos1-1/+1
The ocfs2-devel mailing list has been migrated to the kernel.org infrastructure, update the related entry to reflect the change. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Anthony Iliopoulos <[email protected]> Acked-by: Joseph Qi <[email protected]> Acked-by: Joel Becker <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-07-08mm: disable CONFIG_PER_VMA_LOCK until its fixedSuren Baghdasaryan1-1/+2
A memory corruption was reported in [1] with bisection pointing to the patch [2] enabling per-VMA locks for x86. Disable per-VMA locks config to prevent this issue until the fix is confirmed. This is expected to be a temporary measure. [1] https://bugzilla.kernel.org/show_bug.cgi?id=217624 [2] https://lore.kernel.org/all/[email protected] Link: https://lkml.kernel.org/r/[email protected] Reported-by: Jiri Slaby <[email protected]> Closes: https://lore.kernel.org/all/[email protected]/ Reported-by: Jacob Young <[email protected]> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217624 Fixes: 0bff0aaea03e ("x86/mm: try VMA lock-based page fault handling first") Signed-off-by: Suren Baghdasaryan <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Holger Hoffstätte <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-07-08fork: lock VMAs of the parent process when forkingSuren Baghdasaryan1-0/+6
Patch series "Avoid memory corruption caused by per-VMA locks", v4. A memory corruption was reported in [1] with bisection pointing to the patch [2] enabling per-VMA locks for x86. Based on the reproducer provided in [1] we suspect this is caused by the lack of VMA locking while forking a child process. Patch 1/2 in the series implements proper VMA locking during fork. I tested the fix locally using the reproducer and was unable to reproduce the memory corruption problem. This fix can potentially regress some fork-heavy workloads. Kernel build time did not show noticeable regression on a 56-core machine while a stress test mapping 10000 VMAs and forking 5000 times in a tight loop shows ~7% regression. If such fork time regression is unacceptable, disabling CONFIG_PER_VMA_LOCK should restore its performance. Further optimizations are possible if this regression proves to be problematic. Patch 2/2 disables per-VMA locks until the fix is tested and verified. This patch (of 2): When forking a child process, parent write-protects an anonymous page and COW-shares it with the child being forked using copy_present_pte(). Parent's TLB is flushed right before we drop the parent's mmap_lock in dup_mmap(). If we get a write-fault before that TLB flush in the parent, and we end up replacing that anonymous page in the parent process in do_wp_page() (because, COW-shared with the child), this might lead to some stale writable TLB entries targeting the wrong (old) page. Similar issue happened in the past with userfaultfd (see flush_tlb_page() call inside do_wp_page()). Lock VMAs of the parent process when forking a child, which prevents concurrent page faults during fork operation and avoids this issue. This fix can potentially regress some fork-heavy workloads. Kernel build time did not show noticeable regression on a 56-core machine while a stress test mapping 10000 VMAs and forking 5000 times in a tight loop shows ~7% regression. If such fork time regression is unacceptable, disabling CONFIG_PER_VMA_LOCK should restore its performance. Further optimizations are possible if this regression proves to be problematic. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: 0bff0aaea03e ("x86/mm: try VMA lock-based page fault handling first") Signed-off-by: Suren Baghdasaryan <[email protected]> Suggested-by: David Hildenbrand <[email protected]> Reported-by: Jiri Slaby <[email protected]> Closes: https://lore.kernel.org/all/[email protected]/ Reported-by: Holger Hoffstätte <[email protected]> Closes: https://lore.kernel.org/all/[email protected]/ Reported-by: Jacob Young <[email protected]> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=3D217624 Reviewed-by: Liam R. Howlett <[email protected]> Acked-by: David Hildenbrand <[email protected]> Tested-by: Holger Hoffsttte <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>