aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-10-05Merge tag 'leds-fixes-6.6' of ↵Linus Torvalds1-4/+0
git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds Pull LED fix from Lee Jones: "Just the one bug-fix: - Fix regression affecting LED_COLOR_ID_MULTI users" * tag 'leds-fixes-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds: leds: Drop BUG_ON check for LED_COLOR_ID_MULTI
2023-10-05Merge tag 'mfd-fixes-6.6' of ↵Linus Torvalds2-4/+1
git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd Pull MFD fixes from Lee Jones: "A couple of small fixes: - Potential build failure in CS42L43 - Device Tree bindings clean-up for a superseded patch" * tag 'mfd-fixes-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: dt-bindings: mfd: Revert "dt-bindings: mfd: maxim,max77693: Add USB connector" mfd: cs42l43: Fix MFD_CS42L43 dependency on REGMAP_IRQ
2023-10-05Merge tag 'ovl-fixes-6.6-rc5' of ↵Linus Torvalds5-30/+28
git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs Pull overlayfs fixes from Amir Goldstein: - Fix for file reference leak regression - Fix for NULL pointer deref regression - Fixes for RCU-walk race regressions: Two of the fixes were taken from Al's RCU pathwalk race fixes series with his consent [1]. Note that unlike most of Al's series, these two patches are not about racing with ->kill_sb() and they are also very recent regressions from v6.5, so I think it's worth getting them into v6.5.y. There is also a fix for an RCU pathwalk race with ->kill_sb(), which may have been solved in vfs generic code as you suggested, but it also rids overlayfs from a nasty hack, so I think it's worth anyway. Link: https://lore.kernel.org/linux-fsdevel/20231003204749.GA800259@ZenIV/ [1] * tag 'ovl-fixes-6.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs: ovl: fix NULL pointer defer when encoding non-decodable lower fid ovl: make use of ->layers safe in rcu pathwalk ovl: fetch inode once in ovl_dentry_revalidate_common() ovl: move freeing ovl_entry past rcu delay ovl: fix file reference leak when submitting aio
2023-10-04Merge tag 'rtla-v6.6-fixes' of ↵Linus Torvalds3-10/+32
git://git.kernel.org/pub/scm/linux/kernel/git/bristot/linux Pull rtla fixes from Daniel Bristot de Oliveira: "rtla (Real-Time Linux Analysis) tool fixes. Timerlat auto-analysis: - Timerlat is reporting thread interference time without thread noise events occurrence. It was caused because the thread interference variable was not reset after the analysis of a timerlat activation that did not hit the threshold. - The IRQ handler delay is estimated from the delta of the IRQ latency reported by timerlat, and the timestamp from IRQ handler start event. If the delta is near-zero, the drift from the external clock and the trace event and/or the overhead can cause the value to be negative. If the value is negative, print a zero-delay. - IRQ handlers happening after the timerlat thread event but before the stop tracing were being reported as IRQ that happened before the *current* IRQ occurrence. Ignore Previous IRQ noise in this condition because they are valid only for the *next* timerlat activation. Timerlat user-space: - Timerlat is stopping all user-space thread if a CPU becomes offline. Do not stop the entire tool if a CPU is/become offline, but only the thread of the unavailable CPU. Stop the tool only, if all threads leave because the CPUs become/are offline. man-pages: - Fix command line example in timerlat hist man page" * tag 'rtla-v6.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bristot/linux: rtla: fix a example in rtla-timerlat-hist.rst rtla/timerlat: Do not stop user-space if a cpu is offline rtla/timerlat_aa: Fix previous IRQ delay for IRQs that happens after thread sample rtla/timerlat_aa: Fix negative IRQ delay rtla/timerlat_aa: Zero thread sum after every sample analysis
2023-10-04Merge tag 'linux-kselftest-fixes-6.6-rc5' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest fix from Shuah Khan: "One single fix to Makefile to fix the incorrect TARGET name for uevent test" * tag 'linux-kselftest-fixes-6.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests: Fix wrong TARGET in kselftest top level Makefile
2023-10-03Merge tag 'nfs-for-6.6-3' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds7-22/+63
Pull NFS client fixes from Anna Schumaker: "Stable fixes: - Revert "SUNRPC dont update timeout value on connection reset" - NFSv4: Fix a state manager thread deadlock regression Fixes: - Fix potential NULL pointer dereference in nfs_inode_remove_request() - Fix rare NULL pointer dereference in xs_tcp_tls_setup_socket() - Fix long delay before failing a TLS mount when server does not support TLS - Fix various NFS state manager issues" * tag 'nfs-for-6.6-3' of git://git.linux-nfs.org/projects/anna/linux-nfs: nfs: decrement nrequests counter before releasing the req SUNRPC/TLS: Lock the lower_xprt during the tls handshake Revert "SUNRPC dont update timeout value on connection reset" NFSv4: Fix a state manager thread deadlock regression NFSv4: Fix a nfs4_state_manager() race SUNRPC: Fail quickly when server does not recognize TLS
2023-10-03Merge tag 'regulator-fix-v6.6-rc4' of ↵Linus Torvalds2-10/+18
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fixes from Mark Brown: "Two things here, one is an improved fix for issues around freeing devices when registration fails which replaces a half baked fix with a more complete one which uses the device model release() function properly. The other fix is a device specific fix for mt6358, the driver said that the LDOs supported mode configuration but this is not actually the case and could cause issues" * tag 'regulator-fix-v6.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator/core: Revert "fix kobject release warning and memory leak in regulator_register()" regulator/core: regulator_register: set device->class earlier regulator: mt6358: split ops for buck and linear range LDO regulators
2023-10-03Merge tag 'regmap-fix-v6.6-rc4' of ↵Linus Torvalds1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap Pull regmap fix from Mark Brown: "A fix for a long standing issue where when we create a new node in an rbtree register cache we were failing to convert the register address of the new register into a bitmask correctly and marking the wrong register as being present in the newly created node. This would only have affected devices with a register stride other than 1 but would corrupt data on those devices" * tag 'regmap-fix-v6.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: regmap: rbtree: Fix wrong register marked as in-cache when creating new node
2023-10-03Merge tag 'scsi-fixes' of ↵Linus Torvalds6-29/+63
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Three fixes, all in drivers. The fnic one is the most extensive because the little used user initiated device reset path never tagged the command and adding a tag is rather involved. The other two fixes are smaller and more obvious" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: zfcp: Fix a double put in zfcp_port_enqueue() scsi: fnic: Fix sg_reset success path scsi: target: core: Fix deadlock due to recursive locking
2023-10-03ovl: fix NULL pointer defer when encoding non-decodable lower fidAmir Goldstein1-1/+1
A wrong return value from ovl_check_encode_origin() would cause ovl_dentry_to_fid() to try to encode fid from NULL upper dentry. Reported-by: syzbot+2208f82282740c1c8915@syzkaller.appspotmail.com Fixes: 16aac5ad1fa9 ("ovl: support encoding non-decodable file handles") Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2023-10-02Merge tag 'ubifs-for-linus-6.6-rc5' of ↵Linus Torvalds1-0/+7
git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs Pull UBI fix from Richard Weinberger: - Don't try to attach MTDs with erase block size 0 * tag 'ubifs-for-linus-6.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs: ubi: Refuse attaching if mtd's erasesize is 0
2023-10-02Merge tag 'libnvdimm-fixes-6.6-rc5' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fix from Dave Jiang: - Fix incorrect calculation of idt size in NFIT * tag 'libnvdimm-fixes-6.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: ACPI: NFIT: Fix incorrect calculation of idt size
2023-10-02Merge tag 'iommu-fixes-v6.6-rc4' of ↵Linus Torvalds7-32/+33
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fixes from Joerg Roedel: - Arm SMMU fixes from Will Deacon: - Fix TLB range command encoding when TTL, Num and Scale are all zero - Fix soft lockup by limiting TLB invalidation ops issued by SVA - Fix clocks description for SDM630 platform in arm-smmu DT binding - Intel VT-d fix from Lu Baolu: - Fix a suspend/hibernation problem in iommu_suspend() - Mediatek driver: Fix page table sharing for addresses over 4GiB - Apple/Dart: DMA_FQ handling fix in attach_dev() * tag 'iommu-fixes-v6.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/vt-d: Avoid memory allocation in iommu_suspend() iommu/apple-dart: Handle DMA_FQ domains in attach_dev() iommu/mediatek: Fix share pgtable for iova over 4GB iommu/arm-smmu-v3: Fix soft lockup triggered by arm_smmu_mm_invalidate_range dt-bindings: arm-smmu: Fix SDM630 clocks description iommu/arm-smmu-v3: Avoid constructing invalid range commands
2023-10-02ovl: make use of ->layers safe in rcu pathwalkAmir Goldstein3-24/+21
ovl_permission() accesses ->layers[...].mnt; we can't have ->layers freed without an RCU delay on fs shutdown. Fortunately, kern_unmount_array() that is used to drop those mounts does include an RCU delay, so freeing is delayed; unfortunately, the array passed to kern_unmount_array() is formed by mangling ->layers contents and that happens without any delays. The ->layers[...].name string entries are used to store the strings to display in "lowerdir=..." by ovl_show_options(). Those entries are not accessed in RCU walk. Move the name strings into a separate array ofs->config.lowerdirs and reuse the ofs->config.lowerdirs array as the temporary mount array to pass to kern_unmount_array(). Reported-by: Al Viro <viro@zeniv.linux.org.uk> Link: https://lore.kernel.org/r/20231002023711.GP3389589@ZenIV/ Acked-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2023-10-02ovl: fetch inode once in ovl_dentry_revalidate_common()Al Viro1-2/+4
d_inode_rcu() is right - we might be in rcu pathwalk; however, OVL_E() hides plain d_inode() on the same dentry... Fixes: a6ff2bc0be17 ("ovl: use OVL_E() and OVL_E_FLAGS() accessors") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2023-10-02ovl: move freeing ovl_entry past rcu delayAl Viro1-1/+2
... into ->free_inode(), that is. Fixes: 0af950f57fef "ovl: move ovl_entry into ovl_inode" Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2023-10-02ovl: fix file reference leak when submitting aioAmir Goldstein1-2/+0
Commit 724768a39374 ("ovl: fix incorrect fdput() on aio completion") took a refcount on real file before submitting aio, but forgot to avoid clearing FDPUT_FPUT from real.flags stack variable. This can result in a file reference leak. Fixes: 724768a39374 ("ovl: fix incorrect fdput() on aio completion") Reported-by: Gil Lev <contact@levgil.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2023-10-01Linux 6.6-rc4Linus Torvalds1-1/+1
2023-10-01Merge tag 'kbuild-fixes-v6.6-2' of ↵Linus Torvalds7-15/+41
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Fix the module compression with xz so the in-kernel decompressor works - Document a kconfig idiom to express an optional dependency between modules - Make modpost, when W=1 is given, detect broken drivers that reference .exit.* sections - Remove unused code * tag 'kbuild-fixes-v6.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kbuild: remove stale code for 'source' symlink in packaging scripts modpost: Don't let "driver"s reference .exit.* vmlinux.lds.h: remove unused CPU_KEEP and CPU_DISCARD macros modpost: add missing else to the "of" check Documentation: kbuild: explain handling optional dependencies kbuild: Use CRC32 and a 1MiB dictionary for XZ compressed modules
2023-10-01Merge tag 'mm-hotfixes-stable-2023-10-01-08-34' of ↵Linus Torvalds38-169/+455
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "Fourteen hotfixes, eleven of which are cc:stable. The remainder pertain to issues which were introduced after 6.5" * tag 'mm-hotfixes-stable-2023-10-01-08-34' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: Crash: add lock to serialize crash hotplug handling selftests/mm: fix awk usage in charge_reserved_hugetlb.sh and hugetlb_reparenting_test.sh that may cause error mm: mempolicy: keep VMA walk if both MPOL_MF_STRICT and MPOL_MF_MOVE are specified mm/damon/vaddr-test: fix memory leak in damon_do_test_apply_three_regions() mm, memcg: reconsider kmem.limit_in_bytes deprecation mm: zswap: fix potential memory corruption on duplicate store arm64: hugetlb: fix set_huge_pte_at() to work with all swap entries mm: hugetlb: add huge page size param to set_huge_pte_at() maple_tree: add MAS_UNDERFLOW and MAS_OVERFLOW states maple_tree: add mas_is_active() to detect in-tree walks nilfs2: fix potential use after free in nilfs_gccache_submit_read_data() mm: abstract moving to the next PFN mm: report success more often from filemap_map_folio_range() fs: binfmt_elf_efpic: fix personality for ELF-FDPIC
2023-10-01Merge tag 'char-misc-6.6-rc4' of ↵Linus Torvalds6-212/+102
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull misc driver fix from Greg KH: "Here is a single, much requested, fix for a set of misc drivers to resolve a much reported regression in the -rc series that has also propagated back to the stable releases. Sorry for the delay, lots of conference travel for a few weeks put me very far behind in patch wrangling. It has been reported by many to resolve the reported problem, and has been in linux-next with no reported issues" * tag 'char-misc-6.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: misc: rtsx: Fix some platforms can not boot and move the l1ss judgment to probe
2023-10-01Merge tag 'tty-6.6-rc4' of ↵Linus Torvalds2-4/+5
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty / serial driver fixes from Greg KH: "Here are two tty/serial driver fixes for 6.6-rc4 that resolve some reported regressions: - revert a n_gsm change that ended up causing problems - 8250_port fix for irq data both have been in linux-next for over a week with no reported problems" * tag 'tty-6.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: Revert "tty: n_gsm: fix UAF in gsm_cleanup_mux" serial: 8250_port: Check IRQ data before use
2023-10-01Merge tag 'x86-urgent-2023-10-01' of ↵Linus Torvalds3-7/+26
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes: a kerneldoc build warning fix, add SRSO mitigation for AMD-derived Hygon processors, and fix a SGX kernel crash in the page fault handler that can trigger when ksgxd races to reclaim the SECS special page, by making the SECS page unswappable" * tag 'x86-urgent-2023-10-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/sgx: Resolves SECS reclaim vs. page fault for EAUG race x86/srso: Add SRSO mitigation for Hygon processors x86/kgdb: Fix a kerneldoc warning when build with W=1
2023-10-01Merge tag 'timers-urgent-2023-10-01' of ↵Linus Torvalds1-1/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Ingo Molnar: "Fix a spurious kernel warning during CPU hotplug events that may trigger when timer/hrtimer softirqs are pending, which are otherwise hotplug-safe and don't merit a warning" * tag 'timers-urgent-2023-10-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timers: Tag (hr)timer softirq as hotplug safe
2023-10-01Merge tag 'sched-urgent-2023-10-01' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fix from Ingo Molnar: "Fix a RT tasks related lockup/live-lock during CPU offlining" * tag 'sched-urgent-2023-10-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/rt: Fix live lock between select_fallback_rq() and RT push
2023-10-01Merge tag 'perf-urgent-2023-10-01' of ↵Linus Torvalds1-7/+17
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf event fixes from Ingo Molnar: "Misc fixes: work around an AMD microcode bug on certain models, and fix kexec kernel PMI handlers on AMD systems that get loaded on older kernels that have an unexpected register state" * tag 'perf-urgent-2023-10-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/amd: Do not WARN() on every IRQ perf/x86/amd/core: Fix overflow reset on hotplug
2023-10-01kbuild: remove stale code for 'source' symlink in packaging scriptsMasahiro Yamada2-4/+0
Since commit d8131c2965d5 ("kbuild: remove $(MODLIB)/source symlink"), modules_install does not create the 'source' symlink. Remove the stale code from builddeb and kernel.spec. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-10-01modpost: Don't let "driver"s reference .exit.*Uwe Kleine-König1-2/+13
Drivers must not reference functions marked with __exit as these likely are not available when the code is built-in. There are few creative offenders uncovered for example in ARCH=amd64 allmodconfig builds. So only trigger the section mismatch warning for W=1 builds. The dual rule that drivers must not reference .init.* is implemented since commit 0db252452378 ("modpost: don't allow *driver to reference .init.*") which however missed that .exit.* should be handled in the same way. Thanks to Masahiro Yamada and Arnd Bergmann who gave valuable hints to find this improvement. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-10-01vmlinux.lds.h: remove unused CPU_KEEP and CPU_DISCARD macrosMasahiro Yamada1-7/+0
Remove the left-over of commit e24f6628811e ("modpost: remove all traces of cpuinit/cpuexit sections"). Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2023-10-01modpost: add missing else to the "of" checkMauricio Faria de Oliveira1-1/+1
Without this 'else' statement, an "usb" name goes into two handlers: the first/previous 'if' statement _AND_ the for-loop over 'devtable', but the latter is useless as it has no 'usb' device_id entry anyway. Tested with allmodconfig before/after patch; no changes to *.mod.c: git checkout v6.6-rc3 make -j$(nproc) allmodconfig make -j$(nproc) olddefconfig make -j$(nproc) find . -name '*.mod.c' | cpio -pd /tmp/before # apply patch make -j$(nproc) find . -name '*.mod.c' | cpio -pd /tmp/after diff -r /tmp/before/ /tmp/after/ # no difference Fixes: acbef7b76629 ("modpost: fix module autoloading for OF devices with generic compatible property") Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-09-30Merge tag 'soc-fixes-6.6' of ↵Linus Torvalds29-90/+179
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Arnd Bergmann: "These are the latest bug fixes that have come up in the soc tree. Most of these are fairly minor. Most notably, the majority of changes this time are not for dts files as usual. - Updates to the addresses of the broadcom and aspeed entries in the MAINTAINERS file. - Defconfig updates to address a regression on samsung and a build warning from an unknown Kconfig symbol - Build fixes for the StrongARM and Uniphier platforms - Code fixes for SCMI and FF-A firmware drivers, both of which had a simple bug that resulted in invalid data, and a lesser fix for the optee firmware driver - Multiple fixes for the recently added loongson/loongarch "guts" soc driver - Devicetree fixes for RISC-V on the startfive platform, addressing issues with NOR flash, usb and uart. - Multiple fixes for NXP i.MX8/i.MX9 dts files, fixing problems with clock, gpio, hdmi settings and the Makefile - Bug fixes for i.MX firmware code and the OCOTP soc driver - Multiple fixes for the TI sysc bus driver - Minor dts updates for TI omap dts files, to address boot time warnings and errors" * tag 'soc-fixes-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (35 commits) MAINTAINERS: Fix Florian Fainelli's email address arm64: defconfig: enable syscon-poweroff driver ARM: locomo: fix locomolcd_power declaration soc: loongson: loongson2_guts: Remove unneeded semicolon soc: loongson: loongson2_guts: Convert to devm_platform_ioremap_resource() soc: loongson: loongson_pm2: Populate children syscon nodes dt-bindings: soc: loongson,ls2k-pmc: Allow syscon-reboot/syscon-poweroff as child soc: loongson: loongson_pm2: Drop useless of_device_id compatible dt-bindings: soc: loongson,ls2k-pmc: Use fallbacks for ls2k-pmc compatible soc: loongson: loongson_pm2: Add dependency for INPUT arm64: defconfig: remove CONFIG_COMMON_CLK_NPCM8XX=y ARM: uniphier: fix cache kernel-doc warnings MAINTAINERS: aspeed: Update Andrew's email address MAINTAINERS: aspeed: Update git tree URL firmware: arm_ffa: Don't set the memory region attributes for MEM_LEND arm64: dts: imx: Add imx8mm-prt8mm.dtb to build arm64: dts: imx8mm-evk: Fix hdmi@3d node soc: imx8m: Enable OCOTP clock for imx8mm before reading registers arm64: dts: imx8mp-beacon-kit: Fix audio_pll2 clock arm64: dts: imx8mp: Fix SDMA2/3 clocks ...
2023-09-30Merge tag 'trace-v6.6-rc3' of ↵Linus Torvalds4-8/+56
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Make sure 32-bit applications using user events have aligned access when running on a 64-bit kernel. - Add cond_resched in the loop that handles converting enums in print_fmt string is trace events. - Fix premature wake ups of polling processes in the tracing ring buffer. When a task polls waiting for a percentage of the ring buffer to be filled, the writer still will wake it up at every event. Add the polling's percentage to the "shortest_full" list to tell the writer when to wake it up. - For eventfs dir lookups on dynamic events, an event system's only event could be removed, leaving its dentry with no children. This is totally legitimate. But in eventfs_release() it must not access the children array, as it is only allocated when the dentry has children. * tag 'trace-v6.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: eventfs: Test for dentries array allocated in eventfs_release() tracing/user_events: Align set_bit() address for all archs tracing: relax trace_event_eval_update() execution with cond_resched() ring-buffer: Update "shortest_full" in polling
2023-09-30eventfs: Test for dentries array allocated in eventfs_release()Steven Rostedt (Google)1-1/+1
The dcache_dir_open_wrapper() could be called when a dynamic event is being deleted leaving a dentry with no children. In this case the dlist->dentries array will never be allocated. This needs to be checked for in eventfs_release(), otherwise it will trigger a NULL pointer dereference. Link: https://lore.kernel.org/linux-trace-kernel/20230930090106.1c3164e9@rorschach.local.home Cc: Mark Rutland <mark.rutland@arm.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Fixes: ef36b4f92868 ("eventfs: Remember what dentries were created on dir open") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-09-30tracing/user_events: Align set_bit() address for all archsBeau Belgrave1-7/+51
All architectures should use a long aligned address passed to set_bit(). User processes can pass either a 32-bit or 64-bit sized value to be updated when tracing is enabled when on a 64-bit kernel. Both cases are ensured to be naturally aligned, however, that is not enough. The address must be long aligned without affecting checks on the value within the user process which require different adjustments for the bit for little and big endian CPUs. Add a compat flag to user_event_enabler that indicates when a 32-bit value is being used on a 64-bit kernel. Long align addresses and correct the bit to be used by set_bit() to account for this alignment. Ensure compat flags are copied during forks and used during deletion clears. Link: https://lore.kernel.org/linux-trace-kernel/20230925230829.341-2-beaub@linux.microsoft.com Link: https://lore.kernel.org/linux-trace-kernel/20230914131102.179100-1-cleger@rivosinc.com/ Cc: stable@vger.kernel.org Fixes: 7235759084a4 ("tracing/user_events: Use remote writes for event enablement") Reported-by: Clément Léger <cleger@rivosinc.com> Suggested-by: Clément Léger <cleger@rivosinc.com> Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-09-30tracing: relax trace_event_eval_update() execution with cond_resched()Clément Léger1-0/+1
When kernel is compiled without preemption, the eval_map_work_func() (which calls trace_event_eval_update()) will not be preempted up to its complete execution. This can actually cause a problem since if another CPU call stop_machine(), the call will have to wait for the eval_map_work_func() function to finish executing in the workqueue before being able to be scheduled. This problem was observe on a SMP system at boot time, when the CPU calling the initcalls executed clocksource_done_booting() which in the end calls stop_machine(). We observed a 1 second delay because one CPU was executing eval_map_work_func() and was not preempted by the stop_machine() task. Adding a call to cond_resched() in trace_event_eval_update() allows other tasks to be executed and thus continue working asynchronously like before without blocking any pending task at boot time. Link: https://lore.kernel.org/linux-trace-kernel/20230929191637.416931-1-cleger@rivosinc.com Cc: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Clément Léger <cleger@rivosinc.com> Tested-by: Atish Patra <atishp@rivosinc.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-09-30ring-buffer: Update "shortest_full" in pollingSteven Rostedt (Google)1-0/+3
It was discovered that the ring buffer polling was incorrectly stating that read would not block, but that's because polling did not take into account that reads will block if the "buffer-percent" was set. Instead, the ring buffer polling would say reads would not block if there was any data in the ring buffer. This was incorrect behavior from a user space point of view. This was fixed by commit 42fb0a1e84ff by having the polling code check if the ring buffer had more data than what the user specified "buffer percent" had. The problem now is that the polling code did not register itself to the writer that it wanted to wait for a specific "full" value of the ring buffer. The result was that the writer would wake the polling waiter whenever there was a new event. The polling waiter would then wake up, see that there's not enough data in the ring buffer to notify user space and then go back to sleep. The next event would wake it up again. Before the polling fix was added, the code would wake up around 100 times for a hackbench 30 benchmark. After the "fix", due to the constant waking of the writer, it would wake up over 11,0000 times! It would never leave the kernel, so the user space behavior was still "correct", but this definitely is not the desired effect. To fix this, have the polling code add what it's waiting for to the "shortest_full" variable, to tell the writer not to wake it up if the buffer is not as full as it expects to be. Note, after this fix, it appears that the waiter is now woken up around 2x the times it was before (~200). This is a tremendous improvement from the 11,000 times, but I will need to spend some time to see why polling is more aggressive in its wakeups than the read blocking code. Link: https://lore.kernel.org/linux-trace-kernel/20230929180113.01c2cae3@rorschach.local.home Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Fixes: 42fb0a1e84ff ("tracing/ring-buffer: Have polling block on watermark") Reported-by: Julia Lawall <julia.lawall@inria.fr> Tested-by: Julia Lawall <julia.lawall@inria.fr> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-09-30Merge tag 'dma-mapping-6.6-2023-09-30' of ↵Linus Torvalds2-16/+38
git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping fixes from Christoph Hellwig: - fix the narea calculation in swiotlb initialization (Ross Lagerwall) - fix the check whether a device has used swiotlb (Petr Tesarik) * tag 'dma-mapping-6.6-2023-09-30' of git://git.infradead.org/users/hch/dma-mapping: swiotlb: fix the check whether a device has used software IO TLB swiotlb: use the calculated number of areas
2023-09-30Merge tag 'iomap-6.6-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds2-2/+11
Pull iomap fixes from Darrick Wong: - Handle a race between writing and shrinking block devices by returning EIO - Fix a typo in a comment * tag 'iomap-6.6-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: iomap: Spelling s/preceeding/preceding/g iomap: add a workaround for racy i_size updates on block devices
2023-09-30Merge tag 'i2c-for-6.6-rc4' of ↵Linus Torvalds3-12/+12
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Usual business: a driver fix, a DT fix, a minor core fix" * tag 'i2c-for-6.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: npcm7xx: Fix callback completion ordering i2c: mux: Avoid potential false error message in i2c_mux_add_adapter dt-bindings: i2c: mxs: Pass ref and 'unevaluatedProperties: false'
2023-09-30Merge tag 'acpi-6.6-rc4' of ↵Linus Torvalds1-2/+5
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fix from Rafael Wysocki: "Fix a possible NULL pointer dereference in the error path of acpi_video_bus_add() resulting from recent changes (Dinghao Liu)" * tag 'acpi-6.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: video: Fix NULL pointer dereference in acpi_video_bus_add()
2023-09-30Merge tag 'powerpc-6.6-3' of ↵Linus Torvalds3-31/+14
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix arch_stack_walk_reliable(), used by live patching - Fix powerpc selftests to work with run_kselftest.sh Thanks to Joe Lawrence and Petr Mladek. * tag 'powerpc-6.6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: selftests/powerpc: Fix emit_tests to work with run_kselftest.sh powerpc/stacktrace: Fix arch_stack_walk_reliable()
2023-09-30Merge tag 'nfsd-6.6-2' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fix from Chuck Lever: - Fix NFSv4 READ corner case * tag 'nfsd-6.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: NFSD: Fix zero NFSv4 READ results when RQ_SPLICE_OK is not set
2023-09-30Merge tag '6.6-rc3-smb3-client-fix' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds1-0/+1
Pull smb client fix from Steve French: "Fix for password freeing potential oops (also for stable)" * tag '6.6-rc3-smb3-client-fix' of git://git.samba.org/sfrench/cifs-2.6: fs/smb/client: Reset password pointer to NULL
2023-09-29Crash: add lock to serialize crash hotplug handlingBaoquan He1-0/+17
Eric reported that handling corresponding crash hotplug event can be failed easily when many memory hotplug event are notified in a short period. They failed because failing to take __kexec_lock. ======= [ 78.714569] Fallback order for Node 0: 0 [ 78.714575] Built 1 zonelists, mobility grouping on. Total pages: 1817886 [ 78.717133] Policy zone: Normal [ 78.724423] crash hp: kexec_trylock() failed, elfcorehdr may be inaccurate [ 78.727207] crash hp: kexec_trylock() failed, elfcorehdr may be inaccurate [ 80.056643] PEFILE: Unsigned PE binary ======= The memory hotplug events are notified very quickly and very many, while the handling of crash hotplug is much slower relatively. So the atomic variable __kexec_lock and kexec_trylock() can't guarantee the serialization of crash hotplug handling. Here, add a new mutex lock __crash_hotplug_lock to serialize crash hotplug handling specifically. This doesn't impact the usage of __kexec_lock. Link: https://lkml.kernel.org/r/20230926120905.392903-1-bhe@redhat.com Fixes: 247262756121 ("crash: add generic infrastructure for crash hotplug support") Signed-off-by: Baoquan He <bhe@redhat.com> Tested-by: Eric DeVolder <eric.devolder@oracle.com> Reviewed-by: Eric DeVolder <eric.devolder@oracle.com> Reviewed-by: Valentin Schneider <vschneid@redhat.com> Cc: Sourabh Jain <sourabhjain@linux.ibm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-09-29selftests/mm: fix awk usage in charge_reserved_hugetlb.sh and ↵Juntong Deng2-4/+4
hugetlb_reparenting_test.sh that may cause error According to the awk manual, the -e option does not need to be specified in front of 'program' (unless you need to mix program-file). The redundant -e option can cause error when users use awk tools other than gawk (for example, mawk does not support the -e option). Error Example: awk: not an option: -e Link: https://lkml.kernel.org/r/VI1P193MB075228810591AF2FDD7D42C599C3A@VI1P193MB0752.EURP193.PROD.OUTLOOK.COM Signed-off-by: Juntong Deng <juntong.deng@outlook.com> Cc: Shuah Khan <shuah@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-09-29mm: mempolicy: keep VMA walk if both MPOL_MF_STRICT and MPOL_MF_MOVE are ↵Yang Shi1-20/+19
specified When calling mbind() with MPOL_MF_{MOVE|MOVEALL} | MPOL_MF_STRICT, kernel should attempt to migrate all existing pages, and return -EIO if there is misplaced or unmovable page. Then commit 6f4576e3687b ("mempolicy: apply page table walker on queue_pages_range()") messed up the return value and didn't break VMA scan early ianymore when MPOL_MF_STRICT alone. The return value problem was fixed by commit a7f40cfe3b7a ("mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified"), but it broke the VMA walk early if unmovable page is met, it may cause some pages are not migrated as expected. The code should conceptually do: if (MPOL_MF_MOVE|MOVEALL) scan all vmas try to migrate the existing pages return success else if (MPOL_MF_MOVE* | MPOL_MF_STRICT) scan all vmas try to migrate the existing pages return -EIO if unmovable or migration failed else /* MPOL_MF_STRICT alone */ break early if meets unmovable and don't call mbind_range() at all else /* none of those flags */ check the ranges in test_walk, EFAULT without mbind_range() if discontig. Fixed the behavior. Link: https://lkml.kernel.org/r/20230920223242.3425775-1-yang@os.amperecomputing.com Fixes: a7f40cfe3b7a ("mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified") Signed-off-by: Yang Shi <yang@os.amperecomputing.com> Cc: Hugh Dickins <hughd@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Oscar Salvador <osalvador@suse.de> Cc: Rafael Aquini <aquini@redhat.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: David Rientjes <rientjes@google.com> Cc: <stable@vger.kernel.org> [4.9+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-09-29mm/damon/vaddr-test: fix memory leak in damon_do_test_apply_three_regions()Jinjie Ruan1-0/+2
When CONFIG_DAMON_VADDR_KUNIT_TEST=y and making CONFIG_DEBUG_KMEMLEAK=y and CONFIG_DEBUG_KMEMLEAK_AUTO_SCAN=y, the below memory leak is detected. Since commit 9f86d624292c ("mm/damon/vaddr-test: remove unnecessary variables"), the damon_destroy_ctx() is removed, but still call damon_new_target() and damon_new_region(), the damon_region which is allocated by kmem_cache_alloc() in damon_new_region() and the damon_target which is allocated by kmalloc in damon_new_target() are not freed. And the damon_region which is allocated in damon_new_region() in damon_set_regions() is also not freed. So use damon_destroy_target to free all the damon_regions and damon_target. unreferenced object 0xffff888107c9a940 (size 64): comm "kunit_try_catch", pid 1069, jiffies 4294670592 (age 732.761s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 06 00 00 00 6b 6b 6b 6b ............kkkk 60 c7 9c 07 81 88 ff ff f8 cb 9c 07 81 88 ff ff `............... backtrace: [<ffffffff817e0167>] kmalloc_trace+0x27/0xa0 [<ffffffff819c11cf>] damon_new_target+0x3f/0x1b0 [<ffffffff819c7d55>] damon_do_test_apply_three_regions.constprop.0+0x95/0x3e0 [<ffffffff819c82be>] damon_test_apply_three_regions1+0x21e/0x260 [<ffffffff829fce6a>] kunit_generic_run_threadfn_adapter+0x4a/0x90 [<ffffffff81237cf6>] kthread+0x2b6/0x380 [<ffffffff81097add>] ret_from_fork+0x2d/0x70 [<ffffffff81003791>] ret_from_fork_asm+0x11/0x20 unreferenced object 0xffff8881079cc740 (size 56): comm "kunit_try_catch", pid 1069, jiffies 4294670592 (age 732.761s) hex dump (first 32 bytes): 05 00 00 00 00 00 00 00 14 00 00 00 00 00 00 00 ................ 6b 6b 6b 6b 6b 6b 6b 6b 00 00 00 00 6b 6b 6b 6b kkkkkkkk....kkkk backtrace: [<ffffffff819bc492>] damon_new_region+0x22/0x1c0 [<ffffffff819c7d91>] damon_do_test_apply_three_regions.constprop.0+0xd1/0x3e0 [<ffffffff819c82be>] damon_test_apply_three_regions1+0x21e/0x260 [<ffffffff829fce6a>] kunit_generic_run_threadfn_adapter+0x4a/0x90 [<ffffffff81237cf6>] kthread+0x2b6/0x380 [<ffffffff81097add>] ret_from_fork+0x2d/0x70 [<ffffffff81003791>] ret_from_fork_asm+0x11/0x20 unreferenced object 0xffff888107c9ac40 (size 64): comm "kunit_try_catch", pid 1071, jiffies 4294670595 (age 732.843s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 06 00 00 00 6b 6b 6b 6b ............kkkk a0 cc 9c 07 81 88 ff ff 78 a1 76 07 81 88 ff ff ........x.v..... backtrace: [<ffffffff817e0167>] kmalloc_trace+0x27/0xa0 [<ffffffff819c11cf>] damon_new_target+0x3f/0x1b0 [<ffffffff819c7d55>] damon_do_test_apply_three_regions.constprop.0+0x95/0x3e0 [<ffffffff819c851e>] damon_test_apply_three_regions2+0x21e/0x260 [<ffffffff829fce6a>] kunit_generic_run_threadfn_adapter+0x4a/0x90 [<ffffffff81237cf6>] kthread+0x2b6/0x380 [<ffffffff81097add>] ret_from_fork+0x2d/0x70 [<ffffffff81003791>] ret_from_fork_asm+0x11/0x20 unreferenced object 0xffff8881079ccc80 (size 56): comm "kunit_try_catch", pid 1071, jiffies 4294670595 (age 732.843s) hex dump (first 32 bytes): 05 00 00 00 00 00 00 00 14 00 00 00 00 00 00 00 ................ 6b 6b 6b 6b 6b 6b 6b 6b 00 00 00 00 6b 6b 6b 6b kkkkkkkk....kkkk backtrace: [<ffffffff819bc492>] damon_new_region+0x22/0x1c0 [<ffffffff819c7d91>] damon_do_test_apply_three_regions.constprop.0+0xd1/0x3e0 [<ffffffff819c851e>] damon_test_apply_three_regions2+0x21e/0x260 [<ffffffff829fce6a>] kunit_generic_run_threadfn_adapter+0x4a/0x90 [<ffffffff81237cf6>] kthread+0x2b6/0x380 [<ffffffff81097add>] ret_from_fork+0x2d/0x70 [<ffffffff81003791>] ret_from_fork_asm+0x11/0x20 unreferenced object 0xffff888107c9af40 (size 64): comm "kunit_try_catch", pid 1073, jiffies 4294670597 (age 733.011s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 06 00 00 00 6b 6b 6b 6b ............kkkk 20 a2 76 07 81 88 ff ff b8 a6 76 07 81 88 ff ff .v.......v..... backtrace: [<ffffffff817e0167>] kmalloc_trace+0x27/0xa0 [<ffffffff819c11cf>] damon_new_target+0x3f/0x1b0 [<ffffffff819c7d55>] damon_do_test_apply_three_regions.constprop.0+0x95/0x3e0 [<ffffffff819c877e>] damon_test_apply_three_regions3+0x21e/0x260 [<ffffffff829fce6a>] kunit_generic_run_threadfn_adapter+0x4a/0x90 [<ffffffff81237cf6>] kthread+0x2b6/0x380 [<ffffffff81097add>] ret_from_fork+0x2d/0x70 [<ffffffff81003791>] ret_from_fork_asm+0x11/0x20 unreferenced object 0xffff88810776a200 (size 56): comm "kunit_try_catch", pid 1073, jiffies 4294670597 (age 733.011s) hex dump (first 32 bytes): 05 00 00 00 00 00 00 00 14 00 00 00 00 00 00 00 ................ 6b 6b 6b 6b 6b 6b 6b 6b 00 00 00 00 6b 6b 6b 6b kkkkkkkk....kkkk backtrace: [<ffffffff819bc492>] damon_new_region+0x22/0x1c0 [<ffffffff819c7d91>] damon_do_test_apply_three_regions.constprop.0+0xd1/0x3e0 [<ffffffff819c877e>] damon_test_apply_three_regions3+0x21e/0x260 [<ffffffff829fce6a>] kunit_generic_run_threadfn_adapter+0x4a/0x90 [<ffffffff81237cf6>] kthread+0x2b6/0x380 [<ffffffff81097add>] ret_from_fork+0x2d/0x70 [<ffffffff81003791>] ret_from_fork_asm+0x11/0x20 unreferenced object 0xffff88810776a740 (size 56): comm "kunit_try_catch", pid 1073, jiffies 4294670597 (age 733.025s) hex dump (first 32 bytes): 3d 00 00 00 00 00 00 00 3f 00 00 00 00 00 00 00 =.......?....... 6b 6b 6b 6b 6b 6b 6b 6b 00 00 00 00 6b 6b 6b 6b kkkkkkkk....kkkk backtrace: [<ffffffff819bc492>] damon_new_region+0x22/0x1c0 [<ffffffff819bfcc2>] damon_set_regions+0x4c2/0x8e0 [<ffffffff819c7dbb>] damon_do_test_apply_three_regions.constprop.0+0xfb/0x3e0 [<ffffffff819c877e>] damon_test_apply_three_regions3+0x21e/0x260 [<ffffffff829fce6a>] kunit_generic_run_threadfn_adapter+0x4a/0x90 [<ffffffff81237cf6>] kthread+0x2b6/0x380 [<ffffffff81097add>] ret_from_fork+0x2d/0x70 [<ffffffff81003791>] ret_from_fork_asm+0x11/0x20 unreferenced object 0xffff888108038240 (size 64): comm "kunit_try_catch", pid 1075, jiffies 4294670600 (age 733.022s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 03 00 00 00 6b 6b 6b 6b ............kkkk 48 ad 76 07 81 88 ff ff 98 ae 76 07 81 88 ff ff H.v.......v..... backtrace: [<ffffffff817e0167>] kmalloc_trace+0x27/0xa0 [<ffffffff819c11cf>] damon_new_target+0x3f/0x1b0 [<ffffffff819c7d55>] damon_do_test_apply_three_regions.constprop.0+0x95/0x3e0 [<ffffffff819c898d>] damon_test_apply_three_regions4+0x1cd/0x210 [<ffffffff829fce6a>] kunit_generic_run_threadfn_adapter+0x4a/0x90 [<ffffffff81237cf6>] kthread+0x2b6/0x380 [<ffffffff81097add>] ret_from_fork+0x2d/0x70 [<ffffffff81003791>] ret_from_fork_asm+0x11/0x20 unreferenced object 0xffff88810776ad28 (size 56): comm "kunit_try_catch", pid 1075, jiffies 4294670600 (age 733.022s) hex dump (first 32 bytes): 05 00 00 00 00 00 00 00 07 00 00 00 00 00 00 00 ................ 6b 6b 6b 6b 6b 6b 6b 6b 00 00 00 00 6b 6b 6b 6b kkkkkkkk....kkkk backtrace: [<ffffffff819bc492>] damon_new_region+0x22/0x1c0 [<ffffffff819bfcc2>] damon_set_regions+0x4c2/0x8e0 [<ffffffff819c7dbb>] damon_do_test_apply_three_regions.constprop.0+0xfb/0x3e0 [<ffffffff819c898d>] damon_test_apply_three_regions4+0x1cd/0x210 [<ffffffff829fce6a>] kunit_generic_run_threadfn_adapter+0x4a/0x90 [<ffffffff81237cf6>] kthread+0x2b6/0x380 [<ffffffff81097add>] ret_from_fork+0x2d/0x70 [<ffffffff81003791>] ret_from_fork_asm+0x11/0x20 Link: https://lkml.kernel.org/r/20230925072100.3725620-1-ruanjinjie@huawei.com Fixes: 9f86d624292c ("mm/damon/vaddr-test: remove unnecessary variables") Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Reviewed-by: SeongJae Park <sj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-09-29mm, memcg: reconsider kmem.limit_in_bytes deprecationMichal Hocko2-0/+20
This reverts commits 86327e8eb94c ("memcg: drop kmem.limit_in_bytes") and partially reverts 58056f77502f ("memcg, kmem: further deprecate kmem.limit_in_bytes") which have incrementally removed support for the kernel memory accounting hard limit. Unfortunately it has turned out that there is still userspace depending on the existence of memory.kmem.limit_in_bytes [1]. The underlying functionality is not really required but the non-existent file just confuses the userspace which fails in the result. The patch to fix this on the userspace side has been submitted but it is hard to predict how it will propagate through the maze of 3rd party consumers of the software. Now, reverting alone 86327e8eb94c is not an option because there is another set of userspace which cannot cope with ENOTSUPP returned when writing to the file. Therefore we have to go and revisit 58056f77502f as well. There are two ways to go ahead. Either we give up on the deprecation and fully revert 58056f77502f as well or we can keep kmem.limit_in_bytes but make the write a noop and warn about the fact. This should work for both known breaking workloads which depend on the existence but do not depend on the hard limit enforcement. Note to backporters to stable trees. a8c49af3be5f ("memcg: add per-memcg total kernel memory stat") introduced in 4.18 has added memcg_account_kmem so the accounting is not done by obj_cgroup_charge_pages directly for v1 anymore. Prior kernels need to add it explicitly (thanks to Johannes for pointing this out). [akpm@linux-foundation.org: fix build - remove unused local] Link: http://lkml.kernel.org/r/20230920081101.GA12096@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net [1] Link: https://lkml.kernel.org/r/ZRE5VJozPZt9bRPy@dhcp22.suse.cz Fixes: 86327e8eb94c ("memcg: drop kmem.limit_in_bytes") Fixes: 58056f77502f ("memcg, kmem: further deprecate kmem.limit_in_bytes") Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Shakeel Butt <shakeelb@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jeremi Piotrowski <jpiotrowski@linux.microsoft.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Tejun heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-09-29mm: zswap: fix potential memory corruption on duplicate storeDomenico Cerasuolo1-0/+20
While stress-testing zswap a memory corruption was happening when writing back pages. __frontswap_store used to check for duplicate entries before attempting to store a page in zswap, this was because if the store fails the old entry isn't removed from the tree. This change removes duplicate entries in zswap_store before the actual attempt. [cerasuolodomenico@gmail.com: add a warning and a comment, per Johannes] Link: https://lkml.kernel.org/r/20230925130002.1929369-1-cerasuolodomenico@gmail.com Link: https://lkml.kernel.org/r/20230922172211.1704917-1-cerasuolodomenico@gmail.com Fixes: 42c06a0e8ebe ("mm: kill frontswap") Signed-off-by: Domenico Cerasuolo <cerasuolodomenico@gmail.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Nhat Pham <nphamcs@gmail.com> Cc: Dan Streetman <ddstreet@ieee.org> Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com> Cc: Seth Jennings <sjenning@redhat.com> Cc: Vitaly Wool <vitaly.wool@konsulko.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-09-29arm64: hugetlb: fix set_huge_pte_at() to work with all swap entriesRyan Roberts1-14/+3
When called with a swap entry that does not embed a PFN (e.g. PTE_MARKER_POISONED or PTE_MARKER_UFFD_WP), the previous implementation of set_huge_pte_at() would either cause a BUG() to fire (if CONFIG_DEBUG_VM is enabled) or cause a dereference of an invalid address and subsequent panic. arm64's huge pte implementation supports multiple huge page sizes, some of which are implemented in the page table with multiple contiguous entries. So set_huge_pte_at() needs to work out how big the logical pte is, so that it can also work out how many physical ptes (or pmds) need to be written. It previously did this by grabbing the folio out of the pte and querying its size. However, there are cases when the pte being set is actually a swap entry. But this also used to work fine, because for huge ptes, we only ever saw migration entries and hwpoison entries. And both of these types of swap entries have a PFN embedded, so the code would grab that and everything still worked out. But over time, more calls to set_huge_pte_at() have been added that set swap entry types that do not embed a PFN. And this causes the code to go bang. The triggering case is for the uffd poison test, commit 99aa77215ad0 ("selftests/mm: add uffd unit test for UFFDIO_POISON"), which causes a PTE_MARKER_POISONED swap entry to be set, coutesey of commit 8a13897fb0da ("mm: userfaultfd: support UFFDIO_POISON for hugetlbfs") - added in v6.5-rc7. Although review shows that there are other call sites that set PTE_MARKER_UFFD_WP (which also has no PFN), these don't trigger on arm64 because arm64 doesn't support UFFD WP. Arguably, the root cause is really due to commit 18f3962953e4 ("mm: hugetlb: kill set_huge_swap_pte_at()"), which aimed to simplify the interface to the core code by removing set_huge_swap_pte_at() (which took a page size parameter) and replacing it with calls to set_huge_pte_at() where the size was inferred from the folio, as descibed above. While that commit didn't break anything at the time, it did break the interface because it couldn't handle swap entries without PFNs. And since then new callers have come along which rely on this working. But given the brokeness is only observable after commit 8a13897fb0da ("mm: userfaultfd: support UFFDIO_POISON for hugetlbfs"), that one gets the Fixes tag. Now that we have modified the set_huge_pte_at() interface to pass the huge page size in the previous patch, we can trivially fix this issue. Link: https://lkml.kernel.org/r/20230922115804.2043771-3-ryan.roberts@arm.com Fixes: 8a13897fb0da ("mm: userfaultfd: support UFFDIO_POISON for hugetlbfs") Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: Axel Rasmussen <axelrasmussen@google.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Christoph Hellwig <hch@infradead.org> Cc: David S. Miller <davem@davemloft.net> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Xu <peterx@redhat.com> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: SeongJae Park <sj@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Cc: <stable@vger.kernel.org> [6.5+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>