aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-06-12gfs2: Add some missing quota lockingAndreas Gruenbacher1-29/+53
The quota code is missing some locking between local quota changes and syncing those quota changes to the global quota file (gfs2_quota_sync); in particular, qd->qd_change needs to be kept in sync with the QDF_CHANGE change flag and the number of references held. Use the qd->qd_lockref.lock spinlock for that. With the qd->qd_lockref.lock spinlock held, we can no longer call lockref_get(), so turn qd_hold() into a variant that assumes that the lock is held. This function is really supposed to take an additional reference when one or more references are already held, so check for that instead of checking if the lockref is dead. Signed-off-by: Andreas Gruenbacher <[email protected]>
2024-06-08gfs2: Fold qd_fish into gfs2_quota_syncAndreas Gruenbacher1-48/+29
The split between qd_fish() and gfs2_quota_sync() is rather unfortunate as qd_fish() is repeatedly called to scan sdp->sd_quota_list only to find the next object to that needs syncing; if there are multiple objects on the list that need syncing, it makes more sense to grab them all in one go. This is relatively easy to do when qd_fish() is folded into gfs2_quota_sync(). Signed-off-by: Andreas Gruenbacher <[email protected]>
2024-06-08gfs2: quota need_sync cleanupAndreas Gruenbacher1-14/+12
Rename variable 'value' to 'change' as it stores a change in value. Add new 'value' and 'limit' variables for the current value and limit. Only fetch the tuning parameters when we need them. Get rid of unnecessary nesting. No change in functionality. Signed-off-by: Andreas Gruenbacher <[email protected]>
2024-06-08gfs2: Fix and clean up function do_qcAndreas Gruenbacher1-13/+21
Function do_qc() is supposed to be conceptually simple: it alters the current in-memory and on-disk quota change values for a given uid/gid by a given delta. If the on-disk record isn't defined yet, a new record is created. If the on-disk record exists and the resulting change value is zero, there no longer is a need for that record and so the record is deleted. On top of that, some reference counting is involved when creating and deleting records. Currently, instead of doing the above, do_qc() alters the on-disk value and then it sets the in-memory value to the on-disk value. This is incorrect when the on-disk value differs from the in-memory value. The two values are allowed to differ when quota changes are synced to the global quota file. Fix by changing both values by the same amount. In addition, do_qc() currently gets confused when the delta value is 0. It isn't supposed to be called that way, but that assumption isn't mentioned and it makes the code harder to read. Make the code more explicit. Signed-off-by: Andreas Gruenbacher <[email protected]>
2024-06-08gfs2: Revert "Add quota_change type"Andreas Gruenbacher2-15/+10
Commit 432928c93779 ("gfs2: Add quota_change type") makes the incorrect assertion that function do_qc() should behave differently in the two contexts it is used in, but that isn't actually true. In all cases, do_qc() grabs a "reference" when it starts using a slot in the per-node quota changes file, and it releases that "reference" when no more residual changes remain. Revert that broken commit. There are some remaining issues with function do_qc() which are addressed in the next commit. This reverts commit 432928c9377959684c748a9bc6553ed2d3c2ea4f. Signed-off-by: Andreas Gruenbacher <[email protected]>
2024-06-08gfs2: Revert "ignore negated quota changes"Andreas Gruenbacher1-11/+0
Commit 4c6a08125f22 ("gfs2: ignore negated quota changes") skips quota changes with qd_change == 0 instead of writing them back, which leaves behind non-zero qd_change values in the affected slots. The kernel then assumes that those slots are unused, while the qd_change values on disk indicate that they are indeed still in use. The next time the filesystem is mounted, those invalid slots are read in from disk, which will cause inconsistencies. Revert that commit to avoid filesystem corruption. This reverts commit 4c6a08125f2249531ec01783a5f4317d7342add5. Signed-off-by: Andreas Gruenbacher <[email protected]>
2024-06-08gfs2: qd_check_sync cleanupsAndreas Gruenbacher1-18/+22
Rename qd_check_sync() to qd_grab_sync() and make it return a bool. Turn the sync_gen pointer into a regular u64 and pass in U64_MAX instead of a NULL pointer when sync generation checking isn't needed. Introduce a new qd_ungrab_sync() helper for undoing the effects of qd_grab_sync() if the subsequent bh_get() on the qd object fails. Signed-off-by: Andreas Gruenbacher <[email protected]>
2024-06-08gfs2: Revert "introduce qd_bh_get_or_undo"Andreas Gruenbacher1-19/+17
The qd_bh_get_or_undo() helper introduced by that commit doesn't improve the code much, so revert it and clean things up in a more useful way in the next commit. This reverts commit 7dbc6ae60dd7089d8ed42892b6a66c138f0aa7a0. Signed-off-by: Andreas Gruenbacher <[email protected]>
2024-06-07gfs2: Check quota consistency on mountAndreas Gruenbacher1-6/+31
In gfs2_quota_init(), make sure that the per-node "quota_change%u" file doesn't contain duplicate uids/gids. Those duplicates would cause us to acquire the glock corresponding to those ids repeatedly, which the glock code doesn't allow. When finding inconsistencies, we wipe them out and ignore them. The resulting quotas will likely be inconsistent, and running quotacheck(1) is advised. Signed-off-by: Andreas Gruenbacher <[email protected]>
2024-06-04gfs2: Minor gfs2_quota_init error path cleanupAndreas Gruenbacher1-9/+7
Add a fail_brelse label and use it where useful. Move variable bh out of the loop to extend its visibility to the new label. No functional change. Signed-off-by: Andreas Gruenbacher <[email protected]>
2024-05-29gfs2: Get rid of demote_ok checksAndreas Gruenbacher4-44/+1
The demote_ok glock operation is only still used to prevent the inode glocks of the "jindex" and "rindex" directories from getting recycled while they are still referenced by sdp->sd_jindex and sdp->sd_rindex. However, the LRU walking code will no longer recycle glocks which are referenced, so the demote_ok glock operation is obsolete and can be removed. Each of a glock's holders in the gl_holders list is holding a reference on the glock, so when the list of holders isn't empty in demote_ok(), the existing reference count check will already prevent the glock from getting released. This means that demote_ok() is obsolete as well. Signed-off-by: Andreas Gruenbacher <[email protected]>
2024-05-29Revert "GFS2: Don't add all glocks to the lru"Andreas Gruenbacher3-12/+4
This reverts commit e7ccaf5fe1590667b3fa2f8df5c5ec9ba0dc5b85. Before commit e7ccaf5fe159, every time a resource group glock was dequeued by gfs2_glock_dq(), it was added to the glock LRU list even though the glock was still referenced by the resource group and could never be evicted, anyway. Commit e7ccaf5fe159 added a GLOF_LRU hack to avoid that overhead for resource group glocks, and that hack was since adopted for some other types of glocks as well. We now no longer add glocks to the glock LRU list while they are still referenced. This solves the underlying problem, and obsoletes the GLOF_LRU hack. Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit 3e5257c810cba91e274d07f3db5cf013c7c830be)
2024-05-29gfs2: Revise glock reference counting modelAndreas Gruenbacher3-28/+30
In the current glock reference counting model, a bias of one is added to a glock's refcount when it is locked (gl->gl_state != LM_ST_UNLOCKED). A glock is removed from the lru_list when it is enqueued, and added back when it is dequeued. This isn't a very appropriate model because most glocks are held for long periods of time (for example, the inode "owns" references to its inode and iopen glocks as long as the inode is cached even when the glock state changes to LM_ST_UNLOCKED), and they can only be freed when they are no longer referenced, anyway. Fix this by getting rid of the refcount bias for locked glocks. That way, we can use lockref_put_or_lock() to efficiently drop all but the last glock reference, and put the glock onto the lru_list when the last reference is dropped. When find_insert_glock() returns a reference to a cached glock, it removes the glock from the lru_list. Dumping the "glocks" and "glstats" debugfs files also takes glock references, but instead of removing the glocks from the lru_list in that case as well, we leave them on the list. This ensures that dumping those files won't perturb the order of the glocks on the lru_list. In addition, when the last reference to an *unlocked* glock is dropped, we immediately free it; this preserves the preexisting behavior. If it later turns out that caching unlocked glocks is useful in some situations, we can change the caching strategy. It is currently unclear if a glock that has no active references can have the GLF_LFLUSH flag set. To make sure that such a glock won't accidentally be evicted due to memory pressure, we add a GLF_LFLUSH check to gfs2_dispose_glock_lru(). Signed-off-by: Andreas Gruenbacher <[email protected]>
2024-05-29gfs2: Switch to a per-filesystem glock workqueueAndreas Gruenbacher3-15/+18
Switch to a per-filesystem glock workqueue. Additional workqueues are cheap nowadays, and keeping separate workqueues allows to flush the work of each filesystem without affecting the others. Signed-off-by: Andreas Gruenbacher <[email protected]>
2024-05-29gfs2: Report when glocks cannot be freed for a long timeAndreas Gruenbacher1-3/+15
When glocks cannot be freed for a long time, avoid the "task blocked for more than N seconds" messages and report how many glocks are still outstanding, instead. Signed-off-by: Andreas Gruenbacher <[email protected]>
2024-05-29gfs2: gfs2_glock_get cleanupAndreas Gruenbacher1-20/+13
Clean up the messy code in gfs2_glock_get(). No change in functionality. Signed-off-by: Andreas Gruenbacher <[email protected]>
2024-05-29gfs2: Invert the GLF_INITIAL flagAndreas Gruenbacher3-10/+22
Invert the meaning of the GLF_INITIAL flag: right now, when GLF_INITIAL is set, a DLM lock exists and we have a valid identifier for it; when GLF_INITIAL is cleared, no DLM lock exists (yet). This is confusing. In addition, it makes more sense to highlight the exceptional case (i.e., no DLM lock exists yet) in glock dumps and trace points than to highlight the common case. To avoid confusion between the "old" and the "new" meaning of the flag, use 'a' instead of 'I' to represent the flag. For improved code consistency, check if the GLF_INITIAL flag is cleared to determine whether a DLM lock exists instead of checking if the lock identifier is non-zero. Document what the flag is used for. Signed-off-by: Andreas Gruenbacher <[email protected]>
2024-05-29gfs2: Remove outdated comment in glock_work_funcAndreas Gruenbacher1-5/+1
Signed-off-by: Andreas Gruenbacher <[email protected]>
2024-05-29gfs2: Update glocks documentationAndreas Gruenbacher1-26/+26
Rearrange the table of locking modes and associated caching capability to be in order of increasing caching capability. Update the description of the glock operations. Signed-off-by: Andreas Gruenbacher <[email protected]>
2024-05-28gfs2: Rename handle_callback to request_demoteAndreas Gruenbacher1-10/+10
Function handle_callback() is used to request a glock demote. This often happens in response to a conflicting remote locking request and subsequent bast callback from DLM, but there are other reasons for triggering a demote request as well, such as when trying to release a glock in response to memory pressure. To clarify that, rename the function to request_demote(). Signed-off-by: Andreas Gruenbacher <[email protected]>
2024-05-28gfs2: Rename GLF_FROZEN to GLF_HAVE_FROZEN_REPLYAndreas Gruenbacher3-6/+6
The GLF_FROZEN flag indicates that a reply to a DLM locking request has been received, but should not be processed at this time. To clarify that meaning, rename the flag to GLF_HAVE_FROZEN_REPLY. Signed-off-by: Andreas Gruenbacher <[email protected]>
2024-05-28gfs2: Rename GLF_REPLY_PENDING to GLF_HAVE_REPLYAndreas Gruenbacher3-9/+9
The GLF_REPLY_PENDING flag indicates to glock_work_func() that in response to a locking request, DLM has sent a reply that needs to be processed. A flag with that name could as well indicate that we are waiting on a reply from DLM, however. To disambiguate these two cases, rename the flag to GLF_HAVE_REPLY. Signed-off-by: Andreas Gruenbacher <[email protected]>
2024-05-28gfs2: Rename GLF_FREEING to GLF_UNLOCKEDAndreas Gruenbacher5-16/+16
Rename the GLF_FREEING flag to GLF_UNLOCKED, and the ->go_free glock operation to ->go_unlocked. This mechanism is used to wait for the underlying DLM lock to be unlocked; being able to free the glock is a consequence of the DLM lock being unlocked. Signed-off-by: Andreas Gruenbacher <[email protected]>
2024-05-28gfs2: Remove useless return statement in run_queueAndreas Gruenbacher1-1/+0
The return statement at the end of run_queue() is useless. Signed-off-by: Andreas Gruenbacher <[email protected]>
2024-05-28gfs2: Remove unnecessary function prototypeAndreas Gruenbacher1-1/+0
Function __gfs2_glock_dq() gets defined before it is used, so there is no need for a separate function declaration. Signed-off-by: Andreas Gruenbacher <[email protected]>
2024-05-26Linux 6.10-rc1Linus Torvalds1-3/+3
2024-05-26mm: percpu: Include smp.h in alloc_tag.hKent Overstreet1-0/+1
percpu.h depends on smp.h, but doesn't include it directly because of circular header dependency issues; percpu.h is needed in a bunch of low level headers. This fixes a randconfig build error on mips: include/linux/alloc_tag.h: In function '__alloc_tag_ref_set': include/asm-generic/percpu.h:31:40: error: implicit declaration of function 'raw_smp_processor_id' [-Werror=implicit-function-declaration] Reported-by: kernel test robot <[email protected]> Fixes: 24e44cc22aa3 ("mm: percpu: enable per-cpu allocation tagging") Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Signed-off-by: Kent Overstreet <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2024-05-26Merge tag 'perf-tools-fixes-for-v6.10-1-2024-05-26' of ↵Linus Torvalds4-103/+68
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tool fix from Arnaldo Carvalho de Melo: "Revert a patch causing a regression. This made a simple 'perf record -e cycles:pp make -j199' stop working on the Ampere ARM64 system Linus uses to test ARM64 kernels". * tag 'perf-tools-fixes-for-v6.10-1-2024-05-26' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: Revert "perf parse-events: Prefer sysfs/JSON hardware events over legacy"
2024-05-26Revert "perf parse-events: Prefer sysfs/JSON hardware events over legacy"Arnaldo Carvalho de Melo4-103/+68
This reverts commit 617824a7f0f73e4de325cf8add58e55b28c12493. This made a simple 'perf record -e cycles:pp make -j199' stop working on the Ampere ARM64 system Linus uses to test ARM64 kernels, as discussed at length in the threads in the Link tags below. The fix provided by Ian wasn't acceptable and work to fix this will take time we don't have at this point, so lets revert this and work on it on the next devel cycle. Reported-by: Linus Torvalds <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Bhaskar Chowdhury <[email protected]> Cc: Ethan Adams <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Tycho Andersen <[email protected]> Cc: Yang Jihong <[email protected]> Link: https://lore.kernel.org/lkml/CAHk-=wi5Ri=yR2jBVk-4HzTzpoAWOgstr1LEvg_-OXtJvXXJOA@mail.gmail.com Link: https://lore.kernel.org/lkml/CAHk-=wiWvtFyedDNpoV7a8Fq_FpbB+F5KmWK2xPY3QoYseOf_A@mail.gmail.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2024-05-25Merge tag '6.10-rc-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds6-6/+34
Pull smb client fixes from Steve French: - two important netfs integration fixes - including for a data corruption and also fixes for multiple xfstests - reenable swap support over SMB3 * tag '6.10-rc-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6: cifs: Fix missing set of remote_i_size cifs: Fix smb3_insert_range() to move the zero_point cifs: update internal version number smb3: reenable swapfiles over SMB3 mounts
2024-05-25Merge tag 'mm-hotfixes-stable-2024-05-25-09-13' of ↵Linus Torvalds13-69/+187
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "16 hotfixes, 11 of which are cc:stable. A few nilfs2 fixes, the remainder are for MM: a couple of selftests fixes, various singletons fixing various issues in various parts" * tag 'mm-hotfixes-stable-2024-05-25-09-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm/ksm: fix possible UAF of stable_node mm/memory-failure: fix handling of dissolved but not taken off from buddy pages mm: /proc/pid/smaps_rollup: avoid skipping vma after getting mmap_lock again nilfs2: fix potential hang in nilfs_detach_log_writer() nilfs2: fix unexpected freezing of nilfs_segctor_sync() nilfs2: fix use-after-free of timer for log writer thread selftests/mm: fix build warnings on ppc64 arm64: patching: fix handling of execmem addresses selftests/mm: compaction_test: fix bogus test success and reduce probability of OOM-killer invocation selftests/mm: compaction_test: fix incorrect write of zero to nr_hugepages selftests/mm: compaction_test: fix bogus test success on Aarch64 mailmap: update email address for Satya Priya mm/huge_memory: don't unpoison huge_zero_folio kasan, fortify: properly rename memintrinsics lib: add version into /proc/allocinfo output mm/vmalloc: fix vmalloc which may return null if called with __GFP_NOFAIL
2024-05-25Merge tag 'irq-urgent-2024-05-25' of ↵Linus Torvalds3-12/+18
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Ingo Molnar: - Fix x86 IRQ vector leak caused by a CPU offlining race - Fix build failure in the riscv-imsic irqchip driver caused by an API-change semantic conflict - Fix use-after-free in irq_find_at_or_after() * tag 'irq-urgent-2024-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq/irqdesc: Prevent use-after-free in irq_find_at_or_after() genirq/cpuhotplug, x86/vector: Prevent vector leak during CPU offline irqchip/riscv-imsic: Fixup riscv_ipi_set_virq_range() conflict
2024-05-25Merge tag 'x86-urgent-2024-05-25' of ↵Linus Torvalds6-18/+67
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: - Fix regressions of the new x86 CPU VFM (vendor/family/model) enumeration/matching code - Fix crash kernel detection on buggy firmware with non-compliant ACPI MADT tables - Address Kconfig warning * tag 'x86-urgent-2024-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu: Fix x86_match_cpu() to match just X86_VENDOR_INTEL crypto: x86/aes-xts - switch to new Intel CPU model defines x86/topology: Handle bogus ACPI tables correctly x86/kconfig: Select ARCH_WANT_FRAME_POINTERS again when UNWINDER_FRAME_POINTER=y
2024-05-25Merge tag 'for-linus-6.10-1' of https://github.com/cminyard/linux-ipmiLinus Torvalds10-45/+35
Pull ipmi updates from Corey Minyard: "Mostly updates for deprecated interfaces, platform.remove and converting from a tasklet to a BH workqueue. Also use HAS_IOPORT for disabling inb()/outb()" * tag 'for-linus-6.10-1' of https://github.com/cminyard/linux-ipmi: ipmi: kcs_bmc_npcm7xx: Convert to platform remove callback returning void ipmi: kcs_bmc_aspeed: Convert to platform remove callback returning void ipmi: ipmi_ssif: Convert to platform remove callback returning void ipmi: ipmi_si_platform: Convert to platform remove callback returning void ipmi: ipmi_powernv: Convert to platform remove callback returning void ipmi: bt-bmc: Convert to platform remove callback returning void char: ipmi: handle HAS_IOPORT dependencies ipmi: Convert from tasklet to BH workqueue
2024-05-25Merge tag 'ceph-for-6.10-rc1' of https://github.com/ceph/ceph-clientLinus Torvalds6-19/+434
Pull ceph updates from Ilya Dryomov: "A series from Xiubo that adds support for additional access checks based on MDS auth caps which were recently made available to clients. This is needed to prevent scenarios where the MDS quietly discards updates that a UID-restricted client previously (wrongfully) acked to the user. Other than that, just a documentation fixup" * tag 'ceph-for-6.10-rc1' of https://github.com/ceph/ceph-client: doc: ceph: update userspace command to get CephFS metadata ceph: add CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK feature bit ceph: check the cephx mds auth access for async dirop ceph: check the cephx mds auth access for open ceph: check the cephx mds auth access for setattr ceph: add ceph_mds_check_access() helper ceph: save cap_auths in MDS client when session is opened
2024-05-25Merge tag 'ntfs3_for_6.10' of ↵Linus Torvalds13-154/+98
https://github.com/Paragon-Software-Group/linux-ntfs3 Pull ntfs3 updates from Konstantin Komarov: "Fixes: - reusing of the file index (could cause the file to be trimmed) - infinite dir enumeration - taking DOS names into account during link counting - le32_to_cpu conversion, 32 bit overflow, NULL check - some code was refactored Changes: - removed max link count info display during driver init Remove: - atomic_open has been removed for lack of use" * tag 'ntfs3_for_6.10' of https://github.com/Paragon-Software-Group/linux-ntfs3: fs/ntfs3: Break dir enumeration if directory contents error fs/ntfs3: Fix case when index is reused during tree transformation fs/ntfs3: Mark volume as dirty if xattr is broken fs/ntfs3: Always make file nonresident on fallocate call fs/ntfs3: Redesign ntfs_create_inode to return error code instead of inode fs/ntfs3: Use variable length array instead of fixed size fs/ntfs3: Use 64 bit variable to avoid 32 bit overflow fs/ntfs3: Check 'folio' pointer for NULL fs/ntfs3: Missed le32_to_cpu conversion fs/ntfs3: Remove max link count info display during driver init fs/ntfs3: Taking DOS names into account during link counting fs/ntfs3: remove atomic_open fs/ntfs3: use kcalloc() instead of kzalloc()
2024-05-25Merge tag '6.10-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbdLinus Torvalds2-9/+18
Pull smb server fixes from Steve French: "Two ksmbd server fixes, both for stable" * tag '6.10-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd: ksmbd: ignore trailing slashes in share paths ksmbd: avoid to send duplicate oplock break notifications
2024-05-25Merge tag 'rtc-6.10' of ↵Linus Torvalds29-250/+702
git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux Pull RTC updates from Alexandre Belloni: "There is one new driver and then most of the changes are the device tree bindings conversions to yaml. New driver: - Epson RX8111 Drivers: - Many Device Tree bindings conversions to dtschema - pcf8563: wakeup-source support" * tag 'rtc-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: pcf8563: add wakeup-source support rtc: rx8111: handle VLOW flag rtc: rx8111: demote warnings to debug level rtc: rx6110: Constify struct regmap_config dt-bindings: rtc: convert trivial devices into dtschema dt-bindings: rtc: stmp3xxx-rtc: convert to dtschema dt-bindings: rtc: pxa-rtc: convert to dtschema rtc: Add driver for Epson RX8111 dt-bindings: rtc: Add Epson RX8111 rtc: mcp795: drop unneeded MODULE_ALIAS rtc: nuvoton: Modify part number value rtc: test: Split rtc unit test into slow and normal speed test dt-bindings: rtc: nxp,lpc1788-rtc: convert to dtschema dt-bindings: rtc: digicolor-rtc: move to trivial-rtc dt-bindings: rtc: alphascale,asm9260-rtc: convert to dtschema dt-bindings: rtc: armada-380-rtc: convert to dtschema rtc: cros-ec: provide ID table for avoiding fallback match
2024-05-25Merge tag 'i3c/for-6.10' of ↵Linus Torvalds5-17/+80
git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux Pull i3c updates from Alexandre Belloni: "Runtime PM (power management) is improved and hot-join support has been added to the dw controller driver. Core: - Allow device driver to trigger controller runtime PM Drivers: - dw: hot-join support - svc: better IBI handling" * tag 'i3c/for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux: i3c: dw: Add hot-join support. i3c: master: Enable runtime PM for master controller i3c: master: svc: fix invalidate IBI type and miss call client IBI handler i3c: master: svc: change ENXIO to EAGAIN when IBI occurs during start frame i3c: Add comment for -EAGAIN in i3c_device_do_priv_xfers()
2024-05-25Merge tag 'jffs2-for-linus-6.10-rc1' of ↵Linus Torvalds4-35/+26
git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs Pull jffs2 updates from Richard Weinberger: - Fix illegal memory access in jffs2_free_inode() - Kernel-doc fixes - print symbolic error names * tag 'jffs2-for-linus-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs: jffs2: Fix potential illegal address access in jffs2_free_inode jffs2: Simplify the allocation of slab caches jffs2: nodemgmt: fix kernel-doc comments jffs2: print symbolic error name instead of error code
2024-05-25Merge tag 'uml-for-linus-6.10-rc1' of ↵Linus Torvalds54-129/+136
git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux Pull UML updates from Richard Weinberger: - Fixes for -Wmissing-prototypes warnings and further cleanup - Remove callback returning void from rtc and virtio drivers - Fix bash location * tag 'uml-for-linus-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: (26 commits) um: virtio_uml: Convert to platform remove callback returning void um: rtc: Convert to platform remove callback returning void um: Remove unused do_get_thread_area function um: Fix -Wmissing-prototypes warnings for __vdso_* um: Add an internal header shared among the user code um: Fix the declaration of kasan_map_memory um: Fix the -Wmissing-prototypes warning for get_thread_reg um: Fix the -Wmissing-prototypes warning for __switch_mm um: Fix -Wmissing-prototypes warnings for (rt_)sigreturn um: Stop tracking host PID in cpu_tasks um: process: remove unused 'n' variable um: vector: remove unused len variable/calculation um: vector: fix bpfflash parameter evaluation um: slirp: remove set but unused variable 'pid' um: signal: move pid variable where needed um: Makefile: use bash from the environment um: Add winch to winch_handlers before registering winch IRQ um: Fix -Wmissing-prototypes warnings for __warp_* and foo um: Fix -Wmissing-prototypes warnings for text_poke* um: Move declarations to proper headers ...
2024-05-24Merge tag 'drm-next-2024-05-25' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds27-98/+215
Pull drm fixes from Dave Airlie: "Some fixes for the end of the merge window, mostly amdgpu and panthor, with one nouveau uAPI change that fixes a bad decision we made a few months back. nouveau: - fix bo metadata uAPI for vm bind panthor: - Fixes for panthor's heap logical block. - Reset on unrecoverable fault - Fix VM references. - Reset fix. xlnx: - xlnx compile and doc fixes. amdgpu: - Handle vbios table integrated info v2.3 amdkfd: - Handle duplicate BOs in reserve_bo_and_cond_vms - Handle memory limitations on small APUs dp/mst: - MST null deref fix. bridge: - Don't let next bridge create connector in adv7511 to make probe work" * tag 'drm-next-2024-05-25' of https://gitlab.freedesktop.org/drm/kernel: drm/amdgpu/atomfirmware: add intergrated info v2.3 table drm/mst: Fix NULL pointer dereference at drm_dp_add_payload_part2 drm/amdkfd: Let VRAM allocations go to GTT domain on small APUs drm/amdkfd: handle duplicate BOs in reserve_bo_and_cond_vms drm/bridge: adv7511: Attach next bridge without creating connector drm/buddy: Fix the warn on's during force merge drm/nouveau: use tile_mode and pte_kind for VM_BIND bo allocations drm/panthor: Call panthor_sched_post_reset() even if the reset failed drm/panthor: Reset the FW VM to NULL on unplug drm/panthor: Keep a ref to the VM at the panthor_kernel_bo level drm/panthor: Force an immediate reset on unrecoverable faults drm/panthor: Document drm_panthor_tiler_heap_destroy::handle validity constraints drm/panthor: Fix an off-by-one in the heap context retrieval logic drm/panthor: Relax the constraints on the tiler chunk size drm/panthor: Make sure the tiler initial/max chunks are consistent drm/panthor: Fix tiler OOM handling to allow incremental rendering drm: xlnx: zynqmp_dpsub: Fix compilation error drm: xlnx: zynqmp_dpsub: Fix few function comments
2024-05-24cifs: Fix missing set of remote_i_sizeDavid Howells2-3/+4
Occasionally, the generic/001 xfstest will fail indicating corruption in one of the copy chains when run on cifs against a server that supports FSCTL_DUPLICATE_EXTENTS_TO_FILE (eg. Samba with a share on btrfs). The problem is that the remote_i_size value isn't updated by cifs_setsize() when called by smb2_duplicate_extents(), but i_size *is*. This may cause cifs_remap_file_range() to then skip the bit after calling ->duplicate_extents() that sets sizes. Fix this by calling netfs_resize_file() in smb2_duplicate_extents() before calling cifs_setsize() to set i_size. This means we don't then need to call netfs_resize_file() upon return from ->duplicate_extents(), but we also fix the test to compare against the pre-dup inode size. [Note that this goes back before the addition of remote_i_size with the netfs_inode struct. It should probably have been setting cifsi->server_eof previously.] Fixes: cfc63fc8126a ("smb3: fix cached file size problems in duplicate extents (reflink)") Signed-off-by: David Howells <[email protected]> cc: Steve French <[email protected]> cc: Paulo Alcantara <[email protected]> cc: Shyam Prasad N <[email protected]> cc: Rohith Surabattula <[email protected]> cc: Jeff Layton <[email protected]> cc: [email protected] cc: [email protected] Signed-off-by: Steve French <[email protected]>
2024-05-24cifs: Fix smb3_insert_range() to move the zero_pointDavid Howells1-0/+1
Fix smb3_insert_range() to move the zero_point over to the new EOF. Without this, generic/147 fails as reads of data beyond the old EOF point return zeroes. Fixes: 3ee1a1fc3981 ("cifs: Cut over to using netfslib") Signed-off-by: David Howells <[email protected]> cc: Shyam Prasad N <[email protected]> cc: Rohith Surabattula <[email protected]> cc: Jeff Layton <[email protected]> cc: [email protected] cc: [email protected] Signed-off-by: Steve French <[email protected]>
2024-05-24Merge tag 'mm-stable-2024-05-24-11-49' of ↵Linus Torvalds33-3/+2732
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull more mm updates from Andrew Morton: "Jeff Xu's implementation of the mseal() syscall" * tag 'mm-stable-2024-05-24-11-49' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: selftest mm/mseal read-only elf memory segment mseal: add documentation selftest mm/mseal memory sealing mseal: add mseal syscall mseal: wire up mseal syscall
2024-05-24mm/ksm: fix possible UAF of stable_nodeChengming Zhou1-1/+2
The commit 2c653d0ee2ae ("ksm: introduce ksm_max_page_sharing per page deduplication limit") introduced a possible failure case in the stable_tree_insert(), where we may free the new allocated stable_node_dup if we fail to prepare the missing chain node. Then that kfolio return and unlock with a freed stable_node set... And any MM activities can come in to access kfolio->mapping, so UAF. Fix it by moving folio_set_stable_node() to the end after stable_node is inserted successfully. Link: https://lkml.kernel.org/r/[email protected] Fixes: 2c653d0ee2ae ("ksm: introduce ksm_max_page_sharing per page deduplication limit") Signed-off-by: Chengming Zhou <[email protected]> Acked-by: David Hildenbrand <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Stefan Roesch <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-05-24mm/memory-failure: fix handling of dissolved but not taken off from buddy pagesMiaohe Lin1-2/+2
When I did memory failure tests recently, below panic occurs: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00 flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff) raw: 06fffe0000000000 dead000000000100 dead000000000122 0000000000000000 raw: 0000000000000000 0000000000000009 00000000ffffffff 0000000000000000 page dumped because: VM_BUG_ON_PAGE(!PageBuddy(page)) ------------[ cut here ]------------ kernel BUG at include/linux/page-flags.h:1009! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI RIP: 0010:__del_page_from_free_list+0x151/0x180 RSP: 0018:ffffa49c90437998 EFLAGS: 00000046 RAX: 0000000000000035 RBX: 0000000000000009 RCX: ffff8dd8dfd1c9c8 RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff8dd8dfd1c9c0 RBP: ffffd901233b8000 R08: ffffffffab5511f8 R09: 0000000000008c69 R10: 0000000000003c15 R11: ffffffffab5511f8 R12: ffff8dd8fffc0c80 R13: 0000000000000001 R14: ffff8dd8fffc0c80 R15: 0000000000000009 FS: 00007ff916304740(0000) GS:ffff8dd8dfd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055eae50124c8 CR3: 00000008479e0000 CR4: 00000000000006f0 Call Trace: <TASK> __rmqueue_pcplist+0x23b/0x520 get_page_from_freelist+0x26b/0xe40 __alloc_pages_noprof+0x113/0x1120 __folio_alloc_noprof+0x11/0xb0 alloc_buddy_hugetlb_folio.isra.0+0x5a/0x130 __alloc_fresh_hugetlb_folio+0xe7/0x140 alloc_pool_huge_folio+0x68/0x100 set_max_huge_pages+0x13d/0x340 hugetlb_sysctl_handler_common+0xe8/0x110 proc_sys_call_handler+0x194/0x280 vfs_write+0x387/0x550 ksys_write+0x64/0xe0 do_syscall_64+0xc2/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7ff916114887 RSP: 002b:00007ffec8a2fd78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000055eae500e350 RCX: 00007ff916114887 RDX: 0000000000000004 RSI: 000055eae500e390 RDI: 0000000000000003 RBP: 000055eae50104c0 R08: 0000000000000000 R09: 000055eae50104c0 R10: 0000000000000077 R11: 0000000000000246 R12: 0000000000000004 R13: 0000000000000004 R14: 00007ff916216b80 R15: 00007ff916216a00 </TASK> Modules linked in: mce_inject hwpoison_inject ---[ end trace 0000000000000000 ]--- And before the panic, there had an warning about bad page state: BUG: Bad page state in process page-types pfn:8cee00 page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00 flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff) page_type: 0xffffff7f(buddy) raw: 06fffe0000000000 ffffd901241c0008 ffffd901240f8008 0000000000000000 raw: 0000000000000000 0000000000000009 00000000ffffff7f 0000000000000000 page dumped because: nonzero mapcount Modules linked in: mce_inject hwpoison_inject CPU: 8 PID: 154211 Comm: page-types Not tainted 6.9.0-rc4-00499-g5544ec3178e2-dirty #22 Call Trace: <TASK> dump_stack_lvl+0x83/0xa0 bad_page+0x63/0xf0 free_unref_page+0x36e/0x5c0 unpoison_memory+0x50b/0x630 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110 debugfs_attr_write+0x42/0x60 full_proxy_write+0x5b/0x80 vfs_write+0xcd/0x550 ksys_write+0x64/0xe0 do_syscall_64+0xc2/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f189a514887 RSP: 002b:00007ffdcd899718 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f189a514887 RDX: 0000000000000009 RSI: 00007ffdcd899730 RDI: 0000000000000003 RBP: 00007ffdcd8997a0 R08: 0000000000000000 R09: 00007ffdcd8994b2 R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcda199a8 R13: 0000000000404af1 R14: 000000000040ad78 R15: 00007f189a7a5040 </TASK> The root cause should be the below race: memory_failure try_memory_failure_hugetlb me_huge_page __page_handle_poison dissolve_free_hugetlb_folio drain_all_pages -- Buddy page can be isolated e.g. for compaction. take_page_off_buddy -- Failed as page is not in the buddy list. -- Page can be putback into buddy after compaction. page_ref_inc -- Leads to buddy page with refcnt = 1. Then unpoison_memory() can unpoison the page and send the buddy page back into buddy list again leading to the above bad page state warning. And bad_page() will call page_mapcount_reset() to remove PageBuddy from buddy page leading to later VM_BUG_ON_PAGE(!PageBuddy(page)) when trying to allocate this page. Fix this issue by only treating __page_handle_poison() as successful when it returns 1. Link: https://lkml.kernel.org/r/[email protected] Fixes: ceaf8fbea79a ("mm, hwpoison: skip raw hwpoison page in freeing 1GB hugepage") Signed-off-by: Miaohe Lin <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-05-24mm: /proc/pid/smaps_rollup: avoid skipping vma after getting mmap_lock againYuanyuan Zhong1-2/+7
After switching smaps_rollup to use VMA iterator, searching for next entry is part of the condition expression of the do-while loop. So the current VMA needs to be addressed before the continue statement. Otherwise, with some VMAs skipped, userspace observed memory consumption from /proc/pid/smaps_rollup will be smaller than the sum of the corresponding fields from /proc/pid/smaps. Link: https://lkml.kernel.org/r/[email protected] Fixes: c4c84f06285e ("fs/proc/task_mmu: stop using linked list and highest_vm_end") Signed-off-by: Yuanyuan Zhong <[email protected]> Reviewed-by: Mohamed Khalfella <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-05-24nilfs2: fix potential hang in nilfs_detach_log_writer()Ryusuke Konishi1-3/+18
Syzbot has reported a potential hang in nilfs_detach_log_writer() called during nilfs2 unmount. Analysis revealed that this is because nilfs_segctor_sync(), which synchronizes with the log writer thread, can be called after nilfs_segctor_destroy() terminates that thread, as shown in the call trace below: nilfs_detach_log_writer nilfs_segctor_destroy nilfs_segctor_kill_thread --> Shut down log writer thread flush_work nilfs_iput_work_func nilfs_dispose_list iput nilfs_evict_inode nilfs_transaction_commit nilfs_construct_segment (if inode needs sync) nilfs_segctor_sync --> Attempt to synchronize with log writer thread *** DEADLOCK *** Fix this issue by changing nilfs_segctor_sync() so that the log writer thread returns normally without synchronizing after it terminates, and by forcing tasks that are already waiting to complete once after the thread terminates. The skipped inode metadata flushout will then be processed together in the subsequent cleanup work in nilfs_segctor_destroy(). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ryusuke Konishi <[email protected]> Reported-by: [email protected] Closes: https://syzkaller.appspot.com/bug?extid=e3973c409251e136fdd0 Tested-by: Ryusuke Konishi <[email protected]> Cc: <[email protected]> Cc: "Bai, Shuangpeng" <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-05-24nilfs2: fix unexpected freezing of nilfs_segctor_sync()Ryusuke Konishi1-4/+13
A potential and reproducible race issue has been identified where nilfs_segctor_sync() would block even after the log writer thread writes a checkpoint, unless there is an interrupt or other trigger to resume log writing. This turned out to be because, depending on the execution timing of the log writer thread running in parallel, the log writer thread may skip responding to nilfs_segctor_sync(), which causes a call to schedule() waiting for completion within nilfs_segctor_sync() to lose the opportunity to wake up. The reason why waking up the task waiting in nilfs_segctor_sync() may be skipped is that updating the request generation issued using a shared sequence counter and adding an wait queue entry to the request wait queue to the log writer, are not done atomically. There is a possibility that log writing and request completion notification by nilfs_segctor_wakeup() may occur between the two operations, and in that case, the wait queue entry is not yet visible to nilfs_segctor_wakeup() and the wake-up of nilfs_segctor_sync() will be carried over until the next request occurs. Fix this issue by performing these two operations simultaneously within the lock section of sc_state_lock. Also, following the memory barrier guidelines for event waiting loops, move the call to set_current_state() in the same location into the event waiting loop to ensure that a memory barrier is inserted just before the event condition determination. Link: https://lkml.kernel.org/r/[email protected] Fixes: 9ff05123e3bf ("nilfs2: segment constructor") Signed-off-by: Ryusuke Konishi <[email protected]> Tested-by: Ryusuke Konishi <[email protected]> Cc: <[email protected]> Cc: "Bai, Shuangpeng" <[email protected]> Signed-off-by: Andrew Morton <[email protected]>