aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-12-06xfs: document what LARP meansDarrick J. Wong1-0/+9
Christoph requested a blurb somewhere explaining exactly what LARP means. I don't know of a good place other than the source code (debug knobs aren't covered in Documentation/), so here it is. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-06xfs: fix 32-bit truncation in xfs_compute_rextslogDarrick J. Wong1-3/+5
It's quite reasonable that some customer somewhere will want to configure a realtime volume with more than 2^32 extents. If they try to do this, the highbit32() call will truncate the upper bits of the xfs_rtbxlen_t and produce the wrong value for rextslog. This in turn causes the rsumlevels to be wrong, which results in a realtime summary file that is the wrong length. Fix that. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-06xfs: make rextslog computation consistent with mkfsDarrick J. Wong4-3/+19
There's a weird discrepancy in xfsprogs dating back to the creation of the Linux port -- if there are zero rt extents, mkfs will set sb_rextents and sb_rextslog both to zero: sbp->sb_rextslog = (uint8_t)(rtextents ? libxfs_highbit32((unsigned int)rtextents) : 0); However, that's not the check that xfs_repair uses for nonzero rtblocks: if (sb->sb_rextslog != libxfs_highbit32((unsigned int)sb->sb_rextents)) The difference here is that xfs_highbit32 returns -1 if its argument is zero. Unfortunately, this means that in the weird corner case of a realtime volume shorter than 1 rt extent, xfs_repair will immediately flag a freshly formatted filesystem as corrupt. Because mkfs has been writing ondisk artifacts like this for decades, we have to accept that as "correct". TBH, zero rextslog for zero rtextents makes more sense to me anyway. Regrettably, the superblock verifier checks created in commit copied xfs_repair even though mkfs has been writing out such filesystems for ages. Fix the superblock verifier to accept what mkfs spits out; the userspace version of this patch will have to fix xfs_repair as well. Note that the new helper leaves the zeroday bug where the upper 32 bits of sb_rextents is ripped off and fed to highbit32. This leads to a seriously undersized rt summary file, which immediately breaks mkfs: $ hugedisk.sh foo /dev/sdc $(( 0x100000080 * 4096))B $ /sbin/mkfs.xfs -f /dev/sda -m rmapbt=0,reflink=0 -r rtdev=/dev/mapper/foo meta-data=/dev/sda isize=512 agcount=4, agsize=1298176 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=0 = reflink=0 bigtime=1 inobtcount=1 nrext64=1 data = bsize=4096 blocks=5192704, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0, ftype=1 log =internal log bsize=4096 blocks=16384, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =/dev/mapper/foo extsz=4096 blocks=4294967424, rtextents=4294967424 Discarding blocks...Done. mkfs.xfs: Error initializing the realtime space [117 - Structure needs cleaning] The next patch will drop support for rt volumes with fewer than 1 or more than 2^32-1 rt extents, since they've clearly been broken forever. Fixes: f8e566c0f5e1f ("xfs: validate the realtime geometry in xfs_validate_sb_common") Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-06xfs: collapse the ->create_done functionsDarrick J. Wong5-109/+64
Move the meat of the ->create_done function helpers into ->create_done to reduce the amount of boilerplate. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-06xfs: hoist xfs_trans_add_item calls to defer ops functionsDarrick J. Wong6-17/+6
Remove even more repeated boilerplate. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-06xfs: clean out XFS_LI_DIRTY setting boilerplate from ->iop_relogDarrick J. Wong6-7/+11
Hoist this dirty flag setting to the ->iop_relog callsite to reduce boilerplate. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-06xfs: use xfs_defer_create_done for the relogging operationDarrick J. Wong7-26/+14
Now that we have a helper to handle creating a log intent done item and updating all the necessary state flags, use it to reduce boilerplate in the ->iop_relog implementations. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-06xfs: hoist ->create_intent boilerplate to its callsiteDarrick J. Wong6-15/+2
Hoist the dirty flag setting code out of each ->create_intent implementation up to the callsite to reduce boilerplate further. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-06xfs: collapse the ->finish_item helpersDarrick J. Wong5-146/+58
Each log item's ->finish_item function sets up a small amount of state and calls another function to do the work. Collapse that other function into ->finish_item to reduce the call stack height. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-06xfs: move ->iop_recover to xfs_defer_op_typeDarrick J. Wong10-88/+109
Finish off the series by moving the intent item recovery function pointer to the xfs_defer_op_type struct, since this is really a deferred work function now. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-06xfs: hoist intent done flag setting to ->finish_item callsiteDarrick J. Wong6-92/+34
Each log intent item's ->finish_item call chain inevitably includes some code to set the dirty flag of the transaction. If there's an associated log intent done item, it also sets the item's dirty flag and the transaction's INTENT_DONE flag. This is repeated throughout the codebase. Reduce the LOC by moving all that to xfs_defer_finish_one. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-06xfs: use xfs_defer_finish_one to finish recovered work itemsDarrick J. Wong9-157/+49
Get rid of the open-coded calls to xfs_defer_finish_one. This also means that the recovery transaction takes care of cleaning up the dfp, and we have solved (I hope) all the ownership issues in recovery. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-06xfs: don't set XFS_TRANS_HAS_INTENT_DONE when there's no ATTRD log itemDarrick J. Wong1-2/+4
XFS_TRANS_HAS_INTENT_DONE is a flag to the CIL that we've added a log intent done item to the transaction. This enables an optimization wherein we avoid writing out log intent and log intent done items if they would have ended up in the same checkpoint. This reduces writes to the ondisk log and speeds up recovery as a result. However, callers can use the defer ops machinery to modify xattrs without using the log items. In this situation, there won't be an intent done item, so we do not need to set the flag. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-06xfs: dump the recovered xattri log item if corruption happensDarrick J. Wong1-0/+4
If xfs_attri_item_recover receives a corruption error when it tries to finish a recovered log intent item, it should dump the log item for debugging, just like all the other log intent items. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-06xfs: recreate work items when recovering intent itemsDarrick J. Wong7-163/+215
Recreate work items for each xfs_defer_pending object when we are recovering intent items. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-06xfs: transfer recovered intent item ownership in ->iop_recoverDarrick J. Wong7-7/+22
Now that we pass the xfs_defer_pending object into the intent item recovery functions, we know exactly when ownership of the sole refcount passes from the recovery context to the intent done item. At that point, we need to null out dfp_intent so that the recovery mechanism won't release it. This should fix the UAF problem reported by Long Li. Note that we still want to recreate the full deferred work state. That will be addressed in the next patches. Fixes: 2e76f188fd90 ("xfs: cancel intents immediately if process_intents fails") Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-06xfs: pass the xfs_defer_pending object to iop_recoverDarrick J. Wong7-7/+14
Now that log intent item recovery recreates the xfs_defer_pending state, we should pass that into the ->iop_recover routines so that the intent item can finish the recreation work. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-06xfs: use xfs_defer_pending objects to recover intent itemsDarrick J. Wong11-116/+158
One thing I never quite got around to doing is porting the log intent item recovery code to reconstruct the deferred pending work state. As a result, each intent item open codes xfs_defer_finish_one in its recovery method, because that's what the EFI code did before xfs_defer.c even existed. This is a gross thing to have left unfixed -- if an EFI cannot proceed due to busy extents, we end up creating separate new EFIs for each unfinished work item, which is a change in behavior from what runtime would have done. Worse yet, Long Li pointed out that there's a UAF in the recovery code. The ->commit_pass2 function adds the intent item to the AIL and drops the refcount. The one remaining refcount is now owned by the recovery mechanism (aka the log intent items in the AIL) with the intent of giving the refcount to the intent done item in the ->iop_recover function. However, if something fails later in recovery, xlog_recover_finish will walk the recovered intent items in the AIL and release them. If the CIL hasn't been pushed before that point (which is possible since we don't force the log until later) then the intent done release will try to free its associated intent, which has already been freed. This patch starts to address this mess by having the ->commit_pass2 functions recreate the xfs_defer_pending state. The next few patches will fix the recovery functions. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-06xfs: don't leak recovered attri intent itemsDarrick J. Wong1-2/+7
If recovery finds an xattr log intent item calling for the removal of an attribute and the file doesn't even have an attr fork, we know that the removal is trivially complete. However, we can't just exit the recovery function without doing something about the recovered log intent item -- it's still on the AIL, and not logging an attrd item means it stays there forever. This has likely not been seen in practice because few people use LARP and the runtime code won't log the attri for a no-attrfork removexattr operation. But let's fix this anyway. Also we shouldn't really be testing the attr fork presence until we've taken the ILOCK, though this doesn't matter much in recovery, which is single threaded. Fixes: fdaf1bb3cafc ("xfs: ATTR_REPLACE algorithm with LARP enabled needs rework") Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-03Linux 6.7-rc4Linus Torvalds1-1/+1
2023-12-03Merge tag 'v6.7-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds7-34/+54
Pull smb client fixes from Steve French: - Two fallocate fixes - Fix warnings from new gcc - Two symlink fixes * tag 'v6.7-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: smb: client, common: fix fortify warnings cifs: Fix FALLOC_FL_INSERT_RANGE by setting i_size after EOF moved cifs: Fix FALLOC_FL_ZERO_RANGE by setting i_size if EOF moved smb: client: report correct st_size for SMB and NFS symlinks smb: client: fix missing mode bits for SMB symlinks
2023-12-03Merge tag 'firewire-fixes-6.7-rc4' of ↵Linus Torvalds1-7/+4
git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394 Pull firewire fix from Takashi Sakamoto: "A single patch to fix long-standing issue of memory leak at failure of device registration for fw_unit. We rarely encounter the issue, but it should be applied to stable releases, since it fixes inappropriate API usage" * tag 'firewire-fixes-6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394: firewire: core: fix possible memory leak in create_units()
2023-12-03Merge tag 'powerpc-6.7-3' of ↵Linus Torvalds3-3/+18
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix corruption of f0/vs0 during FP/Vector save, seen as userspace crashes when using io-uring workers (in particular with MariaDB) - Fix KVM_RUN potentially clobbering all host userspace FP/Vector registers Thanks to Timothy Pearson, Jens Axboe, and Nicholas Piggin. * tag 'powerpc-6.7-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: KVM: PPC: Book3S HV: Fix KVM_RUN clobbering FP/VEC user registers powerpc: Don't clobber f0/vs0 during fp|altivec register save
2023-12-03Merge tag 'vfio-v6.7-rc4' of https://github.com/awilliam/linux-vfioLinus Torvalds4-18/+26
Pull vfio fixes from Alex Williamson: - Fix the lifecycle of a mutex in the pds variant driver such that a reset prior to opening the device won't find it uninitialized. Implement the release path to symmetrically destroy the mutex. Also switch a different lock from spinlock to mutex as the code path has the potential to sleep and doesn't need the spinlock context otherwise (Brett Creeley) - Fix an issue detected via randconfig where KVM tries to symbol_get an undeclared function. The symbol is temporarily declared unconditionally here, which resolves the problem and avoids churn relative to a series pending for the next merge window which resolves some of this symbol ugliness, but also fixes Kconfig dependencies (Sean Christopherson) * tag 'vfio-v6.7-rc4' of https://github.com/awilliam/linux-vfio: vfio: Drop vfio_file_iommu_group() stub to fudge around a KVM wart vfio/pds: Fix possible sleep while in atomic context vfio/pds: Fix mutex lock->magic != lock warning
2023-12-03Merge tag 'for-linus-6.7a-rc4-tag' of ↵Linus Torvalds3-3/+9
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - A fix for the Xen event driver setting the correct return value when experiencing an allocation failure - A fix for allocating space for a struct in the percpu area to not cross page boundaries (this one is for x86, a similar one for Arm was already in the pull request for rc3) * tag 'for-linus-6.7a-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/events: fix error code in xen_bind_pirq_msi_to_irq() x86/xen: fix percpu vcpu_info allocation
2023-12-03Merge tag 'probes-fixes-v6.7-rc3' of ↵Linus Torvalds5-21/+43
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull probes fixes from Masami Hiramatsu: - objpool: Fix objpool overrun case on memory/cache access delay especially on the big.LITTLE SoC. The objpool uses a copy of object slot index internal loop, but the slot index can be changed on another processor in parallel. In that case, the difference of 'head' local copy and the 'slot->last' index will be bigger than local slot size. In that case, we need to re-read the slot::head to update it. - kretprobe: Fix to use appropriate rcu API for kretprobe holder. Since kretprobe_holder::rp is RCU managed, it should use rcu_assign_pointer() and rcu_dereference_check() correctly. Also adding __rcu tag for finding wrong usage by sparse. - rethook: Fix to use appropriate rcu API for rethook::handler. The same as kretprobe, rethook::handler is RCU managed and it should use rcu_assign_pointer() and rcu_dereference_check(). This also adds __rcu tag for finding wrong usage by sparse. * tag 'probes-fixes-v6.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: rethook: Use __rcu pointer for rethook::handler kprobes: consistent rcu api usage for kretprobe holder lib: objpool: fix head overrun on RK3588 SBC
2023-12-02Merge tag 'pm-6.7-rc4' of ↵Linus Torvalds7-32/+136
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix issues in two cpufreq drivers, in the AMD P-state driver and in the power-capping DTPM framework. Specifics: - Fix the AMD P-state driver's EPP sysfs interface in the cases when the performance governor is in use (Ayush Jain) - Make the ->fast_switch() callback in the AMD P-state driver return the target frequency as expected (Gautham R. Shenoy) - Allow user space to control the range of frequencies to use via scaling_min_freq and scaling_max_freq when AMD P-state driver is in use (Wyes Karny) - Prevent power domains needed for wakeup signaling from being turned off during system suspend on Qualcomm systems and prevent performance states votes from runtime-suspended devices from being lost across a system suspend-resume cycle in qcom-cpufreq-nvmem (Stephan Gerhold) - Fix disabling the 792 Mhz OPP in the imx6q cpufreq driver for the i.MX6ULL types that can run at that frequency (Christoph Niedermaier) - Eliminate unnecessary and harmful conversions to uW from the DTPM (dynamic thermal and power management) framework (Lukasz Luba)" * tag 'pm-6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq/amd-pstate: Only print supported EPP values for performance governor cpufreq/amd-pstate: Fix scaling_min_freq and scaling_max_freq update powercap: DTPM: Fix unneeded conversions to micro-Watts cpufreq/amd-pstate: Fix the return value of amd_pstate_fast_switch() pmdomain: qcom: rpmpd: Set GENPD_FLAG_ACTIVE_WAKEUP cpufreq: qcom-nvmem: Preserve PM domain votes in system suspend cpufreq: qcom-nvmem: Enable virtual power domain devices cpufreq: imx6q: Don't disable 792 Mhz OPP unnecessarily
2023-12-02Merge tag 'acpi-6.7-rc4' of ↵Linus Torvalds4-24/+17
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "This fixes a recently introduced build issue on ARM32 and a NULL pointer dereference in the ACPI backlight driver due to a design issue exposed by a recent change in the ACPI bus type code. Specifics: - Fix a recently introduced build issue on ARM32 platforms caused by an inadvertent header file breakage (Dave Jiang) - Eliminate questionable usage of acpi_driver_data() in the ACPI backlight cooling device code that leads to NULL pointer dereferences after recent ACPI core changes (Hans de Goede)" * tag 'acpi-6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: video: Use acpi_video_device for cooling-dev driver data ACPI: Fix ARM32 platforms compile issue introduced by fw_table changes
2023-12-02Merge tag 'arm64-fixes' of ↵Linus Torvalds1-0/+4
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fix from Catalin Marinas: "Fix a regression where the arm64 KPTI ends up enabled even on systems that don't need it" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: Avoid enabling KPTI unnecessarily
2023-12-02Merge tag 'iommu-fixes-v6.7-rc3' of ↵Linus Torvalds9-42/+126
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fixes from Joerg Roedel: - Fix race conditions in device probe path - Handle ERR_PTR() returns in __iommu_domain_alloc() path - Update MAINTAINERS entry for Qualcom IOMMUs - Printk argument fix in device tree specific code - Several Intel VT-d fixes from Lu Baolu: - Do not support enforcing cache coherency for non-empty domains - Avoid devTLB invalidation if iommu is off - Disable PCI ATS in legacy passthrough mode - Support non-PCI devices when clearing context - Fix incorrect cache invalidation for mm notification - Add MTL to quirk list to skip TE disabling - Set variable intel_dirty_ops to static * tag 'iommu-fixes-v6.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu: Fix printk arg in of_iommu_get_resv_regions() iommu/vt-d: Set variable intel_dirty_ops to static iommu/vt-d: Fix incorrect cache invalidation for mm notification iommu/vt-d: Add MTL to quirk list to skip TE disabling iommu/vt-d: Make context clearing consistent with context mapping iommu/vt-d: Disable PCI ATS in legacy passthrough mode iommu/vt-d: Omit devTLB invalidation requests when TES=0 iommu/vt-d: Support enforce_cache_coherency only for empty domains iommu: Avoid more races around device probe MAINTAINERS: list all Qualcomm IOMMU drivers in the QUALCOMM IOMMU entry iommu: Flow ERR_PTR out from __iommu_domain_alloc()
2023-12-02Merge tag 'sound-6.7-rc4' of ↵Linus Torvalds8-25/+76
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "No surprise here, including only a collection of HD-audio device-specific small fixes" * tag 'sound-6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda: Disable power-save on KONTRON SinglePC ALSA: hda/realtek: Add supported ALC257 for ChromeOS ALSA: hda/realtek: Headset Mic VREF to 100% ALSA: hda: intel-nhlt: Ignore vbps when looking for DMIC 32 bps format ALSA: hda: cs35l56: Enable low-power hibernation mode on SPI ALSA: cs35l41: Fix for old systems which do not support command ALSA: hda: cs35l41: Remove unnecessary boolean state variable firmware_running ALSA: hda - Fix speaker and headset mic pin config for CHUWI CoreBook XPro
2023-12-02Merge tag 'drm-fixes-2023-12-01' of git://anongit.freedesktop.org/drm/drmLinus Torvalds67-302/+537
Pull drm fixes from Dave Airlie: "Weekly fixes, mostly amdgpu fixes with a scattering of nouveau, i915, and a couple of reverts. Hopefully it will quieten down in coming weeks. drm: - Revert unexport of prime helpers for fd/handle conversion dma_resv: - Do not double add fences in dma_resv_add_fence. gpuvm: - Fix GPUVM license identifier. i915: - Mark internal GSC engine with reserved uabi class - Take VGA converters into account in eDP probe - Fix intel_pre_plane_updates() call to ensure workarounds get applied panel: - Revert panel fixes as they require exporting device_is_dependent. nouveau: - fix oversized allocations in new vm path - fix zero-length array - remove a stray lock nt36523: - Fix error check for nt36523. amdgpu: - DMUB fix - DCN 3.5 fixes - XGMI fix - DCN 3.2 fixes - Vangogh suspend fix - NBIO 7.9 fix - GFX11 golden register fix - Backlight fix - NBIO 7.11 fix - IB test overflow fix - DCN 3.1.4 fixes - fix a runtime pm ref count - Retimer fix - ABM fix - DCN 3.1.5 fix - Fix AGP addressing - Fix possible memory leak in SMU error path - Make sure PME is enabled in D3 - Fix possible NULL pointer dereference in debugfs - EEPROM fix - GC 9.4.3 fix amdkfd: - IP version check fix - Fix memory leak in pqm_uninit()" * tag 'drm-fixes-2023-12-01' of git://anongit.freedesktop.org/drm/drm: (53 commits) Revert "drm/prime: Unexport helpers for fd/handle conversion" drm/amdgpu: Use another offset for GC 9.4.3 remap drm/amd/display: Fix some HostVM parameters in DML drm/amdkfd: Free gang_ctx_bo and wptr_bo in pqm_uninit drm/amdgpu: Update EEPROM I2C address for smu v13_0_0 drm/amd/display: Allow DTBCLK disable for DCN35 drm/amdgpu: Fix cat debugfs amdgpu_regs_didt causes kernel null pointer drm/amd: Enable PCIe PME from D3 drm/amd/pm: fix a memleak in aldebaran_tables_init drm/amdgpu: fix AGP addressing when GART is not at 0 drm/amd/display: update dcn315 lpddr pstate latency drm/amd/display: fix ABM disablement drm/amd/display: Fix black screen on video playback with embedded panel drm/amd/display: Fix conversions between bytes and KB drm/amdkfd: Use common function for IP version check drm/amd/display: Remove config update drm/amd/display: Update DCN35 clock table policy drm/amd/display: force toggle rate wa for first link training for a retimer drm/amdgpu: correct the amdgpu runtime dereference usage count drm/amd/display: Update min Z8 residency time to 2100 for DCN314 ...
2023-12-02Merge tag 'io_uring-6.7-2023-11-30' of git://git.kernel.dk/linuxLinus Torvalds6-70/+224
Pull io_uring fixes from Jens Axboe: - Fix an issue with discontig page checking for IORING_SETUP_NO_MMAP - Fix an issue with not allowing IORING_SETUP_NO_MMAP also disallowing mmap'ed buffer rings - Fix an issue with deferred release of memory mapped pages - Fix a lockdep issue with IORING_SETUP_NO_MMAP - Use fget/fput consistently, even from our sync system calls. No real issue here, but if we were ever to allow closing io_uring descriptors it would be required. Let's play it safe and just use the full ref counted versions upfront. Most uses of io_uring are threaded anyway, and hence already doing the full version underneath. * tag 'io_uring-6.7-2023-11-30' of git://git.kernel.dk/linux: io_uring: use fget/fput consistently io_uring: free io_buffer_list entries via RCU io_uring/kbuf: prune deferred locked cache when tearing down io_uring/kbuf: recycle freed mapped buffer ring entries io_uring/kbuf: defer release of mapped buffer rings io_uring: enable io_mem_alloc/free to be used in other parts io_uring: don't guard IORING_OFF_PBUF_RING with SETUP_NO_MMAP io_uring: don't allow discontig pages for IORING_SETUP_NO_MMAP
2023-12-02Merge tag 'block-6.7-2023-12-01' of git://git.kernel.dk/linuxLinus Torvalds6-12/+58
Pull block fixes from Jens Axboe: - NVMe pull request via Keith: - Invalid namespace identification error handling (Marizio Ewan, Keith) - Fabrics keep-alive tuning (Mark) - Fix for a bad error check regression in bcache (Markus) - Fix for a performance regression with O_DIRECT (Ming) - Fix for a flush related deadlock (Ming) - Make the read-only warn on per-partition (Yu) * tag 'block-6.7-2023-12-01' of git://git.kernel.dk/linux: nvme-core: check for too small lba shift blk-mq: don't count completed flush data request as inflight in case of quiesce block: Document the role of the two attribute groups block: warn once for each partition in bio_check_ro() block: move .bd_inode into 1st cacheline of block_device nvme: check for valid nvme_identify_ns() before using it nvme-core: fix a memory leak in nvme_ns_info_from_identify() nvme: fine-tune sending of first keep-alive bcache: revert replacing IS_ERR_OR_NULL with IS_ERR
2023-12-02Merge tag 'dm-6.7/dm-fixes-2' of ↵Linus Torvalds4-10/+8
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - Fix DM verity target's FEC support to always initialize IO before it frees it. Also fix alignment of struct dm_verity_fec_io within the per-bio-data - Fix DM verity target to not FEC failed readahead IO - Update DM flakey target to use MAX_ORDER rather than MAX_ORDER - 1 * tag 'dm-6.7/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm-flakey: start allocating with MAX_ORDER dm-verity: align struct dm_verity_fec_io properly dm verity: don't perform FEC for failed readahead IO dm verity: initialize fec io before freeing it
2023-12-02Merge tag 'scsi-fixes' of ↵Linus Torvalds5-9/+40
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Three small fixes, one in drivers. The core changes are to the internal representation of flags in scsi_devices which removes space wasting bools in favour of single bit flags and to add a flag to force a runtime resume which is used by ATA devices" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: sd: Fix system start for ATA devices scsi: Change SCSI device boolean fields to single bit flags scsi: ufs: core: Clear cmd if abort succeeds in MCQ mode
2023-12-02Merge tag 'fs_for_v6.7-rc4' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull ext2 fix from Jan Kara: "Fix an ext2 bug introduced by changes in ext2 & iomap stepping on each other toes (apparently ext2 driver does not get much testing in linux-next)" * tag 'fs_for_v6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: ext2: Fix ki_pos update for DIO buffered-io fallback case
2023-12-02Merge tag 'bcachefs-2023-11-29' of https://evilpiepirate.org/git/bcachefsLinus Torvalds45-323/+495
Pull more bcachefs bugfixes from Kent Overstreet: - bcache & bcachefs were broken with CFI enabled; patch for closures to fix type punning - mark erasure coding as extra-experimental; there are incompatible disk space accounting changes coming for erasure coding, and I'm still seeing checksum errors in some tests - several fixes for durability-related issues (durability is a device specific setting where we can tell bcachefs that data on a given device should be counted as replicated x times) - a fix for a rare livelock when a btree node merge then updates a parent node that is almost full - fix a race in the device removal path, where dropping a pointer in a btree node to a device would be clobbered by an in flight btree write updating the btree node key on completion - fix one SRCU lock hold time warning in the btree gc code - ther's still a bunch more of these to fix - fix a rare race where we'd start copygc before initializing the "are we rw" percpu refcount; copygc would think we were already ro and die immediately * tag 'bcachefs-2023-11-29' of https://evilpiepirate.org/git/bcachefs: (23 commits) bcachefs: Extra kthread_should_stop() calls for copygc bcachefs: Convert gc_alloc_start() to for_each_btree_key2() bcachefs: Fix race between btree writes and metadata drop bcachefs: move journal seq assertion bcachefs: -EROFS doesn't count as move_extent_start_fail bcachefs: trace_move_extent_start_fail() now includes errcode bcachefs: Fix split_race livelock bcachefs: Fix bucket data type for stripe buckets bcachefs: Add missing validation for jset_entry_data_usage bcachefs: Fix zstd compress workspace size bcachefs: bpos is misaligned on big endian bcachefs: Fix ec + durability calculation bcachefs: Data update path won't accidentaly grow replicas bcachefs: deallocate_extra_replicas() bcachefs: Proper refcounting for journal_keys bcachefs: preserve device path as device name bcachefs: Fix an endianness conversion bcachefs: Start gc, copygc, rebalance threads after initing writes ref bcachefs: Don't stop copygc thread on device resize bcachefs: Make sure bch2_move_ratelimit() also waits for move_ops ...
2023-12-01Merge branch 'acpi-tables'Rafael J. Wysocki3-15/+12
Merge a fix for a recently introduced build issue on ARM32 platforms caused by an inadvertent header file breakage (Dave Jiang). * acpi-tables: ACPI: Fix ARM32 platforms compile issue introduced by fw_table changes
2023-12-01Merge branch 'powercap'Rafael J. Wysocki2-13/+4
Merge a power capping fix for 6.7-rc4 which eliminates unnecessary and harmful conversions to uW from the DTPM (dynamic thermal and power management) framework (Lukasz Luba). * powercap: powercap: DTPM: Fix unneeded conversions to micro-Watts
2023-12-01Merge tag 'nvme-6.7-2023-12-01' of git://git.infradead.org/nvme into block-6.7Jens Axboe1-6/+28
Pull NVMe fixes from Keith: "nvme fixes for Linux 6.7 - Invalid namespace identification error handling (Marizio Ewan, Keith) - Fabrics keep-alive tuning (Mark)" * tag 'nvme-6.7-2023-12-01' of git://git.infradead.org/nvme: nvme-core: check for too small lba shift nvme: check for valid nvme_identify_ns() before using it nvme-core: fix a memory leak in nvme_ns_info_from_identify() nvme: fine-tune sending of first keep-alive
2023-12-01nvme-core: check for too small lba shiftKeith Busch1-2/+3
The block layer doesn't support logical block sizes smaller than 512 bytes. The nvme spec doesn't support that small either, but the driver isn't checking to make sure the device responded with usable data. Failing to catch this will result in a kernel bug, either from a division by zero when stacking, or a zero length bio. Reviewed-by: Jens Axboe <[email protected]> Signed-off-by: Keith Busch <[email protected]>
2023-12-01blk-mq: don't count completed flush data request as inflight in case of quiesceMing Lei1-1/+13
Request queue quiesce may interrupt flush sequence, and the original request may have been marked as COMPLETE, but can't get finished because of queue quiesce. This way is fine from driver viewpoint, because flush sequence is block layer concept, and it isn't related with driver. However, driver(such as dm-rq) can call blk_mq_queue_inflight() to count & drain inflight requests, then the wait & drain never gets done because the completed & not-finished flush request is counted as inflight. Fix this issue by not counting completed flush data request as inflight in case of quiesce. Cc: Mike Snitzer <[email protected]> Cc: David Jeffery <[email protected]> Cc: John Pittman <[email protected]> Signed-off-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2023-12-01iommu: Fix printk arg in of_iommu_get_resv_regions()Daniel Mentz1-1/+1
The variable phys is defined as (struct resource *) which aligns with the printk format specifier %pr. Taking the address of it results in a value of type (struct resource **) which is incompatible with the format specifier %pr. Therefore, remove the address of operator (&). Fixes: a5bf3cfce8cb ("iommu: Implement of_iommu_get_resv_regions()") Signed-off-by: Daniel Mentz <[email protected]> Acked-by: Thierry Reding <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2023-12-01rethook: Use __rcu pointer for rethook::handlerMasami Hiramatsu (Google)3-14/+22
Since the rethook::handler is an RCU-maganged pointer so that it will notice readers the rethook is stopped (unregistered) or not, it should be an __rcu pointer and use appropriate functions to be accessed. This will use appropriate memory barrier when accessing it. OTOH, rethook::data is never changed, so we don't need to check it in get_kretprobe(). NOTE: To avoid sparse warning, rethook::handler is defined by a raw function pointer type with __rcu instead of rethook_handler_t. Link: https://lore.kernel.org/all/170126066201.398836.837498688669005979.stgit@devnote2/ Fixes: 54ecbe6f1ed5 ("rethook: Add a generic return hook") Cc: [email protected] Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Tested-by: JP Kobryn <[email protected]> Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
2023-12-01kprobes: consistent rcu api usage for kretprobe holderJP Kobryn2-7/+4
It seems that the pointer-to-kretprobe "rp" within the kretprobe_holder is RCU-managed, based on the (non-rethook) implementation of get_kretprobe(). The thought behind this patch is to make use of the RCU API where possible when accessing this pointer so that the needed barriers are always in place and to self-document the code. The __rcu annotation to "rp" allows for sparse RCU checking. Plain writes done to the "rp" pointer are changed to make use of the RCU macro for assignment. For the single read, the implementation of get_kretprobe() is simplified by making use of an RCU macro which accomplishes the same, but note that the log warning text will be more generic. I did find that there is a difference in assembly generated between the usage of the RCU macros vs without. For example, on arm64, when using rcu_assign_pointer(), the corresponding store instruction is a store-release (STLR) which has an implicit barrier. When normal assignment is done, a regular store (STR) is found. In the macro case, this seems to be a result of rcu_assign_pointer() using smp_store_release() when the value to write is not NULL. Link: https://lore.kernel.org/all/[email protected]/ Fixes: d741bf41d7c7 ("kprobes: Remove kretprobe hash") Cc: [email protected] Signed-off-by: JP Kobryn <[email protected]> Acked-by: Masami Hiramatsu (Google) <[email protected]> Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
2023-12-01lib: objpool: fix head overrun on RK3588 SBCwuqiang.matt1-0/+17
objpool overrun stress with test_objpool on OrangePi5+ SBC triggered the following kernel warnings: WARNING: CPU: 6 PID: 3115 at lib/objpool.c:168 objpool_push+0xc0/0x100 This message is from objpool.c:168: WARN_ON_ONCE(tail - head > pool->nr_objs); The overrun test case is to validate the case that pre-allocated objects are insufficient: 8 objects are pre-allocated for each node and consumer thread per node tries to grab 16 objects in a row. The testing system is OrangePI 5+, with RK3588, a big.LITTLE SOC with 4x A76 and 4x A55. When disabling either all 4 big or 4 little cores, the overrun tests run well, and once with big and little cores mixed together, the overrun test would always cause an overrun loop. It's likely the memory timing differences of big and little cores cause this trouble. Here are the debugging data of objpool_try_get_slot after try_cmpxchg_release: objpool_pop: cpu: 4/0 0:0 head: 278/279 tail:278 last:276/278 The local copies of 'head' and 'last' were 278 and 276, and reloading of 'slot->head' and 'slot->last' got 279 and 278. After try_cmpxchg_release 'slot->head' became 'head + 1', which is correct. But what's wrong here is the stale value of 'last', and that stale value of 'last' finally led the overrun of 'head'. Memory updating of 'last' and 'head' are performed in push() and pop() independently, which could be the culprit leading this out of order visibility of 'last' and 'head'. So for objpool_try_get_slot(), it's not enough only checking the condition of 'head != slot', the implicit condition 'last - head <= nr_objs' must also be explicitly asserted to guarantee 'last' is always behind 'head' before the object retrieving. This patch will check and try reloading of 'head' and 'last' to ensure 'last' is behind 'head' at the time of object retrieving. Performance testings show the average impact is about 0.1% for X86_64 and 1.12% for ARM64. Here are the results: OS: Debian 10 X86_64, Linux 6.6rc HW: XEON 8336C x 2, 64 cores/128 threads, DDR4 3200MT/s 1T 2T 4T 8T 16T native: 49543304 99277826 199017659 399070324 795185848 objpool: 29909085 59865637 119692073 239750369 478005250 objpool+: 29879313 59230743 119609856 239067773 478509029 32T 48T 64T 96T 128T native: 1596927073 2390099988 2929397330 3183875848 3257546602 objpool: 957553042 1435814086 1680872925 2043126796 2165424198 objpool+: 956476281 1434491297 1666055740 2041556569 2157415622 OS: Debian 11 AARCH64, Linux 6.6rc HW: Kunpeng-920 96 cores/2 sockets/4 NUMA nodes, DDR4 2933 MT/s 1T 2T 4T 8T 16T native: 30890508 60399915 123111980 242257008 494002946 objpool: 14742531 28883047 57739948 115886644 232455421 objpool+: 14107220 29032998 57286084 113730493 232232850 24T 32T 48T 64T 96T native: 746406039 1000174750 1493236240 1998318364 2942911180 objpool: 349164852 467284332 702296756 934459713 1387898285 objpool+: 348388180 462750976 696606096 927865887 1368402195 Link: https://lore.kernel.org/all/[email protected]/ Fixes: b4edb8d2d464 ("lib: objpool added: ring-array based lockless MPMC") Signed-off-by: wuqiang.matt <[email protected]> Acked-by: Masami Hiramatsu (Google) <[email protected]> Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
2023-12-01Merge tag 'hardening-v6.7-rc4' of ↵Linus Torvalds3-8/+5
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening fixes from Kees Cook: - struct_group: propagate attributes to top-level union (Dmitry Antipov) - gcc-plugins: randstruct: Update code comment in relayout_struct (Gustavo A. R. Silva) - MAINTAINERS: refresh LLVM support (Nick Desaulniers) * tag 'hardening-v6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: gcc-plugins: randstruct: Update code comment in relayout_struct() uapi: propagate __struct_group() attributes to the container union MAINTAINERS: refresh LLVM support
2023-12-01Merge tag 'linux_kselftest-kunit-fixes-6.7-rc4' of ↵Linus Torvalds2-3/+41
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull KUnit fixes from Shuah Khan: "Three fixes to warnings and run-time test behavior. With these fixes, test suite counter will be reset correctly before running tests, kunit will warn if tests are too slow, and eliminate warning when kfree() as an action" * tag 'linux_kselftest-kunit-fixes-6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: kunit: test: Avoid cast warning when adding kfree() as an action kunit: Reset suite counter right before running tests kunit: Warn if tests are slow
2023-12-01Merge tag 'amd-drm-fixes-6.7-2023-11-30' of ↵Dave Airlie54-249/+464
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.7-2023-11-30: amdgpu: - DMUB fix - DCN 3.5 fixes - XGMI fix - DCN 3.2 fixes - Vangogh suspend fix - NBIO 7.9 fix - GFX11 golden register fix - Backlight fix - NBIO 7.11 fix - IB test overflow fix - DCN 3.1.4 fixes - fix a runtime pm ref count - Retimer fix - ABM fix - DCN 3.1.5 fix - Fix AGP addressing - Fix possible memory leak in SMU error path - Make sure PME is enabled in D3 - Fix possible NULL pointer dereference in debugfs - EEPROM fix - GC 9.4.3 fix amdkfd: - IP version check fix - Fix memory leak in pqm_uninit() drm: - Revert unexport of prime helpers for fd/handle conversion Signed-off-by: Dave Airlie <[email protected]> From: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]