aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-03-04btrfs: handle root deletion lookup error in btrfs_del_root()David Sterba1-2/+5
We're deleting a root and looking it up by key does not succeed, this is an inconsistent state and we can't do anything. All callers handle errors and abort a transaction. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: handle block group lookup error when it's being removedDavid Sterba1-1/+3
The unlikely case of lookup error in btrfs_remove_block_group() can be handled properly, in its caller this would lead to a transaction abort. We can't do anything else, a block group must have been loaded first. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: handle invalid range and start in merge_extent_mapping()David Sterba1-4/+5
Turn a BUG_ON to a properly handled error and update the error message in the caller. It is expected that @em_in and @start passed to btrfs_add_extent_mapping() overlap. Besides tests, the only caller btrfs_get_extent() makes sure this is true. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: handle directory and dentry mismatch in btrfs_may_delete()David Sterba1-1/+3
The helper btrfs_may_delete() is a copy of generic fs/namei.c:may_delete() to verify various conditions before deletion. There's a BUG_ON added before linux.git started, we can turn it to a proper error handling at least in our local helper. A mistmatch between directory and the deleted dentry is clearly invalid. This won't be probably ever hit due to the way how the parameters are set from the caller btrfs_ioctl_snap_destroy(), using a VFS helper lookup_one(). Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: use READ/WRITE_ONCE for fs_devices->read_policyNaohiro Aota2-8/+9
Since we can read/modify the value from the sysfs interface concurrently, it would be better to protect it from compiler optimizations. Currently, there is only one read policy BTRFS_READ_POLICY_PID available, so no actual problem can happen now. This is a preparation for the future expansion. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: preallocate temporary extent buffer for inode logging when neededFilipe Manana3-36/+94
When logging an inode and we require to copy items from subvolume leaves to the log tree, we clone each subvolume leaf and than use that clone to copy items to the log tree. This is required to avoid possible deadlocks as stated in commit 796787c978ef ("btrfs: do not modify log tree while holding a leaf from fs tree locked"). The cloning requires allocating an extent buffer (struct extent_buffer) and then allocating pages (folios) to attach to the extent buffer. This may be slow in case we are under memory pressure, and since we are doing the cloning while holding a read lock on a subvolume leaf, it means we can be blocking other operations on that leaf for significant periods of time, which can increase latency on operations like creating other files, renaming files, etc. Similarly because we're under a log transaction, we may also cause extra delay on other tasks doing an fsync, because syncing the log requires waiting for tasks that joined a log transaction to exit the transaction. So to improve this, for any inode logging operation that needs to copy items from a subvolume leaf ("full sync" or "copy everything" bit set in the inode), preallocate a dummy extent buffer before locking any extent buffer from the subvolume tree, and even before joining a log transaction, add it to the log context and then use it when we need to copy items from a subvolume leaf to the log tree. This avoids making other operations get extra latency when waiting to lock a subvolume leaf that is used during inode logging and we are under heavy memory pressure. The following test script with bonnie++ was used to test this: $ cat test.sh #!/bin/bash DEV=/dev/sdh MNT=/mnt/sdh MOUNT_OPTIONS="-o ssd" MEMTOTAL_BYTES=`free -b | grep Mem: | awk '{ print $2 }'` NR_DIRECTORIES=20 NR_FILES=20480 DATASET_SIZE=$((MEMTOTAL_BYTES * 2 / 1048576)) DIRECTORY_SIZE=$((MEMTOTAL_BYTES * 2 / NR_FILES)) NR_FILES=$((NR_FILES / 1024)) echo "performance" | \ tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor umount $DEV &> /dev/null mkfs.btrfs -f $MKFS_OPTIONS $DEV mount $MOUNT_OPTIONS $DEV $MNT bonnie++ -u root -d $MNT \ -n $NR_FILES:$DIRECTORY_SIZE:$DIRECTORY_SIZE:$NR_DIRECTORIES \ -r 0 -s $DATASET_SIZE -b umount $MNT The results of this test on a 8G VM running a non-debug kernel (Debian's default kernel config), were the following. Before this change: Version 2.00a ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Name:Size etc /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP debian0 7501M 376k 99 1.4g 96 117m 14 1510k 99 2.5g 95 +++++ +++ Latency 35068us 24976us 2944ms 30725us 71770us 26152us Version 2.00a ------Sequential Create------ --------Random Create-------- debian0 -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files:max:min /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 20:384100:384100/20 20480 32 20480 58 20480 48 20480 39 20480 56 20480 61 Latency 411ms 11914us 119ms 617ms 10296us 110ms After this change: Version 2.00a ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Name:Size etc /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP debian0 7501M 375k 99 1.4g 97 117m 14 1546k 99 2.3g 98 +++++ +++ Latency 35975us 20945us 2144ms 10297us 2217us 6004us Version 2.00a ------Sequential Create------ --------Random Create-------- debian0 -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files:max:min /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 20:384100:384100/20 20480 35 20480 58 20480 48 20480 40 20480 57 20480 59 Latency 320ms 11237us 77779us 518ms 6470us 86389us Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: add comment about list_is_singular() use at btrfs_delete_unused_bgs()Filipe Manana1-0/+7
At btrfs_delete_unused_bgs(), the use of the list_is_singular() check on a block group may not be immediately obvious. It is there to prevent losing raid profile information for a block group type (data, metadata or system), as that information is removed from fs_info->avail_[data|metadata|system]_alloc_bits when the last block group of a given type is deleted. So deleting the block group would later result in creating block groups of that type with a single profile (because fs_info->avail_*_alloc_bits would have a value of 0). This check was added in commit aefbe9a633b5 ("btrfs: Fix lost-data-profile caused by auto removing bg"). So add a comment mentioning the need for the check. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: document what the spinlock unused_bgs_lock protectsFilipe Manana1-0/+3
Add some comments to struct btrfs_fs_info to explicitly document which members are protected by the spinlock unused_bgs_lock. It is currently used to protect two linked lists, the reclaim_bgs and unused_bgs lists. So add an explicit comment on top of each list to mention its protected by unused_bgs_lock, as well as comment on top of unused_bgs_lock to mention the lists it protects. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: make btrfs_error_unpin_extent_range() return voidDavid Sterba2-9/+7
This helper is used in transaction abort or cleanup context and the callers cannot handle all errors, only do best effort. btrfs_cleanup_one_transaction btrfs_destroy_delayed_refs btrfs_error_unpin_extent_range btrfs_destroy_pinned_extent btrfs_error_unpin_extent_range Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: return errors from unpin_extent_range()David Sterba2-5/+16
Handle the lookup failure of the block group to unpin, this is a logic error as the block group must exist at this point. If not, something else must have freed it, like clean_pinned_extents() would do without locking the unused_bg_unpin_mutex. Push the errors to the callers, proper handling will be done in followup patches. Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: handle errors returned from unpin_extent_cache()David Sterba2-3/+16
We've had numerous attempts to let function unpin_extent_cache() return void as it only returns 0. There are still error cases to handle so do that, in addition to the verbose messages. The only caller btrfs_finish_one_ordered() will now abort the transaction, previously it let it continue which could lead to further problems. Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: zlib: Fix spelling mistake "infalte" -> "inflate"Colin Ian King1-1/+1
There is a spelling mistake in a warning message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: zstd: fix and simplify the inline extent decompression (v2)Qu Wenruo2-54/+24
Note: this is a fixed version that was previously reverted as e01a83e12604 ("Revert "btrfs: zstd: fix and simplify the inline extent decompression""), with fixed parameters to memzero_page(). [BUG] If we have a filesystem with 4k sectorsize, and an inlined compressed extent created like this: item 4 key (257 INODE_ITEM 0) itemoff 15863 itemsize 160 generation 8 transid 8 size 4096 nbytes 4096 block group 0 mode 100600 links 1 uid 0 gid 0 rdev 0 sequence 1 flags 0x0(none) item 5 key (257 INODE_REF 256) itemoff 15839 itemsize 24 index 2 namelen 14 name: source_inlined item 6 key (257 EXTENT_DATA 0) itemoff 15770 itemsize 69 generation 8 type 0 (inline) inline extent data size 48 ram_bytes 4096 compression 3 (zstd) Then trying to reflink that extent in an aarch64 system with 64K page size, the reflink would just fail: # xfs_io -f -c "reflink $mnt/source_inlined 0 60k 4k" $mnt/dest XFS_IOC_CLONE_RANGE: Input/output error [CAUSE] In zstd_decompress(), we didn't treat @start_byte as just a page offset, but also use it as an indicator on whether we should error out, without any proper explanation (this is copied from other decompression code). In reality, for subpage cases, although @start_byte can be non-zero, we should never switch input/output buffer nor error out, since the whole input/output buffer should never exceed one sector, thus we should not need to do any buffer switch. Thus the current code using @start_byte as a condition to switch input/output buffer or finish the decompression is completely incorrect. [FIX] The fix involves several modification: - Rename @start_byte to @dest_pgoff to properly express its meaning - Use @sectorsize other than PAGE_SIZE to properly initialize the output buffer size - Use correct destination offset inside the destination page - Simplify the main loop Since the input/output buffer should never switch, we only need one zstd_decompress_stream() call. - Consider early end as an error After the fix, even on 64K page sized aarch64, above reflink now works as expected: # xfs_io -f -c "reflink $mnt/source_inlined 0 60k 4k" $mnt/dest linked 4096/4096 bytes at offset 61440 And results the correct file layout: item 9 key (258 INODE_ITEM 0) itemoff 15542 itemsize 160 generation 10 transid 10 size 65536 nbytes 4096 block group 0 mode 100600 links 1 uid 0 gid 0 rdev 0 sequence 1 flags 0x0(none) item 10 key (258 INODE_REF 256) itemoff 15528 itemsize 14 index 3 namelen 4 name: dest item 11 key (258 XATTR_ITEM 3817753667) itemoff 15445 itemsize 83 location key (0 UNKNOWN.0 0) type XATTR transid 10 data_len 37 name_len 16 name: security.selinux data unconfined_u:object_r:unlabeled_t:s0 item 12 key (258 EXTENT_DATA 61440) itemoff 15392 itemsize 53 generation 10 type 1 (regular) extent data disk byte 13631488 nr 4096 extent data offset 0 nr 4096 ram 4096 extent compression 0 (none) Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: remove unused included headersDavid Sterba42-65/+6
With help of neovim, LSP and clangd we can identify header files that are not actually needed to be included in the .c files. This is focused only on removal (with minor fixups), further cleanups are possible but will require doing the header files properly with forward declarations, minimized includes and include-what-you-use care. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: replace i_blocksize by fs_info::sectorsizeDavid Sterba1-2/+2
The block size calculated by i_blocksize from inode is the same as what we have in fs_info, initalized in inode_init_always(). Unify that to use the fs_info value everywhere. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: replace sb::s_blocksize by fs_info::sectorsizeDavid Sterba7-9/+11
The block size stored in the super block is used by subsystems outside of btrfs and it's a copy of fs_info::sectorsize. Unify that to always use our sectorsize, with the exception of mount where we first need to use fixed values (4K) until we read the super block and can set the sectorsize. Replace all uses, in most cases it's fewer pointer indirections. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: remove duplicate recording of physical addressJohannes Thumshirn1-2/+0
Remove the duplicate physical recording of the original write physical address in case of a single device write. This duplicated code is most likely present due to a rebase error. Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: page to folio conversion in btrfs_truncate_block()Goldwyn Rodrigues1-22/+24
Convert use of struct page to struct folio inside btrfs_truncate_block(). The only page based function is set_page_extent_mapped(). All other functions have folio equivalents. Had to use __filemap_get_folio() because filemap_grab_folio() does not allow passing allocation mask as a parameter. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: use a folio array throughout the defrag processMatthew Wilcox (Oracle)1-23/+21
Remove more hidden calls to compound_head() by using an array of folios instead of pages. Also neaten the error path in defrag_one_range() by adjusting the length of the array instead of checking for NULL. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: convert defrag_prepare_one_page() to use a folioMatthew Wilcox (Oracle)1-26/+27
Use a folio throughout defrag_prepare_one_page() to remove dozens of hidden calls to compound_head(). There is no support here for large folios; indeed, turn the existing check for PageCompound into a check for large folios. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: add set_folio_extent_mapped() helperMatthew Wilcox (Oracle)2-4/+9
Turn set_page_extent_mapped() into a wrapper around this version. Saves a call to compound_head() for callers who already have a folio and removes a couple of users of page->mapping. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: WARN_ON_ONCE() in our leak detection codeJosef Bacik3-0/+3
fstests looks for WARN_ON's in dmesg. Add WARN_ON_ONCE() to our leak detection code (enabled only in debug builds) so that fstests will fail if these things trip at all. This will allow us to easily catch problems with our reference counting that may otherwise go unnoticed. Reviewed-by: Neal Gompa <neal@gompa.dev> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: remove extent_map_tree forward declaration at extent_io.hFilipe Manana1-2/+0
There's no need to do a forward declaration of struct extent_map_tree at extent_io.h, as there are no function prototypes, inline functions or data structures that refer to struct extent_map_tree. So remove that forward declaration, which is not needed since commit 477a30ba5f8d ("btrfs: Sink extent_tree arguments in try_release_extent_mapping"). Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: cache folio size and shift in extent_bufferQu Wenruo5-28/+42
After the conversion to folio interfaces (but without the patch to enable larger folio allocation), there is an LTP report about observable performance drop on metadata heavy operations. https://lore.kernel.org/linux-btrfs/202312221750.571925bd-oliver.sang@intel.com/ This drop is caused by the extra code of calculating the folio_size()/folio_shift(), instead of the old hard coded PAGE_SIZE/PAGE_SHIFT. To slightly reduce the overhead, just cache both folio_size and folio_shift in extent_buffer. The two new members (u32 folio_size and u8 folio_shift) are stored inside the holes of extent_buffer. folio_size is shared with len, which is reduced to u32. The size of eb does not change. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: remove unused variable bio_offset from end_bbio_data_read()Qu Wenruo1-9/+0
The variable @bio_offset was introduced in commit 7ffd27e378d2 ("btrfs: pass bio_offset to check_data_csum() directly"), when we are still using the same endio function for both data and metadata. Later we had several changes to data and metadata endio functions: - Data verification is handled by btrfs bio layer - Split data and metadata endio paths Now for data path we no longer do any verification in end_bbio_data_read(), as the verification is handled by btrfs bio layer already. Thus there is no need for such bio_offset variable. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04btrfs: remove the pg_offset parameter from btrfs_get_extent()Qu Wenruo5-44/+36
The parameter @pg_offset of btrfs_get_extent() is only utilized for inlined extent, and we already have an ASSERT() and tree-checker, to make sure we can only get inline extent at file offset 0. Any invalid inline extent with non-zero file offset would be rejected by tree-checker in the first place. Thus the @pg_offset parameter is not really necessary, just remove it. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-03Linux 6.8-rc7Linus Torvalds1-1/+1
2024-03-03Merge tag 'phy-fixes2-6.8' of ↵Linus Torvalds4-111/+69
git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy Pull phy fixes from Vinod Koul: - qcom: m31 pointer err fix, eusb2 fix redundant zero-out loop and v3 offset fix on qmp-usb - freescale: fix for dphy alias * tag 'phy-fixes2-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: phy: qcom-qmp-usb: fix v3 offsets data phy: qualcomm: eusb2-repeater: Rework init to drop redundant zero-out loop phy: qcom: phy-qcom-m31: fix wrong pointer pass to PTR_ERR() phy: freescale: phy-fsl-imx8-mipi-dphy: Fix alias name to use dashes
2024-03-03Merge tag 'dmaengine-fix2-6.8' of ↵Linus Torvalds13-47/+85
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine Pull dmaengine fixes from Vinod Koul: - dw-edma fixes to improve driver and remote HDMA setup - fsl-edma fixes for SoC hange, irq init and byte calculations and sparse fixes - idxd: safe user copy of completion record fix - ptdma: consistent DMA mask fix * tag 'dmaengine-fix2-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: dmaengine: ptdma: use consistent DMA masks dmaengine: fsl-qdma: add __iomem and struct in union to fix sparse warning dmaengine: idxd: Ensure safe user copy of completion record dmaengine: fsl-edma: correct max_segment_size setting dmaengine: idxd: Remove shadow Event Log head stored in idxd dmaengine: fsl-edma: correct calculation of 'nbytes' in multi-fifo scenario dmaengine: fsl-qdma: init irq after reg initialization dmaengine: fsl-qdma: fix SoC may hang on 16 byte unaligned read dmaengine: dw-edma: eDMA: Add sync read before starting the DMA transfer in remote setup dmaengine: dw-edma: HDMA: Add sync read before starting the DMA transfer in remote setup dmaengine: dw-edma: Add HDMA remote interrupt configuration dmaengine: dw-edma: HDMA_V0_REMOTEL_STOP_INT_EN typo fix dmaengine: dw-edma: Fix wrong interrupt bit set for HDMA dmaengine: dw-edma: Fix the ch_count hdma callback
2024-03-03Merge tag 'powerpc-6.8-5' of ↵Linus Torvalds4-65/+120
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix IOMMU table initialisation when doing kdump over SR-IOV - Fix incorrect RTAS function name for resetting TCE tables - Fix fpu_signal selftest failures since a recent change Thanks to Gaurav Batra and Nathan Lynch. * tag 'powerpc-6.8-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: selftests/powerpc: Fix fpu_signal failures powerpc/rtas: use correct function name for resetting TCE tables powerpc/pseries/iommu: IOMMU table is not initialized for kdump over SR-IOV
2024-03-03Merge tag 'x86_urgent_for_v6.8_rc7' of ↵Linus Torvalds3-92/+98
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Do not reserve SETUP_RNG_SEED setup data in the e820 map as it should be used by kexec only - Make sure MKTME feature detection happens at an earlier time in the boot process so that the physical address size supported by the CPU is properly corrected and MTRR masks are programmed properly, leading to TDX systems booting without disable_mtrr_cleanup on the cmdline - Make sure the different address sizes supported by the CPU are read out as early as possible * tag 'x86_urgent_for_v6.8_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/e820: Don't reserve SETUP_RNG_SEED in e820 x86/cpu/intel: Detect TME keyid bits before setting MTRR mask registers x86/cpu: Allow reducing x86_phys_bits during early_identify_cpu()
2024-03-02Merge tag 'firewire-fixes-6.8-rc7' of ↵Linus Torvalds1-1/+13
git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394 Pull firewire fixes from Takashi Sakamoto: "A workaround to suppress the continuous bus resets in the case that older devices are connected to the modern 1394 OHCI hardware and devices In IEEE 1394 Amendment (IEEE 1394a-2000), the short bus reset is added to resolve the shortcomings of the long bus reset in IEEE 1394-1995. However, it is well-known that the solution is not necessarily effective in the mixing environment that both IEEE 1394-1995 PHY and IEEE 1394a-2000 (or later) PHY exist, as described in section 8.4.6.2 of IEEE 1394a-2000. The current implementation of firewire stack schedules the short bus reset when attempting to resolve the mismatch of gap count in the certain generation of bus topology. It can cause the continuous bus reset in the issued environment. The workaround simply uses the long bus reset instead of the short bus reset. It is desirable to detect whether the issued environment or not. However, the way to access PHY registers from remote note is firstly defined in IEEE 1394a-2000, thus it is not available in the case" * tag 'firewire-fixes-6.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394: firewire: core: use long bus reset on gap count error
2024-03-02Merge tag 'xfs-6.8-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds1-1/+0
Pull xfs fix from Chandan Babu: "Drop experimental warning message when mounting an xfs filesystem on an fsdax device. We now consider xfs on fsdax to be stable" * tag 'xfs-6.8-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: drop experimental warning for FSDAX
2024-03-02Merge tag 'gpio-fixes-for-v6.8-rc7' of ↵Linus Torvalds2-7/+7
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio fixes from Bartosz Golaszewski: - fix resource freeing ordering in error path when adding a GPIO chip - only set pins to output after the reset is complete in gpio-74x164 * tag 'gpio-fixes-for-v6.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: gpio: fix resource unwinding order in error path gpiolib: Fix the error path order in gpiochip_add_data_with_key() gpio: 74x164: Enable output pins after registers are reset
2024-03-02block: define bvec_iter as __packed __aligned(4)Ming Lei1-1/+1
In commit 19416123ab3e ("block: define 'struct bvec_iter' as packed"), what we need is to save the 4byte padding, and avoid `bio` to spread on one extra cache line. It is enough to define it as '__packed __aligned(4)', as '__packed' alone means byte aligned, and can cause compiler to generate horrible code on architectures that don't support unaligned access in case that bvec_iter is embedded in other structures. Cc: Mikulas Patocka <mpatocka@redhat.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Fixes: 19416123ab3e ("block: define 'struct bvec_iter' as packed") Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-03-02Merge tag 'scsi-fixes' of ↵Linus Torvalds2-2/+9
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Two small fixes, all in drivers (the more obsolete mpt3sas and the newer mpi3mr)" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: mpt3sas: Prevent sending diag_reset when the controller is ready scsi: mpi3mr: Reduce stack usage in mpi3mr_refresh_sas_ports()
2024-03-01Merge tag 'for-v6.8-rc2' of ↵Linus Torvalds2-1/+4
git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply Pull power supply fixes from Sebastian Reichel: - Kconfig dependency fix - bq27xxx-i2c: do not free non-existing IRQ * tag 'for-v6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply: power: supply: bq27xxx-i2c: Do not free non existing IRQ power: supply: mm8013: select REGMAP_I2C
2024-03-01Merge tag 'for-linus-iommufd' of ↵Linus Torvalds2-24/+54
git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd Pull iommufd fixes from Jason Gunthorpe: "Four syzkaller found bugs: - Corruption during error unwind in iommufd_access_change_ioas() - Overlapping IDs in the test suite due to out of order destruction - Missing locking for access->ioas in the test suite - False failures in the test suite validation logic with huge pages" * tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: iommufd/selftest: Don't check map/unmap pairing with HUGE_PAGES iommufd: Fix protection fault in iommufd_test_syz_conv_iova iommufd/selftest: Fix mock_dev_num bug iommufd: Fix iopt_access_list_id overwrite bug
2024-03-01Merge tag 'devicetree-fixes-for-6.8-2' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree fix from Rob Herring: "One fix for a bug in fw_devlink handling of OF graph. This doesn't completely fix the reported problems, but it's with users adding out of tree code" * tag 'devicetree-fixes-for-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: of: property: fw_devlink: Fix stupid bug in remote-endpoint parsing
2024-03-01of: property: fw_devlink: Fix stupid bug in remote-endpoint parsingSaravana Kannan1-1/+1
Introduced a stupid bug in commit 782bfd03c3ae ("of: property: Improve finding the supplier of a remote-endpoint property") due to a last minute incorrect edit of "index !=0" into "!index". This patch fixes it to be "index > 0" to match the comment right next to it. Reported-by: Luca Ceresoli <luca.ceresoli@bootlin.com> Link: https://lore.kernel.org/lkml/20240223171849.10f9901d@booty/ Fixes: 782bfd03c3ae ("of: property: Improve finding the supplier of a remote-endpoint property") Signed-off-by: Saravana Kannan <saravanak@google.com> Reviewed-by: Herve Codina <herve.codina@bootlin.com> Reviewed-by: Luca Ceresoli <luca.ceresoli@bootlin.com> Tested-by: Luca Ceresoli <luca.ceresoli@bootlin.com> Link: https://lore.kernel.org/r/20240224052436.3552333-1-saravanak@google.com Signed-off-by: Rob Herring <robh@kernel.org>
2024-03-01Merge tag 'riscv-for-linus-6.8-rc7' of ↵Linus Torvalds21-113/+145
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - detect ".option arch" support on not-yet-released LLVM builds - fix missing TLB flush when modifying non-leaf PTEs - fixes for T-Head custom extensions - fix for systems with the legacy PMU, that manifests as a crash on kernels built without SBI PMU support - fix for systems that clear *envcfg on suspend, which manifests as cbo.zero trapping after resume - fixes for Svnapot systems, including removing Svnapot support for huge vmalloc/vmap regions * tag 'riscv-for-linus-6.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: Sparse-Memory/vmemmap out-of-bounds fix riscv: Fix pte_leaf_size() for NAPOT Revert "riscv: mm: support Svnapot in huge vmap" riscv: Save/restore envcfg CSR during CPU suspend riscv: Add a custom ISA extension for the [ms]envcfg CSR riscv: Fix enabling cbo.zero when running in M-mode perf: RISCV: Fix panic on pmu overflow handler MAINTAINERS: Update SiFive driver maintainers drivers: perf: ctr_get_width function for legacy is not defined drivers: perf: added capabilities for legacy PMU RISC-V: Ignore V from the riscv,isa DT property on older T-Head CPUs riscv: Fix build error if !CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION riscv: mm: fix NOCACHE_THEAD does not set bit[61] correctly riscv: add CALLER_ADDRx support RISC-V: Drop invalid test from CONFIG_AS_HAS_OPTION_ARCH kbuild: Add -Wa,--fatal-warnings to as-instr invocation riscv: tlb: fix __p*d_free_tlb()
2024-03-01Merge tag 'ceph-for-6.8-rc7' of https://github.com/ceph/ceph-clientLinus Torvalds2-4/+9
Pull ceph fix from Ilya Dryomov: "Catch up with mdsmap encoding rectification which ended up being necessary after all to enable cluster upgrades from problematic v18.2.0 and v18.2.1 releases" * tag 'ceph-for-6.8-rc7' of https://github.com/ceph/ceph-client: ceph: switch to corrected encoding of max_xattr_size in mdsmap
2024-03-01Merge tag 'for-6.8-rc6-tag' of ↵Linus Torvalds6-35/+139
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - fix freeing allocated id for anon dev when snapshot creation fails - fiemap fixes: - followup for a recent deadlock fix, ranges that fiemap can access can still race with ordered extent completion - make sure fiemap with SYNC flag does not race with writes * tag 'for-6.8-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix double free of anonymous device after snapshot creation failure btrfs: ensure fiemap doesn't race with writes when FIEMAP_FLAG_SYNC is given btrfs: fix race between ordered extent completion and fiemap
2024-03-01Merge tag 'exfat-for-6.8-rc7' of ↵Linus Torvalds1-15/+22
git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat Pull exfat fix from Namjae Jeon: - Fix ftruncate failure when allocating non-contiguous clusters * tag 'exfat-for-6.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat: exfat: fix appending discontinuous clusters to empty file
2024-03-01Merge tag 'vfs-6.8-rc7.fixes' of ↵Linus Torvalds2-17/+14
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: "Two small fixes: - Fix an endless loop during afs directory iteration caused by not skipping silly-rename files correctly. - Fix reporting of completion events for aio causing leaks in userspace. This is based on the fix last week as it's now possible to recognize aio events submitted through the old aio interface" * tag 'vfs-6.8-rc7.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fs/aio: Make io_cancel() generate completions again afs: Fix endless loop in directory parsing
2024-03-01Merge tag 'iommu-fix-v6.8-rc6' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fix from Joerg Roedel: - Fix SVA handle sharing in multi device case * tag 'iommu-fix-v6.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/sva: Fix SVA handle sharing in multi device case
2024-03-01Merge tag 'mmc-v6.8-rc4' of ↵Linus Torvalds3-9/+65
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC fixes from Ulf Hansson: "MMC core: - Fix eMMC initialization with 1-bit bus connection MMC host: - mmci: Fix DMA API overlapping mappings for the stm32 variant - sdhci-xenon: Fix PHY stability issues" * tag 'mmc-v6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: sdhci-xenon: add timeout for PHY init complete mmc: sdhci-xenon: fix PHY init clock stability mmc: mmci: stm32: fix DMA API overlapping mappings warning mmc: core: Fix eMMC initialization with 1-bit bus connection
2024-03-01Merge tag 'pmdomain-v6.8-rc4' of ↵Linus Torvalds2-2/+8
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm Pull pmdomain fixes from Ulf Hansson: - qcom: Fix enabled_corner aggregation for rpmhpd - arm: Fix NULL dereference on scmi_perf_domain removal * tag 'pmdomain-v6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm: pmdomain: qcom: rpmhpd: Fix enabled_corner aggregation pmdomain: arm: Fix NULL dereference on scmi_perf_domain removal
2024-03-01Merge tag 'efi-fixes-for-v6.8-2' of ↵Linus Torvalds4-18/+16
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fixes from Ard Biesheuvel: "Only the EFI variable name size change is significant, and will be backported once it lands. The others are cleanup. - Fix phys_addr_t size confusion in 32-bit capsule loader - Reduce maximum EFI variable name size to 512 to work around buggy firmware - Drop some redundant code from efivarfs while at it" * tag 'efi-fixes-for-v6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efivarfs: Drop 'duplicates' bool parameter on efivar_init() efivarfs: Drop redundant cleanup on fill_super() failure efivarfs: Request at most 512 bytes for variable names efi/capsule-loader: fix incorrect allocation size
2024-03-01Merge tag 'sound-6.8-rc7' of ↵Linus Torvalds15-14/+96
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "The amount of changes wasn't as small as wished, but all reasonably small fixes. There is a PCM core API change, which is for correcting the behavior change we took in 6.8. The rest are device-specific fixes for ASoC AMD, Qualcomm, Cirrus codecs, HD-audio quirks & co" * tag 'sound-6.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ASoC: amd: yc: Fix non-functional mic on Lenovo 21J2 ASoC: amd: yc: add new YC platform variant (0x63) support ALSA: hda/realtek - ALC285 reduce pop noise from Headphone port ASoC: amd: yc: Add Lenovo ThinkBook 21J0 into DMI quirk table ALSA: hda/realtek: Add special fixup for Lenovo 14IRP8 ASoC: soc-card: Fix missing locking in snd_soc_card_get_kcontrol() ALSA: hda/realtek: tas2781: enable subwoofer volume control ALSA: pcm: clarify and fix default msbits value for all formats ASoC: qcom: Fix uninitialized pointer dmactl ALSA: hda/realtek: fix mute/micmute LED For HP mt440 ALSA: Drop leftover snd-rtctimer stuff from Makefile ALSA: ump: Fix the discard error code from snd_ump_legacy_open() ALSA: hda/realtek: Enable Mute LED on HP 840 G8 (MB 8AB8) ASoC: cs35l56: Must clear HALO_STATE before issuing SYSTEM_RESET ALSA: hda/realtek: Fix top speaker connection on Dell Inspiron 16 Plus 7630 ALSA: firewire-lib: fix to check cycle continuity