blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2021-04-09	ext4: fix various seppling typos	Bhaskar Chowdhury	8	-10/+10
	Signed-off-by: Bhaskar Chowdhury <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-04-09	ext4: fix error return code in ext4_fc_perform_commit()	Xu Yihang	1	-1/+3
	In case of if not ext4_fc_add_tlv branch, an error return code is missing. Cc: [email protected] Fixes: aa75f4d3daae ("ext4: main fast-commit commit path") Reported-by: Hulk Robot <[email protected]> Signed-off-by: Xu Yihang <[email protected]> Reviewed-by: Harshad Shirwadkar <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-04-09	ext4: annotate data race in jbd2_journal_dirty_metadata()	Jan Kara	1	-4/+4
	Assertion checks in jbd2_journal_dirty_metadata() are known to be racy but we don't want to be grabbing locks just for them. We thus recheck them under b_state_lock only if it looks like they would fail. Annotate the checks with data_race(). Cc: [email protected] Reported-by: Hao Sun <[email protected]> Signed-off-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-04-09	ext4: annotate data race in start_this_handle()	Jan Kara	1	-1/+6
	Access to journal->j_running_transaction is not protected by appropriate lock and thus is racy. We are well aware of that and the code handles the race properly. Just add a comment and data_race() annotation. Cc: [email protected] Reported-by: [email protected] Signed-off-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-04-09	ext4: fix ext4_error_err save negative errno into superblock	Ye Bin	1	-1/+1
	Fix As write_mmp_block() so that it returns -EIO instead of 1, so that the correct error gets saved into the superblock. Cc: [email protected] Fixes: 54d3adbc29f0 ("ext4: save all error info in save_error_info() and drop ext4_set_errno()") Reported-by: Liu Zhi Qiang <[email protected]> Signed-off-by: Ye Bin <[email protected]> Reviewed-by: Andreas Dilger <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-04-09	ext4: fix error code in ext4_commit_super	Fengnan Chang	1	-2/+4
	We should set the error code when ext4_commit_super check argument failed. Found in code review. Fixes: c4be0c1dc4cdc ("filesystem freeze: add error handling of write_super_lockfs/unlockfs"). Cc: [email protected] Signed-off-by: Fengnan Chang <[email protected]> Reviewed-by: Andreas Dilger <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-04-09	ext4: always panic when errors=panic is specified	Ye Bin	1	-3/+4
	Before commit 014c9caa29d3 ("ext4: make ext4_abort() use __ext4_error()"), the following series of commands would trigger a panic: 1. mount /dev/sda -o ro,errors=panic test 2. mount /dev/sda -o remount,abort test After commit 014c9caa29d3, remounting a file system using the test mount option "abort" will no longer trigger a panic. This commit will restore the behaviour immediately before commit 014c9caa29d3. (However, note that the Linux kernel's behavior has not been consistent; some previous kernel versions, including 5.4 and 4.19 similarly did not panic after using the mount option "abort".) This also makes a change to long-standing behaviour; namely, the following series commands will now cause a panic, when previously it did not: 1. mount /dev/sda -o ro,errors=panic test 2. echo test > /sys/fs/ext4/sda/trigger_fs_error However, this makes ext4's behaviour much more consistent, so this is a good thing. Cc: [email protected] Fixes: 014c9caa29d3 ("ext4: make ext4_abort() use __ext4_error()") Signed-off-by: Ye Bin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-04-09	ext4: delete redundant uptodate check for buffer	Yang Guo	1	-4/+2
	The buffer uptodate state has been checked in function set_buffer_uptodate, there is no need use buffer_uptodate before calling set_buffer_uptodate and delete it. Cc: "Theodore Ts'o" <[email protected]> Cc: Andreas Dilger <[email protected]> Signed-off-by: Yang Guo <[email protected]> Signed-off-by: Shaokun Zhang <[email protected]> Reviewed-by: Ritesh Harjani <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-04-09	ext4: do not set SB_ACTIVE in ext4_orphan_cleanup()	Zhang Yi	1	-3/+0
	When CONFIG_QUOTA is enabled, if we failed to mount the filesystem due to some error happens behind ext4_orphan_cleanup(), it will end up triggering a after free issue of super_block. The problem is that ext4_orphan_cleanup() will set SB_ACTIVE flag if CONFIG_QUOTA is enabled, after we cleanup the truncated inodes, the last iput() will put them into the lru list, and these inodes' pages may probably dirty and will be write back by the writeback thread, so it could be raced by freeing super_block in the error path of mount_bdev(). After check the setting of SB_ACTIVE flag in ext4_orphan_cleanup(), it was used to ensure updating the quota file properly, but evict inode and trash data immediately in the last iput does not affect the quotafile, so setting the SB_ACTIVE flag seems not required[1]. Fix this issue by just remove the SB_ACTIVE setting. [1] https://lore.kernel.org/linux-ext4/[email protected]/T/#m04990cfbc4f44592421736b504afcc346b2a7c00 Cc: [email protected] Signed-off-by: Zhang Yi <[email protected]> Tested-by: Jan Kara <[email protected]> Reviewed-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-04-09	ext4: make prefetch_block_bitmaps default	Harshad Shirwadkar	2	-8/+9
	Block bitmap prefetching is needed for these allocator optimization data structures to get populated and provide better group scanning order. So, turn it on bu default. prefetch_block_bitmaps mount option is now marked as removed and a new option no_prefetch_block_bitmaps is added to disable block bitmap prefetching. Signed-off-by: Harshad Shirwadkar <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-04-09	ext4: add proc files to monitor new structures	Harshad Shirwadkar	3	-0/+89
	This patch adds a new file "mb_structs_summary" which allows us to see the summary of the new allocator structures added in this series. Here's the sample output of file: optimize_scan: 1 max_free_order_lists: list_order_0_groups: 0 list_order_1_groups: 0 list_order_2_groups: 0 list_order_3_groups: 0 list_order_4_groups: 0 list_order_5_groups: 0 list_order_6_groups: 0 list_order_7_groups: 0 list_order_8_groups: 0 list_order_9_groups: 0 list_order_10_groups: 0 list_order_11_groups: 0 list_order_12_groups: 0 list_order_13_groups: 40 fragment_size_tree: tree_min: 16384 tree_max: 32768 tree_nodes: 40 Signed-off-by: Harshad Shirwadkar <[email protected]> Reviewed-by: Andreas Dilger <[email protected]> Reviewed-by: Ritesh Harjani <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-04-09	ext4: improve cr 0 / cr 1 group scanning	Harshad Shirwadkar	5	-15/+452
	Instead of traversing through groups linearly, scan groups in specific orders at cr 0 and cr 1. At cr 0, we want to find groups that have the largest free order >= the order of the request. So, with this patch, we maintain lists for each possible order and insert each group into a list based on the largest free order in its buddy bitmap. During cr 0 allocation, we traverse these lists in the increasing order of largest free orders. This allows us to find a group with the best available cr 0 match in constant time. If nothing can be found, we fallback to cr 1 immediately. At CR1, the story is slightly different. We want to traverse in the order of increasing average fragment size. For CR1, we maintain a rb tree of groupinfos which is sorted by average fragment size. Instead of traversing linearly, at CR1, we traverse in the order of increasing average fragment size, starting at the most optimal group. This brings down cr 1 search complexity to log(num groups). For cr >= 2, we just perform the linear search as before. Also, in case of lock contention, we intermittently fallback to linear search even in CR 0 and CR 1 cases. This allows us to proceed during the allocation path even in case of high contention. There is an opportunity to do optimization at CR2 too. That's because at CR2 we only consider groups where bb_free counter (number of free blocks) is greater than the request extent size. That's left as future work. All the changes introduced in this patch are protected under a new mount option "mb_optimize_scan". With this patchset, following experiment was performed: Created a highly fragmented disk of size 65TB. The disk had no contiguous 2M regions. Following command was run consecutively for 3 times: time dd if=/dev/urandom of=file bs=2M count=10 Here are the results with and without cr 0/1 optimizations introduced in this patch: \|---------+------------------------------+---------------------------\| \| \| Without CR 0/1 Optimizations \| With CR 0/1 Optimizations \| \|---------+------------------------------+---------------------------\| \| 1st run \| 5m1.871s \| 2m47.642s \| \| 2nd run \| 2m28.390s \| 0m0.611s \| \| 3rd run \| 2m26.530s \| 0m1.255s \| \|---------+------------------------------+---------------------------\| Signed-off-by: Harshad Shirwadkar <[email protected]> Reported-by: kernel test robot <[email protected]> Reported-by: Dan Carpenter <[email protected]> Reviewed-by: Andreas Dilger <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-04-09	ext4: add MB_NUM_ORDERS macro	Harshad Shirwadkar	2	-9/+15
	A few arrays in mballoc.c use the total number of valid orders as their size. Currently, this value is set as "sb->s_blocksize_bits + 2". This makes code harder to read. So, instead add a new macro MB_NUM_ORDERS(sb) to make the code more readable. Signed-off-by: Harshad Shirwadkar <[email protected]> Reviewed-by: Andreas Dilger <[email protected]> Reviewed-by: Ritesh Harjani <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-04-09	ext4: add mballoc stats proc file	Harshad Shirwadkar	3	-2/+80
	Add new stats for measuring the performance of mballoc. This patch is forked from Artem Blagodarenko's work that can be found here: https://github.com/lustre/lustre-release/blob/master/ldiskfs/kernel_patches/patches/rhel8/ext4-simple-blockalloc.patch This patch reorganizes the stats by cr level. This is how the output looks like: mballoc: reqs: 0 success: 0 groups_scanned: 0 cr0_stats: hits: 0 groups_considered: 0 useless_loops: 0 bad_suggestions: 0 cr1_stats: hits: 0 groups_considered: 0 useless_loops: 0 bad_suggestions: 0 cr2_stats: hits: 0 groups_considered: 0 useless_loops: 0 cr3_stats: hits: 0 groups_considered: 0 useless_loops: 0 extents_scanned: 0 goal_hits: 0 2^n_hits: 0 breaks: 0 lost: 0 buddies_generated: 0/40 buddies_time_used: 0 preallocated: 0 discarded: 0 Signed-off-by: Harshad Shirwadkar <[email protected]> Reviewed-by: Andreas Dilger <[email protected]> Reviewed-by: Ritesh Harjani <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-04-09	ext4: add ability to return parsed options from parse_options	Harshad Shirwadkar	1	-21/+30
	Before this patch, the function parse_options() was returning journal_devnum and journal_ioprio variables to the caller. This patch generalizes that interface to allow parse_options to return any parsed options to return back to the caller. In this patch series, it gets used to capture the value of "mb_optimize_scan=%u" mount option. Signed-off-by: Harshad Shirwadkar <[email protected]> Reviewed-by: Ritesh Harjani <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-04-09	ext4: drop s_mb_bal_lock and convert protected fields to atomic	Harshad Shirwadkar	2	-11/+7
	s_mb_buddies_generated gets used later in this patch series to determine if the cr 0 and cr 1 optimziations should be performed or not. Currently, s_mb_buddies_generated is protected under a spin_lock. In the allocation path, it is better if we don't depend on the lock and instead read the value atomically. In order to do that, we drop s_bal_lock altogether and we convert the only two protected fields by it s_mb_buddies_generated and s_mb_generation_time to atomic type. Signed-off-by: Harshad Shirwadkar <[email protected]> Reviewed-by: Andreas Dilger <[email protected]> Reviewed-by: Ritesh Harjani <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-04-09	ext4: fix check to prevent false positive report of incorrect used inodes	Zhang Yi	1	-16/+32
	Commit <50122847007> ("ext4: fix check to prevent initializing reserved inodes") check the block group zero and prevent initializing reserved inodes. But in some special cases, the reserved inode may not all belong to the group zero, it may exist into the second group if we format filesystem below. mkfs.ext4 -b 4096 -g 8192 -N 1024 -I 4096 /dev/sda So, it will end up triggering a false positive report of a corrupted file system. This patch fix it by avoid check reserved inodes if no free inode blocks will be zeroed. Cc: [email protected] Fixes: 50122847007 ("ext4: fix check to prevent initializing reserved inodes") Signed-off-by: Zhang Yi <[email protected]> Suggested-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-04-05	jbd2: avoid -Wempty-body warnings	Arnd Bergmann	1	-1/+1
	Building with 'make W=1' shows a harmless -Wempty-body warning: fs/jbd2/recovery.c: In function 'fc_do_one_pass': fs/jbd2/recovery.c:267:75: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body] 267 \| jbd_debug(3, "Fast commit replay failed, err = %d\n", err); \| ^ Change the empty dprintk() macros to no_printk(), which avoids this warning and adds format string checking. Signed-off-by: Arnd Bergmann <[email protected]> Reviewed-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-04-05	ext4: optimize match for casefolded encrypted dirs	Daniel Rosenberg	2	-34/+38
	Matching names with casefolded encrypting directories requires decrypting entries to confirm case since we are case preserving. We can avoid needing to decrypt if our hash values don't match. Signed-off-by: Daniel Rosenberg <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-04-05	ext4: handle casefolding with encryption	Daniel Rosenberg	8	-89/+287
	This adds support for encryption with casefolding. Since the name on disk is case preserving, and also encrypted, we can no longer just recompute the hash on the fly. Additionally, to avoid leaking extra information from the hash of the unencrypted name, we use siphash via an fscrypt v2 policy. The hash is stored at the end of the directory entry for all entries inside of an encrypted and casefolded directory apart from those that deal with '.' and '..'. This way, the change is backwards compatible with existing ext4 filesystems. [ Changed to advertise this feature via the file: /sys/fs/ext4/features/encrypted_casefold -- TYT ] Signed-off-by: Daniel Rosenberg <[email protected]> Reviewed-by: Andreas Dilger <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-04-02	ext4: remove unnecessary braces in fs/ext4/dir.c	Milan Djurovic	1	-2/+2
	Removes braces to follow the coding style. Signed-off-by: Milan Djurovic <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-04-02	ext4: delete some unused tracepoint definitions	Eric Whitney	1	-176/+0
	A number of tracepoint instances have been removed from ext4 by past patches but the definitions of those tracepoints have not. All instances of ext4_ext_in_cache and ext4_ext_put_in_cache were removed by commit 69eb33dc24dc ("ext4: remove single extent cache"). ext4_get_reserved_cluster_alloc was removed by commit b6bf9171ef5c ("ext4: reduce reserved cluster count by number of allocated clusters"). ext4_find_delalloc_range was removed by commit 7d1b1fbc95eb ("ext4: reimplement ext4_find_delay_alloc_range on extent status tree"). All instances of ext4_direct_IO_enter and ext4_direct_IO_exit were removed by commit 378f32bab371 ("ext4: introduce direct I/O write using iomap infrastructure"). Signed-off-by: Eric Whitney <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-04-02	Updated locking documentation for transaction_t	Alexander Lochmann	1	-6/+12
	Some members of transaction_t are allowed to be read without any lock being held if accessed from the correct context. We used LockDoc's findings to determine those members. Each member of them is marked with a short comment: "no lock needed for jbd2 thread". Signed-off-by: Alexander Lochmann <[email protected]> Signed-off-by: Horst Schirmeier <[email protected]> Reviewed-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-04-02	ext4: updated locking documentation for journal_t	Alexander Lochmann	1	-5/+8
	Some members of transaction_t are allowed to be read without any lock being held if consistency doesn't matter. Based on LockDoc's findings, we extended the locking documentation of those members. Each one of them is marked with a short comment: "no lock for quick racy checks". Signed-off-by: Alexander Lochmann <[email protected]> Signed-off-by: Horst Schirmeier <[email protected]> Reviewed-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-03-25	ext4: use memcpy_to_page() in pagecache_write()	Chaitanya Kulkarni	1	-4/+1
	Signed-off-by: Chaitanya Kulkarni <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-03-25	ext4: use memcpy_from_page() in pagecache_read()	Chaitanya Kulkarni	1	-4/+1
	Signed-off-by: Chaitanya Kulkarni <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-03-21	Linux 5.12-rc4	Linus Torvalds	1	-1/+1

2021-03-21	Merge tag 'ext4_for_linus_stable' of ↵	Linus Torvalds	11	-72/+168
	git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Miscellaneous ext4 bug fixes for v5.12" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: initialize ret to suppress smatch warning ext4: stop inode update before return ext4: fix rename whiteout with fast commit ext4: fix timer use-after-free on failed mount ext4: fix potential error in ext4_do_update_inode ext4: do not try to set xattr into ea_inode if value is empty ext4: do not iput inode under running transaction in ext4_rename() ext4: find old entry again if failed to rename whiteout ext4: fix error handling in ext4_end_enable_verity() ext4: fix bh ref count on error paths fs/ext4: fix integer overflow in s_log_groups_per_flex ext4: add reclaim checks to xattr code ext4: shrink race window in ext4_should_retry_alloc()
2021-03-21	Merge tag 'io_uring-5.12-2021-03-21' of git://git.kernel.dk/linux-block	Linus Torvalds	3	-7/+31
	Pull io_uring followup fixes from Jens Axboe: - The SIGSTOP change from Eric, so we properly ignore that for PF_IO_WORKER threads. - Disallow sending signals to PF_IO_WORKER threads in general, we're not interested in having them funnel back to the io_uring owning task. - Stable fix from Stefan, ensuring we properly break links for short send/sendmsg recv/recvmsg if MSG_WAITALL is set. - Catch and loop when needing to run task_work before a PF_IO_WORKER threads goes to sleep. * tag 'io_uring-5.12-2021-03-21' of git://git.kernel.dk/linux-block: io_uring: call req_set_fail_links() on short send[msg]()/recv[msg]() with MSG_WAITALL io-wq: ensure task is running before processing task_work signal: don't allow STOP on PF_IO_WORKER threads signal: don't allow sending any signals to PF_IO_WORKER threads
2021-03-21	Merge tag 'staging-5.12-rc4' of ↵	Linus Torvalds	14	-48/+75
	git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging and IIO driver fixes from Greg KH: "Some small staging and IIO driver fixes: - MAINTAINERS changes for the move of the staging mailing list - comedi driver fixes to get request_irq() to work correctly - counter driver fixes for reported issues with iio devices - tiny iio driver fixes for reported issues. All of these have been in linux-next with no reported problems" * tag 'staging-5.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: staging: vt665x: fix alignment constraints staging: comedi: cb_pcidas64: fix request_irq() warn staging: comedi: cb_pcidas: fix request_irq() warn MAINTAINERS: move the staging subsystem to lists.linux.dev MAINTAINERS: move some real subsystems off of the staging mailing list iio: gyro: mpu3050: Fix error handling in mpu3050_trigger_handler iio: hid-sensor-temperature: Fix issues of timestamp channel iio: hid-sensor-humidity: Fix alignment issue of timestamp channel counter: stm32-timer-cnt: fix ceiling miss-alignment with reload register counter: stm32-timer-cnt: fix ceiling write max value counter: stm32-timer-cnt: Report count function when SLAVE_MODE_DISABLED iio: adc: ab8500-gpadc: Fix off by 10 to 3 iio:adc:stm32-adc: Add HAS_IOMEM dependency iio: adis16400: Fix an error code in adis16400_initial_setup() iio: adc: adi-axi-adc: add proper Kconfig dependencies iio: adc: ad7949: fix wrong ADC result due to incorrect bit mask iio: hid-sensor-prox: Fix scale not correct issue iio:adc:qcom-spmi-vadc: add default scale to LR_MUX2_BAT_ID channel
2021-03-21	Merge tag 'usb-5.12-rc4' of ↵	Linus Torvalds	11	-25/+62
	git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB and Thunderbolt driver fixes from Greg KH: "Here are some small Thunderbolt and USB driver fixes for some reported issues: - thunderbolt fixes for minor problems - typec fixes for power issues - usb-storage quirk addition - usbip bugfix - dwc3 bugfix when stopping transfers - cdnsp bugfix for isoc transfers - gadget use-after-free fix All have been in linux-next this week with no reported issues" * tag 'usb-5.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: usb: typec: tcpm: Skip sink_cap query only when VDM sm is busy usb: dwc3: gadget: Prevent EP queuing while stopping transfers usb: typec: tcpm: Invoke power_supply_changed for tcpm-source-psy- usb: typec: Remove vdo[3] part of tps6598x_rx_identity_reg struct usb-storage: Add quirk to defeat Kindle's automatic unload usb: gadget: configfs: Fix KASAN use-after-free usbip: Fix incorrect double assignment to udc->ud.tcp_rx usb: cdnsp: Fixes incorrect value in ISOC TRB thunderbolt: Increase runtime PM reference count on DP tunnel discovery thunderbolt: Initialize HopID IDAs in tb_switch_alloc()
2021-03-21	Merge tag 'irq-urgent-2021-03-21' of ↵	Linus Torvalds	2	-2/+6
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fix from Ingo Molnar: "A change to robustify force-threaded IRQ handlers to always disable interrupts, plus a DocBook fix. The force-threaded IRQ handler change has been accelerated from the normal schedule of such a change to keep the bad pattern/workaround of spin_lock_irqsave() in handlers or IRQF_NOTHREAD as a kludge from spreading" * tag 'irq-urgent-2021-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq: Disable interrupts for force threaded handlers genirq/irq_sim: Fix typos in kernel doc (fnode -> fwnode)
2021-03-21	Merge tag 'perf-urgent-2021-03-21' of ↵	Linus Torvalds	2	-1/+4
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Boundary condition fixes for bugs unearthed by the perf fuzzer" * tag 'perf-urgent-2021-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel: Fix unchecked MSR access error caused by VLBR_EVENT perf/x86/intel: Fix a crash caused by zero PEBS status
2021-03-21	Merge tag 'locking-urgent-2021-03-21' of ↵	Linus Torvalds	4	-31/+49
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Ingo Molnar: - Get static calls & modules right. Hopefully. - WW mutex fixes * tag 'locking-urgent-2021-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: static_call: Fix static_call_update() sanity check static_call: Align static_call_is_init() patching condition static_call: Fix static_call_set_init() locking/ww_mutex: Fix acquire/release imbalance in ww_acquire_init()/ww_acquire_fini() locking/ww_mutex: Simplify use_ww_ctx & ww_ctx handling
2021-03-21	Merge tag 'efi-urgent-2021-03-21' of ↵	Linus Torvalds	3	-3/+10
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI fixes from Ingo Molnar: - another missing RT_PROP table related fix, to ensure that the efivarfs pseudo filesystem fails gracefully if variable services are unsupported - use the correct alignment for literal EFI GUIDs - fix a use after unmap issue in the memreserve code * tag 'efi-urgent-2021-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi: use 32-bit alignment for efi_guid_t literals firmware/efi: Fix a use after bug in efi_mem_reserve_persistent efivars: respect EFI_UNSUPPORTED return from firmware
2021-03-21	Merge tag 'x86_urgent_for_v5.12-rc4' of ↵	Linus Torvalds	12	-44/+52
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: "The freshest pile of shiny x86 fixes for 5.12: - Add the arch-specific mapping between physical and logical CPUs to fix devicetree-node lookups - Restore the IRQ2 ignore logic - Fix get_nr_restart_syscall() to return the correct restart syscall number. Split in a 4-patches set to avoid kABI breakage when backporting to dead kernels" * tag 'x86_urgent_for_v5.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic/of: Fix CPU devicetree-node lookups x86/ioapic: Ignore IRQ2 again x86: Introduce restart_block->arch_data to remove TS_COMPAT_RESTART x86: Introduce TS_COMPAT_RESTART to fix get_nr_restart_syscall() x86: Move TS_COMPAT back to asm/thread_info.h kernel, fs: Introduce and use set_restart_fn() and arch_set_restart_data()
2021-03-21	Merge tag 'powerpc-5.12-4' of ↵	Linus Torvalds	3	-10/+19
	git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix a possible stack corruption and subsequent DLPAR failure in the rpadlpar_io PCI hotplug driver - Two build fixes for uncommon configurations Thanks to Christophe Leroy and Tyrel Datwyler. * tag 'powerpc-5.12-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: PCI: rpadlpar: Fix potential drc_name corruption in store functions powerpc: Force inlining of cpu_has_feature() to avoid build failure powerpc/vdso32: Add missing _restgpr_31_x to fix build failure
2021-03-21	io_uring: call req_set_fail_links() on short send[msg]()/recv[msg]() with ↵	Stefan Metzmacher	1	-4/+20
	MSG_WAITALL Without that it's not safe to use them in a linked combination with others. Now combinations like IORING_OP_SENDMSG followed by IORING_OP_SPLICE should be possible. We already handle short reads and writes for the following opcodes: - IORING_OP_READV - IORING_OP_READ_FIXED - IORING_OP_READ - IORING_OP_WRITEV - IORING_OP_WRITE_FIXED - IORING_OP_WRITE - IORING_OP_SPLICE - IORING_OP_TEE Now we have it for these as well: - IORING_OP_SENDMSG - IORING_OP_SEND - IORING_OP_RECVMSG - IORING_OP_RECV For IORING_OP_RECVMSG we also check for the MSG_TRUNC and MSG_CTRUNC flags in order to call req_set_fail_links(). There might be applications arround depending on the behavior that even short send[msg]()/recv[msg]() retuns continue an IOSQE_IO_LINK chain. It's very unlikely that such applications pass in MSG_WAITALL, which is only defined in 'man 2 recvmsg', but not in 'man 2 sendmsg'. It's expected that the low level sock_sendmsg() call just ignores MSG_WAITALL, as MSG_ZEROCOPY is also ignored without explicitly set SO_ZEROCOPY. We also expect the caller to know about the implicit truncation to MAX_RW_COUNT, which we don't detect. cc: [email protected] Link: https://lore.kernel.org/r/c4e1a4cc0d905314f4d5dc567e65a7b09621aab3.1615908477.git.metze@samba.org Signed-off-by: Stefan Metzmacher <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-03-21	io-wq: ensure task is running before processing task_work	Jens Axboe	1	-2/+6
	Mark the current task as running if we need to run task_work from the io-wq threads as part of work handling. If that is the case, then return as such so that the caller can appropriately loop back and reset if it was part of a going-to-sleep flush. Fixes: 3bfe6106693b ("io-wq: fork worker threads from original task") Signed-off-by: Jens Axboe <[email protected]>
2021-03-21	signal: don't allow STOP on PF_IO_WORKER threads	Eric W. Biederman	1	-1/+2
	Just like we don't allow normal signals to IO threads, don't deliver a STOP to a task that has PF_IO_WORKER set. The IO threads don't take signals in general, and have no means of flushing out a stop either. Longer term, we may want to look into allowing stop of these threads, as it relates to eg process freezing. For now, this prevents a spin issue if a SIGSTOP is delivered to the parent task. Reported-by: Stefan Metzmacher <[email protected]> Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: "Eric W. Biederman" <[email protected]>
2021-03-21	signal: don't allow sending any signals to PF_IO_WORKER threads	Jens Axboe	1	-0/+3
	They don't take signals individually, and even if they share signals with the parent task, don't allow them to be delivered through the worker thread. Linux does allow this kind of behavior for regular threads, but it's really a compatability thing that we need not care about for the IO threads. Reported-by: Stefan Metzmacher <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-03-21	ext4: initialize ret to suppress smatch warning	Theodore Ts'o	1	-1/+1
	Signed-off-by: Theodore Ts'o <[email protected]>
2021-03-21	ext4: stop inode update before return	Pan Bian	1	-1/+3
	The inode update should be stopped before returing the error code. Signed-off-by: Pan Bian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Fixes: 8016e29f4362 ("ext4: fast commit recovery path") Cc: [email protected] Reviewed-by: Harshad Shirwadkar <[email protected]> Signed-off-by: Theodore Ts'o <[email protected]>
2021-03-21	ext4: fix rename whiteout with fast commit	Harshad Shirwadkar	3	-2/+12
	This patch adds rename whiteout support in fast commits. Note that the whiteout object that gets created is actually char device. Which imples, the function ext4_inode_journal_mode(struct inode *inode) would return "JOURNAL_DATA" for this inode. This has a consequence in fast commit code that it will make creation of the whiteout object a fast-commit ineligible behavior and thus will fall back to full commits. With this patch, this can be observed by running fast commits with rename whiteout and seeing the stats generated by ext4_fc_stats tracepoint as follows: ext4_fc_stats: dev 254:32 fc ineligible reasons: XATTR:0, CROSS_RENAME:0, JOURNAL_FLAG_CHANGE:0, NO_MEM:0, SWAP_BOOT:0, RESIZE:0, RENAME_DIR:0, FALLOC_RANGE:0, INODE_JOURNAL_DATA:16; num_commits:6, ineligible: 6, numblks: 3 So in short, this patch guarantees that in case of rename whiteout, we fall back to full commits. Amir mentioned that instead of creating a new whiteout object for every rename, we can create a static whiteout object with irrelevant nlink. That will make fast commits to not fall back to full commit. But until this happens, this patch will ensure correctness by falling back to full commits. Fixes: 8016e29f4362 ("ext4: fast commit recovery path") Cc: [email protected] Signed-off-by: Harshad Shirwadkar <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-03-21	ext4: fix timer use-after-free on failed mount	Jan Kara	1	-1/+1
	When filesystem mount fails because of corrupted filesystem we first cancel the s_err_report timer reminding fs errors every day and only then we flush s_error_work. However s_error_work may report another fs error and re-arm timer thus resulting in timer use-after-free. Fix the problem by first flushing the work and only after that canceling the s_err_report timer. Reported-by: [email protected] Fixes: 2d01ddc86606 ("ext4: save error info to sb through journal if available") CC: [email protected] Signed-off-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-03-21	ext4: fix potential error in ext4_do_update_inode	Shijie Luo	1	-4/+4
	If set_large_file = 1 and errors occur in ext4_handle_dirty_metadata(), the error code will be overridden, go to out_brelse to avoid this situation. Signed-off-by: Shijie Luo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Cc: [email protected] Reviewed-by: Jan Kara <[email protected]> Signed-off-by: Theodore Ts'o <[email protected]>
2021-03-21	ext4: do not try to set xattr into ea_inode if value is empty	zhangyi (F)	1	-1/+1
	Syzbot report a warning that ext4 may create an empty ea_inode if set an empty extent attribute to a file on the file system which is no free blocks left. WARNING: CPU: 6 PID: 10667 at fs/ext4/xattr.c:1640 ext4_xattr_set_entry+0x10f8/0x1114 fs/ext4/xattr.c:1640 ... Call trace: ext4_xattr_set_entry+0x10f8/0x1114 fs/ext4/xattr.c:1640 ext4_xattr_block_set+0x1d0/0x1b1c fs/ext4/xattr.c:1942 ext4_xattr_set_handle+0x8a0/0xf1c fs/ext4/xattr.c:2390 ext4_xattr_set+0x120/0x1f0 fs/ext4/xattr.c:2491 ext4_xattr_trusted_set+0x48/0x5c fs/ext4/xattr_trusted.c:37 __vfs_setxattr+0x208/0x23c fs/xattr.c:177 ... Now, ext4 try to store extent attribute into an external inode if ext4_xattr_block_set() return -ENOSPC, but for the case of store an empty extent attribute, store the extent entry into the extent attribute block is enough. A simple reproduce below. fallocate test.img -l 1M mkfs.ext4 -F -b 2048 -O ea_inode test.img mount test.img /mnt dd if=/dev/zero of=/mnt/foo bs=2048 count=500 setfattr -n "user.test" /mnt/foo Reported-by: [email protected] Fixes: 9c6e7853c531 ("ext4: reserve space for xattr entries/names") Cc: [email protected] Signed-off-by: zhangyi (F) <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-03-21	ext4: do not iput inode under running transaction in ext4_rename()	zhangyi (F)	1	-9/+9
	In ext4_rename(), when RENAME_WHITEOUT failed to add new entry into directory, it ends up dropping new created whiteout inode under the running transaction. After commit <9b88f9fb0d2> ("ext4: Do not iput inode under running transaction"), we follow the assumptions that evict() does not get called from a transaction context but in ext4_rename() it breaks this suggestion. Although it's not a real problem, better to obey it, so this patch add inode to orphan list and stop transaction before final iput(). Signed-off-by: zhangyi (F) <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-03-21	ext4: find old entry again if failed to rename whiteout	zhangyi (F)	1	-2/+27
	If we failed to add new entry on rename whiteout, we cannot reset the old->de entry directly, because the old->de could have moved from under us during make indexed dir. So find the old entry again before reset is needed, otherwise it may corrupt the filesystem as below. /dev/sda: Entry '00000001' in ??? (12) has deleted/unused inode 15. CLEARED. /dev/sda: Unattached inode 75 /dev/sda: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY. Fixes: 6b4b8e6b4ad ("ext4: fix bug for rename with RENAME_WHITEOUT") Cc: [email protected] Signed-off-by: zhangyi (F) <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-03-21	genirq: Disable interrupts for force threaded handlers	Thomas Gleixner	1	-0/+4
	With interrupt force threading all device interrupt handlers are invoked from kernel threads. Contrary to hard interrupt context the invocation only disables bottom halfs, but not interrupts. This was an oversight back then because any code like this will have an issue: thread(irq_A) irq_handler(A) spin_lock(&foo->lock); interrupt(irq_B) irq_handler(B) spin_lock(&foo->lock); This has been triggered with networking (NAPI vs. hrtimers) and console drivers where printk() happens from an interrupt which interrupted the force threaded handler. Now people noticed and started to change the spin_lock() in the handler to spin_lock_irqsave() which affects performance or add IRQF_NOTHREAD to the interrupt request which in turn breaks RT. Fix the root cause and not the symptom and disable interrupts before invoking the force threaded handler which preserves the regular semantics and the usefulness of the interrupt force threading as a general debugging tool. For not RT this is not changing much, except that during the execution of the threaded handler interrupts are delayed until the handler returns. Vs. scheduling and softirq processing there is no difference. For RT kernels there is no issue. Fixes: 8d32a307e4fa ("genirq: Provide forced interrupt threading") Reported-by: Johan Hovold <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Johan Hovold <[email protected]> Acked-by: Sebastian Andrzej Siewior <[email protected]> Link: https://lore.kernel.org/r/[email protected]