blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2021-02-03	f2fs: add ckpt_thread_ioprio sysfs node	Daeho Jeong	4	-1/+67
	Added "ckpt_thread_ioprio" sysfs node to give a way to change checkpoint merge daemon's io priority. Its default value is "be,3", which means "BE" I/O class and I/O priority "3". We can select the class between "rt" and "be", and set the I/O priority within valid range of it. "," delimiter is necessary in between I/O class and priority number. Signed-off-by: Daeho Jeong <[email protected]> Reviewed-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2021-02-03	f2fs: introduce checkpoint_merge mount option	Daeho Jeong	5	-8/+277
	We've added a new mount options, "checkpoint_merge" and "nocheckpoint_merge", which creates a kernel daemon and makes it to merge concurrent checkpoint requests as much as possible to eliminate redundant checkpoint issues. Plus, we can eliminate the sluggish issue caused by slow checkpoint operation when the checkpoint is done in a process context in a cgroup having low i/o budget and cpu shares. To make this do better, we set the default i/o priority of the kernel daemon to "3", to give one higher priority than other kernel threads. The below verification result explains this. The basic idea has come from https://opensource.samsung.com. [Verification] Android Pixel Device(ARM64, 7GB RAM, 256GB UFS) Create two I/O cgroups (fg w/ weight 100, bg w/ wight 20) Set "strict_guarantees" to "1" in BFQ tunables In "fg" cgroup, - thread A => trigger 1000 checkpoint operations "for i in `seq 1 1000`; do touch test_dir1/file; fsync test_dir1; done" - thread B => gererating async. I/O "fio --rw=write --numjobs=1 --bs=128k --runtime=3600 --time_based=1 --filename=test_img --name=test" In "bg" cgroup, - thread C => trigger repeated checkpoint operations "echo $$ > /dev/blkio/bg/tasks; while true; do touch test_dir2/file; fsync test_dir2; done" We've measured thread A's execution time. [ w/o patch ] Elapsed Time: Avg. 68 seconds [ w/ patch ] Elapsed Time: Avg. 48 seconds Reported-by: kernel test robot <[email protected]> Reported-by: Dan Carpenter <[email protected]> [Jaegeuk Kim: fix the return value in f2fs_start_ckpt_thread, reported by Dan] Signed-off-by: Daeho Jeong <[email protected]> Signed-off-by: Sungjong Seo <[email protected]> Reviewed-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2021-02-02	f2fs: relocate inline conversion from mmap() to mkwrite()	Chao Yu	1	-6/+4
	If there is page fault only for read case on inline inode, we don't need to convert inline inode, instead, let's do conversion for write case. Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2021-02-02	f2fs: fix a wrong condition in __submit_bio	Dehe Gu	1	-1/+1
	We should use !F2FS_IO_ALIGNED() to check and submit_io directly. Fixes: 8223ecc456d0 ("f2fs: fix to add missing F2FS_IO_ALIGNED() condition") Reviewed-by: Chao Yu <[email protected]> Signed-off-by: Dehe Gu <[email protected]> Signed-off-by: Ge Qiu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2021-02-01	f2fs: remove unnecessary initialization in xattr.c	Liu Song	1	-4/+4
	These variables will be explicitly assigned before use, so there is no need to initialize. Signed-off-by: Liu Song <[email protected]> Reviewed-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2021-02-01	f2fs: fix to avoid inconsistent quota data	Yi Chen	2	-0/+8
	Occasionally, quota data may be corrupted detected by fsck: Info: checkpoint state = 45 : crc compacted_summary unmount [QUOTA WARNING] Usage inconsistent for ID 0:actual (1543036928, 762) != expected (1543032832, 762) [ASSERT] (fsck_chk_quota_files:1986) --> Quota file is missing or invalid quota file content found. [QUOTA WARNING] Usage inconsistent for ID 0:actual (1352478720, 344) != expected (1352474624, 344) [ASSERT] (fsck_chk_quota_files:1986) --> Quota file is missing or invalid quota file content found. [FSCK] Unreachable nat entries [Ok..] [0x0] [FSCK] SIT valid block bitmap checking [Ok..] [FSCK] Hard link checking for regular file [Ok..] [0x0] [FSCK] valid_block_count matching with CP [Ok..] [0xdf299] [FSCK] valid_node_count matcing with CP (de lookup) [Ok..] [0x2b01] [FSCK] valid_node_count matcing with CP (nat lookup) [Ok..] [0x2b01] [FSCK] valid_inode_count matched with CP [Ok..] [0x2665] [FSCK] free segment_count matched with CP [Ok..] [0xcb04] [FSCK] next block offset is free [Ok..] [FSCK] fixing SIT types [FSCK] other corrupted bugs [Fail] The root cause is: If we open file w/ readonly flag, disk quota info won't be initialized for this file, however, following mmap() will force to convert inline inode via f2fs_convert_inline_inode(), which may increase block usage for this inode w/o updating quota data, it causes inconsistent disk quota info. The issue will happen in following stack: open(file, O_RDONLY) mmap(file) - f2fs_convert_inline_inode - f2fs_convert_inline_page - f2fs_reserve_block - f2fs_reserve_new_block - f2fs_reserve_new_blocks - f2fs_i_blocks_write - dquot_claim_block inode->i_blocks increase, but the dqb_curspace keep the size for the dquots is NULL. To fix this issue, let's call dquot_initialize() anyway in both f2fs_truncate() and f2fs_convert_inline_inode() functions to avoid potential inconsistent quota data issue. Fixes: 0abd675e97e6 ("f2fs: support plain user/group quota") Signed-off-by: Daiyue Zhang <[email protected]> Signed-off-by: Dehe Gu <[email protected]> Signed-off-by: Junchao Jiang <[email protected]> Signed-off-by: Ge Qiu <[email protected]> Signed-off-by: Yi Chen <[email protected]> Reviewed-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2021-02-01	f2fs: flush data when enabling checkpoint back	Jaegeuk Kim	1	-0/+3
	During checkpoint=disable period, f2fs bypasses all the synchronous IOs such as sync and fsync. So, when enabling it back, we must flush all of them in order to keep the data persistent. Otherwise, suddern power-cut right after enabling checkpoint will cause data loss. Fixes: 4354994f097d ("f2fs: checkpoint disabling") Cc: [email protected] Reviewed-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2021-01-27	f2fs: deprecate f2fs_trace_io	Jaegeuk Kim	10	-239/+0
	This patch deprecates f2fs_trace_io, since f2fs uses page->private more broadly, resulting in more buggy cases. Acked-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2021-01-27	f2fs: Remove readahead collision detection	Matthew Wilcox (Oracle)	3	-28/+0
	With the new ->readahead operation, locked pages are added to the page cache, preventing two threads from racing with each other to read the same chunk of file, so this is dead code. Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2021-01-27	f2fs: remove unused stat_{inc, dec}_atomic_write	Jack Qiu	1	-2/+0
	Just clean code, no logical change. Signed-off-by: Jack Qiu <[email protected]> Reviewed-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2021-01-27	f2fs: introduce sb_status sysfs node	Chao Yu	2	-0/+31
	Introduce /sys/fs/f2fs/<devname>/stat/sb_status to show superblock status in real time as a hexadecimal value. value sb status macro description 0x1 SBI_IS_DIRTY, /* dirty flag for checkpoint / 0x2 SBI_IS_CLOSE, / specify unmounting / 0x4 SBI_NEED_FSCK, / need fsck.f2fs to fix / 0x8 SBI_POR_DOING, / recovery is doing or not / 0x10 SBI_NEED_SB_WRITE, / need to recover superblock / 0x20 SBI_NEED_CP, / need to checkpoint / 0x40 SBI_IS_SHUTDOWN, / shutdown by ioctl / 0x80 SBI_IS_RECOVERED, / recovered orphan/data / 0x100 SBI_CP_DISABLED, / CP was disabled last mount / 0x200 SBI_CP_DISABLED_QUICK, / CP was disabled quickly / 0x400 SBI_QUOTA_NEED_FLUSH, / need to flush quota info in CP / 0x800 SBI_QUOTA_SKIP_FLUSH, / skip flushing quota in current CP / 0x1000 SBI_QUOTA_NEED_REPAIR, / quota file may be corrupted / 0x2000 SBI_IS_RESIZEFS, / resizefs is in process */ Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2021-01-27	f2fs: fix to use per-inode maxbytes	Chengguang Xu	4	-9/+16
	F2FS inode may have different max size, e.g. compressed file have less blkaddr entries in all its direct-node blocks, result in being with less max filesize. So change to use per-inode maxbytes. Suggested-by: Chao Yu <[email protected]> Signed-off-by: Chengguang Xu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2021-01-27	f2fs: compress: fix potential deadlock	Chao Yu	3	-6/+11
	generic/269 reports a hangtask issue, the root cause is ABBA deadlock described as below: Thread A Thread B - down_write(&sbi->gc_lock) -- A - f2fs_write_data_pages - lock all pages in cluster -- B - f2fs_write_multi_pages - f2fs_write_raw_pages - f2fs_write_single_data_page - f2fs_balance_fs - down_write(&sbi->gc_lock) -- A - f2fs_gc - do_garbage_collect - ra_data_block - pagecache_get_page -- B To fix this, it needs to avoid calling f2fs_balance_fs() if there is still cluster pages been locked in context of cluster writeback, so instead, let's call f2fs_balance_fs() in the end of f2fs_write_raw_pages() when all cluster pages were unlocked. Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2021-01-27	libfs: unexport generic_ci_d_compare() and generic_ci_d_hash()	Eric Biggers	2	-10/+3
	Now that generic_set_encrypted_ci_d_ops() has been added and ext4 and f2fs are using it, it's no longer necessary to export generic_ci_d_compare() and generic_ci_d_hash() to filesystems. Reviewed-by: Gabriel Krisman Bertazi <[email protected]> Signed-off-by: Eric Biggers <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2021-01-27	f2fs: fix to set/clear I_LINKABLE under i_lock	Chao Yu	1	-0/+8
	fsstress + fault injection test case reports a warning message as below: WARNING: CPU: 13 PID: 6226 at fs/inode.c:361 inc_nlink+0x32/0x40 Call Trace: f2fs_init_inode_metadata+0x25c/0x4a0 [f2fs] f2fs_add_inline_entry+0x153/0x3b0 [f2fs] f2fs_add_dentry+0x75/0x80 [f2fs] f2fs_do_add_link+0x108/0x160 [f2fs] f2fs_rename2+0x6ab/0x14f0 [f2fs] vfs_rename+0x70c/0x940 do_renameat2+0x4d8/0x4f0 __x64_sys_renameat2+0x4b/0x60 do_syscall_64+0x33/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Following race case can cause this: Thread A Kworker - f2fs_rename - f2fs_create_whiteout - __f2fs_tmpfile - f2fs_i_links_write - f2fs_mark_inode_dirty_sync - mark_inode_dirty_sync - writeback_single_inode - __writeback_single_inode - spin_lock(&inode->i_lock) - inode->i_state \|= I_LINKABLE - inode->i_state &= ~dirty - spin_unlock(&inode->i_lock) - f2fs_add_link - f2fs_do_add_link - f2fs_add_dentry - f2fs_add_inline_entry - f2fs_init_inode_metadata - f2fs_i_links_write - inc_nlink - WARN_ON(!(inode->i_state & I_LINKABLE)) Fix to add i_lock to avoid i_state update race condition. Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2021-01-27	f2fs: fix null page reference in redirty_blocks	Daeho Jeong	1	-2/+4
	By Colin's static analysis, we found out there is a null page reference under low memory situation in redirty_blocks. I've made the page finding loop stop immediately and return an error not to cause further memory pressure when we run into a failure to find a page under low memory condition. Signed-off-by: Daeho Jeong <[email protected]> Reported-by: Colin Ian King <[email protected]> Fixes: 5fdb322ff2c2 ("f2fs: add F2FS_IOC_DECOMPRESS_FILE and F2FS_IOC_COMPRESS_FILE") Reviewed-by: Colin Ian King <[email protected]> Reviewed-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2021-01-27	f2fs: clean up post-read processing	Eric Biggers	3	-264/+297
	Rework the post-read processing logic to be much easier to understand. At least one bug is fixed by this: if an I/O error occurred when reading from disk, decryption and verity would be performed on the uninitialized data, causing misleading messages in the kernel log. Signed-off-by: Eric Biggers <[email protected]> Reviewed-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2021-01-27	f2fs: trival cleanup in move_data_block()	Chao Yu	1	-5/+3
	Trival cleanups: - relocate set_summary() before its use - relocate "allocate block address" to correct place - remove unneeded f2fs_wait_on_page_writeback() Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2021-01-27	f2fs: fix out-of-repair __setattr_copy()	Chao Yu	1	-1/+2
	__setattr_copy() was copied from setattr_copy() in fs/attr.c, there is two missing patches doesn't cover this inner function, fix it. Commit 7fa294c8991c ("userns: Allow chown and setgid preservation") Commit 23adbe12ef7d ("fs,userns: Change inode_capable to capable_wrt_inode_uidgid") Fixes: fbfa2cc58d53 ("f2fs: add file operations") Cc: [email protected] Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2021-01-27	f2fs: fix to tag FIEMAP_EXTENT_MERGED in f2fs_fiemap()	Chao Yu	1	-0/+1
	f2fs does not natively support extents in metadata, 'extent' in f2fs is used as a virtual concept, so in f2fs_fiemap() interface, it needs to tag FIEMAP_EXTENT_MERGED flag to indicated the extent status is a result of merging. Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2021-01-27	f2fs: introduce a new per-sb directory in sysfs	Chao Yu	2	-6/+68
	Add a new directory 'stat' in path of /sys/fs/f2fs/<devname>/, later we can add new readonly stat sysfs file into this directory, it will make <devname> directory less mess. Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2021-01-27	f2fs: compress: support compress level	Chao Yu	6	-5/+152
	Expand 'compress_algorithm' mount option to accept parameter as format of <algorithm>:<level>, by this way, it gives a way to allow user to do more specified config on lz4 and zstd compression level, then f2fs compression can provide higher compress ratio. In order to set compress level for lz4 algorithm, it needs to set CONFIG_LZ4HC_COMPRESS and CONFIG_F2FS_FS_LZ4HC config to enable lz4hc compress algorithm. CR and performance number on lz4/lz4hc algorithm: dd if=enwik9 of=compressed_file conv=fsync Original blocks: 244382 lz4 lz4hc-9 compressed blocks 170647 163270 compress ratio 69.8% 66.8% speed 16.4207 s, 60.9 MB/s 26.7299 s, 37.4 MB/s compress ratio = after / before Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2021-01-27	f2fs: compress: deny setting unsupported compress algorithm	Chao Yu	1	-0/+16
	If kernel doesn't support certain kinds of compress algorithm, deny to set them as compress algorithm of f2fs via 'compress_algorithm=%s' mount option. Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2021-01-27	f2fs: relocate f2fs_precache_extents()	Chao Yu	1	-1/+2
	Relocate f2fs_precache_extents() in prior to check_swap_activate(), then extent cache can be enabled before its use. Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2021-01-27	f2fs: enforce the immutable flag on open files	Chao Yu	1	-0/+17
	This patch ports commit 02b016ca7f99 ("ext4: enforce the immutable flag on open files") to f2fs. According to the chattr man page, "a file with the 'i' attribute cannot be modified..." Historically, this was only enforced when the file was opened, per the rest of the description, "... and the file can not be opened in write mode". There is general agreement that we should standardize all file systems to prevent modifications even for files that were opened at the time the immutable flag is set. Eventually, a change to enforce this at the VFS layer should be landing in mainline. Cc: [email protected] Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2021-01-27	f2fs: enhance to update i_mode and acl atomically in f2fs_setattr()	Chao Yu	3	-10/+35
	Previously, in f2fs_setattr(), we don't update S_ISUID\|S_ISGID\|S_ISVTX bits with S_IRWXUGO bits and acl entries atomically, so in error path, chmod() may partially success, this patch enhances to make chmod() flow being atomical. Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2021-01-27	f2fs: fix to set inode->i_mode correctly for posix_acl_update_mode	Weichao Guo	1	-0/+1
	We should update the ~S_IRWXUGO part of inode->i_mode in __setattr_copy, because posix_acl_update_mode updates mode based on inode->i_mode, which finally overwrites the ~S_IRWXUGO part of i_acl_mode with old i_mode. Testcase to reproduce this bug: 0. adduser abc 1. mkfs.f2fs /dev/sdd 2. mount -t f2fs /dev/sdd /mnt/f2fs 3. mkdir /mnt/f2fs/test 4. setfacl -m u:abc:r /mnt/f2fs/test 5. chmod +s /mnt/f2fs/test Signed-off-by: Weichao Guo <[email protected]> Signed-off-by: Bin Shu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2021-01-27	f2fs: Replace expression with offsetof()	Zheng Yongjun	1	-1/+1
	Use the existing offsetof() macro instead of duplicating code. Signed-off-by: Zheng Yongjun <[email protected]> Reviewed-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2021-01-27	f2fs: handle unallocated section and zone on pinned/atgc	Jaegeuk Kim	1	-2/+2
	If we have large section/zone, unallocated segment makes them corrupted. E.g., - Pinned file: -1 119304647 119304647 - ATGC data: -1 119304647 119304647 Reviewed-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2021-01-27	Merge branch 'parisc-5.11-2' of ↵	Linus Torvalds	3	-9/+12
	git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc fixes from Helge Deller: "Two small fixes: - Fix linking error with 64-bit kernel when modules are disabled, reported by kernel test robot - Remove leftover reference to power_tasklet, by Davidlohr Bueso" * 'parisc-5.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Enable -mlong-calls gcc option by default when !CONFIG_MODULES parisc: Remove leftover reference to the power_tasklet
2021-01-26	mailmap: remove the "repo-abbrev" comment	Ævar Arnfjörð Bjarmason	1	-3/+0
	Remove the magical "repo-abbrev" comment added when this file was introduced in e0ab1ec9fcd3 ([PATCH] add .mailmap for proper git-shortlog output, 2007-02-14). It's been an undocumented feature of git-shortlog(1), originally added to git for Linus's use. Since then he's no longer using it[1], and I've removed the feature in git.git's 4e168333a87 (shortlog: remove unused(?) "repo-abbrev" feature, 2021-01-12). It's on the "master" branch, but not yet in a release version. Let's also remove it from linux.git, both as a heads-up to any potential users of it in linux.git whose use would be broken sooner than later by git itself, and because it'll eventually be entirely redundant. 1. https://lore.kernel.org/git/CAHk-=wixHyBKZVUcxq+NCWMbkrX0xnppb7UCopRWw1+oExYpYw@mail.gmail.com/ Acked-by: Linus Torvalds <[email protected]> Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-01-26	parisc: Enable -mlong-calls gcc option by default when !CONFIG_MODULES	Helge Deller	2	-6/+12
	When building a kernel without module support, the CONFIG_MLONGCALL option needs to be enabled in order to reach symbols which are outside of a 22-bit branch. This patch changes the autodetection in the Kconfig script to always enable CONFIG_MLONGCALL when modules are disabled and uses a far call to preempt_schedule_irq() in intr_do_preempt() to reach the symbol in all cases. Signed-off-by: Helge Deller <[email protected]> Reported-by: kernel test robot <[email protected]> Cc: [email protected] # v5.6+
2021-01-26	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm	Linus Torvalds	15	-112/+172
	Pull kvm fixes from Paolo Bonzini: - x86 bugfixes - Documentation fixes - Avoid performance regression due to SEV-ES patches - ARM: - Don't allow tagged pointers to point to memslots - Filter out ARMv8.1+ PMU events on v8.0 hardware - Hide PMU registers from userspace when no PMU is configured - More PMU cleanups - Don't try to handle broken PSCI firmware - More sys_reg() to reg_to_encoding() conversions * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: allow KVM_REQ_GET_NESTED_STATE_PAGES outside guest mode for VMX KVM: x86: Revert "KVM: x86: Mark GPRs dirty when written" KVM: SVM: Unconditionally sync GPRs to GHCB on VMRUN of SEV-ES guest KVM: nVMX: Sync unsync'd vmcs02 state to vmcs12 on migration kvm: tracing: Fix unmatched kvm_entry and kvm_exit events KVM: Documentation: Update description of KVM_{GET,CLEAR}_DIRTY_LOG KVM: x86: get smi pending status correctly KVM: x86/pmu: Fix HW_REF_CPU_CYCLES event pseudo-encoding in intel_arch_events[] KVM: x86/pmu: Fix UBSAN shift-out-of-bounds warning in intel_pmu_refresh() KVM: x86: Add more protection against undefined behavior in rsvd_bits() KVM: Documentation: Fix spec for KVM_CAP_ENABLE_CAP_VM KVM: Forbid the use of tagged userspace addresses for memslots KVM: arm64: Filter out v8.1+ events on v8.0 HW KVM: arm64: Compute TPIDR_EL2 ignoring MTE tag KVM: arm64: Use the reg_to_encoding() macro instead of sys_reg() KVM: arm64: Allow PSCI SYSTEM_OFF/RESET to return KVM: arm64: Simplify handling of absent PMU system registers KVM: arm64: Hide PMU registers from userspace when not available
2021-01-26	Merge tag 'spi-fix-v5.11-rc5' of ↵	Linus Torvalds	2	-1/+3
	git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "One new device ID here, plus an error handling fix - nothing remarkable in either" * tag 'spi-fix-v5.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spidev: Add cisco device compatible spi: altera: Fix memory leak on error path
2021-01-26	Merge tag 'regulator-fix-v5.11-rc5' of ↵	Linus Torvalds	2	-11/+63
	git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fixes from Mark Brown: "The main thing here is a change to make sure that we don't try to double resolve the supply of a regulator if we have two probes going on simultaneously, plus an incremental fix on top of that to resolve a lockdep issue it introduced. There's also a patch from Dmitry Osipenko adding stubs for some functions to avoid build issues in consumers in some configurations" * tag 'regulator-fix-v5.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: Fix lockdep warning resolving supplies regulator: consumer: Add missing stubs to regulator/consumer.h regulator: core: avoid regulator_resolve_supply() race condition
2021-01-26	parisc: Remove leftover reference to the power_tasklet	Davidlohr Bueso	1	-3/+0
	This was removed long ago, back in: 6e16d9409e1 ([PARISC] Convert soft power switch driver to kthread) Signed-off-by: Davidlohr Bueso <[email protected]> Signed-off-by: Helge Deller <[email protected]>
2021-01-26	Revert "mm: fix initialization of struct page for holes in memory layout"	Linus Torvalds	1	-50/+34
	This reverts commit d3921cb8be29ce5668c64e23ffdaeec5f8c69399. Chris Wilson reports that it causes boot problems: "We have half a dozen or so different machines in CI that are silently failing to boot, that we believe is bisected to this patch" and the CI team confirmed that a revert fixed the issues. The cause is unknown for now, so let's revert it. Link: https://lore.kernel.org/lkml/[email protected]/ Reported-and-tested-by: Chris Wilson <[email protected]> Acked-by: Mike Rapoport <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-01-25	KVM: x86: allow KVM_REQ_GET_NESTED_STATE_PAGES outside guest mode for VMX	Paolo Bonzini	3	-9/+29
	VMX also uses KVM_REQ_GET_NESTED_STATE_PAGES for the Hyper-V eVMCS, which may need to be loaded outside guest mode. Therefore we cannot WARN in that case. However, that part of nested_get_vmcs12_pages is _not_ needed at vmentry time. Split it out of KVM_REQ_GET_NESTED_STATE_PAGES handling, so that both vmentry and migration (and in the latter case, independent of is_guest_mode) do the parts that are needed. Cc: <[email protected]> # 5.10.x: f2c7ef3ba: KVM: nSVM: cancel KVM_REQ_GET_NESTED_STATE_PAGES Cc: <[email protected]> # 5.10.x Signed-off-by: Paolo Bonzini <[email protected]>
2021-01-25	KVM: x86: Revert "KVM: x86: Mark GPRs dirty when written"	Sean Christopherson	1	-26/+25
	Revert the dirty/available tracking of GPRs now that KVM copies the GPRs to the GHCB on any post-VMGEXIT VMRUN, even if a GPR is not dirty. Per commit de3cd117ed2f ("KVM: x86: Omit caching logic for always-available GPRs"), tracking for GPRs noticeably impacts KVM's code footprint. This reverts commit 1c04d8c986567c27c56c05205dceadc92efb14ff. Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-01-25	KVM: SVM: Unconditionally sync GPRs to GHCB on VMRUN of SEV-ES guest	Sean Christopherson	1	-9/+6
	Drop the per-GPR dirty checks when synchronizing GPRs to the GHCB, the GRPs' dirty bits are set from time zero and never cleared, i.e. will always be seen as dirty. The obvious alternative would be to clear the dirty bits when appropriate, but removing the dirty checks is desirable as it allows reverting GPR dirty+available tracking, which adds overhead to all flavors of x86 VMs. Note, unconditionally writing the GPRs in the GHCB is tacitly allowed by the GHCB spec, which allows the hypervisor (or guest) to provide unnecessary info; it's the guest's responsibility to consume only what it needs (the hypervisor is untrusted after all). The guest and hypervisor can supply additional state if desired but must not rely on that additional state being provided. Cc: Brijesh Singh <[email protected]> Cc: Tom Lendacky <[email protected]> Fixes: 291bd20d5d88 ("KVM: SVM: Add initial support for a VMGEXIT VMEXIT") Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-01-25	KVM: nVMX: Sync unsync'd vmcs02 state to vmcs12 on migration	Maxim Levitsky	1	-5/+8
	Even when we are outside the nested guest, some vmcs02 fields may not be in sync vs vmcs12. This is intentional, even across nested VM-exit, because the sync can be delayed until the nested hypervisor performs a VMCLEAR or a VMREAD/VMWRITE that affects those rarely accessed fields. However, during KVM_GET_NESTED_STATE, the vmcs12 has to be up to date to be able to restore it. To fix that, call copy_vmcs02_to_vmcs12_rare() before the vmcs12 contents are copied to userspace. Fixes: 7952d769c29ca ("KVM: nVMX: Sync rarely accessed guest fields only when needed") Reviewed-by: Sean Christopherson <[email protected]> Signed-off-by: Maxim Levitsky <[email protected]> Message-Id: <[email protected]> Cc: [email protected] Signed-off-by: Paolo Bonzini <[email protected]>
2021-01-25	kvm: tracing: Fix unmatched kvm_entry and kvm_exit events	Lorenzo Brescia	3	-2/+5
	On VMX, if we exit and then re-enter immediately without leaving the vmx_vcpu_run() function, the kvm_entry event is not logged. That means we will see one (or more) kvm_exit, without its (their) corresponding kvm_entry, as shown here: CPU-1979 [002] 89.871187: kvm_entry: vcpu 1 CPU-1979 [002] 89.871218: kvm_exit: reason MSR_WRITE CPU-1979 [002] 89.871259: kvm_exit: reason MSR_WRITE It also seems possible for a kvm_entry event to be logged, but then we leave vmx_vcpu_run() right away (if vmx->emulation_required is true). In this case, we will have a spurious kvm_entry event in the trace. Fix these situations by moving trace_kvm_entry() inside vmx_vcpu_run() (where trace_kvm_exit() already is). A trace obtained with this patch applied looks like this: CPU-14295 [000] 8388.395387: kvm_entry: vcpu 0 CPU-14295 [000] 8388.395392: kvm_exit: reason MSR_WRITE CPU-14295 [000] 8388.395393: kvm_entry: vcpu 0 CPU-14295 [000] 8388.395503: kvm_exit: reason EXTERNAL_INTERRUPT Of course, not calling trace_kvm_entry() in common x86 code any longer means that we need to adjust the SVM side of things too. Signed-off-by: Lorenzo Brescia <[email protected]> Signed-off-by: Dario Faggioli <[email protected]> Message-Id: <160873470698.11652.13483635328769030605.stgit@Wayrath> Signed-off-by: Paolo Bonzini <[email protected]>
2021-01-25	KVM: Documentation: Update description of KVM_{GET,CLEAR}_DIRTY_LOG	Zenghui Yu	1	-9/+7
	Update various words, including the wrong parameter name and the vague description of the usage of "slot" field. Signed-off-by: Zenghui Yu <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-01-25	KVM: x86: get smi pending status correctly	Jay Zhou	1	-0/+4
	The injection process of smi has two steps: Qemu KVM Step1: cpu->interrupt_request &= \ ~CPU_INTERRUPT_SMI; kvm_vcpu_ioctl(cpu, KVM_SMI) call kvm_vcpu_ioctl_smi() and kvm_make_request(KVM_REQ_SMI, vcpu); Step2: kvm_vcpu_ioctl(cpu, KVM_RUN, 0) call process_smi() if kvm_check_request(KVM_REQ_SMI, vcpu) is true, mark vcpu->arch.smi_pending = true; The vcpu->arch.smi_pending will be set true in step2, unfortunately if vcpu paused between step1 and step2, the kvm_run->immediate_exit will be set and vcpu has to exit to Qemu immediately during step2 before mark vcpu->arch.smi_pending true. During VM migration, Qemu will get the smi pending status from KVM using KVM_GET_VCPU_EVENTS ioctl at the downtime, then the smi pending status will be lost. Signed-off-by: Jay Zhou <[email protected]> Signed-off-by: Shengen Zhuang <[email protected]> Message-Id: <[email protected]> Cc: [email protected] Signed-off-by: Paolo Bonzini <[email protected]>
2021-01-25	KVM: x86/pmu: Fix HW_REF_CPU_CYCLES event pseudo-encoding in intel_arch_events[]	Like Xu	1	-1/+1
	The HW_REF_CPU_CYCLES event on the fixed counter 2 is pseudo-encoded as 0x0300 in the intel_perfmon_event_map[]. Correct its usage. Fixes: 62079d8a4312 ("KVM: PMU: add proper support for fixed counter 2") Signed-off-by: Like Xu <[email protected]> Message-Id: <[email protected]> Reviewed-by: Sean Christopherson <[email protected]> Cc: [email protected] Signed-off-by: Paolo Bonzini <[email protected]>
2021-01-25	KVM: x86/pmu: Fix UBSAN shift-out-of-bounds warning in intel_pmu_refresh()	Like Xu	1	-0/+4
	Since we know vPMU will not work properly when (1) the guest bit_width(s) of the [gp\|fixed] counters are greater than the host ones, or (2) guest requested architectural events exceeds the range supported by the host, so we can setup a smaller left shift value and refresh the guest cpuid entry, thus fixing the following UBSAN shift-out-of-bounds warning: shift exponent 197 is too large for 64-bit type 'long long unsigned int' Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x107/0x163 lib/dump_stack.c:120 ubsan_epilogue+0xb/0x5a lib/ubsan.c:148 __ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:395 intel_pmu_refresh.cold+0x75/0x99 arch/x86/kvm/vmx/pmu_intel.c:348 kvm_vcpu_after_set_cpuid+0x65a/0xf80 arch/x86/kvm/cpuid.c:177 kvm_vcpu_ioctl_set_cpuid2+0x160/0x440 arch/x86/kvm/cpuid.c:308 kvm_arch_vcpu_ioctl+0x11b6/0x2d70 arch/x86/kvm/x86.c:4709 kvm_vcpu_ioctl+0x7b9/0xdb0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3386 vfs_ioctl fs/ioctl.c:48 [inline] __do_sys_ioctl fs/ioctl.c:753 [inline] __se_sys_ioctl fs/ioctl.c:739 [inline] __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:739 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Reported-by: [email protected] Signed-off-by: Like Xu <[email protected]> Message-Id: <[email protected]> Cc: [email protected] Signed-off-by: Paolo Bonzini <[email protected]>
2021-01-25	KVM: x86: Add more protection against undefined behavior in rsvd_bits()	Sean Christopherson	1	-1/+8
	Add compile-time asserts in rsvd_bits() to guard against KVM passing in garbage hardcoded values, and cap the upper bound at '63' for dynamic values to prevent generating a mask that would overflow a u64. Suggested-by: Paolo Bonzini <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-01-25	KVM: Documentation: Fix spec for KVM_CAP_ENABLE_CAP_VM	Quentin Perret	1	-1/+1
	The documentation classifies KVM_ENABLE_CAP with KVM_CAP_ENABLE_CAP_VM as a vcpu ioctl, which is incorrect. Fix it by specifying it as a VM ioctl. Fixes: e5d83c74a580 ("kvm: make KVM_CAP_ENABLE_CAP_VM architecture agnostic") Signed-off-by: Quentin Perret <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-01-25	Merge tag 'kvmarm-fixes-5.11-2' of ↵	Paolo Bonzini	6	-49/+74
	git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 5.11, take #2 - Don't allow tagged pointers to point to memslots - Filter out ARMv8.1+ PMU events on v8.0 hardware - Hide PMU registers from userspace when no PMU is configured - More PMU cleanups - Don't try to handle broken PSCI firmware - More sys_reg() to reg_to_encoding() conversions
2021-01-25	Merge branch 'linus' of ↵	Linus Torvalds	1	-2/+2
	git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fix from Herbert Xu: "Fix a regression in the cesa driver" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: marvel/cesa - Fix tdma descriptor on 64-bit