blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2024-05-21	pNFS: rework pnfs_generic_pg_check_layout to check IO range	Olga Kornievskaia	4	-36/+14
	All callers of pnfs_generic_pg_check_layout() also want to do a call to check that the layout's range covers the IO range. Merge the functionality of the pnfs_generic_pg_check_range() into that of pnfs_generic_pg_check_layout(). Signed-off-by: Olga Kornievskaia <[email protected]> Reviewed-by: Benjamin Coddington <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2024-05-21	pNFS/filelayout: check layout segment range	Olga Kornievskaia	1	-0/+2
	Before doing the IO, check that we have the layout covering the range of IO. Signed-off-by: Olga Kornievskaia <[email protected]> Reviewed-by: Benjamin Coddington <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2024-05-21	pNFS/filelayout: fixup pNfs allocation modes	Olga Kornievskaia	1	-2/+2
	Change left over allocation flags. Fixes: a245832aaa99 ("pNFS/files: Ensure pNFS allocation modes are consistent with nfsiod") Signed-off-by: Olga Kornievskaia <[email protected]> Reviewed-by: Benjamin Coddington <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2024-05-20	rpcrdma: fix handling for RDMA_CM_EVENT_DEVICE_REMOVAL	Dan Aloni	1	-1/+5
	Under the scenario of IB device bonding, when bringing down one of the ports, or all ports, we saw xprtrdma entering a non-recoverable state where it is not even possible to complete the disconnect and shut it down the mount, requiring a reboot. Following debug, we saw that transport connect never ended after receiving the RDMA_CM_EVENT_DEVICE_REMOVAL callback. The DEVICE_REMOVAL callback is irrespective of whether the CM_ID is connected, and ESTABLISHED may not have happened. So need to work with each of these states accordingly. Fixes: 2acc5cae2923 ('xprtrdma: Prevent dereferencing r_xprt->rx_ep after it is freed') Cc: Sagi Grimberg <[email protected]> Signed-off-by: Dan Aloni <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Reviewed-by: Chuck Lever <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2024-05-20	NFS: Don't enable NFS v2 by default	Anna Schumaker	1	-2/+2
	This came up during one of the Bake-a-thon discussions. NFS v2 support was dropped from nfs-utils/mount.nfs in December 2021. Let's turn it off by default in the kernel too, since this means there isn't a way to mount and test it. Signed-off-by: Anna Schumaker <[email protected]> Reviewed-by: Jeffrey Layton <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2024-05-20	NFS: Fix READ_PLUS when server doesn't support OP_READ_PLUS	Anna Schumaker	1	-1/+1
	Olga showed me a case where the client was sending multiple READ_PLUS calls to the server in parallel, and the server replied NFS4ERR_OPNOTSUPP to each. The client would fall back to READ for the first reply, but fail to retry the other calls. I fix this by removing the test for NFS_CAP_READ_PLUS in nfs4_read_plus_not_supported(). This allows us to reschedule any READ_PLUS call that has a NFS4ERR_OPNOTSUPP return value, even after the capability has been cleared. Reported-by: Olga Kornievskaia <[email protected]> Fixes: c567552612ec ("NFS: Add READ_PLUS data segment support") Cc: [email protected] # v5.10+ Signed-off-by: Anna Schumaker <[email protected]> Reviewed-by: Benjamin Coddington <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2024-05-20	sunrpc: fix NFSACL RPC retry on soft mount	Dan Aloni	1	-0/+1
	It used to be quite awhile ago since 1b63a75180c6 ('SUNRPC: Refactor rpc_clone_client()'), in 2012, that `cl_timeout` was copied in so that all mount parameters propagate to NFSACL clients. However since that change, if mount options as follows are given: soft,timeo=50,retrans=16,vers=3 The resultant NFSACL client receives: cl_softrtry: 1 cl_timeout: to_initval=60000, to_maxval=60000, to_increment=0, to_retries=2, to_exponential=0 These values lead to NFSACL operations not being retried under the condition of transient network outages with soft mount. Instead, getacl call fails after 60 seconds with EIO. The simple fix is to pass the existing client's `cl_timeout` as the new client timeout. Cc: Chuck Lever <[email protected]> Cc: Benjamin Coddington <[email protected]> Link: https://lore.kernel.org/all/[email protected]/T/ Fixes: 1b63a75180c6 ('SUNRPC: Refactor rpc_clone_client()') Signed-off-by: Dan Aloni <[email protected]> Reviewed-by: Benjamin Coddington <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2024-05-20	SUNRPC: fix handling expired GSS context	Olga Kornievskaia	1	-1/+12
	In the case where we have received a successful reply to an RPC request, but while processing the reply the client in rpc_decode_header() finds an expired context, the code ends up propagating the error to the caller instead of getting a new context and retrying the request. To give more details, in rpc_decode_header() we call rpcauth_checkverf() will call into the gss and internally will at some point call gss_validate() which has a check if the current’s context lifetime expired, and it would fail. The reason for the failure gets ‘scrubbed’ and translated to EACCES so when we get back to rpc_decode_header() we just go to “out_verifier” which for that error would get converted to “out_garbage” (ie it’s treated as garballed reply) and the next action is call_encode. Which (1) doesn’t reencode or re-send (not to mention no upcall happens because context expires as that reason just not known) and it again fails in the same decoding process. After re-trying it 3 times the error is propagated back to the caller (ie nfs4_write_done_cb() in the case a failing write). To fix this, instead we need to look to the case where the server decides that context has expired and replies with an RPC auth error. In that case, the rpc_decode_header() goes to "out_msg_denied" in that we return EKEYREJECTED which in call_decode() is sent to “call_reserve” which triggers an upcalls and a re-try of the operation. The proposed fix is in case of a failed rpc_decode_header() to check if credentials were set to be invalid and use that as a proxy for deciding that context has expired and then treat is same way as receiving an auth error. Signed-off-by: Olga Kornievskaia <[email protected]> Reviewed-by: Benjamin Coddington <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2024-05-20	nfs: keep server info for remounts	Martin Kaiser	1	-3/+6
	With newer kernels that use fs_context for nfs mounts, remounts fail with -EINVAL. $ mount -t nfs -o nolock 10.0.0.1:/tmp/test /mnt/test/ $ mount -t nfs -o remount /mnt/test/ mount: mounting 10.0.0.1:/tmp/test on /mnt/test failed: Invalid argument For remounts, the nfs server address and port are populated by nfs_init_fs_context and later overwritten with 0x00 bytes by nfs23_parse_monolithic. The remount then fails as the server address is invalid. Fix this by not overwriting nfs server info in nfs23_parse_monolithic if we're doing a remount. Fixes: f2aedb713c28 ("NFS: Add fs_context support.") Signed-off-by: Martin Kaiser <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2024-05-20	NFSv4: Fixup smatch warning for ambiguous return	Benjamin Coddington	1	-7/+5
	Dan Carpenter reports smatch warning for nfs4_try_migration() when a memory allocation failure results in a zero return value. In this case, a transient allocation failure error will likely be retried the next time the server responds with NFS4ERR_MOVED. We can fixup the smatch warning with a small refactor: attempt all three allocations before testing and returning on a failure. Reported-by: Dan Carpenter <[email protected]> Fixes: c3ed222745d9 ("NFSv4: Fix free of uninitialized nfs4_label on referral lookup.") Signed-off-by: Benjamin Coddington <[email protected]> Reviewed-by: Dan Carpenter <[email protected]> Reviewed-by: Chuck Lever <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2024-05-20	NFS: make sure lock/nolock overriding local_lock mount option	Chen Hanxiao	3	-0/+19
	Currently, mount option lock/nolock and local_lock option may override NFS_MOUNT_LOCAL_FLOCK NFS_MOUNT_LOCAL_FCNTL flags when passing in different order: mount -o vers=3,local_lock=all,lock: local_lock=none mount -o vers=3,lock,local_lock=all: local_lock=all This patch will let lock/nolock override local_lock option as nfs(5) suggested. Signed-off-by: Chen Hanxiao <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2024-05-20	NFS: add atomic_open for NFSv3 to handle O_TRUNC correctly.	NeilBrown	4	-3/+56
	With two clients, each with NFSv3 mounts of the same directory, the sequence: client1 client2 ls -l afile echo hello there > afile echo HELLO > afile cat afile will show HELLO there because the O_TRUNC requested in the final 'echo' doesn't take effect. This is because the "Negative dentry, just create a file" section in lookup_open() assumes that the file does get created since the dentry was negative, so it sets FMODE_CREATED, and this causes do_open() to clear O_TRUNC and so the file doesn't get truncated. Even mounting with -o lookupcache=none does not help as nfs_neg_need_reval() always returns false if LOOKUP_CREATE is set. This patch fixes the problem by providing an atomic_open inode operation for NFSv3 (and v2). The code is largely the code from the branch in lookup_open() when atomic_open is not provided. The significant change is that the O_TRUNC flag is passed a new nfs_do_create() which add 'trunc' handling to nfs_create(). With this change we also optimise away an unnecessary LOOKUP before the file is created. Signed-off-by: NeilBrown <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2024-05-20	pNFS/filelayout: Specify the layout segment range in LAYOUTGET	Anna Schumaker	1	-4/+4
	Move from only requesting full file layout segments to requesting layout segments that match our I/O size. This means the server is still free to return a full file layout if it wants, but partial layouts will no longer cause an error. Signed-off-by: Anna Schumaker <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2024-05-20	pNFS/filelayout: Remove the whole file layout requirement	Anna Schumaker	1	-8/+0
	Layout segments have been supported in pNFS for years, so remove the requirement that the server always sends whole file layouts. Signed-off-by: Anna Schumaker <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2024-05-12	Linux 6.9	Linus Torvalds	1	-1/+1

2024-05-12	Merge tag 'kselftest-fix-vfork-2024-05-12' of ↵	Linus Torvalds	4	-67/+147
	git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux Pull Kselftest fixes from Mickaël Salaün: "Fix Kselftest's vfork() side effects. As reported by Kernel Test Robot and Sean Christopherson, some tests fail since v6.9-rc1 . This is due to the use of vfork() which introduced some side effects. Similarly, while making it more generic, a previous commit made some Landlock file system tests flaky, and subject to the host's file system mount configuration. This fixes all these side effects by replacing vfork() with clone3() and CLONE_VFORK, which is cleaner (no arbitrary shared memory) and makes the Kselftest framework more robust" Link: https://lore.kernel.org/oe-lkp/[email protected] Link: https://lore.kernel.org/r/[email protected] Link: https://lore.kernel.org/r/[email protected] * tag 'kselftest-fix-vfork-2024-05-12' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: selftests/harness: Handle TEST_F()'s explicit exit codes selftests/harness: Fix vfork() side effects selftests/harness: Share _metadata between forked processes selftests/pidfd: Fix wrong expectation selftests/harness: Constify fixture variants selftests/landlock: Do not allocate memory in fixture data selftests/harness: Fix interleaved scheduling leading to race conditions selftests/harness: Fix fixture teardown selftests/landlock: Fix FS tests when run on a private mount point selftests/pidfd: Fix config for pidfd_setns_test
2024-05-12	Merge tag 'for-linus-6.9' of git://git.kernel.org/pub/scm/virt/kvm/kvm	Linus Torvalds	1	-1/+1
	Pull kvm fix from Paolo Bonzini: - Fix NULL pointer read on s390 in ioctl(KVM_CHECK_EXTENSION) for /dev/kvm * tag 'for-linus-6.9' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: s390: Check kvm pointer when testing KVM_CAP_S390_HPAGE_1M
2024-05-12	Merge tag 'edac_urgent_for_v6.9' of ↵	Linus Torvalds	1	-13/+37
	git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC fix from Borislav Petkov: - Fix a race condition when clearing error count bits and toggling the error interrupt throug the same register, in synopsys_edac * tag 'edac_urgent_for_v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC/synopsys: Fix ECC status and IRQ control race condition
2024-05-12	Merge tag 'x86_urgent_for_v6.9' of ↵	Linus Torvalds	3	-9/+9
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Add a new PCI ID which belongs to a new AMD CPU family 0x1a - Ensure that that last level cache ID is set in all cases, in the AMD CPU topology parsing code, in order to prevent invalid scheduling domain CPU masks * tag 'x86_urgent_for_v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/topology/amd: Ensure that LLC ID is initialized x86/amd_nb: Add new PCI IDs for AMD family 0x1a
2024-05-11	selftests/harness: Handle TEST_F()'s explicit exit codes	Mickaël Salaün	1	-1/+5
	If TEST_F() explicitly calls exit(code) with code different than 0, then _metadata->exit_code is set to this code (e.g. KVM_ONE_VCPU_TEST()). We need to keep in mind that _metadata->exit_code can be KSFT_SKIP while the process exit code is 0. Cc: Jakub Kicinski <[email protected]> Cc: Kees Cook <[email protected]> Cc: Mark Brown <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Will Drewry <[email protected]> Reported-by: Sean Christopherson <[email protected]> Tested-by: Sean Christopherson <[email protected]> Closes: https://lore.kernel.org/r/[email protected] Fixes: 0710a1a73fb4 ("selftests/harness: Merge TEST_F_FORK() into TEST_F()") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mickaël Salaün <[email protected]>
2024-05-11	selftests/harness: Fix vfork() side effects	Mickaël Salaün	2	-25/+57
	Setting the time namespace with CLONE_NEWTIME returns -EUSERS if the calling thread shares memory with another thread (because of the shared vDSO), which is the case when it is created with vfork(). Fix pidfd_setns_test by replacing test harness's vfork() call with a clone3() call with CLONE_VFORK, and an explicit sharing of the _metadata and self objects. Replace _metadata->teardown_parent with a new FIXTURE_TEARDOWN_PARENT() helper that can replace FIXTURE_TEARDOWN(). This is a cleaner approach and it enables to selectively share the fixture data between the child process running tests and the parent process running the fixture teardown. This also avoids updating several tests to not rely on the self object's copy-on-write property (e.g. storing the returned value of a fork() call). Cc: Christian Brauner <[email protected]> Cc: David S. Miller <[email protected]> Cc: Günther Noack <[email protected]> Cc: Jakub Kicinski <[email protected]> Cc: Mark Brown <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Will Drewry <[email protected]> Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-lkp/[email protected] Fixes: 0710a1a73fb4 ("selftests/harness: Merge TEST_F_FORK() into TEST_F()") Reviewed-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mickaël Salaün <[email protected]>
2024-05-11	selftests/harness: Share _metadata between forked processes	Mickaël Salaün	1	-11/+15
	Unconditionally share _metadata between all forked processes, which enables to actually catch errors which were previously ignored. This is required for a following commit replacing vfork() with clone3() and CLONE_VFORK (i.e. not sharing the full memory) . It should also be useful to share _metadata to extend expectations to test process's forks. For instance, this change identified a wrong expectation in pidfd_setns_test. Because this _metadata is used by the new XFAIL_ADD(), use a global pointer initialized in TEST_F(). This is OK because only XFAIL_ADD() use it, and XFAIL_ADD() already depends on TEST_F(). Cc: Jakub Kicinski <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Will Drewry <[email protected]> Reviewed-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mickaël Salaün <[email protected]>
2024-05-11	selftests/pidfd: Fix wrong expectation	Mickaël Salaün	1	-1/+1
	Replace a wrong EXPECT_GT(self->child_pid_exited, 0) with EXPECT_GE(), which will be actually tested on the parent and child sides with a following commit. Cc: Shuah Khan <[email protected]> Reviewed-by: Kees Cook <[email protected]> Reviewed-by: Christian Brauner <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mickaël Salaün <[email protected]>
2024-05-11	selftests/harness: Constify fixture variants	Mickaël Salaün	1	-2/+2
	FIXTURE_VARIANT_ADD() types are passed as const pointers to FIXTURE_TEARDOWN(). Make that explicit by constifying the variants declarations. Cc: Shuah Khan <[email protected]> Cc: Will Drewry <[email protected]> Reviewed-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mickaël Salaün <[email protected]>
2024-05-11	selftests/landlock: Do not allocate memory in fixture data	Mickaël Salaün	1	-22/+35
	Do not allocate self->dir_path in the test process because this would not be visible in the FIXTURE_TEARDOWN() process when relying on fork()/clone3() instead of vfork(). This change is required for a following commit removing vfork() call to not break the layout3_fs.* test cases. Cc: Günther Noack <[email protected]> Cc: Shuah Khan <[email protected]> Reviewed-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mickaël Salaün <[email protected]>
2024-05-11	selftests/harness: Fix interleaved scheduling leading to race conditions	Mickaël Salaün	1	-1/+14
	Fix a race condition when running several FIXTURE_TEARDOWN() managing the same resource. This fixes a race condition in the Landlock file system tests when creating or unmounting the same directory. Using clone3() with CLONE_VFORK guarantees that the child and grandchild test processes are sequentially scheduled. This is implemented with a new clone3_vfork() helper replacing the fork() call. This avoids triggering this error in __wait_for_test(): Test ended in some other way [127] Cc: Christian Brauner <[email protected]> Cc: David S. Miller <[email protected]> Cc: Günther Noack <[email protected]> Cc: Jakub Kicinski <[email protected]> Cc: Mark Brown <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Will Drewry <[email protected]> Fixes: 41cca0542d7c ("selftests/harness: Fix TEST_F()'s vfork handling") Reviewed-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mickaël Salaün <[email protected]>
2024-05-11	selftests/harness: Fix fixture teardown	Mickaël Salaün	1	-5/+9
	Make sure fixture teardowns are run when test cases failed, including when _metadata->teardown_parent is set to true. Make sure only one fixture teardown is run per test case, handling the case where the test child forks. Cc: Jakub Kicinski <[email protected]> Cc: Shengyu Li <[email protected]> Cc: Shuah Khan <[email protected]> Fixes: 72d7cb5c190b ("selftests/harness: Prevent infinite loop due to Assert in FIXTURE_TEARDOWN") Fixes: 0710a1a73fb4 ("selftests/harness: Merge TEST_F_FORK() into TEST_F()") Reviewed-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected] Rule: add Link: https://lore.kernel.org/stable/20240506165518.474504-4-mic%40digikod.net Signed-off-by: Mickaël Salaün <[email protected]>
2024-05-11	selftests/landlock: Fix FS tests when run on a private mount point	Mickaël Salaün	1	-1/+9
	According to the test environment, the mount point of the test's working directory may be shared or not, which changes the visibility of the nested "tmp" mount point for the test's parent process calling umount("tmp"). This was spotted while running tests in containers [1], where mount points are private. Cc: Günther Noack <[email protected]> Cc: Shuah Khan <[email protected]> Link: https://github.com/landlock-lsm/landlock-test-tools/pull/4 [1] Fixes: 41cca0542d7c ("selftests/harness: Fix TEST_F()'s vfork handling") Reviewed-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mickaël Salaün <[email protected]>
2024-05-11	selftests/pidfd: Fix config for pidfd_setns_test	Mickaël Salaün	1	-0/+2
	Required by switch_timens() to open /proc/self/ns/time_for_children. CONFIG_GENERIC_VDSO_TIME_NS is not available on UML, so pidfd_setns_test cannot be run successfully on this architecture. Cc: Shuah Khan <[email protected]> Fixes: 2b40c5db73e2 ("selftests/pidfd: add pidfd setns tests") Reviewed-by: Kees Cook <[email protected]> Reviewed-by: Christian Brauner <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mickaël Salaün <[email protected]>
2024-05-10	Merge tag 'drm-fixes-2024-05-11' of https://gitlab.freedesktop.org/drm/kernel	Linus Torvalds	20	-203/+122
	Pull drm fixes from Dave Airlie: "This should be the last set of fixes for 6.9, i915, xe and amdgpu are the bulk here, one of the previous nouveau fixes turned up an issue, so reverting it, otherwise one core and a couple of meson fixes. core: - fix connector debugging output i915: - Automate CCS Mode setting during engine resets - Fix audio time stamp programming for DP - Fix parsing backlight BDB data xe: - Fix use zero-length element array - Move more from system wq to ordered private wq - Do not ignore return for drmm_mutex_init amdgpu: - DCN 3.5 fix - MST DSC fixes - S0i3 fix - S4 fix - HDP MMIO mapping fix - Fix a regression in visible vram handling amdkfd: - Spatial partition fix meson: - dw-hdmi: power-up fixes - dw-hdmi: add badngap setting for g12 nouveau: - revert SG_DEBUG fix that has a side effect" * tag 'drm-fixes-2024-05-11' of https://gitlab.freedesktop.org/drm/kernel: Revert "drm/nouveau/firmware: Fix SG_DEBUG error with nvkm_firmware_ctor()" drm/amdgpu: Fix comparison in amdgpu_res_cpu_visible drm/amdkfd: don't allow mapping the MMIO HDP page with large pages drm/xe: Use ordered WQ for G2H handler drm/xe/guc: Check error code when initializing the CT mutex drm/xe/ads: Use flexible-array Revert "drm/amdkfd: Add partition id field to location_id" dm/amd/pm: Fix problems with reboot/shutdown for some SMU 13.0.4/13.0.11 users drm/amd/display: MST DSC check for older devices drm/amd/display: Fix idle optimization checks for multi-display and dual eDP drm/amd/display: Fix DSC-re-computing drm/amd/display: Enable urgent latency adjustments for DCN35 drm/connector: Add \n to message about demoting connector force-probes drm/i915/bios: Fix parsing backlight BDB data drm/i915/audio: Fix audio time stamp programming for DP drm/i915/gt: Automate CCS Mode setting during engine resets drm/meson: dw-hdmi: add bandgap setting for g12 drm/meson: dw-hdmi: power up phy on device init
2024-05-10	Merge tag 'mm-hotfixes-stable-2024-05-10-13-14' of ↵	Linus Torvalds	16	-64/+116
	git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM fixes from Andrew Morton: "18 hotfixes, 7 of which are cc:stable. More fixups for this cycle's page_owner updates. And a few userfaultfd fixes. Otherwise, random singletons - see the individual changelogs for details" * tag 'mm-hotfixes-stable-2024-05-10-13-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mailmap: add entry for Barry Song selftests/mm: fix powerpc ARCH check mailmap: add entry for John Garry XArray: set the marks correctly when splitting an entry selftests/vDSO: fix runtime errors on LoongArch selftests/vDSO: fix building errors on LoongArch mm,page_owner: don't remove __GFP_NOLOCKDEP in add_stack_record_to_list fs/proc/task_mmu: fix uffd-wp confusion in pagemap_scan_pmd_entry() fs/proc/task_mmu: fix loss of young/dirty bits during pagemap scan mm/vmalloc: fix return value of vb_alloc if size is 0 mm: use memalloc_nofs_save() in page_cache_ra_order() kmsan: compiler_types: declare __no_sanitize_or_inline lib/test_xarray.c: fix error assumptions on check_xa_multi_store_adv_add() tools: fix userspace compilation with new test_xarray changes MAINTAINERS: update URL's for KEYS/KEYRINGS_INTEGRITY and TPM DEVICE DRIVER mm: page_owner: fix wrong information in dump_page_owner maple_tree: fix mas_empty_area_rev() null pointer dereference mm/userfaultfd: reset ptes when close() for wr-protected ones
2024-05-11	Revert "drm/nouveau/firmware: Fix SG_DEBUG error with nvkm_firmware_ctor()"	Dave Airlie	1	-12/+7
	This reverts commit 52a6947bf576b97ff8e14bb0a31c5eaf2d0d96e2. This causes loading failures in [ 0.367379] nouveau 0000:01:00.0: NVIDIA GP104 (134000a1) [ 0.474499] nouveau 0000:01:00.0: bios: version 86.04.50.80.13 [ 0.474620] nouveau 0000:01:00.0: pmu: firmware unavailable [ 0.474977] nouveau 0000:01:00.0: fb: 8192 MiB GDDR5 [ 0.484371] nouveau 0000:01:00.0: sec2(acr): mbox 00000001 00000000 [ 0.484377] nouveau 0000:01:00.0: sec2(acr):load: boot failed: -5 [ 0.484379] nouveau 0000:01:00.0: acr: init failed, -5 [ 0.484466] nouveau 0000:01:00.0: init failed with -5 [ 0.484468] nouveau: DRM-master:00000000:00000080: init failed with -5 [ 0.484470] nouveau 0000:01:00.0: DRM-master: Device allocation failed: -5 [ 0.485078] nouveau 0000:01:00.0: probe with driver nouveau failed with error -50 I tried tracking it down but ran out of time this week, will revisit next week. Reported-by: Dan Moulding <[email protected]> Cc: [email protected] Signed-off-by: Dave Airlie <[email protected]>
2024-05-11	Merge tag 'drm-misc-fixes-2024-05-10' of ↵	Dave Airlie	2	-40/+32
	https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Short summary of fixes pull: core: - fix connector debugging output meson: - dw-hdmi: power-up fixes - dw-hdmi: add badngap setting for g12 Signed-off-by: Dave Airlie <[email protected]> From: Thomas Zimmermann <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-05-10	Merge tag 'gpio-fixes-for-v6.9' of ↵	Linus Torvalds	3	-30/+63
	git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio fixes from Bartosz Golaszewski: "Some last-minute fixes for this release from the GPIO subsystem. The first two address a regression in performance reported to me after the conversion to using SRCU in GPIOLIB that was merged during the v6.9 merge window. The second patch is not technically a fix but since after the first one we no longer need to use a per-descriptor SRCU struct, I think it's worth to simplify the code before it gets released on Sunday. The next two commits fix two memory issues: one use-after-free bug and one instance of possibly leaking kernel stack memory to user-space. Summary: - fix a performance regression in GPIO requesting and releasing after the conversion to SRCU - fix a use-after-free bug due to a race-condition - fix leaking stack memory to user-space in a GPIO uABI corner case" * tag 'gpio-fixes-for-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: gpiolib: cdev: fix uninitialised kfifo gpiolib: cdev: Fix use after free in lineinfo_changed_notify gpiolib: use a single SRCU struct for all GPIO descriptors gpiolib: fix the speed of descriptor label setting with SRCU
2024-05-11	Merge tag 'amd-drm-fixes-6.9-2024-05-10' of ↵	Dave Airlie	7	-18/+51
	https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.9-2024-05-10: amdgpu: - DCN 3.5 fix - MST DSC fixes - S0i3 fix - S4 fix - HDP MMIO mapping fix - Fix a regression in visible vram handling amdkfd: - Spatial partition fix Signed-off-by: Dave Airlie <[email protected]> From: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-05-10	mailmap: add entry for Barry Song	Barry Song	1	-0/+5
	Include a .mailmap entry to synchronize with both my past and current emails. Among them, three business mailboxes are dead. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Barry Song <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-05-10	selftests/mm: fix powerpc ARCH check	Michael Ellerman	1	-3/+3
	In commit 0518dbe97fe6 ("selftests/mm: fix cross compilation with LLVM") the logic to detect the machine architecture in the Makefile was changed to use ARCH, and only fallback to uname -m if ARCH is unset. However the tests of ARCH were not updated to account for the fact that ARCH is "powerpc" for powerpc builds, not "ppc64". Fix it by changing the checks to look for "powerpc", and change the uname -m logic to convert "ppc64." into "powerpc". With that fixed the following tests now build for powerpc again: protection_keys * va_high_addr_switch * virtual_address_range * write_to_hugetlbfs Link: https://lkml.kernel.org/r/[email protected] Fixes: 0518dbe97fe6 ("selftests/mm: fix cross compilation with LLVM") Signed-off-by: Michael Ellerman <[email protected]> Cc: Mark Brown <[email protected]> Cc: <[email protected]> [6.4+] Signed-off-by: Andrew Morton <[email protected]>
2024-05-10	Merge tag 'block-6.9-20240510' of git://git.kernel.dk/linux	Linus Torvalds	7	-21/+29
	Pull block fixes from Jens Axboe: - NVMe pull request via Keith: - nvme target fixes (Sagi, Dan, Maurizo) - new vendor quirk for broken MSI (Sean) - Virtual boundary fix for a regression in this merge window (Ming) * tag 'block-6.9-20240510' of git://git.kernel.dk/linux: nvmet-rdma: fix possible bad dereference when freeing rsps nvmet: prevent sprintf() overflow in nvmet_subsys_nsid_exists() nvmet: make nvmet_wq unbound nvmet-auth: return the error code to the nvmet_auth_ctrl_hash() callers nvme-pci: Add quirk for broken MSIs block: set default max segment size in case of virt_boundary
2024-05-10	Merge tag 'spi-fix-v6.9-rc7' of ↵	Linus Torvalds	2	-12/+3
	git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "Two device specific fixes here, one avoiding glitches on chip select with the STM32 driver and one for incorrectly configured clocks on the Microchip QSPI controller" * tag 'spi-fix-v6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: microchip-core-qspi: fix setting spi bus clock rate spi: stm32: enable controller before asserting CS
2024-05-10	Merge tag 'regulator-fix-v6.9-rc7' of ↵	Linus Torvalds	2	-15/+19
	git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fixes from Mark Brown: "Two fixes here, one from Johan which fixes error handling when we attempt to create duplicate debugfs files and one for an incorrect specification of ramp_delay with the rtq2208" * tag 'regulator-fix-v6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: core: fix debugfs creation regression regulator: rtq2208: Fix the BUCK ramp_delay range to maximum of 16mVstep/us
2024-05-10	Merge tag 'timers-urgent-2024-05-10' of ↵	Linus Torvalds	1	-2/+2
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Ingo Molnar: "Fix possible (but unlikely) out-of-bounds access in the timer migration per-CPU-init code" * tag 'timers-urgent-2024-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timers/migration: Prevent out of bounds access on failure
2024-05-10	Merge tag 'iommu-fixes-v6.9-rc7' of ↵	Linus Torvalds	2	-3/+5
	git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fixes from Joerg Roedel: - Fix offset miscalculation on ARM-SMMU driver - AMD IOMMU fix for initializing state of untrusted devices * tag 'iommu-fixes-v6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/arm-smmu: Use the correct type in nvidia_smmu_context_fault() iommu/amd: Enhance def_domain_type to handle untrusted device
2024-05-10	drm/amdgpu: Fix comparison in amdgpu_res_cpu_visible	Michel Dänzer	1	-1/+1
	It incorrectly claimed a resource isn't CPU visible if it's located at the very end of CPU visible VRAM. Fixes: a6ff969fe9cb ("drm/amdgpu: fix visible VRAM handling during faults") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3343 Reviewed-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reported-and-Tested-by: Jeremy Day <[email protected]> Signed-off-by: Michel Dänzer <[email protected]> Signed-off-by: Alex Deucher <[email protected]> CC: [email protected]
2024-05-10	drm/amdkfd: don't allow mapping the MMIO HDP page with large pages	Alex Deucher	1	-2/+5
	We don't get the right offset in that case. The GPU has an unused 4K area of the register BAR space into which you can remap registers. We remap the HDP flush registers into this space to allow userspace (CPU or GPU) to flush the HDP when it updates VRAM. However, on systems with >4K pages, we end up exposing PAGE_SIZE of MMIO space. Fixes: d8e408a82704 ("drm/amdkfd: Expose HDP registers to user space") Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2024-05-10	x86/topology/amd: Ensure that LLC ID is initialized	Thomas Gleixner	1	-9/+7
	The original topology evaluation code initialized cpu_data::topo::llc_id with the die ID initialy and then eventually overwrite it with information gathered from a CPUID leaf. The conversion analysis failed to spot that particular detail and omitted this initial assignment under the assumption that each topology evaluation path will set it up. That assumption is mostly correct, but turns out to be wrong in case that the CPUID leaf 0x80000006 does not provide a LLC ID. In that case, LLC ID is invalid and as a consequence the setup of the scheduling domain CPU masks is incorrect which subsequently causes the scheduler core to complain about it during CPU hotplug: BUG: arch topology borken the CLS domain not a subset of the MC domain Cure it by reusing legacy_set_llc() and assigning the die ID if the LLC ID is invalid after all possible parsers have been tried. Fixes: f7fb3b2dd92c ("x86/cpu: Provide an AMD/HYGON specific topology parser") Reported-by: Yuezhang Mo <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Tested-by: Yuezhang Mo <[email protected]> Link: https://lore.kernel.org/r/PUZPR04MB63168AC442C12627E827368581292@PUZPR04MB6316.apcprd04.prod.outlook.com
2024-05-10	gpiolib: cdev: fix uninitialised kfifo	Kent Gibson	1	-0/+14
	If a line is requested with debounce, and that results in debouncing in software, and the line is subsequently reconfigured to enable edge detection then the allocation of the kfifo to contain edge events is overlooked. This results in events being written to and read from an uninitialised kfifo. Read events are returned to userspace. Initialise the kfifo in the case where the software debounce is already active. Fixes: 65cff7046406 ("gpiolib: cdev: support setting debounce") Signed-off-by: Kent Gibson <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Bartosz Golaszewski <[email protected]>
2024-05-10	x86/amd_nb: Add new PCI IDs for AMD family 0x1a	Shyam Sundar S K	2	-0/+2
	Add the new PCI Device IDs to the MISC IDs list to support new generation of AMD 1Ah family 70h Models of processors. [ bp: Massage commit message. ] Signed-off-by: Shyam Sundar S K <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-05-10	iommu/arm-smmu: Use the correct type in nvidia_smmu_context_fault()	Jason Gunthorpe	1	-3/+1
	This was missed because of the function pointer indirection. nvidia_smmu_context_fault() is also installed as a irq function, and the 'void *' was changed to a struct arm_smmu_domain. Since the iommu_domain is embedded at a non-zero offset this causes nvidia_smmu_context_fault() to miscompute the offset. Fixup the types. Unable to handle kernel NULL pointer dereference at virtual address 0000000000000120 Mem abort info: ESR = 0x0000000096000004 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x04: level 0 translation fault Data abort info: ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 CM = 0, WnR = 0, TnD = 0, TagAccess = 0 GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 user pgtable: 4k pages, 48-bit VAs, pgdp=0000000107c9f000 [0000000000000120] pgd=0000000000000000, p4d=0000000000000000 Internal error: Oops: 0000000096000004 [#1] SMP Modules linked in: CPU: 1 PID: 47 Comm: kworker/u25:0 Not tainted 6.9.0-0.rc7.58.eln136.aarch64 #1 Hardware name: Unknown NVIDIA Jetson Orin NX/NVIDIA Jetson Orin NX, BIOS 3.1-32827747 03/19/2023 Workqueue: events_unbound deferred_probe_work_func pstate: 604000c9 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : nvidia_smmu_context_fault+0x1c/0x158 lr : __free_irq+0x1d4/0x2e8 sp : ffff80008044b6f0 x29: ffff80008044b6f0 x28: ffff000080a60b18 x27: ffffd32b5172e970 x26: 0000000000000000 x25: ffff0000802f5aac x24: ffff0000802f5a30 x23: ffff0000802f5b60 x22: 0000000000000057 x21: 0000000000000000 x20: ffff0000802f5a00 x19: ffff000087d4cd80 x18: ffffffffffffffff x17: 6234362066666666 x16: 6630303078302d30 x15: ffff00008156d888 x14: 0000000000000000 x13: ffff0000801db910 x12: ffff00008156d6d0 x11: 0000000000000003 x10: ffff0000801db918 x9 : ffffd32b50f94d9c x8 : 1fffe0001032fda1 x7 : ffff00008197ed00 x6 : 000000000000000f x5 : 000000000000010e x4 : 000000000000010e x3 : 0000000000000000 x2 : ffffd32b51720cd8 x1 : ffff000087e6f700 x0 : 0000000000000057 Call trace: nvidia_smmu_context_fault+0x1c/0x158 __free_irq+0x1d4/0x2e8 free_irq+0x3c/0x80 devm_free_irq+0x64/0xa8 arm_smmu_domain_free+0xc4/0x158 iommu_domain_free+0x44/0xa0 iommu_deinit_device+0xd0/0xf8 __iommu_group_remove_device+0xcc/0xe0 iommu_bus_notifier+0x64/0xa8 notifier_call_chain+0x78/0x148 blocking_notifier_call_chain+0x4c/0x90 bus_notify+0x44/0x70 device_del+0x264/0x3e8 pci_remove_bus_device+0x84/0x120 pci_remove_root_bus+0x5c/0xc0 dw_pcie_host_deinit+0x38/0xe0 tegra_pcie_config_rp+0xc0/0x1f0 tegra_pcie_dw_probe+0x34c/0x700 platform_probe+0x70/0xe8 really_probe+0xc8/0x3a0 __driver_probe_device+0x84/0x160 driver_probe_device+0x44/0x130 __device_attach_driver+0xc4/0x170 bus_for_each_drv+0x90/0x100 __device_attach+0xa8/0x1c8 device_initial_probe+0x1c/0x30 bus_probe_device+0xb0/0xc0 deferred_probe_work_func+0xbc/0x120 process_one_work+0x194/0x490 worker_thread+0x284/0x3b0 kthread+0xf4/0x108 ret_from_fork+0x10/0x20 Code: a9b97bfd 910003fd a9025bf5 f85a0035 (b94122a1) Cc: [email protected] Fixes: e0976331ad11 ("iommu/arm-smmu: Pass arm_smmu_domain to internal functions") Reported-by: Jerry Snitselaar <[email protected]> Closes: https://lore.kernel.org/all/jto5e3ili4auk6sbzpnojdvhppgwuegir7mpd755anfhwcbkfz@2u5gh7bxb4iv Signed-off-by: Jason Gunthorpe <[email protected]> Tested-by: Jerry Snitselaar <[email protected]> Acked-by: Jerry Snitselaar <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2024-05-10	Merge tag 'drm-xe-fixes-2024-05-09' of ↵	Dave Airlie	4	-3/+13
	https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes - Fix use zero-length element array - Move more from system wq to ordered private wq - Do not ignore return for drmm_mutex_init Signed-off-by: Dave Airlie <[email protected]> From: Lucas De Marchi <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/c3rduifdp5wipkljdpuq4x6uowkc2uyzgdoft4txvp6mgvzjaj@7zw7c6uw4wrf
2024-05-10	Merge tag 'drm-intel-fixes-2024-05-08' of ↵	Dave Airlie	6	-130/+19
	https://anongit.freedesktop.org/git/drm/drm-intel into drm-fixes - Automate CCS Mode setting during engine resets (Andi) - Fix audio time stamp programming for DP (Chaitanya) - Fix parsing backlight BDB data (Karthikeyan) Signed-off-by: Dave Airlie <[email protected]> From: Rodrigo Vivi <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]