aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-09-25vdpa/mlx5: Parallelize VQ suspend/resume for CVQ MQ commandDragos Tatulea1-10/+12
change_num_qps() is still suspending/resuming VQs one by one. This change switches to parallel suspend/resume. When increasing the number of queues the flow has changed a bit for simplicity: the setup_vq() function will always be called before resume_vqs(). If the VQ is initialized, setup_vq() will exit early. If the VQ is not initialized, setup_vq() will create it and resume_vqs() will resume it. Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Acked-by: Eugenio Pérez <[email protected]> Tested-by: Lei Yang <[email protected]>
2024-09-25vdpa/mlx5: Small improvement for change_num_qps()Dragos Tatulea1-10/+11
change_num_qps() has a lot of multiplications by 2 to convert the number of VQ pairs to number of VQs. This patch simplifies the code by doing the VQP -> VQ count conversion at the beginning in a variable. Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Acked-by: Eugenio Pérez <[email protected]> Tested-by: Lei Yang <[email protected]>
2024-09-25vdpa/mlx5: Keep notifiers during suspend but ignoreDragos Tatulea1-2/+4
Unregistering notifiers is a costly operation. Instead of removing the notifiers during device suspend and adding them back at resume, simply ignore the call when the device is suspended. At resume time call queue_link_work() to make sure that the device state is propagated in case there were changes. For 1 vDPA device x 32 VQs (16 VQPs) attached to a large VM (256 GB RAM, 32 CPUs x 2 threads per core), the device suspend time is reduced from ~13 ms to ~2.5 ms. Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Acked-by: Eugenio Pérez <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Tested-by: Lei Yang <[email protected]>
2024-09-25vdpa/mlx5: Parallelize device resumeDragos Tatulea1-26/+14
Currently device resume works on vqs serially. Building up on previous changes that converted vq operations to the async api, this patch parallelizes the device resume. For 1 vDPA device x 32 VQs (16 VQPs) attached to a large VM (256 GB RAM, 32 CPUs x 2 threads per core), the device resume time is reduced from ~16 ms to ~4.5 ms. Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Acked-by: Eugenio Pérez <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Tested-by: Lei Yang <[email protected]>
2024-09-25vdpa/mlx5: Parallelize device suspendDragos Tatulea1-27/+29
Currently device suspend works on vqs serially. Building up on previous changes that converted vq operations to the async api, this patch parallelizes the device suspend: 1) Suspend all active vqs parallel. 2) Query suspended vqs in parallel. For 1 vDPA device x 32 VQs (16 VQPs) attached to a large VM (256 GB RAM, 32 CPUs x 2 threads per core), the device suspend time is reduced from ~37 ms to ~13 ms. A later patch will remove the link unregister operation which will make it even faster. Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Acked-by: Eugenio Pérez <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Tested-by: Lei Yang <[email protected]>
2024-09-25vdpa/mlx5: Use async API for vq modify commandsDragos Tatulea1-48/+106
Switch firmware vq modify command to be issued via the async API to allow future parallelization. The new refactored function applies the modify on a range of vqs and waits for their execution to complete. For now the command is still used in a serial fashion. A later patch will switch to modifying multiple vqs in parallel. Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Acked-by: Eugenio Pérez <[email protected]> Tested-by: Lei Yang <[email protected]>
2024-09-25vdpa/mlx5: Use async API for vq query commandDragos Tatulea2-25/+78
Switch firmware vq query command to be issued via the async API to allow future parallelization. For now the command is still serial but the infrastructure is there to issue commands in parallel, including ratelimiting the number of issued async commands to firmware. A later patch will switch to issuing more commands at a time. Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Tested-by: Lei Yang <[email protected]>
2024-09-25vdpa/mlx5: Introduce async fw command wrapperDragos Tatulea2-0/+88
Introduce a new function mlx5_vdpa_exec_async_cmds() which wraps the mlx5_core async firmware command API in a way that will be used to parallelize certain operation in this driver. The wrapper deals with the case when mlx5_cmd_exec_cb() returns EBUSY due to the command being throttled. Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Acked-by: Eugenio Pérez <[email protected]> Tested-by: Lei Yang <[email protected]>
2024-09-25vdpa/mlx5: Introduce error logging functionDragos Tatulea2-12/+17
mlx5_vdpa_err() was missing. This patch adds it and uses it in the necessary places. Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Acked-by: Eugenio Pérez <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Tested-by: Lei Yang <[email protected]>
2024-09-25net/mlx5: Support throttled commands from async APIDragos Tatulea1-5/+16
Currently, commands that qualify as throttled can't be used via the async API. That's due to the fact that the throttle semaphore can sleep but the async API can't. This patch allows throttling in the async API by using the tentative variant of the semaphore and upon failure (semaphore at 0) returns EBUSY to signal to the caller that they need to wait for the completion of previously issued commands. Furthermore, make sure that the semaphore is released in the callback. Signed-off-by: Dragos Tatulea <[email protected]> Cc: Leon Romanovsky <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Tested-by: Lei Yang <[email protected]>
2024-09-25xen/pciback: fix cast to restricted pci_ers_result_t and pci_power_tMin-Hua Chen2-2/+2
This patch fix the following sparse warning by applying __force cast to pci_ers_result_t and pci_power_t. drivers/xen/xen-pciback/pci_stub.c:760:16: sparse: warning: cast to restricted pci_ers_result_t drivers/xen/xen-pciback/conf_space_capability.c:125:22: sparse: warning: cast to restricted pci_power_t No functional changes intended. Signed-off-by: Min-Hua Chen <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Message-ID: <[email protected]> Signed-off-by: Juergen Gross <[email protected]>
2024-09-25Merge tag 'nvme-6.12-2024-09-25' of git://git.infradead.org/nvme into ↵Jens Axboe3-8/+12
for-6.12/block Pull NVMe fixes from Keith: "nvme fixes for Linux 6.12 - Multipath fixes (Hannes) - Sysfs attribute list NULL terminate fix (Shin'ichiro) - Remove problematic read-back (Keith)" * tag 'nvme-6.12-2024-09-25' of git://git.infradead.org/nvme: nvme: remove CC register read-back during enabling nvme: null terminate nvme_tls_attrs nvme-multipath: avoid hang on inaccessible namespaces nvme-multipath: system fails to create generic nvme device
2024-09-25Revert "driver core: don't always lock parent in shutdown"Greg Kroah-Hartman1-2/+2
This reverts commit ba6353748e71bd1d7e422fec2b5c2e2dfc2e3bd9. The series is being reverted before -rc1 as there are still reports of lockups on shutdown, so it's not quite ready for "prime time." Reported-by: Andrey Skvortsov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Cc: Christoph Hellwig <[email protected]> Cc: David Jeffery <[email protected]> Cc: Keith Busch <[email protected]> Cc: Laurence Oberman <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Sagi Grimberg <[email protected]> Cc: Stuart Hayes <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2024-09-25Revert "driver core: separate function to shutdown one device"Greg Kroah-Hartman1-36/+30
This reverts commit 95dc7565253a8564911190ebd1e4ffceb4de208a. The series is being reverted before -rc1 as there are still reports of lockups on shutdown, so it's not quite ready for "prime time." Reported-by: Andrey Skvortsov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Cc: Christoph Hellwig <[email protected]> Cc: David Jeffery <[email protected]> Cc: Keith Busch <[email protected]> Cc: Laurence Oberman <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Sagi Grimberg <[email protected]> Cc: Stuart Hayes <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2024-09-25Revert "driver core: shut down devices asynchronously"Greg Kroah-Hartman3-59/+1
This reverts commit 8064952c65045f05ee2671fe437770e50c151776. The series is being reverted before -rc1 as there are still reports of lockups on shutdown, so it's not quite ready for "prime time." Reported-by: Andrey Skvortsov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Cc: Christoph Hellwig <[email protected]> Cc: David Jeffery <[email protected]> Cc: Keith Busch <[email protected]> Cc: Laurence Oberman <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Sagi Grimberg <[email protected]> Cc: Stuart Hayes <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2024-09-25Revert "nvme-pci: Make driver prefer asynchronous shutdown"Greg Kroah-Hartman1-1/+0
This reverts commit ba82e10c3c6b5b5d2c8279a8bd0dae5c2abaacfc. The series is being reverted before -rc1 as there are still reports of lockups on shutdown, so it's not quite ready for "prime time." Reported-by: Andrey Skvortsov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Cc: Christoph Hellwig <[email protected]> Cc: David Jeffery <[email protected]> Cc: Keith Busch <[email protected]> Cc: Laurence Oberman <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Sagi Grimberg <[email protected]> Cc: Stuart Hayes <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2024-09-25Revert "driver core: fix async device shutdown hang"Greg Kroah-Hartman1-9/+1
This reverts commit 4f2c346e621624315e2a1405e98616a0c5ac146f. The series is being reverted before -rc1 as there are still reports of lockups on shutdown, so it's not quite ready for "prime time." Reported-by: Andrey Skvortsov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Cc: Christoph Hellwig <[email protected]> Cc: David Jeffery <[email protected]> Cc: Keith Busch <[email protected]> Cc: Laurence Oberman <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Sagi Grimberg <[email protected]> Cc: Stuart Hayes <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2024-09-25drm/i915/dp: Fix colorimetry detectionVille Syrjälä1-3/+6
intel_dp_init_connector() is no place for detecting stuff via DPCD (except perhaps for eDP). Move the colorimetry stuff into a more appropriate place. Cc: Jouni Högander <[email protected]> Fixes: 00076671a648 ("drm/i915/display: Move colorimetry_support from intel_psr to intel_dp") Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Jouni Högander <[email protected]> (cherry picked from commit 35dba4834bded843d5416e8caadfe82bd0ce1904) Signed-off-by: Joonas Lahtinen <[email protected]>
2024-09-25xen/privcmd: Add new syscall to get gsi from devJiqian Chen5-3/+84
On PVH dom0, when passthrough a device to domU, QEMU and xl tools want to use gsi number to do pirq mapping, see QEMU code xen_pt_realize->xc_physdev_map_pirq, and xl code pci_add_dm_done->xc_physdev_map_pirq, but in current codes, the gsi number is got from file /sys/bus/pci/devices/<sbdf>/irq, that is wrong, because irq is not equal with gsi, they are in different spaces, so pirq mapping fails. And in current linux codes, there is no method to get gsi for userspace. For above purpose, record gsi of pcistub devices when init pcistub and add a new syscall into privcmd to let userspace can get gsi when they have a need. Signed-off-by: Jiqian Chen <[email protected]> Signed-off-by: Huang Rui <[email protected]> Signed-off-by: Jiqian Chen <[email protected]> Reviewed-by: Stefano Stabellini <[email protected]> Message-ID: <[email protected]> Signed-off-by: Juergen Gross <[email protected]>
2024-09-25xen/pvh: Setup gsi for passthrough deviceJiqian Chen6-1/+113
In PVH dom0, the gsis don't get registered, but the gsi of a passthrough device must be configured for it to be able to be mapped into a domU. When assigning a device to passthrough, proactively setup the gsi of the device during that process. Signed-off-by: Jiqian Chen <[email protected]> Signed-off-by: Huang Rui <[email protected]> Signed-off-by: Jiqian Chen <[email protected]> Reviewed-by: Stefano Stabellini <[email protected]> Message-ID: <[email protected]> Signed-off-by: Juergen Gross <[email protected]>
2024-09-25xen/pci: Add a function to reset device for xenJiqian Chen4-3/+51
When device on dom0 side has been reset, the vpci on Xen side won't get notification, so that the cached state in vpci is all out of date with the real device state. To solve that problem, add a new function to clear all vpci device state when device is reset on dom0 side. And call that function in pcistub_init_device. Because when using "pci-assignable-add" to assign a passthrough device in Xen, it will reset passthrough device and the vpci state will out of date, and then device will fail to restore bar state. Signed-off-by: Jiqian Chen <[email protected]> Signed-off-by: Huang Rui <[email protected]> Signed-off-by: Jiqian Chen <[email protected]> Reviewed-by: Stefano Stabellini <[email protected]> Message-ID: <[email protected]> Signed-off-by: Juergen Gross <[email protected]>
2024-09-24nvme: remove CC register read-back during enablingKeith Busch1-5/+0
Any non-posted read should flush the previous write, so we don't necessarily need to read back the value we just wrote. I've found at least some controllers that respond with 0 for short moments after writing the CC register with EN (enable) cleared, so the read-back is overwriting our valid ctrl_config value and ends up breaking on the subsequent enabling. Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Keith Busch <[email protected]>
2024-09-24nvme: null terminate nvme_tls_attrsShin'ichiro Kawasaki1-0/+1
Commit 1e48b34c9bc7 ("nvme: split off TLS sysfs attributes into a separate group") introduced the struct attribute array nvme_tls_attrs. However, the array was not null terminated and caused BUG KASAN global- out-of-bounds. To avoid the BUG, null terminate the array. Reported-by: Yi Zhang <[email protected]> Closes: https://lore.kernel.org/linux-nvme/jhllwfxcedrcxcnbajwl4x2l2ujcqowqcd4ps574zrafrqhjna@f4icvecutekm/ Fixes: 1e48b34c9bc7 ("nvme: split off TLS sysfs attributes into a separate group") Signed-off-by: Shin'ichiro Kawasaki <[email protected]> Tested-by: Yi Zhang <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Keith Busch <[email protected]>
2024-09-24nvme-multipath: avoid hang on inaccessible namespacesHannes Reinecke1-2/+10
During repetitive namespace remapping operations on the target the namespace might have changed between the time the initial scan was performed, and partition scan was invoked by device_add_disk() in nvme_mpath_set_live(). We then end up with a stuck scanning process: [<0>] folio_wait_bit_common+0x12a/0x310 [<0>] filemap_read_folio+0x97/0xd0 [<0>] do_read_cache_folio+0x108/0x390 [<0>] read_part_sector+0x31/0xa0 [<0>] read_lba+0xc5/0x160 [<0>] efi_partition+0xd9/0x8f0 [<0>] bdev_disk_changed+0x23d/0x6d0 [<0>] blkdev_get_whole+0x78/0xc0 [<0>] bdev_open+0x2c6/0x3b0 [<0>] bdev_file_open_by_dev+0xcb/0x120 [<0>] disk_scan_partitions+0x5d/0x100 [<0>] device_add_disk+0x402/0x420 [<0>] nvme_mpath_set_live+0x4f/0x1f0 [nvme_core] [<0>] nvme_mpath_add_disk+0x107/0x120 [nvme_core] [<0>] nvme_alloc_ns+0xac6/0xe60 [nvme_core] [<0>] nvme_scan_ns+0x2dd/0x3e0 [nvme_core] [<0>] nvme_scan_work+0x1a3/0x490 [nvme_core] This happens when we have several paths, some of which are inaccessible, and the active paths are removed first. Then nvme_find_path() will requeue I/O in the ns_head (as paths are present), but the requeue list is never triggered as all remaining paths are inactive. This patch checks for NVME_NSHEAD_DISK_LIVE in nvme_available_path(), and requeue I/O after NVME_NSHEAD_DISK_LIVE has been cleared once the last path has been removed to properly terminate pending I/O. Signed-off-by: Hannes Reinecke <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Keith Busch <[email protected]>
2024-09-24nvme-multipath: system fails to create generic nvme deviceHannes Reinecke1-1/+1
NVME_NSHEAD_DISK_LIVE is a flag for struct nvme_ns_head, not nvme_ns. The current code has a typo causing NVME_NSHEAD_DISK_LIVE never to be cleared once device_add_disk_fails, causing the system never to create the 'generic' character device. Even several rescan attempts will change the situation and the system has to be rebooted to fix the issue. Fixes: 11384580e332 ("nvme-multipath: add error handling support for add_disk()") Signed-off-by: Hannes Reinecke <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Keith Busch <[email protected]>
2024-09-24netfs, cifs: Fix mtime/ctime update for mmapped writesDavid Howells1-0/+1
The cifs flag CIFS_INO_MODIFIED_ATTR, which indicates that the mtime and ctime need to be written back on close, got taken over by netfs as NETFS_ICTX_MODIFIED_ATTR to avoid the need to call a function pointer to set it. The flag gets set correctly on buffered writes, but doesn't get set by netfs_page_mkwrite(), leading to occasional failures in generic/080 and generic/215. Fix this by setting the flag in netfs_page_mkwrite(). Fixes: 73425800ac94 ("netfs, cifs: Move CIFS_INO_MODIFIED_ATTR to netfs_inode") Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-lkp/[email protected] Signed-off-by: David Howells <[email protected]> Reviewed-by: Paulo Alcantara (Red Hat) <[email protected]> cc: Jeff Layton <[email protected]> cc: [email protected] cc: [email protected] cc: [email protected] Signed-off-by: Steve French <[email protected]>
2024-09-24cifs: update internal version numberSteve French1-2/+2
To 2.51 Signed-off-by: Steve French <[email protected]>
2024-09-24smb: client: print failed session logoffs with FYIPaulo Alcantara1-2/+1
Do not flood dmesg with failed session logoffs as kerberos tickets getting expired or passwords being rotated is a very common scenario. Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]> Signed-off-by: Steve French <[email protected]>
2024-09-24cifs: Fix reversion of the iter in cifs_readv_receive().David Howells3-11/+7
cifs_read_iter_from_socket() copies the iterator that's passed in for the socket to modify as and if it will, and then advances the original iterator by the amount sent. However, both callers revert the advancement (although receive_encrypted_read() zeros beyond the iterator first). The problem is, though, that cifs_readv_receive() reverts by the original length, not the amount transmitted which can cause an oops in iov_iter_revert(). Fix this by: (1) Remove the iov_iter_advance() from cifs_read_iter_from_socket(). (2) Remove the iov_iter_revert() from both callers. This fixes the bug in cifs_readv_receive(). (3) In receive_encrypted_read(), if we didn't get back as much data as the buffer will hold, copy the iterator, advance the copy and use the copy to drive iov_iter_zero(). As a bonus, this gets rid of some unnecessary work. This was triggered by generic/074 with the "-o sign" mount option. Fixes: 3ee1a1fc3981 ("cifs: Cut over to using netfslib") Signed-off-by: David Howells <[email protected]> cc: Steve French <[email protected]> cc: Paulo Alcantara <[email protected]> cc: Shyam Prasad N <[email protected]> cc: Rohith Surabattula <[email protected]> cc: Jeff Layton <[email protected]> cc: [email protected] cc: [email protected] cc: [email protected] Signed-off-by: Steve French <[email protected]>
2024-09-24smb3: fix incorrect mode displayed for read-only filesSteve French1-8/+11
Commands like "chmod 0444" mark a file readonly via the attribute flag (when mapping of mode bits into the ACL are not set, or POSIX extensions are not negotiated), but they were not reported correctly for stat of directories (they were reported ok for files and for "ls"). See example below: root:~# ls /mnt2 -l total 12 drwxr-xr-x 2 root root 0 Sep 21 18:03 normaldir -rwxr-xr-x 1 root root 0 Sep 21 23:24 normalfile dr-xr-xr-x 2 root root 0 Sep 21 17:55 readonly-dir -r-xr-xr-x 1 root root 209716224 Sep 21 18:15 readonly-file root:~# stat -c %a /mnt2/readonly-dir 755 root:~# stat -c %a /mnt2/readonly-file 555 This fixes the stat of directories when ATTR_READONLY is set (in cases where the mode can not be obtained other ways). root:~# stat -c %a /mnt2/readonly-dir 555 Cc: [email protected] Signed-off-by: Steve French <[email protected]>
2024-09-24smb: client: fix parsing of device numbersPaulo Alcantara2-11/+4
Report correct major and minor numbers from special files created with NFS reparse points. Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]> Signed-off-by: Steve French <[email protected]>
2024-09-24smb: client: set correct device number on nfs reparse pointsPaulo Alcantara1-2/+2
Fix major and minor numbers set on special files created with NFS reparse points. Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]> Signed-off-by: Steve French <[email protected]>
2024-09-24smb: client: propagate error from cifs_construct_tcon()Paulo Alcantara1-6/+10
Propagate error from cifs_construct_tcon() in cifs_sb_tlink() instead of always returning -EACCES. Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]> Signed-off-by: Steve French <[email protected]>
2024-09-24smb: client: fix DFS failover in multiuser mountsPaulo Alcantara1-1/+2
For sessions and tcons created on behalf of new users accessing a multiuser mount, matching their sessions in tcon_super_cb() with master tcon will always lead to false as every new user will have its own session and tcon. All multiuser sessions, however, will inherit ->dfs_root_ses from master tcon, so match it instead. Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]> Signed-off-by: Steve French <[email protected]>
2024-09-24cifs: Make the write_{enter,done,err} tracepoints display netfs infoDavid Howells2-10/+18
Make the write RPC tracepoints use the same trace macro complexes as the read tracepoints and display the netfs request and subrequest IDs where available (see commit 519be989717c "cifs: Add a tracepoint to track credits involved in R/W requests"). Signed-off-by: David Howells <[email protected]> cc: Steve French <[email protected]> cc: Paulo Alcantara (Red Hat) <[email protected]> cc: Jeff Layton <[email protected]> cc: [email protected] cc: [email protected] cc: [email protected] Signed-off-by: Steve French <[email protected]>
2024-09-24smb: client: fix DFS interlink failoverPaulo Alcantara9-86/+94
The DFS interlinks point to different DFS namespaces so make sure to use the correct DFS root server to chase any DFS links under it by storing the SMB session in dfs_ref_walk structure and then using it on every referral walk. Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]> Signed-off-by: Steve French <[email protected]>
2024-09-24smb: client: improve purging of cached referralsPaulo Alcantara1-17/+15
Purge cached referrals that have a single target when reaching maximum of cache size as the client won't need them to failover. Otherwise remove oldest cache entry. Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]> Signed-off-by: Steve French <[email protected]>
2024-09-24smb: client: avoid unnecessary reconnects when refreshing referralsPaulo Alcantara1-70/+117
Do not mark tcons for reconnect when current connection matches any of the targets returned by new referral even when there is no cached entry. Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]> Signed-off-by: Steve French <[email protected]>
2024-09-25Merge tag 'drm-xe-next-fixes-2024-09-19' of ↵Dave Airlie7-17/+69
https://gitlab.freedesktop.org/drm/xe/kernel into drm-next Driver Changes: - Fix macro for checking minimum GuC version (Michal Wajdeczko) - Fix CCS offset calculation for some BMG SKUs (Matthew Auld) - Fix locking on memory usage reporting via fdinfo and BO destroy (Matthew Auld) - Fix GPU page fault handler on a closed VM (Matthew Brost) - Fix overflow in oa batch buffer (José) Signed-off-by: Dave Airlie <[email protected]> From: Lucas De Marchi <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/lr6vhd7x5eb7gubd7utfmnwzvfqfslji4kssxyqisynzlvqjni@svgm6jot7r66
2024-09-25Merge tag 'drm-intel-next-fixes-2024-09-19' of ↵Dave Airlie4-14/+35
https://gitlab.freedesktop.org/drm/i915/kernel into drm-next - Fix BMG support to UHBR13.5 - Two PSR fixes Signed-off-by: Dave Airlie <[email protected]> From: Joonas Lahtinen <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-09-24Merge tag 'nfs-for-6.12-1' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds58-464/+2519
Pull NFS client updates from Anna Schumaker: "New Features: - Add a 'noalignwrite' mount option for lock-less 'lost writes' prevention - Add support for the LOCALIO protocol extention Bugfixes: - Fix memory leak in error path of nfs4_do_reclaim() - Simplify and guarantee lock owner uniqueness - Fix -Wformat-truncation warning - Fix folio refcounts by using folio_attach_private() - Fix failing the mount system call when the server is down - Fix detection of "Proxying of Times" server support Cleanups: - Annotate struct nfs_cache_array with __counted_by() - Remove unnecessary NULL checks before kfree() - Convert RPC_TASK_* constants to an enum - Remove obsolete or misleading comments and declerations" * tag 'nfs-for-6.12-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (41 commits) nfs: Fix `make htmldocs` warnings in the localio documentation nfs: add "NFS Client and Server Interlock" section to localio.rst nfs: add FAQ section to Documentation/filesystems/nfs/localio.rst nfs: add Documentation/filesystems/nfs/localio.rst nfs: implement client support for NFS_LOCALIO_PROGRAM nfs/localio: use dedicated workqueues for filesystem read and write pnfs/flexfiles: enable localio support nfs: enable localio for non-pNFS IO nfs: add LOCALIO support nfs: pass struct nfsd_file to nfs_init_pgio and nfs_init_commit nfsd: implement server support for NFS_LOCALIO_PROGRAM nfsd: add LOCALIO support nfs_common: prepare for the NFS client to use nfsd_file for LOCALIO nfs_common: add NFS LOCALIO auxiliary protocol enablement SUNRPC: replace program list with program array SUNRPC: add svcauth_map_clnt_to_svc_cred_local SUNRPC: remove call_allocate() BUG_ONs nfsd: add nfsd_serv_try_get and nfsd_serv_put nfsd: add nfsd_file_acquire_local() nfsd: factor out __fh_verify to allow NULL rqstp to be passed ...
2024-09-24Merge tag 'fuse-update-6.12' of ↵Linus Torvalds15-297/+552
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse updates from Miklos Szeredi: - Add support for idmapped fuse mounts (Alexander Mikhalitsyn) - Add optimization when checking for writeback (yangyun) - Add tracepoints (Josef Bacik) - Clean up writeback code (Joanne Koong) - Clean up request queuing (me) - Misc fixes * tag 'fuse-update-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: (32 commits) fuse: use exclusive lock when FUSE_I_CACHE_IO_MODE is set fuse: clear FR_PENDING if abort is detected when sending request fs/fuse: convert to use invalid_mnt_idmap fs/mnt_idmapping: introduce an invalid_mnt_idmap fs/fuse: introduce and use fuse_simple_idmap_request() helper fs/fuse: fix null-ptr-deref when checking SB_I_NOIDMAP flag fuse: allow O_PATH fd for FUSE_DEV_IOC_BACKING_OPEN virtio_fs: allow idmapped mounts fuse: allow idmapped mounts fuse: warn if fuse_access is called when idmapped mounts are allowed fuse: handle idmappings properly in ->write_iter() fuse: support idmapped ->rename op fuse: support idmapped ->set_acl fuse: drop idmap argument from __fuse_get_acl fuse: support idmapped ->setattr op fuse: support idmapped ->permission inode op fuse: support idmapped getattr inode op fuse: support idmap for mkdir/mknod/symlink/create/tmpfile fuse: support idmapped FUSE_EXT_GROUPS fuse: add an idmap argument to fuse_simple_request ...
2024-09-24Merge tag 'exfat-for-6.12-rc1' of ↵Linus Torvalds9-127/+200
git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat Pull exfat updates from Namjae Jeon: - Clean-up unnecessary codes as ->valid_size is supported - buffered-IO fallback is no longer needed when using direct-IO - Move ->valid_size extension from mmap to ->page_mkwrite. This improves the overhead caused by unnecessary zero-out during mmap. - Fix memleaks from exfat_load_bitmap() and exfat_create_upcase_table() - Add sops->shutdown and ioctl - Add Yuezhang Mo as a reviwer * tag 'exfat-for-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat: MAINTAINERS: exfat: add myself as reviewer exfat: resolve memory leak from exfat_create_upcase_table() exfat: move extend valid_size into ->page_mkwrite() exfat: fix memory leak in exfat_load_bitmap() exfat: Implement sops->shutdown and ioctl exfat: do not fallback to buffered write exfat: drop ->i_size_ondisk
2024-09-24Merge tag 'f2fs-for-6.12-rc1' of ↵Linus Torvalds23-463/+798
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "The main changes include converting major IO paths to use folio, and adding various knobs to control GC more flexibly for Zoned devices. In addition, there are several patches to address corner cases of atomic file operations and better support for file pinning on zoned device. Enhancement: - add knobs to tune foreground/background GCs for Zoned devices - convert IO paths to use folio - reduce expensive checkpoint trigger frequency - allow F2FS_IPU_NOCACHE for pinned file - forcibly migrate to secure space for zoned device file pinning - get rid of buffer_head use - add write priority option based on zone UFS - get rid of online repair on corrupted directory Bug fixes: - fix to don't panic system for no free segment fault injection - fix to don't set SB_RDONLY in f2fs_handle_critical_error() - avoid unused block when dio write in LFS mode - compress: don't redirty sparse cluster during {,de}compress - check discard support for conventional zones - atomic: prevent atomic file from being dirtied before commit - atomic: fix to check atomic_file in f2fs ioctl interfaces - atomic: fix to forbid dio in atomic_file - atomic: fix to truncate pagecache before on-disk metadata truncation - atomic: create COW inode from parent dentry - atomic: fix to avoid racing w/ GC - atomic: require FMODE_WRITE for atomic write ioctls - fix to wait page writeback before setting gcing flag - fix to avoid racing in between read and OPU dio write, dio completion - fix several potential integer overflows in file offsets and dir_block_index - fix to avoid use-after-free in f2fs_stop_gc_thread() As usual, there are several code clean-ups and refactorings" * tag 'f2fs-for-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (60 commits) f2fs: allow F2FS_IPU_NOCACHE for pinned file f2fs: forcibly migrate to secure space for zoned device file pinning f2fs: remove unused parameters f2fs: fix to don't panic system for no free segment fault injection f2fs: fix to don't set SB_RDONLY in f2fs_handle_critical_error() f2fs: add valid block ratio not to do excessive GC for one time GC f2fs: create gc_no_zoned_gc_percent and gc_boost_zoned_gc_percent f2fs: do FG_GC when GC boosting is required for zoned devices f2fs: increase BG GC migration window granularity when boosted for zoned devices f2fs: add reserved_segments sysfs node f2fs: introduce migration_window_granularity f2fs: make BG GC more aggressive for zoned devices f2fs: avoid unused block when dio write in LFS mode f2fs: fix to check atomic_file in f2fs ioctl interfaces f2fs: get rid of online repaire on corrupted directory f2fs: prevent atomic file from being dirtied before commit f2fs: get rid of page->index f2fs: convert read_node_page() to use folio f2fs: convert __write_node_page() to use folio f2fs: convert f2fs_write_data_page() to use folio ...
2024-09-24Merge tag 'bpf-next-6.12-struct-fd' of ↵Linus Torvalds12-303/+179
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Pull bpf 'struct fd' updates from Alexei Starovoitov: "This includes struct_fd BPF changes from Al and Andrii" * tag 'bpf-next-6.12-struct-fd' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: bpf: convert bpf_token_create() to CLASS(fd, ...) security,bpf: constify struct path in bpf_token_create() LSM hook bpf: more trivial fdget() conversions bpf: trivial conversions for fdget() bpf: switch maps to CLASS(fd, ...) bpf: factor out fetching bpf_map from FD and adding it to used_maps list bpf: switch fdget_raw() uses to CLASS(fd_raw, ...) bpf: convert __bpf_prog_get() to CLASS(fd, ...)
2024-09-24ceph: remove the incorrect Fw reference check when dirtying pagesXiubo Li1-1/+0
When doing the direct-io reads it will also try to mark pages dirty, but for the read path it won't hold the Fw caps and there is case will it get the Fw reference. Fixes: 5dda377cf0a6 ("ceph: set i_head_snapc when getting CEPH_CAP_FILE_WR reference") Signed-off-by: Xiubo Li <[email protected]> Reviewed-by: Patrick Donnelly <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2024-09-24ceph: Remove empty definition in header fileZhang Zekun1-4/+0
The real definition of ceph_acl_chmod() has been removed since commit 4db658ea0ca2 ("ceph: Fix up after semantic merge conflict"), remain the empty definition untouched in the header files. Let's remove the empty definition. Signed-off-by: Zhang Zekun <[email protected]> Reviewed-by: Xiubo Li <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2024-09-24ceph: Fix typo in the commentYan Zhen4-4/+4
Correctly spelled comments make it easier for the reader to understand the code. replace 'tagert' with 'target' in the comment & replace 'vaild' with 'valid' in the comment & replace 'carefull' with 'careful' in the comment & replace 'trsaverse' with 'traverse' in the comment. Signed-off-by: Yan Zhen <[email protected]> Reviewed-by: Xiubo Li <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2024-09-24ceph: fix a memory leak on cap_auths in MDS clientLuis Henriques (SUSE)1-0/+12
The cap_auths that are allocated during an MDS session opening are never released, causing a memory leak detected by kmemleak. Fix this by freeing the memory allocated when shutting down the MDS client. Fixes: 1d17de9534cb ("ceph: save cap_auths in MDS client when session is opened") Signed-off-by: Luis Henriques (SUSE) <[email protected]> Reviewed-by: Xiubo Li <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2024-09-24ceph: flush all caps releases when syncing the whole filesystemXiubo Li4-0/+25
We have hit a race between cap releases and cap revoke request that will cause the check_caps() to miss sending a cap revoke ack to MDS. And the client will depend on the cap release to release that revoking caps, which could be delayed for some unknown reasons. In Kclient we have figured out the RCA about race and we need a way to explictly trigger this manually could help to get rid of the caps revoke stuck issue. Link: https://tracker.ceph.com/issues/67221 Signed-off-by: Xiubo Li <[email protected]> Reviewed-by: Ilya Dryomov <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>