aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-12-12vfio/mlx5: fix error code in mlx5vf_precopy_ioctl()Dan Carpenter1-1/+4
The copy_to_user() function returns the number of bytes remaining to be copied but we want to return a negative error code here. Fixes: 0dce165b1adf ("vfio/mlx5: Introduce vfio precopy ioctl implementation") Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: Yishai Hadas <[email protected]> Link: https://lore.kernel.org/r/Y5IKVknlf5Z5NPtU@kili Signed-off-by: Alex Williamson <[email protected]>
2022-12-12samples: vfio-mdev: Fix missing pci_disable_device() in mdpy_fb_probe()Shang XiaoJing1-1/+7
Add missing pci_disable_device() in fail path of mdpy_fb_probe(). Besides, fix missing release functions in mdpy_fb_remove(). Fixes: cacade1946a4 ("sample: vfio mdev display - guest driver") Signed-off-by: Shang XiaoJing <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-12-06hisi_acc_vfio_pci: Enable PRE_COPY flagShameer Kolothum1-1/+1
Now that we have everything to support the PRE_COPY state, enable it. Signed-off-by: Shameer Kolothum <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-12-06hisi_acc_vfio_pci: Move the dev compatibility tests for early checkShameer Kolothum2-12/+8
Instead of waiting till data transfer is complete to perform dev compatibility, do it as soon as we have enough data to perform the check. This will be useful when we enable the support for PRE_COPY. Signed-off-by: Shameer Kolothum <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-12-06hisi_acc_vfio_pci: Introduce support for PRE_COPY state transitionsShameer Kolothum1-3/+71
The saving_migf is open in PRE_COPY state if it is supported and reads initial device match data. hisi_acc_vf_stop_copy() is refactored to make use of common code. Signed-off-by: Shameer Kolothum <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-12-06hisi_acc_vfio_pci: Add support for precopy IOCTLShameer Kolothum2-0/+53
PRECOPY IOCTL in the case of HiSiIicon ACC driver can be used to perform the device compatibility check earlier during migration. Signed-off-by: Shameer Kolothum <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-12-06vfio/mlx5: Enable MIGRATION_PRE_COPY flagShay Drory1-0/+5
Now that everything has been set up for MIGRATION_PRE_COPY, enable it. Signed-off-by: Shay Drory <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Yishai Hadas <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-12-06vfio/mlx5: Fallback to STOP_COPY upon specific PRE_COPY errorShay Drory3-3/+32
Before a SAVE command is issued, a QUERY command is issued in order to know the device data size. In case PRE_COPY is used, the above commands are issued while the device is running. Thus, it is possible that between the QUERY and the SAVE commands the state of the device will be changed significantly and thus the SAVE will fail. Currently, if a SAVE command is failing, the driver will fail the migration. In the above case, don't fail the migration, but don't allow for new SAVEs to be executed while the device is in a RUNNING state. Once the device will be moved to STOP_COPY, SAVE can be executed again and the full device state will be read. Signed-off-by: Shay Drory <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Yishai Hadas <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-12-06vfio/mlx5: Introduce multiple loadsYishai Hadas3-45/+257
In order to support PRE_COPY, mlx5 driver transfers multiple states (images) of the device. e.g.: the source VF can save and transfer multiple states, and the target VF will load them by that order. This patch implements the changes for the target VF to decompose the header for each state and to write and load multiple states. Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Yishai Hadas <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-12-06vfio/mlx5: Consider temporary end of stream as part of PRE_COPYYishai Hadas3-2/+14
During PRE_COPY the migration data FD may have a temporary "end of stream" that is reached when the initial_bytes were read and no other dirty data exists yet. For instance, this may indicate that the device is idle and not currently dirtying any internal state. When read() is done on this temporary end of stream the kernel driver should return ENOMSG from read(). Userspace can wait for more data or consider moving to STOP_COPY. To not block the user upon read() and let it get ENOMSG we add a new state named MLX5_MIGF_STATE_PRE_COPY on the migration file. In addition, we add the MLX5_MIGF_STATE_SAVE_LAST state to block the read() once we call the last SAVE upon moving to STOP_COPY. Any further error will be marked with MLX5_MIGF_STATE_ERROR and the user won't be blocked. Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Yishai Hadas <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-12-06vfio/mlx5: Introduce vfio precopy ioctl implementationYishai Hadas2-0/+127
vfio precopy ioctl returns an estimation of data available for transferring from the device. Whenever a user is using VFIO_MIG_GET_PRECOPY_INFO, track the current state of the device, and if needed, append the dirty data to the transfer FD data. This is done by saving a middle state. As mlx5 runs the SAVE command asynchronously, make sure to query for incremental data only once there is no active save command. Running both in parallel, might end-up with a failure in the incremental query command on un-tracked vhca. Also, a middle state will be saved only after the previous state has finished its SAVE command and has been fully transferred, this prevents endless use resources. Co-developed-by: Shay Drory <[email protected]> Signed-off-by: Shay Drory <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Yishai Hadas <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-12-06vfio/mlx5: Introduce SW headers for migration statesYishai Hadas3-4/+67
As mentioned in the previous patches, mlx5 is transferring multiple states when the PRE_COPY protocol is used. This states mechanism requires the target VM to know the states' size in order to execute multiple loads. Therefore, add SW header, with the needed information, for each saved state the source VM is transferring to the target VM. This patch implements the source VM handling of the headers, following patch will implement the target VM handling of the headers. Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Yishai Hadas <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-12-06vfio/mlx5: Introduce device transitions of PRE_COPYYishai Hadas3-18/+184
In order to support PRE_COPY, mlx5 driver is transferring multiple states (images) of the device. e.g.: the source VF can save and transfer multiple states, and the target VF will load them by that order. The device is saving three kinds of states: 1) Initial state - when the device moves to PRE_COPY state. 2) Middle state - during PRE_COPY phase via VFIO_MIG_GET_PRECOPY_INFO. There can be multiple states of this type. 3) Final state - when the device moves to STOP_COPY state. After moving to PRE_COPY state, user is holding the saving migf FD and can use it. For example: user can start transferring data via read() callback. Also, user can switch from PRE_COPY to STOP_COPY whenever he sees it fits. This will invoke saving of final state. This means that mlx5 VFIO device can be switched to STOP_COPY without transferring any data in PRE_COPY state. Therefore, when the device moves to STOP_COPY, mlx5 will store the final state on a dedicated queue entry on the list. Co-developed-by: Shay Drory <[email protected]> Signed-off-by: Shay Drory <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Yishai Hadas <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-12-06vfio/mlx5: Refactor to use queue based data chunksYishai Hadas3-38/+136
Refactor to use queue based data chunks on the migration file. The SAVE command adds a chunk to the tail of the queue while the read() API finds the required chunk and returns its data. In case the queue is empty but the state of the migration file is MLX5_MIGF_STATE_COMPLETE, read() may not be blocked but will return 0 to indicate end of file. This is a step towards maintaining multiple images and their meta data (i.e. headers) on the migration file as part of next patches from the series. Note: At that point, we still use a single chunk on the migration file but becomes ready to support multiple. Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Yishai Hadas <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-12-06vfio/mlx5: Refactor migration file stateYishai Hadas3-8/+12
Refactor migration file state to be an emum which is mutual exclusive. As of that dropped the 'disabled' state as 'error' is the same from functional point of view. Next patches from the series will extend this enum for other relevant states. Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Yishai Hadas <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-12-06vfio/mlx5: Refactor MKEY usageYishai Hadas3-113/+178
This patch refactors MKEY usage such as its life cycle will be as of the migration file instead of allocating/destroying it upon each SAVE/LOAD command. This is a preparation step towards the PRE_COPY series where multiple images will be SAVED/LOADED. We achieve it by having a new struct named mlx5_vhca_data_buffer which holds the mkey and its related stuff as of sg_append_table, allocated_length, etc. The above fields were taken out from the migration file main struct, into mlx5_vhca_data_buffer dedicated struct with the proper helpers in place. For now we have a single mlx5_vhca_data_buffer per migration file. However, in coming patches we'll have multiple of them to support multiple images. Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Yishai Hadas <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-12-06vfio/mlx5: Refactor PD usageYishai Hadas3-31/+71
This patch refactors PD usage such as its life cycle will be as of the migration file instead of allocating/destroying it upon each SAVE/LOAD command. This is a preparation step towards the PRE_COPY series where multiple images will be SAVED/LOADED and a single PD can be simply reused. Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Yishai Hadas <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-12-06vfio/mlx5: Enforce a single SAVE command at a timeYishai Hadas3-0/+14
Enforce a single SAVE command at a time. As the SAVE command is an asynchronous one, we must enforce running only a single command at a time. This will preserve ordering between multiple calls and protect from races on the migration file data structure. This is a must for the next patches from the series where as part of PRE_COPY we may have multiple images to be saved and multiple SAVE commands may be issued from different flows. Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Yishai Hadas <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-12-06vfio: Extend the device migration protocol with PRE_COPYJason Gunthorpe2-6/+191
The optional PRE_COPY states open the saving data transfer FD before reaching STOP_COPY and allows the device to dirty track internal state changes with the general idea to reduce the volume of data transferred in the STOP_COPY stage. While in PRE_COPY the device remains RUNNING, but the saving FD is open. Only if the device also supports RUNNING_P2P can it support PRE_COPY_P2P, which halts P2P transfers while continuing the saving FD. PRE_COPY, with P2P support, requires the driver to implement 7 new arcs and exists as an optional FSM branch between RUNNING and STOP_COPY: RUNNING -> PRE_COPY -> PRE_COPY_P2P -> STOP_COPY A new ioctl VFIO_MIG_GET_PRECOPY_INFO is provided to allow userspace to query the progress of the precopy operation in the driver with the idea it will judge to move to STOP_COPY at least once the initial data set is transferred, and possibly after the dirty size has shrunk appropriately. This ioctl is valid only in PRE_COPY states and kernel driver should return -EINVAL from any other migration state. Compared to the v1 clarification, STOP_COPY -> PRE_COPY is blocked and to be defined in future. We also split the pending_bytes report into the initial and sustaining values, e.g.: initial_bytes and dirty_bytes. initial_bytes: Amount of initial precopy data. dirty_bytes: Device state changes relative to data previously retrieved. These fields are not required to have any bearing to STOP_COPY phase. It is recommended to leave PRE_COPY for STOP_COPY only after the initial_bytes field reaches zero. Leaving PRE_COPY earlier might make things slower. Signed-off-by: Jason Gunthorpe <[email protected]> Signed-off-by: Shay Drory <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Reviewed-by: Shameer Kolothum <[email protected]> Signed-off-by: Yishai Hadas <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-12-06net/mlx5: Introduce ifc bits for pre_copyShay Drory1-3/+11
Introduce ifc related stuff to enable PRE_COPY of VF during migration. Signed-off-by: Shay Drory <[email protected]> Acked-by: Leon Romanovsky <[email protected]> Signed-off-by: Yishai Hadas <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-12-05vfio/ap/ccw/samples: Fix device_register() unwind pathAlex Williamson5-11/+15
We always need to call put_device() if device_register() fails. All vfio drivers calling device_register() include a similar unwind stack via gotos, therefore split device_unregister() into its device_del() and put_device() components in the unwind path, and add a goto target to handle only the put_device() requirement. Reported-by: Ruan Jinjie <[email protected]> Link: https://lore.kernel.org/all/[email protected] Fixes: d61fc96f47fd ("sample: vfio mdev display - host device") Fixes: 9d1a546c53b4 ("docs: Sample driver to demonstrate how to use Mediated device framework.") Fixes: a5e6e6505f38 ("sample: vfio bochs vbe display (host device for bochs-drm)") Fixes: 9e6f07cd1eaa ("vfio/ccw: create a parent struct") Fixes: 36360658eb5a ("s390: vfio_ap: link the vfio_ap devices to the vfio_ap bus subsystem") Cc: Tony Krowiak <[email protected]> Cc: Halil Pasic <[email protected]> Cc: Jason Herne <[email protected]> Cc: Kirti Wankhede <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Reviewed-by: Eric Farman <[email protected]> Reviewed-by: Tony Krowiak <[email protected]> Reviewed-by: Jason J. Herne <[email protected]> Link: https://lore.kernel.org/r/166999942139.645727.12439756512449846442.stgit@omen Signed-off-by: Alex Williamson <[email protected]>
2022-12-05vfio: Fold vfio_virqfd.ko into vfio.koJason Gunthorpe5-18/+25
This is only 1.8k, putting it in its own module is not really necessary. The kconfig infrastructure is still there to completely remove it for systems that are trying for small footprint. Put it in the main vfio.ko module now that kbuild can support multiple .c files. Signed-off-by: Jason Gunthorpe <[email protected]> Reviewed-by: Cornelia Huck <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-12-05vfio: Remove CONFIG_VFIO_SPAPR_EEHJason Gunthorpe2-8/+3
We don't need a kconfig symbol for this, just directly test CONFIG_EEH in the few places that need it. Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Cornelia Huck <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-12-05vfio: Move vfio_spapr_iommu_eeh_ioctl into vfio_iommu_spapr_tce.cJason Gunthorpe4-103/+53
As with the previous patch EEH is always enabled if SPAPR_TCE_IOMMU, so move this last bit of code into the main module. Now that this function only processes VFIO_EEH_PE_OP remove a level of indenting as well, it is only called by a case statement that already checked VFIO_EEH_PE_OP. This eliminates an unnecessary module and SPAPR code in a global header. Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Cornelia Huck <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-12-05vfio/spapr: Move VFIO_CHECK_EXTENSION into tce_iommu_ioctl()Jason Gunthorpe2-12/+4
The PPC64 kconfig is a bit of a rats nest, but it turns out that if CONFIG_SPAPR_TCE_IOMMU is on then EEH must be too: config SPAPR_TCE_IOMMU bool "sPAPR TCE IOMMU Support" depends on PPC_POWERNV || PPC_PSERIES select IOMMU_API help Enables bits of IOMMU API required by VFIO. The iommu_ops is not implemented as it is not necessary for VFIO. config PPC_POWERNV select FORCE_PCI config PPC_PSERIES select FORCE_PCI config EEH bool depends on (PPC_POWERNV || PPC_PSERIES) && PCI default y So, just open code the call to eeh_enabled() into tce_iommu_ioctl(). Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Philippe Mathieu-Daudé <[email protected]> Reviewed-by: Cornelia Huck <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-12-05vfio/pci: Move all the SPAPR PCI specific logic to vfio_pci_core.koJason Gunthorpe3-26/+9
The vfio_spapr_pci_eeh_open/release() functions are one line wrappers around an arch function. Just call them directly. This eliminates some weird exported symbols that don't need to exist. Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Reviewed-by: Cornelia Huck <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-12-02vfio/iova_bitmap: refactor iova_bitmap_set() to better handle page boundariesJoao Martins1-17/+13
Commit f38044e5ef58 ("vfio/iova_bitmap: Fix PAGE_SIZE unaligned bitmaps") had fixed the unaligned bitmaps by capping the remaining iterable set at the start of the bitmap. Although, that mistakenly worked around iova_bitmap_set() incorrectly setting bits across page boundary. Fix this by reworking the loop inside iova_bitmap_set() to iterate over a range of bits to set (cur_bit .. last_bit) which may span different pinned pages, thus updating @page_idx and @offset as it sets the bits. The previous cap to the first page is now adjusted to be always accounted rather than when there's only a non-zero pgoff. While at it, make @page_idx , @offset and @nbits to be unsigned int given that it won't be more than 512 and 4096 respectively (even a bigger PAGE_SIZE or a smaller struct page size won't make this bigger than the above 32-bit max). Also, delete the stale kdoc on Return type. Cc: Avihai Horon <[email protected]> Fixes: f38044e5ef58 ("vfio/iova_bitmap: Fix PAGE_SIZE unaligned bitmaps") Co-developed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Signed-off-by: Joao Martins <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Tested-by: Avihai Horon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-11-14vfio/mlx5: Fix a typo in mlx5vf_cmd_load_vhca_state()Yishai Hadas1-2/+2
Fix a typo in mlx5vf_cmd_load_vhca_state() to use the 'load' memory layout. As in/out sizes are equal for save and load commands there wasn't any functional issue. Fixes: f1d98f346ee3 ("vfio/mlx5: Expose migration commands over mlx5 device") Signed-off-by: Yishai Hadas <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-11-14vfio: Add an option to get migration data sizeYishai Hadas6-1/+79
Add an option to get migration data size by introducing a new migration feature named VFIO_DEVICE_FEATURE_MIG_DATA_SIZE. Upon VFIO_DEVICE_FEATURE_GET the estimated data length that will be required to complete STOP_COPY is returned. This option may better enable user space to consider before moving to STOP_COPY whether it can meet the downtime SLA based on the returned data. The patch also includes the implementation for mlx5 and hisi for this new option to make it feature complete for the existing drivers in this area. Signed-off-by: Yishai Hadas <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Longfang Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-11-10vfio: Remove vfio_free_deviceEric Farman12-35/+4
With the "mess" sorted out, we should be able to inline the vfio_free_device call introduced by commit cb9ff3f3b84c ("vfio: Add helpers for unifying vfio_device life cycle") and remove them from driver release callbacks. Signed-off-by: Eric Farman <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Reviewed-by: Cornelia Huck <[email protected]> Reviewed-by: Tony Krowiak <[email protected]> # vfio-ap part Reviewed-by: Matthew Rosato <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-11-10vfio/ccw: replace vfio_init_device with _alloc_Eric Farman5-37/+23
Now that we have a reasonable separation of structs that follow the subchannel and mdev lifecycles, there's no reason we can't call the official vfio_alloc_device routine for our private data, and behave like everyone else. Signed-off-by: Eric Farman <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Reviewed-by: Matthew Rosato <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-11-10vfio/ccw: remove release completionEric Farman2-16/+1
There's enough separation between the parent and private structs now, that it is fine to remove the release completion hack. Signed-off-by: Eric Farman <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Reviewed-by: Matthew Rosato <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-11-10vfio/ccw: move private to mdev lifecycleEric Farman3-28/+16
Now that the mdev parent data is split out into its own struct, it is safe to move the remaining private data to follow the mdev probe/remove lifecycle. The mdev parent data will remain where it is, and follow the subchannel and the css driver interfaces. Signed-off-by: Eric Farman <[email protected]> Reviewed-by: Matthew Rosato <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-11-10vfio/ccw: move private initialization to callbackEric Farman3-66/+58
There's already a device initialization callback that is used to initialize the release completion workaround that was introduced by commit ebb72b765fb49 ("vfio/ccw: Use the new device life cycle helpers"). Move the other elements of the vfio_ccw_private struct that require distinct initialization over to that routine. With that done, the vfio_ccw_alloc_private routine only does a kzalloc, so fold it inline. Signed-off-by: Eric Farman <[email protected]> Reviewed-by: Matthew Rosato <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-11-10vfio/ccw: remove private->schEric Farman5-28/+26
These places all rely on the ability to jump from a private struct back to the subchannel struct. Rather than keeping a copy in our back pocket, let's use the relationship provided by the vfio_device embedded within the private. Signed-off-by: Eric Farman <[email protected]> Reviewed-by: Matthew Rosato <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-11-10vfio/ccw: create a parent structEric Farman3-25/+101
Move the stuff associated with the mdev parent (and thus the subchannel struct) into its own struct, and leave the rest in the existing private structure. The subchannel will point to the parent, and the parent will point to the private, for the areas where one or both are needed. Further separation of these structs will follow. Signed-off-by: Eric Farman <[email protected]> Reviewed-by: Matthew Rosato <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-11-09vfio/iova_bitmap: Fix PAGE_SIZE unaligned bitmapsJoao Martins1-2/+6
iova_bitmap_set() doesn't consider the end of the page boundary when the first bitmap page offset isn't zero, and wrongly changes the consecutive page right after. Consequently this leads to missing dirty pages from reported by the device as seen from the VMM. The current logic iterates over a given number of base pages and clamps it to the remaining indexes to iterate in the last page. Instead of having to consider extra pages to pin (e.g. first and extra pages), just handle the first page as its own range and let the rest of the bitmap be handled as if it was base page aligned. This is done by changing iova_bitmap_mapped_remaining() to return PAGE_SIZE - pgoff (on the first bitmap page), and leads to pgoff being set to 0 on following iterations. Fixes: 58ccf0190d19 ("vfio: Add an IOVA bitmap support") Reported-by: Avihai Horon <[email protected]> Tested-by: Avihai Horon <[email protected]> Signed-off-by: Joao Martins <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-11-09vfio/iova_bitmap: Explicitly include linux/slab.hJoao Martins1-0/+1
kzalloc/kzfree are used so include `slab.h`. While it happens to work without it, due to commit 8b9f3ac5b01d ("fs: introduce alloc_inode_sb() to allocate filesystems specific inode") which indirectly includes via: . ./include/linux/mm.h .. ./include/linux/huge_mm.h ... ./include/linux/fs.h .... ./include/linux/slab.h Make it explicit should any of its indirect dependencies be dropped/changed for entirely different reasons as it was the cause prior to commit above recently (i.e. <= v5.18). Signed-off-by: Joao Martins <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-11-09vfio: platform: Do not pass return buffer to ACPI _RST methodRafael Mendonca1-2/+1
The ACPI _RST method has no return value, there's no need to pass a return buffer to acpi_evaluate_object(). Fixes: d30daa33ec1d ("vfio: platform: call _RST method when using ACPI") Signed-off-by: Rafael Mendonca <[email protected]> Reviewed-by: Eric Auger <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-11-09vfio/mlx5: Switch to use module_pci_driver() macroShang XiaoJing1-12/+1
Since pci provides the helper macro module_pci_driver(), we may replace the module_init/exit with it. Signed-off-by: Shang XiaoJing <[email protected]> Reviewed-by: Yishai Hadas <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-11-09MAINTAINERS: git://github -> https://github.com for awilliamPalmer Dabbelt1-1/+1
Github deprecated the git:// links about a year ago, so let's move to the https:// URLs instead. Reported-by: Conor Dooley <[email protected]> Link: https://github.blog/2021-09-01-improving-git-protocol-security-github/ Signed-off-by: Palmer Dabbelt <[email protected]> Reviewed-by: Philippe Mathieu-Daudé <[email protected]> Tested-by: Philippe Mathieu-Daudé <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2022-11-06Linux 6.1-rc4Linus Torvalds1-1/+1
2022-11-06Merge tag 'cxl-fixes-for-6.1-rc4' of ↵Linus Torvalds8-91/+448
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull cxl fixes from Dan Williams: "Several fixes for CXL region creation crashes, leaks and failures. This is mainly fallout from the original implementation of dynamic CXL region creation (instantiate new physical memory pools) that arrived in v6.0-rc1. Given the theme of "failures in the presence of pass-through decoders" this also includes new regression test infrastructure for that case. Summary: - Fix region creation crash with pass-through decoders - Fix region creation crash when no decoder allocation fails - Fix region creation crash when scanning regions to enforce the increasing physical address order constraint that CXL mandates - Fix a memory leak for cxl_pmem_region objects, track 1:N instead of 1:1 memory-device-to-region associations. - Fix a memory leak for cxl_region objects when regions with active targets are deleted - Fix assignment of NUMA nodes to CXL regions by CFMWS (CXL Window) emulated proximity domains. - Fix region creation failure for switch attached devices downstream of a single-port host-bridge - Fix false positive memory leak of cxl_region objects by recycling recently used region ids rather than freeing them - Add regression test infrastructure for a pass-through decoder configuration - Fix some mailbox payload handling corner cases" * tag 'cxl-fixes-for-6.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl/region: Recycle region ids cxl/region: Fix 'distance' calculation with passthrough ports tools/testing/cxl: Add a single-port host-bridge regression config tools/testing/cxl: Fix some error exits cxl/pmem: Fix cxl_pmem_region and cxl_memdev leak cxl/region: Fix cxl_region leak, cleanup targets at region delete cxl/region: Fix region HPA ordering validation cxl/pmem: Use size_add() against integer overflow cxl/region: Fix decoder allocation crash ACPI: NUMA: Add CXL CFMWS 'nodes' to the possible nodes set cxl/pmem: Fix failure to account for 8 byte header for writes to the device LSA. cxl/region: Fix null pointer dereference due to pass through decoder commit cxl/mbox: Add a check on input payload size
2022-11-06Merge tag 'hwmon-for-v6.1-rc4' of ↵Linus Torvalds2-14/+103
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fixes from Guenter Roeck: "Fix two regressions: - Commit 54cc3dbfc10d ("hwmon: (pmbus) Add regulator supply into macro") resulted in regulator undercount when disabling regulators. Revert it. - The thermal subsystem rework caused the scmi driver to no longer register with the thermal subsystem because index values no longer match. To fix the problem, the scmi driver now directly registers with the thermal subsystem, no longer through the hwmon core" * tag 'hwmon-for-v6.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: Revert "hwmon: (pmbus) Add regulator supply into macro" hwmon: (scmi) Register explicitly with Thermal Framework
2022-11-06Merge tag 'perf_urgent_for_v6.1_rc4' of ↵Linus Torvalds4-11/+18
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Borislav Petkov: - Add Cooper Lake's stepping to the PEBS guest/host events isolation fixed microcode revisions checking quirk - Update Icelake and Sapphire Rapids events constraints - Use the standard energy unit for Sapphire Rapids in RAPL - Fix the hw_breakpoint test to fail more graciously on !SMP configs * tag 'perf_urgent_for_v6.1_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel: Add Cooper Lake stepping to isolation_ucodes[] perf/x86/intel: Fix pebs event constraints for SPR perf/x86/intel: Fix pebs event constraints for ICL perf/x86/rapl: Use standard Energy Unit for SPR Dram RAPL domain perf/hw_breakpoint: test: Skip the test if dependencies unmet
2022-11-06Merge tag 'x86_urgent_for_v6.1_rc4' of ↵Linus Torvalds3-9/+29
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Add new Intel CPU models - Enforce that TDX guests are successfully loaded only on TDX hardware where virtualization exception (#VE) delivery on kernel memory is disabled because handling those in all possible cases is "essentially impossible" - Add the proper include to the syscall wrappers so that BTF can see the real pt_regs definition and not only the forward declaration * tag 'x86_urgent_for_v6.1_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu: Add several Intel server CPU model numbers x86/tdx: Panic on bad configs that #VE on "private" memory access x86/tdx: Prepare for using "INFO" call for a second purpose x86/syscall: Include asm/ptrace.h in syscall_wrapper header
2022-11-06Merge tag 'kbuild-fixes-v6.1-2' of ↵Linus Torvalds4-21/+16
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Use POSIX-compatible grep options - Document git-related tips for reproducible builds - Fix a typo in the modpost rule - Suppress SIGPIPE error message from gcc-ar and llvm-ar - Fix segmentation fault in the menuconfig search * tag 'kbuild-fixes-v6.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kconfig: fix segmentation fault in menuconfig search kbuild: fix SIGPIPE error message for AR=gcc-ar and AR=llvm-ar kbuild: fix typo in modpost Documentation: kbuild: Add description of git for reproducible builds kbuild: use POSIX-compatible grep option
2022-11-06Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds11-60/+52
Pull kvm fixes from Paolo Bonzini: "ARM: - Fix the pKVM stage-1 walker erronously using the stage-2 accessor - Correctly convert vcpu->kvm to a hyp pointer when generating an exception in a nVHE+MTE configuration - Check that KVM_CAP_DIRTY_LOG_* are valid before enabling them - Fix SMPRI_EL1/TPIDR2_EL0 trapping on VHE - Document the boot requirements for FGT when entering the kernel at EL1 x86: - Use SRCU to protect zap in __kvm_set_or_clear_apicv_inhibit() - Make argument order consistent for kvcalloc() - Userspace API fixes for DEBUGCTL and LBRs" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: Fix a typo about the usage of kvcalloc() KVM: x86: Use SRCU to protect zap in __kvm_set_or_clear_apicv_inhibit() KVM: VMX: Ignore guest CPUID for host userspace writes to DEBUGCTL KVM: VMX: Fold vmx_supported_debugctl() into vcpu_supported_debugctl() KVM: VMX: Advertise PMU LBRs if and only if perf supports LBRs arm64: booting: Document our requirements for fine grained traps with SME KVM: arm64: Fix SMPRI_EL1/TPIDR2_EL0 trapping on VHE KVM: Check KVM_CAP_DIRTY_LOG_{RING, RING_ACQ_REL} prior to enabling them KVM: arm64: Fix bad dereference on MTE-enabled systems KVM: arm64: Use correct accessor to parse stage-1 PTEs
2022-11-06Merge tag 'for-linus-6.1-rc4-tag' of ↵Linus Torvalds2-18/+7
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "One fix for silencing a smatch warning, and a small cleanup patch" * tag 'for-linus-6.1-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: x86/xen: simplify sysenter and syscall setup x86/xen: silence smatch warning in pmu_msr_chk_emulated()
2022-11-06Merge tag 'ext4_for_linus_stable' of ↵Linus Torvalds6-7/+21
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Fix a number of bugs, including some regressions, the most serious of which was one which would cause online resizes to fail with file systems with metadata checksums enabled. Also fix a warning caused by the newly added fortify string checker, plus some bugs that were found using fuzzed file systems" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix fortify warning in fs/ext4/fast_commit.c:1551 ext4: fix wrong return err in ext4_load_and_init_journal() ext4: fix warning in 'ext4_da_release_space' ext4: fix BUG_ON() when directory entry has invalid rec_len ext4: update the backup superblock's at the end of the online resize