aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-01-14vdpa/vdpa_sim_net: Report max device capabilitiesEli Cohen1-0/+1
Configure max supported virtqueues features on the management device. This info can be retrieved using: $ vdpa mgmtdev show vdpasim_net: supported_classes net max_supported_vqs 2 dev_features MAC ANY_LAYOUT VERSION_1 ACCESS_PLATFORM Signed-off-by: Eli Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]> Acked-by: Jason Wang <[email protected]>
2022-01-14vdpa: Use BIT_ULL for bit operationsEli Cohen1-6/+6
All masks in this file are 64 bits. Change BIT to BIT_ULL. Other occurences use (1 << val) which yields a 32 bit value. Change them to use BIT_ULL too. Reviewed-by: Si-Wei Liu <[email protected]> Signed-off-by: Eli Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]> Acked-by: Jason Wang <[email protected]>
2022-01-14vdpa/vdpa_sim: Configure max supported virtqueuesEli Cohen1-0/+1
Configure max supported virtqueues on the management device. Signed-off-by: Eli Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]> Acked-by: Jason Wang <[email protected]>
2022-01-14vdpa/mlx5: Report max device capabilitiesEli Cohen1-12/+23
Configure max supported virtqueues and features on the management device. This info can be retrieved using: $ vdpa mgmtdev show auxiliary/mlx5_core.sf.1: supported_classes net max_supported_vqs 257 dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS CTRL_VQ MQ \ CTRL_MAC_ADDR VERSION_1 ACCESS_PLATFORM Signed-off-by: Eli Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]> Reviewed-by: Si-Wei Liu<[email protected]>
2022-01-14vdpa: Support reporting max device capabilitiesEli Cohen3-0/+14
Add max_supported_vqs and supported_features fields to struct vdpa_mgmt_dev. Upstream drivers need to feel these values according to the device capabilities. These values are reported back in a netlink message when showing management devices. Examples: $ auxiliary/mlx5_core.sf.1: supported_classes net max_supported_vqs 257 dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS CTRL_VQ MQ \ CTRL_MAC_ADDR VERSION_1 ACCESS_PLATFORM $ vdpa -j mgmtdev show {"mgmtdev":{"auxiliary/mlx5_core.sf.1":{"supported_classes":["net"], \ "max_supported_vqs":257,"dev_features":["CSUM","GUEST_CSUM","MTU", \ "HOST_TSO4","HOST_TSO6","STATUS","CTRL_VQ","MQ","CTRL_MAC_ADDR", \ "VERSION_1","ACCESS_PLATFORM"]}}} $ vdpa -jp mgmtdev show { "mgmtdev": { "auxiliary/mlx5_core.sf.1": { "supported_classes": [ "net" ], "max_supported_vqs": 257, "dev_features": ["CSUM","GUEST_CSUM","MTU","HOST_TSO4", \ "HOST_TSO6","STATUS","CTRL_VQ","MQ", \ "CTRL_MAC_ADDR","VERSION_1","ACCESS_PLATFORM"] } } } Signed-off-by: Eli Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]> Reviewed-by: Si-Wei Liu<[email protected]>
2022-01-14vdpa/mlx5: Restore cur_num_vqs in case of failure in change_num_qps()Eli Cohen1-1/+3
Restore ndev->cur_num_vqs to the original value in case change_num_qps() fails. Fixes: 52893733f2c5 ("vdpa/mlx5: Add multiqueue support") Reviewed-by: Si-Wei Liu<[email protected]> Acked-by: Jason Wang <[email protected]> Signed-off-by: Eli Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-01-14vdpa: Add support for returning device configuration informationEli Cohen2-0/+7
Add netlink attribute to store the negotiated features. This can be used by userspace to get the current state of the vdpa instance. Examples: $ vdpa dev config show vdpa-a vdpa-a: mac 00:00:00:00:88:88 link up link_announce false max_vq_pairs 16 mtu 1500 negotiated_features CSUM GUEST_CSUM MTU MAC HOST_TSO4 HOST_TSO6 STATUS \ CTRL_VQ MQ CTRL_MAC_ADDR VERSION_1 ACCESS_PLATFORM $ vdpa -j dev config show vdpa-a {"config":{"vdpa-a":{"mac":"00:00:00:00:88:88","link ":"up","link_announce":false, \ "max_vq_pairs":16,"mtu":1500,"negotiated_features":["CSUM","GUEST_CSUM","MTU","MAC", \ "HOST_TSO4","HOST_TSO6","STATUS","CTRL_VQ","MQ","CTRL_MAC_ADDR","VERSION_1", \ "ACCESS_PLATFORM"]}}} $ vdpa -jp dev config show vdpa-a { "config": { "vdpa-a": { "mac": "00:00:00:00:88:88", "link ": "up", "link_announce ": false, "max_vq_pairs": 16, "mtu": 1500, "negotiated_features": [ "CSUM","GUEST_CSUM","MTU","MAC","HOST_TSO4","HOST_TSO6","STATUS","CTRL_VQ","MQ", \ "CTRL_MAC_ADDR","VERSION_1","ACCESS_PLATFORM" ] } } } Signed-off-by: Eli Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]> Acked-by: Jason Wang <[email protected]>
2022-01-14vdpa/mlx5: Support configuring max data virtqueueEli Cohen1-14/+40
Check whether the max number of data virtqueue pairs was provided when a adding a new device and verify the new value does not exceed device capabilities. In addition, change the arrays holding virtqueue and callback contexts to be dynamically allocated. Signed-off-by: Eli Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Includes fixup: vdpa/mlx5: fix error handling in mlx5_vdpa_dev_add() Clang build fails with mlx5_vnet.c:2574:6: error: variable 'mvdev' is used uninitialized whenever 'if' condition is true if (!ndev->vqs || !ndev->event_cbs) { ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ mlx5_vnet.c:2660:14: note: uninitialized use occurs here put_device(&mvdev->vdev.dev); ^~~~~ This because mvdev is set after trying to allocate ndev->vqs,event_cbs. So move the allocation to after mvdev is set but before the arrays are used in init_mvqs() Signed-off-by: Tom Rix <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]> Includes fixup: vdpa/mlx5: fix endian-ness for max vqs sparse warnings: (new ones prefixed by >>) >> drivers/vdpa/mlx5/net/mlx5_vnet.c:1247:23: sparse: sparse: cast to restricted __le16 >> drivers/vdpa/mlx5/net/mlx5_vnet.c:1247:23: sparse: sparse: cast from restricted __virtio16 > 1247 num = le16_to_cpu(ndev->config.max_virtqueue_pairs); Address this using the appropriate wrapper. Cc: "Eli Cohen" <[email protected]> Reported-by: kernel test robot <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Acked-by: Jason Wang <[email protected]> Reviewed-by: Eli Cohen <[email protected]>
2022-01-14vdpa/mlx5: Fix config_attr_mask assignmentEli Cohen1-1/+1
Fix VDPA_ATTR_DEV_NET_CFG_MACADDR assignment to be explicit 64 bit assignment. No issue was seen since the value is well below 64 bit max value. Nevertheless it needs to be fixed. Fixes: a007d940040c ("vdpa/mlx5: Support configuration of MAC") Reviewed-by: Si-Wei Liu <[email protected]> Acked-by: Jason Wang <[email protected]> Signed-off-by: Eli Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-01-14vdpa: Allow to configure max data virtqueuesEli Cohen4-7/+31
Add netlink support to configure the max virtqueue pairs for a device. At least one pair is required. The maximum is dictated by the device. Example: $ vdpa dev add name vdpa-a mgmtdev auxiliary/mlx5_core.sf.1 max_vqp 4 Signed-off-by: Eli Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-01-14vdpa: Read device configuration only if FEATURES_OKEli Cohen1-12/+33
Avoid reading device configuration during feature negotiation. Read device status and verify that VIRTIO_CONFIG_S_FEATURES_OK is set. Protect the entire operation, including configuration read with cf_mutex to ensure integrity of the results. Signed-off-by: Eli Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]> Acked-by: Jason Wang <[email protected]>
2022-01-14vdpa: Sync calls set/get config/status with cf_mutexEli Cohen4-6/+26
Add wrappers to get/set status and protect these operations with cf_mutex to serialize these operations with respect to get/set config operations. Signed-off-by: Eli Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-01-14vdpa/mlx5: Distribute RX virtqueues in RQT objectEli Cohen1-23/+7
Distribute the available rx virtqueues amongst the available RQT entries. RQTs require to have a power of two entries. When creating or modifying the RQT, use the lowest number of power of two entries that is not less than the number of rx virtqueues. Distribute them in the available entries such that some virtqueus may be referenced twice. This allows to configure any number of virtqueue pairs when multiqueue is used. Reviewed-by: Si-Wei Liu <[email protected]> Acked-by: Jason Wang <[email protected]> Signed-off-by: Eli Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-01-14vdpa: Provide interface to read driver featuresEli Cohen10-34/+87
Provide an interface to read the negotiated features. This is needed when building the netlink message in vdpa_dev_net_config_fill(). Also fix the implementation of vdpa_dev_net_config_fill() to use the negotiated features instead of the device features. To make APIs clearer, make the following name changes to struct vdpa_config_ops so they better describe their operations: get_features -> get_device_features set_features -> set_driver_features Finally, add get_driver_features to return the negotiated features and add implementation to all the upstream drivers. Acked-by: Jason Wang <[email protected]> Signed-off-by: Eli Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-01-14vdpa: clean up get_config_size ret value handlingLaura Abbott1-1/+1
The return type of get_config_size is size_t so it makes sense to change the type of the variable holding its result. That said, this already got taken care of (differently, and arguably not as well) by commit 3ed21c1451a1 ("vdpa: check that offsets are within bounds"). The added 'c->off > size' test in that commit will be done as an unsigned comparison on 32-bit (safe due to not being signed). On a 64-bit platform, it will be done as a signed comparison, but in that case the comparison will be done in 64-bit, and 'c->off' being an u32 it will be valid thanks to the extended range (ie both values will be positive in 64 bits). So this was a real bug, but it was already addressed and marked for stable. Signed-off-by: Laura Abbott <[email protected]> Reported-by: Luo Likang <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-01-14virtio_ring: mark ring unused on errorMichael S. Tsirkin1-1/+3
A recently added error path does not mark ring unused when exiting on OOM, which will lead to BUG on the next entry in debug builds. TODO: refactor code so we have START_USE and END_USE in the same function. Fixes: fc6d70f40b3d ("virtio_ring: check desc == NULL when using indirect with packed") Cc: "Xuan Zhuo" <[email protected]> Cc: Jiasheng Jiang <[email protected]> Reviewed-by: Xuan Zhuo <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-01-14vhost/test: fix memory leak of vhost virtqueuesXianting Tian1-0/+1
We need free the vqs in .release(), which are allocated in .open(). Signed-off-by: Xianting Tian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]> Acked-by: Jason Wang <[email protected]>
2022-01-14vdpa/mlx5: Fix wrong configuration of virtio_version_1_0Eli Cohen1-2/+0
Remove overriding of virtio_version_1_0 which forced the virtqueue object to version 1. Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices") Signed-off-by: Eli Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]> Reviewed-by: Parav Pandit <[email protected]> Acked-by: Jason Wang <[email protected]> Reviewed-by: Si-Wei Liu <[email protected]>
2022-01-14virtio/virtio_pci_legacy_dev: ensure the correct return valuePeng Hao1-1/+3
When pci_iomap return NULL, the return value is zero. Signed-off-by: Peng Hao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]> Acked-by: Jason Wang <[email protected]>
2022-01-14virtio/virtio_mem: handle a possible NULL as a memcpy parameterPeng Hao1-1/+1
There is a check for vm->sbm.sb_states before, and it should check it here as well. Signed-off-by: Peng Hao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]> Fixes: 5f1f79bbc9e2 ("virtio-mem: Paravirtualized memory hotplug") Cc: [email protected] # v5.8+
2022-01-14virtio: fix a typo in function "vp_modern_remove" comments.Dapeng Mi1-1/+1
Function name "vp_modern_remove" in comments is written to "vp_modern_probe" incorrectly. Change it. Signed-off-by: Dapeng Mi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]> Reviewed-by: Stefano Garzarella <[email protected]>
2022-01-14virtio-pci: fix the confusing error message王贇1-1/+1
The error message on the failure of pfn check should tell virtio-pci rather than virtio-mmio, just fix it. Signed-off-by: Michael Wang <[email protected]> Suggested-by: Michael S. Tsirkin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-01-14firmware: qemu_fw_cfg: remove sysfs entries explicitlyJohan Hovold1-0/+1
Explicitly remove the file entries from sysfs before dropping the final reference for symmetry reasons and for consistency with the rest of the driver. Signed-off-by: Johan Hovold <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-01-14firmware: qemu_fw_cfg: fix sysfs information leakJohan Hovold1-1/+1
Make sure to always NUL-terminate file names retrieved from the firmware to avoid accessing data beyond the entry slab buffer and exposing it through sysfs in case the firmware data is corrupt. Fixes: 75f3e8e47f38 ("firmware: introduce sysfs driver for QEMU's fw_cfg device") Cc: [email protected] # 4.6 Cc: Gabriel Somlo <[email protected]> Signed-off-by: Johan Hovold <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-01-14firmware: qemu_fw_cfg: fix kobject leak in probe error pathJohan Hovold1-7/+6
An initialised kobject must be freed using kobject_put() to avoid leaking associated resources (e.g. the object name). Commit fe3c60684377 ("firmware: Fix a reference count leak.") "fixed" the leak in the first error path of the file registration helper but left the second one unchanged. This "fix" would however result in a NULL pointer dereference due to the release function also removing the never added entry from the fw_cfg_entry_cache list. This has now been addressed. Fix the remaining kobject leak by restoring the common error path and adding the missing kobject_put(). Fixes: 75f3e8e47f38 ("firmware: introduce sysfs driver for QEMU's fw_cfg device") Cc: [email protected] # 4.6 Cc: Gabriel Somlo <[email protected]> Signed-off-by: Johan Hovold <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-01-14firmware: qemu_fw_cfg: fix NULL-pointer deref on duplicate entriesJohan Hovold1-4/+1
Commit fe3c60684377 ("firmware: Fix a reference count leak.") "fixed" a kobject leak in the file registration helper by properly calling kobject_put() for the entry in case registration of the object fails (e.g. due to a name collision). This would however result in a NULL pointer dereference when the release function tries to remove the never added entry from the fw_cfg_entry_cache list. Fix this by moving the list-removal out of the release function. Note that the offending commit was one of the benign looking umn.edu fixes which was reviewed but not reverted. [1][2] [1] https://lore.kernel.org/r/202105051005.49BFABCE@keescook [2] https://lore.kernel.org/all/[email protected] Fixes: fe3c60684377 ("firmware: Fix a reference count leak.") Cc: [email protected] # 5.8 Cc: Qiushi Wu <[email protected]> Cc: Kees Cook <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Signed-off-by: Johan Hovold <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-01-14vdpa: Mark vdpa_config_ops.get_vq_notification as optionalEugenio Pérez1-1/+1
Since vhost_vdpa_mmap checks for its existence before calling it. Signed-off-by: Eugenio Pérez <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]> Acked-by: Jason Wang <[email protected]> Reviewed-by: Stefano Garzarella <[email protected]>
2022-01-14vdpa: Avoid duplicate call to vp_vdpa get_statusEugenio Pérez1-1/+1
It has no sense to call get_status twice, since we already have a variable for that. Signed-off-by: Eugenio Pérez <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]> Acked-by: Jason Wang <[email protected]> Reviewed-by: Stefano Garzarella <[email protected]>
2022-01-14eni_vdpa: Simplify 'eni_vdpa_probe()'Christophe JAILLET1-12/+0
When 'pcim_enable_device()' is used, some resources become automagically managed. There is no need to call 'pci_free_irq_vectors()' when the driver is removed. The same will already be done by 'pcim_release()'. Signed-off-by: Christophe JAILLET <[email protected]> Link: https://lore.kernel.org/r/02045bdcbbb25f79bae4827f66029cfcddc90381.1636301587.git.christophe.jaillet@wanadoo.fr Signed-off-by: Michael S. Tsirkin <[email protected]> Acked-by: Jason Wang <[email protected]>
2022-01-14net/mlx5_vdpa: Offer VIRTIO_NET_F_MTU when setting MTUEli Cohen1-0/+1
Make sure to offer VIRTIO_NET_F_MTU since we configure the MTU based on what was queried from the device. This allows the virtio driver to allocate large enough buffers based on the reported MTU. Signed-off-by: Eli Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]> Acked-by: Jason Wang <[email protected]> Reviewed-by: Si-Wei Liu <[email protected]>
2022-01-14virtio-mem: prepare fake page onlining code for granularity smaller than ↵David Hildenbrand1-13/+13
MAX_ORDER - 1 Let's prepare our fake page onlining code for subblock size smaller than MAX_ORDER - 1: we might get called for ranges not covering properly aligned MAX_ORDER - 1 pages. We have to detect the order to use dynamically. Signed-off-by: David Hildenbrand <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]> Reviewed-by: Zi Yan <[email protected]> Reviewed-by: Eric Ren <[email protected]>
2022-01-14virtio-mem: prepare page onlining code for granularity smaller than ↵David Hildenbrand1-24/+62
MAX_ORDER - 1 Let's prepare our page onlining code for subblock size smaller than MAX_ORDER - 1: we'll get called for a MAX_ORDER - 1 page but might have some subblocks in the range plugged and some unplugged. In that case, fallback to subblock granularity to properly only expose the plugged parts to the buddy. Signed-off-by: David Hildenbrand <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]> Reviewed-by: Zi Yan <[email protected]> Reviewed-by: Eric Ren <[email protected]>
2022-01-14vdpa: add driver_override supportStefano Garzarella3-0/+96
`driver_override` allows to control which of the vDPA bus drivers binds to a vDPA device. If `driver_override` is not set, the previous behaviour is followed: devices use the first vDPA bus driver loaded (unless auto binding is disabled). Tested on Fedora 34 with driverctl(8): $ modprobe virtio-vdpa $ modprobe vhost-vdpa $ modprobe vdpa-sim-net $ vdpa dev add mgmtdev vdpasim_net name dev1 # dev1 is attached to the first vDPA bus driver loaded $ driverctl -b vdpa list-devices dev1 virtio_vdpa $ driverctl -b vdpa set-override dev1 vhost_vdpa $ driverctl -b vdpa list-devices dev1 vhost_vdpa [*] Note: driverctl(8) integrates with udev so the binding is preserved. Suggested-by: Jason Wang <[email protected]> Acked-by: Jason Wang <[email protected]> Signed-off-by: Stefano Garzarella <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-01-14docs: document sysfs ABI for vDPA busStefano Garzarella2-0/+38
Add missing documentation of sysfs ABI for vDPA bus in the new Documentation/ABI/testing/sysfs-bus-vdpa file. Signed-off-by: Stefano Garzarella <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]> Acked-by: Jason Wang <[email protected]>
2022-01-14ifcvf/vDPA: fix misuse virtio-net device config size for blk devZhu Lingshan3-33/+41
This commit fixes a misuse of virtio-net device config size issue for virtio-block devices. A new member config_size in struct ifcvf_hw is introduced and would be initialized through vdpa_dev_add() to record correct device config size. To be more generic, rename ifcvf_hw.net_config to ifcvf_hw.dev_config, the helpers ifcvf_read/write_net_config() to ifcvf_read/write_dev_config() Signed-off-by: Zhu Lingshan <[email protected]> Reported-and-suggested-by: Stefano Garzarella <[email protected]> Reviewed-by: Stefano Garzarella <[email protected]> Fixes: 6ad31d162a4e ("vDPA/ifcvf: enable Intel C5000X-PL virtio-block for vDPA") Cc: <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-01-14vduse: moving kvfree into callerGuanjun1-1/+2
This free action should be moved into caller 'vduse_ioctl' in concert with the allocation. No functional change. Signed-off-by: Guanjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-01-14hwrng: virtio - unregister device before resetMichael S. Tsirkin1-1/+1
unregister after reset is clearly wrong - device can be used while it's reset. There's an attempt to protect against that using hwrng_removed but it seems racy since access can be in progress when the flag is set. Just unregister, then reset seems simpler and cleaner. NB: we might be able to drop hwrng_removed in a follow-up patch. Signed-off-by: Laurent Vivier <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-01-14virtio: wrap config->reset callsMichael S. Tsirkin26-33/+40
This will enable cleanups down the road. The idea is to disable cbs, then add "flush_queued_cbs" callback as a parameter, this way drivers can flush any work queued after callbacks have been disabled. Signed-off-by: Michael S. Tsirkin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-01-14MAINTAINERS: Add Helge as fbdev maintainerHelge Deller1-3/+4
The fbdev layer is orphaned, but seems to need some care. So I'd like to step up as new maintainer. Signed-off-by: Helge Deller <[email protected]> Acked-by: Geert Uytterhoeven <[email protected]>
2022-01-14x86/fpu: Fix inline prefix warningsYang Zhong2-2/+2
Fix sparse warnings in xstate and remove inline prefix. Fixes: 980fe2fddcff ("x86/fpu: Extend fpu_xstate_prctl() with guest permissions") Signed-off-by: Yang Zhong <[email protected]> Reported-by: kernel test robot <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-01-14selftest: kvm: Add amx selftestYang Zhong2-0/+449
This selftest covers two aspects of AMX. The first is triggering #NM exception and checking the MSR XFD_ERR value. The second case is loading tile config and tile data into guest registers and trapping to the host side for a complete save/load of the guest state. TMM0 is also checked against memory data after save/restore. Signed-off-by: Yang Zhong <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-01-14selftest: kvm: Move struct kvm_x86_state to headerYang Zhong2-16/+15
Those changes can avoid dereferencing pointer compile issue when amx_test.c reference state->xsave. Move struct kvm_x86_state definition to processor.h. Signed-off-by: Yang Zhong <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-01-14selftest: kvm: Reorder vcpu_load_state steps for AMXPaolo Bonzini1-8/+9
For AMX support it is recommended to load XCR0 after XFD, so that KVM does not see XFD=0, XCR=1 for a save state that will eventually be disabled (which would lead to premature allocation of the space required for that save state). It is also required to load XSAVE data after XCR0 and XFD, so that KVM can trigger allocation of the extra space required to store AMX state. Adjust vcpu_load_state to obey these new requirements. Signed-off-by: Paolo Bonzini <[email protected]> Signed-off-by: Yang Zhong <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-01-14kvm: x86: Disable interception for IA32_XFD on demandKevin Tian4-6/+29
Always intercepting IA32_XFD causes non-negligible overhead when this register is updated frequently in the guest. Disable r/w emulation after intercepting the first WRMSR(IA32_XFD) with a non-zero value. Disable WRMSR emulation implies that IA32_XFD becomes out-of-sync with the software states in fpstate and the per-cpu xfd cache. This leads to two additional changes accordingly: - Call fpu_sync_guest_vmexit_xfd_state() after vm-exit to bring software states back in-sync with the MSR, before handle_exit_irqoff() is called. - Always trap #NM once write interception is disabled for IA32_XFD. The #NM exception is rare if the guest doesn't use dynamic features. Otherwise, there is at most one exception per guest task given a dynamic feature. p.s. We have confirmed that SDM is being revised to say that when setting IA32_XFD[18] the AMX register state is not guaranteed to be preserved. This clarification avoids adding mess for a creative guest which sets IA32_XFD[18]=1 before saving active AMX state to its own storage. Signed-off-by: Kevin Tian <[email protected]> Signed-off-by: Jing Liu <[email protected]> Signed-off-by: Yang Zhong <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-01-14x86/fpu: Provide fpu_sync_guest_vmexit_xfd_state()Thomas Gleixner2-0/+26
KVM can disable the write emulation for the XFD MSR when the vCPU's fpstate is already correctly sized to reduce the overhead. When write emulation is disabled the XFD MSR state after a VMEXIT is unknown and therefore not in sync with the software states in fpstate and the per CPU XFD cache. Provide fpu_sync_guest_vmexit_xfd_state() which has to be invoked after a VMEXIT before enabling interrupts when write emulation is disabled for the XFD MSR. It could be invoked unconditionally even when write emulation is enabled for the price of a pointless MSR read. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Jing Liu <[email protected]> Signed-off-by: Yang Zhong <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-01-14kvm: selftests: Add support for KVM_CAP_XSAVE2Wei Wang10-8/+130
When KVM_CAP_XSAVE2 is supported, userspace is expected to allocate buffer for KVM_GET_XSAVE2 and KVM_SET_XSAVE using the size returned by KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2). Signed-off-by: Wei Wang <[email protected]> Signed-off-by: Guang Zeng <[email protected]> Signed-off-by: Jing Liu <[email protected]> Signed-off-by: Yang Zhong <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-01-14kvm: x86: Add support for getting/setting expanded xstate bufferGuang Zeng6-5/+106
With KVM_CAP_XSAVE, userspace uses a hardcoded 4KB buffer to get/set xstate data from/to KVM. This doesn't work when dynamic xfeatures (e.g. AMX) are exposed to the guest as they require a larger buffer size. Introduce a new capability (KVM_CAP_XSAVE2). Userspace VMM gets the required xstate buffer size via KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2). KVM_SET_XSAVE is extended to work with both legacy and new capabilities by doing properly-sized memdup_user() based on the guest fpu container. KVM_GET_XSAVE is kept for backward-compatible reason. Instead, KVM_GET_XSAVE2 is introduced under KVM_CAP_XSAVE2 as the preferred interface for getting xstate buffer (4KB or larger size) from KVM (Link: https://lkml.org/lkml/2021/12/15/510) Also, update the api doc with the new KVM_GET_XSAVE2 ioctl. Signed-off-by: Guang Zeng <[email protected]> Signed-off-by: Wei Wang <[email protected]> Signed-off-by: Jing Liu <[email protected]> Signed-off-by: Kevin Tian <[email protected]> Signed-off-by: Yang Zhong <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-01-14x86/fpu: Add uabi_size to guest_fpuThomas Gleixner3-0/+7
Userspace needs to inquire KVM about the buffer size to work with the new KVM_SET_XSAVE and KVM_GET_XSAVE2. Add the size info to guest_fpu for KVM to access. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Wei Wang <[email protected]> Signed-off-by: Jing Liu <[email protected]> Signed-off-by: Yang Zhong <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-01-14kvm: x86: Add CPUID support for Intel AMXJing Liu2-2/+27
Extend CPUID emulation to support XFD, AMX_TILE, AMX_INT8 and AMX_BF16. Adding those bits into kvm_cpu_caps finally activates all previous logics in this series. Hide XFD on 32bit host kernels. Otherwise it leads to a weird situation where KVM tells userspace to migrate MSR_IA32_XFD and then rejects attempts to read/write the MSR. Signed-off-by: Jing Liu <[email protected]> Signed-off-by: Yang Zhong <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-01-14kvm: x86: Add XCR0 support for Intel AMXJing Liu1-1/+6
Two XCR0 bits are defined for AMX to support XSAVE mechanism. Bit 17 is for tilecfg and bit 18 is for tiledata. The value of XCR0[17:18] is always either 00b or 11b. Also, SDM recommends that only 64-bit operating systems enable Intel AMX by setting XCR0[18:17]. 32-bit host kernel never sets the tile bits in vcpu->arch.guest_supported_xcr0. Signed-off-by: Jing Liu <[email protected]> Signed-off-by: Kevin Tian <[email protected]> Signed-off-by: Yang Zhong <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>