aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-09-21io_uring/net: io_async_msghdr caches for sendzcPavel Begunkov1-5/+4
We already keep io_async_msghdr caches for normal send/recv requests, use them also for zerocopy send. Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/42fa615b6e0be25f47a685c35d7b5e4f1b03d348.1662639236.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <[email protected]>
2022-09-21io_uring/net: use async caches for async prepPavel Begunkov2-3/+15
send/recv have async_data caches but there are only used from within issue handlers. Extend their use also to ->prep_async, should be handy with links and IOSQE_ASYNC. Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/b9a2264b807582a97ed606c5bfcdc2399384e8a5.1662639236.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <[email protected]>
2022-09-21io_uring/net: reshuffle error handlingPavel Begunkov1-8/+8
We should prioritise send/recv retry cases over failures, they're more important. Shuffle -ERESTARTSYS after we handled retries. Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/d9059691b30d0963b7269fa4a0c81ee7720555e6.1662639236.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <[email protected]>
2022-09-21io_uring: use io_cq_lock consistentlyPavel Begunkov1-1/+1
There is one place when we forgot to change hand coded spin locking with io_cq_lock(), change it to be more consistent. Note, the unlock part is already __io_cq_unlock_post(). Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/91699b9a00a07128f7ca66136bdbbfc67a64659e.1662639236.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <[email protected]>
2022-09-21io_uring: kill an outdated commentPavel Begunkov1-4/+0
Request referencing has changed a while ago and there is no notion left of submission/completion references, kill an outdated comment. Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/38902e7229d68cecd62702436d627d4858b0d9d4.1662639236.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <[email protected]>
2022-09-21io_uring: allow buffer recycling in READVDylan Yudaken2-94/+52
In commit 934447a603b2 ("io_uring: do not recycle buffer in READV") a temporary fix was put in io_kbuf_recycle to simply never recycle READV buffers. Instead of that, rather treat READV with REQ_F_BUFFER_SELECTED the same as a READ with REQ_F_BUFFER_SELECTED. Since READV requires iov_len of 1 they are essentially the same. In order to do this inside io_prep_rw() add some validation to check that it is in fact only length 1, and also extract the length of the buffer at prep time. This allows removal of the io_iov_buffer_select codepaths as they are only used from the READV op. Signed-off-by: Dylan Yudaken <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-09-21fs: add batch and poll flags to the uring_cmd_iopoll() handlerJens Axboe4-8/+16
We need the poll_flags to know how to poll for the IO, and we should have the batch structure in preparation for supporting batched completions with iopoll. Signed-off-by: Jens Axboe <[email protected]>
2022-09-21io_uring: ensure iopoll runs local task work as wellJens Axboe2-19/+26
Combine the two checks we have for task_work running and whether or not we need to shuffle the mutex into one, so we unify how task_work is run in the iopoll loop. This helps ensure that local task_work is run when needed, and also optimizes that path to avoid a mutex shuffle if it's not needed. Signed-off-by: Jens Axboe <[email protected]>
2022-09-21io_uring: add local task_work run helper that is entered lockedJens Axboe2-7/+17
We have a few spots that drop the mutex just to run local task_work, which immediately tries to grab it again. Add a helper that just passes in whether we're locked already. Signed-off-by: Jens Axboe <[email protected]>
2022-09-21io_uring: cleanly separate request types for iopollJens Axboe1-6/+9
After the addition of iopoll support for passthrough, there's a bit of a mixup here. Clean it up and get rid of the casting for the passthrough command type. Signed-off-by: Jens Axboe <[email protected]>
2022-09-21nvme: wire up async polling for io passthrough commandsKanchan Joshi4-5/+72
Store a cookie during submission, and use that to implement completion-polling inside the ->uring_cmd_iopoll handler. This handler makes use of existing bio poll facility. Signed-off-by: Kanchan Joshi <[email protected]> Signed-off-by: Anuj Gupta <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-09-21block: export blk_rq_is_pollKanchan Joshi2-1/+3
This is in preparation to support iopoll for nvme passthrough. Signed-off-by: Kanchan Joshi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-09-21io_uring: add iopoll infrastructure for io_uring_cmdKanchan Joshi5-5/+29
Put this up in the same way as iopoll is done for regular read/write IO. Make place for storing a cookie into struct io_uring_cmd on submission. Perform the completion using the ->uring_cmd_iopoll handler. Signed-off-by: Kanchan Joshi <[email protected]> Signed-off-by: Pankaj Raghav <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-09-21fs: add file_operations->uring_cmd_iopollKanchan Joshi1-0/+1
io_uring will invoke this to do completion polling on uring-cmd operations. Signed-off-by: Kanchan Joshi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-09-21io_uring: trace local task work runDylan Yudaken2-0/+32
Add tracing for io_run_local_task_work Signed-off-by: Dylan Yudaken <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-09-21io_uring: signal registered eventfd to process deferred task workDylan Yudaken2-23/+63
Some workloads rely on a registered eventfd (via io_uring_register_eventfd(3)) in order to wake up and process the io_uring. In the case of a ring setup with IORING_SETUP_DEFER_TASKRUN, that eventfd also needs to be signalled when there are tasks to run. This changes an old behaviour which assumed 1 eventfd signal implied at least 1 CQE, however only when this new flag is set (and so old users will not notice). This should be expected with the IORING_SETUP_DEFER_TASKRUN flag as it is not guaranteed that every task will result in a CQE. Signed-off-by: Dylan Yudaken <[email protected]> Link: https://lore.kernel.org/r/[email protected] [axboe: fold in call_rcu() serialization fix] Signed-off-by: Jens Axboe <[email protected]>
2022-09-21io_uring: move io_eventfd_putDylan Yudaken1-8/+8
Non functional change: move this function above io_eventfd_signal so it can be used from there Signed-off-by: Dylan Yudaken <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-09-21io_uring: add IORING_SETUP_DEFER_TASKRUNDylan Yudaken6-21/+168
Allow deferring async tasks until the user calls io_uring_enter(2) with the IORING_ENTER_GETEVENTS flag. Enable this mode with a flag at io_uring_setup time. This functionality requires that the later io_uring_enter will be called from the same submission task, and therefore restrict this flag to work only when IORING_SETUP_SINGLE_ISSUER is also set. Being able to hand pick when tasks are run prevents the problem where there is current work to be done, however task work runs anyway. For example, a common workload would obtain a batch of CQEs, and process each one. Interrupting this to additional taskwork would add latency but not gain anything. If instead task work is deferred to just before more CQEs are obtained then no additional latency is added. The way this is implemented is by trying to keep task work local to a io_ring_ctx, rather than to the submission task. This is required, as the application will want to wake up only a single io_ring_ctx at a time to process work, and so the lists of work have to be kept separate. This has some other benefits like not having to check the task continually in handle_tw_list (and potentially unlocking/locking those), and reducing locks in the submit & process completions path. There are networking cases where using this option can reduce request latency by 50%. For example a contrived example using [1] where the client sends 2k data and receives the same data back while doing some system calls (to trigger task work) shows this reduction. The reason ends up being that if sending responses is delayed by processing task work, then the client side sits idle. Whereas reordering the sends first means that the client runs it's workload in parallel with the local task work. [1]: Using https://github.com/DylanZA/netbench/tree/defer_run Client: ./netbench --client_only 1 --control_port 10000 --host <host> --tx "epoll --threads 16 --per_thread 1 --size 2048 --resp 2048 --workload 1000" Server: ./netbench --server_only 1 --control_port 10000 --rx "io_uring --defer_taskrun 0 --workload 100" --rx "io_uring --defer_taskrun 1 --workload 100" Signed-off-by: Dylan Yudaken <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-09-21io_uring: do not run task work at the start of io_uring_enterDylan Yudaken1-2/+0
This is not needed, and it is normally better to wait for task work until after submissions. This will allow greater batching if either work arrives in the meanwhile, or if the submissions cause task work to be queued up. For SQPOLL this also no longer runs task work, but this is handled inside the SQPOLL loop anyway. For IOPOLL io_iopoll_check will run task work anyway And otherwise io_cqring_wait will run task work Suggested-by: Pavel Begunkov <[email protected]> Signed-off-by: Dylan Yudaken <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-09-21io_uring: introduce io_has_workDylan Yudaken1-4/+9
This will be used later to know if the ring has outstanding work. Right now just if there is overflow CQEs to copy to the main CQE ring, but later will include deferred tasks Signed-off-by: Dylan Yudaken <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-09-21io_uring: remove unnecessary variableDylan Yudaken1-4/+1
'running' is set once and read once, so can easily just remove it Signed-off-by: Dylan Yudaken <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-09-21eventfd: guard wake_up in eventfd fs calls as wellDylan Yudaken3-5/+9
Guard wakeups that the user can trigger, and that may end up triggering a call back into eventfd_signal. This is in addition to the current approach that only guards in eventfd_signal. Rename in_eventfd_signal -> in_eventfd at the same time to reflect this. Without this there would be a deadlock in the following code using libaio: int main() { struct io_context *ctx = NULL; struct iocb iocb; struct iocb *iocbs[] = { &iocb }; int evfd; uint64_t val = 1; evfd = eventfd(0, EFD_CLOEXEC); assert(!io_setup(2, &ctx)); io_prep_poll(&iocb, evfd, POLLIN); io_set_eventfd(&iocb, evfd); assert(1 == io_submit(ctx, 1, iocbs)); write(evfd, &val, 8); } Signed-off-by: Dylan Yudaken <[email protected]> Reviewed-by: Jens Axboe <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-09-21Merge tag 'dmaengine-fix-6.0' of ↵Linus Torvalds3-12/+17
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine Pull dmaengine fixes from Vinod Koul: "A couple of small driver fixes: - xilinx_dma: devm_platform_ioremap_resource error handling, dma_set_mask_and_coherent failure handling, dt property read cleanup - refcount leak fix for of_xudma_dev_get() - zynqmp_dma: coverity fix for enum typecast" * tag 'dmaengine-fix-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: dmaengine: zynqmp_dma: Typecast with enum to fix the coverity warning dmaengine: ti: k3-udma-private: Fix refcount leak bug in of_xudma_dev_get() dmaengine: xilinx_dma: Report error in case of dma_set_mask_and_coherent API failure dmaengine: xilinx_dma: cleanup for fetching xlnx,num-fstores property dmaengine: xilinx_dma: Fix devm_platform_ioremap_resource error handling
2022-09-21Merge tag 'iommu-fixes-v6.0-rc6' of ↵Linus Torvalds3-13/+27
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fixes from Joerg Roedel: "Two fixes for Intel VT-d: - Check the right capability bit for 5-level page table support. - Revert a previous fix which caused a regression with Thunderbolt devices" * tag 'iommu-fixes-v6.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/vt-d: Check correct capability for sagaw determination Revert "iommu/vt-d: Fix possible recursive locking in intel_iommu_init()"
2022-09-21Merge tag 'sound-6.0-rc7' of ↵Linus Torvalds8-37/+80
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A bit more changes than wished, but still manageable amount. Most of commits are HD-audio specific device fixes / quirks, while there is a revert for the previous fix due to regressions and a double-free fix in ALSA core code" * tag 'sound-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: Revert "ALSA: usb-audio: Split endpoint setups for hw_params and prepare" ALSA: core: Fix double-free at snd_card_new() ALSA: hda/realtek: Add a quirk for HP OMEN 16 (8902) mute LED ALSA: hda/hdmi: Fix the converter reuse for the silent stream ALSA: hda/realtek: Add quirk for ASUS GA503R laptop ALSA: hda/realtek: Add pincfg for ASUS G533Z HP jack ALSA: hda/realtek: Add pincfg for ASUS G513 HP jack ALSA: hda/realtek: Re-arrange quirk table entries ALSA: hda/realtek: Enable 4-speaker output Dell Precision 5530 laptop ALSA: hda/realtek: Enable 4-speaker output Dell Precision 5570 laptop ALSA: hda: Fix Nvidia dp infoframe ALSA: hda/realtek: Add quirk for Huawei WRT-WX9 ALSA: hda/tegra: set depop delay for tegra ALSA: hda: add Intel 5 Series / 3400 PCI DID ALSA: hda: Fix hang at HD-audio codec unbinding due to refcount saturation
2022-09-21Merge tag 'exfat-for-6.0-rc7' of ↵Linus Torvalds1-2/+1
git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat Pull exfat fix from Namjae Jeon: - fix integer overflow on large partitions * tag 'exfat-for-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat: exfat: fix overflow for large capacity partition
2022-09-21blk-wbt: call rq_qos_add() after wb_normal is initializedYu Kuai1-5/+4
Our test found a problem that wbt inflight counter is negative, which will cause io hang(noted that this problem doesn't exist in mainline): t1: device create t2: issue io add_disk blk_register_queue wbt_enable_default wbt_init rq_qos_add // wb_normal is still 0 /* * in mainline, disk can't be opened before * bdev_add(), however, in old kernels, disk * can be opened before blk_register_queue(). */ blkdev_issue_flush // disk size is 0, however, it's not checked submit_bio_wait submit_bio blk_mq_submit_bio rq_qos_throttle wbt_wait bio_to_wbt_flags rwb_enabled // wb_normal is 0, inflight is not increased wbt_queue_depth_changed(&rwb->rqos); wbt_update_limits // wb_normal is initialized rq_qos_track wbt_track rq->wbt_flags |= bio_to_wbt_flags(rwb, bio); // wb_normal is not 0,wbt_flags will be set t3: io completion blk_mq_free_request rq_qos_done wbt_done wbt_is_tracked // return true __wbt_done wbt_rqw_done atomic_dec_return(&rqw->inflight); // inflight is decreased commit 8235b5c1e8c1 ("block: call bdev_add later in device_add_disk") can avoid this problem, however it's better to fix this problem in wbt: 1) Lower kernel can't backport this patch due to lots of refactor. 2) Root cause is that wbt call rq_qos_add() before wb_normal is initialized. Fixes: e34cbd307477 ("blk-wbt: add general throttling mechanism") Cc: <[email protected]> Signed-off-by: Yu Kuai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-09-21rnbd-srv: remove struct rnbd_devChristoph Hellwig3-50/+18
Given that rnbd_srv_sess_dev already has an open_flags member, there is no need for the rnbd_dev indirection as a simple block_device pointer works just as well. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Acked-by: Jack Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-09-21rnbd-srv: remove rnbd_dev_{open,close}Christoph Hellwig4-62/+18
These can be trivially open coded in the callers. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Acked-by: Jack Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-09-21rnbd-srv: remove rnbd_endioChristoph Hellwig2-15/+7
Fold rnbd_endio into the only caller. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Acked-by: Jack Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-09-21rnbd-srv: simplify rnbd_srv_fill_msg_open_rspChristoph Hellwig2-52/+13
Remove all the wrappers and just get the information directly from the block device, or where no such helpers exist the request_queue. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Acked-by: Jack Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-09-21block: Fix the enum blk_eh_timer_return documentationBart Van Assche1-2/+9
The documentation of the blk_eh_timer_return enumeration values does not reflect correctly how e.g. the SCSI core uses these values. Fix the documentation. Cc: Christoph Hellwig <[email protected]> Cc: Ming Lei <[email protected]> Cc: Hannes Reinecke <[email protected]> Cc: Damien Le Moal <[email protected]> Cc: Johannes Thumshirn <[email protected]> Fixes: 88b0cfad2888 ("block: document the blk_eh_timer_return values") Signed-off-by: Bart Van Assche <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-09-21s390/dasd: add device ping attributeStefan Haberland4-0/+81
Add a function to check if a device is accessible. This makes mostly sense for copy pair secondary devices but it will work for all devices. The sysfs attribute ping is a write only attribute and will issue a NOP CCW to the device. In case of success it will return zero. If the device is not accessible it will return an error code. Signed-off-by: Stefan Haberland <[email protected]> Reviewed-by: Jan Hoeppner <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-09-21s390/dasd: suppress generic error messages for PPRC secondary devicesStefan Haberland2-0/+10
Suppress generic command reject messages and dump of sense data for Peer-To-Peer-Remote-Copy (PPRC) secondary errors. If IO is issued on a PPRC secondary device, a specific error message is printed instead. Signed-off-by: Stefan Haberland <[email protected]> Reviewed-by: Jan Hoeppner <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-09-21s390/dasd: add ioctl to perform a swap of the drivers copy pairStefan Haberland2-0/+67
The newly defined ioctl BIODASDCOPYPAIRSWAP takes a structure that specifies a copy pair that should be swapped. It will call the device discipline function to perform the swap operation. The structure looks as followed: struct dasd_copypair_swap_data_t { char primary[20]; char secondary[20]; __u8 reserved[64]; }; where primary is the old primary device that will be replaced by the secondary device. The old primary will become a secondary device afterwards. Signed-off-by: Stefan Haberland <[email protected]> Reviewed-by: Jan Hoeppner <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-09-21s390/dasd: add copy pair swap capabilityStefan Haberland4-1/+117
In case of errors or misbehaviour of the primary device a controlled failover to one of the configured secondary devices needs to be performed. The swap processing stops I/O on the primary device, all requests are re-queued to the blocklayer queue, the entries in the copy relation are swapped and finally the link to the blockdevice is moved from primary to secondary dasd device. After this, the secondary becomes the new primary device and I/O is restarted on that device. Signed-off-by: Stefan Haberland <[email protected]> Reviewed-by: Jan Hoeppner <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-09-21s390/dasd: add copy pair setupStefan Haberland4-17/+623
A copy relation that is configured on the storage server side needs to be enabled separately in the device driver. A sysfs interface is created that allows userspace tooling to control such setup. The following sysfs entries are added to store and read copy relation information: copy_pair - Add/Delete a copy pair relation to the DASD device driver - Query all previously added copy pair relations copy_role - Query the copy pair role of the device To add a copy pair to the DASD device driver it has to be specified through the sysfs attribute copy_pair. Only one secondary device can be specified at a time together with the primary device. Both, secondary and primary can be used equally to define the copy pair. The secondary devices have to be offline when adding the copy relation. The primary device needs to be specified first followed by the comma separated secondary device. Read from the copy_pair attribute to get the current setup and write "clear" to the attribute to delete any existing setup. Example: $ echo 0.0.9700,0.0.9740 > /sys/bus/ccw/devices/0.0.9700/copy_pair $ cat /sys/bus/ccw/devices/0.0.9700/copy_pair 0.0.9700,0.0.9740 During device online processing the required data will be read from the storage server and the information will be compared to the setup requested through the copy_pair attribute. The registration of the primary and secondary device will be handled accordingly. A blockdevice is only allocated for copy relation primary devices. To query the copy role of a device read from the copy_role sysfs attribute. Possible values are primary, secondary, and none. Example: $ cat /sys/bus/ccw/devices/0.0.9700/copy_role primary Signed-off-by: Stefan Haberland <[email protected]> Reviewed-by: Jan Hoeppner <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-09-21s390/dasd: add query PPRC functionStefan Haberland3-0/+104
Add function to query the Peer-to-Peer-Remote-Copy (PPRC) state of a device by reading the related structure through a read subsystem data call. Signed-off-by: Stefan Haberland <[email protected]> Reviewed-by: Jan Hoeppner <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-09-21s390/dasd: put block allocation in separate functionStefan Haberland1-15/+23
Put block allocation into a separate function to put some copy pair logic in it in a later patch. Signed-off-by: Stefan Haberland <[email protected]> Reviewed-by: Jan Hoeppner <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-09-21KVM: s390: pci: register pci hooks without interpretationMatthew Rosato2-5/+13
The kvm registration hooks must be registered even if the facilities necessary for zPCI interpretation are unavailable, as vfio-pci-zdev will expect to use the hooks regardless. This fixes an issue where vfio-pci-zdev will fail its open function because of a missing kvm_register when running on hardware that does not support zPCI interpretation. Fixes: ca922fecda6c ("KVM: s390: pci: Hook to access KVM lowlevel from VFIO") Signed-off-by: Matthew Rosato <[email protected]> Reviewed-by: Pierre Morel <[email protected]> Link: https://lore.kernel.org/r/[email protected] Message-Id: <[email protected]> Signed-off-by: Janosch Frank <[email protected]>
2022-09-21KVM: s390: pci: fix GAIT physical vs virtual pointers usageMatthew Rosato2-2/+2
The GAIT and all of its entries must be represented by physical addresses as this structure is shared with underlying firmware. We can keep a virtual address of the GAIT origin in order to handle processing in the kernel, but when traversing the entries we must again convert the physical AISB stored in that GAIT entry into a virtual address in order to process it. Note: this currently doesn't fix a real bug, since virtual addresses are indentical to physical ones. Reviewed-by: Pierre Morel <[email protected]> Acked-by: Nico Boehr <[email protected]> Signed-off-by: Matthew Rosato <[email protected]> Reviewed-by: Claudio Imbrenda <[email protected]> Link: https://lore.kernel.org/r/[email protected] Message-Id: <[email protected]> Signed-off-by: Janosch Frank <[email protected]>
2022-09-21KVM: s390: Pass initialized arg even if unusedJanis Schoetterl-Glausch1-3/+13
This silences smatch warnings reported by kbuild bot: arch/s390/kvm/gaccess.c:859 guest_range_to_gpas() error: uninitialized symbol 'prot'. arch/s390/kvm/gaccess.c:1064 access_guest_with_key() error: uninitialized symbol 'prot'. This is because it cannot tell that the value is not used in this case. The trans_exc* only examine prot if code is PGM_PROTECTION. Pass a dummy value for other codes. Reported-by: kernel test robot <[email protected]> Reported-by: Dan Carpenter <[email protected]> Signed-off-by: Janis Schoetterl-Glausch <[email protected]> Reviewed-by: Claudio Imbrenda <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Christian Borntraeger <[email protected]> Signed-off-by: Janosch Frank <[email protected]>
2022-09-21KVM: s390: pci: fix plain integer as NULL pointer warningsMatthew Rosato2-5/+5
Fix some sparse warnings that a plain integer 0 is being used instead of NULL. Reported-by: kernel test robot <[email protected]> Signed-off-by: Matthew Rosato <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Christian Borntraeger <[email protected]> Signed-off-by: Janosch Frank <[email protected]>
2022-09-21Merge tag 'linux-can-fixes-for-6.0-20220921' of ↵Jakub Kicinski2-13/+18
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2022-09-21 The 1st patch is by me, targets the flexcan driver and fixes a potential system hang on single core systems under high CAN packet rate. The next 2 patches are also by me and target the gs_usb driver. A potential race condition during the ndo_open callback as well as the return value if the ethtool identify feature is not supported are fixed. * tag 'linux-can-fixes-for-6.0-20220921' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can: can: gs_usb: gs_usb_set_phys_id(): return with error if identify is not supported can: gs_usb: gs_can_open(): fix race dev->can.state condition can: flexcan: flexcan_mailbox_read() fix return value for drop = true ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-09-21perf jit: Include program header in ELF filesLieven Hey2-0/+18
The missing header makes it hard for programs like elfutils to open these files. Fixes: 2d86612aacb7805f ("perf symbol: Correct address for bss symbols") Reviewed-by: Leo Yan <[email protected]> Signed-off-by: Lieven Hey <[email protected]> Tested-by: Leo Yan <[email protected]> Cc: Leo Yan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-09-21perf test: Add a new test for perf stat cgroup BPF counterNamhyung Kim1-0/+83
$ sudo ./perf test -v each-cgroup 96: perf stat --bpf-counters --for-each-cgroup test : --- start --- test child forked, pid 79600 test child finished with 0 ---- end ---- perf stat --bpf-counters --for-each-cgroup test: Ok Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-09-21perf stat: Use evsel->core.cpus to iterate cpus in BPF cgroup countersNamhyung Kim1-3/+3
If it mixes core and uncore events, each evsel would have different cpu map. But it assumed they are same with evlist's all_cpus and accessed by the same index. This resulted in a crash like below. $ perf stat -a --bpf-counters --for-each_cgroup ^. -e cycles,imc/cas_count_read/ sleep 1 Segmentation fault While it's not recommended to use uncore events for cgroup aggregation, it should not crash. Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-09-21perf stat: Fix cpu map index in bperf cgroup codeNamhyung Kim1-2/+2
The previous cpu map introduced a bug in the bperf cgroup counter. This results in a failure when user gives a partial cpu map starting from non-zero. $ sudo ./perf stat -C 1-2 --bpf-counters --for-each-cgroup ^. sleep 1 libbpf: prog 'on_cgrp_switch': failed to create BPF link for perf_event FD 0: -9 (Bad file descriptor) Failed to attach cgroup program To get the FD of an evsel, it should use a map index not the CPU number. Fixes: 0255571a16059c8e ("perf cpumap: Switch to using perf_cpu_map API") Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: [email protected] Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-09-21perf stat: Fix BPF program section nameNamhyung Kim1-1/+1
It seems the recent libbpf got more strict about the section name. I'm seeing a failure like this: $ sudo ./perf stat -a --bpf-counters --for-each-cgroup ^. sleep 1 libbpf: prog 'on_cgrp_switch': missing BPF prog type, check ELF section name 'perf_events' libbpf: prog 'on_cgrp_switch': failed to load: -22 libbpf: failed to load object 'bperf_cgroup_bpf' libbpf: failed to load BPF skeleton 'bperf_cgroup_bpf': -22 Failed to load cgroup skeleton The section name should be 'perf_event' (without the trailing 's'). Although it's related to the libbpf change, it'd be better fix the section name in the first place. Fixes: 944138f048f7d759 ("perf stat: Enable BPF counter with --for-each-cgroup") Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: [email protected] Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-09-21Merge tag 'fpga-for-6.0-final' of ↵Greg Kroah-Hartman1-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/fpga/linux-fpga into char-misc-linus Xu writes: FPGA Manager changes for 6.0-final Intel m10 bmc secure update - Russ's change fixes the memory leak for a sysfs node reading All patches have been reviewed on the mailing list, and have been in the last linux-next releases (as part of our for-6.0 branch). Signed-off-by: Xu Yilun <[email protected]> * tag 'fpga-for-6.0-final' of git://git.kernel.org/pub/scm/linux/kernel/git/fpga/linux-fpga: fpga: m10bmc-sec: Fix possible memory leak of flash_buf