aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-09-25Merge tag 'rtc-6.12' of ↵Linus Torvalds17-39/+633
git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux Pull RTC updates from Alexandre Belloni: "More conversions of DT bindings to yaml. There is one new driver, for the DFRobot SD2405AL and support for important features of the stm32 RTC. Summary: New driver: - DFRobot SD2405AL Drivers: - stm32: add alarm A out and LSCO support - sun6i: disable automatic clock input switching - m48t59: set range" * tag 'rtc-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: rtc: rc5t619: use proper module tables rtc: m48t59: set range dt-bindings: rtc: microcrystal,rv3028: add #clock-cells property rtc: m48t59: Remove division condition with direct comparison rtc: at91sam9: fix OF node leak in probe() error path rtc: sun6i: disable automatic clock input switching dt-bindings: rtc: Drop non-trivial duplicate compatibles dt-bindings: vendor-prefixes: Add DFRobot. dt-bindings: rtc: Add support for SD2405AL. rtc: Add driver for SD2405AL rtc: s35390a: Drop vendorless compatible string from match table rtc: twl: convert comma to semicolon dt-bindings: rtc: sprd,sc2731-rtc: convert to YAML rtc: stm32: add alarm A out feature rtc: stm32: add Low Speed Clock Output (LSCO) support rtc: stm32: add pinctrl and pinmux interfaces dt-bindings: rtc: stm32: describe pinmux nodes
2024-09-25dt-bindings: input: Revert "dt-bindings: input: Goodix SPI HID Touchscreen"Krzysztof Kozlowski1-71/+0
This reverts commit 9184b17fbc23 ("dt-bindings: input: Goodix SPI HID Touchscreen") because it duplicates existing binding leadings to errors: goodix,gt7986u.example.dtb: touchscreen@0: compatible: 'oneOf' conditional failed, one must be fixed: ['goodix,gt7986u'] is too short 'goodix,gt7375p' was expected This was reported on mailing list on 6th of September, but no reaction happened from contributor or maintainer to fix it. Therefore let's drop binding which breaks and duplicates existing one. Fixes: 9184b17fbc23 ("dt-bindings: input: Goodix SPI HID Touchscreen") Reported-by: Rob Herring <[email protected]> Closes: https://lore.kernel.org/all/CAL_Jsq+QfTtRj_JCqXzktQ49H8VUnztVuaBjvvkg3fwEHniUHw@mail.gmail.com/ Signed-off-by: Krzysztof Kozlowski <[email protected]> Signed-off-by: Jiri Kosina <[email protected]>
2024-09-25HID: hid-goodix: drop unsupported and undocumented DT partKrzysztof Kozlowski1-9/+0
Drop support for Devicetree from, because the binding is being reverted (on basis of duplicating existing binding) and property was not added to the original binding. Signed-off-by: Krzysztof Kozlowski <[email protected]> Signed-off-by: Jiri Kosina <[email protected]>
2024-09-25Merge tag 'memblock-v6.12-rc1' of ↵Linus Torvalds16-15/+104
git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock Pull memblock updates from Mike Rapoport: - new memblock_estimated_nr_free_pages() helper to replace totalram_pages() which is less accurate when CONFIG_DEFERRED_STRUCT_PAGE_INIT is set - fixes for memblock tests * tag 'memblock-v6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock: s390/mm: get estimated free pages by memblock api kernel/fork.c: get estimated free pages by memblock api mm/memblock: introduce a new helper memblock_estimated_nr_free_pages() memblock test: fix implicit declaration of function 'strscpy' memblock test: fix implicit declaration of function 'isspace' memblock test: fix implicit declaration of function 'memparse' memblock test: add the definition of __setup() memblock test: fix implicit declaration of function 'virt_to_phys' tools/testing: abstract two init.h into common include directory memblock tests: include export.h in linkage.h as kernel dose memblock tests: include memory_hotplug.h in mmzone.h as kernel dose
2024-09-25Merge tag 'sparc-for-6.12-tag1' of ↵Linus Torvalds1-7/+1
git://git.kernel.org/pub/scm/linux/kernel/git/alarsson/linux-sparc Pull sparc32 update from Andreas Larsson: - Remove an unused variable for sparc32 * tag 'sparc-for-6.12-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/alarsson/linux-sparc: arch/sparc: remove unused varible paddrbase in function leon_swprobe()
2024-09-25Merge tag 'powerpc-6.12-2' of ↵Linus Torvalds2-100/+100
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix build error in vdso32 when building 64-bit with COMPAT=y and -Os - Fix build error in pseries EEH when CONFIG_DEBUG_FS is not set Thanks to Christophe Leroy, Narayana Murty N, Christian Zigotzky, and Ritesh Harjani. * tag 'powerpc-6.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/pseries/eeh: move pseries_eeh_err_inject() outside CONFIG_DEBUG_FS block powerpc/vdso32: Fix use of crtsavres for PPC64
2024-09-25Merge tag 'clang-format-6.12' of https://github.com/ojeda/linuxLinus Torvalds1-9/+30
Pull clang-format updates from Miguel Ojeda: "A routine update of the 'for_each' macro list" * tag 'clang-format-6.12' of https://github.com/ojeda/linux: clang-format: Update with v6.11-rc1's `for_each` macro list
2024-09-25Kbuild: make MODVERSIONS support depend on not being a compile test buildLinus Torvalds1-0/+1
Currently the Rust support is gated on not having MODVERSIONS enabled, and as a result an "allmodconfig" build will disable Rust build tests. While MODVERSIONS configurations are worth build testing, the feature is not actually meaningful unless you run the result, and I'd rather get build coverage of Rust than MODVERSIONS. So let's disable MODVERSIONS for build testing until the Rust side clears up. Signed-off-by: Linus Torvalds <[email protected]>
2024-09-25Merge tag 'rust-6.12' of https://github.com/Rust-for-Linux/linuxLinus Torvalds63-394/+3870
Pull Rust updates from Miguel Ojeda: "Toolchain and infrastructure: - Support 'MITIGATION_{RETHUNK,RETPOLINE,SLS}' (which cleans up objtool warnings), teach objtool about 'noreturn' Rust symbols and mimic '___ADDRESSABLE()' for 'module_{init,exit}'. With that, we should be objtool-warning-free, so enable it to run for all Rust object files. - KASAN (no 'SW_TAGS'), KCFI and shadow call sanitizer support. - Support 'RUSTC_VERSION', including re-config and re-build on change. - Split helpers file into several files in a folder, to avoid conflicts in it. Eventually those files will be moved to the right places with the new build system. In addition, remove the need to manually export the symbols defined there, reusing existing machinery for that. - Relax restriction on configurations with Rust + GCC plugins to just the RANDSTRUCT plugin. 'kernel' crate: - New 'list' module: doubly-linked linked list for use with reference counted values, which is heavily used by the upcoming Rust Binder. This includes 'ListArc' (a wrapper around 'Arc' that is guaranteed unique for the given ID), 'AtomicTracker' (tracks whether a 'ListArc' exists using an atomic), 'ListLinks' (the prev/next pointers for an item in a linked list), 'List' (the linked list itself), 'Iter' (an iterator over a 'List'), 'Cursor' (a cursor into a 'List' that allows to remove elements), 'ListArcField' (a field exclusively owned by a 'ListArc'), as well as support for heterogeneous lists. - New 'rbtree' module: red-black tree abstractions used by the upcoming Rust Binder. This includes 'RBTree' (the red-black tree itself), 'RBTreeNode' (a node), 'RBTreeNodeReservation' (a memory reservation for a node), 'Iter' and 'IterMut' (immutable and mutable iterators), 'Cursor' (bidirectional cursor that allows to remove elements), as well as an entry API similar to the Rust standard library one. - 'init' module: add 'write_[pin_]init' methods and the 'InPlaceWrite' trait. Add the 'assert_pinned!' macro. - 'sync' module: implement the 'InPlaceInit' trait for 'Arc' by introducing an associated type in the trait. - 'alloc' module: add 'drop_contents' method to 'BoxExt'. - 'types' module: implement the 'ForeignOwnable' trait for 'Pin<Box<T>>' and improve the trait's documentation. In addition, add the 'into_raw' method to the 'ARef' type. - 'error' module: in preparation for the upcoming Rust support for 32-bit architectures, like arm, locally allow Clippy lint for those. Documentation: - https://rust.docs.kernel.org has been announced, so link to it. - Enable rustdoc's "jump to definition" feature, making its output a bit closer to the experience in a cross-referencer. - Debian Testing now also provides recent Rust releases (outside of the freeze period), so add it to the list. MAINTAINERS: - Trevor is joining as reviewer of the "RUST" entry. And a few other small bits" * tag 'rust-6.12' of https://github.com/Rust-for-Linux/linux: (54 commits) kasan: rust: Add KASAN smoke test via UAF kbuild: rust: Enable KASAN support rust: kasan: Rust does not support KHWASAN kbuild: rust: Define probing macros for rustc kasan: simplify and clarify Makefile rust: cfi: add support for CFI_CLANG with Rust cfi: add CONFIG_CFI_ICALL_NORMALIZE_INTEGERS rust: support for shadow call stack sanitizer docs: rust: include other expressions in conditional compilation section kbuild: rust: replace proc macros dependency on `core.o` with the version text kbuild: rust: rebuild if the version text changes kbuild: rust: re-run Kconfig if the version text changes kbuild: rust: add `CONFIG_RUSTC_VERSION` rust: avoid `box_uninit_write` feature MAINTAINERS: add Trevor Gross as Rust reviewer rust: rbtree: add `RBTree::entry` rust: rbtree: add cursor rust: rbtree: add mutable iterator rust: rbtree: add iterator rust: rbtree: add red-black tree implementation backed by the C version ...
2024-09-25sefltests/tracing: Add a test for tracepoint events on modulesMasami Hiramatsu (Google)2-0/+62
Add a test case for tracepoint events on modules. This checks if it can add and remove the events correctly. Link: https://lore.kernel.org/all/172397781494.286558.7581515061075998225.stgit@devnote2/ Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
2024-09-25tracing/fprobe: Support raw tracepoints on future loaded modulesMasami Hiramatsu (Google)2-51/+101
Support raw tracepoint events on future loaded (unloaded) modules. This allows user to create raw tracepoint events which can be used from module's __init functions. Note: since the kernel does not have any information about the tracepoints in the unloaded modules, fprobe events can not check whether the tracepoint exists nor extend the BTF based arguments. Link: https://lore.kernel.org/all/172397780593.286558.18360375226968537828.stgit@devnote2/ Suggested-by: Mathieu Desnoyers <[email protected]> Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
2024-09-25tracing/fprobe: Support raw tracepoint events on modulesMasami Hiramatsu (Google)1-8/+38
Support raw tracepoint event on module by fprobe events. Since it only uses for_each_kernel_tracepoint() to find a tracepoint, the tracepoints on modules are not handled. Thus if user specified a tracepoint on a module, it shows an error. This adds new for_each_module_tracepoint() API to tracepoint subsystem, and uses it to find tracepoints on modules. Link: https://lore.kernel.org/all/172397779651.286558.15903703620679186867.stgit@devnote2/ Reported-by: don <[email protected]> Closes: https://lore.kernel.org/all/[email protected]/ Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
2024-09-25tracepoint: Support iterating tracepoints in a loading moduleMasami Hiramatsu (Google)2-10/+44
Add for_each_tracepoint_in_module() function to iterate tracepoints in a module. This API is needed for handling tracepoints in a loading module from tracepoint_module_notifier callback function. This also update for_each_module_tracepoint() to pass the module to callback function so that it can find module easily. Link: https://lore.kernel.org/all/172397778740.286558.15781131277732977643.stgit@devnote2/ Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
2024-09-25tracepoint: Support iterating over tracepoints on modulesMasami Hiramatsu (Google)2-0/+28
Add for_each_module_tracepoint() for iterating over tracepoints on modules. This is similar to the for_each_kernel_tracepoint() but only for the tracepoints on modules (not including kernel built-in tracepoints). Link: https://lore.kernel.org/all/172397777800.286558.14554748203446214056.stgit@devnote2/ Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
2024-09-25x86/pvh: Add 64bit relocation page tablesJason Andryuk1-1/+103
The PVH entry point is 32bit. For a 64bit kernel, the entry point must switch to 64bit mode, which requires a set of page tables. In the past, PVH used init_top_pgt. This works fine when the kernel is loaded at LOAD_PHYSICAL_ADDR, as the page tables are prebuilt for this address. If the kernel is loaded at a different address, they need to be adjusted. __startup_64() adjusts the prebuilt page tables for the physical load address, but it is 64bit code. The 32bit PVH entry code can't call it to adjust the page tables, so it can't readily be re-used. 64bit PVH entry needs page tables set up for identity map, the kernel high map and the direct map. pvh_start_xen() enters identity mapped. Inside xen_prepare_pvh(), it jumps through a pv_ops function pointer into the highmap. The direct map is used for __va() on the initramfs and other guest physical addresses. Add a dedicated set of prebuild page tables for PVH entry. They are adjusted in assembly before loading. Add XEN_ELFNOTE_PHYS32_RELOC to indicate support for relocation along with the kernel's loading constraints. The maximum load address, KERNEL_IMAGE_SIZE - 1, is determined by a single pvh_level2_ident_pgt page. It could be larger with more pages. Signed-off-by: Jason Andryuk <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Message-ID: <[email protected]> Signed-off-by: Juergen Gross <[email protected]>
2024-09-25x86/kernel: Move page table macros to headerJason Andryuk2-21/+22
The PVH entry point will need an additional set of prebuild page tables. Move the macros and defines to pgtable_64.h, so they can be re-used. Signed-off-by: Jason Andryuk <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Acked-by: Dave Hansen <[email protected]> Message-ID: <[email protected]> Signed-off-by: Juergen Gross <[email protected]>
2024-09-25x86/pvh: Set phys_base when calling xen_prepare_pvh()Jason Andryuk1-0/+13
phys_base needs to be set for __pa() to work in xen_pvh_init() when finding the hypercall page. Set it before calling into xen_prepare_pvh(), which calls xen_pvh_init(). Clear it afterward to avoid __startup_64() adding to it and creating an incorrect value. Signed-off-by: Jason Andryuk <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Message-ID: <[email protected]> Signed-off-by: Juergen Gross <[email protected]>
2024-09-25x86/pvh: Make PVH entrypoint PIC for x86-64Jason Andryuk1-12/+34
The PVH entrypoint is 32bit non-PIC code running the uncompressed vmlinux at its load address CONFIG_PHYSICAL_START - default 0x1000000 (16MB). The kernel is loaded at that physical address inside the VM by the VMM software (Xen/QEMU). When running a Xen PVH Dom0, the host reserved addresses are mapped 1-1 into the PVH container. There exist system firmwares (Coreboot/EDK2) with reserved memory at 16MB. This creates a conflict where the PVH kernel cannot be loaded at that address. Modify the PVH entrypoint to be position-indepedent to allow flexibility in load address. Only the 64bit entry path is converted. A 32bit kernel is not PIC, so calling into other parts of the kernel, like xen_prepare_pvh() and mk_pgtable_32(), don't work properly when relocated. This makes the code PIC, but the page tables need to be updated as well to handle running from the kernel high map. The UNWIND_HINT_END_OF_STACK is to silence: vmlinux.o: warning: objtool: pvh_start_xen+0x7f: unreachable instruction after the lret into 64bit code. Signed-off-by: Jason Andryuk <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Message-ID: <[email protected]> Signed-off-by: Juergen Gross <[email protected]>
2024-09-25xen: sync elfnote.h from xen treeJason Andryuk1-5/+88
Sync Xen's elfnote.h header from xen.git to pull in the XEN_ELFNOTE_PHYS32_RELOC define. xen commit dfc9fab00378 ("x86/PVH: Support relocatable dom0 kernels") This is a copy except for the removal of the emacs editor config at the end of the file. Signed-off-by: Jason Andryuk <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Message-ID: <[email protected]> Signed-off-by: Juergen Gross <[email protected]>
2024-09-25kprobes: Remove obsoleted declaration for init_test_probesGaosheng Cui1-9/+0
The init_test_probes() have been removed since commit e44e81c5b90f ("kprobes: convert tests to kunit"), and now it is useless, so remove it. Link: https://lore.kernel.org/all/[email protected]/ Signed-off-by: Gaosheng Cui <[email protected]> Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
2024-09-25uprobes: turn trace_uprobe's nhit counter to be per-CPU oneAndrii Nakryiko1-3/+21
trace_uprobe->nhit counter is not incremented atomically, so its value is questionable in when uprobe is hit on multiple CPUs simultaneously. Also, doing this shared counter increment across many CPUs causes heavy cache line bouncing, limiting uprobe/uretprobe performance scaling with number of CPUs. Solve both problems by making this a per-CPU counter. Link: https://lore.kernel.org/all/[email protected]/ Reviewed-by: Oleg Nesterov <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
2024-09-25vsock/virtio: avoid queuing packets when intermediate queue is emptyLuigi Leonardi1-4/+35
When the driver needs to send new packets to the device, it always queues the new sk_buffs into an intermediate queue (send_pkt_queue) and schedules a worker (send_pkt_work) to then queue them into the virtqueue exposed to the device. This increases the chance of batching, but also introduces a lot of latency into the communication. So we can optimize this path by adding a fast path to be taken when there is no element in the intermediate queue, there is space available in the virtqueue, and no other process that is sending packets (tx_lock held). The following benchmarks were run to check improvements in latency and throughput. The test bed is a host with Intel i7-10700KF CPU @ 3.80GHz and L1 guest running on QEMU/KVM with vhost process and all vCPUs pinned individually to pCPUs. - Latency Tool: Fio version 3.37-56 Mode: pingpong (h-g-h) Test runs: 50 Runtime-per-test: 50s Type: SOCK_STREAM In the following fio benchmark (pingpong mode) the host sends a payload to the guest and waits for the same payload back. fio process pinned both inside the host and the guest system. Before: Linux 6.9.8 Payload 64B: 1st perc. overall 99th perc. Before 12.91 16.78 42.24 us After 9.77 13.57 39.17 us Payload 512B: 1st perc. overall 99th perc. Before 13.35 17.35 41.52 us After 10.25 14.11 39.58 us Payload 4K: 1st perc. overall 99th perc. Before 14.71 19.87 41.52 us After 10.51 14.96 40.81 us - Throughput Tool: iperf-vsock The size represents the buffer length (-l) to read/write P represents the number of parallel streams P=1 4K 64K 128K Before 6.87 29.3 29.5 Gb/s After 10.5 39.4 39.9 Gb/s P=2 4K 64K 128K Before 10.5 32.8 33.2 Gb/s After 17.8 47.7 48.5 Gb/s P=4 4K 64K 128K Before 12.7 33.6 34.2 Gb/s After 16.9 48.1 50.5 Gb/s The performance improvement is related to this optimization, I used a ebpf kretprobe on virtio_transport_send_skb to check that each packet was sent directly to the virtqueue Co-developed-by: Marco Pinna <[email protected]> Signed-off-by: Marco Pinna <[email protected]> Signed-off-by: Luigi Leonardi <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Reviewed-by: Stefano Garzarella <[email protected]>
2024-09-25vsock/virtio: refactor virtio_transport_send_pkt_workMarco Pinna1-46/+59
Preliminary patch to introduce an optimization to the enqueue system. All the code used to enqueue a packet into the virtqueue is removed from virtio_transport_send_pkt_work() and moved to the new virtio_transport_send_skb() function. Co-developed-by: Luigi Leonardi <[email protected]> Signed-off-by: Luigi Leonardi <[email protected]> Signed-off-by: Marco Pinna <[email protected]> Reviewed-by: Stefano Garzarella <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2024-09-25fw_cfg: Constify struct kobj_typeHongbo Li1-1/+1
This 'struct kobj_type' is not modified. It is only used in kobject_init_and_add() which takes a 'const struct kobj_type *ktype' parameter. Constifying this structure and moving it to a read-only section, and this can increase over all security. ``` [Before] text data bss dec hex filename 5974 1008 96 7078 1ba6 drivers/firmware/qemu_fw_cfg.o [After] text data bss dec hex filename 6038 944 96 7078 1ba6 drivers/firmware/qemu_fw_cfg.o ``` Signed-off-by: Hongbo Li <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2024-09-25vdpa/mlx5: Postpone MR deletionDragos Tatulea3-5/+64
Currently, when a new MR is set up, the old MR is deleted. MR deletion is about 30-40% the time of MR creation. As deleting the old MR is not important for the process of setting up the new MR, this operation can be postponed. This series adds a workqueue that does MR garbage collection at a later point. If the MR lock is taken, the handler will back off and reschedule. The exception during shutdown: then the handler must not postpone the work. Note that this is only a speculative optimization: if there is some mapping operation that is triggered while the garbage collector handler has the lock taken, this operation it will have to wait for the handler to finish. Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Cosmin Ratiu <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2024-09-25vdpa/mlx5: Introduce init/destroy for MR resourcesDragos Tatulea4-5/+26
There's currently not a lot of action happening during the init/destroy of MR resources. But more will be added in the upcoming patches. As the mr mutex lock init/destroy has been moved to these new functions, the lifetime has now shifted away from mlx5_vdpa_alloc_resources() / mlx5_vdpa_free_resources() into these new functions. However, the lifetime at the outer scope remains the same: mlx5_vdpa_dev_add() / mlx5_vdpa_dev_free() Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Cosmin Ratiu <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2024-09-25vdpa/mlx5: Rename mr_mtx -> lockDragos Tatulea4-16/+16
Now that the mr resources have their own namespace in the struct, give the lock a clearer name. Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Cosmin Ratiu <[email protected]> Acked-by: Eugenio Pérez <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2024-09-25vdpa/mlx5: Extract mr members in own resource structDragos Tatulea4-41/+44
Group all mapping related resources into their own structure. Upcoming patches will add more members in this new structure. Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Cosmin Ratiu <[email protected]> Acked-by: Eugenio Pérez <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2024-09-25vdpa/mlx5: Rename functionDragos Tatulea3-6/+6
A followup patch will use this name for something else. Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Cosmin Ratiu <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2024-09-25vdpa/mlx5: Delete direct MKEYs in parallelDragos Tatulea1-0/+64
Use the async interface to issue MTT MKEY deletion. This makes destroy_user_mr() on average 8x times faster. This number is also dependent on the size of the MR being deleted. Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Cosmin Ratiu <[email protected]> Acked-by: Eugenio Pérez <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2024-09-25vdpa/mlx5: Create direct MKEYs in parallelDragos Tatulea1-22/+98
Use the async interface to issue MTT MKEY creation. Extra care is taken at the allocation of FW input commands due to the MTT tables having variable sizes depending on MR. The indirect MKEY is still created synchronously at the end as the direct MKEYs need to be filled in. This makes create_user_mr() 3-5x faster, depending on the size of the MR. Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Cosmin Ratiu <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2024-09-25MAINTAINERS: add virtio-vsock driver in the VIRTIO CORE sectionStefano Garzarella1-0/+1
The virtio-vsock driver is already under VM SOCKETS (AF_VSOCK), managed pricipally with the net tree, and VIRTIO AND VHOST VSOCK DRIVER. However, changes that only affect the virtio part usually go with Michael's tree, so let's also put the driver in the VIRTIO CORE section to have its maintainers in CC for changes to the virtio-vsock driver. Cc: "Michael S. Tsirkin" <[email protected]> Cc: Jason Wang <[email protected]> Signed-off-by: Stefano Garzarella <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Reviewed-by: Stefan Hajnoczi <[email protected]> Acked-by: Jason Wang <[email protected]>
2024-09-25virtio_fs: add sysfs entries for queue informationMax Gurtovoy1-8/+139
Introduce sysfs entries to provide visibility to the multiple queues used by the Virtio FS device. This enhancement allows users to query information about these queues. Specifically, add two sysfs entries: 1. Queue name: Provides the name of each queue (e.g. hiprio/requests.8). 2. CPU list: Shows the list of CPUs that can process requests for each queue. The CPU list feature is inspired by similar functionality in the block MQ layer, which provides analogous sysfs entries for block devices. These new sysfs entries will improve observability and aid in debugging and performance tuning of Virtio FS devices. Reviewed-by: Idan Zach <[email protected]> Reviewed-by: Shai Malin <[email protected]> Signed-off-by: Max Gurtovoy <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2024-09-25virtio_fs: introduce virtio_fs_put_locked helperMax Gurtovoy1-6/+11
Introduce a new helper function virtio_fs_put_locked to encapsulate the common pattern of releasing a virtio_fs reference while holding a lock. The existing virtio_fs_put helper will be used to release a virtio_fs reference while not holding a lock. Also add an assertion in case the lock is not taken when it should. Reviewed-by: Idan Zach <[email protected]> Reviewed-by: Shai Malin <[email protected]> Signed-off-by: Max Gurtovoy <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Reviewed-by: Stefan Hajnoczi <[email protected]>
2024-09-25vdpa: Remove unused declarationsYue Haibing2-4/+0
There is no caller and implementation in tree. Signed-off-by: Yue Haibing <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Reviewed-by: Shannon Nelson <[email protected]> Reviewed-by: Zhu Lingshan <[email protected]> Reviewed-by: Shannon Nelson &lt;<a href="mailto:[email protected]" target="_blank">[email protected]</a>&gt;<br> Reviewed-by: Zhu Lingshan <[email protected]>
2024-09-25vdpa/mlx5: Parallelize VQ suspend/resume for CVQ MQ commandDragos Tatulea1-10/+12
change_num_qps() is still suspending/resuming VQs one by one. This change switches to parallel suspend/resume. When increasing the number of queues the flow has changed a bit for simplicity: the setup_vq() function will always be called before resume_vqs(). If the VQ is initialized, setup_vq() will exit early. If the VQ is not initialized, setup_vq() will create it and resume_vqs() will resume it. Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Acked-by: Eugenio Pérez <[email protected]> Tested-by: Lei Yang <[email protected]>
2024-09-25vdpa/mlx5: Small improvement for change_num_qps()Dragos Tatulea1-10/+11
change_num_qps() has a lot of multiplications by 2 to convert the number of VQ pairs to number of VQs. This patch simplifies the code by doing the VQP -> VQ count conversion at the beginning in a variable. Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Acked-by: Eugenio Pérez <[email protected]> Tested-by: Lei Yang <[email protected]>
2024-09-25vdpa/mlx5: Keep notifiers during suspend but ignoreDragos Tatulea1-2/+4
Unregistering notifiers is a costly operation. Instead of removing the notifiers during device suspend and adding them back at resume, simply ignore the call when the device is suspended. At resume time call queue_link_work() to make sure that the device state is propagated in case there were changes. For 1 vDPA device x 32 VQs (16 VQPs) attached to a large VM (256 GB RAM, 32 CPUs x 2 threads per core), the device suspend time is reduced from ~13 ms to ~2.5 ms. Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Acked-by: Eugenio Pérez <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Tested-by: Lei Yang <[email protected]>
2024-09-25vdpa/mlx5: Parallelize device resumeDragos Tatulea1-26/+14
Currently device resume works on vqs serially. Building up on previous changes that converted vq operations to the async api, this patch parallelizes the device resume. For 1 vDPA device x 32 VQs (16 VQPs) attached to a large VM (256 GB RAM, 32 CPUs x 2 threads per core), the device resume time is reduced from ~16 ms to ~4.5 ms. Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Acked-by: Eugenio Pérez <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Tested-by: Lei Yang <[email protected]>
2024-09-25vdpa/mlx5: Parallelize device suspendDragos Tatulea1-27/+29
Currently device suspend works on vqs serially. Building up on previous changes that converted vq operations to the async api, this patch parallelizes the device suspend: 1) Suspend all active vqs parallel. 2) Query suspended vqs in parallel. For 1 vDPA device x 32 VQs (16 VQPs) attached to a large VM (256 GB RAM, 32 CPUs x 2 threads per core), the device suspend time is reduced from ~37 ms to ~13 ms. A later patch will remove the link unregister operation which will make it even faster. Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Acked-by: Eugenio Pérez <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Tested-by: Lei Yang <[email protected]>
2024-09-25vdpa/mlx5: Use async API for vq modify commandsDragos Tatulea1-48/+106
Switch firmware vq modify command to be issued via the async API to allow future parallelization. The new refactored function applies the modify on a range of vqs and waits for their execution to complete. For now the command is still used in a serial fashion. A later patch will switch to modifying multiple vqs in parallel. Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Acked-by: Eugenio Pérez <[email protected]> Tested-by: Lei Yang <[email protected]>
2024-09-25vdpa/mlx5: Use async API for vq query commandDragos Tatulea2-25/+78
Switch firmware vq query command to be issued via the async API to allow future parallelization. For now the command is still serial but the infrastructure is there to issue commands in parallel, including ratelimiting the number of issued async commands to firmware. A later patch will switch to issuing more commands at a time. Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Tested-by: Lei Yang <[email protected]>
2024-09-25vdpa/mlx5: Introduce async fw command wrapperDragos Tatulea2-0/+88
Introduce a new function mlx5_vdpa_exec_async_cmds() which wraps the mlx5_core async firmware command API in a way that will be used to parallelize certain operation in this driver. The wrapper deals with the case when mlx5_cmd_exec_cb() returns EBUSY due to the command being throttled. Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Acked-by: Eugenio Pérez <[email protected]> Tested-by: Lei Yang <[email protected]>
2024-09-25vdpa/mlx5: Introduce error logging functionDragos Tatulea2-12/+17
mlx5_vdpa_err() was missing. This patch adds it and uses it in the necessary places. Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Acked-by: Eugenio Pérez <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Tested-by: Lei Yang <[email protected]>
2024-09-25net/mlx5: Support throttled commands from async APIDragos Tatulea1-5/+16
Currently, commands that qualify as throttled can't be used via the async API. That's due to the fact that the throttle semaphore can sleep but the async API can't. This patch allows throttling in the async API by using the tentative variant of the semaphore and upon failure (semaphore at 0) returns EBUSY to signal to the caller that they need to wait for the completion of previously issued commands. Furthermore, make sure that the semaphore is released in the callback. Signed-off-by: Dragos Tatulea <[email protected]> Cc: Leon Romanovsky <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Tested-by: Lei Yang <[email protected]>
2024-09-25xen/pciback: fix cast to restricted pci_ers_result_t and pci_power_tMin-Hua Chen2-2/+2
This patch fix the following sparse warning by applying __force cast to pci_ers_result_t and pci_power_t. drivers/xen/xen-pciback/pci_stub.c:760:16: sparse: warning: cast to restricted pci_ers_result_t drivers/xen/xen-pciback/conf_space_capability.c:125:22: sparse: warning: cast to restricted pci_power_t No functional changes intended. Signed-off-by: Min-Hua Chen <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Message-ID: <[email protected]> Signed-off-by: Juergen Gross <[email protected]>
2024-09-25Merge tag 'nvme-6.12-2024-09-25' of git://git.infradead.org/nvme into ↵Jens Axboe3-8/+12
for-6.12/block Pull NVMe fixes from Keith: "nvme fixes for Linux 6.12 - Multipath fixes (Hannes) - Sysfs attribute list NULL terminate fix (Shin'ichiro) - Remove problematic read-back (Keith)" * tag 'nvme-6.12-2024-09-25' of git://git.infradead.org/nvme: nvme: remove CC register read-back during enabling nvme: null terminate nvme_tls_attrs nvme-multipath: avoid hang on inaccessible namespaces nvme-multipath: system fails to create generic nvme device
2024-09-25Revert "driver core: don't always lock parent in shutdown"Greg Kroah-Hartman1-2/+2
This reverts commit ba6353748e71bd1d7e422fec2b5c2e2dfc2e3bd9. The series is being reverted before -rc1 as there are still reports of lockups on shutdown, so it's not quite ready for "prime time." Reported-by: Andrey Skvortsov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Cc: Christoph Hellwig <[email protected]> Cc: David Jeffery <[email protected]> Cc: Keith Busch <[email protected]> Cc: Laurence Oberman <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Sagi Grimberg <[email protected]> Cc: Stuart Hayes <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2024-09-25Revert "driver core: separate function to shutdown one device"Greg Kroah-Hartman1-36/+30
This reverts commit 95dc7565253a8564911190ebd1e4ffceb4de208a. The series is being reverted before -rc1 as there are still reports of lockups on shutdown, so it's not quite ready for "prime time." Reported-by: Andrey Skvortsov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Cc: Christoph Hellwig <[email protected]> Cc: David Jeffery <[email protected]> Cc: Keith Busch <[email protected]> Cc: Laurence Oberman <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Sagi Grimberg <[email protected]> Cc: Stuart Hayes <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2024-09-25Revert "driver core: shut down devices asynchronously"Greg Kroah-Hartman3-59/+1
This reverts commit 8064952c65045f05ee2671fe437770e50c151776. The series is being reverted before -rc1 as there are still reports of lockups on shutdown, so it's not quite ready for "prime time." Reported-by: Andrey Skvortsov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Cc: Christoph Hellwig <[email protected]> Cc: David Jeffery <[email protected]> Cc: Keith Busch <[email protected]> Cc: Laurence Oberman <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Sagi Grimberg <[email protected]> Cc: Stuart Hayes <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>