aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-01-15mm/page_alloc: refactor memmap_init_zone_device() page initJoao Martins1-33/+41
Move struct page init to an helper function __init_zone_device_page(). This is in preparation for sharing the storage for compound page metadata. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Joao Martins <[email protected]> Reviewed-by: Dan Williams <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Dave Jiang <[email protected]> Cc: Jane Chu <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: John Hubbard <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Muchun Song <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Vishal Verma <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-15mm/page_alloc: split prep_compound_page into head and tail subpartsJoao Martins1-10/+20
Patch series "mm, device-dax: Introduce compound pages in devmap", v7. This series converts device-dax to use compound pages, and moves away from the 'struct page per basepage on PMD/PUD' that is done today. Doing so 1) unlocks a few noticeable improvements on unpin_user_pages() and makes device-dax+altmap case 4x times faster in pinning (numbers below and in last patch) 2) as mentioned in various other threads it's one important step towards cleaning up ZONE_DEVICE refcounting. I've split the compound pages on devmap part from the rest based on recent discussions on devmap pending and future work planned[5][6]. There is consensus that device-dax should be using compound pages to represent its PMD/PUDs just like HugeTLB and THP, and that leads to less specialization of the dax parts. I will pursue the rest of the work in parallel once this part is merged, particular the GUP-{slow,fast} improvements [7] and the tail struct page deduplication memory savings part[8]. To summarize what the series does: Patch 1: Prepare hwpoisoning to work with dax compound pages. Patches 2-3: Split the current utility function of prep_compound_page() into head and tail and use those two helpers where appropriate to take advantage of caches being warm after __init_single_page(). This is used when initializing zone device when we bring up device-dax namespaces. Patches 4-10: Add devmap support for compound pages in device-dax. memmap_init_zone_device() initialize its metadata as compound pages, and it introduces a new devmap property known as vmemmap_shift which outlines how the vmemmap is structured (defaults to base pages as done today). The property describe the page order of the metadata essentially. While at it do a few cleanups in device-dax in patches 5-9. Finally enable device-dax usage of devmap @vmemmap_shift to a value based on its own @align property. @vmemmap_shift returns 0 by default (which is today's case of base pages in devmap, like fsdax or the others) and the usage of compound devmap is optional. Starting with device-dax (*not* fsdax) we enable it by default. There are a few pinning improvements particular on the unpinning case and altmap, as well as unpin_user_page_range_dirty_lock() being just as effective as THP/hugetlb[0] pages. $ gup_test -f /dev/dax1.0 -m 16384 -r 10 -S -a -n 512 -w (pin_user_pages_fast 2M pages) put:~71 ms -> put:~22 ms [altmap] (pin_user_pages_fast 2M pages) get:~524ms put:~525 ms -> get: ~127ms put:~71ms $ gup_test -f /dev/dax1.0 -m 129022 -r 10 -S -a -n 512 -w (pin_user_pages_fast 2M pages) put:~513 ms -> put:~188 ms [altmap with -m 127004] (pin_user_pages_fast 2M pages) get:~4.1 secs put:~4.12 secs -> get:~1sec put:~563ms Tested on x86 with 1Tb+ of pmem (alongside registering it with RDMA with and without altmap), alongside gup_test selftests with dynamic dax regions and static dax regions. Coupled with ndctl unit tests for dynamic dax devices that exercise all of this. Note, for dynamic dax regions I had to revert commit 8aa83e6395 ("x86/setup: Call early_reserve_memory() earlier"), it is a known issue that this commit broke efi_fake_mem=. This patch (of 11): Split the utility function prep_compound_page() into head and tail counterparts, and use them accordingly. This is in preparation for sharing the storage for compound page metadata. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Joao Martins <[email protected]> Acked-by: Mike Kravetz <[email protected]> Reviewed-by: Dan Williams <[email protected]> Reviewed-by: Muchun Song <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Dave Jiang <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: John Hubbard <[email protected]> Cc: Jane Chu <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Jason Gunthorpe <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-15mm: defer kmemleak object creation of module_alloc()Kefeng Wang7-12/+27
Yongqiang reports a kmemleak panic when module insmod/rmmod with KASAN enabled(without KASAN_VMALLOC) on x86[1]. When the module area allocates memory, it's kmemleak_object is created successfully, but the KASAN shadow memory of module allocation is not ready, so when kmemleak scan the module's pointer, it will panic due to no shadow memory with KASAN check. module_alloc __vmalloc_node_range kmemleak_vmalloc kmemleak_scan update_checksum kasan_module_alloc kmemleak_ignore Note, there is no problem if KASAN_VMALLOC enabled, the modules area entire shadow memory is preallocated. Thus, the bug only exits on ARCH which supports dynamic allocation of module area per module load, for now, only x86/arm64/s390 are involved. Add a VM_DEFER_KMEMLEAK flags, defer vmalloc'ed object register of kmemleak in module_alloc() to fix this issue. [1] https://lore.kernel.org/all/[email protected]/ [[email protected]: fix build] Link: https://lkml.kernel.org/r/[email protected] [[email protected]: simplify ifdefs, per Andrey] Link: https://lkml.kernel.org/r/CA+fCnZcnwJHUQq34VuRxpdoY6_XbJCDJ-jopksS5Eia4PijPzw@mail.gmail.com Link: https://lkml.kernel.org/r/[email protected] Fixes: 793213a82de4 ("s390/kasan: dynamic shadow mem allocation for modules") Fixes: 39d114ddc682 ("arm64: add KASAN support") Fixes: bebf56a1b176 ("kasan: enable instrumentation of global variables") Signed-off-by: Kefeng Wang <[email protected]> Reported-by: Yongqiang Liu <[email protected]> Cc: Andrey Konovalov <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Kefeng Wang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-15mm: kmemleak: alloc gray object for reserved region with direct mapCalvin Zhang1-1/+5
Reserved regions with direct mapping may contain references to other regions. CMA region with fixed location is reserved without creating kmemleak_object for it. So add them as gray kmemleak objects. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Calvin Zhang <[email protected]> Cc: Rob Herring <[email protected]> Cc: Frank Rowand <[email protected]> Cc: Catalin Marinas <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-15kmemleak: fix kmemleak false positive report with HW tag-based kasan enableKuan-Ying Lee1-7/+14
With HW tag-based kasan enable, We will get the warning when we free object whose address starts with 0xFF. It is because kmemleak rbtree stores tagged object and this freeing object's tag does not match with rbtree object. In the example below, kmemleak rbtree stores the tagged object in the kmalloc(), and kfree() gets the pointer with 0xFF tag. Call sequence: ptr = kmalloc(size, GFP_KERNEL); page = virt_to_page(ptr); offset = offset_in_page(ptr); kfree(page_address(page) + offset); ptr = kmalloc(size, GFP_KERNEL); A sequence like that may cause the warning as following: 1) Freeing unknown object: In kfree(), we will get free unknown object warning in kmemleak_free(). Because object(0xFx) in kmemleak rbtree and pointer(0xFF) in kfree() have different tag. 2) Overlap existing: When we allocate that object with the same hw-tag again, we will find the overlap in the kmemleak rbtree and kmemleak thread will be killed. kmemleak: Freeing unknown object at 0xffff000003f88000 CPU: 5 PID: 177 Comm: cat Not tainted 5.16.0-rc1-dirty #21 Hardware name: linux,dummy-virt (DT) Call trace: dump_backtrace+0x0/0x1ac show_stack+0x1c/0x30 dump_stack_lvl+0x68/0x84 dump_stack+0x1c/0x38 kmemleak_free+0x6c/0x70 slab_free_freelist_hook+0x104/0x200 kmem_cache_free+0xa8/0x3d4 test_version_show+0x270/0x3a0 module_attr_show+0x28/0x40 sysfs_kf_seq_show+0xb0/0x130 kernfs_seq_show+0x30/0x40 seq_read_iter+0x1bc/0x4b0 seq_read_iter+0x1bc/0x4b0 kernfs_fop_read_iter+0x144/0x1c0 generic_file_splice_read+0xd0/0x184 do_splice_to+0x90/0xe0 splice_direct_to_actor+0xb8/0x250 do_splice_direct+0x88/0xd4 do_sendfile+0x2b0/0x344 __arm64_sys_sendfile64+0x164/0x16c invoke_syscall+0x48/0x114 el0_svc_common.constprop.0+0x44/0xec do_el0_svc+0x74/0x90 el0_svc+0x20/0x80 el0t_64_sync_handler+0x1a8/0x1b0 el0t_64_sync+0x1ac/0x1b0 ... kmemleak: Cannot insert 0xf2ff000003f88000 into the object search tree (overlaps existing) CPU: 5 PID: 178 Comm: cat Not tainted 5.16.0-rc1-dirty #21 Hardware name: linux,dummy-virt (DT) Call trace: dump_backtrace+0x0/0x1ac show_stack+0x1c/0x30 dump_stack_lvl+0x68/0x84 dump_stack+0x1c/0x38 create_object.isra.0+0x2d8/0x2fc kmemleak_alloc+0x34/0x40 kmem_cache_alloc+0x23c/0x2f0 test_version_show+0x1fc/0x3a0 module_attr_show+0x28/0x40 sysfs_kf_seq_show+0xb0/0x130 kernfs_seq_show+0x30/0x40 seq_read_iter+0x1bc/0x4b0 kernfs_fop_read_iter+0x144/0x1c0 generic_file_splice_read+0xd0/0x184 do_splice_to+0x90/0xe0 splice_direct_to_actor+0xb8/0x250 do_splice_direct+0x88/0xd4 do_sendfile+0x2b0/0x344 __arm64_sys_sendfile64+0x164/0x16c invoke_syscall+0x48/0x114 el0_svc_common.constprop.0+0x44/0xec do_el0_svc+0x74/0x90 el0_svc+0x20/0x80 el0t_64_sync_handler+0x1a8/0x1b0 el0t_64_sync+0x1ac/0x1b0 kmemleak: Kernel memory leak detector disabled kmemleak: Object 0xf2ff000003f88000 (size 128): kmemleak: comm "cat", pid 177, jiffies 4294921177 kmemleak: min_count = 1 kmemleak: count = 0 kmemleak: flags = 0x1 kmemleak: checksum = 0 kmemleak: backtrace: kmem_cache_alloc+0x23c/0x2f0 test_version_show+0x1fc/0x3a0 module_attr_show+0x28/0x40 sysfs_kf_seq_show+0xb0/0x130 kernfs_seq_show+0x30/0x40 seq_read_iter+0x1bc/0x4b0 kernfs_fop_read_iter+0x144/0x1c0 generic_file_splice_read+0xd0/0x184 do_splice_to+0x90/0xe0 splice_direct_to_actor+0xb8/0x250 do_splice_direct+0x88/0xd4 do_sendfile+0x2b0/0x344 __arm64_sys_sendfile64+0x164/0x16c invoke_syscall+0x48/0x114 el0_svc_common.constprop.0+0x44/0xec do_el0_svc+0x74/0x90 kmemleak: Automatic memory scanning thread ended [[email protected]: whitespace tweak] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kuan-Ying Lee <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Cc: Doug Berger <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Vlastimil Babka <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-15mm: slab: make slab iterator functions staticMuchun Song3-20/+15
There is no external users of slab_start/next/stop(), so make them static. And the memory.kmem.slabinfo is deprecated, which outputs nothing now, so move memcg_slab_show() into mm/memcontrol.c and rename it to mem_cgroup_slab_show to be consistent with other function names. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Muchun Song <[email protected]> Reviewed-by: Vlastimil Babka <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: David Rientjes <[email protected]> Cc: Joonsoo Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-15mm/slab_common: use WARN() if cache still has objects on destroyMarco Elver1-8/+3
Calling kmem_cache_destroy() while the cache still has objects allocated is a kernel bug, and will usually result in the entire cache being leaked. While the message in kmem_cache_destroy() resembles a warning, it is currently not implemented using a real WARN(). This is problematic for infrastructure testing the kernel, all of which rely on the specific format of WARN()s to pick up on bugs. Some 13 years ago this used to be a simple WARN_ON() in slub, but commit d629d8195793 ("slub: improve kmem_cache_destroy() error message") changed it into an open-coded warning to avoid confusion with a bug in slub itself. Instead, turn the open-coded warning into a real WARN() with the message preserved, so that test systems can actually identify these issues, and we get all the other benefits of using a normal WARN(). The warning message is extended with "when called from <caller-ip>" to make it even clearer where the fault lies. For most configurations this is only a cosmetic change, however, note that WARN() here will now also respect panic_on_warn. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Marco Elver <[email protected]> Reviewed-by: Vlastimil Babka <[email protected]> Acked-by: David Rientjes <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-15fs/ioctl: remove unnecessary __user annotationAmit Daniel Kachhap1-1/+1
__user annotations are used by the checker (e.g sparse) to mark user pointers. However here __user is applied to a struct directly, without a pointer being directly involved. Although the presence of __user does not cause sparse to emit a warning, __user should be removed for consistency with other uses of offsetof(). Note: No functional changes intended. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Amit Daniel Kachhap <[email protected]> Cc: Vincenzo Frascino <[email protected]> Cc: Kevin Brodsky <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-15ocfs2: remove redundant assignment to variable free_spaceColin Ian King1-1/+1
The variable 'free_space' is being initialized with a value that is not read, it is being re-assigned later in the two paths of an if statement. The early initialization is redundant and can be removed. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Colin Ian King <[email protected]> Acked-by: Joseph Qi <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-15ocfs2: cluster: use default_groups in kobj_typeGreg Kroah-Hartman1-5/+6
There are currently two ways to create a set of sysfs files for a kobj_type, through the default_attrs field, and the default_groups field. Move the ocfs2 cluster sysfs code to use default_groups field which has been the preferred way since aa30f47cf666 ("kobject: Add support for default attribute groups to kobj_type") so that we can soon get rid of the obsolete default_attrs field. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> Reviewed-by: Joseph Qi <[email protected]> Tested-by: Joseph Qi <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-15ocfs2: remove redundant assignment to pointer root_bhColin Ian King1-1/+1
The variable 'root_bh' is being initialized with a value that is not read, it is being re-assigned later on closer to its use. The early initialization is redundant and can be removed. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Colin Ian King <[email protected]> Acked-by: Joseph Qi <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-15ocfs2: use default_groups in kobj_typeGreg Kroah-Hartman1-1/+2
There are currently two ways to create a set of sysfs files for a kobj_type, through the default_attrs field, and the default_groups field. Move the ocfs2 code to use default_groups field which has been the preferred way since aa30f47cf666 ("kobject: Add support for default attribute groups to kobj_type") so that we can soon get rid of the obsolete default_attrs field. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> Acked-by: Joseph Qi <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-15ocfs2: clearly handle ocfs2_grab_pages_for_write() return valueJoseph Qi1-13/+13
ocfs2_grab_pages_for_write() may return -EAGAIN if write context type is mmap and it could not lock the target page. In this case, we exit with no error and no target page. And then trigger the caller page_mkwrite() to retry. Since there are other caller types, e.g. buffer and direct io, make the return value handling more clear. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Joseph Qi <[email protected]> Reported-by: Dan Carpenter <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-15ocfs2: use BUG_ON instead of if condition followed by BUG.Zhang Mingyu1-4/+2
This issue was detected with the help of Coccinelle. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Zhang Mingyu <[email protected]> Reported-by: Zeal Robot <[email protected]> Acked-by: Joseph Qi <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-15squashfs: provide backing_dev_info in order to disable read-aheadZheng Liang1-0/+33
Commit c1f6925e1091 ("mm: put readahead pages in cache earlier") causes the read performance of squashfs to deteriorate.Through testing, we find that the performance will be back by closing the readahead of squashfs. So we want to learn the way of ubifs, provides backing_dev_info and disable read-ahead We tested the following data by fio. squashfs image blocksize=128K test command: fio --name basic --bs=? --filename="/mnt/test_file" --rw=? --iodepth=1 --ioengine=psync --runtime=200 --time_based turn on squashfs readahead in 5.10 kernel bs(k) read/randread MB/s 4 randread 271 128 randread 231 1024 randread 246 4 read 310 128 read 245 1024 read 247 turn off squashfs readahead in 5.10 kernel bs(k) read/randread MB/s 4 randread 293 128 randread 330 1024 randread 363 4 read 338 128 read 360 1024 read 365 turn on squashfs readahead and revert the commit c1f6925e1091("mm: put readahead pages in cache earlier") in 5.10 kernel bs(k) read/randread MB/s 4 randread 289 128 randread 306 1024 randread 335 4 read 337 128 read 336 1024 read 338 Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Zheng Liang <[email protected]> Reviewed-by: Phillip Lougher <[email protected]> Cc: Zhang Yi <[email protected]> Cc: Hou Tao <[email protected]> Cc: Miao Xie <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-15fs/ntfs/attrib.c: fix one kernel-doc commentYang Li1-1/+1
The comments for the file should not be in kernel-doc format: /** * attrib.c - NTFS attribute operations. Part of the Linux-NTFS as it causes it to be incorrectly identified for function ntfs_map_runlist_nolock(), causing some warnings found by running scripts/kernel-doc.: fs/ntfs/attrib.c:25: warning: Incorrect use of kernel-doc format: * ntfs_map_runlist_nolock - map (a part of) a runlist of an ntfs inode fs/ntfs/attrib.c:71: warning: Function parameter or member 'ni' not described in 'ntfs_map_runlist_nolock' fs/ntfs/attrib.c:71: warning: Function parameter or member 'vcn' not described in 'ntfs_map_runlist_nolock' fs/ntfs/attrib.c:71: warning: Function parameter or member 'ctx' not described in 'ntfs_map_runlist_nolock' fs/ntfs/attrib.c:71: warning: expecting prototype for attrib.c - NTFS attribute operations. Part of the Linux(). Prototype was for ntfs_map_runlist_nolock() instead Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Yang Li <[email protected]> Reported-by: Abaci Robot <[email protected]> Acked-by: Randy Dunlap <[email protected]> Cc: Anton Altaparmakov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-15scripts/spelling.txt: add "oveflow"Drew Fustini1-0/+1
Add typo "oveflow" for "overflow". This typo was found and fixed in tools/testing/selftests/bpf/prog_tests/btf_dump.c Link: https://lore.kernel.org/all/[email protected]/ Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Drew Fustini <[email protected]> Suggested-by: Gustavo A. R. Silva <[email protected]> Cc: Colin Ian King <[email protected]> Cc: Drew Fustini <[email protected]> Cc: zuoqilin <[email protected]> Cc: Tom Saeger <[email protected]> Cc: Sven Eckelmann <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-15ia64: topology: use default_groups in kobj_typeGreg Kroah-Hartman1-1/+2
There are currently two ways to create a set of sysfs files for a kobj_type, through the default_attrs field, and the default_groups field. Move the ia64 topology sysfs code to use default_groups field which has been the preferred way since aa30f47cf666 ("kobject: Add support for default attribute groups to kobj_type") so that we can soon get rid of the obsolete default_attrs field. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: David Hildenbrand <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-15ia64: fix typo in a commentJason Wang1-1/+1
The double `the' in a comment is repeated, thus it should be removed. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Jason Wang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-15arch/ia64/kernel/setup.c: use swap() to make code cleanerYang Guang1-4/+1
Use the macro 'swap()' defined in 'include/linux/minmax.h' to avoid opencoding it. Link: https://lkml.kernel.org/r/[email protected] Reported-by: Zeal Robot <[email protected]> Signed-off-by: Yang Guang <[email protected]> Cc: David Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-15ia64: module: use swap() to make code cleanerYang Guang1-4/+2
Use the macro 'swap()' defined in 'include/linux/minmax.h' to avoid opencoding it. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Yang Guang <[email protected]> Reported-by: Zeal Robot <[email protected]> Cc: David Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-15trace/hwlat: make use of the helper function kthread_run_on_cpu()Cai Huoqing1-5/+1
Replace kthread_create_on_cpu/wake_up_process() with kthread_run_on_cpu() to simplify the code. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Cai Huoqing <[email protected]> Cc: Bernard Metzler <[email protected]> Cc: Daniel Bristot de Oliveira <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Doug Ledford <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Joel Fernandes (Google) <[email protected]> Cc: Josh Triplett <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: "Paul E . McKenney" <[email protected]> Cc: Steven Rostedt <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-15trace/osnoise: make use of the helper function kthread_run_on_cpu()Cai Huoqing1-2/+1
Replace kthread_create_on_cpu/wake_up_process() with kthread_run_on_cpu() to simplify the code. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Cai Huoqing <[email protected]> Cc: Bernard Metzler <[email protected]> Cc: Daniel Bristot de Oliveira <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Doug Ledford <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Joel Fernandes (Google) <[email protected]> Cc: Josh Triplett <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: "Paul E . McKenney" <[email protected]> Cc: Steven Rostedt <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-15rcutorture: make use of the helper function kthread_run_on_cpu()Cai Huoqing1-5/+2
Replace kthread_create_on_node/kthread_bind/wake_up_process() with kthread_run_on_cpu() to simplify the code. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Cai Huoqing <[email protected]> Cc: Bernard Metzler <[email protected]> Cc: Daniel Bristot de Oliveira <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Doug Ledford <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Joel Fernandes (Google) <[email protected]> Cc: Josh Triplett <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: "Paul E . McKenney" <[email protected]> Cc: Steven Rostedt <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-15ring-buffer: make use of the helper function kthread_run_on_cpu()Cai Huoqing1-5/+2
Replace kthread_create/kthread_bind/wake_up_process() with kthread_run_on_cpu() to simplify the code. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Cai Huoqing <[email protected]> Cc: Bernard Metzler <[email protected]> Cc: Daniel Bristot de Oliveira <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Doug Ledford <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Joel Fernandes (Google) <[email protected]> Cc: Josh Triplett <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: "Paul E . McKenney" <[email protected]> Cc: Steven Rostedt <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-15RDMA/siw: make use of the helper function kthread_run_on_cpu()Cai Huoqing1-4/+3
Replace kthread_create/kthread_bind/wake_up_process() with kthread_run_on_cpu() to simplify the code. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Cai Huoqing <[email protected]> Cc: Bernard Metzler <[email protected]> Cc: Daniel Bristot de Oliveira <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Doug Ledford <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Joel Fernandes (Google) <[email protected]> Cc: Josh Triplett <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: "Paul E . McKenney" <[email protected]> Cc: Steven Rostedt <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-15kthread: add the helper function kthread_run_on_cpu()Cai Huoqing2-0/+26
Add a new helper function kthread_run_on_cpu(), which includes kthread_create_on_cpu/wake_up_process(). In some cases, use kthread_run_on_cpu() directly instead of kthread_create_on_node/kthread_bind/wake_up_process() or kthread_create_on_cpu/wake_up_process() or kthreadd_create/kthread_bind/wake_up_process() to simplify the code. [[email protected]: export kthread_create_on_cpu to modules] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Cai Huoqing <[email protected]> Cc: Bernard Metzler <[email protected]> Cc: Cai Huoqing <[email protected]> Cc: Daniel Bristot de Oliveira <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Doug Ledford <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Joel Fernandes (Google) <[email protected]> Cc: Josh Triplett <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: "Paul E . McKenney" <[email protected]> Cc: Steven Rostedt <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-15drop fen.cocciJulia Lawall1-124/+0
This semantic patch does not take into account the fact that of_node_put can be safely applied to NULL. Thus it gives only false positives. Drop it. Reported-by: Qing Wang <[email protected]> Signed-off-by: Julia Lawall <[email protected]>
2022-01-15scripts/coccinelle: drop bugon.cocciJulia Lawall1-63/+0
The BUG_ON script was never safe, in that it was not able to check whether the condition was side-effecting. At this point, BUG_ON should be well known, so it has probably outlived its usefuless. Signed-off-by: Julia Lawall <[email protected]> Suggested-by: Matthew Wilcox <[email protected]>
2022-01-15MAINTAINERS: remove Gilles MullerJulia Lawall1-1/+0
Gilles Muller passed away on November 17, 2021. We would like to thank him for his continued support for the development of Coccinelle. Signed-off-by: Julia Lawall <[email protected]>
2022-01-15Merge tag 'xfs-5.17-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds3-46/+72
Pull xfs fixes from Darrick Wong: "These are the last few obvious fixes that I found while stress testing online fsck for XFS prior to initiating a design review of the whole giant machinery. - Fix a minor locking inconsistency in readdir - Fix incorrect fs feature bit validation for secondary superblocks" * tag 'xfs-5.17-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: fix online fsck handling of v5 feature bits on secondary supers xfs: take the ILOCK when readdir inspects directory mapping data
2022-01-14af_unix: annote lockless accesses to unix_tot_inflight & gc_in_progressEric Dumazet2-5/+15
wait_for_unix_gc() reads unix_tot_inflight & gc_in_progress without synchronization. Adds READ_ONCE()/WRITE_ONCE() and their associated comments to better document the intent. BUG: KCSAN: data-race in unix_inflight / wait_for_unix_gc write to 0xffffffff86e2b7c0 of 4 bytes by task 9380 on cpu 0: unix_inflight+0x1e8/0x260 net/unix/scm.c:63 unix_attach_fds+0x10c/0x1e0 net/unix/scm.c:121 unix_scm_to_skb net/unix/af_unix.c:1674 [inline] unix_dgram_sendmsg+0x679/0x16b0 net/unix/af_unix.c:1817 unix_seqpacket_sendmsg+0xcc/0x110 net/unix/af_unix.c:2258 sock_sendmsg_nosec net/socket.c:704 [inline] sock_sendmsg net/socket.c:724 [inline] ____sys_sendmsg+0x39a/0x510 net/socket.c:2409 ___sys_sendmsg net/socket.c:2463 [inline] __sys_sendmmsg+0x267/0x4c0 net/socket.c:2549 __do_sys_sendmmsg net/socket.c:2578 [inline] __se_sys_sendmmsg net/socket.c:2575 [inline] __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2575 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae read to 0xffffffff86e2b7c0 of 4 bytes by task 9375 on cpu 1: wait_for_unix_gc+0x24/0x160 net/unix/garbage.c:196 unix_dgram_sendmsg+0x8e/0x16b0 net/unix/af_unix.c:1772 unix_seqpacket_sendmsg+0xcc/0x110 net/unix/af_unix.c:2258 sock_sendmsg_nosec net/socket.c:704 [inline] sock_sendmsg net/socket.c:724 [inline] ____sys_sendmsg+0x39a/0x510 net/socket.c:2409 ___sys_sendmsg net/socket.c:2463 [inline] __sys_sendmmsg+0x267/0x4c0 net/socket.c:2549 __do_sys_sendmmsg net/socket.c:2578 [inline] __se_sys_sendmmsg net/socket.c:2575 [inline] __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2575 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae value changed: 0x00000002 -> 0x00000004 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 9375 Comm: syz-executor.1 Not tainted 5.16.0-rc7-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Fixes: 9915672d4127 ("af_unix: limit unix_tot_inflight") Signed-off-by: Eric Dumazet <[email protected]> Reported-by: syzbot <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-01-14vdpa/mlx5: Fix tracking of current number of VQsEli Cohen1-4/+8
Modify the code such that ndev->cur_num_vqs better reflects the actual number of data virtqueues. The value can be accurately realized after features have been negotiated. This is to prevent possible failures when modifying the RQT object if the cur_num_vqs bears invalid value. No issue was actually encountered but this also makes the code more readable. Fixes: c5a5cd3d3217 ("vdpa/mlx5: Support configuring max data virtqueue") Signed-off-by: Eli Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]> Reviewed-by: Si-Wei Liu<[email protected]> Acked-by: Jason Wang <[email protected]>
2022-01-14vdpa/mlx5: Fix is_index_valid() to refer to featuresEli Cohen1-3/+7
Make sure the decision whether an index received through a callback is valid or not consults the negotiated features. The motivation for this was due to a case encountered where I shut down the VM. After the reset operation was called features were already clear, I got get_vq_state() call which caused out array bounds access since is_index_valid() reported the index value. So this is more of not hit a bug since the call shouldn't have been made first place. Signed-off-by: Eli Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]> Reviewed-by: Si-Wei Liu<[email protected]> Acked-by: Jason Wang <[email protected]>
2022-01-14vdpa: Protect vdpa reset with cf_mutexEli Cohen1-1/+1
Call reset using the wrapper function vdpa_reset() to make sure the operation is serialized with cf_mutex. This comes to protect from the following possible scenario: vhost_vdpa_set_status() could call the reset op. Since the call is not protected by cf_mutex, a netlink thread calling vdpa_dev_config_fill could get passed the VIRTIO_CONFIG_S_FEATURES_OK check in vdpa_dev_config_fill() and end up reporting wrong features. Fixes: 5f6e85953d8f ("vdpa: Read device configuration only if FEATURES_OK") Signed-off-by: Eli Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]> Reviewed-by: Si-Wei Liu<[email protected]> Acked-by: Jason Wang <[email protected]>
2022-01-14vdpa: Avoid taking cf_mutex lock on get statusEli Cohen3-14/+3
Avoid the wrapper holding cf_mutex since it is not protecting anything. To avoid confusion and unnecessary overhead incurred by it, remove. Fixes: f489f27bc0ab ("vdpa: Sync calls set/get config/status with cf_mutex") Signed-off-by: Eli Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]> Reviewed-by: Si-Wei Liu<[email protected]> Acked-by: Jason Wang <[email protected]>
2022-01-14vdpa/vdpa_sim_net: Report max device capabilitiesEli Cohen1-0/+1
Configure max supported virtqueues features on the management device. This info can be retrieved using: $ vdpa mgmtdev show vdpasim_net: supported_classes net max_supported_vqs 2 dev_features MAC ANY_LAYOUT VERSION_1 ACCESS_PLATFORM Signed-off-by: Eli Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]> Acked-by: Jason Wang <[email protected]>
2022-01-14vdpa: Use BIT_ULL for bit operationsEli Cohen1-6/+6
All masks in this file are 64 bits. Change BIT to BIT_ULL. Other occurences use (1 << val) which yields a 32 bit value. Change them to use BIT_ULL too. Reviewed-by: Si-Wei Liu <[email protected]> Signed-off-by: Eli Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]> Acked-by: Jason Wang <[email protected]>
2022-01-14vdpa/vdpa_sim: Configure max supported virtqueuesEli Cohen1-0/+1
Configure max supported virtqueues on the management device. Signed-off-by: Eli Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]> Acked-by: Jason Wang <[email protected]>
2022-01-14vdpa/mlx5: Report max device capabilitiesEli Cohen1-12/+23
Configure max supported virtqueues and features on the management device. This info can be retrieved using: $ vdpa mgmtdev show auxiliary/mlx5_core.sf.1: supported_classes net max_supported_vqs 257 dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS CTRL_VQ MQ \ CTRL_MAC_ADDR VERSION_1 ACCESS_PLATFORM Signed-off-by: Eli Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]> Reviewed-by: Si-Wei Liu<[email protected]>
2022-01-14vdpa: Support reporting max device capabilitiesEli Cohen3-0/+14
Add max_supported_vqs and supported_features fields to struct vdpa_mgmt_dev. Upstream drivers need to feel these values according to the device capabilities. These values are reported back in a netlink message when showing management devices. Examples: $ auxiliary/mlx5_core.sf.1: supported_classes net max_supported_vqs 257 dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS CTRL_VQ MQ \ CTRL_MAC_ADDR VERSION_1 ACCESS_PLATFORM $ vdpa -j mgmtdev show {"mgmtdev":{"auxiliary/mlx5_core.sf.1":{"supported_classes":["net"], \ "max_supported_vqs":257,"dev_features":["CSUM","GUEST_CSUM","MTU", \ "HOST_TSO4","HOST_TSO6","STATUS","CTRL_VQ","MQ","CTRL_MAC_ADDR", \ "VERSION_1","ACCESS_PLATFORM"]}}} $ vdpa -jp mgmtdev show { "mgmtdev": { "auxiliary/mlx5_core.sf.1": { "supported_classes": [ "net" ], "max_supported_vqs": 257, "dev_features": ["CSUM","GUEST_CSUM","MTU","HOST_TSO4", \ "HOST_TSO6","STATUS","CTRL_VQ","MQ", \ "CTRL_MAC_ADDR","VERSION_1","ACCESS_PLATFORM"] } } } Signed-off-by: Eli Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]> Reviewed-by: Si-Wei Liu<[email protected]>
2022-01-14vdpa/mlx5: Restore cur_num_vqs in case of failure in change_num_qps()Eli Cohen1-1/+3
Restore ndev->cur_num_vqs to the original value in case change_num_qps() fails. Fixes: 52893733f2c5 ("vdpa/mlx5: Add multiqueue support") Reviewed-by: Si-Wei Liu<[email protected]> Acked-by: Jason Wang <[email protected]> Signed-off-by: Eli Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-01-14vdpa: Add support for returning device configuration informationEli Cohen2-0/+7
Add netlink attribute to store the negotiated features. This can be used by userspace to get the current state of the vdpa instance. Examples: $ vdpa dev config show vdpa-a vdpa-a: mac 00:00:00:00:88:88 link up link_announce false max_vq_pairs 16 mtu 1500 negotiated_features CSUM GUEST_CSUM MTU MAC HOST_TSO4 HOST_TSO6 STATUS \ CTRL_VQ MQ CTRL_MAC_ADDR VERSION_1 ACCESS_PLATFORM $ vdpa -j dev config show vdpa-a {"config":{"vdpa-a":{"mac":"00:00:00:00:88:88","link ":"up","link_announce":false, \ "max_vq_pairs":16,"mtu":1500,"negotiated_features":["CSUM","GUEST_CSUM","MTU","MAC", \ "HOST_TSO4","HOST_TSO6","STATUS","CTRL_VQ","MQ","CTRL_MAC_ADDR","VERSION_1", \ "ACCESS_PLATFORM"]}}} $ vdpa -jp dev config show vdpa-a { "config": { "vdpa-a": { "mac": "00:00:00:00:88:88", "link ": "up", "link_announce ": false, "max_vq_pairs": 16, "mtu": 1500, "negotiated_features": [ "CSUM","GUEST_CSUM","MTU","MAC","HOST_TSO4","HOST_TSO6","STATUS","CTRL_VQ","MQ", \ "CTRL_MAC_ADDR","VERSION_1","ACCESS_PLATFORM" ] } } } Signed-off-by: Eli Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]> Acked-by: Jason Wang <[email protected]>
2022-01-14vdpa/mlx5: Support configuring max data virtqueueEli Cohen1-14/+40
Check whether the max number of data virtqueue pairs was provided when a adding a new device and verify the new value does not exceed device capabilities. In addition, change the arrays holding virtqueue and callback contexts to be dynamically allocated. Signed-off-by: Eli Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Includes fixup: vdpa/mlx5: fix error handling in mlx5_vdpa_dev_add() Clang build fails with mlx5_vnet.c:2574:6: error: variable 'mvdev' is used uninitialized whenever 'if' condition is true if (!ndev->vqs || !ndev->event_cbs) { ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ mlx5_vnet.c:2660:14: note: uninitialized use occurs here put_device(&mvdev->vdev.dev); ^~~~~ This because mvdev is set after trying to allocate ndev->vqs,event_cbs. So move the allocation to after mvdev is set but before the arrays are used in init_mvqs() Signed-off-by: Tom Rix <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]> Includes fixup: vdpa/mlx5: fix endian-ness for max vqs sparse warnings: (new ones prefixed by >>) >> drivers/vdpa/mlx5/net/mlx5_vnet.c:1247:23: sparse: sparse: cast to restricted __le16 >> drivers/vdpa/mlx5/net/mlx5_vnet.c:1247:23: sparse: sparse: cast from restricted __virtio16 > 1247 num = le16_to_cpu(ndev->config.max_virtqueue_pairs); Address this using the appropriate wrapper. Cc: "Eli Cohen" <[email protected]> Reported-by: kernel test robot <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Acked-by: Jason Wang <[email protected]> Reviewed-by: Eli Cohen <[email protected]>
2022-01-14vdpa/mlx5: Fix config_attr_mask assignmentEli Cohen1-1/+1
Fix VDPA_ATTR_DEV_NET_CFG_MACADDR assignment to be explicit 64 bit assignment. No issue was seen since the value is well below 64 bit max value. Nevertheless it needs to be fixed. Fixes: a007d940040c ("vdpa/mlx5: Support configuration of MAC") Reviewed-by: Si-Wei Liu <[email protected]> Acked-by: Jason Wang <[email protected]> Signed-off-by: Eli Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-01-14vdpa: Allow to configure max data virtqueuesEli Cohen4-7/+31
Add netlink support to configure the max virtqueue pairs for a device. At least one pair is required. The maximum is dictated by the device. Example: $ vdpa dev add name vdpa-a mgmtdev auxiliary/mlx5_core.sf.1 max_vqp 4 Signed-off-by: Eli Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-01-14vdpa: Read device configuration only if FEATURES_OKEli Cohen1-12/+33
Avoid reading device configuration during feature negotiation. Read device status and verify that VIRTIO_CONFIG_S_FEATURES_OK is set. Protect the entire operation, including configuration read with cf_mutex to ensure integrity of the results. Signed-off-by: Eli Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]> Acked-by: Jason Wang <[email protected]>
2022-01-14vdpa: Sync calls set/get config/status with cf_mutexEli Cohen4-6/+26
Add wrappers to get/set status and protect these operations with cf_mutex to serialize these operations with respect to get/set config operations. Signed-off-by: Eli Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-01-14vdpa/mlx5: Distribute RX virtqueues in RQT objectEli Cohen1-23/+7
Distribute the available rx virtqueues amongst the available RQT entries. RQTs require to have a power of two entries. When creating or modifying the RQT, use the lowest number of power of two entries that is not less than the number of rx virtqueues. Distribute them in the available entries such that some virtqueus may be referenced twice. This allows to configure any number of virtqueue pairs when multiqueue is used. Reviewed-by: Si-Wei Liu <[email protected]> Acked-by: Jason Wang <[email protected]> Signed-off-by: Eli Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]>
2022-01-14vdpa: Provide interface to read driver featuresEli Cohen10-34/+87
Provide an interface to read the negotiated features. This is needed when building the netlink message in vdpa_dev_net_config_fill(). Also fix the implementation of vdpa_dev_net_config_fill() to use the negotiated features instead of the device features. To make APIs clearer, make the following name changes to struct vdpa_config_ops so they better describe their operations: get_features -> get_device_features set_features -> set_driver_features Finally, add get_driver_features to return the negotiated features and add implementation to all the upstream drivers. Acked-by: Jason Wang <[email protected]> Signed-off-by: Eli Cohen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]>