aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-08-02Merge tag 'v5.20-p1' of ↵Linus Torvalds114-1203/+9140
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Make proc files report fips module name and version Algorithms: - Move generic SHA1 code into lib/crypto - Implement Chinese Remainder Theorem for RSA - Remove blake2s - Add XCTR with x86/arm64 acceleration - Add POLYVAL with x86/arm64 acceleration - Add HCTR2 - Add ARIA Drivers: - Add support for new CCP/PSP device ID in ccp" * tag 'v5.20-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (89 commits) crypto: tcrypt - Remove the static variable initialisations to NULL crypto: arm64/poly1305 - fix a read out-of-bound crypto: hisilicon/zip - Use the bitmap API to allocate bitmaps crypto: hisilicon/sec - fix auth key size error crypto: ccree - Remove a useless dma_supported() call crypto: ccp - Add support for new CCP/PSP device ID crypto: inside-secure - Add missing MODULE_DEVICE_TABLE for of crypto: hisilicon/hpre - don't use GFP_KERNEL to alloc mem during softirq crypto: testmgr - some more fixes to RSA test vectors cyrpto: powerpc/aes - delete the rebundant word "block" in comments hwrng: via - Fix comment typo crypto: twofish - Fix comment typo crypto: rmd160 - fix Kconfig "its" grammar crypto: keembay-ocs-ecc - Drop if with an always false condition Documentation: qat: rewrite description Documentation: qat: Use code block for qat sysfs example crypto: lib - add module license to libsha1 crypto: lib - make the sha1 library optional crypto: lib - move lib/sha1.c into lib/crypto/ crypto: fips - make proc files report fips module name and version ...
2022-08-02Merge tag 'random-6.0-rc1-for-linus' of ↵Linus Torvalds34-300/+204
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator updates from Jason Donenfeld: "Though there's been a decent amount of RNG-related development during this last cycle, not all of it is coming through this tree, as this cycle saw a shift toward tackling early boot time seeding issues, which took place in other trees as well. Here's a summary of the various patches: - The CONFIG_ARCH_RANDOM .config option and the "nordrand" boot option have been removed, as they overlapped with the more widely supported and more sensible options, CONFIG_RANDOM_TRUST_CPU and "random.trust_cpu". This change allowed simplifying a bit of arch code. - x86's RDRAND boot time test has been made a bit more robust, with RDRAND disabled if it's clearly producing bogus results. This would be a tip.git commit, technically, but I took it through random.git to avoid a large merge conflict. - The RNG has long since mixed in a timestamp very early in boot, on the premise that a computer that does the same things, but does so starting at different points in wall time, could be made to still produce a different RNG state. Unfortunately, the clock isn't set early in boot on all systems, so now we mix in that timestamp when the time is actually set. - User Mode Linux now uses the host OS's getrandom() syscall to generate a bootloader RNG seed and later on treats getrandom() as the platform's RDRAND-like faculty. - The arch_get_random_{seed_,}_long() family of functions is now arch_get_random_{seed_,}_longs(), which enables certain platforms, such as s390, to exploit considerable performance advantages from requesting multiple CPU random numbers at once, while at the same time compiling down to the same code as before on platforms like x86. - A small cleanup changing a cmpxchg() into a try_cmpxchg(), from Uros. - A comment spelling fix" More info about other random number changes that come in through various architecture trees in the full commentary in the pull request: https://lore.kernel.org/all/[email protected]/ * tag 'random-6.0-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: random: correct spelling of "overwrites" random: handle archrandom with multiple longs um: seed rng using host OS rng random: use try_cmpxchg in _credit_init_bits timekeeping: contribute wall clock to rng on time change x86/rdrand: Remove "nordrand" flag in favor of "random.trust_cpu" random: remove CONFIG_ARCH_RANDOM
2022-08-02block: change the blk_queue_bounce calling conventionChristoph Hellwig3-18/+20
The double indirect bio leads to somewhat suboptimal code generation. Instead return the (original or split) bio, and make sure the request_queue arguments to the lower level helpers is passed after the bio to avoid constant reshuffling of the argument passing registers. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-08-02block: change the blk_queue_split calling conventionChristoph Hellwig11-65/+63
The double indirect bio leads to somewhat suboptimal code generation. Instead return the (original or split) bio, and make sure the request_queue arguments to the lower level helpers is passed after the bio to avoid constant reshuffling of the argument passing registers. Also give it and the helpers used to implement it more descriptive names. Signed-off-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme: update MAINTAINERS for the new auth codeChristoph Hellwig1-1/+2
Add the common subdirectory and match all nvme* headers in include/linux/. Reported-by: Dan Carpenter <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvmet-tcp: fix lockdep complaint on nvmet_tcp_wq flush during queue teardownSagi Grimberg1-1/+2
We probably need nvmet_tcp_wq to have MEM_RECLAIM as we are sending/receiving for the socket from works on this workqueue. Also this eliminates lockdep complaints: -- [ 6174.010200] workqueue: WQ_MEM_RECLAIM nvmet-wq:nvmet_tcp_release_queue_work [nvmet_tcp] is flushing !WQ_MEM_RECLAIM nvmet_tcp_wq:nvmet_tcp_io_work [nvmet_tcp] [ 6174.010216] WARNING: CPU: 20 PID: 14456 at kernel/workqueue.c:2628 check_flush_dependency+0x110/0x14c Reported-by: Yi Zhang <[email protected]> Signed-off-by: Sagi Grimberg <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme: enable generic interface (/dev/ngXnY) for unknown command setsJoel Granados1-7/+34
Extend nvme_alloc_ns() and nvme_validate_ns() for unknown command-set as well. Both are made to use a new helper (nvme_update_ns_info_cs_indep) which is similar to nvme_update_ns_info but performs fewer operations to get the generic interface up. Suggested-by: Christoph Hellwig <[email protected]> Signed-off-by: Joel Granados <[email protected]> Signed-off-by: Kanchan Joshi <[email protected]> [hch: rebased on other refactoring patches] Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Javier González <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme: factor out a nvme_ns_is_readonly helperChristoph Hellwig1-5/+10
Add a little helper to check if a namespace should be marked read-only that uses a new is_readonly flag in the nvme_ns_info structure. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Javier González <[email protected]> Reviewed-by: Joel Granados <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Reviewed-by: Kanchan Joshi <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme: refactor namespace probingChristoph Hellwig1-105/+125
Change nvme_ns_scan to gather all information needed for generic namespace setup into a nvme_ns_info structure. This structure is filled from the Command Set Idependent Identify Namespace data structure if it is available or else the legacy Identify namespace structure. With that everything related to the NVM command set (and the ZNS command set derived from it) can be encapsulated in the nvme_update_ns_info_block function while keeping the rest of the namespace probing generic. The downside is that we now always issue two Identify Namespace calls for each probed namespace instead of usually just a single one previously. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Javier González <[email protected]> Reviewed-by: Joel Granados <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme: generalize the nvme_multi_css check in nvme_scan_nsChristoph Hellwig1-6/+6
Check for multiple command set support early on an error out if is not supported when a !NVM command set namespace is found. This prepares for adding command set independent passthrough support. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Javier González <[email protected]> Reviewed-by: Joel Granados <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Reviewed-by: Kanchan Joshi <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme: rename nvme_validate_or_alloc_ns to nvme_scan_nsChristoph Hellwig1-3/+3
This shorter name much better fits what this function does in the scanning process. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Javier González <[email protected]> Reviewed-by: Joel Granados <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Reviewed-by: Kanchan Joshi <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme: catch -ENODEV from nvme_revalidate_zones againChristoph Hellwig1-6/+7
nvme_revalidate_zones can also return -ENODEV if e.g. zone sizes aren't constant or not a power of two. In that case we should jump to marking the gendisk hidden and only support pass through. Fixes: 602e57c9799c ("nvme: also mark passthrough-only namespaces ready in nvme_update_ns_info") Reported-by: Joel Granados <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Joel Granados <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvmet-auth: select the intended CRYPTO_DH_RFC7919_GROUPSLukas Bulwahn1-1/+1
Commit 71ebe3842ebe ("nvmet-auth: Diffie-Hellman key exchange support") intends to select 'Support for RFC 7919 FFDHE group parameters' for using FFDHE groups for NVMe In-Band Authentication. It however selects CRYPTO_DH_GROUPS_RFC7919, instead of the intended CRYPTO_DH_RFC7919_GROUPS; notice the swapping of words here. Correct the select to the intended config option. Signed-off-by: Lukas Bulwahn <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvmet-auth: fix return value check in auth receiveChaitanya Kulkarni1-2/+1
nvmet_auth_challenge() return type is int and currently it uses status variable that is of type u16 in nvmet_execute_auth_receive(). Catch the return value of nvmet_auth_challenge() into int and set the NVME_SC_INTERNAL as status variable before we jump to error. Reported-by: Dan Carpenter <[email protected]> Signed-off-by: Chaitanya Kulkarni <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvmet-auth: fix return value check in auth sendChaitanya Kulkarni1-2/+2
nvmet_setup_auth() return type is int and currently it uses status variable that is of type u16 in nvmet_execute_auth_send(). Catch the return value of nvmet_setup_auth() into int and set the NVME_SC_INTERNAL as status variable before we jump to error. Reported-by: Dan Carpenter <[email protected]> Signed-off-by: Chaitanya Kulkarni <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvmet-auth: fix a couple of spelling mistakesColin Ian King2-2/+2
There are a couple of spelling mistakes in pr_warn and pr_debug messages. Fix them. Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvmet: fix a format specifier in nvmet_auth_ctrl_exponentialChristoph Hellwig1-1/+1
dh_keysize is a size_t, use the proper format specifier for printing it. Reported-by: Guenter Roeck <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: Guenter Roeck <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvmet: don't check for NULL pointer before kfree in nvmet_host_releaseChristoph Hellwig1-2/+2
And add an empty line after the variable declaration. Reported-by: kernel test robot <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme-apple: stop casting function pointer signaturesChristoph Hellwig1-6/+15
Casting function pointers breaks control flow enforcement and is generally a horrible coding style. Add two wrappers to get rid of these casts. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Reviewed-by: Sven Peter <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme-tcp: split nvme_tcp_alloc_tagsetChristoph Hellwig1-41/+41
Split nvme_tcp_alloc_tagset into one helper for the admin tag_set and one for the I/O tag set. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Keith Busch <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme-rdma: split nvme_rdma_alloc_tagsetChristoph Hellwig1-46/+46
Split nvme_rdma_alloc_tagset into one helper for the admin tag_set and one for the I/O tag set. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Keith Busch <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme-pci: split nvme_dev_addChristoph Hellwig1-35/+37
Split nvme_dev_add into a helper to actually allocate the tag set, and one that just update the number of queues. Add a local variable for the tag_set to clean up the code a bit. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Keith Busch <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme-pci: split nvme_alloc_admin_tagsChristoph Hellwig1-29/+31
Split nvme_alloc_admin_tags into a helper to actually allocate the tag set, and one that just restarts the admin queue. Add a local variable for the tag_set to clean up the code a bit. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Keith Busch <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme-pci: print the command name of aborted commandsChristoph Hellwig2-2/+5
To allow for slightly better debugging, print the command name when aborting an command. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Keith Busch <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme-pci: remove useless assignment in nvme_pci_setup_prpsLiu Song1-1/+0
If prp_list is NULL, nvme_unmap_sg will be performed, and the assignment to first_dma is meaningless, so remove it. Signed-off-by: Liu Song <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme-auth: uninitialized variable in nvme_auth_transform_key()Dan Carpenter1-9/+16
A couple of the early error gotos call kfree_sensitive(transformed_key); before "transformed_key" has been initialized. Fixes: db1312dd9548 ("nvmet: implement basic In-Band Authentication") Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme-auth: fix off by one checksDan Carpenter1-5/+5
The > ARRAY_SIZE() checks need to be >= ARRAY_SIZE() to prevent reading one element beyond the end of the arrays. Fixes: db1312dd9548 ("nvmet: implement basic In-Band Authentication") Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme: define compat_ioctl again to unbreak 32-bit userspace.Nick Bowler2-0/+2
Commit 89b3d6e60550 ("nvme: simplify the compat ioctl handling") removed the initialization of compat_ioctl from the nvme block_device_operations structures. Presumably the expectation was that 32-bit ioctls would be directed through the regular handler but this is not the case: failing to assign .compat_ioctl actually means that the compat case is disabled entirely, and any attempt to submit nvme ioctls from 32-bit userspace fails outright with -ENOTTY. For example: % smartctl -x /dev/nvme0n1 [...] Read NVMe Identify Controller failed: NVME_IOCTL_ADMIN_CMD: Inappropriate ioctl for device The blkdev_compat_ptr_ioctl helper can be used to direct compat calls through the main ioctl handler and makes things work again. Fixes: 89b3d6e60550 ("nvme: simplify the compat ioctl handling") Signed-off-by: Nick Bowler <[email protected]> Reviewed-by: Guixin Liu <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme: don't always build constants.oChristoph Hellwig2-3/+2
The entire content of constants.c if guarded by an ifdef, so switch to just building the file conditionally instead. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Keith Busch <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme: use command_id instead of req->tag in trace_nvme_complete_rq()Bean Huo1-1/+1
Use command_id instead of req->tag in trace_nvme_complete_rq(), because of commit e7006de6c238 ("nvme: code command_id with a genctr for use authentication after release"), cmd->common.command_id is set to ((genctl & 0xf)< 12 | req->tag), no longer req->tag, which makes cid in trace_nvme_complete_rq and trace_nvme_setup_cmd are not the same. Fixes: e7006de6c238 ("nvme: code command_id with a genctr for use authentication after release") Signed-off-by: Bean Huo <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02md-raid10: fix KASAN warningMikulas Patocka1-1/+4
There's a KASAN warning in raid10_remove_disk when running the lvm test lvconvert-raid-reshape.sh. We fix this warning by verifying that the value "number" is valid. BUG: KASAN: slab-out-of-bounds in raid10_remove_disk+0x61/0x2a0 [raid10] Read of size 8 at addr ffff889108f3d300 by task mdX_raid10/124682 CPU: 3 PID: 124682 Comm: mdX_raid10 Not tainted 5.19.0-rc6 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x34/0x44 print_report.cold+0x45/0x57a ? __lock_text_start+0x18/0x18 ? raid10_remove_disk+0x61/0x2a0 [raid10] kasan_report+0xa8/0xe0 ? raid10_remove_disk+0x61/0x2a0 [raid10] raid10_remove_disk+0x61/0x2a0 [raid10] Buffer I/O error on dev dm-76, logical block 15344, async page read ? __mutex_unlock_slowpath.constprop.0+0x1e0/0x1e0 remove_and_add_spares+0x367/0x8a0 [md_mod] ? super_written+0x1c0/0x1c0 [md_mod] ? mutex_trylock+0xac/0x120 ? _raw_spin_lock+0x72/0xc0 ? _raw_spin_lock_bh+0xc0/0xc0 md_check_recovery+0x848/0x960 [md_mod] raid10d+0xcf/0x3360 [raid10] ? sched_clock_cpu+0x185/0x1a0 ? rb_erase+0x4d4/0x620 ? var_wake_function+0xe0/0xe0 ? psi_group_change+0x411/0x500 ? preempt_count_sub+0xf/0xc0 ? _raw_spin_lock_irqsave+0x78/0xc0 ? __lock_text_start+0x18/0x18 ? raid10_sync_request+0x36c0/0x36c0 [raid10] ? preempt_count_sub+0xf/0xc0 ? _raw_spin_unlock_irqrestore+0x19/0x40 ? del_timer_sync+0xa9/0x100 ? try_to_del_timer_sync+0xc0/0xc0 ? _raw_spin_lock_irqsave+0x78/0xc0 ? __lock_text_start+0x18/0x18 ? _raw_spin_unlock_irq+0x11/0x24 ? __list_del_entry_valid+0x68/0xa0 ? finish_wait+0xa3/0x100 md_thread+0x161/0x260 [md_mod] ? unregister_md_personality+0xa0/0xa0 [md_mod] ? _raw_spin_lock_irqsave+0x78/0xc0 ? prepare_to_wait_event+0x2c0/0x2c0 ? unregister_md_personality+0xa0/0xa0 [md_mod] kthread+0x148/0x180 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x1f/0x30 </TASK> Allocated by task 124495: kasan_save_stack+0x1e/0x40 __kasan_kmalloc+0x80/0xa0 setup_conf+0x140/0x5c0 [raid10] raid10_run+0x4cd/0x740 [raid10] md_run+0x6f9/0x1300 [md_mod] raid_ctr+0x2531/0x4ac0 [dm_raid] dm_table_add_target+0x2b0/0x620 [dm_mod] table_load+0x1c8/0x400 [dm_mod] ctl_ioctl+0x29e/0x560 [dm_mod] dm_compat_ctl_ioctl+0x7/0x20 [dm_mod] __do_compat_sys_ioctl+0xfa/0x160 do_syscall_64+0x90/0xc0 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Last potentially related work creation: kasan_save_stack+0x1e/0x40 __kasan_record_aux_stack+0x9e/0xc0 kvfree_call_rcu+0x84/0x480 timerfd_release+0x82/0x140 L __fput+0xfa/0x400 task_work_run+0x80/0xc0 exit_to_user_mode_prepare+0x155/0x160 syscall_exit_to_user_mode+0x12/0x40 do_syscall_64+0x42/0xc0 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Second to last potentially related work creation: kasan_save_stack+0x1e/0x40 __kasan_record_aux_stack+0x9e/0xc0 kvfree_call_rcu+0x84/0x480 timerfd_release+0x82/0x140 __fput+0xfa/0x400 task_work_run+0x80/0xc0 exit_to_user_mode_prepare+0x155/0x160 syscall_exit_to_user_mode+0x12/0x40 do_syscall_64+0x42/0xc0 entry_SYSCALL_64_after_hwframe+0x46/0xb0 The buggy address belongs to the object at ffff889108f3d200 which belongs to the cache kmalloc-256 of size 256 The buggy address is located 0 bytes to the right of 256-byte region [ffff889108f3d200, ffff889108f3d300) The buggy address belongs to the physical page: page:000000007ef2a34c refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1108f3c head:000000007ef2a34c order:2 compound_mapcount:0 compound_pincount:0 flags: 0x4000000000010200(slab|head|zone=2) raw: 4000000000010200 0000000000000000 dead000000000001 ffff889100042b40 raw: 0000000000000000 0000000080200020 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff889108f3d200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff889108f3d280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff889108f3d300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ^ ffff889108f3d380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff889108f3d400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Signed-off-by: Mikulas Patocka <[email protected]> Cc: [email protected] Signed-off-by: Song Liu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02md-raid: destroy the bitmap after destroying the threadMikulas Patocka1-1/+1
When we ran the lvm test "shell/integrity-blocksize-3.sh" on a kernel with kasan, we got failure in write_page. The reason for the failure is that md_bitmap_destroy is called before destroying the thread and the thread may be waiting in the function write_page for the bio to complete. When the thread finishes waiting, it executes "if (test_bit(BITMAP_WRITE_ERROR, &bitmap->flags))", which triggers the kasan warning. Note that the commit 48df498daf62 that caused this bug claims that it is neede for md-cluster, you should check md-cluster and possibly find another bugfix for it. BUG: KASAN: use-after-free in write_page+0x18d/0x680 [md_mod] Read of size 8 at addr ffff889162030c78 by task mdX_raid1/5539 CPU: 10 PID: 5539 Comm: mdX_raid1 Not tainted 5.19.0-rc2 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x34/0x44 print_report.cold+0x45/0x57a ? __lock_text_start+0x18/0x18 ? write_page+0x18d/0x680 [md_mod] kasan_report+0xa8/0xe0 ? write_page+0x18d/0x680 [md_mod] kasan_check_range+0x13f/0x180 write_page+0x18d/0x680 [md_mod] ? super_sync+0x4d5/0x560 [dm_raid] ? md_bitmap_file_kick+0xa0/0xa0 [md_mod] ? rs_set_dev_and_array_sectors+0x2e0/0x2e0 [dm_raid] ? mutex_trylock+0x120/0x120 ? preempt_count_add+0x6b/0xc0 ? preempt_count_sub+0xf/0xc0 md_update_sb+0x707/0xe40 [md_mod] md_reap_sync_thread+0x1b2/0x4a0 [md_mod] md_check_recovery+0x533/0x960 [md_mod] raid1d+0xc8/0x2a20 [raid1] ? var_wake_function+0xe0/0xe0 ? psi_group_change+0x411/0x500 ? preempt_count_sub+0xf/0xc0 ? _raw_spin_lock_irqsave+0x78/0xc0 ? __lock_text_start+0x18/0x18 ? raid1_end_read_request+0x2a0/0x2a0 [raid1] ? preempt_count_sub+0xf/0xc0 ? _raw_spin_unlock_irqrestore+0x19/0x40 ? del_timer_sync+0xa9/0x100 ? try_to_del_timer_sync+0xc0/0xc0 ? _raw_spin_lock_irqsave+0x78/0xc0 ? __lock_text_start+0x18/0x18 ? __list_del_entry_valid+0x68/0xa0 ? finish_wait+0xa3/0x100 md_thread+0x161/0x260 [md_mod] ? unregister_md_personality+0xa0/0xa0 [md_mod] ? _raw_spin_lock_irqsave+0x78/0xc0 ? prepare_to_wait_event+0x2c0/0x2c0 ? unregister_md_personality+0xa0/0xa0 [md_mod] kthread+0x148/0x180 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x1f/0x30 </TASK> Allocated by task 5522: kasan_save_stack+0x1e/0x40 __kasan_kmalloc+0x80/0xa0 md_bitmap_create+0xa8/0xe80 [md_mod] md_run+0x777/0x1300 [md_mod] raid_ctr+0x249c/0x4a30 [dm_raid] dm_table_add_target+0x2b0/0x620 [dm_mod] table_load+0x1c8/0x400 [dm_mod] ctl_ioctl+0x29e/0x560 [dm_mod] dm_compat_ctl_ioctl+0x7/0x20 [dm_mod] __do_compat_sys_ioctl+0xfa/0x160 do_syscall_64+0x90/0xc0 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Freed by task 5680: kasan_save_stack+0x1e/0x40 kasan_set_track+0x21/0x40 kasan_set_free_info+0x20/0x40 __kasan_slab_free+0xf7/0x140 kfree+0x80/0x240 md_bitmap_free+0x1c3/0x280 [md_mod] __md_stop+0x21/0x120 [md_mod] md_stop+0x9/0x40 [md_mod] raid_dtr+0x1b/0x40 [dm_raid] dm_table_destroy+0x98/0x1e0 [dm_mod] __dm_destroy+0x199/0x360 [dm_mod] dev_remove+0x10c/0x160 [dm_mod] ctl_ioctl+0x29e/0x560 [dm_mod] dm_compat_ctl_ioctl+0x7/0x20 [dm_mod] __do_compat_sys_ioctl+0xfa/0x160 do_syscall_64+0x90/0xc0 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Signed-off-by: Mikulas Patocka <[email protected]> Cc: [email protected] Fixes: 48df498daf62 ("md: move bitmap_destroy to the beginning of __md_stop") Signed-off-by: Song Liu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02md: return the allocated devices from md_allocChristoph Hellwig3-49/+30
Two callers of md_alloc want to use the newly allocated devices, so return it instead of letting them find it cumbersomely after the allocation. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-and-tested-by: Logan Gunthorpe <[email protected]> Signed-off-by: Song Liu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02md: open code md_probe in autorun_devicesChristoph Hellwig1-1/+1
autorun_devices should not be limited to the controls for the legacy probe on open, so just call md_alloc directly. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-and-tested-by: Logan Gunthorpe <[email protected]> Signed-off-by: Song Liu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02md: remove unneeded semicolonYang Li1-1/+1
Eliminate the following coccicheck warning: ./drivers/md/md.c:8208:2-3: Unneeded semicolon Reported-by: Abaci Robot <[email protected]> Signed-off-by: Yang Li <[email protected]> Reported-by: kernel test robot <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Song Liu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02remove the sx8 block driverChristoph Hellwig3-1593/+0
This driver is for fairly obscure hardware, and has only seen random drive-by changes after the maintainer stopped working on it in 2005 (about a year and a half after it was introduced). It has some "interesting" block layer interactions, so let's just drop it unless anyone complains. Signed-off-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] [axboe: fix date typo, it was in 2005, not 2015] Signed-off-by: Jens Axboe <[email protected]>
2022-08-02md: fix build failure for !MODULEStephen Rothwell1-0/+2
After merging the block tree, today's linux-next build (x86_64 allmodconfig) failed like this: drivers/md/md.c:717:22: error: 'mddev_find' defined but not used [-Werror=unused-function] 717 | static struct mddev *mddev_find(dev_t unit) | ^~~~~~~~~~ cc1: all warnings being treated as errors Caused by commit 4500d5c17910 ("md: simplify md_open") Make mddev_find() available only for non-modular builds. Signed-off-by: Stephen Rothwell <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-08-02raid5: fix duplicate checks for rdev->saved_raid_diskJackie Liu1-2/+1
'first' will always be greater than or equal to 0, it is unnecessary to repeat the 0 check, clean it up. Signed-off-by: Jackie Liu <[email protected]> Signed-off-by: Song Liu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02md: simplify md_openChristoph Hellwig1-27/+15
Now that devices are on the all_mddevs list until the gendisk is freed, there can't be any duplicates. Remove the global list lookup and just grab a reference. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: Logan Gunthorpe <[email protected]> Signed-off-by: Song Liu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02md: only delete entries from all_mddevs when the disk is freedChristoph Hellwig2-18/+40
This ensures device names don't get prematurely reused. Instead add a deleted flag to skip already deleted devices in mddev_get and other places that only want to see live mddevs. Reported-by: Logan Gunthorpe <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: Logan Gunthorpe <[email protected]> Signed-off-by: Song Liu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02md: stop using for_each_mddev in md_exitChristoph Hellwig1-28/+11
Just do a simple list_for_each_entry_safe on all_mddevs, and only grab a reference when we drop the lock and delete the now unused for_each_mddev macro. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Logan Gunthorpe <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Song Liu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02md: stop using for_each_mddev in md_notify_rebootChristoph Hellwig1-3/+9
Just do a simple list_for_each_entry_safe on all_mddevs, and only grab a reference when we drop the lock. Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Logan Gunthorpe <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Song Liu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02md: stop using for_each_mddev in md_do_syncChristoph Hellwig1-3/+5
Just do a plain list_for_each that only grabs a mddev reference in the case where the thread sleeps and restarts the list iteration. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Logan Gunthorpe <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Song Liu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02md: factor out the rdev overlaps check from rdev_size_storeChristoph Hellwig1-45/+39
This splits the code into nicely readable chunks and also avoids the refcount inc/dec manipulations. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Logan Gunthorpe <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Song Liu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02md: rename md_free to md_kobj_releaseChristoph Hellwig1-2/+2
The md_free name is rather misleading, so pick a better one. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Logan Gunthorpe <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Song Liu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02md: implement ->free_diskChristoph Hellwig1-6/+12
Ensure that all private data is only freed once all accesses are done. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Logan Gunthorpe <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Song Liu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02md: fix error handling in md_allocChristoph Hellwig1-13/+32
Error handling in md_alloc is a mess. Untangle it to just free the mddev directly before add_disk is called and thus the gendisk is globally visible. After that clear the hold flag and let the mddev_put take care of cleaning up the mddev through the usual mechanisms. Fixes: 5e55e2f5fc95 ("[PATCH] md: convert compile time warnings into runtime warnings") Fixes: 9be68dd7ac0e ("md: add error handling support for add_disk()") Fixes: 7ad1069166c0 ("md: properly unwind when failing to add the kobject in md_alloc") Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Logan Gunthorpe <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Song Liu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02md: fix mddev->kobj lifetimeChristoph Hellwig1-6/+4
Once a kobject is initialized, the containing object should not be directly freed. So delay initialization until it is added. Also remove the kobject_del call as the last put will remove the kobject as well. The explicitly delete isn't needed here, and dropping it will simplify further fixes. With this md_free now does not need to check that ->gendisk is non-NULL as it is always set by the time that kobject_init is called on mddev->kobj. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Logan Gunthorpe <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Song Liu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02md/raid5: Convert prepare_to_wait() to wait_woken() apiLogan Gunthorpe1-7/+6
raid5_get_active_stripe() can sleep in various situations and it is called by make_stripe_request() while inside the prepare_to_wait()/finish_wait() section. Nested waits like this are not supported. This was noticed while making other changes that add different sleeps to raid5_get_active_stripe() that caused a WARNING with CONFIG_DEBUG_ATOMIC_SLEEP. No ill effects have been noticed with the code as is, but theoretically a nested and here could cause a dead lock so it should be fixed. To fix this, convert the prepare_to_wait() call to use wake_woken() which supports nested sleeps. Link: https://lwn.net/Articles/628628/ Signed-off-by: Logan Gunthorpe <[email protected]> Signed-off-by: Song Liu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02md/raid5: Fix sectors_to_do bitmap overflow in raid5_make_request()Logan Gunthorpe1-8/+11
For unaligned IO that have nearly maximum sectors, the number of stripes will end up being one greater than the size of the bitmap. When this happens, the last stripe in the IO will not be processed as it should be, resulting in data corruption. However, this is not normally seen when the backing block devices have 4K physical block sizes since the block layer will split the request before that happens. To fix this increase the bitmap size by one bit and ensure the full number of stripes are checked when calling find_first_bit(). Reported-by: David Sloan <[email protected]> Fixes: 7e55c60acfbb ("md/raid5: Pivot raid5_make_request()") Signed-off-by: Logan Gunthorpe <[email protected]> Signed-off-by: Song Liu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>