aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-08-02block: fix leaking page ref on truncated direct ioKeith Busch1-15/+15
The size being added to a bio from an iov is aligned to a block size after the pages were gotten. If the new aligned size truncates the last page, its reference was being leaked. Ensure all pages that were not added to the bio have their reference released. Since this essentially requires doing the same that bio_put_pages(), and there was only one caller for that function, this patch makes the put_page() loop common for everyone. Fixes: b1a000d3b8ec5 ("block: relax direct io memory alignment") Reported-by: Al Viro <[email protected]> Signed-off-by: Keith Busch <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-08-02block: ensure bio_iov_add_page can't failKeith Busch1-10/+9
Adding the page could fail on the bio_full() condition, which checks for either exceeding the bio's max segments or total size exceeding UINT_MAX. We already ensure the max segments can't be exceeded, so just ensure the total size won't reach the limit. This simplifies error handling and removes unnecessary repeated bio_full() checks. Signed-off-by: Keith Busch <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-08-02block: ensure iov_iter advances for added pagesKeith Busch1-4/+4
There are cases where a bio may not accept additional pages, and the iov needs to advance to the last data length that was accepted. The zone append used to handle this correctly, but was inadvertently broken when the setup was made common with the normal r/w case. Fixes: 576ed9135489c ("block: use bio_add_page in bio_iov_iter_get_pages") Fixes: c58c0074c54c2 ("block/bio: remove duplicate append pages code") Signed-off-by: Keith Busch <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-08-02drivers:md:fix a potential use-after-free bugWentao_Liang1-1/+1
In line 2884, "raid5_release_stripe(sh);" drops the reference to sh and may cause sh to be released. However, sh is subsequently used in lines 2886 "if (sh->batch_head && sh != sh->batch_head)". This may result in an use-after-free bug. It can be fixed by moving "raid5_release_stripe(sh);" to the bottom of the function. Signed-off-by: Wentao_Liang <[email protected]> Signed-off-by: Song Liu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02md/raid5: Ensure batch_last is released before sleeping for quiesceLogan Gunthorpe2-9/+29
A race condition exists where if raid5_quiesce() is called in the middle of a request that has set batch_last, it will deadlock. batch_last will hold a reference to a stripe when raid5_quiesce() is called. This will cause the next raid5_get_active_stripe() call to sleep waiting for the quiesce to finish, but the raid5_quiesce() thread will wait for active_stripes to go to zero which will never happen because request thread is waiting for the quiesce to stop. Fix this by creating a special __raid5_get_active_stripe() function which takes the request context and clears the last_batch before sleeping. While we're at it, change the arguments of raid5_get_active_stripe() to bools. Fixes: 3312e6c887fe ("md/raid5: Keep a reference to last stripe_head for batch") Reported-by: David Sloan <[email protected]> Signed-off-by: Logan Gunthorpe <[email protected]> Signed-off-by: Song Liu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02md/raid5: Move stripe_request_ctx upLogan Gunthorpe1-27/+27
Move stripe_request_ctx up. No functional changes intended. This will be necessary in the next patch to release the batch_last in the context before sleeping. Signed-off-by: Logan Gunthorpe <[email protected]> Signed-off-by: Song Liu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02md/raid5: Drop unnecessary call to r5c_check_stripe_cache_usage()Logan Gunthorpe1-1/+0
Now that raid5_get_active_stripe() has been refactored it is appearant that r5c_check_stripe_cache_usage() doesn't need to be called in the wait_for_stripe branch. r5c_check_stripe_cache_usage() will only conditionally call r5l_wake_reclaim(), but that function is called two lines later. Drop the call for cleanup. Reported-by: Martin Oliveira <[email protected]> Signed-off-by: Logan Gunthorpe <[email protected]> Signed-off-by: Song Liu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02md/raid5: Make is_inactive_blocked() helperLogan Gunthorpe1-5/+19
The logic to wait_for_stripe is difficult to parse being on so many lines and with confusing operator precedence. Move it to a helper function to make it easier to read. No functional changes intended. Signed-off-by: Logan Gunthorpe <[email protected]> Signed-off-by: Song Liu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02md/raid5: Refactor raid5_get_active_stripe()Logan Gunthorpe1-31/+36
Refactor the raid5_get_active_stripe() to read more linearly in the order it's typically executed. The init_stripe() call is called if a free stripe is found and the function is exited early which removes a lot of if (sh) checks and unindents the following code. Remove the while loop in favour of the 'goto retry' pattern, which reduces indentation further. And use a 'goto wait_for_stripe' instead of an additional indent seeing it is the unusual path and this makes the code easier to read. No functional changes intended. Will make subsequent changes in patches easier to understand. Signed-off-by: Logan Gunthorpe <[email protected]> Signed-off-by: Song Liu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02block: pass struct queue_limits to the bio splitting helpersChristoph Hellwig5-72/+68
Allow using the splitting helpers on just a queue_limits instead of a full request_queue structure. This will eventually allow file systems or remapping drivers to split REQ_OP_ZONE_APPEND bios based on limits calculated as the minimum common capabilities over multiple devices. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-08-02block: move bio_allowed_max_sectors to blk-merge.cChristoph Hellwig2-10/+10
Move this helper into the only file where it is used. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-08-02block: move the call to get_max_io_size out of blk_bio_segment_splitChristoph Hellwig1-4/+5
Prepare for reusing blk_bio_segment_split for (file system controlled) splits of REQ_OP_ZONE_APPEND bios by letting the caller control the maximum size of the bio. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-08-02block: move ->bio_split to the gendiskChristoph Hellwig6-16/+15
Only non-passthrough requests are split by the block layer and use the ->bio_split bio_set. Move it from the request_queue to the gendisk. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-08-02Merge tag 'flexible-array-transformations-UAPI-6.0-rc1' of ↵Linus Torvalds82-216/+216
git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux Pull uapi flexible array update from Gustavo Silva: "A treewide patch that replaces zero-length arrays with flexible-array members in UAPI. This has been baking in linux-next for 5 weeks now. '-fstrict-flex-arrays=3' is coming and we need to land these changes to prevent issues like these in the short future: fs/minix/dir.c:337:3: warning: 'strcpy' will always overflow; destination buffer has size 0, but the source string has length 2 (including NUL byte) [-Wfortify-source] strcpy(de3->name, "."); ^ Since these are all [0] to [] changes, the risk to UAPI is nearly zero. If this breaks anything, we can use a union with a new member name" Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836 * tag 'flexible-array-transformations-UAPI-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: treewide: uapi: Replace zero-length arrays with flexible-array members
2022-08-02Merge tag 'linux-kselftest-next-5.20-rc1' of ↵Linus Torvalds30-145/+111
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull Kselftest updates from Shuah Khan: - timers test build fixes and cleanups for new tool chains - removing khdr from kselftest framework and main Makefile - changes to test output messages to improve reports * tag 'linux-kselftest-next-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (24 commits) Makefile: replace headers_install with headers for kselftest selftests/landlock: drop deprecated headers dependency selftests: timers: clocksource-switch: adapt to kselftest framework selftests: timers: clocksource-switch: add 'runtime' command line parameter selftests: timers: clocksource-switch: add command line switch to skip sanity check selftests: timers: clocksource-switch: sort includes selftests: timers: clocksource-switch: fix passing errors from child selftests: timers: inconsistency-check: adapt to kselftest framework selftests: timers: nanosleep: adapt to kselftest framework selftests: timers: fix declarations of main() selftests: timers: valid-adjtimex: build fix for newer toolchains Makefile: add headers_install to kselftest targets selftests: drop KSFT_KHDR_INSTALL make target selftests: stop using KSFT_KHDR_INSTALL selftests: drop khdr make target selftests: drivers/dma-buf: Improve message in selftest summary selftests/kcmp: Make the test output consistent and clear selftests:timers: globals don't need initialization to 0 selftests/drivers/gpu: Add error messages to drm_mm.sh selftests/tpm2: increase timeout for kselftests ...
2022-08-02Merge tag 'linux-kselftest-kunit-5.20-rc1' of ↵Linus Torvalds36-592/+657
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull KUnit updates from Shuah Khan: "This consists of several fixes and an important feature to discourage running KUnit tests on production systems. Running tests on a production system could leave the system in a bad state. Summary: - Add a new taint type, TAINT_TEST to signal that a test has been run. This should discourage people from running these tests on production systems, and to make it easier to tell if tests have been run accidentally (by loading the wrong configuration, etc) - Several documentation and tool enhancements and fixes" * tag 'linux-kselftest-kunit-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (29 commits) Documentation: KUnit: Fix example with compilation error Documentation: kunit: Add CLI args for kunit_tool kcsan: test: Add a .kunitconfig to run KCSAN tests kunit: executor: Fix a memory leak on failure in kunit_filter_tests clk: explicitly disable CONFIG_UML_PCI_OVER_VIRTIO in .kunitconfig mmc: sdhci-of-aspeed: test: Use kunit_test_suite() macro nitro_enclaves: test: Use kunit_test_suite() macro thunderbolt: test: Use kunit_test_suite() macro kunit: flatten kunit_suite*** to kunit_suite** in .kunit_test_suites kunit: unify module and builtin suite definitions selftest: Taint kernel when test module loaded module: panic: Taint the kernel when selftest modules load Documentation: kunit: fix example run_kunit func to allow spaces in args Documentation: kunit: Cleanup run_wrapper, fix x-ref kunit: test.h: fix a kernel-doc markup kunit: tool: Enable virtio/PCI by default on UML kunit: tool: make --kunitconfig repeatable, blindly concat kunit: add coverage_uml.config to enable GCOV on UML kunit: tool: refactor internal kconfig handling, allow overriding kunit: tool: introduce --qemu_args ...
2022-08-02Merge tag 'docs-6.0' of git://git.lwn.net/linuxLinus Torvalds133-1400/+3274
Pull documentation updates from Jonathan Corbet: "This was a moderately busy cycle for documentation, but nothing all that earth-shaking: - More Chinese translations, and an update to the Italian translations. The Japanese, Korean, and traditional Chinese translations are more-or-less unmaintained at this point, instead. - Some build-system performance improvements. - The removal of the archaic submitting-drivers.rst document, with the movement of what useful material that remained into other docs. - Improvements to sphinx-pre-install to, hopefully, give more useful suggestions. - A number of build-warning fixes Plus the usual collection of typo fixes, updates, and more" * tag 'docs-6.0' of git://git.lwn.net/linux: (92 commits) docs: efi-stub: Fix paths for x86 / arm stubs Docs/zh_CN: Update the translation of sched-stats to 5.19-rc8 Docs/zh_CN: Update the translation of pci to 5.19-rc8 Docs/zh_CN: Update the translation of pci-iov-howto to 5.19-rc8 Docs/zh_CN: Update the translation of usage to 5.19-rc8 Docs/zh_CN: Update the translation of testing-overview to 5.19-rc8 Docs/zh_CN: Update the translation of sparse to 5.19-rc8 Docs/zh_CN: Update the translation of kasan to 5.19-rc8 Docs/zh_CN: Update the translation of iio_configfs to 5.19-rc8 doc:it_IT: align Italian documentation docs: Remove spurious tag from admin-guide/mm/overcommit-accounting.rst Documentation: process: Update email client instructions for Thunderbird docs: ABI: correct QEMU fw_cfg spec path doc/zh_CN: remove submitting-driver reference from docs docs: zh_TW: align to submitting-drivers removal docs: zh_CN: align to submitting-drivers removal docs: ko_KR: howto: remove reference to removed submitting-drivers docs: ja_JP: howto: remove reference to removed submitting-drivers docs: it_IT: align to submitting-drivers removal docs: process: remove outdated submitting-drivers.rst ...
2022-08-02Merge tag 'nolibc.2022.07.27a' of ↵Linus Torvalds4-8/+43
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull nolibc updates from Paul McKenney: "This provides nolibc updates, perhaps most notably improved testing via the 'cd tools/include/nolibc; make headers' command. This should be considered a smoke test. More thorough testing is in the works" * tag 'nolibc.2022.07.27a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: tools/nolibc: add a help target to list supported targets tools/nolibc: make the default target build the headers tools/nolibc: fix the makefile to also work as "make -C tools ..." tools/nolibc/stdio: Add format attribute to enable printf warnings tools/nolibc/stdlib: Support overflow checking for older compiler versions
2022-08-02Merge tag 'rcu.2022.07.26a' of ↵Linus Torvalds76-1279/+2124
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull RCU updates from Paul McKenney: - Documentation updates - Miscellaneous fixes - Callback-offload updates, perhaps most notably a new RCU_NOCB_CPU_DEFAULT_ALL Kconfig option that causes all CPUs to be offloaded at boot time, regardless of kernel boot parameters. This is useful to battery-powered systems such as ChromeOS and Android. In addition, a new RCU_NOCB_CPU_CB_BOOST kernel boot parameter prevents offloaded callbacks from interfering with real-time workloads and with energy-efficiency mechanisms - Polled grace-period updates, perhaps most notably making these APIs account for both normal and expedited grace periods - Tasks RCU updates, perhaps most notably reducing the CPU overhead of RCU tasks trace grace periods by more than a factor of two on a system with 15,000 tasks. The reduction is expected to increase with the number of tasks, so it seems reasonable to hypothesize that a system with 150,000 tasks might see a 20-fold reduction in CPU overhead - Torture-test updates - Updates that merge RCU's dyntick-idle tracking into context tracking, thus reducing the overhead of transitioning to kernel mode from either idle or nohz_full userspace execution for kernels that track context independently of RCU. This is expected to be helpful primarily for kernels built with CONFIG_NO_HZ_FULL=y * tag 'rcu.2022.07.26a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (98 commits) rcu: Add irqs-disabled indicator to expedited RCU CPU stall warnings rcu: Diagnose extended sync_rcu_do_polled_gp() loops rcu: Put panic_on_rcu_stall() after expedited RCU CPU stall warnings rcutorture: Test polled expedited grace-period primitives rcu: Add polled expedited grace-period primitives rcutorture: Verify that polled GP API sees synchronous grace periods rcu: Make Tiny RCU grace periods visible to polled APIs rcu: Make polled grace-period API account for expedited grace periods rcu: Switch polled grace-period APIs to ->gp_seq_polled rcu/nocb: Avoid polling when my_rdp->nocb_head_rdp list is empty rcu/nocb: Add option to opt rcuo kthreads out of RT priority rcu: Add nocb_cb_kthread check to rcu_is_callbacks_kthread() rcu/nocb: Add an option to offload all CPUs on boot rcu/nocb: Fix NOCB kthreads spawn failure with rcu_nocb_rdp_deoffload() direct call rcu/nocb: Invert rcu_state.barrier_mutex VS hotplug lock locking order rcu/nocb: Add/del rdp to iterate from rcuog itself rcu/tree: Add comment to describe GP-done condition in fqs loop rcu: Initialize first_gp_fqs at declaration in rcu_gp_fqs() rcu/kvfree: Remove useless monitor_todo flag rcu: Cleanup RCU urgency state for offline CPU ...
2022-08-02Merge tag 'v5.20-p1' of ↵Linus Torvalds114-1203/+9140
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Make proc files report fips module name and version Algorithms: - Move generic SHA1 code into lib/crypto - Implement Chinese Remainder Theorem for RSA - Remove blake2s - Add XCTR with x86/arm64 acceleration - Add POLYVAL with x86/arm64 acceleration - Add HCTR2 - Add ARIA Drivers: - Add support for new CCP/PSP device ID in ccp" * tag 'v5.20-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (89 commits) crypto: tcrypt - Remove the static variable initialisations to NULL crypto: arm64/poly1305 - fix a read out-of-bound crypto: hisilicon/zip - Use the bitmap API to allocate bitmaps crypto: hisilicon/sec - fix auth key size error crypto: ccree - Remove a useless dma_supported() call crypto: ccp - Add support for new CCP/PSP device ID crypto: inside-secure - Add missing MODULE_DEVICE_TABLE for of crypto: hisilicon/hpre - don't use GFP_KERNEL to alloc mem during softirq crypto: testmgr - some more fixes to RSA test vectors cyrpto: powerpc/aes - delete the rebundant word "block" in comments hwrng: via - Fix comment typo crypto: twofish - Fix comment typo crypto: rmd160 - fix Kconfig "its" grammar crypto: keembay-ocs-ecc - Drop if with an always false condition Documentation: qat: rewrite description Documentation: qat: Use code block for qat sysfs example crypto: lib - add module license to libsha1 crypto: lib - make the sha1 library optional crypto: lib - move lib/sha1.c into lib/crypto/ crypto: fips - make proc files report fips module name and version ...
2022-08-02Merge tag 'random-6.0-rc1-for-linus' of ↵Linus Torvalds34-300/+204
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator updates from Jason Donenfeld: "Though there's been a decent amount of RNG-related development during this last cycle, not all of it is coming through this tree, as this cycle saw a shift toward tackling early boot time seeding issues, which took place in other trees as well. Here's a summary of the various patches: - The CONFIG_ARCH_RANDOM .config option and the "nordrand" boot option have been removed, as they overlapped with the more widely supported and more sensible options, CONFIG_RANDOM_TRUST_CPU and "random.trust_cpu". This change allowed simplifying a bit of arch code. - x86's RDRAND boot time test has been made a bit more robust, with RDRAND disabled if it's clearly producing bogus results. This would be a tip.git commit, technically, but I took it through random.git to avoid a large merge conflict. - The RNG has long since mixed in a timestamp very early in boot, on the premise that a computer that does the same things, but does so starting at different points in wall time, could be made to still produce a different RNG state. Unfortunately, the clock isn't set early in boot on all systems, so now we mix in that timestamp when the time is actually set. - User Mode Linux now uses the host OS's getrandom() syscall to generate a bootloader RNG seed and later on treats getrandom() as the platform's RDRAND-like faculty. - The arch_get_random_{seed_,}_long() family of functions is now arch_get_random_{seed_,}_longs(), which enables certain platforms, such as s390, to exploit considerable performance advantages from requesting multiple CPU random numbers at once, while at the same time compiling down to the same code as before on platforms like x86. - A small cleanup changing a cmpxchg() into a try_cmpxchg(), from Uros. - A comment spelling fix" More info about other random number changes that come in through various architecture trees in the full commentary in the pull request: https://lore.kernel.org/all/[email protected]/ * tag 'random-6.0-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: random: correct spelling of "overwrites" random: handle archrandom with multiple longs um: seed rng using host OS rng random: use try_cmpxchg in _credit_init_bits timekeeping: contribute wall clock to rng on time change x86/rdrand: Remove "nordrand" flag in favor of "random.trust_cpu" random: remove CONFIG_ARCH_RANDOM
2022-08-02block: change the blk_queue_bounce calling conventionChristoph Hellwig3-18/+20
The double indirect bio leads to somewhat suboptimal code generation. Instead return the (original or split) bio, and make sure the request_queue arguments to the lower level helpers is passed after the bio to avoid constant reshuffling of the argument passing registers. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-08-02block: change the blk_queue_split calling conventionChristoph Hellwig11-65/+63
The double indirect bio leads to somewhat suboptimal code generation. Instead return the (original or split) bio, and make sure the request_queue arguments to the lower level helpers is passed after the bio to avoid constant reshuffling of the argument passing registers. Also give it and the helpers used to implement it more descriptive names. Signed-off-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme: update MAINTAINERS for the new auth codeChristoph Hellwig1-1/+2
Add the common subdirectory and match all nvme* headers in include/linux/. Reported-by: Dan Carpenter <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvmet-tcp: fix lockdep complaint on nvmet_tcp_wq flush during queue teardownSagi Grimberg1-1/+2
We probably need nvmet_tcp_wq to have MEM_RECLAIM as we are sending/receiving for the socket from works on this workqueue. Also this eliminates lockdep complaints: -- [ 6174.010200] workqueue: WQ_MEM_RECLAIM nvmet-wq:nvmet_tcp_release_queue_work [nvmet_tcp] is flushing !WQ_MEM_RECLAIM nvmet_tcp_wq:nvmet_tcp_io_work [nvmet_tcp] [ 6174.010216] WARNING: CPU: 20 PID: 14456 at kernel/workqueue.c:2628 check_flush_dependency+0x110/0x14c Reported-by: Yi Zhang <[email protected]> Signed-off-by: Sagi Grimberg <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme: enable generic interface (/dev/ngXnY) for unknown command setsJoel Granados1-7/+34
Extend nvme_alloc_ns() and nvme_validate_ns() for unknown command-set as well. Both are made to use a new helper (nvme_update_ns_info_cs_indep) which is similar to nvme_update_ns_info but performs fewer operations to get the generic interface up. Suggested-by: Christoph Hellwig <[email protected]> Signed-off-by: Joel Granados <[email protected]> Signed-off-by: Kanchan Joshi <[email protected]> [hch: rebased on other refactoring patches] Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Javier González <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme: factor out a nvme_ns_is_readonly helperChristoph Hellwig1-5/+10
Add a little helper to check if a namespace should be marked read-only that uses a new is_readonly flag in the nvme_ns_info structure. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Javier González <[email protected]> Reviewed-by: Joel Granados <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Reviewed-by: Kanchan Joshi <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme: refactor namespace probingChristoph Hellwig1-105/+125
Change nvme_ns_scan to gather all information needed for generic namespace setup into a nvme_ns_info structure. This structure is filled from the Command Set Idependent Identify Namespace data structure if it is available or else the legacy Identify namespace structure. With that everything related to the NVM command set (and the ZNS command set derived from it) can be encapsulated in the nvme_update_ns_info_block function while keeping the rest of the namespace probing generic. The downside is that we now always issue two Identify Namespace calls for each probed namespace instead of usually just a single one previously. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Javier González <[email protected]> Reviewed-by: Joel Granados <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme: generalize the nvme_multi_css check in nvme_scan_nsChristoph Hellwig1-6/+6
Check for multiple command set support early on an error out if is not supported when a !NVM command set namespace is found. This prepares for adding command set independent passthrough support. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Javier González <[email protected]> Reviewed-by: Joel Granados <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Reviewed-by: Kanchan Joshi <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme: rename nvme_validate_or_alloc_ns to nvme_scan_nsChristoph Hellwig1-3/+3
This shorter name much better fits what this function does in the scanning process. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Javier González <[email protected]> Reviewed-by: Joel Granados <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Reviewed-by: Kanchan Joshi <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme: catch -ENODEV from nvme_revalidate_zones againChristoph Hellwig1-6/+7
nvme_revalidate_zones can also return -ENODEV if e.g. zone sizes aren't constant or not a power of two. In that case we should jump to marking the gendisk hidden and only support pass through. Fixes: 602e57c9799c ("nvme: also mark passthrough-only namespaces ready in nvme_update_ns_info") Reported-by: Joel Granados <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Joel Granados <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvmet-auth: select the intended CRYPTO_DH_RFC7919_GROUPSLukas Bulwahn1-1/+1
Commit 71ebe3842ebe ("nvmet-auth: Diffie-Hellman key exchange support") intends to select 'Support for RFC 7919 FFDHE group parameters' for using FFDHE groups for NVMe In-Band Authentication. It however selects CRYPTO_DH_GROUPS_RFC7919, instead of the intended CRYPTO_DH_RFC7919_GROUPS; notice the swapping of words here. Correct the select to the intended config option. Signed-off-by: Lukas Bulwahn <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvmet-auth: fix return value check in auth receiveChaitanya Kulkarni1-2/+1
nvmet_auth_challenge() return type is int and currently it uses status variable that is of type u16 in nvmet_execute_auth_receive(). Catch the return value of nvmet_auth_challenge() into int and set the NVME_SC_INTERNAL as status variable before we jump to error. Reported-by: Dan Carpenter <[email protected]> Signed-off-by: Chaitanya Kulkarni <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvmet-auth: fix return value check in auth sendChaitanya Kulkarni1-2/+2
nvmet_setup_auth() return type is int and currently it uses status variable that is of type u16 in nvmet_execute_auth_send(). Catch the return value of nvmet_setup_auth() into int and set the NVME_SC_INTERNAL as status variable before we jump to error. Reported-by: Dan Carpenter <[email protected]> Signed-off-by: Chaitanya Kulkarni <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvmet-auth: fix a couple of spelling mistakesColin Ian King2-2/+2
There are a couple of spelling mistakes in pr_warn and pr_debug messages. Fix them. Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvmet: fix a format specifier in nvmet_auth_ctrl_exponentialChristoph Hellwig1-1/+1
dh_keysize is a size_t, use the proper format specifier for printing it. Reported-by: Guenter Roeck <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: Guenter Roeck <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvmet: don't check for NULL pointer before kfree in nvmet_host_releaseChristoph Hellwig1-2/+2
And add an empty line after the variable declaration. Reported-by: kernel test robot <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme-apple: stop casting function pointer signaturesChristoph Hellwig1-6/+15
Casting function pointers breaks control flow enforcement and is generally a horrible coding style. Add two wrappers to get rid of these casts. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Reviewed-by: Sven Peter <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme-tcp: split nvme_tcp_alloc_tagsetChristoph Hellwig1-41/+41
Split nvme_tcp_alloc_tagset into one helper for the admin tag_set and one for the I/O tag set. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Keith Busch <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme-rdma: split nvme_rdma_alloc_tagsetChristoph Hellwig1-46/+46
Split nvme_rdma_alloc_tagset into one helper for the admin tag_set and one for the I/O tag set. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Keith Busch <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme-pci: split nvme_dev_addChristoph Hellwig1-35/+37
Split nvme_dev_add into a helper to actually allocate the tag set, and one that just update the number of queues. Add a local variable for the tag_set to clean up the code a bit. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Keith Busch <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme-pci: split nvme_alloc_admin_tagsChristoph Hellwig1-29/+31
Split nvme_alloc_admin_tags into a helper to actually allocate the tag set, and one that just restarts the admin queue. Add a local variable for the tag_set to clean up the code a bit. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Keith Busch <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme-pci: print the command name of aborted commandsChristoph Hellwig2-2/+5
To allow for slightly better debugging, print the command name when aborting an command. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Keith Busch <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme-pci: remove useless assignment in nvme_pci_setup_prpsLiu Song1-1/+0
If prp_list is NULL, nvme_unmap_sg will be performed, and the assignment to first_dma is meaningless, so remove it. Signed-off-by: Liu Song <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme-auth: uninitialized variable in nvme_auth_transform_key()Dan Carpenter1-9/+16
A couple of the early error gotos call kfree_sensitive(transformed_key); before "transformed_key" has been initialized. Fixes: db1312dd9548 ("nvmet: implement basic In-Band Authentication") Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme-auth: fix off by one checksDan Carpenter1-5/+5
The > ARRAY_SIZE() checks need to be >= ARRAY_SIZE() to prevent reading one element beyond the end of the arrays. Fixes: db1312dd9548 ("nvmet: implement basic In-Band Authentication") Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme: define compat_ioctl again to unbreak 32-bit userspace.Nick Bowler2-0/+2
Commit 89b3d6e60550 ("nvme: simplify the compat ioctl handling") removed the initialization of compat_ioctl from the nvme block_device_operations structures. Presumably the expectation was that 32-bit ioctls would be directed through the regular handler but this is not the case: failing to assign .compat_ioctl actually means that the compat case is disabled entirely, and any attempt to submit nvme ioctls from 32-bit userspace fails outright with -ENOTTY. For example: % smartctl -x /dev/nvme0n1 [...] Read NVMe Identify Controller failed: NVME_IOCTL_ADMIN_CMD: Inappropriate ioctl for device The blkdev_compat_ptr_ioctl helper can be used to direct compat calls through the main ioctl handler and makes things work again. Fixes: 89b3d6e60550 ("nvme: simplify the compat ioctl handling") Signed-off-by: Nick Bowler <[email protected]> Reviewed-by: Guixin Liu <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme: don't always build constants.oChristoph Hellwig2-3/+2
The entire content of constants.c if guarded by an ifdef, so switch to just building the file conditionally instead. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Keith Busch <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02nvme: use command_id instead of req->tag in trace_nvme_complete_rq()Bean Huo1-1/+1
Use command_id instead of req->tag in trace_nvme_complete_rq(), because of commit e7006de6c238 ("nvme: code command_id with a genctr for use authentication after release"), cmd->common.command_id is set to ((genctl & 0xf)< 12 | req->tag), no longer req->tag, which makes cid in trace_nvme_complete_rq and trace_nvme_setup_cmd are not the same. Fixes: e7006de6c238 ("nvme: code command_id with a genctr for use authentication after release") Signed-off-by: Bean Huo <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-08-02md-raid10: fix KASAN warningMikulas Patocka1-1/+4
There's a KASAN warning in raid10_remove_disk when running the lvm test lvconvert-raid-reshape.sh. We fix this warning by verifying that the value "number" is valid. BUG: KASAN: slab-out-of-bounds in raid10_remove_disk+0x61/0x2a0 [raid10] Read of size 8 at addr ffff889108f3d300 by task mdX_raid10/124682 CPU: 3 PID: 124682 Comm: mdX_raid10 Not tainted 5.19.0-rc6 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x34/0x44 print_report.cold+0x45/0x57a ? __lock_text_start+0x18/0x18 ? raid10_remove_disk+0x61/0x2a0 [raid10] kasan_report+0xa8/0xe0 ? raid10_remove_disk+0x61/0x2a0 [raid10] raid10_remove_disk+0x61/0x2a0 [raid10] Buffer I/O error on dev dm-76, logical block 15344, async page read ? __mutex_unlock_slowpath.constprop.0+0x1e0/0x1e0 remove_and_add_spares+0x367/0x8a0 [md_mod] ? super_written+0x1c0/0x1c0 [md_mod] ? mutex_trylock+0xac/0x120 ? _raw_spin_lock+0x72/0xc0 ? _raw_spin_lock_bh+0xc0/0xc0 md_check_recovery+0x848/0x960 [md_mod] raid10d+0xcf/0x3360 [raid10] ? sched_clock_cpu+0x185/0x1a0 ? rb_erase+0x4d4/0x620 ? var_wake_function+0xe0/0xe0 ? psi_group_change+0x411/0x500 ? preempt_count_sub+0xf/0xc0 ? _raw_spin_lock_irqsave+0x78/0xc0 ? __lock_text_start+0x18/0x18 ? raid10_sync_request+0x36c0/0x36c0 [raid10] ? preempt_count_sub+0xf/0xc0 ? _raw_spin_unlock_irqrestore+0x19/0x40 ? del_timer_sync+0xa9/0x100 ? try_to_del_timer_sync+0xc0/0xc0 ? _raw_spin_lock_irqsave+0x78/0xc0 ? __lock_text_start+0x18/0x18 ? _raw_spin_unlock_irq+0x11/0x24 ? __list_del_entry_valid+0x68/0xa0 ? finish_wait+0xa3/0x100 md_thread+0x161/0x260 [md_mod] ? unregister_md_personality+0xa0/0xa0 [md_mod] ? _raw_spin_lock_irqsave+0x78/0xc0 ? prepare_to_wait_event+0x2c0/0x2c0 ? unregister_md_personality+0xa0/0xa0 [md_mod] kthread+0x148/0x180 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x1f/0x30 </TASK> Allocated by task 124495: kasan_save_stack+0x1e/0x40 __kasan_kmalloc+0x80/0xa0 setup_conf+0x140/0x5c0 [raid10] raid10_run+0x4cd/0x740 [raid10] md_run+0x6f9/0x1300 [md_mod] raid_ctr+0x2531/0x4ac0 [dm_raid] dm_table_add_target+0x2b0/0x620 [dm_mod] table_load+0x1c8/0x400 [dm_mod] ctl_ioctl+0x29e/0x560 [dm_mod] dm_compat_ctl_ioctl+0x7/0x20 [dm_mod] __do_compat_sys_ioctl+0xfa/0x160 do_syscall_64+0x90/0xc0 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Last potentially related work creation: kasan_save_stack+0x1e/0x40 __kasan_record_aux_stack+0x9e/0xc0 kvfree_call_rcu+0x84/0x480 timerfd_release+0x82/0x140 L __fput+0xfa/0x400 task_work_run+0x80/0xc0 exit_to_user_mode_prepare+0x155/0x160 syscall_exit_to_user_mode+0x12/0x40 do_syscall_64+0x42/0xc0 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Second to last potentially related work creation: kasan_save_stack+0x1e/0x40 __kasan_record_aux_stack+0x9e/0xc0 kvfree_call_rcu+0x84/0x480 timerfd_release+0x82/0x140 __fput+0xfa/0x400 task_work_run+0x80/0xc0 exit_to_user_mode_prepare+0x155/0x160 syscall_exit_to_user_mode+0x12/0x40 do_syscall_64+0x42/0xc0 entry_SYSCALL_64_after_hwframe+0x46/0xb0 The buggy address belongs to the object at ffff889108f3d200 which belongs to the cache kmalloc-256 of size 256 The buggy address is located 0 bytes to the right of 256-byte region [ffff889108f3d200, ffff889108f3d300) The buggy address belongs to the physical page: page:000000007ef2a34c refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1108f3c head:000000007ef2a34c order:2 compound_mapcount:0 compound_pincount:0 flags: 0x4000000000010200(slab|head|zone=2) raw: 4000000000010200 0000000000000000 dead000000000001 ffff889100042b40 raw: 0000000000000000 0000000080200020 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff889108f3d200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff889108f3d280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff889108f3d300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ^ ffff889108f3d380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff889108f3d400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Signed-off-by: Mikulas Patocka <[email protected]> Cc: [email protected] Signed-off-by: Song Liu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>