aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2021-09-21scsi: ufs: core: Unbreak the reset handlerBart Van Assche1-1/+1
A command tag is passed as the second argument of the __ufshcd_transfer_req_compl() call in ufshcd_eh_device_reset_handler() instead of a bitmask. Fix this by passing a bitmask as argument instead of a command tag. Link: https://lore.kernel.org/r/[email protected] Fixes: a45f937110fa ("scsi: ufs: Optimize host lock on transfer requests send/compl paths") Cc: Can Guo <[email protected]> Reviewed-by: Avri Altman <[email protected]> Signed-off-by: Bart Van Assche <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2021-09-21scsi: sd_zbc: Support disks with more than 2**32 logical blocksBart Van Assche1-1/+1
This patch addresses the following Coverity report about the zno * sdkp->zone_blocks expression: CID 1475514 (#1 of 1): Unintentional integer overflow (OVERFLOW_BEFORE_WIDEN) overflow_before_widen: Potentially overflowing expression zno * sdkp->zone_blocks with type unsigned int (32 bits, unsigned) is evaluated using 32-bit arithmetic, and then used in a context that expects an expression of type sector_t (64 bits, unsigned). Link: https://lore.kernel.org/r/[email protected] Fixes: 5795eb443060 ("scsi: sd_zbc: emulate ZONE_APPEND commands") Cc: Johannes Thumshirn <[email protected]> Cc: Damien Le Moal <[email protected]> Cc: Hannes Reinecke <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Himanshu Madhani <[email protected]> Signed-off-by: Bart Van Assche <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2021-09-21scsi: ufs: core: Revert "scsi: ufs: Synchronize SCSI and UFS error handling"Adrian Hunter2-57/+58
This reverts commit a113eaaf86373362b053279049907ff82b5df6c8. There are a couple of issues with the commit: 1. It causes deadlocks. 2. It causes the shost->eh_cmd_q list of failed requests not to be processed, ever. So revert it. 1. Deadlocks The SCSI error handler runs with requests blocked beginning when scsi_schedule_eh() sets SHOST_RECOVERY state, continuing through scsi_error_handler() callback ->eh_strategy_handler() until scsi_restart_operations() is called. By setting eh_strategy_handler to ufshcd_err_handler, the patch changed the UFS error handler to run with requests blocked, including PM requests, for the entire run of the error handler. That conflicts with UFS error handler existing synchronization with UFS device PM operations. The UFS error handler synchronizes with runtime PM by doing pm_runtime_get_sync() prior to blocking requests itself. It synchronizes with system PM by use of hba->host_sem, again before blocking requests itself. However, if requests are already blocked, then PM operations will block. So: the UFS error handler blocks waiting on PM + PM blocks waiting on SCSI PM requests to process or fail + PM requests are blocked waiting on error handling to finish = deadlock This happens both for runtime PM and system PM. Prior to the patch, these deadlocks could not happen even if SCSI error handling was running, because the presence of requests in shost->eh_cmd_q would mean the queues could not be suspended, which would mean that, should the UFS error handler run at the same time, it would not need to wait for PM or vice versa. Please note these scenarios are not just theoretical, they were found during testing on a Samsung Galaxy Book S. 2. ->eh_strategy_handler() must process shost->eh_cmd_q list of failed requests, as all other eh_strategy_handler's do except UFS error handler. Refer for example: scsi_unjam_host(), ata_scsi_error() and sas_scsi_recover_host(). Link: https://lore.kernel.org/r/[email protected] Fixes: a113eaaf8637 ("scsi: ufs: Synchronize SCSI and UFS error handling") Reviewed-by: Bart Van Assche <[email protected]> Signed-off-by: Adrian Hunter <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2021-09-21Merge branch 's390-qeth-fixes-2021-09-21'Jakub Kicinski6-19/+18
Julian Wiedmann says: ==================== s390/qeth: fixes 2021-09-21 This brings two fixes for deadlocks when a device is removed while it has certain types of async work pending. And one additional fix for a missing NULL check in an error case. ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2021-09-21s390/qeth: fix deadlock during failing recoveryAlexandra Winter3-4/+11
Commit 0b9902c1fcc5 ("s390/qeth: fix deadlock during recovery") removed taking discipline_mutex inside qeth_do_reset(), fixing potential deadlocks. An error path was missed though, that still takes discipline_mutex and thus has the original deadlock potential. Intermittent deadlocks were seen when a qeth channel path is configured offline, causing a race between qeth_do_reset and ccwgroup_remove. Call qeth_set_offline() directly in the qeth_do_reset() error case and then a new variant of ccwgroup_set_offline(), without taking discipline_mutex. Fixes: b41b554c1ee7 ("s390/qeth: fix locking for discipline setup / removal") Signed-off-by: Alexandra Winter <[email protected]> Reviewed-by: Julian Wiedmann <[email protected]> Signed-off-by: Julian Wiedmann <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-09-21s390/qeth: Fix deadlock in remove_disciplineAlexandra Winter4-15/+4
Problem: qeth_close_dev_handler is a worker that tries to acquire card->discipline_mutex via drv->set_offline() in ccwgroup_set_offline(). Since commit b41b554c1ee7 ("s390/qeth: fix locking for discipline setup / removal") qeth_remove_discipline() is called under card->discipline_mutex and cancels the work and waits for it to finish. STOPLAN reception with reason code IPA_RC_VEPA_TO_VEB_TRANSITION is the only situation that schedules close_dev_work. In that situation scheduling qeth recovery will also result in an offline interface, when resetting the isolation mode fails, if the external switch is still set to VEB. And since commit 0b9902c1fcc5 ("s390/qeth: fix deadlock during recovery") qeth recovery does not aquire card->discipline_mutex anymore. So we accept the longer pathlength of qeth_schedule_recovery in this error situation and re-use the existing function. As a side-benefit this changes the hwtrap to behave like during recovery instead of like during a user-triggered set_offline. Fixes: b41b554c1ee7 ("s390/qeth: fix locking for discipline setup / removal") Signed-off-by: Alexandra Winter <[email protected]> Acked-by: Julian Wiedmann <[email protected]> Signed-off-by: Julian Wiedmann <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-09-21s390/qeth: fix NULL deref in qeth_clear_working_pool_list()Julian Wiedmann1-0/+3
When qeth_set_online() calls qeth_clear_working_pool_list() to roll back after an error exit from qeth_hardsetup_card(), we are at risk of accessing card->qdio.in_q before it was allocated by qeth_alloc_qdio_queues() via qeth_mpc_initialize(). qeth_clear_working_pool_list() then dereferences NULL, and by writing to queue->bufs[i].pool_entry scribbles all over the CPU's lowcore. Resulting in a crash when those lowcore areas are used next (eg. on the next machine-check interrupt). Such a scenario would typically happen when the device is first set online and its queues aren't allocated yet. An early IO error or certain misconfigs (eg. mismatched transport mode, bad portno) then cause us to error out from qeth_hardsetup_card() with card->qdio.in_q still being NULL. Fix it by checking the pointer for NULL before accessing it. Note that we also have (rare) paths inside qeth_mpc_initialize() where a configuration change can cause us to free the existing queues, expecting that subsequent code will allocate them again. If we then error out before that re-allocation happens, the same bug occurs. Fixes: eff73e16ee11 ("s390/qeth: tolerate pre-filled RX buffer") Reported-by: Stefan Raspl <[email protected]> Root-caused-by: Heiko Carstens <[email protected]> Signed-off-by: Julian Wiedmann <[email protected]> Reviewed-by: Alexandra Winter <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-09-21cifs: fix a sign extension bugDan Carpenter1-1/+1
The problem is the mismatched types between "ctx->total_len" which is an unsigned int, "rc" which is an int, and "ctx->rc" which is a ssize_t. The code does: ctx->rc = (rc == 0) ? ctx->total_len : rc; We want "ctx->rc" to store the negative "rc" error code. But what happens is that "rc" is type promoted to a high unsigned int and 'ctx->rc" will store the high positive value instead of a negative value. The fix is to change "rc" from an int to a ssize_t. Fixes: c610c4b619e5 ("CIFS: Add asynchronous write support through kernel AIO") Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Steve French <[email protected]>
2021-09-21spi: Revert modalias changesMark Brown1-8/+0
During the v5.13 cycle we updated the SPI subsystem to generate OF style modaliases for SPI devices, replacing the old Linux style modalises we used to generate based on spi_device_id which are the DT style name with the vendor removed. Unfortunately this means that we start only reporting OF style modalises and not the old ones and there is nothing that ensures that drivers list every possible OF compatible string in their OF ID table. The result is that there are systems which have been relying on loading modules based on the old style that are now broken, as found by Russell King with spi-nor on Macchiatobin. spi-nor is a particularly problematic case for this, it only lists a single generic DT compatible jedec,spi-nor in the driver but supports a huge raft of device specific compatibles, with a large set of part numbers many of which are offered by multiple vendors. Russell's searches of upstream device trees has turned up examples with vendor names written in non-standard ways too. To make matters worse up until 8ff16cf77ce3 ("Documentation: devicetree: m25p80: add "nor-jedec" binding") the generic compatible was not part of the binding so there are device trees out there written to that binding version which don't list it all. The sheer number of parts supported together with our previous approach of ignoring the vendor ID makes robustly fixing this by adding compatibles to the spi-nor driver seem problematic, the current DT binding document does not list all the parts supported by the driver at the minute (further patches will fix this). I've also investigated supporting both formats of modalias simultaneously but that doesn't seem possible, especially without breaking our userspace ABI which is obviously not viable. Instead revert the relevant changes for now: e09f2ab8eecc ("spi: update modalias_show after of_device_uevent_modalias support") 3ce6c9e2617e ("spi: add of_device_uevent_modalias support") This will unfortunately mean that any system which had started having modules autoload based on the OF compatibles for drivers that list things there but not in the spi_device_ids will now not have those modules load which is itself a regression. Since it affects a narrower time window and the particularly problematic spi-nor driver may be critical to system boot on smaller systems this seems the best of a series of bad options. I will start an audit of SPI drivers to identify and fix cases where things won't autoload using spi_device_id, this is not great but seems to be the best way forward that anyone has been able to identify. Thanks to Russell for both his report and the additional diagnostic and analysis work he has done here, the detailed research above was his work. Fixes: e09f2ab8eecc ("spi: update modalias_show after of_device_uevent_modalias support") Fixes: 3ce6c9e2617e ("spi: add of_device_uevent_modalias support") Reported-by: Russell King (Oracle) <[email protected]> Suggested-by: Russell King (Oracle) <[email protected]> Signed-off-by: Mark Brown <[email protected]> Tested-by: Russell King (Oracle) <[email protected]> Cc: Andreas Schwab <[email protected]> Cc: Marco Felsch <[email protected]>
2021-09-21ksmbd: add default data stream name in FILE_STREAM_INFORMATIONNamjae Jeon1-5/+3
Windows client expect to get default stream name(::DATA) in FILE_STREAM_INFORMATION response even if there is no stream data in file. This patch fix update failure when writing ppt or doc files. Signed-off-by: Namjae Jeon <[email protected]> Reviewed-By: Tom Talpey <[email protected]> Signed-off-by: Steve French <[email protected]>
2021-09-21ksmbd: log that server is experimental at module loadSteve French1-0/+3
While we are working through detailed security reviews of ksmbd server code we should remind users that it is an experimental module by adding a warning when the module loads. Currently the module shows as experimental in Kconfig and is disabled by default, but we don't want to confuse users. Although ksmbd passes a wide variety of the important functional tests (since initial focus had been largely on functional testing such as smbtorture, xfstests etc.), and ksmbd has added key security features (e.g. GCM256 encryption, Kerberos support), there are ongoing detailed reviews of the code base for path processing and network buffer decoding, and this patch reminds users that the module should be considered "experimental." Reviewed-by: Namjae Jeon <[email protected]> Reviewed-by: Paulo Alcantara (SUSE) <[email protected]> Reviewed-by: Ronnie Sahlberg <[email protected]> Signed-off-by: Steve French <[email protected]>
2021-09-21kselftest/arm64: signal: Skip tests if required features are missingCristian Marussi1-2/+5
During initialization of a signal testcase, features declared as required are properly checked against the running system but no action is then taken to effectively skip such a testcase. Fix core signals test logic to abort initialization and report such a testcase as skipped to the KSelfTest framework. Fixes: f96bf4340316 ("kselftest: arm64: mangle_pstate_invalid_compat_toggle and common utils") Signed-off-by: Cristian Marussi <[email protected]> Reviewed-by: Mark Brown <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
2021-09-21Merge tag 's390-5.15-ebpf-jit-fixes' of ↵Linus Torvalds1-32/+38
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 eBPF fixes from Vasily Gorbik: "Johan Almbladh has implemented a number of new testcases for eBPF [1], which uncovered three miscompilation issues in the s390 eBPF JIT" Link: https://lore.kernel.org/bpf/[email protected]/ [1] * tag 's390-5.15-ebpf-jit-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/bpf: Fix optimizing out zero-extensions s390/bpf: Fix 64-bit subtraction of the -0x80000000 constant s390/bpf: Fix branch shortening during codegen pass
2021-09-21comedi: Fix memory leak in compat_insnlist()Ian Abbott1-0/+1
`compat_insnlist()` handles the 32-bit version of the `COMEDI_INSNLIST` ioctl (whenwhen `CONFIG_COMPAT` is enabled). It allocates memory to temporarily hold an array of `struct comedi_insn` converted from the 32-bit version in user space. This memory is only being freed if there is a fault while filling the array, otherwise it is leaked. Add a call to `kfree()` to fix the leak. Fixes: b8d47d881305 ("comedi: get rid of compat_alloc_user_space() mess in COMEDI_INSNLIST compat") Cc: Al Viro <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: [email protected] Cc: <[email protected]> # 5.13+ Signed-off-by: Ian Abbott <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2021-09-21ceph: fix off by one bugs in unsafe_request_wait()Dan Carpenter1-2/+2
The "> max" tests should be ">= max" to prevent an out of bounds access on the next lines. Fixes: e1a4541ec0b9 ("ceph: flush the mdlog before waiting on unsafe reqs") Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: Ilya Dryomov <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2021-09-21nvmem: NVMEM_NINTENDO_OTP should depend on WIIGeert Uytterhoeven1-0/+1
The Nintendo Wii and Wii U OTP is only present on Nintendo Wii and Wii U consoles. Hence add a dependency on WII, to prevent asking the user about this driver when configuring a kernel without Nintendo Wii and Wii U console support. Fixes: 3683b761fe3a10ad ("nvmem: nintendo-otp: Add new driver for the Wii and Wii U OTP") Reviewed-by: Emmanuel Gil Peyrot <[email protected]> Signed-off-by: Geert Uytterhoeven <[email protected]> Link: https://lore.kernel.org/r/01318920709dddc4d85fe895e2083ca0eee234d8.1631611652.git.geert+renesas@glider.be Signed-off-by: Greg Kroah-Hartman <[email protected]>
2021-09-21qnx4: work around gcc false positive warning bugLinus Torvalds1-9/+27
In commit b7213ffa0e58 ("qnx4: avoid stringop-overread errors") I tried to teach gcc about how the directory entry structure can be two different things depending on a status flag. It made the code clearer, and it seemed to make gcc happy. However, Arnd points to a gcc bug, where despite using two different members of a union, gcc then gets confused, and uses the size of one of the members to decide if a string overrun happens. And not necessarily the rigth one. End result: with some configurations, gcc-11 will still complain about the source buffer size being overread: fs/qnx4/dir.c: In function 'qnx4_readdir': fs/qnx4/dir.c:76:32: error: 'strnlen' specified bound [16, 48] exceeds source size 1 [-Werror=stringop-overread] 76 | size = strnlen(name, size); | ^~~~~~~~~~~~~~~~~~~ fs/qnx4/dir.c:26:22: note: source object declared here 26 | char de_name; | ^~~~~~~ because gcc will get confused about which union member entry is actually getting accessed, even when the source code is very clear about it. Gcc internally will have combined two "redundant" pointers (pointing to different union elements that are at the same offset), and takes the size checking from one or the other - not necessarily the right one. This is clearly a gcc bug, but we can work around it fairly easily. The biggest thing here is the big honking comment about why we do what we do. Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99578#c6 Reported-and-tested-by: Arnd Bergmann <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-21usb: musb: tusb6010: uninitialized data in tusb_fifo_write_unaligned()Dan Carpenter1-0/+1
This is writing to the first 1 - 3 bytes of "val" and then writing all four bytes to musb_writel(). The last byte is always going to be garbage. Zero out the last bytes instead. Fixes: 550a7375fe72 ("USB: Add MUSB and TUSB support") Signed-off-by: Dan Carpenter <[email protected]> Cc: stable <[email protected]> Link: https://lore.kernel.org/r/20210916135737.GI25094@kili Signed-off-by: Greg Kroah-Hartman <[email protected]>
2021-09-21usb-storage: Add quirk for ScanLogic SL11R-IDE older than 2.6cOndrej Zary1-1/+8
ScanLogic SL11R-IDE with firmware older than 2.6c (the latest one) has broken tag handling, preventing the device from working at all: usb 1-1: new full-speed USB device number 2 using uhci_hcd usb 1-1: New USB device found, idVendor=04ce, idProduct=0002, bcdDevice= 2.60 usb 1-1: New USB device strings: Mfr=1, Product=1, SerialNumber=0 usb 1-1: Product: USB Device usb 1-1: Manufacturer: USB Device usb-storage 1-1:1.0: USB Mass Storage device detected scsi host2: usb-storage 1-1:1.0 usbcore: registered new interface driver usb-storage usb 1-1: reset full-speed USB device number 2 using uhci_hcd usb 1-1: reset full-speed USB device number 2 using uhci_hcd usb 1-1: reset full-speed USB device number 2 using uhci_hcd usb 1-1: reset full-speed USB device number 2 using uhci_hcd Add US_FL_BULK_IGNORE_TAG to fix it. Also update my e-mail address. 2.6c is the only firmware that claims Linux compatibility. The firmware can be upgraded using ezotgdbg utility: https://github.com/asciilifeform/ezotgdbg Acked-by: Alan Stern <[email protected]> Signed-off-by: Ondrej Zary <[email protected]> Cc: stable <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2021-09-21Re-enable UAS for LaCie Rugged USB3-FW with fk quirkJulian Sikorski1-1/+1
Further testing has revealed that LaCie Rugged USB3-FW does work with uas as long as US_FL_NO_REPORT_OPCODES and US_FL_NO_SAME are enabled. Link: https://lore.kernel.org/linux-usb/[email protected]/ Cc: stable <[email protected]> Suggested-by: Hans de Goede <[email protected]> Reviewed-by: Hans de Goede <[email protected]> Acked-by: Oliver Neukum <[email protected]> Signed-off-by: Julian Sikorski <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2021-09-21misc: bcm-vk: fix tty registration raceJohan Hovold1-3/+3
Make sure to set the tty class-device driver data before registering the tty to avoid having a racing open() dereference a NULL pointer. Fixes: 91ca10d6fa07 ("misc: bcm-vk: add ttyVK support") Cc: [email protected] # 5.12 Signed-off-by: Johan Hovold <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2021-09-21arm64: Mitigate MTE issues with str{n}cmp()Robin Murphy4-2/+9
As with strlen(), the patches importing the updated str{n}cmp() implementations were originally developed and tested before the advent of CONFIG_KASAN_HW_TAGS, and have subsequently revealed not to be MTE-safe. Since in-kernel MTE is still a rather niche case, let it temporarily fall back to the generic C versions for correctness until we can figure out the best fix. Fixes: 758602c04409 ("arm64: Import latest version of Cortex Strings' strcmp") Fixes: 020b199bc70d ("arm64: Import latest version of Cortex Strings' strncmp") Cc: <[email protected]> # 5.14.x Reported-by: Branislav Rankov <[email protected]> Signed-off-by: Robin Murphy <[email protected]> Acked-by: Mark Rutland <[email protected]> Link: https://lore.kernel.org/r/34dc4d12eec0adae49b0ac927df642ed10089d40.1631890770.git.robin.murphy@arm.com Signed-off-by: Catalin Marinas <[email protected]>
2021-09-21platform/x86: gigabyte-wmi: add support for B550I Aorus Pro AXTobias Jakobi1-0/+1
Tested with a AMD Ryzen 7 5800X. Signed-off-by: Tobias Jakobi <[email protected]> Acked-by: Thomas Weißschuh <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Hans de Goede <[email protected]>
2021-09-21platform/x86/intel: hid: Add DMI switches allow listJosé Expósito1-5/+22
Some devices, even non convertible ones, can send incorrect SW_TABLET_MODE reports. Add an allow list and accept such reports only from devices in it. Bug reported for Dell XPS 17 9710 on: https://gitlab.freedesktop.org/libinput/libinput/-/issues/662 Reported-by: Tobias Gurtzick <[email protected]> Suggested-by: Hans de Goede <[email protected]> Tested-by: Tobias Gurtzick <[email protected]> Signed-off-by: José Expósito <[email protected]> Link: https://lore.kernel.org/r/[email protected] [[email protected]: Check dmi_switches_auto_add_allow_list only once] Signed-off-by: Hans de Goede <[email protected]>
2021-09-21platform/x86: dell: fix DELL_WMI_PRIVACY dependencies & build errorRandy Dunlap1-2/+1
When DELL_WMI=y, DELL_WMI_PRIVACY=y, and LEDS_TRIGGER_AUDIO=m, there is a linker error since the LEDS trigger code is built as a loadable module. This happens because DELL_WMI_PRIVACY is a bool that depends on a tristate (LEDS_TRIGGER_AUDIO=m), which can be dangerous. ld: drivers/platform/x86/dell/dell-wmi-privacy.o: in function `dell_privacy_wmi_probe': dell-wmi-privacy.c:(.text+0x3df): undefined reference to `ledtrig_audio_get' Fixes: 8af9fa37b8a3 ("platform/x86: dell-privacy: Add support for Dell hardware privacy") Signed-off-by: Randy Dunlap <[email protected]> Cc: Perry Yuan <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Hans de Goede <[email protected]> Cc: Mark Gross <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Hans de Goede <[email protected]> Signed-off-by: Hans de Goede <[email protected]>
2021-09-21thermal/drivers/tsens: Fix wrong check for tzd in irq handlersAnsuel Smith1-2/+2
Some devices can have some thermal sensors disabled from the factory. The current two irq handler functions check all the sensor by default and the check if the sensor was actually registered is wrong. The tzd is actually never set if the registration fails hence the IS_ERR check is wrong. Signed-off-by: Ansuel Smith <[email protected]> Reviewed-by: Matthias Kaehlcke <[email protected]> Signed-off-by: Daniel Lezcano <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-09-21thermal/core: Potential buffer overflow in thermal_build_list_of_policies()Dan Carpenter1-4/+3
After printing the list of thermal governors, then this function prints a newline character. The problem is that "size" has not been updated after printing the last governor. This means that it can write one character (the NUL terminator) beyond the end of the buffer. Get rid of the "size" variable and just use "PAGE_SIZE - count" directly. Fixes: 1b4f48494eb2 ("thermal: core: group functions related to governor handling") Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Daniel Lezcano <[email protected]> Link: https://lore.kernel.org/r/20210916131342.GB25094@kili
2021-09-21Merge branch 'dsa-devres'David S. Miller2-4/+10
Vladimir Oltean says: ==================== Fix mdiobus users with devres Commit ac3a68d56651 ("net: phy: don't abuse devres in devm_mdiobus_register()") by Bartosz Golaszewski has introduced two classes of potential bugs by making the devres callback of devm_mdiobus_alloc stop calling mdiobus_unregister. The exact buggy circumstances are presented in the individual commit messages. I have searched the tree for other occurrences, but at the moment: - for issue (a) I have no concrete proof that other buses except SPI and I2C suffer from it, and the only SPI or I2C device drivers that call of_mdiobus_alloc are the DSA drivers that leave a NULL ds->slave_mii_bus and a non-NULL ds->ops->phy_read, aka ksz9477, ksz8795, lan9303_i2c, vsc73xx-spi. - for issue (b), all drivers which call of_mdiobus_alloc either use of_mdiobus_register too, or call mdiobus_unregister sometime within the ->remove path. Although at this point I've seen enough strangeness caused by this "device_del during ->shutdown" that I'm just going to copy the SPI and I2C subsystem maintainers to this patch series, to get their feedback whether they've had reports about things like this before. I don't think other buses behave in this way, it forces SPI and I2C devices to have to protect themselves from a really strange set of issues. ==================== Signed-off-by: David S. Miller <[email protected]>
2021-09-21net: dsa: realtek: register the MDIO bus under devresVladimir Oltean1-1/+1
The Linux device model permits both the ->shutdown and ->remove driver methods to get called during a shutdown procedure. Example: a DSA switch which sits on an SPI bus, and the SPI bus driver calls this on its ->shutdown method: spi_unregister_controller -> device_for_each_child(&ctlr->dev, NULL, __unregister); -> spi_unregister_device(to_spi_device(dev)); -> device_del(&spi->dev); So this is a simple pattern which can theoretically appear on any bus, although the only other buses on which I've been able to find it are I2C: i2c_del_adapter -> device_for_each_child(&adap->dev, NULL, __unregister_client); -> i2c_unregister_device(client); -> device_unregister(&client->dev); The implication of this pattern is that devices on these buses can be unregistered after having been shut down. The drivers for these devices might choose to return early either from ->remove or ->shutdown if the other callback has already run once, and they might choose that the ->shutdown method should only perform a subset of the teardown done by ->remove (to avoid unnecessary delays when rebooting). So in other words, the device driver may choose on ->remove to not do anything (therefore to not unregister an MDIO bus it has registered on ->probe), because this ->remove is actually triggered by the device_shutdown path, and its ->shutdown method has already run and done the minimally required cleanup. This used to be fine until the blamed commit, but now, the following BUG_ON triggers: void mdiobus_free(struct mii_bus *bus) { /* For compatibility with error handling in drivers. */ if (bus->state == MDIOBUS_ALLOCATED) { kfree(bus); return; } BUG_ON(bus->state != MDIOBUS_UNREGISTERED); bus->state = MDIOBUS_RELEASED; put_device(&bus->dev); } In other words, there is an attempt to free an MDIO bus which was not unregistered. The attempt to free it comes from the devres release callbacks of the SPI device, which are executed after the device is unregistered. I'm not saying that the fact that MDIO buses allocated using devres would automatically get unregistered wasn't strange. I'm just saying that the commit didn't care about auditing existing call paths in the kernel, and now, the following code sequences are potentially buggy: (a) devm_mdiobus_alloc followed by plain mdiobus_register, for a device located on a bus that unregisters its children on shutdown. After the blamed patch, either both the alloc and the register should use devres, or none should. (b) devm_mdiobus_alloc followed by plain mdiobus_register, and then no mdiobus_unregister at all in the remove path. After the blamed patch, nobody unregisters the MDIO bus anymore, so this is even more buggy than the previous case which needs a specific bus configuration to be seen, this one is an unconditional bug. In this case, the Realtek drivers fall under category (b). To solve it, we can register the MDIO bus under devres too, which restores the previous behavior. Fixes: ac3a68d56651 ("net: phy: don't abuse devres in devm_mdiobus_register()") Reported-by: Lino Sanfilippo <[email protected]> Reported-by: Alvin Šipraga <[email protected]> Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-21net: dsa: don't allocate the slave_mii_bus using devresVladimir Oltean1-3/+9
The Linux device model permits both the ->shutdown and ->remove driver methods to get called during a shutdown procedure. Example: a DSA switch which sits on an SPI bus, and the SPI bus driver calls this on its ->shutdown method: spi_unregister_controller -> device_for_each_child(&ctlr->dev, NULL, __unregister); -> spi_unregister_device(to_spi_device(dev)); -> device_del(&spi->dev); So this is a simple pattern which can theoretically appear on any bus, although the only other buses on which I've been able to find it are I2C: i2c_del_adapter -> device_for_each_child(&adap->dev, NULL, __unregister_client); -> i2c_unregister_device(client); -> device_unregister(&client->dev); The implication of this pattern is that devices on these buses can be unregistered after having been shut down. The drivers for these devices might choose to return early either from ->remove or ->shutdown if the other callback has already run once, and they might choose that the ->shutdown method should only perform a subset of the teardown done by ->remove (to avoid unnecessary delays when rebooting). So in other words, the device driver may choose on ->remove to not do anything (therefore to not unregister an MDIO bus it has registered on ->probe), because this ->remove is actually triggered by the device_shutdown path, and its ->shutdown method has already run and done the minimally required cleanup. This used to be fine until the blamed commit, but now, the following BUG_ON triggers: void mdiobus_free(struct mii_bus *bus) { /* For compatibility with error handling in drivers. */ if (bus->state == MDIOBUS_ALLOCATED) { kfree(bus); return; } BUG_ON(bus->state != MDIOBUS_UNREGISTERED); bus->state = MDIOBUS_RELEASED; put_device(&bus->dev); } In other words, there is an attempt to free an MDIO bus which was not unregistered. The attempt to free it comes from the devres release callbacks of the SPI device, which are executed after the device is unregistered. I'm not saying that the fact that MDIO buses allocated using devres would automatically get unregistered wasn't strange. I'm just saying that the commit didn't care about auditing existing call paths in the kernel, and now, the following code sequences are potentially buggy: (a) devm_mdiobus_alloc followed by plain mdiobus_register, for a device located on a bus that unregisters its children on shutdown. After the blamed patch, either both the alloc and the register should use devres, or none should. (b) devm_mdiobus_alloc followed by plain mdiobus_register, and then no mdiobus_unregister at all in the remove path. After the blamed patch, nobody unregisters the MDIO bus anymore, so this is even more buggy than the previous case which needs a specific bus configuration to be seen, this one is an unconditional bug. In this case, DSA falls into category (a), it tries to be helpful and registers an MDIO bus on behalf of the switch, which might be on such a bus. I've no idea why it does it under devres. It does this on probe: if (!ds->slave_mii_bus && ds->ops->phy_read) alloc and register mdio bus and this on remove: if (ds->slave_mii_bus && ds->ops->phy_read) unregister mdio bus I _could_ imagine using devres because the condition used on remove is different than the condition used on probe. So strictly speaking, DSA cannot determine whether the ds->slave_mii_bus it sees on remove is the ds->slave_mii_bus that _it_ has allocated on probe. Using devres would have solved that problem. But nonetheless, the existing code already proceeds to unregister the MDIO bus, even though it might be unregistering an MDIO bus it has never registered. So I can only guess that no driver that implements ds->ops->phy_read also allocates and registers ds->slave_mii_bus itself. So in that case, if unregistering is fine, freeing must be fine too. Stop using devres and free the MDIO bus manually. This will make devres stop attempting to free a still registered MDIO bus on ->shutdown. Fixes: ac3a68d56651 ("net: phy: don't abuse devres in devm_mdiobus_register()") Reported-by: Lino Sanfilippo <[email protected]> Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Tested-by: Lino Sanfilippo <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-21arm64: add MTE supported check to thread switching and syscall entry/exitPeter Collingbourne2-6/+10
This lets us avoid doing unnecessary work on hardware that does not support MTE, and will allow us to freely use MTE instructions in the code called by mte_thread_switch(). Since this would mean that we do a redundant check in mte_check_tfsr_el1(), remove it and add two checks now required in its callers. This also avoids an unnecessary DSB+ISB sequence on the syscall exit path for hardware not supporting MTE. Fixes: 65812c6921cc ("arm64: mte: Enable async tag check fault") Cc: <[email protected]> # 5.13.x Signed-off-by: Peter Collingbourne <[email protected]> Link: https://linux-review.googlesource.com/id/I02fd000d1ef2c86c7d2952a7f099b254ec227a5d Link: https://lore.kernel.org/r/[email protected] [[email protected]: adjust the commit log slightly] Signed-off-by: Catalin Marinas <[email protected]>
2021-09-21drm/i915: Free all DMC payloadsChris Wilson1-1/+4
Free all the DMC payloads, not just DMC_MAIN. unreferenced object 0xffff88ff32d4d800 (size 1024): comm "kworker/1:5", pid 701, jiffies 4294904239 (age 109.736s) hex dump (first 32 bytes): 40 40 00 0c 03 00 00 00 00 00 00 00 00 00 00 00 @@.............. 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000ba9d0d95>] dmc_load_work_fn+0x34d/0x510 [i915] [<000000001049fcab>] process_one_work+0x261/0x550 [<00000000eeb995ac>] worker_thread+0x49/0x3c0 [<0000000021031dc3>] kthread+0x10b/0x140 [<000000004a0f69ee>] ret_from_fork+0x1f/0x30 unreferenced object 0xffff88ff0bde4000 (size 1024): comm "kworker/0:3", pid 708, jiffies 4294904469 (age 108.816s) hex dump (first 32 bytes): 40 40 00 0c 01 00 00 00 00 00 00 00 00 00 00 00 @@.............. 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000ba9d0d95>] dmc_load_work_fn+0x34d/0x510 [i915] [<000000001049fcab>] process_one_work+0x261/0x550 [<00000000eeb995ac>] worker_thread+0x49/0x3c0 [<0000000021031dc3>] kthread+0x10b/0x140 [<000000004a0f69ee>] ret_from_fork+0x1f/0x30 Fixes: 3d5928a168a9 ("drm/i915/xelpd: Pipe A DMC plugging") Cc: Anusha Srivatsa <[email protected]> Cc: José Roberto de Souza <[email protected]> Signed-off-by: Chris Wilson <[email protected]> Signed-off-by: Lucas De Marchi <[email protected]> Reviewed-by: José Roberto de Souza <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit 064b877dff4252ced91a1c8b1f129073f2991f6e) Signed-off-by: Jani Nikula <[email protected]>
2021-09-21drm/i915: Move __i915_gem_free_object to ttm_bo_destroyMaarten Lankhorst1-4/+5
When we implement delayed destroy, we may have a second call to the delete_mem_notify() handler, while free_object() only should be called once. Move it to bo->destroy(), to ensure it's only called once. This fixes some weird memory corruption issues with delayed destroy when async eviction is used. Signed-off-by: Maarten Lankhorst <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Fixes: 213d50927763 ("drm/i915/ttm: Introduce a TTM i915 gem object backend") Cc: Thomas Hellström <[email protected]> Reviewed-by: Thomas Hellström <[email protected]> (cherry picked from commit 48b0961269546716c3232748bf37e64e49fb866c) Signed-off-by: Jani Nikula <[email protected]>
2021-09-21drm/i915: Update memory bandwidth parametersRadhakrishna Sripada1-3/+16
Earlier while calculating derated bw we would use 90% of the calculated bw. Starting ADL-P we use a non standard derating. Updating the formulae to reflect the same. Bspec: 64631 v2: Use the new derating value only for ADL-P(MattR) Fixes: 4d32fe2f14a7 ("drm/i915/adl_p: Update memory bandwidth parameters") Cc: Matt Roper <[email protected]> Signed-off-by: Radhakrishna Sripada <[email protected]> Reviewed-by: Matt Roper <[email protected]> Signed-off-by: Matt Roper <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit f6d66fc8cf5f673ea76407be84dc17dbb3eda108) Signed-off-by: Jani Nikula <[email protected]>
2021-09-21Doc: networking: Fox a typo in ice.rstMasanari Iida1-1/+1
This patch fixes a spelling typo in ice.rst Signed-off-by: Masanari Iida <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-21net: dsa: fix dsa_tree_setup error pathVladimir Oltean1-0/+1
Since the blamed commit, dsa_tree_teardown_switches() was split into two smaller functions, dsa_tree_teardown_switches and dsa_tree_teardown_ports. However, the error path of dsa_tree_setup stopped calling dsa_tree_teardown_ports. Fixes: a57d8c217aad ("net: dsa: flush switchdev workqueue before tearing down CPU/DSA ports") Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-21Merge branch 'smc-fixes'David S. Miller2-1/+4
Karsten Graul says: ==================== net/smc: fixes 2021-09-20 Please apply the following patches for smc to netdev's net tree. The first patch adds a missing error check, and the second patch fixes a possible leak of a lock in a worker. ==================== Signed-off-by: David S. Miller <[email protected]>
2021-09-21net/smc: fix 'workqueue leaked lock' in smc_conn_abort_workKarsten Graul1-0/+2
The abort_work is scheduled when a connection was detected to be out-of-sync after a link failure. The work calls smc_conn_kill(), which calls smc_close_active_abort() and that might end up calling smc_close_cancel_work(). smc_close_cancel_work() cancels any pending close_work and tx_work but needs to release the sock_lock before and acquires the sock_lock again afterwards. So when the sock_lock was NOT acquired before then it may be held after the abort_work completes. Thats why the sock_lock is acquired before the call to smc_conn_kill() in __smc_lgr_terminate(), but this is missing in smc_conn_abort_work(). Fix that by acquiring the sock_lock first and release it after the call to smc_conn_kill(). Fixes: b286a0651e44 ("net/smc: handle incoming CDC validation message") Signed-off-by: Karsten Graul <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-21net/smc: add missing error check in smc_clc_prfx_set()Karsten Graul1-1/+2
Coverity stumbled over a missing error check in smc_clc_prfx_set(): *** CID 1475954: Error handling issues (CHECKED_RETURN) /net/smc/smc_clc.c: 233 in smc_clc_prfx_set() >>> CID 1475954: Error handling issues (CHECKED_RETURN) >>> Calling "kernel_getsockname" without checking return value (as is done elsewhere 8 out of 10 times). 233 kernel_getsockname(clcsock, (struct sockaddr *)&addrs); Add the return code check in smc_clc_prfx_set(). Fixes: c246d942eabc ("net/smc: restructure netinfo for CLC proposal msgs") Reported-by: Julian Wiedmann <[email protected]> Signed-off-by: Karsten Graul <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-21x86/setup: Call early_reserve_memory() earlierJuergen Gross1-12/+14
Commit in Fixes introduced early_reserve_memory() to do all needed initial memblock_reserve() calls in one function. Unfortunately, the call of early_reserve_memory() is done too late for Xen dom0, as in some cases a Xen hook called by e820__memory_setup() will need those memory reservations to have happened already. Move the call of early_reserve_memory() before the call of e820__memory_setup() in order to avoid such problems. Fixes: a799c2bd29d1 ("x86/setup: Consolidate early memory reservations") Reported-by: Marek Marczykowski-Górecki <[email protected]> Signed-off-by: Juergen Gross <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Tested-by: Marek Marczykowski-Górecki <[email protected]> Tested-by: Nathan Chancellor <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2021-09-21xen/x86: fix PV trap handling on secondary processorsJan Beulich1-6/+9
The initial observation was that in PV mode under Xen 32-bit user space didn't work anymore. Attempts of system calls ended in #GP(0x402). All of the sudden the vector 0x80 handler was not in place anymore. As it turns out up to 5.13 redundant initialization did occur: Once from cpu_initialize_context() (through its VCPUOP_initialise hypercall) and a 2nd time while each CPU was brought fully up. This 2nd initialization is now gone, uncovering that the 1st one was flawed: Unlike for the set_trap_table hypercall, a full virtual IDT needs to be specified here; the "vector" fields of the individual entries are of no interest. With many (kernel) IDT entries still(?) (i.e. at that point at least) empty, the syscall vector 0x80 ended up in slot 0x20 of the virtual IDT, thus becoming the domain's handler for vector 0x20. Make xen_convert_trap_info() fit for either purpose, leveraging the fact that on the xen_copy_trap_info() path the table starts out zero-filled. This includes moving out the writing of the sentinel, which would also have lead to a buffer overrun in the xen_copy_trap_info() case if all (kernel) IDT entries were populated. Convert the writing of the sentinel to clearing of the entire table entry rather than just the address field. (I didn't bother trying to identify the commit which uncovered the issue in 5.14; the commit named below is the one which actually introduced the bad code.) Fixes: f87e4cac4f4e ("xen: SMP guest support") Cc: [email protected] Signed-off-by: Jan Beulich <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Juergen Gross <[email protected]>
2021-09-21xen/balloon: fix balloon kthread freezingJuergen Gross1-2/+2
Commit 8480ed9c2bbd56 ("xen/balloon: use a kernel thread instead a workqueue") switched the Xen balloon driver to use a kernel thread. Unfortunately the patch omitted to call try_to_freeze() or to use wait_event_freezable_timeout(), causing a system suspend to fail. Fixes: 8480ed9c2bbd56 ("xen/balloon: use a kernel thread instead a workqueue") Signed-off-by: Juergen Gross <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Juergen Gross <[email protected]>
2021-09-21nvme: keep ctrl->namespaces orderedChristoph Hellwig1-16/+17
Various places in the nvme code that rely on ctrl->namespace to be ordered. Ensure that the namespae is inserted into the list at the right position from the start instead of sorting it after the fact. Fixes: 540c801c65eb ("NVMe: Implement namespace list scanning") Reported-by: Anton Eidelman <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Keith Busch <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Reviewed-by: Damien Le Moal <[email protected]>
2021-09-21nvme-tcp: fix incorrect h2cdata pdu offset accountingSagi Grimberg1-3/+10
When the controller sends us multiple r2t PDUs in a single request we need to account for it correctly as our send/recv context run concurrently (i.e. we get a new r2t with r2t_offset before we updated our iterator and req->data_sent marker). This can cause wrong offsets to be sent to the controller. To fix that, we will first know that this may happen only in the send sequence of the last page, hence we will take the r2t_offset to the h2c PDU data_offset, and in nvme_tcp_try_send_data loop, we make sure to increment the request markers also when we completed a PDU but we are expecting more r2t PDUs as we still did not send the entire data of the request. Fixes: 825619b09ad3 ("nvme-tcp: fix possible use-after-completion") Reported-by: Nowak, Lukasz <[email protected]> Tested-by: Nowak, Lukasz <[email protected]> Signed-off-by: Sagi Grimberg <[email protected]> Reviewed-by: Keith Busch <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2021-09-21nvme-fc: remove freeze/unfreeze around update_nr_hw_queuesJames Smart1-2/+0
Remove the freeze/unfreeze around changes to the number of hardware queues. Study and retest has indicated there are no ios that can be active at this point so there is nothing to freeze. nvme-fc is draining the queues in the shutdown and error recovery path in __nvme_fc_abort_outstanding_ios. This patch primarily reverts 88e837ed0f1f "nvme-fc: wait for queues to freeze before calling update_hr_hw_queues". It's not an exact revert as it leaves the adjusting of hw queues only if the count changes. Signed-off-by: James Smart <[email protected]> [dwagner: added explanation why no IO is pending] Signed-off-by: Daniel Wagner <[email protected]> Reviewed-by: Ming Lei <[email protected]> Reviewed-by: Himanshu Madhani <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2021-09-21nvme-fc: avoid race between time out and tear downJames Smart1-0/+2
To avoid race between time out and tear down, in tear down process, first we quiesce the queue, and then delete the timer and cancel the time out work for the queue. This patch merges the admin and io sync ops into the queue teardown logic as shown in the RDMA patch 3017013dcc "nvme-rdma: avoid race between time out and tear down". There is no teardown_lock in nvme-fc. Signed-off-by: James Smart <[email protected]> Tested-by: Daniel Wagner <[email protected]> Reviewed-by: Himanshu Madhani <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: Daniel Wagner <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2021-09-21nvme-fc: update hardware queues before using themDaniel Wagner1-8/+8
In case the number of hardware queues changes, we need to update the tagset and the mapping of ctx to hctx first. If we try to create and connect the I/O queues first, this operation will fail (target will reject the connect call due to the wrong number of queues) and hence we bail out of the recreate function. Then we will to try the very same operation again, thus we don't make any progress. Signed-off-by: Daniel Wagner <[email protected]> Reviewed-by: Ming Lei <[email protected]> Reviewed-by: Himanshu Madhani <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: James Smart <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2021-09-20Merge tag 'afs-fixes-20210913' of ↵Linus Torvalds17-120/+294
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull AFS fixes from David Howells: "Fixes for AFS problems that can cause data corruption due to interaction with another client modifying data cached locally: - When d_revalidating a dentry, don't look at the inode to which it points. Only check the directory to which the dentry belongs. This was confusing things and causing the silly-rename cleanup code to remove the file now at the dentry of a file that got deleted. - Fix mmap data coherency. When a callback break is received that relates to a file that we have cached, the data content may have been changed (there are other reasons, such as the user's rights having been changed). However, we're checking it lazily, only on entry to the kernel, which doesn't happen if we have a writeable shared mapped page on that file. We make the kernel keep track of mmapped files and clear all PTEs mapping to that file as soon as the callback comes in by calling unmap_mapping_pages() (we don't necessarily want to zap the pagecache). This causes the kernel to be reentered when userspace tries to access the mmapped address range again - and at that point we can query the server and, if we need to, zap the page cache. Ideally, I would check each file at the point of notification, but that involves poking the server[*] - which is holding an exclusive lock on the vnode it is changing, waiting for all the clients it notified to reply. This could then deadlock against the server. Further, invalidating the pagecache might call ->launder_page(), which would try to write to the file, which would definitely deadlock. (AFS doesn't lease file access). [*] Checking to see if the file content has changed is a matter of comparing the current data version number, but we have to ask the server for that. We also need to get a new callback promise and we need to poke the server for that too. - Add some more points at which the inode is validated, since we're doing it lazily, notably in ->read_iter() and ->page_mkwrite(), but also when performing some directory operations. Ideally, checking in ->read_iter() would be done in some derivation of filemap_read(). If we're going to call the server to read the file, then we get the file status fetch as part of that. - The above is now causing us to make a lot more calls to afs_validate() to check the inode - and afs_validate() takes the RCU read lock each time to make a quick check (ie. afs_check_validity()). This is entirely for the purpose of checking cb_s_break to see if the server we're using reinitialised its list of callbacks - however this isn't a very common event, so most of the time we're taking this needlessly. Add a new cell-wide counter to count the number of reinitialisations done by any server and check that - and only if that changes, take the RCU read lock and check the server list (the server list may change, but the cell a file is part of won't). - Don't update vnode->cb_s_break and ->cb_v_break inside the validity checking loop. The cb_lock is done with read_seqretry, so we might go round the loop a second time after resetting those values - and that could cause someone else checking validity to miss something (I think). Also included are patches for fixes for some bugs encountered whilst debugging this: - Fix a leak of afs_read objects and fix a leak of keys hidden by that. - Fix a leak of pages that couldn't be added to extend a writeback. - Fix the maintenance of i_blocks when i_size is changed by a local write or a local dir edit" Link: https://bugzilla.kernel.org/show_bug.cgi?id=214217 [1] Link: https://lore.kernel.org/r/163111665183.283156.17200205573146438918.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163113612442.352844.11162345591911691150.stgit@warthog.procyon.org.uk/ # i_blocks patch * tag 'afs-fixes-20210913' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: afs: Fix updating of i_blocks on file/dir extension afs: Fix corruption in reads at fpos 2G-4G from an OpenAFS server afs: Try to avoid taking RCU read lock when checking vnode validity afs: Fix mmap coherency vs 3rd-party changes afs: Fix incorrect triggering of sillyrename on 3rd-party invalidation afs: Add missing vnode validation checks afs: Fix page leak afs: Fix missing put on afs_read objects and missing get on the key therein
2021-09-20Merge tag '5.15-rc1-ksmbd' of git://git.samba.org/ksmbdLinus Torvalds4-17/+81
Pull ksmbd server fixes from Steve French: "Three ksmbd fixes, including an important security fix for path processing, and a buffer overflow check, and a trivial fix for incorrect header inclusion" * tag '5.15-rc1-ksmbd' of git://git.samba.org/ksmbd: ksmbd: add validation for FILE_FULL_EA_INFORMATION of smb2_get_info ksmbd: prevent out of share access ksmbd: transport_rdma: Don't include rwlock.h directly
2021-09-20Merge tag '5.15-rc1-smb3' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds48-57/+67
Pull cifs client fixes from Steve French: - two deferred close fixes (for bugs found with xfstests 478 and 461) - a deferred close improvement in rename - two trivial fixes for incorrect Linux comment formatting of multiple cifs files (pointed out by automated kernel test robot and checkpatch) * tag '5.15-rc1-smb3' of git://git.samba.org/sfrench/cifs-2.6: cifs: Not to defer close on file when lock is set cifs: Fix soft lockup during fsstress cifs: Deferred close performance improvements cifs: fix incorrect kernel doc comments cifs: remove pathname for file from SPDX header