aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-01-19Merge tag 'usb-serial-6.2-rc5' of ↵Greg Kroah-Hartman2-0/+18
https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus Johan writes: "USB-serial fixes for 6.2-rc5 Here are some new device ids, mostly for Quectel modems. All have been in linux-next with no reported issues." * tag 'usb-serial-6.2-rc5' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial: USB: serial: option: add Quectel EM05CN modem USB: serial: option: add Quectel EM05CN (SG) modem USB: serial: cp210x: add SCALANCE LPE-9000 device id USB: serial: option: add Quectel EC200U modem USB: serial: option: add Quectel EM05-G (RS) modem USB: serial: option: add Quectel EM05-G (GR) modem USB: serial: option: add Quectel EM05-G (CS) modem
2023-01-19Merge tag 'zonefs-6.2-rc5' of ↵Linus Torvalds1-0/+22
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs Pull zonefs fix from Damien Le Moal: - A single patch to fix sync write operations to detect and handle errors due to external zone corruptions resulting in writes at invalid location, from me. * tag 'zonefs-6.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs: zonefs: Detect append writes at invalid locations
2023-01-19io_uring/msg_ring: fix missing lock on overflow for IOPOLLJens Axboe1-9/+30
If the target ring is configured with IOPOLL, then we always need to hold the target ring uring_lock before posting CQEs. We could just grab it unconditionally, but since we don't expect many target rings to be of this type, make grabbing the uring_lock conditional on the ring type. Link: https://lore.kernel.org/io-uring/Y8krlYa52%[email protected]/ Reported-by: Xingyuan Mo <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2023-01-19net: dsa: microchip: ksz9477: port map correction in ALU table entry registerRakesh Sankaranarayanan1-2/+2
ALU table entry 2 register in KSZ9477 have bit positions reserved for forwarding port map. This field is referred in ksz9477_fdb_del() for clearing forward port map and alu table. But current fdb_del refer ALU table entry 3 register for accessing forward port map. Update ksz9477_fdb_del() to get forward port map from correct alu table entry register. With this bug, issue can be observed while deleting static MAC entries. Delete any specific MAC entry using "bridge fdb del" command. This should clear all the specified MAC entries. But it is observed that entries with self static alone are retained. Tested on LAN9370 EVB since ksz9477_fdb_del() is used common across LAN937x and KSZ series. Fixes: b987e98e50ab ("dsa: add DSA switch driver for Microchip KSZ9477") Signed-off-by: Rakesh Sankaranarayanan <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-01-19selftests/net: toeplitz: fix race on tpacket_v3 block closeWillem de Bruijn1-5/+7
Avoid race between process wakeup and tpacket_v3 block timeout. The test waits for cfg_timeout_msec for packets to arrive. Packets arrive in tpacket_v3 rings, which pass packets ("frames") to the process in batches ("blocks"). The sk waits for req3.tp_retire_blk_tov msec to release a block. Set the block timeout lower than the process waiting time, else the process may find that no block has been released by the time it scans the socket list. Convert to a ring of more than one, smaller, blocks with shorter timeouts. Blocks must be page aligned, so >= 64KB. Fixes: 5ebfb4cc3048 ("selftests/net: toeplitz test") Signed-off-by: Willem de Bruijn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-01-19net/ulp: use consistent error code when blocking ULPPaolo Abeni1-1/+1
The referenced commit changed the error code returned by the kernel when preventing a non-established socket from attaching the ktls ULP. Before to such a commit, the user-space got ENOTCONN instead of EINVAL. The existing self-tests depend on such error code, and the change caused a failure: RUN global.non_established ... tls.c:1673:non_established:Expected errno (22) == ENOTCONN (107) non_established: Test failed at step #3 FAIL global.non_established In the unlikely event existing applications do the same, address the issue by restoring the prior error code in the above scenario. Note that the only other ULP performing similar checks at init time - smc_ulp_ops - also fails with ENOTCONN when trying to attach the ULP to a non-established socket. Reported-by: Sabrina Dubroca <[email protected]> Fixes: 2c02d41d71f9 ("net/ulp: prevent ULP without clone op from entering the LISTEN status") Signed-off-by: Paolo Abeni <[email protected]> Reviewed-by: Sabrina Dubroca <[email protected]> Link: https://lore.kernel.org/r/7bb199e7a93317fb6f8bf8b9b2dc71c18f337cde.1674042685.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <[email protected]>
2023-01-19driver core: Fix test_async_probe_init saves device in wrong arrayChen Zhongjin1-1/+1
In test_async_probe_init, second set of asynchronous devices are saved in sync_dev[sync_id], which should be async_dev[async_id]. This makes these devices not unregistered when exit. > modprobe test_async_driver_probe && \ > modprobe -r test_async_driver_probe && \ > modprobe test_async_driver_probe ... > sysfs: cannot create duplicate filename '/devices/platform/test_async_driver.4' > kobject_add_internal failed for test_async_driver.4 with -EEXIST, don't try to register things with the same name in the same directory. Fixes: 57ea974fb871 ("driver core: Rewrite test_async_driver_probe to cover serialization and NUMA affinity") Signed-off-by: Chen Zhongjin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-01-19w1: fix WARNING after calling w1_process()Yang Yingliang1-1/+3
I got the following WARNING message while removing driver(ds2482): ------------[ cut here ]------------ do not call blocking ops when !TASK_RUNNING; state=1 set at [<000000002d50bfb6>] w1_process+0x9e/0x1d0 [wire] WARNING: CPU: 0 PID: 262 at kernel/sched/core.c:9817 __might_sleep+0x98/0xa0 CPU: 0 PID: 262 Comm: w1_bus_master1 Tainted: G N 6.1.0-rc3+ #307 RIP: 0010:__might_sleep+0x98/0xa0 Call Trace: exit_signals+0x6c/0x550 do_exit+0x2b4/0x17e0 kthread_exit+0x52/0x60 kthread+0x16d/0x1e0 ret_from_fork+0x1f/0x30 The state of task is set to TASK_INTERRUPTIBLE in loop in w1_process(), set it to TASK_RUNNING when it breaks out of the loop to avoid the warning. Fixes: 3c52e4e62789 ("W1: w1_process, block or sleep") Signed-off-by: Yang Yingliang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-01-19w1: fix deadloop in __w1_remove_master_device()Yang Yingliang2-3/+4
I got a deadloop report while doing device(ds2482) add/remove test: [ 162.241881] w1_master_driver w1_bus_master1: Waiting for w1_bus_master1 to become free: refcnt=1. [ 163.272251] w1_master_driver w1_bus_master1: Waiting for w1_bus_master1 to become free: refcnt=1. [ 164.296157] w1_master_driver w1_bus_master1: Waiting for w1_bus_master1 to become free: refcnt=1. ... __w1_remove_master_device() can't return, because the dev->refcnt is not zero. w1_add_master_device() | w1_alloc_dev() | atomic_set(&dev->refcnt, 2) | kthread_run() | |__w1_remove_master_device() | kthread_stop() // KTHREAD_SHOULD_STOP is set, | // threadfn(w1_process) won't be | // called. | kthread() | | // refcnt will never be 0, it's deadloop. | while (atomic_read(&dev->refcnt)) {...} After calling w1_add_master_device(), w1_process() is not really invoked, before w1_process() starting, if kthread_stop() is called in __w1_remove_master_device(), w1_process() will never be called, the refcnt can not be decreased, then it causes deadloop in remove function because of non-zero refcnt. We need to make sure w1_process() is really started, so move the set refcnt into w1_process() to fix this problem. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Yang Yingliang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-01-19comedi: adv_pci1760: Fix PWM instruction handlingIan Abbott1-1/+1
(Actually, this is fixing the "Read the Current Status" command sent to the device's outgoing mailbox, but it is only currently used for the PWM instructions.) The PCI-1760 is operated mostly by sending commands to a set of Outgoing Mailbox registers, waiting for the command to complete, and reading the result from the Incoming Mailbox registers. One of these commands is the "Read the Current Status" command. The number of this command is 0x07 (see the User's Manual for the PCI-1760 at <https://advdownload.advantech.com/productfile/Downloadfile2/1-11P6653/PCI-1760.pdf>. The `PCI1760_CMD_GET_STATUS` macro defined in the driver should expand to this command number 0x07, but unfortunately it currently expands to 0x03. (Command number 0x03 is not defined in the User's Manual.) Correct the definition of the `PCI1760_CMD_GET_STATUS` macro to fix it. This is used by all the PWM subdevice related instructions handled by `pci1760_pwm_insn_config()` which are probably all broken. The effect of sending the undefined command number 0x03 is not known. Fixes: 14b93bb6bbf0 ("staging: comedi: adv_pci_dio: separate out PCI-1760 support") Cc: <[email protected]> # v4.5+ Signed-off-by: Ian Abbott <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-01-19io_uring/msg_ring: move double lock/unlock helpers higher upJens Axboe1-24/+23
In preparation for needing them somewhere else, move them and get rid of the unused 'issue_flags' for the unlock side. No functional changes in this patch. Signed-off-by: Jens Axboe <[email protected]>
2023-01-19tty: serial: qcom_geni: avoid duplicate struct member initArnd Bergmann1-6/+8
When -Woverride-init is enabled in a build, gcc points out that qcom_geni_serial_pm_ops contains conflicting initializers: drivers/tty/serial/qcom_geni_serial.c:1586:20: error: initialized field overwritten [-Werror=override-init] 1586 | .restore = qcom_geni_serial_sys_hib_resume, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/tty/serial/qcom_geni_serial.c:1586:20: note: (near initialization for 'qcom_geni_serial_pm_ops.restore') drivers/tty/serial/qcom_geni_serial.c:1587:17: error: initialized field overwritten [-Werror=override-init] 1587 | .thaw = qcom_geni_serial_sys_hib_resume, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Open-code the initializers with the version that was already used, and use the pm_sleep_ptr() method to deal with unused ones, in place of the __maybe_unused annotation. Fixes: 35781d8356a2 ("tty: serial: qcom-geni-serial: Add support for Hibernation feature") Signed-off-by: Arnd Bergmann <[email protected]> Reviewed-by: Douglas Anderson <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-01-19serial: atmel: fix incorrect baudrate setupTobias Schramm1-7/+1
Commit ba47f97a18f2 ("serial: core: remove baud_rates when serial console setup") changed uart_set_options to select the correct baudrate configuration based on the absolute error between requested baudrate and available standard baudrate settings. Prior to that commit the baudrate was selected based on which predefined standard baudrate did not exceed the requested baudrate. This change of selection logic was never reflected in the atmel serial driver. Thus the comment left in the atmel serial driver is no longer accurate. Additionally the manual rounding up described in that comment and applied via (quot - 1) requests an incorrect baudrate. Since uart_set_options uses tty_termios_encode_baud_rate to determine the appropriate baudrate flags this can cause baudrate selection to fail entirely because tty_termios_encode_baud_rate will only select a baudrate if relative error between requested and selected baudrate does not exceed +/-2%. Fix that by requesting actual, exact baudrate used by the serial. Fixes: ba47f97a18f2 ("serial: core: remove baud_rates when serial console setup") Cc: stable <[email protected]> Signed-off-by: Tobias Schramm <[email protected]> Acked-by: Richard Genoud <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-01-19tty: fix possible null-ptr-defer in spk_ttyio_releaseGaosheng Cui1-0/+3
Run the following tests on the qemu platform: syzkaller:~# modprobe speakup_audptr input: Speakup as /devices/virtual/input/input4 initialized device: /dev/synth, node (MAJOR 10, MINOR 125) speakup 3.1.6: initialized synth name on entry is: (null) synth probe spk_ttyio_initialise_ldisc failed because tty_kopen_exclusive returned failed (errno -16), then remove the module, we will get a null-ptr-defer problem, as follow: syzkaller:~# modprobe -r speakup_audptr releasing synth audptr BUG: kernel NULL pointer dereference, address: 0000000000000080 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 0 P4D 0 Oops: 0002 [#1] PREEMPT SMP PTI CPU: 2 PID: 204 Comm: modprobe Not tainted 6.1.0-rc6-dirty #1 RIP: 0010:mutex_lock+0x14/0x30 Call Trace: <TASK> spk_ttyio_release+0x19/0x70 [speakup] synth_release.part.6+0xac/0xc0 [speakup] synth_remove+0x56/0x60 [speakup] __x64_sys_delete_module+0x156/0x250 ? fpregs_assert_state_consistent+0x1d/0x50 do_syscall_64+0x37/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd </TASK> Modules linked in: speakup_audptr(-) speakup Dumping ftrace buffer: in_synth->dev was not initialized during modprobe, so we add check for in_synth->dev to fix this bug. Fixes: 4f2a81f3a882 ("speakup: Reference synth from tty and tty from synth") Cc: stable <[email protected]> Signed-off-by: Gaosheng Cui <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Samuel Thibault <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-01-19Merge tag 'mlx5-fixes-2023-01-18' of ↵Paolo Abeni12-40/+27
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== This series provides bug fixes to mlx5 driver. * tag 'mlx5-fixes-2023-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net: mlx5: eliminate anonymous module_init & module_exit net/mlx5: E-switch, Fix switchdev mode after devlink reload net/mlx5e: Protect global IPsec ASO net/mlx5e: Remove optimization which prevented update of ESN state net/mlx5e: Set decap action based on attr for sample net/mlx5e: QoS, Fix wrongfully setting parent_element_id on MODIFY_SCHEDULING_ELEMENT net/mlx5: E-switch, Fix setting of reserved fields on MODIFY_SCHEDULING_ELEMENT net/mlx5e: Remove redundant xsk pointer check in mlx5e_mpwrq_validate_xsk net/mlx5e: Avoid false lock dependency warning on tc_ht even more net/mlx5: fix missing mutex_unlock in mlx5_fw_fatal_reporter_err_work() ==================== Link: https://lore.kernel.org/r/ Signed-off-by: Paolo Abeni <[email protected]>
2023-01-19Merge branch 'rework/console-list-lock' into for-linusPetr Mladek3-15/+12
2023-01-19serial: stm32: Merge hard IRQ and threaded IRQ handling into single IRQ handlerMarek Vasut1-27/+4
Requesting an interrupt with IRQF_ONESHOT will run the primary handler in the hard-IRQ context even in the force-threaded mode. The force-threaded mode is used by PREEMPT_RT in order to avoid acquiring sleeping locks (spinlock_t) in hard-IRQ context. This combination makes it impossible and leads to "sleeping while atomic" warnings. Use one interrupt handler for both handlers (primary and secondary) and drop the IRQF_ONESHOT flag which is not needed. Fixes: e359b4411c283 ("serial: stm32: fix threaded interrupt handling") Reviewed-by: Sebastian Andrzej Siewior <[email protected]> Tested-by: Valentin Caron <[email protected]> # V3 Signed-off-by: Marek Vasut <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-01-19serial: amba-pl011: fix high priority character transmission in rs486 modeLino Sanfilippo1-4/+4
In RS485 mode the transmission of a high priority character fails since it is written to the data register before the transmitter is enabled. Fix this in pl011_tx_chars() by enabling RS485 transmission before writing the character. Fixes: 8d479237727c ("serial: amba-pl011: add RS485 support") Cc: [email protected] Signed-off-by: Lino Sanfilippo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-01-19serial: pch_uart: Pass correct sg to dma_unmap_sg()Ilpo Järvinen1-1/+1
A local variable sg is used to store scatterlist pointer in pch_dma_tx_complete(). The for loop doing Tx byte accounting before dma_unmap_sg() alters sg in its increment statement. Therefore, the pointer passed into dma_unmap_sg() won't match to the one given to dma_map_sg(). To fix the problem, use priv->sg_tx_p directly in dma_unmap_sg() instead of the local variable. Fixes: da3564ee027e ("pch_uart: add multi-scatter processing") Cc: [email protected] Signed-off-by: Ilpo Järvinen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-01-19tty: serial: qcom-geni-serial: fix slab-out-of-bounds on RX FIFO bufferKrzysztof Kozlowski1-2/+16
Driver's probe allocates memory for RX FIFO (port->rx_fifo) based on default RX FIFO depth, e.g. 16. Later during serial startup the qcom_geni_serial_port_setup() updates the RX FIFO depth (port->rx_fifo_depth) to match real device capabilities, e.g. to 32. The RX UART handle code will read "port->rx_fifo_depth" number of words into "port->rx_fifo" buffer, thus exceeding the bounds. This can be observed in certain configurations with Qualcomm Bluetooth HCI UART device and KASAN: Bluetooth: hci0: QCA Product ID :0x00000010 Bluetooth: hci0: QCA SOC Version :0x400a0200 Bluetooth: hci0: QCA ROM Version :0x00000200 Bluetooth: hci0: QCA Patch Version:0x00000d2b Bluetooth: hci0: QCA controller version 0x02000200 Bluetooth: hci0: QCA Downloading qca/htbtfw20.tlv bluetooth hci0: Direct firmware load for qca/htbtfw20.tlv failed with error -2 Bluetooth: hci0: QCA Failed to request file: qca/htbtfw20.tlv (-2) Bluetooth: hci0: QCA Failed to download patch (-2) ================================================================== BUG: KASAN: slab-out-of-bounds in handle_rx_uart+0xa8/0x18c Write of size 4 at addr ffff279347d578c0 by task swapper/0/0 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.1.0-rt5-00350-gb2450b7e00be-dirty #26 Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT) Call trace: dump_backtrace.part.0+0xe0/0xf0 show_stack+0x18/0x40 dump_stack_lvl+0x8c/0xb8 print_report+0x188/0x488 kasan_report+0xb4/0x100 __asan_store4+0x80/0xa4 handle_rx_uart+0xa8/0x18c qcom_geni_serial_handle_rx+0x84/0x9c qcom_geni_serial_isr+0x24c/0x760 __handle_irq_event_percpu+0x108/0x500 handle_irq_event+0x6c/0x110 handle_fasteoi_irq+0x138/0x2cc generic_handle_domain_irq+0x48/0x64 If the RX FIFO depth changes after probe, be sure to resize the buffer. Fixes: f9d690b6ece7 ("tty: serial: qcom_geni_serial: Allocate port->rx_fifo buffer in probe") Cc: <[email protected]> Signed-off-by: Krzysztof Kozlowski <[email protected]> Reviewed-by: Jiri Slaby <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-01-19device property: fix of node refcount leak in fwnode_graph_get_next_endpoint()Yang Yingliang1-6/+12
The 'parent' returned by fwnode_graph_get_port_parent() with refcount incremented when 'prev' is not NULL, it needs be put when finish using it. Because the parent is const, introduce a new variable to store the returned fwnode, then put it before returning from fwnode_graph_get_next_endpoint(). Fixes: b5b41ab6b0c1 ("device property: Check fwnode->secondary in fwnode_graph_get_next_endpoint()") Signed-off-by: Yang Yingliang <[email protected]> Reviewed-by: Sakari Ailus <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]> Reviewed-and-tested-by: Daniel Scally <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-01-19ptdma: pt_core_execute_cmd() should use spinlockEric Pilmore2-4/+5
The interrupt handler (pt_core_irq_handler()) of the ptdma driver can be called from interrupt context. The code flow in this function can lead down to pt_core_execute_cmd() which will attempt to grab a mutex, which is not appropriate in interrupt context and ultimately leads to a kernel panic. The fix here changes this mutex to a spinlock, which has been verified to resolve the issue. Fixes: fa5d823b16a9 ("dmaengine: ptdma: Initial driver for the AMD PTDMA") Signed-off-by: Eric Pilmore <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Vinod Koul <[email protected]>
2023-01-19usb: dwc3: fix extcon dependencyArnd Bergmann1-1/+1
The dwc3 core support now links against the extcon subsystem, so it cannot be built-in when extcon is a loadable module: arm-linux-gnueabi-ld: drivers/usb/dwc3/core.o: in function `dwc3_get_extcon': core.c:(.text+0x572): undefined reference to `extcon_get_edev_by_phandle' arm-linux-gnueabi-ld: core.c:(.text+0x596): undefined reference to `extcon_get_extcon_dev' arm-linux-gnueabi-ld: core.c:(.text+0x5ea): undefined reference to `extcon_find_edev_by_node' There was already a Kconfig dependency in the dual-role support, but this is now needed for the entire dwc3 driver. It is still possible to build dwc3 without extcon, but this prevents it from being set to built-in when extcon is a loadable module. Fixes: d182c2e1bc92 ("usb: dwc3: Don't switch OTG -> peripheral if extcon is present") Signed-off-by: Arnd Bergmann <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]> Acked-by: Thinh Nguyen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-01-19octeontx2-pf: Fix the use of GFP_KERNEL in atomic context on rtKevin Hao2-9/+4
The commit 4af1b64f80fb ("octeontx2-pf: Fix lmtst ID used in aura free") uses the get/put_cpu() to protect the usage of percpu pointer in ->aura_freeptr() callback, but it also unnecessarily disable the preemption for the blockable memory allocation. The commit 87b93b678e95 ("octeontx2-pf: Avoid use of GFP_KERNEL in atomic context") tried to fix these sleep inside atomic warnings. But it only fix the one for the non-rt kernel. For the rt kernel, we still get the similar warnings like below. BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper/0 preempt_count: 1, expected: 0 RCU nest depth: 0, expected: 0 3 locks held by swapper/0/1: #0: ffff800009fc5fe8 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock+0x24/0x30 #1: ffff000100c276c0 (&mbox->lock){+.+.}-{3:3}, at: otx2_init_hw_resources+0x8c/0x3a4 #2: ffffffbfef6537e0 (&cpu_rcache->lock){+.+.}-{2:2}, at: alloc_iova_fast+0x1ac/0x2ac Preemption disabled at: [<ffff800008b1908c>] otx2_rq_aura_pool_init+0x14c/0x284 CPU: 20 PID: 1 Comm: swapper/0 Tainted: G W 6.2.0-rc3-rt1-yocto-preempt-rt #1 Hardware name: Marvell OcteonTX CN96XX board (DT) Call trace: dump_backtrace.part.0+0xe8/0xf4 show_stack+0x20/0x30 dump_stack_lvl+0x9c/0xd8 dump_stack+0x18/0x34 __might_resched+0x188/0x224 rt_spin_lock+0x64/0x110 alloc_iova_fast+0x1ac/0x2ac iommu_dma_alloc_iova+0xd4/0x110 __iommu_dma_map+0x80/0x144 iommu_dma_map_page+0xe8/0x260 dma_map_page_attrs+0xb4/0xc0 __otx2_alloc_rbuf+0x90/0x150 otx2_rq_aura_pool_init+0x1c8/0x284 otx2_init_hw_resources+0xe4/0x3a4 otx2_open+0xf0/0x610 __dev_open+0x104/0x224 __dev_change_flags+0x1e4/0x274 dev_change_flags+0x2c/0x7c ic_open_devs+0x124/0x2f8 ip_auto_config+0x180/0x42c do_one_initcall+0x90/0x4dc do_basic_setup+0x10c/0x14c kernel_init_freeable+0x10c/0x13c kernel_init+0x2c/0x140 ret_from_fork+0x10/0x20 Of course, we can shuffle the get/put_cpu() to only wrap the invocation of ->aura_freeptr() as what commit 87b93b678e95 does. But there are only two ->aura_freeptr() callbacks, otx2_aura_freeptr() and cn10k_aura_freeptr(). There is no usage of perpcu variable in the otx2_aura_freeptr() at all, so the get/put_cpu() seems redundant to it. We can move the get/put_cpu() into the corresponding callback which really has the percpu variable usage and avoid the sprinkling of get/put_cpu() in several places. Fixes: 4af1b64f80fb ("octeontx2-pf: Fix lmtst ID used in aura free") Signed-off-by: Kevin Hao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2023-01-19tcp: avoid the lookup process failing to get sk in ehash tableJason Xing2-6/+19
While one cpu is working on looking up the right socket from ehash table, another cpu is done deleting the request socket and is about to add (or is adding) the big socket from the table. It means that we could miss both of them, even though it has little chance. Let me draw a call trace map of the server side. CPU 0 CPU 1 ----- ----- tcp_v4_rcv() syn_recv_sock() inet_ehash_insert() -> sk_nulls_del_node_init_rcu(osk) __inet_lookup_established() -> __sk_nulls_add_node_rcu(sk, list) Notice that the CPU 0 is receiving the data after the final ack during 3-way shakehands and CPU 1 is still handling the final ack. Why could this be a real problem? This case is happening only when the final ack and the first data receiving by different CPUs. Then the server receiving data with ACK flag tries to search one proper established socket from ehash table, but apparently it fails as my map shows above. After that, the server fetches a listener socket and then sends a RST because it finds a ACK flag in the skb (data), which obeys RST definition in RFC 793. Besides, Eric pointed out there's one more race condition where it handles tw socket hashdance. Only by adding to the tail of the list before deleting the old one can we avoid the race if the reader has already begun the bucket traversal and it would possibly miss the head. Many thanks to Eric for great help from beginning to end. Fixes: 5e0724d027f0 ("tcp/dccp: fix hashdance race for passive sessions") Suggested-by: Eric Dumazet <[email protected]> Signed-off-by: Jason Xing <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Reviewed-by: Kuniyuki Iwashima <[email protected]> Link: https://lore.kernel.org/lkml/[email protected]/ Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2023-01-19EDAC/device: Respect any driver-supplied workqueue polling valueManivannan Sadhasivam1-8/+7
The EDAC drivers may optionally pass the poll_msec value. Use that value if available, else fall back to 1000ms. [ bp: Touchups. ] Fixes: e27e3dac6517 ("drivers/edac: add edac_device class") Reported-by: Luca Weiss <[email protected]> Signed-off-by: Manivannan Sadhasivam <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Tested-by: Steev Klimaszewski <[email protected]> # Thinkpad X13s Tested-by: Andrew Halaney <[email protected]> # sa8540p-ride Cc: <[email protected]> # 4.9 Link: https://lore.kernel.org/r/COZYL8MWN97H.MROQ391BGA09@otso
2023-01-19nvme-pci: fix timeout request state checkKeith Busch1-1/+1
Polling the completion can progress the request state to IDLE, either inline with the completion, or through softirq. Either way, the state may not be COMPLETED, so don't check for that. We only care if the state isn't IN_FLIGHT. This is fixing an issue where the driver aborts an IO that we just completed. Seeing the "aborting" message instead of "polled" is very misleading as to where the timeout problem resides. Fixes: bf392a5dc02a9b ("nvme-pci: Remove tag from process cq") Signed-off-by: Keith Busch <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2023-01-19nvme-apple: only reset the controller when RTKit is runningJanne Grunau1-3/+3
NVMe controller register access hangs indefinitely when the co-processor is not running. A missed reset is preferable over a hanging thread since it could be recoverable. Signed-off-by: Janne Grunau <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2023-01-19nvme-apple: reset controller during shutdownJanne Grunau1-1/+17
This is a functional revert of c76b8308e4c9 ("nvme-apple: fix controller shutdown in apple_nvme_disable"). The commit broke suspend/resume since apple_nvme_reset_work() tries to disable the controller on resume. This does not work for the apple NVMe controller since register access only works while the co-processor firmware is running. Disabling the NVMe controller in the shutdown path is also required for shutting the co-processor down. The original code was appropriate for this hardware. Add a comment to prevent a similar breaking changes in the future. Fixes: c76b8308e4c9 ("nvme-apple: fix controller shutdown in apple_nvme_disable") Reported-by: Janne Grunau <[email protected]> Link: https://lore.kernel.org/all/[email protected]/ Signed-off-by: Janne Grunau <[email protected]> [hch: updated with a more descriptive comment from Hector Martin] Signed-off-by: Christoph Hellwig <[email protected]>
2023-01-18Revert "net: team: use IFF_NO_ADDRCONF flag to prevent ipv6 addrconf"Xin Long1-2/+0
This reverts commit 0aa64df30b382fc71d4fb1827d528e0eb3eff854. Currently IFF_NO_ADDRCONF is used to prevent all ipv6 addrconf for the slave ports of team, bonding and failover devices and it means no ipv6 packets can be sent out through these slave ports. However, for team device, "nsna_ping" link_watch requires ipv6 addrconf. Otherwise, the link will be marked failure. This patch removes the IFF_NO_ADDRCONF flag set for team port, and we will fix the original issue in another patch, as Jakub suggested. Fixes: 0aa64df30b38 ("net: team: use IFF_NO_ADDRCONF flag to prevent ipv6 addrconf") Signed-off-by: Xin Long <[email protected]> Link: https://lore.kernel.org/r/63e09531fc47963d2e4eff376653d3db21b97058.1673980932.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <[email protected]>
2023-01-18MAINTAINERS: add networking entries for WillemJakub Kicinski1-0/+20
We often have to ping Willem asking for reviews of patches because he doesn't get included in the CC list. Add MAINTAINERS entries for some of the areas he covers so that ./scripts/ will know to add him. Acked-by: Willem de Bruijn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-01-18net: sched: gred: prevent races when adding offloads to statsJakub Kicinski1-0/+2
Naresh reports seeing a warning that gred is calling u64_stats_update_begin() with preemption enabled. Arnd points out it's coming from _bstats_update(). We should be holding the qdisc lock when writing to stats, they are also updated from the datapath. Reported-by: Linux Kernel Functional Testing <[email protected]> Link: https://lore.kernel.org/all/CA+G9fYsTr9_r893+62u6UGD3dVaCE-kN9C-Apmb2m=hxjc1Cqg@mail.gmail.com/ Fixes: e49efd5288bd ("net: sched: gred: support reporting stats from offloads") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-01-18drm/amd/display: disable S/G display on DCN 3.1.4Alex Deucher1-1/+0
Causes flickering or white screens in some configurations. Disable it for now until we can fix the issue. Cc: [email protected] Cc: [email protected] Acked-by: Christian König <[email protected]> Reviewed-by: Roman Li <[email protected]> Reviewed-by: Yifan Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected] # 6.1.x
2023-01-18drm/amd/display: disable S/G display on DCN 3.1.5Alex Deucher1-1/+0
Causes flickering or white screens in some configurations. Disable it for now until we can fix the issue. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2354 Cc: [email protected] Cc: [email protected] Acked-by: Christian König <[email protected]> Reviewed-by: Yifan Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected] # 6.1.x
2023-01-18drm/amdgpu: allow multipipe policy on ASICs with one MECLang Yu1-0/+3
Always enable multipipe policy on ASICs with GC VERSION > 9.0.0 instead of MEC number > 1. This will allow multipipe policy on ASICs with one MEC, e.g., gfx11 APUs. Signed-off-by: Lang Yu <[email protected]> Reviewed-by: Aaron Liu <[email protected]> Reviewed-by: Yifan Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected] # 6.1.x
2023-01-18drm/amdgpu: correct MEC number for gfx11 APUsLang Yu1-2/+9
There is only one MEC on these APUs. Signed-off-by: Lang Yu <[email protected]> Reviewed-by: Aaron Liu <[email protected]> Reviewed-by: Yifan Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected] # 6.1.x
2023-01-18drm/amd/display: fix issues with driver unloadHamza Mahfooz2-5/+0
Currently, we run into a number of WARN()s when attempting to unload the amdgpu driver (e.g. using "modprobe -r amdgpu"). These all stem from calling drm_encoder_cleanup() too early. So, to fix this we can stop calling drm_encoder_cleanup() from amdgpu_dm_fini() and instead have it be called from amdgpu_dm_encoder_destroy(). Also, we don't need to free in amdgpu_dm_encoder_destroy() since mst_encoders[] isn't explicitly allocated by the slab allocator. Fixes: f74367e492ba ("drm/amdgpu/display: create fake mst encoders ahead of time (v4)") Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Hamza Mahfooz <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-01-18drm/amdgpu: fix amdgpu_job_free_resources v2Christian König1-2/+8
It can be that neither fence were initialized when we run out of UVD streams for example. v2: fix typo breaking compile Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2324 Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected] # 6.1.x
2023-01-18drm/amd/display: Fix COLOR_SPACE_YCBCR2020_TYPE matrixJoshua Ashton1-2/+2
The YCC conversion matrix for RGB -> COLOR_SPACE_YCBCR2020_TYPE is missing the values for the fourth column of the matrix. The fourth column of the matrix is essentially just a value that is added given that the color is 3 components in size. These values are needed to bias the chroma from the [-1, 1] -> [0, 1] range. This fixes color being very green when using Gamescope HDR on HDMI output which prefers YCC 4:4:4. Fixes: 40df2f809e8f ("drm/amd/display: color space ycbcr709 support") Reviewed-by: Melissa Wen <[email protected]> Signed-off-by: Joshua Ashton <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2023-01-18drm/amd/display: Calculate output_color_space after pixel encoding adjustmentJoshua Ashton1-2/+2
Code in get_output_color_space depends on knowing the pixel encoding to determine whether to pick between eg. COLOR_SPACE_SRGB or COLOR_SPACE_YCBCR709 for transparent RGB -> YCbCr 4:4:4 in the driver. v2: Fixed patch being accidentally based on a personal feature branch, oops! Fixes: ea117312ea9f ("drm/amd/display: Reduce HDMI pixel encoding if max clock is exceeded") Reviewed-by: Melissa Wen <[email protected]> Signed-off-by: Joshua Ashton <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2023-01-18drm/amdgpu: fix cleaning up reserved VMID on releaseChristian König1-0/+1
We need to reset this or otherwise run into list corruption later on. Fixes: e44a0fe630c5 ("drm/amdgpu: rework reserved VMID handling") Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Tested-by: Candice Li <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-01-18Merge tag 'wireless-2023-01-18' of ↵Jakub Kicinski7-87/+117
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Kalle Valo says: ==================== wireless fixes for v6.2 Third set of fixes for v6.2. This time most of them are for drivers, only one revert for mac80211. For an important mt76 fix we had to cherry pick two commits from wireless-next. * tag 'wireless-2023-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: Revert "wifi: mac80211: fix memory leak in ieee80211_if_add()" wifi: mt76: dma: fix a regression in adding rx buffers wifi: mt76: handle possible mt76_rx_token_consume failures wifi: mt76: dma: do not increment queue head if mt76_dma_add_buf fails wifi: rndis_wlan: Prevent buffer overflow in rndis_query_oid wifi: brcmfmac: fix regression for Broadcom PCIe wifi devices wifi: brcmfmac: avoid NULL-deref in survey dump for 2G only device wifi: brcmfmac: avoid handling disabled channels for survey dump ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-01-18drm/amdgpu: Correct the power calcultion for Renior/Cezanne.jie1zhan1-1/+6
From smu firmware,the value of power is transferred in units of watts. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2321 Fixes: 137aac26a2ed ("drm/amdgpu/smu12: fix power reporting on renoir") Acked-by: Alex Deucher <[email protected]> Signed-off-by: Jesse Zhang <[email protected]> Reviewed-by: Aaron Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2023-01-18drm/amd/display: Fix set scaling doesn's workhongao1-2/+2
[Why] Setting scaling does not correctly update CRTC state. As a result dc stream state's src (composition area) && dest (addressable area) was not calculated as expected. This causes set scaling doesn's work. [How] Correctly update CRTC state when setting scaling property. Reviewed-by: Harry Wentland <[email protected]> Tested-by: Rodrigo Siqueira <[email protected]> Signed-off-by: hongao <[email protected]> Signed-off-by: Rodrigo Siqueira <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2023-01-18scsi: device_handler: alua: Remove a might_sleep() annotationBart Van Assche1-2/+3
The might_sleep() annotation in alua_rtpg_queue() is not correct since the command completion code may call this function from atomic context. Calling alua_rtpg_queue() from atomic context in the command completion path is fine since request submitters must hold an sdev reference until command execution has completed. This patch fixes the following kernel complaint: BUG: sleeping function called from invalid context at drivers/scsi/device_handler/scsi_dh_alua.c:992 Call Trace: dump_stack_lvl+0xac/0x100 __might_resched+0x284/0x2c8 alua_rtpg_queue+0x3c/0x98 [scsi_dh_alua] alua_check+0x122/0x250 [scsi_dh_alua] alua_check_sense+0x172/0x228 [scsi_dh_alua] scsi_check_sense+0x8a/0x2e0 scsi_decide_disposition+0x286/0x298 scsi_complete+0x6a/0x108 blk_complete_reqs+0x6e/0x88 __do_softirq+0x13e/0x6b8 __irq_exit_rcu+0x14a/0x170 irq_exit_rcu+0x22/0x50 do_ext_irq+0x10a/0x1d0 Link: https://lore.kernel.org/r/[email protected] Reported-by: Steffen Maier <[email protected]> Reviewed-by: Martin Wilck <[email protected]> Tested-by: Steffen Maier <[email protected]> Signed-off-by: Bart Van Assche <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-01-18scsi: iscsi_tcp: Fix UAF during login when accessing the shost ipaddressMike Christie1-3/+6
If during iscsi_sw_tcp_session_create() iscsi_tcp_r2tpool_alloc() fails, userspace could be accessing the host's ipaddress attr. If we then free the session via iscsi_session_teardown() while userspace is still accessing the session we will hit a use after free bug. Set the tcp_sw_host->session after we have completed session creation and can no longer fail. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mike Christie <[email protected]> Reviewed-by: Lee Duncan <[email protected]> Acked-by: Ding Hui <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-01-18scsi: iscsi_tcp: Fix UAF during logout when accessing the shost ipaddressMike Christie3-9/+42
Bug report and analysis from Ding Hui. During iSCSI session logout, if another task accesses the shost ipaddress attr, we can get a KASAN UAF report like this: [ 276.942144] BUG: KASAN: use-after-free in _raw_spin_lock_bh+0x78/0xe0 [ 276.942535] Write of size 4 at addr ffff8881053b45b8 by task cat/4088 [ 276.943511] CPU: 2 PID: 4088 Comm: cat Tainted: G E 6.1.0-rc8+ #3 [ 276.943997] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 [ 276.944470] Call Trace: [ 276.944943] <TASK> [ 276.945397] dump_stack_lvl+0x34/0x48 [ 276.945887] print_address_description.constprop.0+0x86/0x1e7 [ 276.946421] print_report+0x36/0x4f [ 276.947358] kasan_report+0xad/0x130 [ 276.948234] kasan_check_range+0x35/0x1c0 [ 276.948674] _raw_spin_lock_bh+0x78/0xe0 [ 276.949989] iscsi_sw_tcp_host_get_param+0xad/0x2e0 [iscsi_tcp] [ 276.951765] show_host_param_ISCSI_HOST_PARAM_IPADDRESS+0xe9/0x130 [scsi_transport_iscsi] [ 276.952185] dev_attr_show+0x3f/0x80 [ 276.953005] sysfs_kf_seq_show+0x1fb/0x3e0 [ 276.953401] seq_read_iter+0x402/0x1020 [ 276.954260] vfs_read+0x532/0x7b0 [ 276.955113] ksys_read+0xed/0x1c0 [ 276.955952] do_syscall_64+0x38/0x90 [ 276.956347] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 276.956769] RIP: 0033:0x7f5d3a679222 [ 276.957161] Code: c0 e9 b2 fe ff ff 50 48 8d 3d 32 c0 0b 00 e8 a5 fe 01 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24 [ 276.958009] RSP: 002b:00007ffc864d16a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 276.958431] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f5d3a679222 [ 276.958857] RDX: 0000000000020000 RSI: 00007f5d3a4fe000 RDI: 0000000000000003 [ 276.959281] RBP: 00007f5d3a4fe000 R08: 00000000ffffffff R09: 0000000000000000 [ 276.959682] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000020000 [ 276.960126] R13: 0000000000000003 R14: 0000000000000000 R15: 0000557a26dada58 [ 276.960536] </TASK> [ 276.961357] Allocated by task 2209: [ 276.961756] kasan_save_stack+0x1e/0x40 [ 276.962170] kasan_set_track+0x21/0x30 [ 276.962557] __kasan_kmalloc+0x7e/0x90 [ 276.962923] __kmalloc+0x5b/0x140 [ 276.963308] iscsi_alloc_session+0x28/0x840 [scsi_transport_iscsi] [ 276.963712] iscsi_session_setup+0xda/0xba0 [libiscsi] [ 276.964078] iscsi_sw_tcp_session_create+0x1fd/0x330 [iscsi_tcp] [ 276.964431] iscsi_if_create_session.isra.0+0x50/0x260 [scsi_transport_iscsi] [ 276.964793] iscsi_if_recv_msg+0xc5a/0x2660 [scsi_transport_iscsi] [ 276.965153] iscsi_if_rx+0x198/0x4b0 [scsi_transport_iscsi] [ 276.965546] netlink_unicast+0x4d5/0x7b0 [ 276.965905] netlink_sendmsg+0x78d/0xc30 [ 276.966236] sock_sendmsg+0xe5/0x120 [ 276.966576] ____sys_sendmsg+0x5fe/0x860 [ 276.966923] ___sys_sendmsg+0xe0/0x170 [ 276.967300] __sys_sendmsg+0xc8/0x170 [ 276.967666] do_syscall_64+0x38/0x90 [ 276.968028] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 276.968773] Freed by task 2209: [ 276.969111] kasan_save_stack+0x1e/0x40 [ 276.969449] kasan_set_track+0x21/0x30 [ 276.969789] kasan_save_free_info+0x2a/0x50 [ 276.970146] __kasan_slab_free+0x106/0x190 [ 276.970470] __kmem_cache_free+0x133/0x270 [ 276.970816] device_release+0x98/0x210 [ 276.971145] kobject_cleanup+0x101/0x360 [ 276.971462] iscsi_session_teardown+0x3fb/0x530 [libiscsi] [ 276.971775] iscsi_sw_tcp_session_destroy+0xd8/0x130 [iscsi_tcp] [ 276.972143] iscsi_if_recv_msg+0x1bf1/0x2660 [scsi_transport_iscsi] [ 276.972485] iscsi_if_rx+0x198/0x4b0 [scsi_transport_iscsi] [ 276.972808] netlink_unicast+0x4d5/0x7b0 [ 276.973201] netlink_sendmsg+0x78d/0xc30 [ 276.973544] sock_sendmsg+0xe5/0x120 [ 276.973864] ____sys_sendmsg+0x5fe/0x860 [ 276.974248] ___sys_sendmsg+0xe0/0x170 [ 276.974583] __sys_sendmsg+0xc8/0x170 [ 276.974891] do_syscall_64+0x38/0x90 [ 276.975216] entry_SYSCALL_64_after_hwframe+0x63/0xcd We can easily reproduce by two tasks: 1. while :; do iscsiadm -m node --login; iscsiadm -m node --logout; done 2. while :; do cat \ /sys/devices/platform/host*/iscsi_host/host*/ipaddress; done iscsid | cat --------------------------------+--------------------------------------- |- iscsi_sw_tcp_session_destroy | |- iscsi_session_teardown | |- device_release | |- iscsi_session_release ||- dev_attr_show |- kfree | |- show_host_param_ | ISCSI_HOST_PARAM_IPADDRESS | |- iscsi_sw_tcp_host_get_param | |- r/w tcp_sw_host->session (UAF) |- iscsi_host_remove | |- iscsi_host_free | Fix the above bug by splitting the session removal into 2 parts: 1. removal from iSCSI class which includes sysfs and removal from host tracking. 2. freeing of session. During iscsi_tcp host and session removal we can remove the session from sysfs then remove the host from sysfs. At this point we know userspace is not accessing the kernel via sysfs so we can free the session and host. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mike Christie <[email protected]> Reviewed-by: Lee Duncan <[email protected]> Acked-by: Ding Hui <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-01-18scsi: ufs: core: Fix devfreq deadlocksJohan Hovold2-14/+17
There is a lock inversion and rwsem read-lock recursion in the devfreq target callback which can lead to deadlocks. Specifically, ufshcd_devfreq_scale() already holds a clk_scaling_lock read lock when toggling the write booster, which involves taking the dev_cmd mutex before taking another clk_scaling_lock read lock. This can lead to a deadlock if another thread: 1) tries to acquire the dev_cmd and clk_scaling locks in the correct order, or 2) takes a clk_scaling write lock before the attempt to take the clk_scaling read lock a second time. Fix this by dropping the clk_scaling_lock before toggling the write booster as was done before commit 0e9d4ca43ba8 ("scsi: ufs: Protect some contexts from unexpected clock scaling"). While the devfreq callbacks are already serialised, add a second serialising mutex to handle the unlikely case where a callback triggered through the devfreq sysfs interface is racing with a request to disable clock scaling through the UFS controller 'clkscale_enable' sysfs attribute. This could otherwise lead to the write booster being left disabled after having disabled clock scaling. Also take the new mutex in ufshcd_clk_scaling_allow() to make sure that any pending write booster update has completed on return. Note that this currently only affects Qualcomm platforms since commit 87bd05016a64 ("scsi: ufs: core: Allow host driver to disable wb toggling during clock scaling"). The lock inversion (i.e. 1 above) was reported by lockdep as: ====================================================== WARNING: possible circular locking dependency detected 6.1.0-next-20221216 #211 Not tainted ------------------------------------------------------ kworker/u16:2/71 is trying to acquire lock: ffff076280ba98a0 (&hba->dev_cmd.lock){+.+.}-{3:3}, at: ufshcd_query_flag+0x50/0x1c0 but task is already holding lock: ffff076280ba9cf0 (&hba->clk_scaling_lock){++++}-{3:3}, at: ufshcd_devfreq_scale+0x2b8/0x380 which lock already depends on the new lock. [ +0.011606] the existing dependency chain (in reverse order) is: -> #1 (&hba->clk_scaling_lock){++++}-{3:3}: lock_acquire+0x68/0x90 down_read+0x58/0x80 ufshcd_exec_dev_cmd+0x70/0x2c0 ufshcd_verify_dev_init+0x68/0x170 ufshcd_probe_hba+0x398/0x1180 ufshcd_async_scan+0x30/0x320 async_run_entry_fn+0x34/0x150 process_one_work+0x288/0x6c0 worker_thread+0x74/0x450 kthread+0x118/0x120 ret_from_fork+0x10/0x20 -> #0 (&hba->dev_cmd.lock){+.+.}-{3:3}: __lock_acquire+0x12a0/0x2240 lock_acquire.part.0+0xcc/0x220 lock_acquire+0x68/0x90 __mutex_lock+0x98/0x430 mutex_lock_nested+0x2c/0x40 ufshcd_query_flag+0x50/0x1c0 ufshcd_query_flag_retry+0x64/0x100 ufshcd_wb_toggle+0x5c/0x120 ufshcd_devfreq_scale+0x2c4/0x380 ufshcd_devfreq_target+0xf4/0x230 devfreq_set_target+0x84/0x2f0 devfreq_update_target+0xc4/0xf0 devfreq_monitor+0x38/0x1f0 process_one_work+0x288/0x6c0 worker_thread+0x74/0x450 kthread+0x118/0x120 ret_from_fork+0x10/0x20 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&hba->clk_scaling_lock); lock(&hba->dev_cmd.lock); lock(&hba->clk_scaling_lock); lock(&hba->dev_cmd.lock); *** DEADLOCK *** Fixes: 0e9d4ca43ba8 ("scsi: ufs: Protect some contexts from unexpected clock scaling") Cc: [email protected] # 5.12 Cc: Can Guo <[email protected]> Tested-by: Andrew Halaney <[email protected]> Signed-off-by: Johan Hovold <[email protected]> Reviewed-by: Bart Van Assche <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-01-18scsi: hpsa: Fix allocation size for scsi_host_alloc()Alexey V. Vissarionov1-1/+1
The 'h' is a pointer to struct ctlr_info, so it's just 4 or 8 bytes, while the structure itself is much bigger. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: edd163687ea5 ("hpsa: add driver for HP Smart Array controllers.") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexey V. Vissarionov <[email protected]> Acked-by: Don Brace <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-01-18Merge tag 'for-linus-2023011801' of ↵Linus Torvalds11-14/+96
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Jiri Kosina: - fixes for potential empty list handling in HID core (Pietro Borrello) - fix for NULL pointer dereference in betop driver that could be triggered by malicious device (Pietro Borrello) - fixes for handling calibration data preventing division by zero in Playstation driver (Roderick Colenbrander) - fix for memory leak on error path in amd-sfh driver (Basavaraj Natikar) - other few assorted small fixes and device ID-specific handling * tag 'for-linus-2023011801' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: betop: check shape of output reports HID: playstation: sanity check DualSense calibration data. HID: playstation: sanity check DualShock4 calibration data. HID: uclogic: Add support for XP-PEN Deco 01 V2 HID: revert CHERRY_MOUSE_000C quirk HID: check empty report_list in bigben_probe() HID: check empty report_list in hid_validate_values() HID: amd_sfh: Fix warning unwind goto HID: intel_ish-hid: Add check for ishtp_dma_tx_map