aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2021-09-21xen/balloon: fix balloon kthread freezingJuergen Gross1-2/+2
Commit 8480ed9c2bbd56 ("xen/balloon: use a kernel thread instead a workqueue") switched the Xen balloon driver to use a kernel thread. Unfortunately the patch omitted to call try_to_freeze() or to use wait_event_freezable_timeout(), causing a system suspend to fail. Fixes: 8480ed9c2bbd56 ("xen/balloon: use a kernel thread instead a workqueue") Signed-off-by: Juergen Gross <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Juergen Gross <[email protected]>
2021-09-21nvme: keep ctrl->namespaces orderedChristoph Hellwig1-16/+17
Various places in the nvme code that rely on ctrl->namespace to be ordered. Ensure that the namespae is inserted into the list at the right position from the start instead of sorting it after the fact. Fixes: 540c801c65eb ("NVMe: Implement namespace list scanning") Reported-by: Anton Eidelman <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Keith Busch <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Reviewed-by: Damien Le Moal <[email protected]>
2021-09-21nvme-tcp: fix incorrect h2cdata pdu offset accountingSagi Grimberg1-3/+10
When the controller sends us multiple r2t PDUs in a single request we need to account for it correctly as our send/recv context run concurrently (i.e. we get a new r2t with r2t_offset before we updated our iterator and req->data_sent marker). This can cause wrong offsets to be sent to the controller. To fix that, we will first know that this may happen only in the send sequence of the last page, hence we will take the r2t_offset to the h2c PDU data_offset, and in nvme_tcp_try_send_data loop, we make sure to increment the request markers also when we completed a PDU but we are expecting more r2t PDUs as we still did not send the entire data of the request. Fixes: 825619b09ad3 ("nvme-tcp: fix possible use-after-completion") Reported-by: Nowak, Lukasz <[email protected]> Tested-by: Nowak, Lukasz <[email protected]> Signed-off-by: Sagi Grimberg <[email protected]> Reviewed-by: Keith Busch <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2021-09-21nvme-fc: remove freeze/unfreeze around update_nr_hw_queuesJames Smart1-2/+0
Remove the freeze/unfreeze around changes to the number of hardware queues. Study and retest has indicated there are no ios that can be active at this point so there is nothing to freeze. nvme-fc is draining the queues in the shutdown and error recovery path in __nvme_fc_abort_outstanding_ios. This patch primarily reverts 88e837ed0f1f "nvme-fc: wait for queues to freeze before calling update_hr_hw_queues". It's not an exact revert as it leaves the adjusting of hw queues only if the count changes. Signed-off-by: James Smart <[email protected]> [dwagner: added explanation why no IO is pending] Signed-off-by: Daniel Wagner <[email protected]> Reviewed-by: Ming Lei <[email protected]> Reviewed-by: Himanshu Madhani <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2021-09-21nvme-fc: avoid race between time out and tear downJames Smart1-0/+2
To avoid race between time out and tear down, in tear down process, first we quiesce the queue, and then delete the timer and cancel the time out work for the queue. This patch merges the admin and io sync ops into the queue teardown logic as shown in the RDMA patch 3017013dcc "nvme-rdma: avoid race between time out and tear down". There is no teardown_lock in nvme-fc. Signed-off-by: James Smart <[email protected]> Tested-by: Daniel Wagner <[email protected]> Reviewed-by: Himanshu Madhani <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: Daniel Wagner <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2021-09-21nvme-fc: update hardware queues before using themDaniel Wagner1-8/+8
In case the number of hardware queues changes, we need to update the tagset and the mapping of ctx to hctx first. If we try to create and connect the I/O queues first, this operation will fail (target will reject the connect call due to the wrong number of queues) and hence we bail out of the recreate function. Then we will to try the very same operation again, thus we don't make any progress. Signed-off-by: Daniel Wagner <[email protected]> Reviewed-by: Ming Lei <[email protected]> Reviewed-by: Himanshu Madhani <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: James Smart <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2021-09-20Merge tag 'afs-fixes-20210913' of ↵Linus Torvalds17-120/+294
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull AFS fixes from David Howells: "Fixes for AFS problems that can cause data corruption due to interaction with another client modifying data cached locally: - When d_revalidating a dentry, don't look at the inode to which it points. Only check the directory to which the dentry belongs. This was confusing things and causing the silly-rename cleanup code to remove the file now at the dentry of a file that got deleted. - Fix mmap data coherency. When a callback break is received that relates to a file that we have cached, the data content may have been changed (there are other reasons, such as the user's rights having been changed). However, we're checking it lazily, only on entry to the kernel, which doesn't happen if we have a writeable shared mapped page on that file. We make the kernel keep track of mmapped files and clear all PTEs mapping to that file as soon as the callback comes in by calling unmap_mapping_pages() (we don't necessarily want to zap the pagecache). This causes the kernel to be reentered when userspace tries to access the mmapped address range again - and at that point we can query the server and, if we need to, zap the page cache. Ideally, I would check each file at the point of notification, but that involves poking the server[*] - which is holding an exclusive lock on the vnode it is changing, waiting for all the clients it notified to reply. This could then deadlock against the server. Further, invalidating the pagecache might call ->launder_page(), which would try to write to the file, which would definitely deadlock. (AFS doesn't lease file access). [*] Checking to see if the file content has changed is a matter of comparing the current data version number, but we have to ask the server for that. We also need to get a new callback promise and we need to poke the server for that too. - Add some more points at which the inode is validated, since we're doing it lazily, notably in ->read_iter() and ->page_mkwrite(), but also when performing some directory operations. Ideally, checking in ->read_iter() would be done in some derivation of filemap_read(). If we're going to call the server to read the file, then we get the file status fetch as part of that. - The above is now causing us to make a lot more calls to afs_validate() to check the inode - and afs_validate() takes the RCU read lock each time to make a quick check (ie. afs_check_validity()). This is entirely for the purpose of checking cb_s_break to see if the server we're using reinitialised its list of callbacks - however this isn't a very common event, so most of the time we're taking this needlessly. Add a new cell-wide counter to count the number of reinitialisations done by any server and check that - and only if that changes, take the RCU read lock and check the server list (the server list may change, but the cell a file is part of won't). - Don't update vnode->cb_s_break and ->cb_v_break inside the validity checking loop. The cb_lock is done with read_seqretry, so we might go round the loop a second time after resetting those values - and that could cause someone else checking validity to miss something (I think). Also included are patches for fixes for some bugs encountered whilst debugging this: - Fix a leak of afs_read objects and fix a leak of keys hidden by that. - Fix a leak of pages that couldn't be added to extend a writeback. - Fix the maintenance of i_blocks when i_size is changed by a local write or a local dir edit" Link: https://bugzilla.kernel.org/show_bug.cgi?id=214217 [1] Link: https://lore.kernel.org/r/163111665183.283156.17200205573146438918.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163113612442.352844.11162345591911691150.stgit@warthog.procyon.org.uk/ # i_blocks patch * tag 'afs-fixes-20210913' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: afs: Fix updating of i_blocks on file/dir extension afs: Fix corruption in reads at fpos 2G-4G from an OpenAFS server afs: Try to avoid taking RCU read lock when checking vnode validity afs: Fix mmap coherency vs 3rd-party changes afs: Fix incorrect triggering of sillyrename on 3rd-party invalidation afs: Add missing vnode validation checks afs: Fix page leak afs: Fix missing put on afs_read objects and missing get on the key therein
2021-09-20Merge tag '5.15-rc1-ksmbd' of git://git.samba.org/ksmbdLinus Torvalds4-17/+81
Pull ksmbd server fixes from Steve French: "Three ksmbd fixes, including an important security fix for path processing, and a buffer overflow check, and a trivial fix for incorrect header inclusion" * tag '5.15-rc1-ksmbd' of git://git.samba.org/ksmbd: ksmbd: add validation for FILE_FULL_EA_INFORMATION of smb2_get_info ksmbd: prevent out of share access ksmbd: transport_rdma: Don't include rwlock.h directly
2021-09-20Merge tag '5.15-rc1-smb3' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds48-57/+67
Pull cifs client fixes from Steve French: - two deferred close fixes (for bugs found with xfstests 478 and 461) - a deferred close improvement in rename - two trivial fixes for incorrect Linux comment formatting of multiple cifs files (pointed out by automated kernel test robot and checkpatch) * tag '5.15-rc1-smb3' of git://git.samba.org/sfrench/cifs-2.6: cifs: Not to defer close on file when lock is set cifs: Fix soft lockup during fsstress cifs: Deferred close performance improvements cifs: fix incorrect kernel doc comments cifs: remove pathname for file from SPDX header
2021-09-20x86/fault: Fix wrong signal when vsyscall fails with pkeyJiashuo Liang3-10/+20
The function __bad_area_nosemaphore() calls kernelmode_fixup_or_oops() with the parameter @signal being actually @pkey, which will send a signal numbered with the argument in @pkey. This bug can be triggered when the kernel fails to access user-given memory pages that are protected by a pkey, so it can go down the do_user_addr_fault() path and pass the !user_mode() check in __bad_area_nosemaphore(). Most cases will simply run the kernel fixup code to make an -EFAULT. But when another condition current->thread.sig_on_uaccess_err is met, which is only used to emulate vsyscall, the kernel will generate the wrong signal. Add a new parameter @pkey to kernelmode_fixup_or_oops() to fix this. [ bp: Massage commit message, fix build error as reported by the 0day bot: https://lkml.kernel.org/r/[email protected] ] Fixes: 5042d40a264c ("x86/fault: Bypass no_context() for implicit kernel faults from usermode") Reported-by: kernel test robot <[email protected]> Signed-off-by: Jiashuo Liang <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Acked-by: Dave Hansen <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-09-20Merge tag 'spi-fix-v5.15-rc2' of ↵Linus Torvalds2-5/+10
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark BrownL "This contains a couple of fixes, one fix for handling of zero length transfers on Rockchip devices and a warning fix which will conflict with a version you did but cleans up some extra unneeded forward declarations as well which seems a bit neater" * tag 'spi-fix-v5.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: tegra20-slink: Declare runtime suspend and resume functions conditionally spi: rockchip: handle zero length transfers without timing out
2021-09-20Merge tag 'regulator-fix-v5.15-rc2' of ↵Linus Torvalds2-3/+1
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fixes from Mark Brown: "A couple of small device specific fixes that have been sent since the merge window, neither of which stands out particularly" * tag 'regulator-fix-v5.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: max14577: Revert "regulator: max14577: Add proper module aliases strings" regulator: qcom-rpmh-regulator: fix pm8009-1 ldo7 resource name
2021-09-20drm/nouveau/nvkm: Replace -ENOSYS with -ENODEVGuenter Roeck1-1/+1
nvkm test builds fail with the following error. drivers/gpu/drm/nouveau/nvkm/engine/device/ctrl.c: In function 'nvkm_control_mthd_pstate_info': drivers/gpu/drm/nouveau/nvkm/engine/device/ctrl.c:60:35: error: overflow in conversion from 'int' to '__s8' {aka 'signed char'} changes value from '-251' to '5' The code builds on most architectures, but fails on parisc where ENOSYS is defined as 251. Replace the error code with -ENODEV (-19). The actual error code does not really matter and is not passed to userspace - it just has to be negative. Fixes: 7238eca4cf18 ("drm/nouveau: expose pstate selection per-power source in sysfs") Signed-off-by: Guenter Roeck <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: David Airlie <[email protected]> Cc: Daniel Vetter <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-20sparc64: fix pci_iounmap() when CONFIG_PCI is not setLinus Torvalds1-0/+2
Guenter reported [1] that the pci_iounmap() changes remain problematic, with sparc64 allnoconfig and tinyconfig still not building due to the header file changes and confusion with the arch-specific pci_iounmap() implementation. I'm pretty convinced that sparc should just use GENERIC_IOMAP instead of doing its own thing, since it turns out that the sparc64 version of pci_iounmap() is somewhat buggy (see [2]). But in the meantime, this just fixes the build by avoiding the trivial re-definition of the empty case. Link: https://lore.kernel.org/lkml/[email protected]/ [1] Link: https://lore.kernel.org/lkml/CAHk-=wgheheFx9myQyy5osh79BAazvmvYURAtub2gQtMvLrhqQ@mail.gmail.com/ [2] Reported-by: Guenter Roeck <[email protected]> Cc: David Miller <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-20swiotlb-xen: this is PV-only on x86Jan Beulich3-13/+1
The code is unreachable for HVM or PVH, and it also makes little sense in auto-translated environments. On Arm, with xen_{create,destroy}_contiguous_region() both being stubs, I have a hard time seeing what good the Xen specific variant does - the generic one ought to be fine for all purposes there. Still Arm code explicitly references symbols here, so the code will continue to be included there. Instead of making PCI_XEN's "select" conditional, simply drop it - SWIOTLB_XEN will be available unconditionally in the PV case anyway, and is - as explained above - dead code in non-PV environments. This in turn allows dropping the stubs for xen_{create,destroy}_contiguous_region(), the former of which was broken anyway - it failed to set the DMA handle output. Signed-off-by: Jan Beulich <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Stefano Stabellini <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Juergen Gross <[email protected]>
2021-09-20xen/pci-swiotlb: reduce visibility of symbolsJan Beulich2-7/+3
xen_swiotlb and pci_xen_swiotlb_init() are only used within the file defining them, so make them static and remove the stubs. Otoh pci_xen_swiotlb_detect() has a use (as function pointer) from the main pci-swiotlb.c file - convert its stub to a #define to NULL. Signed-off-by: Jan Beulich <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Juergen Gross <[email protected]>
2021-09-20PCI: only build xen-pcifront in PV-enabled environmentsJan Beulich1-1/+1
The driver's module init function, pcifront_init(), invokes xen_pv_domain() first thing. That construct produces constant "false" when !CONFIG_XEN_PV. Hence there's no point building the driver in non-PV configurations. Drop the (now implicit and generally wrong) X86 dependency: At present, XEN_PV can only be set when X86 is also enabled. In general an architecture supporting Xen PV (and PCI) would want to have this driver built. Signed-off-by: Jan Beulich <[email protected]> Reviewed-by: Stefano Stabellini <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Juergen Gross <[email protected]>
2021-09-20swiotlb-xen: ensure to issue well-formed XENMEM_exchange requestsJan Beulich1-3/+4
While the hypervisor hasn't been enforcing this, we would still better avoid issuing requests with GFNs not aligned to the requested order. Instead of altering the value also in the call to panic(), drop it there for being static and hence easy to determine without being part of the panic message. Signed-off-by: Jan Beulich <[email protected]> Reviewed-by: Stefano Stabellini <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Juergen Gross <[email protected]>
2021-09-20Xen/gntdev: don't ignore kernel unmapping errorJan Beulich1-0/+8
While working on XSA-361 and its follow-ups, I failed to spot another place where the kernel mapping part of an operation was not treated the same as the user space part. Detect and propagate errors and add a 2nd pr_debug(). Signed-off-by: Jan Beulich <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Juergen Gross <[email protected]>
2021-09-20xen/x86: drop redundant zeroing from cpu_initialize_context()Jan Beulich1-4/+0
Just after having obtained the pointer from kzalloc() there's no reason at all to set part of the area to all zero yet another time. Similarly there's no point explicitly clearing "ldt_ents". Signed-off-by: Jan Beulich <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Juergen Gross <[email protected]>
2021-09-20Merge branch 'hns3-fixes'David S. Miller5-49/+103
Guangbin Huang says: ==================== net: hns3: add some fixes for -net This series adds some fixes for the HNS3 ethernet driver. ==================== Signed-off-by: David S. Miller <[email protected]>
2021-09-20net: hns3: fix a return value error in hclge_get_reset_status()Yufeng Mo1-4/+11
hclge_get_reset_status() should return the tqp reset status. However, if the CMDQ fails, the caller will take it as tqp reset success status by mistake. Therefore, uses a parameters to get the tqp reset status instead. Fixes: 46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support") Signed-off-by: Yufeng Mo <[email protected]> Signed-off-by: Guangbin Huang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-20net: hns3: check vlan id before using itliaoguojia1-0/+3
The input parameters may not be reliable, so check the vlan id before using it, otherwise may set wrong vlan id into hardware. Fixes: dc8131d846d4 ("net: hns3: Fix for packet loss due wrong filter config in VLAN tbls") Signed-off-by: liaoguojia <[email protected]> Signed-off-by: Guangbin Huang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-20net: hns3: check queue id range before usingYufeng Mo1-0/+8
The input parameters may not be reliable. Before using the queue id, we should check this parameter. Otherwise, memory overwriting may occur. Fixes: d34100184685 ("net: hns3: refactor the mailbox message between PF and VF") Signed-off-by: Yufeng Mo <[email protected]> Signed-off-by: Guangbin Huang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-20net: hns3: fix misuse vf id and vport id in some logsJiaran Zhang4-10/+12
vport_id include PF and VFs, vport_id = 0 means PF, other values mean VFs. So the actual vf id is equal to vport_id minus 1. Some VF print logs are actually vport, and logs of vf id actually use vport id, so this patch fixes them. Fixes: ac887be5b0fe ("net: hns3: change print level of RAS error log from warning to error") Fixes: adcf738b804b ("net: hns3: cleanup some print format warning") Signed-off-by: Jiaran Zhang <[email protected]> Signed-off-by: Guangbin Huang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-20net: hns3: fix inconsistent vf id printJian Shen1-2/+5
The vf id from ethtool is added 1 before configured to driver. So it's necessary to minus 1 when printing it, in order to keep consistent with user's configuration. Fixes: dd74f815dd41 ("net: hns3: Add support for rule add/delete for flow director") Signed-off-by: Jian Shen <[email protected]> Signed-off-by: Guangbin Huang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-20net: hns3: fix change RSS 'hfunc' ineffective issueJian Shen2-33/+64
When user change rss 'hfunc' without set rss 'hkey' by ethtool -X command, the driver will ignore the 'hfunc' for the hkey is NULL. It's unreasonable. So fix it. Fixes: 46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support") Fixes: 374ad291762a ("net: hns3: Add RSS general configuration support for VF") Signed-off-by: Jian Shen <[email protected]> Signed-off-by: Guangbin Huang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-20staging: r8188eu: fix -Wrestrict warningsArnd Bergmann1-4/+4
Adding back the nonstandard ioctl commands caused -Wrestrict warnings when building with 'make W=1': drivers/staging/r8188eu/os_dep/ioctl_linux.c: In function 'rtw_mp_read_rf': drivers/staging/r8188eu/os_dep/ioctl_linux.c:5515:27: error: 'sprintf' argument 3 overlaps destination object 'extra' [-Werror=restrict] 5515 | sprintf(extra, "%s %d", extra, strtou); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/staging/r8188eu/os_dep/ioctl_linux.c:5470:54: note: destination object referenced by 'restrict'-qualified argument 1 was declared here 5470 | struct iw_point *wrqu, char *extra) | ~~~~~~^~~~~ Change these to the same construct used elsewhere in that driver, with an offset to the string to make the warning go away. The ioctl commands were previously removed, and it's unlikely that anything is actually using them, so ideally I would prefer to have them removed again. The lack of range checking of the 'extra' output buffer is also slightly worrying, but I did not check whether this could cause harm. Fixes: 2b42bd58b321 ("staging: r8188eu: introduce new os_dep dir for RTL8188eu driver") Signed-off-by: Arnd Bergmann <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2021-09-20ptp: ocp: add COMMON_CLK dependencyArnd Bergmann1-0/+1
Without CONFIG_COMMON_CLK, this fails to link: arm-linux-gnueabi-ld: drivers/ptp/ptp_ocp.o: in function `ptp_ocp_register_i2c': ptp_ocp.c:(.text+0xcc0): undefined reference to `__clk_hw_register_fixed_rate' arm-linux-gnueabi-ld: ptp_ocp.c:(.text+0xcf4): undefined reference to `devm_clk_hw_register_clkdev' arm-linux-gnueabi-ld: drivers/ptp/ptp_ocp.o: in function `ptp_ocp_detach': ptp_ocp.c:(.text+0x1c24): undefined reference to `clk_hw_unregister_fixed_rate' Fixes: a7e1abad13f3 ("ptp: Add clock driver for the OpenCompute TimeCard.") Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-20USB: serial: option: remove duplicate USB device IDKrzysztof Kozlowski1-1/+0
The device ZTE 0x0094 is already on the list. Signed-off-by: Krzysztof Kozlowski <[email protected]> Fixes: b9e44fe5ecda ("USB: option: cleanup zte 3g-dongle's pid in option.c") Cc: [email protected] Signed-off-by: Johan Hovold <[email protected]>
2021-09-20USB: serial: mos7840: remove duplicated 0xac24 device IDKrzysztof Kozlowski1-2/+0
0xac24 device ID is already defined and used via BANDB_DEVICE_ID_USO9ML2_4. Remove the duplicate from the list. Fixes: 27f1281d5f72 ("USB: serial: Extra device/vendor ID for mos7840 driver") Signed-off-by: Krzysztof Kozlowski <[email protected]> Cc: [email protected] Signed-off-by: Johan Hovold <[email protected]>
2021-09-20bnxt_en: Fix TX timeout when TX ring size is set to the smallestMichael Chan3-5/+10
The smallest TX ring size we support must fit a TX SKB with MAX_SKB_FRAGS + 1. Because the first TX BD for a packet is always a long TX BD, we need an extra TX BD to fit this packet. Define BNXT_MIN_TX_DESC_CNT with this value to make this more clear. The current code uses a minimum that is off by 1. Fix it using this constant. The tx_wake_thresh to determine when to wake up the TX queue is half the ring size but we must have at least BNXT_MIN_TX_DESC_CNT for the next packet which may have maximum fragments. So the comparison of the available TX BDs with tx_wake_thresh should be >= instead of > in the current code. Otherwise, at the smallest ring size, we will never wake up the TX queue and will cause TX timeout. Fixes: c0c050c58d84 ("bnxt_en: New Broadcom ethernet driver.") Reviewed-by: Pavan Chebbi <[email protected]> Signed-off-by: Michael Chan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-20nexthop: Fix division by zero while replacing a resilient groupIdo Schimmel1-0/+2
The resilient nexthop group torture tests in fib_nexthop.sh exposed a possible division by zero while replacing a resilient group [1]. The division by zero occurs when the data path sees a resilient nexthop group with zero buckets. The tests replace a resilient nexthop group in a loop while traffic is forwarded through it. The tests do not specify the number of buckets while performing the replacement, resulting in the kernel allocating a stub resilient table (i.e, 'struct nh_res_table') with zero buckets. This table should never be visible to the data path, but the old nexthop group (i.e., 'oldg') might still be used by the data path when the stub table is assigned to it. Fix this by only assigning the stub table to the old nexthop group after making sure the group is no longer used by the data path. Tested with fib_nexthops.sh: Tests passed: 222 Tests failed: 0 [1] divide error: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 1850 Comm: ping Not tainted 5.14.0-custom-10271-ga86eb53057fe #1107 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-4.fc34 04/01/2014 RIP: 0010:nexthop_select_path+0x2d2/0x1a80 [...] Call Trace: fib_select_multipath+0x79b/0x1530 fib_select_path+0x8fb/0x1c10 ip_route_output_key_hash_rcu+0x1198/0x2da0 ip_route_output_key_hash+0x190/0x340 ip_route_output_flow+0x21/0x120 raw_sendmsg+0x91d/0x2e10 inet_sendmsg+0x9e/0xe0 __sys_sendto+0x23d/0x360 __x64_sys_sendto+0xe1/0x1b0 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae Cc: [email protected] Fixes: 283a72a5599e ("nexthop: Add implementation of resilient next-hop groups") Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Petr Machata <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-20napi: fix race inside napi_enableXuan Zhuo1-6/+10
The process will cause napi.state to contain NAPI_STATE_SCHED and not in the poll_list, which will cause napi_disable() to get stuck. The prefix "NAPI_STATE_" is removed in the figure below, and NAPI_STATE_HASHED is ignored in napi.state. CPU0 | CPU1 | napi.state =============================================================================== napi_disable() | | SCHED | NPSVC napi_enable() | | { | | smp_mb__before_atomic(); | | clear_bit(SCHED, &n->state); | | NPSVC | napi_schedule_prep() | SCHED | NPSVC | napi_poll() | | napi_complete_done() | | { | | if (n->state & (NPSVC | | (1) | _BUSY_POLL))) | | return false; | | ................ | | } | SCHED | NPSVC | | clear_bit(NPSVC, &n->state); | | SCHED } | | | | napi_schedule_prep() | | SCHED | MISSED (2) (1) Here return direct. Because of NAPI_STATE_NPSVC exists. (2) NAPI_STATE_SCHED exists. So not add napi.poll_list to sd->poll_list Since NAPI_STATE_SCHED already exists and napi is not in the sd->poll_list queue, NAPI_STATE_SCHED cannot be cleared and will always exist. 1. This will cause this queue to no longer receive packets. 2. If you encounter napi_disable under the protection of rtnl_lock, it will cause the entire rtnl_lock to be locked, affecting the overall system. This patch uses cmpxchg to implement napi_enable(), which ensures that there will be no race due to the separation of clear two bits. Fixes: 2d8bff12699abc ("netpoll: Close race condition between poll_one_napi and napi_disable") Signed-off-by: Xuan Zhuo <[email protected]> Reviewed-by: Dust Li <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-20Merge tag 'fpga-fixes-5.15' of ↵Greg Kroah-Hartman2-7/+13
git://git.kernel.org/pub/scm/linux/kernel/git/mdf/linux-fpga into char-misc-linus FPGA Manager fixes for 5.15 Tom and Jiapeng's fixes address smatch warnings around missing return values in error cases. Russ' change addresses an issue where registers are being accessed too early resulting in invalid data being read. All patches have been reviewed on the mailing list, and have been in the last few linux-next releases (as part of my fixes branch) without issues. Signed-off-by: Moritz Fischer <[email protected]> * tag 'fpga-fixes-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/mdf/linux-fpga: fpga: dfl: Avoid reads to AFU CSRs during enumeration fpga: machxo2-spi: Fix missing error code in machxo2_write_complete() fpga: machxo2-spi: Return an error on failure
2021-09-20Merge tag 'misc-habanalabs-fixes-2021-09-19' of ↵Greg Kroah-Hartman5-74/+134
https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/linux into char-misc-linus Oded writes: This tag contains the following fixes for 5.15-rc3: - Fix potential race when user waiting for interrupt ioctl - Prevent possible kernel oops in staged CS ioctl - Use direct MSI mechanism in Gaudi as a WA for a H/W issue regarding FLR - Don't support collective wait ioctl operation when it is not supported. e.g. when the NIC ports are disabled - Fix configuration of one of the security mechanism. - Change error print to be rate-limited as it can be initiated by the user and spam the kernel log - Fix return value of CS ioctl when doing staged CS - Fix CS ioctl code when user doesn't supply an offset for the memory area that we use as fence. - Spelling mistake fix * tag 'misc-habanalabs-fixes-2021-09-19' of https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/linux: habanalabs: expose a single cs seq in staged submissions habanalabs: fix wait offset handling habanalabs: rate limit multi CS completion errors habanalabs/gaudi: fix LBW RR configuration habanalabs: Fix spelling mistake "FEADBACK" -> "FEEDBACK" habanalabs: fail collective wait when not supported habanalabs/gaudi: use direct MSI in single mode habanalabs: fix kernel OOPs related to staged cs habanalabs: fix potential race in interrupt wait ioctl
2021-09-19init: don't panic if mount_nodev_root failedLeon Romanovsky1-3/+0
Attempt to mount 9p file system as root gives the following kernel panic: 9pnet_virtio: no channels available for device root Kernel panic - not syncing: VFS: Unable to mount root "root" (9p), err=-2 CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.15.0-rc1+ #127 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack_lvl+0x45/0x59 panic+0x1e2/0x44b ? __warn_printk+0xf3/0xf3 ? free_unref_page+0x2d4/0x4a0 ? trace_hardirqs_on+0x32/0x120 ? free_unref_page+0x2d4/0x4a0 mount_root+0x189/0x1e0 prepare_namespace+0x136/0x165 kernel_init_freeable+0x3b8/0x3cb ? rest_init+0x2e0/0x2e0 kernel_init+0x19/0x130 ret_from_fork+0x1f/0x30 Kernel Offset: disabled ---[ end Kernel panic - not syncing: VFS: Unable to mount root "root" (9p), err=-2 ]--- QEMU command line: "qemu-system-x86_64 -append root=/dev/root rw rootfstype=9p rootflags=trans=virtio ..." This error is because root_device_name is truncated in prepare_namespace() from being "/dev/root" to be "root" prior to call to mount_nodev_root(). As a solution, don't treat errors in mount_nodev_root() as errors that require panics and allow failback to the mount flow that existed before patch citied in Fixes tag. Fixes: f9259be6a9e7 ("init: allow mounting arbitrary non-blockdevice filesystems as root") Signed-off-by: Leon Romanovsky <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Al Viro <[email protected]>
2021-09-19init/do_mounts.c: Harden split_fs_names() against buffer overflowVivek Goyal1-11/+16
split_fs_names() currently takes comma separate list of filesystems and converts it into individual filesystem strings. Pleaces these strings in the input buffer passed by caller and returns number of strings. If caller manages to pass input string bigger than buffer, then we can write beyond the buffer. Or if string just fits buffer, we will still write beyond the buffer as we append a '\0' byte at the end. Pass size of input buffer to split_fs_names() and put enough checks in place so such buffer overrun possibilities do not occur. This patch does few things. - Add a parameter "size" to split_fs_names(). This specifies size of input buffer. - Use strlcpy() (instead of strcpy()) so that we can't go beyond buffer size. If input string "names" is larger than passed in buffer, input string will be truncated to fit in buffer. - Stop appending extra '\0' character at the end and avoid one possibility of going beyond the input buffer size. - Do not use extra loop to count number of strings. - Previously if one passed "rootfstype=foo,,bar", split_fs_names() will return only 1 string "foo" (and "bar" will be truncated due to extra ,). After this patch, now split_fs_names() will return 3 strings ("foo", zero-sized-string, and "bar"). Callers of split_fs_names() have been modified to check for zero sized string and skip to next one. Reported-by: xu xin <[email protected]> Signed-off-by: Vivek Goyal <[email protected]> Reviewed-by: Jan Kara <[email protected]> Signed-off-by: Al Viro <[email protected]>
2021-09-19Linux 5.15-rc2Linus Torvalds1-1/+1
2021-09-19pci_iounmap'2: Electric Boogaloo: try to make sense of it allLinus Torvalds2-23/+46
Nathan Chancellor reports that the recent change to pci_iounmap in commit 9caea0007601 ("parisc: Declare pci_iounmap() parisc version only when CONFIG_PCI enabled") causes build errors on arm64. It took me about two hours to convince myself that I think I know what the logic of that mess of #ifdef's in the <asm-generic/io.h> header file really aim to do, and rewrite it to be easier to follow. Famous last words. Anyway, the code has now been lifted from that grotty header file into lib/pci_iomap.c, and has fairly extensive comments about what the logic is. It also avoids indirecting through another confusing (and badly named) helper function that has other preprocessor config conditionals. Let's see what odd architecture did something else strange in this area to break things. But my arm64 cross build is clean. Fixes: 9caea0007601 ("parisc: Declare pci_iounmap() parisc version only when CONFIG_PCI enabled") Reported-by: Nathan Chancellor <[email protected]> Cc: Helge Deller <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Guenter Roeck <[email protected]> Cc: Ulrich Teichert <[email protected]> Cc: James Bottomley <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-19Merge tag 'x86_urgent_for_v5.15_rc2' of ↵Linus Torvalds5-15/+47
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Prevent a infinite loop in the MCE recovery on return to user space, which was caused by a second MCE queueing work for the same page and thereby creating a circular work list. - Make kern_addr_valid() handle existing PMD entries, which are marked not present in the higher level page table, correctly instead of blindly dereferencing them. - Pass a valid address to sanitize_phys(). This was caused by the mixture of inclusive and exclusive ranges. memtype_reserve() expect 'end' being exclusive, but sanitize_phys() wants it inclusive. This worked so far, but with end being the end of the physical address space the fail is exposed. - Increase the maximum supported GPIO numbers for 64bit. Newer SoCs exceed the previous maximum. * tag 'x86_urgent_for_v5.15_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce: Avoid infinite loop for copy from user recovery x86/mm: Fix kern_addr_valid() to cope with existing but not present entries x86/platform: Increase maximum GPIO number for X86_64 x86/pat: Pass valid address to sanitize_phys()
2021-09-19Merge tag 'perf-urgent-2021-09-19' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf event fix from Thomas Gleixner: "A single fix for the perf core where a value read with READ_ONCE() was checked and then reread which makes all the checks invalid. Reuse the already read value instead" * tag 'perf-urgent-2021-09-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: events: Reuse value read using READ_ONCE instead of re-reading it
2021-09-19Merge tag 'locking-urgent-2021-09-19' of ↵Linus Torvalds1-20/+45
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Thomas Gleixner: "A set of updates for the RT specific reader/writer locking base code: - Make the fast path reader ordering guarantees correct. - Code reshuffling to make the fix simpler" [ This plays ugly games with atomic_add_return_release() because we don't have a plain atomic_add_release(), and should really be cleaned up, I think - Linus ] * tag 'locking-urgent-2021-09-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/rwbase: Take care of ordering guarantee for fastpath reader locking/rwbase: Extract __rwbase_write_trylock() locking/rwbase: Properly match set_and_save_state() to restore_state()
2021-09-19Merge tag 'powerpc-5.15-2' of ↵Linus Torvalds7-55/+159
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix crashes when scv (System Call Vectored) is used to make a syscall when a transaction is active, on Power9 or later. - Fix bad interactions between rfscv (Return-from scv) and Power9 fake-suspend mode. - Fix crashes when handling machine checks in LPARs using the Hash MMU. - Partly revert a recent change to our XICS interrupt controller code, which broke the recently added Microwatt support. Thanks to Cédric Le Goater, Eirik Fuller, Ganesh Goudar, Gustavo Romero, Joel Stanley, Nicholas Piggin. * tag 'powerpc-5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/xics: Set the IRQ chip data for the ICS native backend powerpc/mce: Fix access error in mce handler KVM: PPC: Book3S HV: Tolerate treclaim. in fake-suspend mode changing registers powerpc/64s: system call rfscv workaround for TM bugs selftests/powerpc: Add scv versions of the basic TM syscall tests powerpc/64s: system call scv tabort fix for corrupt irq soft-mask state
2021-09-19Merge tag 'kbuild-fixes-v5.15' of ↵Linus Torvalds6-20/+27
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Fix bugs in checkkconfigsymbols.py - Fix missing sys import in gen_compile_commands.py - Fix missing FORCE warning for ARCH=sh builds - Fix -Wignored-optimization-argument warnings for Clang builds - Turn -Wignored-optimization-argument into an error in order to stop building instead of sprinkling warnings * tag 'kbuild-fixes-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kbuild: Add -Werror=ignored-optimization-argument to CLANG_FLAGS x86/build: Do not add -falign flags unconditionally for clang kbuild: Fix comment typo in scripts/Makefile.modpost sh: Add missing FORCE prerequisites in Makefile gen_compile_commands: fix missing 'sys' package checkkconfigsymbols.py: Remove skipping of help lines in parse_kconfig_file checkkconfigsymbols.py: Forbid passing 'HEAD' to --commit
2021-09-19Merge tag 'perf-tools-fixes-for-v5.15-2021-09-18' of ↵Linus Torvalds7-51/+100
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tools fixes from Arnaldo Carvalho de Melo: - Fix ip display in 'perf script' when output type != attr->type. - Ignore deprecation warning when using libbpf'sg btf__get_from_id(), fixing the build with libbpf v0.6+. - Make use of FD() robust in libperf, fixing a segfault with 'perf stat --iostat list'. - Initialize addr_location:srcline pointer to NULL when resolving callchain addresses. - Fix fused instruction logic for assembly functions in 'perf annotate'. * tag 'perf-tools-fixes-for-v5.15-2021-09-18' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: perf bpf: Ignore deprecation warning when using libbpf's btf__get_from_id() libperf evsel: Make use of FD robust. perf machine: Initialize srcline string member in add_location struct perf script: Fix ip display when type != attr->type perf annotate: Fix fused instr logic for assembly functions
2021-09-19dmascc: use proper 'virt_to_bus()' rather than casting to 'int'Linus Torvalds1-3/+3
The old dmascc driver depends on the legacy ISA_DMA_API, and blindly just casts the kernel virtual address to 'int' for set_dma_addr(). That works only incidentally, and because the high bits of the address will be ignored anyway. And on 64-bit architectures it causes warnings. Admittedly, 64-bit architectures with ISA are basically dead - I think the only example of this is alpha, and nobody would ever use the dmascc driver there. But hey, the fix is easy enough, the end result is cleaner, and it's yet another configuration that now builds without warnings. If somebody actually uses this driver on an alpha and this fixes it for you, please email me. Because that is just incredibly bizarre. Signed-off-by: Linus Torvalds <[email protected]>
2021-09-19alpha: enable GENERIC_PCI_IOMAP unconditionallyLinus Torvalds1-1/+1
With the previous commit (9caea0007601: "parisc: Declare pci_iounmap() parisc version only when CONFIG_PCI enabled") we can now enable GENERIC_PCI_IOMAP unconditionally on alpha, and if PCI is not enabled we will just get the nice empty helper functions that allow mixed-bus drivers to build. Example driver: the old 3com/3c59x.c driver works with either the PCI or the EISA version of the 3x59x card, but wouldn't build in an EISA-only configuration because of missing pci_iomap() and pci_iounmap() dummy wrappers. Most of the other PCI infrastructure just becomes empty wrappers even without GENERIC_PCI_IOMAP, and it's not obvious that the pci_iomap functionality shouldn't do the same, but this works. Cc: Ulrich Teichert <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-19parisc: Declare pci_iounmap() parisc version only when CONFIG_PCI enabledHelge Deller3-11/+6
Linus noticed odd declaration rules for pci_iounmap() in iomap.h and pci_iomap.h, where it dependend on either NO_GENERIC_PCI_IOPORT_MAP or GENERIC_IOMAP when CONFIG_PCI was disabled. Testing on parisc seems to indicate that we need pci_iounmap() only when CONFIG_PCI is enabled, so the declaration of pci_iounmap() can be moved cleanly into pci_iomap.h in sync with the declarations of pci_iomap(). Link: https://lore.kernel.org/all/CAHk-=wjRrh98pZoQ+AzfWmsTZacWxTJKXZ9eKU2X_0+jM=O8nw@mail.gmail.com/ Signed-off-by: Helge Deller <[email protected]> Suggested-by: Linus Torvalds <[email protected]> Fixes: 97a29d59fc22 ("[PARISC] fix compile break caused by iomap: make IOPORT/PCI mapping functions conditional") Cc: Arnd Bergmann <[email protected]> Cc: Guenter Roeck <[email protected]> Cc: Ulrich Teichert <[email protected]> Cc: James Bottomley <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-19Revert "drm/vc4: hdmi: Remove drm_encoder->crtc usage"Linus Torvalds1-27/+13
This reverts commit 27da370e0fb343a0baf308f503bb3e5dcdfe3362. Sudip Mukherjee reports that this broke pulseaudio with a NULL pointer dereference in vc4_hdmi_audio_prepare(), bisected it to this commit, and confirmed that a revert fixed the problem. Revert the problematic commit until fixed. Link: https://lore.kernel.org/all/CADVatmPB9-oKd=ypvj25UYysVo6EZhQ6bCM7EvztQBMyiZfAyw@mail.gmail.com/ Link: https://lore.kernel.org/all/CADVatmN5EpRshGEPS_JozbFQRXg5w_8LFB3OMP1Ai-ghxd3w4g@mail.gmail.com/ Reported-and-tested-by: Sudip Mukherjee <[email protected]> Cc: Maxime Ripard <[email protected]> Cc: Emma Anholt <[email protected]> Cc: Dave Airlie <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>