aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-11-18atm: nicstar: Unmap DMA on send errorSebastian Andrzej Siewior1-0/+2
The `skb' is mapped for DMA in ns_send() but does not unmap DMA in case push_scqe() fails to submit the `skb'. The memory of the `skb' is released so only the DMA mapping is leaking. Unmap the DMA mapping in case push_scqe() failed. Fixes: 864a3ff635fa7 ("atm: [nicstar] remove virt_to_bus() and support 64-bit platforms") Cc: Chas Williams <[email protected]> Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-18page_frag: Recover from memory pressureDongli Zhang1-0/+5
The ethernet driver may allocate skb (and skb->data) via napi_alloc_skb(). This ends up to page_frag_alloc() to allocate skb->data from page_frag_cache->va. During the memory pressure, page_frag_cache->va may be allocated as pfmemalloc page. As a result, the skb->pfmemalloc is always true as skb->data is from page_frag_cache->va. The skb will be dropped if the sock (receiver) does not have SOCK_MEMALLOC. This is expected behaviour under memory pressure. However, once kernel is not under memory pressure any longer (suppose large amount of memory pages are just reclaimed), the page_frag_alloc() may still re-use the prior pfmemalloc page_frag_cache->va to allocate skb->data. As a result, the skb->pfmemalloc is always true unless page_frag_cache->va is re-allocated, even if the kernel is not under memory pressure any longer. Here is how kernel runs into issue. 1. The kernel is under memory pressure and allocation of PAGE_FRAG_CACHE_MAX_ORDER in __page_frag_cache_refill() will fail. Instead, the pfmemalloc page is allocated for page_frag_cache->va. 2: All skb->data from page_frag_cache->va (pfmemalloc) will have skb->pfmemalloc=true. The skb will always be dropped by sock without SOCK_MEMALLOC. This is an expected behaviour. 3. Suppose a large amount of pages are reclaimed and kernel is not under memory pressure any longer. We expect skb->pfmemalloc drop will not happen. 4. Unfortunately, page_frag_alloc() does not proactively re-allocate page_frag_alloc->va and will always re-use the prior pfmemalloc page. The skb->pfmemalloc is always true even kernel is not under memory pressure any longer. Fix this by freeing and re-allocating the page instead of recycling it. References: https://lore.kernel.org/lkml/[email protected]/ References: https://lore.kernel.org/linux-mm/[email protected]/ Suggested-by: Matthew Wilcox (Oracle) <[email protected]> Cc: Aruna Ramakrishna <[email protected]> Cc: Bert Barbe <[email protected]> Cc: Rama Nichanamatlu <[email protected]> Cc: Venkat Venkatsubra <[email protected]> Cc: Manjunath Patil <[email protected]> Cc: Joe Jin <[email protected]> Cc: SRINIVAS <[email protected]> Fixes: 79930f5892e1 ("net: do not deplete pfmemalloc reserve") Signed-off-by: Dongli Zhang <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-18drm/amd/display: Always get CRTC updated constant values inside commit tailRodrigo Siqueira1-1/+2
We recently improved our display atomic commit and tail sequence to avoid some issues related to concurrency. One of the major changes consisted of moving the interrupt disable and the stream release from our atomic commit to our atomic tail (commit 6d90a208cfff ("drm/amd/display: Move disable interrupt into commit tail")) . However, the new code introduced inside our commit tail function was inserted right after the function drm_atomic_helper_update_legacy_modeset_state(), which has routines for updating internal data structs related to timestamps. As a result, in certain conditions, the display module can reach a situation where we update our constants and, after that, clean it. This situation generates the following warning: amdgpu 0000:03:00.0: drm_WARN_ON_ONCE(drm_drv_uses_atomic_modeset(dev)) WARNING: CPU: 6 PID: 1269 at drivers/gpu/drm/drm_vblank.c:722 drm_crtc_vblank_helper_get_vblank_timestamp_internal+0x32b/0x340 [drm] ... RIP: 0010:drm_crtc_vblank_helper_get_vblank_timestamp_internal+0x32b/0x340 [drm] ... Call Trace: ? dc_stream_get_vblank_counter+0x57/0x60 [amdgpu] drm_crtc_vblank_helper_get_vblank_timestamp+0x1c/0x20 [drm] drm_get_last_vbltimestamp+0xad/0xc0 [drm] drm_reset_vblank_timestamp+0x63/0xd0 [drm] drm_crtc_vblank_on+0x85/0x150 [drm] amdgpu_dm_atomic_commit_tail+0xaf1/0x2330 [amdgpu] commit_tail+0x99/0x130 [drm_kms_helper] drm_atomic_helper_commit+0x123/0x150 [drm_kms_helper] amdgpu_dm_atomic_commit+0x11/0x20 [amdgpu] drm_atomic_commit+0x4a/0x50 [drm] drm_atomic_helper_set_config+0x7c/0xc0 [drm_kms_helper] drm_mode_setcrtc+0x20b/0x7e0 [drm] ? tomoyo_path_number_perm+0x6f/0x200 ? drm_mode_getcrtc+0x190/0x190 [drm] drm_ioctl_kernel+0xae/0xf0 [drm] drm_ioctl+0x245/0x400 [drm] ? drm_mode_getcrtc+0x190/0x190 [drm] amdgpu_drm_ioctl+0x4e/0x80 [amdgpu] __x64_sys_ioctl+0x91/0xc0 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xa9 ... For fixing this issue we rely upon a refactor introduced on drm_atomic_helper_update_legacy_modeset_state ("Remove the timestamping constant update from drm_atomic_helper_update_legacy_modeset_state()") which decouples constant values update from drm_atomic_helper_update_legacy_modeset_state to a new helper. Basically, this commit uses this new helper and place it right after our release module to avoid a situation where our CRTC struct gets wrong values. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1373 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1349 Reviewed-by: Harry Wentland <[email protected]> Signed-off-by: Rodrigo Siqueira <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-11-18Merge tag 'gfs2-v5.10-rc4-fixes' of ↵Linus Torvalds1-1/+12
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 fix from Andreas Gruenbacher: "Fix gfs2 freeze/thaw" * tag 'gfs2-v5.10-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: Fix regression in freeze_go_sync
2020-11-18Merge tag 'nfsd-5.10-2' of git://linux-nfs.org/~bfields/linuxLinus Torvalds1-1/+2
Pull nfsd fix from Bruce Fields: "Just one quick fix for a tracing oops" * tag 'nfsd-5.10-2' of git://linux-nfs.org/~bfields/linux: SUNRPC: Fix oops in the rpc_xdr_buf event class
2020-11-18Merge tag 'linux-kselftest-kunit-fixes-5.10-rc5' of ↵Linus Torvalds9-54/+80
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull Kunit fixes from Shuah Khan: "Several fixes to Kunit documentation and tools, and to not pollute the source directory. Also remove the incorrect kunit .gitattributes file" * tag 'linux-kselftest-kunit-fixes-5.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: kunit: fix display of failed expectations for strings kunit: tool: fix extra trailing \n in raw + parsed test output kunit: tool: print out stderr from make (like build warnings) KUnit: Docs: usage: wording fixes KUnit: Docs: style: fix some Kconfig example issues KUnit: Docs: fix a wording typo kunit: Do not pollute source directory with generated files (test.log) kunit: Do not pollute source directory with generated files (.kunitconfig) kunit: tool: fix pre-existing python type annotation errors kunit: Fix kunit.py parse subcommand (use null build_dir) kunit: tool: unmark test_data as binary blobs
2020-11-18net: dsa: mv88e6xxx: Wait for EEPROM done after HW resetAndrew Lunn3-0/+34
When the switch is hardware reset, it reads the contents of the EEPROM. This can contain instructions for programming values into registers and to perform waits between such programming. Reading the EEPROM can take longer than the 100ms mv88e6xxx_hardware_reset() waits after deasserting the reset GPIO. So poll the EEPROM done bit to ensure it is complete. Signed-off-by: Andrew Lunn <[email protected]> Signed-off-by: Ruslan Sushko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-18Merge branch 'mlxsw-couple-of-fixes'Jakub Kicinski2-2/+3
Ido Schimmel says: ==================== mlxsw: Couple of fixes Patch #1 fixes firmware flashing when CONFIG_MLXSW_CORE=y and CONFIG_MLXFW=m. Patch #2 prevents EMAD transactions from needlessly failing when the system is under heavy load by using exponential backoff. Please consider patch #2 for stable. ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-18mlxsw: core: Use variable timeout for EMAD retriesIdo Schimmel1-1/+2
The driver sends Ethernet Management Datagram (EMAD) packets to the device for configuration purposes and waits for up to 200ms for a reply. A request is retried up to 5 times. When the system is under heavy load, replies are not always processed in time and EMAD transactions fail. Make the process more robust to such delays by using exponential backoff. First wait for up to 200ms, then retransmit and wait for up to 400ms and so on. Fixes: caf7297e7ab5 ("mlxsw: core: Introduce support for asynchronous EMAD register access") Reported-by: Denis Yulevich <[email protected]> Tested-by: Denis Yulevich <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-18mlxsw: Fix firmware flashingIdo Schimmel1-1/+1
The commit cited below moved firmware flashing functionality from mlxsw_spectrum to mlxsw_core, but did not adjust the Kconfig dependencies. This makes it possible to have mlxsw_core as built-in and mlxfw as a module. The mlxfw code is therefore not reachable from mlxsw_core and firmware flashing fails: # devlink dev flash pci/0000:01:00.0 file mellanox/mlxsw_spectrum-13.2008.1310.mfa2 devlink answers: Operation not supported Fix by having mlxsw_core select mlxfw. Fixes: b79cb787ac70 ("mlxsw: Move fw flashing code into core.c") Signed-off-by: Ido Schimmel <[email protected]> Reported-by: Vadim Pasternak <[email protected]> Tested-by: Vadim Pasternak <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-18net: Have netpoll bring-up DSA management interfaceFlorian Fainelli1-4/+18
DSA network devices rely on having their DSA management interface up and running otherwise their ndo_open() will return -ENETDOWN. Without doing this it would not be possible to use DSA devices as netconsole when configured on the command line. These devices also do not utilize the upper/lower linking so the check about the netpoll device having upper is not going to be a problem. The solution adopted here is identical to the one done for net/ipv4/ipconfig.c with 728c02089a0e ("net: ipv4: handle DSA enabled master network devices"), with the network namespace scope being restricted to that of the process configuring netpoll. Fixes: 04ff53f96a93 ("net: dsa: Add netconsole support") Tested-by: Vladimir Oltean <[email protected]> Signed-off-by: Florian Fainelli <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-18atl1e: fix error return code in atl1e_probe()Zhang Changzhong1-2/+2
Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: a6a5325239c2 ("atl1e: Atheros L1E Gigabit Ethernet driver") Reported-by: Hulk Robot <[email protected]> Signed-off-by: Zhang Changzhong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-18atl1c: fix error return code in atl1c_probe()Zhang Changzhong1-2/+2
Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: 43250ddd75a3 ("atl1c: Atheros L1C Gigabit Ethernet driver") Reported-by: Hulk Robot <[email protected]> Signed-off-by: Zhang Changzhong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-18ah6: fix error return code in ah6_input()Zhang Changzhong1-1/+2
Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: Hulk Robot <[email protected]> Signed-off-by: Zhang Changzhong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-18net: usb: qmi_wwan: Set DTR quirk for MR400Filip Moc1-1/+1
LTE module MR400 embedded in TL-MR6400 v4 requires DTR to be set. Signed-off-by: Filip Moc <[email protected]> Acked-by: Bjørn Mork <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-18regulator: ti-abb: Fix array out of bound read access on the first transitionNishanth Menon1-1/+11
At the start of driver initialization, we do not know what bias setting the bootloader has configured the system for and we only know for certain the very first time we do a transition. However, since the initial value of the comparison index is -EINVAL, this negative value results in an array out of bound access on the very first transition. Since we don't know what the setting is, we just set the bias configuration as there is nothing to compare against. This prevents the array out of bound access. NOTE: Even though we could use a more relaxed check of "< 0" the only valid values(ignoring cosmic ray induced bitflips) are -EINVAL, 0+. Fixes: 40b1936efebd ("regulator: Introduce TI Adaptive Body Bias(ABB) on-chip LDO driver") Link: https://lore.kernel.org/linux-mm/CA+G9fYuk4imvhyCN7D7T6PMDH6oNp6HDCRiTUKMQ6QXXjBa4ag@mail.gmail.com/ Reported-by: Naresh Kamboju <[email protected]> Reviewed-by: Arnd Bergmann <[email protected]> Signed-off-by: Nishanth Menon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]>
2020-11-18ASOC: Intel: kbl_rt5663_rt5514_max98927: Do not try to disable disabled clockGuenter Roeck1-0/+2
In kabylake_set_bias_level(), enabling mclk may fail if the clock has already been enabled by the firmware. Attempts to disable that clock later will fail with a warning backtrace. mclk already disabled WARNING: CPU: 2 PID: 108 at drivers/clk/clk.c:952 clk_core_disable+0x1b6/0x1cf ... Call Trace: clk_disable+0x2d/0x3a kabylake_set_bias_level+0x72/0xfd [snd_soc_kbl_rt5663_rt5514_max98927] snd_soc_card_set_bias_level+0x2b/0x6f snd_soc_dapm_set_bias_level+0xe1/0x209 dapm_pre_sequence_async+0x63/0x96 async_run_entry_fn+0x3d/0xd1 process_one_work+0x2a9/0x526 ... Only disable the clock if it has been enabled. Fixes: 15747a802075 ("ASoC: eve: implement set_bias_level function for rt5514") Cc: Brent Lu <[email protected]> Cc: Curtis Malainey <[email protected]> Signed-off-by: Guenter Roeck <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]>
2020-11-18xfs: return corresponding errcode if xfs_initialize_perag() failYu Kuai1-3/+8
In xfs_initialize_perag(), if kmem_zalloc(), xfs_buf_hash_init(), or radix_tree_preload() failed, the returned value 'error' is not set accordingly. Reported-as-fixing: 8b26c5825e02 ("xfs: handle ENOMEM correctly during initialisation of perag structures") Fixes: 9b2471797942 ("xfs: cache unlinked pointers in an rhashtable") Reported-by: Hulk Robot <[email protected]> Signed-off-by: Yu Kuai <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2020-11-18xfs: ensure inobt record walks always make forward progressDarrick J. Wong1-3/+24
The aim of the inode btree record iterator function is to call a callback on every record in the btree. To avoid having to tear down and recreate the inode btree cursor around every callback, it caches a certain number of records in a memory buffer. After each batch of callback invocations, we have to perform a btree lookup to find the next record after where we left off. However, if the keys of the inode btree are corrupt, the lookup might put us in the wrong part of the inode btree, causing the walk function to loop forever. Therefore, we add extra cursor tracking to make sure that we never go backwards neither when performing the lookup nor when jumping to the next inobt record. This also fixes an off by one error where upon resume the lookup should have been for the inode /after/ the point at which we stopped. Found by fuzzing xfs/460 with keys[2].startino = ones causing bulkstat and quotacheck to hang. Fixes: a211432c27ff ("xfs: create simplified inode walk function") Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Chandan Babu R <[email protected]>
2020-11-18xfs: fix forkoff miscalculation related to XFS_LITINO(mp)Gao Xiang1-1/+7
Currently, commit e9e2eae89ddb dropped a (int) decoration from XFS_LITINO(mp), and since sizeof() expression is also involved, the result of XFS_LITINO(mp) is simply as the size_t type (commonly unsigned long). Considering the expression in xfs_attr_shortform_bytesfit(): offset = (XFS_LITINO(mp) - bytes) >> 3; let "bytes" be (int)340, and "XFS_LITINO(mp)" be (unsigned long)336. on 64-bit platform, the expression is offset = ((unsigned long)336 - (int)340) >> 3 = (int)(0xfffffffffffffffcUL >> 3) = -1 but on 32-bit platform, the expression is offset = ((unsigned long)336 - (int)340) >> 3 = (int)(0xfffffffcUL >> 3) = 0x1fffffff instead. so offset becomes a large positive number on 32-bit platform, and cause xfs_attr_shortform_bytesfit() returns maxforkoff rather than 0. Therefore, one result is "ASSERT(new_size <= XFS_IFORK_SIZE(ip, whichfork));" assertion failure in xfs_idata_realloc(), which was also the root cause of the original bugreport from Dennis, see: https://bugzilla.redhat.com/show_bug.cgi?id=1894177 And it can also be manually triggered with the following commands: $ touch a; $ setfattr -n user.0 -v "`seq 0 80`" a; $ setfattr -n user.1 -v "`seq 0 80`" a on 32-bit platform. Fix the case in xfs_attr_shortform_bytesfit() by bailing out "XFS_LITINO(mp) < bytes" in advance suggested by Eric and a misleading comment together with this bugfix suggested by Darrick. It seems the other users of XFS_LITINO(mp) are not impacted. Fixes: e9e2eae89ddb ("xfs: only check the superblock version for dinode size calculation") Cc: <[email protected]> # 5.7+ Reported-and-tested-by: Dennis Gilmore <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Gao Xiang <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2020-11-18xfs: directory scrub should check the null bestfree entries tooDarrick J. Wong1-4/+17
Teach the directory scrubber to check all the bestfree entries, including the null ones. We want to be able to detect the case where the entry is null but there actually /is/ a directory data block. Found by fuzzing lbests[0] = ones in xfs/391. Fixes: df481968f33b ("xfs: scrub directory freespace") Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Chandan Babu R <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2020-11-18xfs: strengthen rmap record flags checkingDarrick J. Wong1-4/+4
We always know the correct state of the rmap record flags (attr, bmbt, unwritten) so check them by direct comparison. Fixes: d852657ccfc0 ("xfs: cross-reference reverse-mapping btree") Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Chandan Babu R <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2020-11-18xfs: fix the minrecs logic when dealing with inode root child blocksDarrick J. Wong1-18/+27
The comment and logic in xchk_btree_check_minrecs for dealing with inode-rooted btrees isn't quite correct. While the direct children of the inode root are allowed to have fewer records than what would normally be allowed for a regular ondisk btree block, this is only true if there is only one child block and the number of records don't fit in the inode root. Fixes: 08a3a692ef58 ("xfs: btree scrub should check minrecs") Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Chandan Babu R <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2020-11-18can: m_can: process interrupt only when not runtime suspendedJarkko Nikula1-0/+2
Avoid processing bogus interrupt statuses when the HW is runtime suspended and the M_CAN_IR register read may get all bits 1's. Handler can be called if the interrupt request is shared with other peripherals or at the end of free_irq(). Therefore check the runtime suspended status before processing. Fixes: cdf8259d6573 ("can: m_can: Add PM Support") Signed-off-by: Jarkko Nikula <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Marc Kleine-Budde <[email protected]>
2020-11-18gfs2: Fix regression in freeze_go_syncBob Peterson1-1/+12
Patch 541656d3a513 ("gfs2: freeze should work on read-only mounts") changed the check for glock state in function freeze_go_sync() from "gl->gl_state == LM_ST_SHARED" to "gl->gl_req == LM_ST_EXCLUSIVE". That's wrong and it regressed gfs2's freeze/thaw mechanism because it caused only the freezing node (which requests the glock in EX) to queue freeze work. All nodes go through this go_sync code path during the freeze to drop their SHared hold on the freeze glock, allowing the freezing node to acquire it in EXclusive mode. But all the nodes must freeze access to the file system locally, so they ALL must queue freeze work. The freeze_work calls freeze_func, which makes a request to reacquire the freeze glock in SH, effectively blocking until the thaw from the EX holder. Once thawed, the freezing node drops its EX hold on the freeze glock, then the (blocked) freeze_func reacquires the freeze glock in SH again (on all nodes, including the freezer) so all nodes go back to a thawed state. This patch changes the check back to gl_state == LM_ST_SHARED like it was prior to 541656d3a513. Fixes: 541656d3a513 ("gfs2: freeze should work on read-only mounts") Cc: [email protected] # v5.8+ Signed-off-by: Bob Peterson <[email protected]> Signed-off-by: Andreas Gruenbacher <[email protected]>
2020-11-18can: flexcan: flexcan_chip_start(): fix erroneous ↵Marc Kleine-Budde1-9/+9
flexcan_transceiver_enable() during bus-off recovery If the CAN controller goes into bus off, the do_set_mode() callback with CAN_MODE_START can be used to recover the controller, which then calls flexcan_chip_start(). If configured, this is done automatically by the framework or manually by the user. In flexcan_chip_start() there is an explicit call to flexcan_transceiver_enable(), which does a regulator_enable() on the transceiver regulator. This results in a net usage counter increase, as there is no corresponding flexcan_transceiver_disable() in the bus off code path. This further leads to the transceiver stuck enabled, even if the CAN interface is shut down. To fix this problem the flexcan_transceiver_enable()/flexcan_transceiver_disable() are moved out of flexcan_chip_start()/flexcan_chip_stop() into flexcan_open()/flexcan_close(). Fixes: e955cead0311 ("CAN: Add Flexcan CAN controller driver") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Marc Kleine-Budde <[email protected]>
2020-11-18io_uring: order refnode recyclingPavel Begunkov1-10/+23
Don't recycle a refnode until we're done with all requests of nodes ejected before. Signed-off-by: Pavel Begunkov <[email protected]> Cc: [email protected] # v5.7+ Signed-off-by: Jens Axboe <[email protected]>
2020-11-18io_uring: get an active ref_node from files_dataPavel Begunkov1-3/+1
An active ref_node always can be found in ctx->files_data, it's much safer to get it this way instead of poking into files_data->ref_list. Signed-off-by: Pavel Begunkov <[email protected]> Cc: [email protected] # v5.7+ Signed-off-by: Jens Axboe <[email protected]>
2020-11-18iommu/vt-d: Avoid panic if iommu init fails in tboot systemZhenzhong Duan3-6/+3
"intel_iommu=off" command line is used to disable iommu but iommu is force enabled in a tboot system for security reason. However for better performance on high speed network device, a new option "intel_iommu=tboot_noforce" is introduced to disable the force on. By default kernel should panic if iommu init fail in tboot for security reason, but it's unnecessory if we use "intel_iommu=tboot_noforce,off". Fix the code setting force_on and move intel_iommu_tboot_noforce from tboot code to intel iommu code. Fixes: 7304e8f28bb2 ("iommu/vt-d: Correctly disable Intel IOMMU force on") Signed-off-by: Zhenzhong Duan <[email protected]> Tested-by: Lukasz Hawrylko <[email protected]> Acked-by: Lu Baolu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2020-11-18dmaengine: fix error codes in channel_register()Dan Carpenter1-8/+9
The error codes were not set on some of these error paths. Also the error handling was more confusing than it needed to be so I cleaned it up and shuffled it around a bit. Fixes: d2fb0a043838 ("dmaengine: break out channel registration") Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: Peter Ujfalusi <[email protected]> Link: https://lore.kernel.org/r/20201113101631.GE168908@mwanda Signed-off-by: Vinod Koul <[email protected]>
2020-11-18x86/dumpstack: Do not try to access user space code of other tasksThomas Gleixner1-4/+19
sysrq-t ends up invoking show_opcodes() for each task which tries to access the user space code of other processes, which is obviously bogus. It either manages to dump where the foreign task's regs->ip points to in a valid mapping of the current task or triggers a pagefault and prints "Code: Bad RIP value.". Both is just wrong. Add a safeguard in copy_code() and check whether the @regs pointer matches currents pt_regs. If not, do not even try to access it. While at it, add commentary why using copy_from_user_nmi() is safe in copy_code() even if the function name suggests otherwise. Reported-by: Oleg Nesterov <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Acked-by: Oleg Nesterov <[email protected]> Tested-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-11-18can: kvaser_usb: kvaser_usb_hydra: Fix KCAN bittiming limitsJimmy Assarsson1-1/+1
Use correct bittiming limits for the KCAN CAN controller. Fixes: aec5fb2268b7 ("can: kvaser_usb: Add support for Kvaser USB hydra family") Signed-off-by: Jimmy Assarsson <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Marc Kleine-Budde <[email protected]>
2020-11-18can: kvaser_pciefd: Fix KCAN bittiming limitsJimmy Assarsson1-2/+2
Use correct bittiming limits for the KCAN CAN controller. Fixes: 26ad340e582d ("can: kvaser_pciefd: Add driver for Kvaser PCIEcan devices") Signed-off-by: Jimmy Assarsson <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Marc Kleine-Budde <[email protected]>
2020-11-18drm/sun4i: backend: Fix probe failure with multiple backendsMaxime Ripard1-1/+7
Commit e0d072782c73 ("dma-mapping: introduce DMA range map, supplanting dma_pfn_offset") introduced a regression in our code since the second backed to probe will now get -EINVAL back from dma_direct_set_offset and will prevent the entire DRM device from probing. Ignore -EINVAL as a temporary measure to get it back working, before removing that call entirely. Fixes: e0d072782c73 ("dma-mapping: introduce DMA range map, supplanting dma_pfn_offset") Signed-off-by: Maxime Ripard <[email protected]> Reviewed-by: Chen-Yu Tsai <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Acked-by: Daniel Vetter <[email protected]>
2020-11-17ARC: stack unwinding: reorganize how initial register state setupVineet Gupta1-19/+18
This is a non-functional change, if anything a better fall-back handling. Signed-off-by: Vineet Gupta <[email protected]>
2020-11-17ARC: stack unwinding: don't assume non-current task is sleepingVineet Gupta1-8/+15
To start stack unwinding (SP, PC and BLINK) are needed. When the explicit execution context (pt_regs etc) is not available, unwinder assumes the task is sleeping (in __switch_to()) and fetches SP and BLINK from kernel mode stack. But this assumption is not true, specially in a SMP system, when top runs on 1 core, there may be active running processes on all cores. So when unwinding non courrent tasks, ensure they are NOT running. And while at it, handle the self unwinding case explicitly. This came out of investigation of a customer reported hang with rcutorture+top Link: https://github.com/foss-for-synopsys-dwc-arc-processors/linux/issues/31 Signed-off-by: Vineet Gupta <[email protected]>
2020-11-17ARC: mm: fix spelling mistakesFlavio Suligoi1-12/+12
Signed-off-by: Flavio Suligoi <[email protected]> Signed-off-by: Vineet Gupta <[email protected]>
2020-11-17ARC: bitops: Remove unecessary operation and valueGustavo Pimentel1-3/+1
The 1-bit shift rotation to the left on x variable located on 4 last if statement can be removed because the computed value is will not be used afront. Signed-off-by: Gustavo Pimentel <[email protected]> Signed-off-by: Vineet Gupta <[email protected]>
2020-11-17ipv4: use IS_ENABLED instead of ifdefFlorian Klink1-1/+1
Checking for ifdef CONFIG_x fails if CONFIG_x=m. Use IS_ENABLED instead, which is true for both built-ins and modules. Otherwise, a > ip -4 route add 1.2.3.4/32 via inet6 fe80::2 dev eth1 fails with the message "Error: IPv6 support not enabled in kernel." if CONFIG_IPV6 is `m`. In the spirit of b8127113d01e53adba15b41aefd37b90ed83d631. Fixes: d15662682db2 ("ipv4: Allow ipv6 gateway with ipv4 routes") Cc: Kim Phillips <[email protected]> Signed-off-by: Florian Klink <[email protected]> Reviewed-by: David Ahern <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-17qed: fix ILT configuration of SRC blockDmitry Bogdanov2-5/+2
The code refactoring of ILT configuration was not complete, the old unused variables were used for the SRC block. That could lead to the memory corruption by HW when rx filters are configured. This patch completes that refactoring. Fixes: 8a52bbab39c9 (qed: Debug feature: ilt and mdump) Signed-off-by: Igor Russkikh <[email protected]> Signed-off-by: Ariel Elior <[email protected]> Signed-off-by: Dmitry Bogdanov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-17inet_diag: Fix error path to cancel the meseage in inet_req_diag_fill()Wang Hai1-1/+3
nlmsg_cancel() needs to be called in the error path of inet_req_diag_fill to cancel the message. Fixes: d545caca827b ("net: inet: diag: expose the socket mark to privileged processes.") Reported-by: Hulk Robot <[email protected]> Signed-off-by: Wang Hai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-17tools/testing/scatterlist: Fix test to compile and runMaor Gottlieb2-2/+3
Add missing define of ALIGN_DOWN to make the test build and run. In addition, __sg_alloc_table_from_pages now support unaligned maximum segment, so adapt the test result accordingly. Fixes: 07da1223ec93 ("lib/scatterlist: Add support in dynamic allocation of SG table from pages") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Maor Gottlieb <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2020-11-18bpf, sockmap: Avoid failures from skb_to_sgvec when skb has frag_listJohn Fastabend1-2/+9
When skb has a frag_list its possible for skb_to_sgvec() to fail. This happens when the scatterlist has fewer elements to store pages than would be needed for the initial skb plus any of its frags. This case appears rare, but is possible when running an RX parser/verdict programs exposed to the internet. Currently, when this happens we throw an error, break the pipe, and kfree the msg. This effectively breaks the application or forces it to do a retry. Lets catch this case and handle it by doing an skb_linearize() on any skb we receive with frags. At this point skb_to_sgvec should not fail because the failing conditions would require frags to be in place. Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: John Fastabend <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Jakub Sitnicki <[email protected]> Link: https://lore.kernel.org/bpf/160556576837.73229.14800682790808797635.stgit@john-XPS-13-9370
2020-11-18bpf, sockmap: Handle memory acct if skb_verdict prog redirects to selfJohn Fastabend1-0/+8
If the skb_verdict_prog redirects an skb knowingly to itself, fix your BPF program this is not optimal and an abuse of the API please use SK_PASS. That said there may be cases, such as socket load balancing, where picking the socket is hashed based or otherwise picks the same socket it was received on in some rare cases. If this happens we don't want to confuse userspace giving them an EAGAIN error if we can avoid it. To avoid double accounting in these cases. At the moment even if the skb has already been charged against the sockets rcvbuf and forward alloc we check it again and do set_owner_r() causing it to be orphaned and recharged. For one this is useless work, but more importantly we can have a case where the skb could be put on the ingress queue, but because we are under memory pressure we return EAGAIN. The trouble here is the skb has already been accounted for so any rcvbuf checks include the memory associated with the packet already. This rolls up and can result in unnecessary EAGAIN errors in userspace read() calls. Fix by doing an unlikely check and skipping checks if skb->sk == sk. Fixes: 51199405f9672 ("bpf: skb_verdict, support SK_PASS on RX BPF path") Signed-off-by: John Fastabend <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Jakub Sitnicki <[email protected]> Link: https://lore.kernel.org/bpf/160556574804.73229.11328201020039674147.stgit@john-XPS-13-9370
2020-11-18bpf, sockmap: Avoid returning unneeded EAGAIN when redirecting to selfJohn Fastabend1-19/+53
If a socket redirects to itself and it is under memory pressure it is possible to get a socket stuck so that recv() returns EAGAIN and the socket can not advance for some time. This happens because when redirecting a skb to the same socket we received the skb on we first check if it is OK to enqueue the skb on the receiving socket by checking memory limits. But, if the skb is itself the object holding the memory needed to enqueue the skb we will keep retrying from kernel side and always fail with EAGAIN. Then userspace will get a recv() EAGAIN error if there are no skbs in the psock ingress queue. This will continue until either some skbs get kfree'd causing the memory pressure to reduce far enough that we can enqueue the pending packet or the socket is destroyed. In some cases its possible to get a socket stuck for a noticeable amount of time if the socket is only receiving skbs from sk_skb verdict programs. To reproduce I make the socket memory limits ridiculously low so sockets are always under memory pressure. More often though if under memory pressure it looks like a spurious EAGAIN error on user space side causing userspace to retry and typically enough has moved on the memory side that it works. To fix skip memory checks and skb_orphan if receiving on the same sock as already assigned. For SK_PASS cases this is easy, its always the same socket so we can just omit the orphan/set_owner pair. For backlog cases we need to check skb->sk and decide if the orphan and set_owner pair are needed. Fixes: 51199405f9672 ("bpf: skb_verdict, support SK_PASS on RX BPF path") Signed-off-by: John Fastabend <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Jakub Sitnicki <[email protected]> Link: https://lore.kernel.org/bpf/160556572660.73229.12566203819812939627.stgit@john-XPS-13-9370
2020-11-18bpf, sockmap: Use truesize with sk_rmem_schedule()John Fastabend1-1/+1
We use skb->size with sk_rmem_scheduled() which is not correct. Instead use truesize to align with socket and tcp stack usage of sk_rmem_schedule. Suggested-by: Daniel Borkman <[email protected]> Signed-off-by: John Fastabend <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Jakub Sitnicki <[email protected]> Link: https://lore.kernel.org/bpf/160556570616.73229.17003722112077507863.stgit@john-XPS-13-9370
2020-11-18bpf, sockmap: Ensure SO_RCVBUF memory is observed on ingress redirectJohn Fastabend2-5/+18
Fix sockmap sk_skb programs so that they observe sk_rcvbuf limits. This allows users to tune SO_RCVBUF and sockmap will honor them. We can refactor the if(charge) case out in later patches. But, keep this fix to the point. Fixes: 51199405f9672 ("bpf: skb_verdict, support SK_PASS on RX BPF path") Suggested-by: Jakub Sitnicki <[email protected]> Signed-off-by: John Fastabend <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Jakub Sitnicki <[email protected]> Link: https://lore.kernel.org/bpf/160556568657.73229.8404601585878439060.stgit@john-XPS-13-9370
2020-11-18bpf, sockmap: Fix partial copy_page_to_iter so progress can still be madeJohn Fastabend1-6/+9
If copy_page_to_iter() fails or even partially completes, but with fewer bytes copied than expected we currently reset sg.start and return EFAULT. This proves problematic if we already copied data into the user buffer before we return an error. Because we leave the copied data in the user buffer and fail to unwind the scatterlist so kernel side believes data has been copied and user side believes data has _not_ been received. Expected behavior should be to return number of bytes copied and then on the next read we need to return the error assuming its still there. This can happen if we have a copy length spanning multiple scatterlist elements and one or more complete before the error is hit. The error is rare enough though that my normal testing with server side programs, such as nginx, httpd, envoy, etc., I have never seen this. The only reliable way to reproduce that I've found is to stream movies over my browser for a day or so and wait for it to hang. Not very scientific, but with a few extra WARN_ON()s in the code the bug was obvious. When we review the errors from copy_page_to_iter() it seems we are hitting a page fault from copy_page_to_iter_iovec() where the code checks fault_in_pages_writeable(buf, copy) where buf is the user buffer. It also seems typical server applications don't hit this case. The other way to try and reproduce this is run the sockmap selftest tool test_sockmap with data verification enabled, but it doesn't reproduce the fault. Perhaps we can trigger this case artificially somehow from the test tools. I haven't sorted out a way to do that yet though. Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: John Fastabend <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Jakub Sitnicki <[email protected]> Link: https://lore.kernel.org/bpf/160556566659.73229.15694973114605301063.stgit@john-XPS-13-9370
2020-11-17net/tls: Fix wrong record sn in async mode of device resyncTariq Toukan2-11/+42
In async_resync mode, we log the TCP seq of records until the async request is completed. Later, in case one of the logged seqs matches the resync request, we return it, together with its record serial number. Before this fix, we mistakenly returned the serial number of the current record instead. Fixes: ed9b7646b06a ("net/tls: Add asynchronous resync") Signed-off-by: Tariq Toukan <[email protected]> Reviewed-by: Boris Pismenny <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-17io_uring: don't double complete failed reissue requestJens Axboe1-1/+0
Zorro reports that an xfstest test case is failing, and it turns out that for the reissue path we can potentially issue a double completion on the request for the failure path. There's an issue around the retry as well, but for now, at least just make sure that we handle the error path correctly. Cc: [email protected] Fixes: b63534c41e20 ("io_uring: re-issue block requests that failed because of resources") Reported-by: Zorro Lang <[email protected]> Signed-off-by: Jens Axboe <[email protected]>