aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2021-03-02block/bfq: update comments and default value in docs for fifo_expireJoseph Qi2-3/+3
Correct the comments since bfq_fifo_expire[0] is for async request, while bfq_fifo_expire[1] is for sync request. Also update docs, according the source code, the default fifo_expire_async is 250ms, and fifo_expire_sync is 125ms. Signed-off-by: Joseph Qi <[email protected]> Acked-by: Paolo Valente <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-03-02rsxx: remove unused including <linux/version.h>Tian Tao1-1/+0
Remove including <linux/version.h> that don't need it. Signed-off-by: Tian Tao <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-03-02ALSA: hda/hdmi: let new platforms assign the pcm slot dynamicallyHui Wang1-1/+17
If the platform set the dyn_pcm_assign to true, it will call hdmi_find_pcm_slot() to find a pcm slot when hdmi/dp monitor is connected and need to create a pcm. So far only intel_hsw_common_init() and patch_nvhdmi() set the dyn_pcm_assign to true, here we let tgl platforms assign the pcm slot dynamically first, if the driver runs for a period of time and there is no regression reported, we could set no_fixed_assgin to true in the intel_hsw_common_init(), and then set it to true in the patch_nvhdmi(). This change comes from the discussion between Takashi and Kai Vehmanen. Please refer to: https://github.com/alsa-project/alsa-lib/pull/118 Suggested-and-reviewed-by: Takashi Iwai <[email protected]> Suggested-and-reviewed-by: Kai Vehmanen <[email protected]> Signed-off-by: Hui Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2021-03-02ALSA: hda/realtek: Add quirk for Clevo NH55RZQEckhart Mohr1-0/+1
This applies a SND_PCI_QUIRK(...) to the Clevo NH55RZQ barebone. This fixes the issue of the device not recognizing a pluged in microphone. The device has both, a microphone only jack, and a speaker + microphone combo jack. The combo jack already works. The microphone-only jack does not recognize when a device is pluged in without this patch. Signed-off-by: Eckhart Mohr <[email protected]> Co-developed-by: Werner Sembach <[email protected]> Signed-off-by: Werner Sembach <[email protected]> Cc: <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2021-03-02Merge tag 'tags/sound-sdw-kconfig-fixes' into for-linusTakashi Iwai5934-156147/+234516
ALSA/ASoC/SOF/SoundWire: fix Kconfig issues In January, Intel kbuild bot and Arnd Bergmann reported multiple issues with randconfig. This patchset builds on Arnd's suggestions to a) expose ACPI and PCI devices in separate modules, while sof-acpi-dev and sof-pci-dev become helpers. This will result in minor changes required for developers/testers, i.e. modprobe snd-sof-pci will no longer result in a probe. The SOF CI was already updated to deal with this module dependency change and introduction of new modules. b) Fix SOF/SoundWire/DSP_config dependencies by moving the code required to detect SoundWire presence in ACPI tables to sound/hda. Link: https://lore.kernel.org/r/[email protected]
2021-03-02btrfs: subpage: fix the false data csum mismatch errorQu Wenruo1-5/+16
[BUG] When running fstresss, we can hit strange data csum mismatch where the on-disk data is in fact correct (passes scrub). With some extra debug info added, we have the following traces: 0482us: btrfs_do_readpage: root=5 ino=284 offset=393216, submit force=0 pgoff=0 iosize=8192 0494us: btrfs_do_readpage: root=5 ino=284 offset=401408, submit force=0 pgoff=8192 iosize=4096 0498us: btrfs_submit_data_bio: root=5 ino=284 bio first bvec=393216 len=8192 0591us: btrfs_do_readpage: root=5 ino=284 offset=405504, submit force=0 pgoff=12288 iosize=36864 0594us: btrfs_submit_data_bio: root=5 ino=284 bio first bvec=401408 len=4096 0863us: btrfs_submit_data_bio: root=5 ino=284 bio first bvec=405504 len=36864 0933us: btrfs_verify_data_csum: root=5 ino=284 offset=393216 len=8192 0967us: btrfs_do_readpage: root=5 ino=284 offset=442368, skip beyond isize pgoff=49152 iosize=16384 1047us: btrfs_verify_data_csum: root=5 ino=284 offset=401408 len=4096 1163us: btrfs_verify_data_csum: root=5 ino=284 offset=405504 len=36864 1290us: check_data_csum: !!! root=5 ino=284 offset=438272 pg_off=45056 !!! 7387us: end_bio_extent_readpage: root=5 ino=284 before pending_read_bios=0 [CAUSE] Normally we expect all submitted bio reads to only touch the range we specified, and under subpage context, it means we should only touch the range specified in each bvec. But in data read path, inside end_bio_extent_readpage(), we have page zeroing which only takes regular page size into consideration. This means for subpage if we have an inode whose content looks like below: 0 16K 32K 48K 64K |///////| |///////| | |//| = data needs to be read from disk | | = hole And i_size is 64K initially. Then the following race can happen: T1 | T2 --------------------------------+-------------------------------- btrfs_do_readpage() | |- isize = 64K; | | At this time, the isize is | | 64K | | | |- submit_extent_page() | | submit previous assembled bio| | assemble bio for [0, 16K) | | | |- submit_extent_page() | submit read bio for [0, 16K) | assemble read bio for | [32K, 48K) | | | btrfs_setsize() | |- i_size_write(, 16K); | Now i_size is only 16K end_io() for [0K, 16K) | |- end_bio_extent_readpage() | |- btrfs_verify_data_csum() | | No csum error | |- i_size = 16K; | |- zero_user_segment(16K, | PAGE_SIZE); | !!! We zeroed range | !!! [32K, 48K) | | end_io for [32K, 48K) | |- end_bio_extent_readpage() | |- btrfs_verify_data_csum() | ! CSUM MISMATCH ! | ! As the range is zeroed now ! [FIX] To fix the problem, make end_bio_extent_readpage() to only zero the range of bvec. The bug only affects subpage read-write support, as for full read-only mount we can't change i_size thus won't hit the race condition. Signed-off-by: Qu Wenruo <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2021-03-02btrfs: fix warning when creating a directory with smack enabledFilipe Manana1-4/+27
When we have smack enabled, during the creation of a directory smack may attempt to add a "smack transmute" xattr on the inode, which results in the following warning and trace: WARNING: CPU: 3 PID: 2548 at fs/btrfs/transaction.c:537 start_transaction+0x489/0x4f0 Modules linked in: nft_objref nf_conntrack_netbios_ns (...) CPU: 3 PID: 2548 Comm: mkdir Not tainted 5.9.0-rc2smack+ #81 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014 RIP: 0010:start_transaction+0x489/0x4f0 Code: e9 be fc ff ff (...) RSP: 0018:ffffc90001887d10 EFLAGS: 00010202 RAX: ffff88816f1e0000 RBX: 0000000000000201 RCX: 0000000000000003 RDX: 0000000000000201 RSI: 0000000000000002 RDI: ffff888177849000 RBP: ffff888177849000 R08: 0000000000000001 R09: 0000000000000004 R10: ffffffff825e8f7a R11: 0000000000000003 R12: ffffffffffffffe2 R13: 0000000000000000 R14: ffff88803d884270 R15: ffff8881680d8000 FS: 00007f67317b8440(0000) GS:ffff88817bcc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f67247a22a8 CR3: 000000004bfbc002 CR4: 0000000000370ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ? slab_free_freelist_hook+0xea/0x1b0 ? trace_hardirqs_on+0x1c/0xe0 btrfs_setxattr_trans+0x3c/0xf0 __vfs_setxattr+0x63/0x80 smack_d_instantiate+0x2d3/0x360 security_d_instantiate+0x29/0x40 d_instantiate_new+0x38/0x90 btrfs_mkdir+0x1cf/0x1e0 vfs_mkdir+0x14f/0x200 do_mkdirat+0x6d/0x110 do_syscall_64+0x2d/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f673196ae6b Code: 8b 05 11 (...) RSP: 002b:00007ffc3c679b18 EFLAGS: 00000246 ORIG_RAX: 0000000000000053 RAX: ffffffffffffffda RBX: 00000000000001ff RCX: 00007f673196ae6b RDX: 0000000000000000 RSI: 00000000000001ff RDI: 00007ffc3c67a30d RBP: 00007ffc3c67a30d R08: 00000000000001ff R09: 0000000000000000 R10: 000055d3e39fe930 R11: 0000000000000246 R12: 0000000000000000 R13: 00007ffc3c679cd8 R14: 00007ffc3c67a30d R15: 00007ffc3c679ce0 irq event stamp: 11029 hardirqs last enabled at (11037): [<ffffffff81153fe6>] console_unlock+0x486/0x670 hardirqs last disabled at (11044): [<ffffffff81153c01>] console_unlock+0xa1/0x670 softirqs last enabled at (8864): [<ffffffff81e0102f>] asm_call_on_stack+0xf/0x20 softirqs last disabled at (8851): [<ffffffff81e0102f>] asm_call_on_stack+0xf/0x20 This happens because at btrfs_mkdir() we call d_instantiate_new() while holding a transaction handle, which results in the following call chain: btrfs_mkdir() trans = btrfs_start_transaction(root, 5); d_instantiate_new() smack_d_instantiate() __vfs_setxattr() btrfs_setxattr_trans() btrfs_start_transaction() start_transaction() WARN_ON() --> a tansaction start has TRANS_EXTWRITERS set in its type h->orig_rsv = h->block_rsv h->block_rsv = NULL btrfs_end_transaction(trans) Besides the warning triggered at start_transaction, we set the handle's block_rsv to NULL which may cause some surprises later on. So fix this by making btrfs_setxattr_trans() not start a transaction when we already have a handle on one, stored in current->journal_info, and use that handle. We are good to use the handle because at btrfs_mkdir() we did reserve space for the xattr and the inode item. Reported-by: Casey Schaufler <[email protected]> CC: [email protected] # 5.4+ Acked-by: Casey Schaufler <[email protected]> Tested-by: Casey Schaufler <[email protected]> Link: https://lore.kernel.org/linux-btrfs/[email protected]/ Signed-off-by: Filipe Manana <[email protected]> Signed-off-by: David Sterba <[email protected]>
2021-03-02btrfs: don't flush from btrfs_delayed_inode_reserve_metadataNikolay Borisov2-2/+3
Calling btrfs_qgroup_reserve_meta_prealloc from btrfs_delayed_inode_reserve_metadata can result in flushing delalloc while holding a transaction and delayed node locks. This is deadlock prone. In the past multiple commits: * ae5e070eaca9 ("btrfs: qgroup: don't try to wait flushing if we're already holding a transaction") * 6f23277a49e6 ("btrfs: qgroup: don't commit transaction when we already hold the handle") Tried to solve various aspects of this but this was always a whack-a-mole game. Unfortunately those 2 fixes don't solve a deadlock scenario involving btrfs_delayed_node::mutex. Namely, one thread can call btrfs_dirty_inode as a result of reading a file and modifying its atime: PID: 6963 TASK: ffff8c7f3f94c000 CPU: 2 COMMAND: "test" #0 __schedule at ffffffffa529e07d #1 schedule at ffffffffa529e4ff #2 schedule_timeout at ffffffffa52a1bdd #3 wait_for_completion at ffffffffa529eeea <-- sleeps with delayed node mutex held #4 start_delalloc_inodes at ffffffffc0380db5 #5 btrfs_start_delalloc_snapshot at ffffffffc0393836 #6 try_flush_qgroup at ffffffffc03f04b2 #7 __btrfs_qgroup_reserve_meta at ffffffffc03f5bb6 <-- tries to reserve space and starts delalloc inodes. #8 btrfs_delayed_update_inode at ffffffffc03e31aa <-- acquires delayed node mutex #9 btrfs_update_inode at ffffffffc0385ba8 #10 btrfs_dirty_inode at ffffffffc038627b <-- TRANSACTIION OPENED #11 touch_atime at ffffffffa4cf0000 #12 generic_file_read_iter at ffffffffa4c1f123 #13 new_sync_read at ffffffffa4ccdc8a #14 vfs_read at ffffffffa4cd0849 #15 ksys_read at ffffffffa4cd0bd1 #16 do_syscall_64 at ffffffffa4a052eb #17 entry_SYSCALL_64_after_hwframe at ffffffffa540008c This will cause an asynchronous work to flush the delalloc inodes to happen which can try to acquire the same delayed_node mutex: PID: 455 TASK: ffff8c8085fa4000 CPU: 5 COMMAND: "kworker/u16:30" #0 __schedule at ffffffffa529e07d #1 schedule at ffffffffa529e4ff #2 schedule_preempt_disabled at ffffffffa529e80a #3 __mutex_lock at ffffffffa529fdcb <-- goes to sleep, never wakes up. #4 btrfs_delayed_update_inode at ffffffffc03e3143 <-- tries to acquire the mutex #5 btrfs_update_inode at ffffffffc0385ba8 <-- this is the same inode that pid 6963 is holding #6 cow_file_range_inline.constprop.78 at ffffffffc0386be7 #7 cow_file_range at ffffffffc03879c1 #8 btrfs_run_delalloc_range at ffffffffc038894c #9 writepage_delalloc at ffffffffc03a3c8f #10 __extent_writepage at ffffffffc03a4c01 #11 extent_write_cache_pages at ffffffffc03a500b #12 extent_writepages at ffffffffc03a6de2 #13 do_writepages at ffffffffa4c277eb #14 __filemap_fdatawrite_range at ffffffffa4c1e5bb #15 btrfs_run_delalloc_work at ffffffffc0380987 <-- starts running delayed nodes #16 normal_work_helper at ffffffffc03b706c #17 process_one_work at ffffffffa4aba4e4 #18 worker_thread at ffffffffa4aba6fd #19 kthread at ffffffffa4ac0a3d #20 ret_from_fork at ffffffffa54001ff To fully address those cases the complete fix is to never issue any flushing while holding the transaction or the delayed node lock. This patch achieves it by calling qgroup_reserve_meta directly which will either succeed without flushing or will fail and return -EDQUOT. In the latter case that return value is going to be propagated to btrfs_dirty_inode which will fallback to start a new transaction. That's fine as the majority of time we expect the inode will have BTRFS_DELAYED_NODE_INODE_DIRTY flag set which will result in directly copying the in-memory state. Fixes: c53e9653605d ("btrfs: qgroup: try to flush qgroup space when we get -EDQUOT") CC: [email protected] # 5.10+ Reviewed-by: Qu Wenruo <[email protected]> Signed-off-by: Nikolay Borisov <[email protected]> Signed-off-by: David Sterba <[email protected]>
2021-03-02btrfs: export and rename qgroup_reserve_metaNikolay Borisov2-4/+6
Signed-off-by: Nikolay Borisov <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2021-03-02btrfs: free correct amount of space in btrfs_delayed_inode_reserve_metadataNikolay Borisov1-1/+1
Following commit f218ea6c4792 ("btrfs: delayed-inode: Remove wrong qgroup meta reservation calls") this function now reserves num_bytes, rather than the fixed amount of nodesize. As such this requires the same amount to be freed in case of failure. Fix this by adjusting the amount we are freeing. Fixes: f218ea6c4792 ("btrfs: delayed-inode: Remove wrong qgroup meta reservation calls") CC: [email protected] # 4.19+ Reviewed-by: Qu Wenruo <[email protected]> Signed-off-by: Nikolay Borisov <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2021-03-02btrfs: fix spurious free_space_tree remount warningBoris Burkov1-2/+2
The intended logic of the check is to catch cases where the desired free_space_tree setting doesn't match the mounted setting, and the remount is anything but ro->rw. However, it makes the mistake of checking equality on a masked integer (btrfs_test_opt) against a boolean (btrfs_fs_compat_ro). If you run the reproducer: $ mount -o space_cache=v2 dev mnt $ mount -o remount,ro mnt you would expect no warning, because the remount is not attempting to change the free space tree setting, but we do see the warning. To fix this, add explicit bool type casts to the condition. I tested a variety of transitions: sudo mount -o space_cache=v2 /dev/vg0/lv0 mnt/lol (fst enabled) mount -o remount,ro mnt/lol (no warning, no fst change) sudo mount -o remount,rw,space_cache=v1,clear_cache (no warning, ro->rw) sudo mount -o remount,rw,space_cache=v2 mnt (warning, rw->rw with change) sudo mount -o remount,ro mnt (no warning, no fst change) sudo mount -o remount,rw,space_cache=v2 mnt (no warning, no fst change) Reported-by: Chris Murphy <[email protected]> CC: [email protected] # 5.11 Signed-off-by: Boris Burkov <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2021-03-02btrfs: validate qgroup inherit for SNAP_CREATE_V2 ioctlDan Carpenter1-1/+18
The problem is we're copying "inherit" from user space but we don't necessarily know that we're copying enough data for a 64 byte struct. Then the next problem is that 'inherit' has a variable size array at the end, and we have to verify that array is the size we expected. Fixes: 6f72c7e20dba ("Btrfs: add qgroup inheritance") CC: [email protected] # 4.4+ Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2021-03-02btrfs: unlock extents in btrfs_zero_range in case of quota reservation errorsNikolay Borisov1-1/+4
If btrfs_qgroup_reserve_data returns an error (i.e quota limit reached) the handling logic directly goes to the 'out' label without first unlocking the extent range between lockstart, lockend. This results in deadlocks as other processes try to lock the same extent. Fixes: a7f8b1c2ac21 ("btrfs: file: reserve qgroup space after the hole punch range is locked") CC: [email protected] # 5.10+ Reviewed-by: Qu Wenruo <[email protected]> Signed-off-by: Nikolay Borisov <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2021-03-02btrfs: ref-verify: use 'inline void' keyword orderingRandy Dunlap1-2/+2
Fix build warnings of function signature when CONFIG_STACKTRACE is not enabled by reordering the 'inline' and 'void' keywords. ../fs/btrfs/ref-verify.c:221:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration] static void inline __save_stack_trace(struct ref_action *ra) ../fs/btrfs/ref-verify.c:225:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration] static void inline __print_stack_trace(struct btrfs_fs_info *fs_info, Signed-off-by: Randy Dunlap <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2021-03-02netfilter: nftables: disallow updates on table ownershipPablo Neira Ayuso1-0/+6
Disallow updating the ownership bit on an existing table: Do not allow to grab ownership on an existing table. Do not allow to drop ownership on an existing table. Fixes: 6001a930ce03 ("netfilter: nftables: introduce table ownership") Signed-off-by: Pablo Neira Ayuso <[email protected]>
2021-03-02ALSA: hda: intel-sdw-acpi: add missing include filesPierre-Louis Bossart1-0/+5
We rely on implicit includes, list out explicitly what this code relies on. Suggested-by: Guennadi Liakhovetski <[email protected]> Signed-off-by: Pierre-Louis Bossart <[email protected]> Reviewed-by: Kai Vehmanen <[email protected]> Reviewed-by: Guennadi Liakhovetski <[email protected]> Reviewed-by: Bard Liao <[email protected]> Acked-by: Mark Brown <[email protected]> Acked-by: Vinod Koul <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2021-03-02ALSA: hda: move Intel SoundWire ACPI scan to dedicated modulePierre-Louis Bossart9-161/+186
The ACPI scan capabilities is called from the intel-dspconfig as well as the SOF/HDaudio drivers. This creates dependencies and randconfig issues when HDaudio and SOF/SoundWire are not all configured as modules. To simplify Kconfig dependencies between HDAudio, SoundWire, SOF and intel-dspconfig, move the ACPI scan helpers to a dedicated module. This follows the same idea as NHLT helpers which are already handled as a dedicated module. The only functional change is that the kernel parameter to filter links is now handled by a different module, but that was only provided for developers needing work-arounds for early BIOS releases. Reported-by: Arnd Bergmann <[email protected]> Signed-off-by: Pierre-Louis Bossart <[email protected]> Reviewed-by: Kai Vehmanen <[email protected]> Reviewed-by: Guennadi Liakhovetski <[email protected]> Reviewed-by: Bard Liao <[email protected]> Acked-by: Mark Brown <[email protected]> Acked-by: Vinod Koul <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2021-03-02ASoC: SOF: Intel: SoundWire: simplify KconfigPierre-Louis Bossart1-17/+9
The Kconfig file is way too convoluted. Track platforms where SoundWire is supported, and add simpler conditions to make sure there is no module/built-in issue. The use of 'depends on' is less intuitive if a required 'depend' is missing, but that's a small price to pay for clarity and simplicity. Suggested-by: Arnd Bergmann <[email protected]> Signed-off-by: Pierre-Louis Bossart <[email protected]> Reviewed-by: Kai Vehmanen <[email protected]> Reviewed-by: Guennadi Liakhovetski <[email protected]> Reviewed-by: Bard Liao <[email protected]> Acked-by: Mark Brown <[email protected]> Acked-by: Vinod Koul <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2021-03-02ASoC: SOF: pci: move DSP_CONFIG use to platform-specific driversPierre-Louis Bossart7-12/+24
There is no reason why we should call the intel_dspcfg helpers from common code, this should be moved in Intel-specific code and only called from platforms where a conflict may occur with the HDaudio or SST/Skylake driver. Suggested-by: Arnd Bergmann <[email protected]> Signed-off-by: Pierre-Louis Bossart <[email protected]> Reviewed-by: Kai Vehmanen <[email protected]> Reviewed-by: Guennadi Liakhovetski <[email protected]> Reviewed-by: Bard Liao <[email protected]> Acked-by: Mark Brown <[email protected]> Acked-by: Vinod Koul <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2021-03-02ASoC: SOF: pci: split PCI into different driversPierre-Louis Bossart11-447/+562
Move PCI IDs and device-specific definitions out of common code. No functionality change for now, just code split and removal of IF_ENABLED() which made the configurations too complicated in case of reuse of IP across generations. Additional changes to address the DSP_CONFIG case and SoundWire depends/select confusions are provided in follow-up patches. Suggested-by: Arnd Bergmann <[email protected]> Signed-off-by: Pierre-Louis Bossart <[email protected]> Reviewed-by: Kai Vehmanen <[email protected]> Reviewed-by: Guennadi Liakhovetski <[email protected]> Reviewed-by: Bard Liao <[email protected]> Acked-by: Mark Brown <[email protected]> Acked-by: Vinod Koul <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2021-03-02ASoC: SOF: ACPI: avoid reverse module dependencyArnd Bergmann9-176/+215
The SOF-ACPI driver is backwards from the normal Linux model, it has a generic driver that knows about all the specific drivers, as opposed to having hardware specific drivers that link against a common framework. This requires ugly Kconfig magic and leads to missed dependencies as seen in this link error: arm-linux-gnueabi-ld: sound/soc/sof/sof-pci-dev.o: in function `sof_acpi_probe': sof-pci-dev.c:(.text+0x1c): undefined reference to `snd_intel_dsp_driver_probe' Change it to use the normal probe order of starting with a specific device in a driver, turning the sof-acpi-dev.c driver into a library (exported symbols are name-spaced to avoid symbol pollution). For backwards-compatibility with previous Kconfigs, the default values for platform drivers uses the top-level ACPI configurations. The modules were also renamed to allow for gradual transitions in test scripts. Co-developed-by: Pierre-Louis Bossart <[email protected]> Signed-off-by: Pierre-Louis Bossart <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]> Reviewed-by: Kai Vehmanen <[email protected]> Reviewed-by: Guennadi Liakhovetski <[email protected]> Reviewed-by: Bard Liao <[email protected]> Acked-by: Mark Brown <[email protected]> Acked-by: Vinod Koul <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2021-03-02ASoC: soc-acpi: allow for partial match in parent namePierre-Louis Bossart1-1/+1
To change the module dependencies and simplify Kconfigs, we need to introduce new driver names (sof-audio-acpi-intel-byt and sof-audio-acpi-intel-bdw), and move from an exact string match to a partial one. Signed-off-by: Pierre-Louis Bossart <[email protected]> Reviewed-by: Kai Vehmanen <[email protected]> Reviewed-by: Guennadi Liakhovetski <[email protected]> Reviewed-by: Bard Liao <[email protected]> Acked-by: Mark Brown <[email protected]> Acked-by: Vinod Koul <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2021-03-02drm/nouveau/fifo/gk104-gp1xx: fix creation of sw classBen Skeggs1-0/+3
Fixes: 496162037cd24191 ("drm/nouveau/fifo: add id_engine hook") Signed-off-by: Ben Skeggs <[email protected]>
2021-03-02powerpc/sstep: Fix VSX instruction emulationJordan Niethe1-2/+2
Commit af99da74333b ("powerpc/sstep: Support VSX vector paired storage access instructions") added loading and storing 32 word long data into adjacent VSRs. However the calculation used to determine if two VSRs needed to be loaded/stored inadvertently prevented the load/storing taking place for instructions with a data length less than 16 words. This causes the emulation to not function correctly, which can be seen by the alignment_handler selftest: $ ./alignment_handler [snip] test: test_alignment_handler_vsx_207 tags: git_version:powerpc-5.12-1-0-g82d2c16b350f VSX: 2.07B Doing lxsspx: PASSED Doing lxsiwax: FAILED: Wrong Data Doing lxsiwzx: PASSED Doing stxsspx: PASSED Doing stxsiwx: PASSED failure: test_alignment_handler_vsx_207 test: test_alignment_handler_vsx_300 tags: git_version:powerpc-5.12-1-0-g82d2c16b350f VSX: 3.00B Doing lxsd: PASSED Doing lxsibzx: PASSED Doing lxsihzx: PASSED Doing lxssp: FAILED: Wrong Data Doing lxv: PASSED Doing lxvb16x: PASSED Doing lxvh8x: PASSED Doing lxvx: PASSED Doing lxvwsx: FAILED: Wrong Data Doing lxvl: PASSED Doing lxvll: PASSED Doing stxsd: PASSED Doing stxsibx: PASSED Doing stxsihx: PASSED Doing stxssp: PASSED Doing stxv: PASSED Doing stxvb16x: PASSED Doing stxvh8x: PASSED Doing stxvx: PASSED Doing stxvl: PASSED Doing stxvll: PASSED failure: test_alignment_handler_vsx_300 [snip] Fix this by making sure all VSX instruction emulation correctly load/store from the VSRs. Fixes: af99da74333b ("powerpc/sstep: Support VSX vector paired storage access instructions") Signed-off-by: Jordan Niethe <[email protected]> Reviewed-by: Ravi Bangoria <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-03-02powerpc/perf: Fix handling of privilege level checks in perf interrupt contextAthira Rajeev1-2/+2
Running "perf mem record" in powerpc platforms with selinux enabled resulted in soft lockup's. Below call-trace was seen in the logs: CPU: 58 PID: 3751 Comm: sssd_nss Not tainted 5.11.0-rc7+ #2 NIP: c000000000dff3d4 LR: c000000000dff3d0 CTR: 0000000000000000 REGS: c000007fffab7d60 TRAP: 0100 Not tainted (5.11.0-rc7+) ... NIP _raw_spin_lock_irqsave+0x94/0x120 LR _raw_spin_lock_irqsave+0x90/0x120 Call Trace: 0xc00000000fd47260 (unreliable) skb_queue_tail+0x3c/0x90 audit_log_end+0x6c/0x180 common_lsm_audit+0xb0/0xe0 slow_avc_audit+0xa4/0x110 avc_has_perm+0x1c4/0x260 selinux_perf_event_open+0x74/0xd0 security_perf_event_open+0x68/0xc0 record_and_restart+0x6e8/0x7f0 perf_event_interrupt+0x22c/0x560 performance_monitor_exception0x4c/0x60 performance_monitor_common_virt+0x1c8/0x1d0 interrupt: f00 at _raw_spin_lock_irqsave+0x38/0x120 NIP: c000000000dff378 LR: c000000000b5fbbc CTR: c0000000007d47f0 REGS: c00000000fd47860 TRAP: 0f00 Not tainted (5.11.0-rc7+) ... NIP _raw_spin_lock_irqsave+0x38/0x120 LR skb_queue_tail+0x3c/0x90 interrupt: f00 0x38 (unreliable) 0xc00000000aae6200 audit_log_end+0x6c/0x180 audit_log_exit+0x344/0xf80 __audit_syscall_exit+0x2c0/0x320 do_syscall_trace_leave+0x148/0x200 syscall_exit_prepare+0x324/0x390 system_call_common+0xfc/0x27c The above trace shows that while the CPU was handling a performance monitor exception, there was a call to security_perf_event_open() function. In powerpc core-book3s, this function is called from perf_allow_kernel() check during recording of data address in the sample via perf_get_data_addr(). Commit da97e18458fb ("perf_event: Add support for LSM and SELinux checks") introduced security enhancements to perf. As part of this commit, the new security hook for perf_event_open() was added in all places where perf paranoid check was previously used. In powerpc core-book3s code, originally had paranoid checks in perf_get_data_addr() and power_pmu_bhrb_read(). So perf_paranoid_kernel() checks were replaced with perf_allow_kernel() in these PMU helper functions as well. The intention of paranoid checks in core-book3s was to verify privilege access before capturing some of the sample data. Along with paranoid checks, perf_allow_kernel() also does a security_perf_event_open(). Since these functions are accessed while recording a sample, we end up calling selinux_perf_event_open() in PMI context. Some of the security functions use spinlock like sidtab_sid2str_put(). If a perf interrupt hits under a spin lock and if we end up in calling selinux hook functions in PMI handler, this could cause a dead lock. Since the purpose of this security hook is to control access to perf_event_open(), it is not right to call this in interrupt context. The paranoid checks in powerpc core-book3s were done at interrupt time which is also not correct. Reference commits: Commit cd1231d7035f ("powerpc/perf: Prevent kernel address leak via perf_get_data_addr()") Commit bb19af816025 ("powerpc/perf: Prevent kernel address leak to userspace via BHRB buffer") We only allow creation of events that have already passed the privilege checks in perf_event_open(). So these paranoid checks are not needed at event time. As a fix, patch uses 'event->attr.exclude_kernel' check to prevent exposing kernel address for userspace only sampling. Fixes: cd1231d7035f ("powerpc/perf: Prevent kernel address leak via perf_get_data_addr()") Cc: [email protected] # v4.17+ Suggested-by: Michael Ellerman <[email protected]> Signed-off-by: Athira Rajeev <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-03-02powerpc: Force inlining of mmu_has_feature to fix build failureChristophe Leroy1-2/+2
The test robot has managed to generate a random config leading to following build failure: LD .tmp_vmlinux.kallsyms1 powerpc64-linux-ld: arch/powerpc/mm/pgtable.o: in function `ptep_set_access_flags': pgtable.c:(.text.ptep_set_access_flags+0xf0): undefined reference to `hash__flush_tlb_page' powerpc64-linux-ld: arch/powerpc/mm/book3s32/mmu.o: in function `MMU_init_hw_patch': mmu.c:(.init.text+0x452): undefined reference to `patch__hash_page_A0' powerpc64-linux-ld: mmu.c:(.init.text+0x45e): undefined reference to `patch__hash_page_A0' powerpc64-linux-ld: mmu.c:(.init.text+0x46a): undefined reference to `patch__hash_page_A1' powerpc64-linux-ld: mmu.c:(.init.text+0x476): undefined reference to `patch__hash_page_A1' powerpc64-linux-ld: mmu.c:(.init.text+0x482): undefined reference to `patch__hash_page_A2' powerpc64-linux-ld: mmu.c:(.init.text+0x48e): undefined reference to `patch__hash_page_A2' powerpc64-linux-ld: mmu.c:(.init.text+0x49e): undefined reference to `patch__hash_page_B' powerpc64-linux-ld: mmu.c:(.init.text+0x4aa): undefined reference to `patch__hash_page_B' powerpc64-linux-ld: mmu.c:(.init.text+0x4b6): undefined reference to `patch__hash_page_C' powerpc64-linux-ld: mmu.c:(.init.text+0x4c2): undefined reference to `patch__hash_page_C' powerpc64-linux-ld: mmu.c:(.init.text+0x4ce): undefined reference to `patch__flush_hash_A0' powerpc64-linux-ld: mmu.c:(.init.text+0x4da): undefined reference to `patch__flush_hash_A0' powerpc64-linux-ld: mmu.c:(.init.text+0x4e6): undefined reference to `patch__flush_hash_A1' powerpc64-linux-ld: mmu.c:(.init.text+0x4f2): undefined reference to `patch__flush_hash_A1' powerpc64-linux-ld: mmu.c:(.init.text+0x4fe): undefined reference to `patch__flush_hash_A2' powerpc64-linux-ld: mmu.c:(.init.text+0x50a): undefined reference to `patch__flush_hash_A2' powerpc64-linux-ld: mmu.c:(.init.text+0x522): undefined reference to `patch__flush_hash_B' powerpc64-linux-ld: mmu.c:(.init.text+0x532): undefined reference to `patch__flush_hash_B' powerpc64-linux-ld: arch/powerpc/mm/book3s32/mmu.o: in function `update_mmu_cache': mmu.c:(.text.update_mmu_cache+0xa0): undefined reference to `add_hash_page' powerpc64-linux-ld: mm/memory.o: in function `zap_pte_range': memory.c:(.text.zap_pte_range+0x160): undefined reference to `flush_hash_pages' powerpc64-linux-ld: mm/memory.o: in function `handle_pte_fault': memory.c:(.text.handle_pte_fault+0x180): undefined reference to `hash__flush_tlb_page' This is due to mmu_has_feature() not being inlined. See extract of build of mmu.c with -Winline: In file included from ./include/linux/mm_types.h:19, from ./include/linux/mmzone.h:21, from ./include/linux/gfp.h:6, from ./include/linux/mm.h:10, from arch/powerpc/mm/book3s32/mmu.c:21: ./arch/powerpc/include/asm/mmu.h: In function 'find_free_bat': ./arch/powerpc/include/asm/mmu.h:231:20: warning: inlining failed in call to 'early_mmu_has_feature': call is unlikely and code size would grow [-Winline] 231 | static inline bool early_mmu_has_feature(unsigned long feature) | ^~~~~~~~~~~~~~~~~~~~~ ./arch/powerpc/include/asm/mmu.h:291:9: note: called from here 291 | return early_mmu_has_feature(feature); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The code relies on constant folding of MMU_FTRS_POSSIBLE at buildtime and elimination of non possible parts of code at compile time. For this to work, mmu_has_feature() and early_mmu_has_feature() must be inlined. Fixes: 259149cf7c3c ("powerpc/32s: Only build hash code when CONFIG_PPC_BOOK3S_604 is selected") Reported-by: kernel test robot <[email protected]> Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/cf61345912c078c96f171afd0fcc48ef27cbdc3f.1614443418.git.christophe.leroy@csgroup.eu
2021-03-02vio: make remove callback return voidUwe Kleine-König13-37/+15
The driver core ignores the return value of struct bus_type::remove() because there is only little that can be done. To simplify the quest to make this function return void, let struct vio_driver::remove() return void, too. All users already unconditionally return 0, this commit makes it obvious that returning an error code is a bad idea. Note there are two nominally different implementations for a vio bus: one in arch/sparc/kernel/vio.c and the other in arch/powerpc/platforms/pseries/vio.c. This patch only adapts the powerpc one. Before this patch for a device that was bound to a driver without a remove callback vio_cmo_bus_remove(viodev) wasn't called. As the device core still considers the device unbound after vio_bus_remove() returns calling this unconditionally is the consistent behaviour which is implemented here. Signed-off-by: Uwe Kleine-König <[email protected]> Reviewed-by: Tyrel Datwyler <[email protected]> Acked-by: Lijun Pan <[email protected]> Acked-by: Greg Kroah-Hartman <[email protected]> [mpe: Drop unneeded hvcs_remove() forward declaration, squash in change from sfr to drop ibmvnic_remove() forward declaration] Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-03-02selftests/bpf: Mask bpf_csum_diff() return value to 16 bits in test_verifierYauheni Kaliuta1-1/+2
The verifier test labelled "valid read map access into a read-only array 2" calls the bpf_csum_diff() helper and checks its return value. However, architecture implementations of csum_partial() (which is what the helper uses) differ in whether they fold the return value to 16 bit or not. For example, x86 version has ... if (unlikely(odd)) { result = from32to16(result); result = ((result >> 8) & 0xff) | ((result & 0xff) << 8); } ... while generic lib/checksum.c does: result = from32to16(result); if (odd) result = ((result >> 8) & 0xff) | ((result & 0xff) << 8); This makes the helper return different values on different architectures, breaking the test on non-x86. To fix this, add an additional instruction to always mask the return value to 16 bits, and update the expected return value accordingly. Fixes: fb2abb73e575 ("bpf, selftest: test {rd, wr}only flags and direct value access") Signed-off-by: Yauheni Kaliuta <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-03-02selftests/bpf: Use the last page in test_snprintf_btf on s390Ilya Leoshkevich1-3/+10
test_snprintf_btf fails on s390, because NULL points to a readable struct lowcore there. Fix by using the last page instead. Error message example: printing fffffffffffff000 should generate error, got (361) Fixes: 076a95f5aff2 ("selftests/bpf: Add bpf_snprintf_btf helper tests") Signed-off-by: Ilya Leoshkevich <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Heiko Carstens <[email protected]> Acked-by: Yonghong Song <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-03-02ALSA: hda: intel-nhlt: verify config typePierre-Louis Bossart2-9/+50
Multiple bug reports report issues with the SOF and SST drivers when dealing with single microphone cases. We currently read the DMIC array information unconditionally but we don't check that the configuration type is actually a mic array. When the DMIC link does not rely on a mic array configuration, the recommendation is to check the format information to infer the maximum number of channels, and map this to the number of microphones. This leaves a potential for a mismatch between actual microphones available in hardware and what the ACPI table contains, but we have no other source of information. Note that single microphone configurations can alternatively be handled with a 'mic array' configuration along with a 'vendor-defined' geometry. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201251 BugLink: https://github.com/thesofproject/linux/issues/2725 Fixes: 7a33ea70e1868 ('ALSA: hda: intel-nhlt: handle NHLT VENDOR_DEFINED DMIC geometry') Signed-off-by: Pierre-Louis Bossart <[email protected]> Reviewed-by: Guennadi Liakhovetski <[email protected]> Reviewed-by: Rander Wang <[email protected]> Reviewed-by: Kai Vehmanen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2021-03-02ALSA: hda: fix kernel-doc warningsPierre-Louis Bossart7-15/+14
v5.12-rc1 flags new warnings with make W=1, fix missing or broken function descriptors. sound/pci/hda/hda_codec.c:3492: warning: expecting prototype for snd_hda_input_mux_info_info(). Prototype was for snd_hda_input_mux_info() instead sound/pci/hda/hda_codec.c:3521: warning: expecting prototype for snd_hda_input_mux_info_put(). Prototype was for snd_hda_input_mux_put() instead sound/pci/hda/hda_codec.c:3958: warning: expecting prototype for _snd_hda_pin_ctl(). Prototype was for _snd_hda_set_pin_ctl() instead sound/pci/hda/hda_jack.c:223: warning: expecting prototype for snd_hda_set_dirty_all(). Prototype was for snd_hda_jack_set_dirty_all() instead sound/pci/hda/hda_jack.c:309: warning: expecting prototype for snd_hda_jack_detect_enable_mst(). Prototype was for snd_hda_jack_detect_enable_callback_mst() instead sound/pci/hda/hda_generic.c:3933: warning: expecting prototype for snd_dha_gen_add_mute_led_cdev(). Prototype was for snd_hda_gen_add_mute_led_cdev() instead sound/pci/hda/hda_generic.c:4093: warning: expecting prototype for snd_dha_gen_add_micmute_led_cdev(). Prototype was for snd_hda_gen_add_micmute_led_cdev() instead sound/pci/hda/patch_ca0132.c:2357: warning: expecting prototype for Prepare and send the SCP message to DSP(). Prototype was for dspio_scp() instead sound/pci/hda/patch_ca0132.c:2883: warning: expecting prototype for Allocate router ports(). Prototype was for dsp_allocate_router_ports() instead sound/pci/hda/patch_ca0132.c:3202: warning: expecting prototype for Write a block of data into DSP code or data RAM using pre(). Prototype was for dspxfr_one_seg() instead sound/pci/hda/patch_ca0132.c:3397: warning: expecting prototype for data overlay to DSP memories(). Prototype was for dspxfr_image() instead sound/hda/hdac_regmap.c:393: warning: expecting prototype for snd_hdac_regmap_init(). Prototype was for snd_hdac_regmap_exit() instead sound/hda/ext/hdac_ext_controller.c:142: warning: expecting prototype for snd_hdac_ext_bus_get_link_index(). Prototype was for snd_hdac_ext_bus_get_link() instead sound/hda/ext/hdac_ext_stream.c:140: warning: expecting prototype for snd_hdac_ext_linkstream_start(). Prototype was for snd_hdac_ext_link_stream_start() instead Signed-off-by: Pierre-Louis Bossart <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2021-03-01gcc-plugins: latent_entropy: remove unneeded semicolonJason Yan1-1/+1
Fix the following coccicheck warning: scripts/gcc-plugins/latent_entropy_plugin.c:539:2-3: Unneeded semicolon Reported-by: Hulk Robot <[email protected]> Signed-off-by: Jason Yan <[email protected]> Signed-off-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-03-01gcc-plugins: structleak: remove unneeded variable 'ret'Jason Yan1-2/+1
Fix the following coccicheck warning: scripts/gcc-plugins/structleak_plugin.c:177:14-17: Unneeded variable: "ret". Return "0" on line 207 Reported-by: Hulk Robot <[email protected]> Signed-off-by: Jason Yan <[email protected]> Signed-off-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-03-01tcp: add sanity tests to TCP_QUEUE_SEQEric Dumazet1-8/+15
Qingyu Li reported a syzkaller bug where the repro changes RCV SEQ _after_ restoring data in the receive queue. mprotect(0x4aa000, 12288, PROT_READ) = 0 mmap(0x1ffff000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x1ffff000 mmap(0x20000000, 16777216, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x20000000 mmap(0x21000000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x21000000 socket(AF_INET6, SOCK_STREAM, IPPROTO_IP) = 3 setsockopt(3, SOL_TCP, TCP_REPAIR, [1], 4) = 0 connect(3, {sa_family=AF_INET6, sin6_port=htons(0), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, 28) = 0 setsockopt(3, SOL_TCP, TCP_REPAIR_QUEUE, [1], 4) = 0 sendmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="0x0000000000000003\0\0", iov_len=20}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 20 setsockopt(3, SOL_TCP, TCP_REPAIR, [0], 4) = 0 setsockopt(3, SOL_TCP, TCP_QUEUE_SEQ, [128], 4) = 0 recvfrom(3, NULL, 20, 0, NULL, NULL) = -1 ECONNRESET (Connection reset by peer) syslog shows: [ 111.205099] TCP recvmsg seq # bug 2: copied 80, seq 0, rcvnxt 80, fl 0 [ 111.207894] WARNING: CPU: 1 PID: 356 at net/ipv4/tcp.c:2343 tcp_recvmsg_locked+0x90e/0x29a0 This should not be allowed. TCP_QUEUE_SEQ should only be used when queues are empty. This patch fixes this case, and the tx path as well. Fixes: ee9952831cfd ("tcp: Initial repair mode") Signed-off-by: Eric Dumazet <[email protected]> Cc: Pavel Emelyanov <[email protected]> Link: https://bugzilla.kernel.org/show_bug.cgi?id=212005 Reported-by: Qingyu Li <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-01hv_netvsc: Fix validation in netvsc_linkstatus_callback()Andrea Parri (Microsoft)3-6/+11
Contrary to the RNDIS protocol specification, certain (pre-Fe) implementations of Hyper-V's vSwitch did not account for the status buffer field in the length of an RNDIS packet; the bug was fixed in newer implementations. Validate the status buffer fields using the length of the 'vmtransfer_page' packet (all implementations), that is known/validated to be less than or equal to the receive section size and not smaller than the length of the RNDIS message. Reported-by: Dexuan Cui <[email protected]> Suggested-by: Haiyang Zhang <[email protected]> Signed-off-by: Andrea Parri (Microsoft) <[email protected]> Fixes: 505e3f00c3f36 ("hv_netvsc: Add (more) validation for untrusted Hyper-V values") Signed-off-by: David S. Miller <[email protected]>
2021-03-01net: dsa: tag_mtk: fix 802.1ad VLAN egressDENG Qingfang1-6/+13
A different TPID bit is used for 802.1ad VLAN frames. Reported-by: Ilario Gelmetti <[email protected]> Fixes: f0af34317f4b ("net: dsa: mediatek: combine MediaTek tag with VLAN tag") Signed-off-by: DENG Qingfang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-01net: expand textsearch ts_state to fit skb_seq_stateWillem de Bruijn2-1/+3
The referenced commit expands the skb_seq_state used by skb_find_text with a 4B frag_off field, growing it to 48B. This exceeds container ts_state->cb, causing a stack corruption: [ 73.238353] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: skb_find_text+0xc5/0xd0 [ 73.247384] CPU: 1 PID: 376 Comm: nping Not tainted 5.11.0+ #4 [ 73.252613] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 [ 73.260078] Call Trace: [ 73.264677] dump_stack+0x57/0x6a [ 73.267866] panic+0xf6/0x2b7 [ 73.270578] ? skb_find_text+0xc5/0xd0 [ 73.273964] __stack_chk_fail+0x10/0x10 [ 73.277491] skb_find_text+0xc5/0xd0 [ 73.280727] string_mt+0x1f/0x30 [ 73.283639] ipt_do_table+0x214/0x410 The struct is passed between skb_find_text and its callbacks skb_prepare_seq_read, skb_seq_read and skb_abort_seq read through the textsearch interface using TS_SKB_CB. I assumed that this mapped to skb->cb like other .._SKB_CB wrappers. skb->cb is 48B. But it maps to ts_state->cb, which is only 40B. skb->cb was increased from 40B to 48B after ts_state was introduced, in commit 3e3850e989c5 ("[NETFILTER]: Fix xfrm lookup in ip_route_me_harder/ip6_route_me_harder"). Increase ts_state.cb[] to 48 to fit the struct. Also add a BUILD_BUG_ON to avoid a repeat. The alternative is to directly add a dependency from textsearch onto linux/skbuff.h, but I think the intent is textsearch to have no such dependencies on its callers. Link: https://bugzilla.kernel.org/show_bug.cgi?id=211911 Fixes: 97550f6fa592 ("net: compound page support in skb_seq_read") Reported-by: Kris Karas <[email protected]> Signed-off-by: Willem de Bruijn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-01docs: networking: bonding.rst Fix a typo in bonding.rstMasanari Iida1-1/+1
This patch fixes a spelling typo in bonding.rst. Signed-off-by: Masanari Iida <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-01Merge tag 'linux-can-fixes-for-5.12-20210301' of ↵David S. Miller4-31/+28
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2021-03-01 this is a pull request of 6 patches for net/master. The first 3 patches are by Joakim Zhang for the flexcan driver and fix the probing and starting of the chip. The next patch is by me, for the mcp251xfd driver and reverts the BQL support. BQL support got mainline with rc1 and assumes that CAN frames are always echoed, which is not the case. A proper fix requires changes more changes and will be rolled out via linux-can-next later. Oleksij Rempel's patch fixes the socket ref counting if socket was closed before setting skb ownership. Torin Cooper-Bennun's patch for the tcan4x5x driver fixes a race condition, where the chip is first attached the bus and then the MRAM is initialized, which may result in lost data. ==================== Signed-off-by: David S. Miller <[email protected]>
2021-03-01Merge branch 'enetc-fixes'David S. Miller5-63/+152
Vladimir Oltean says: ==================== Fixes for NXP ENETC driver This contains an assorted set of fixes collected over the past 2 weeks on the enetc driver. Some are related to VLAN processing, some to physical link settings, some are fixups of previous hardware workarounds, and some are simply zero-day data path bugs that for some reason were never caught or at least identified. ==================== Signed-off-by: David S. Miller <[email protected]>
2021-03-01net: enetc: keep RX ring consumer index in sync with hardwareVladimir Oltean1-0/+2
The RX rings have a producer index owned by hardware, where newly received frame buffers are placed, and a consumer index owned by software, where newly allocated buffers are placed, in expectation of hardware being able to place frame data in them. Hardware increments the producer index when a frame is received, however it is not allowed to increment the producer index to match the consumer index (RBCIR) since the ring can hold at most RBLENR[LENGTH]-1 received BDs. Whenever the producer index matches the value of the consumer index, the ring has no unprocessed received frames and all BDs in the ring have been initialized/prepared by software, i.e. hardware owns all BDs in the ring. The code uses the next_to_clean variable to keep track of the producer index, and the next_to_use variable to keep track of the consumer index. The RX rings are seeded from enetc_refill_rx_ring, which is called from two places: 1. initially the ring is seeded until full with enetc_bd_unused(rx_ring), i.e. with 511 buffers. This will make next_to_clean=0 and next_to_use=511: .ndo_open -> enetc_open -> enetc_setup_bdrs -> enetc_setup_rxbdr -> enetc_refill_rx_ring 2. then during the data path processing, it is refilled with 16 buffers at a time: enetc_msix -> napi_schedule -> enetc_poll -> enetc_clean_rx_ring -> enetc_refill_rx_ring There is just one problem: the initial seeding done during .ndo_open updates just the producer index (ENETC_RBPIR) with 0, and the software next_to_clean and next_to_use variables. Notably, it will not update the consumer index to make the hardware aware of the newly added buffers. Wait, what? So how does it work? Well, the reset values of the producer index and of the consumer index of a ring are both zero. As per the description in the second paragraph, it means that the ring is full of buffers waiting for hardware to put frames in them, which by coincidence is almost true, because we have in fact seeded 511 buffers into the ring. But will the hardware attempt to access the 512th entry of the ring, which has an invalid BD in it? Well, no, because in order to do that, it would have to first populate the first 511 entries, and the NAPI enetc_poll will kick in by then. Eventually, after 16 processed slots have become available in the RX ring, enetc_clean_rx_ring will call enetc_refill_rx_ring and then will [ finally ] update the consumer index with the new software next_to_use variable. From now on, the next_to_clean and next_to_use variables are in sync with the producer and consumer ring indices. So the day is saved, right? Well, not quite. Freeing the memory allocated for the rings is done in: enetc_close -> enetc_clear_bdrs -> enetc_clear_rxbdr -> this just disables the ring -> enetc_free_rxtx_rings -> enetc_free_rx_ring -> sets next_to_clean and next_to_use to 0 but again, nothing is committed to the hardware producer and consumer indices (yay!). The assumption is that the ring is disabled, so the indices don't matter anyway, and it's the responsibility of the "open" code path to set those up. .. Except that the "open" code path does not set those up properly. While initially, things almost work, during subsequent enetc_close -> enetc_open sequences, we have problems. To be precise, the enetc_open that is subsequent to enetc_close will again refill the ring with 511 entries, but it will leave the consumer index untouched. Untouched means, of course, equal to the value it had before disabling the ring and draining the old buffers in enetc_close. But as mentioned, enetc_setup_rxbdr will at least update the producer index though, through this line of code: enetc_rxbdr_wr(hw, idx, ENETC_RBPIR, 0); so at this stage we'll have: next_to_clean=0 (in hardware 0) next_to_use=511 (in hardware we'll have the refill index prior to enetc_close) Again, the next_to_clean and producer index are in sync and set to correct values, so the driver manages to limp on. Eventually, 16 ring entries will be consumed by enetc_poll, and the savior enetc_clean_rx_ring will come and call enetc_refill_rx_ring, and then update the hardware consumer ring based upon the new next_to_use. So.. it works? Well, by coincidence, it almost does, but there's a circumstance where enetc_clean_rx_ring won't be there to save us. If the previous value of the consumer index was 15, there's a problem, because the NAPI poll sequence will only issue a refill when 16 or more buffers have been consumed. It's easiest to illustrate this with an example: ip link set eno0 up ip addr add 192.168.100.1/24 dev eno0 ping 192.168.100.1 -c 20 # ping this port from another board ip link set eno0 down ip link set eno0 up ping 192.168.100.1 -c 20 # ping it again from the same other board One by one: 1. ip link set eno0 up -> calls enetc_setup_rxbdr: -> calls enetc_refill_rx_ring(511 buffers) -> next_to_clean=0 (in hw 0) -> next_to_use=511 (in hw 0) 2. ping 192.168.100.1 -c 20 # ping this port from another board enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=1 next_to_clean 0 (in hw 1) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=2 next_to_clean 1 (in hw 2) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=3 next_to_clean 2 (in hw 3) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=4 next_to_clean 3 (in hw 4) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=5 next_to_clean 4 (in hw 5) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=6 next_to_clean 5 (in hw 6) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=7 next_to_clean 6 (in hw 7) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=8 next_to_clean 7 (in hw 8) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=9 next_to_clean 8 (in hw 9) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=10 next_to_clean 9 (in hw 10) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=11 next_to_clean 10 (in hw 11) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=12 next_to_clean 11 (in hw 12) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=13 next_to_clean 12 (in hw 13) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=14 next_to_clean 13 (in hw 14) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=15 next_to_clean 14 (in hw 15) next_to_use 511 (in hw 0) enetc_clean_rx_ring: enetc_refill_rx_ring(16) increments next_to_use by 16 (mod 512) and writes it to hw enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=0 next_to_clean 15 (in hw 16) next_to_use 15 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=1 next_to_clean 16 (in hw 17) next_to_use 15 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=2 next_to_clean 17 (in hw 18) next_to_use 15 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=3 next_to_clean 18 (in hw 19) next_to_use 15 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=4 next_to_clean 19 (in hw 20) next_to_use 15 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=5 next_to_clean 20 (in hw 21) next_to_use 15 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=6 next_to_clean 21 (in hw 22) next_to_use 15 (in hw 15) 20 packets transmitted, 20 packets received, 0% packet loss 3. ip link set eno0 down enetc_free_rx_ring: next_to_clean 0 (in hw 22), next_to_use 0 (in hw 15) 4. ip link set eno0 up -> calls enetc_setup_rxbdr: -> calls enetc_refill_rx_ring(511 buffers) -> next_to_clean=0 (in hw 0) -> next_to_use=511 (in hw 15) 5. ping 192.168.100.1 -c 20 # ping it again from the same other board enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=1 next_to_clean 0 (in hw 1) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=2 next_to_clean 1 (in hw 2) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=3 next_to_clean 2 (in hw 3) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=4 next_to_clean 3 (in hw 4) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=5 next_to_clean 4 (in hw 5) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=6 next_to_clean 5 (in hw 6) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=7 next_to_clean 6 (in hw 7) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=8 next_to_clean 7 (in hw 8) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=9 next_to_clean 8 (in hw 9) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=10 next_to_clean 9 (in hw 10) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=11 next_to_clean 10 (in hw 11) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=12 next_to_clean 11 (in hw 12) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=13 next_to_clean 12 (in hw 13) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=14 next_to_clean 13 (in hw 14) next_to_use 511 (in hw 15) 20 packets transmitted, 12 packets received, 40% packet loss And there it dies. No enetc_refill_rx_ring (because cleaned_cnt must be equal to 15 for that to happen), no nothing. The hardware enters the condition where the producer (14) + 1 is equal to the consumer (15) index, which makes it believe it has no more free buffers to put packets in, so it starts discarding them: ip netns exec ns0 ethtool -S eno0 | grep -v ': 0' NIC statistics: Rx ring 0 discarded frames: 8 Summarized, if the interface receives between 16 and 32 (mod 512) frames and then there is a link flap, then the port will eventually die with no way to recover. If it receives less than 16 (mod 512) frames, then the initial NAPI poll [ before the link flap ] will not update the consumer index in hardware (it will remain zero) which will be ok when the buffers are later reinitialized. If more than 32 (mod 512) frames are received, the initial NAPI poll has the chance to refill the ring twice, updating the consumer index to at least 32. So after the link flap, the consumer index is still wrong, but the post-flap NAPI poll gets a chance to refill the ring once (because it passes through cleaned_cnt=15) and makes the consumer index be again back in sync with next_to_use. The solution to this problem is actually simple, we just need to write next_to_use into the hardware consumer index at enetc_open time, which always brings it back in sync after an initial buffer seeding process. The simpler thing would be to put the write to the consumer index into enetc_refill_rx_ring directly, but there are issues with the MDIO locking: in the NAPI poll code we have the enetc_lock_mdio() taken from top-level and we use the unlocked enetc_wr_reg_hot, whereas in enetc_open, the enetc_lock_mdio() is not taken at the top level, but instead by each individual enetc_wr_reg, so we are forced to put an additional enetc_wr_reg in enetc_setup_rxbdr. Better organization of the code is left as a refactoring exercise. Fixes: d4fd0404c1c9 ("enetc: Introduce basic PF and VF ENETC ethernet drivers") Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-01net: enetc: remove bogus write to SIRXIDR from enetc_setup_rxbdrVladimir Oltean1-1/+0
The Station Interface Receive Interrupt Detect Register (SIRXIDR) contains a 16-bit wide mask of 'interrupt detected' events for each ring associated with a port. Bit i is write-1-to-clean for RX ring i. I have no explanation whatsoever how this line of code came to be inserted in the blamed commit. I checked the downstream versions of that patch and none of them have it. The somewhat comical aspect of it is that we're writing a binary number to the SIRXIDR register, which is derived from enetc_bd_unused(rx_ring). Since the RX rings have 512 buffer descriptors, we end up writing 511 to this register, which is 0x1ff, so we are effectively clearing the 'interrupt detected' event for rings 0-8. This register is not what is used for interrupt handling though - it only provides a summary for the entire SI. The hardware provides one separate Interrupt Detect Register per RX ring, which auto-clears upon read. So there doesn't seem to be any adverse effect caused by this bogus write. There is, however, one reason why this should be handled as a bugfix: next_to_clean _should_ be committed to hardware, just not to that register, and this was obscuring the fact that it wasn't. This is fixed in the next patch, and removing the bogus line now allows the fix patch to be backported beyond that point. Fixes: fd5736bf9f23 ("enetc: Workaround for MDIO register access issue") Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-01net: enetc: force the RGMII speed and duplex instead of operating in inband modeVladimir Oltean2-10/+56
The ENETC port 0 MAC supports in-band status signaling coming from a PHY when operating in RGMII mode, and this feature is enabled by default. It has been reported that RGMII is broken in fixed-link, and that is not surprising considering the fact that no PHY is attached to the MAC in that case, but a switch. This brings us to the topic of the patch: the enetc driver should have not enabled the optional in-band status signaling for RGMII unconditionally, but should have forced the speed and duplex to what was resolved by phylink. Note that phylink does not accept the RGMII modes as valid for in-band signaling, and these operate a bit differently than 1000base-x and SGMII (notably there is no clause 37 state machine so no ACK required from the MAC, instead the PHY sends extra code words on RXD[3:0] whenever it is not transmitting something else, so it should be safe to leave a PHY with this option unconditionally enabled even if we ignore it). The spec talks about this here: https://e2e.ti.com/cfs-file/__key/communityserver-discussions-components-files/138/RGMIIv1_5F00_3.pdf Fixes: 71b77a7a27a3 ("enetc: Migrate to PHYLINK and PCS_LYNX") Cc: Florian Fainelli <[email protected]> Cc: Andrew Lunn <[email protected]> Cc: Russell King <[email protected]> Signed-off-by: Vladimir Oltean <[email protected]> Acked-by: Russell King <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-01net: enetc: don't disable VLAN filtering in IFF_PROMISC modeVladimir Oltean1-5/+0
Quoting from the blamed commit: In promiscuous mode, it is more intuitive that all traffic is received, including VLAN tagged traffic. It appears that it is necessary to set the flag in PSIPVMR for that to be the case, so VLAN promiscuous mode is also temporarily enabled. On exit from promiscuous mode, the setting made by ethtool is restored. Intuitive or not, there isn't any definition issued by a standards body which says that promiscuity has anything to do with VLAN filtering - it only has to do with accepting packets regardless of destination MAC address. In fact people are already trying to use this misunderstanding/bug of the enetc driver as a justification to transform promiscuity into something it never was about: accepting every packet (maybe that would be the "rx-all" netdev feature?): https://lore.kernel.org/netdev/[email protected]/ This is relevant because there are use cases in the kernel (such as tc-flower rules with the protocol 802.1Q and a vlan_id key) which do not (yet) use the vlan_vid_add API to be compatible with VLAN-filtering NICs such as enetc, so for those, disabling rx-vlan-filter is currently the only right solution to make these setups work: https://lore.kernel.org/netdev/CA+h21hoxwRdhq4y+w8Kwgm74d4cA0xLeiHTrmT-VpSaM7obhkg@mail.gmail.com/ The blamed patch has unintentionally introduced one more way for this to work, which is to enable IFF_PROMISC, however this is non-portable because port promiscuity is not meant to disable VLAN filtering. Therefore, it could invite people to write broken scripts for enetc, and then wonder why they are broken when migrating to other drivers that don't handle promiscuity in the same way. Fixes: 7070eea5e95a ("enetc: permit configuration of rx-vlan-filter with ethtool") Cc: Markus Blöchl <[email protected]> Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-01net: enetc: fix incorrect TPID when receiving 802.1ad tagged packetsVladimir Oltean2-8/+29
When the enetc ports have rx-vlan-offload enabled, they report a TPID of ETH_P_8021Q regardless of what was actually in the packet. When rx-vlan-offload is disabled, packets have the proper TPID. Fix this inconsistency by finishing the TODO left in the code. Fixes: d4fd0404c1c9 ("enetc: Introduce basic PF and VF ENETC ethernet drivers") Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-01net: enetc: take the MDIO lock only once per NAPI poll cycleVladimir Oltean2-22/+11
The workaround for the ENETC MDIO erratum caused a performance degradation of 82 Kpps (seen with IP forwarding of two 1Gbps streams of 64B packets). This is due to excessive locking and unlocking in the fast path, which can be avoided. By taking the MDIO read-side lock only once per NAPI poll cycle, we are able to regain 54 Kpps (65%) of the performance hit. The rest of the performance degradation comes from the TX data path, but unfortunately it doesn't look like we can optimize that away easily, even with netdev_xmit_more(), there just isn't any skb batching done, to help with taking the MDIO lock less often than once per packet. We need to change the register accessor type for enetc_get_tx_tstamp, because it now runs under the enetc_lock_mdio as per the new call path detailed below: enetc_msix -> napi_schedule -> enetc_poll -> enetc_lock_mdio -> enetc_clean_tx_ring -> enetc_get_tx_tstamp -> enetc_clean_rx_ring -> enetc_unlock_mdio Fixes: fd5736bf9f23 ("enetc: Workaround for MDIO register access issue") Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-01net: enetc: initialize RFS/RSS memories for unused ports tooVladimir Oltean3-9/+36
Michael reports that since linux-next-20210211, the AER messages for ECC errors have started reappearing, and this time they can be reliably reproduced with the first ping on one of his LS1028A boards. $ ping 1[ 33.258069] pcieport 0000:00:1f.0: AER: Multiple Corrected error received: 0000:00:00.0 72.16.0.1 PING [ 33.267050] pcieport 0000:00:1f.0: AER: can't find device of ID0000 172.16.0.1 (172.16.0.1): 56 data bytes 64 bytes from 172.16.0.1: seq=0 ttl=64 time=17.124 ms 64 bytes from 172.16.0.1: seq=1 ttl=64 time=0.273 ms $ devmem 0x1f8010e10 32 0xC0000006 It isn't clear why this is necessary, but it seems that for the errors to go away, we must clear the entire RFS and RSS memory, not just for the ports in use. Sadly the code is structured in such a way that we can't have unified logic for the used and unused ports. For the minimal initialization of an unused port, we need just to enable and ioremap the PF memory space, and a control buffer descriptor ring. Unused ports must then free the CBDR because the driver will exit, but used ports can not pick up from where that code path left, since the CBDR API does not reinitialize a ring when setting it up, so its producer and consumer indices are out of sync between the software and hardware state. So a separate enetc_init_unused_port function was created, and it gets called right after the PF memory space is enabled. Fixes: 07bf34a50e32 ("net: enetc: initialize the RFS and RSS memories") Reported-by: Michael Walle <[email protected]> Cc: Jesse Brandeburg <[email protected]> Signed-off-by: Vladimir Oltean <[email protected]> Tested-by: Michael Walle <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-01net: enetc: don't overwrite the RSS indirection table when initializingVladimir Oltean4-8/+18
After the blamed patch, all RX traffic gets hashed to CPU 0 because the hashing indirection table set up in: enetc_pf_probe -> enetc_alloc_si_resources -> enetc_configure_si -> enetc_setup_default_rss_table is overwritten later in: enetc_pf_probe -> enetc_init_port_rss_memory which zero-initializes the entire port RSS table in order to avoid ECC errors. The trouble really is that enetc_init_port_rss_memory really neads enetc_alloc_si_resources to be called, because it depends upon enetc_alloc_cbdr and enetc_setup_cbdr. But that whole enetc_configure_si thing could have been better thought out, it has nothing to do in a function called "alloc_si_resources", especially since its counterpart, "free_si_resources", does nothing to unwind the configuration of the SI. The point is, we need to pull out enetc_configure_si out of enetc_alloc_resources, and move it after enetc_init_port_rss_memory. This allows us to set up the default RSS indirection table after initializing the memory. Fixes: 07bf34a50e32 ("net: enetc: initialize the RFS and RSS memories") Cc: Jesse Brandeburg <[email protected]> Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-01inetpeer: use div64_ul() and clamp_val() calculate inet_peer_thresholdYejune Deng1-14/+7
In inet_initpeers(), struct inet_peer on IA32 uses 128 bytes in nowdays. Get rid of the cascade and use div64_ul() and clamp_val() calculate that will not need to be adjusted in the future as suggested by Eric Dumazet. Suggested-by: Eric Dumazet <[email protected]> Signed-off-by: Yejune Deng <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-01net/qrtr: fix __netdev_alloc_skb callPavel Skripkin1-1/+1
syzbot found WARNING in __alloc_pages_nodemask()[1] when order >= MAX_ORDER. It was caused by a huge length value passed from userspace to qrtr_tun_write_iter(), which tries to allocate skb. Since the value comes from the untrusted source there is no need to raise a warning in __alloc_pages_nodemask(). [1] WARNING in __alloc_pages_nodemask+0x5f8/0x730 mm/page_alloc.c:5014 Call Trace: __alloc_pages include/linux/gfp.h:511 [inline] __alloc_pages_node include/linux/gfp.h:524 [inline] alloc_pages_node include/linux/gfp.h:538 [inline] kmalloc_large_node+0x60/0x110 mm/slub.c:3999 __kmalloc_node_track_caller+0x319/0x3f0 mm/slub.c:4496 __kmalloc_reserve net/core/skbuff.c:150 [inline] __alloc_skb+0x4e4/0x5a0 net/core/skbuff.c:210 __netdev_alloc_skb+0x70/0x400 net/core/skbuff.c:446 netdev_alloc_skb include/linux/skbuff.h:2832 [inline] qrtr_endpoint_post+0x84/0x11b0 net/qrtr/qrtr.c:442 qrtr_tun_write_iter+0x11f/0x1a0 net/qrtr/tun.c:98 call_write_iter include/linux/fs.h:1901 [inline] new_sync_write+0x426/0x650 fs/read_write.c:518 vfs_write+0x791/0xa30 fs/read_write.c:605 ksys_write+0x12d/0x250 fs/read_write.c:658 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Reported-by: [email protected] Signed-off-by: Pavel Skripkin <[email protected]> Acked-by: Alexander Lobakin <[email protected]> Signed-off-by: David S. Miller <[email protected]>