aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-02-18net/mlx5e: Implement CT entry updateVlad Buslov1-1/+117
With support for UDP NEW offload the flow_table may now send updates for existing flows. Support properly replacing existing entries by updating flow restore_cookie and replacing the rule with new one with the same match but new mod_hdr action that sets updated ctinfo. Signed-off-by: Vlad Buslov <[email protected]> Reviewed-by: Oz Shlomo <[email protected]> Reviewed-by: Paul Blakey <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-02-18net/mlx5: Simplify eq list traversalParav Pandit1-4/+4
EQ list is read only while finding the matching EQ. Hence, avoid *_safe() version. Signed-off-by: Parav Pandit <[email protected]> Reviewed-by: Shay Drory <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]> Reviewed-by: Maciej Fijalkowski <[email protected]>
2023-02-18net/mlx5e: Remove redundant page argument in mlx5e_xdp_handle()Tariq Toukan4-11/+9
Remove the page parameter, it can be derived from the xdp_buff member of mlx5e_xdp_buff. Signed-off-by: Tariq Toukan <[email protected]> Reviewed-by: Dragos Tatulea <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-02-18net/mlx5e: Remove redundant page argument in mlx5e_xmit_xdp_buff()Tariq Toukan1-2/+3
Remove the page parameter, it can be derived from the xdp_buff. Signed-off-by: Tariq Toukan <[email protected]> Reviewed-by: Dragos Tatulea <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-02-18net/mlx5e: Switch to using napi_build_skb()Tariq Toukan1-1/+1
Use napi_build_skb() which uses NAPI percpu caches to obtain skbuff_head instead of inplace allocation. napi_build_skb() calls napi_skb_cache_get(), which returns a cached skb, or allocates a bulk of NAPI_SKB_CACHE_BULK (16) if cache is empty. Performance test: TCP single stream, single ring, single core, default MTU (1500B). Before: 26.5 Gbits/sec After: 30.1 Gbits/sec (+13.6%) Signed-off-by: Tariq Toukan <[email protected]> Reviewed-by: Gal Pressman <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]> Reviewed-by: Maciej Fijalkowski <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]>
2023-02-18x86/Xen: drop leftover VM-assist usesJan Beulich1-4/+0
Both the 4Gb-segments and the PAE-extended-CR3 one are applicable to 32-bit guests only. The PAE-extended-CR3 use, furthermore, was redundant with the PAE_MODE ELF note anyway. Signed-off-by: Jan Beulich <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Juergen Gross <[email protected]>
2023-02-17Merge tag 'mm-hotfixes-stable-2023-02-17-15-16-2' of ↵Linus Torvalds9-7/+38
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "Six hotfixes. Five are cc:stable: four for MM, one for nilfs2. Also a MAINTAINERS update" * tag 'mm-hotfixes-stable-2023-02-17-15-16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: nilfs2: fix underflow in second superblock position calculations hugetlb: check for undefined shift on 32 bit architectures mm/migrate: fix wrongly apply write bit after mkdirty on sparc64 MAINTAINERS: update FPU EMULATOR web page mm/MADV_COLLAPSE: set EAGAIN on unexpected page refcount mm/filemap: fix page end in filemap_get_read_batch
2023-02-17nilfs2: fix underflow in second superblock position calculationsRyusuke Konishi3-1/+23
Macro NILFS_SB2_OFFSET_BYTES, which computes the position of the second superblock, underflows when the argument device size is less than 4096 bytes. Therefore, when using this macro, it is necessary to check in advance that the device size is not less than a lower limit, or at least that underflow does not occur. The current nilfs2 implementation lacks this check, causing out-of-bound block access when mounting devices smaller than 4096 bytes: I/O error, dev loop0, sector 36028797018963960 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2 NILFS (loop0): unable to read secondary superblock (blocksize = 1024) In addition, when trying to resize the filesystem to a size below 4096 bytes, this underflow occurs in nilfs_resize_fs(), passing a huge number of segments to nilfs_sufile_resize(), corrupting parameters such as the number of segments in superblocks. This causes excessive loop iterations in nilfs_sufile_resize() during a subsequent resize ioctl, causing semaphore ns_segctor_sem to block for a long time and hang the writer thread: INFO: task segctord:5067 blocked for more than 143 seconds. Not tainted 6.2.0-rc8-syzkaller-00015-gf6feea56f66d #0 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:segctord state:D stack:23456 pid:5067 ppid:2 flags:0x00004000 Call Trace: <TASK> context_switch kernel/sched/core.c:5293 [inline] __schedule+0x1409/0x43f0 kernel/sched/core.c:6606 schedule+0xc3/0x190 kernel/sched/core.c:6682 rwsem_down_write_slowpath+0xfcf/0x14a0 kernel/locking/rwsem.c:1190 nilfs_transaction_lock+0x25c/0x4f0 fs/nilfs2/segment.c:357 nilfs_segctor_thread_construct fs/nilfs2/segment.c:2486 [inline] nilfs_segctor_thread+0x52f/0x1140 fs/nilfs2/segment.c:2570 kthread+0x270/0x300 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308 </TASK> ... Call Trace: <TASK> folio_mark_accessed+0x51c/0xf00 mm/swap.c:515 __nilfs_get_page_block fs/nilfs2/page.c:42 [inline] nilfs_grab_buffer+0x3d3/0x540 fs/nilfs2/page.c:61 nilfs_mdt_submit_block+0xd7/0x8f0 fs/nilfs2/mdt.c:121 nilfs_mdt_read_block+0xeb/0x430 fs/nilfs2/mdt.c:176 nilfs_mdt_get_block+0x12d/0xbb0 fs/nilfs2/mdt.c:251 nilfs_sufile_get_segment_usage_block fs/nilfs2/sufile.c:92 [inline] nilfs_sufile_truncate_range fs/nilfs2/sufile.c:679 [inline] nilfs_sufile_resize+0x7a3/0x12b0 fs/nilfs2/sufile.c:777 nilfs_resize_fs+0x20c/0xed0 fs/nilfs2/super.c:422 nilfs_ioctl_resize fs/nilfs2/ioctl.c:1033 [inline] nilfs_ioctl+0x137c/0x2440 fs/nilfs2/ioctl.c:1301 ... This fixes these issues by inserting appropriate minimum device size checks or anti-underflow checks, depending on where the macro is used. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ryusuke Konishi <[email protected]> Reported-by: <[email protected]> Tested-by: Ryusuke Konishi <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-02-17hugetlb: check for undefined shift on 32 bit architecturesMike Kravetz1-1/+4
Users can specify the hugetlb page size in the mmap, shmget and memfd_create system calls. This is done by using 6 bits within the flags argument to encode the base-2 logarithm of the desired page size. The routine hstate_sizelog() uses the log2 value to find the corresponding hugetlb hstate structure. Converting the log2 value (page_size_log) to potential hugetlb page size is the simple statement: 1UL << page_size_log Because only 6 bits are used for page_size_log, the left shift can not be greater than 63. This is fine on 64 bit architectures where a long is 64 bits. However, if a value greater than 31 is passed on a 32 bit architecture (where long is 32 bits) the shift will result in undefined behavior. This was generally not an issue as the result of the undefined shift had to exactly match hugetlb page size to proceed. Recent improvements in runtime checking have resulted in this undefined behavior throwing errors such as reported below. Fix by comparing page_size_log to BITS_PER_LONG before doing shift. Link: https://lkml.kernel.org/r/[email protected] Link: https://lore.kernel.org/lkml/CA+G9fYuei_Tr-vN9GS7SfFyU1y9hNysnf=PB7kT0=yv4MiPgVg@mail.gmail.com/ Fixes: 42d7395feb56 ("mm: support more pagesizes for MAP_HUGETLB/SHM_HUGETLB") Signed-off-by: Mike Kravetz <[email protected]> Reported-by: Naresh Kamboju <[email protected]> Reviewed-by: Jesper Juhl <[email protected]> Acked-by: Muchun Song <[email protected]> Tested-by: Linux Kernel Functional Testing <[email protected]> Tested-by: Naresh Kamboju <[email protected]> Cc: Anders Roxell <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Sasha Levin <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-02-17mm/migrate: fix wrongly apply write bit after mkdirty on sparc64Peter Xu2-2/+6
Nick Bowler reported another sparc64 breakage after the young/dirty persistent work for page migration (per "Link:" below). That's after a similar report [2]. It turns out page migration was overlooked, and it wasn't failing before because page migration was not enabled in the initial report test environment. David proposed another way [2] to fix this from sparc64 side, but that patch didn't land somehow. Neither did I check whether there's any other arch that has similar issues. Let's fix it for now as simple as moving the write bit handling to be after dirty, like what we did before. Note: this is based on mm-unstable, because the breakage was since 6.1 and we're at a very late stage of 6.2 (-rc8), so I assume for this specific case we should target this at 6.3. [1] https://lore.kernel.org/all/[email protected]/ [2] https://lore.kernel.org/all/[email protected]/ Link: https://lkml.kernel.org/r/[email protected] Fixes: 2e3468778dbe ("mm: remember young/dirty bit for page migrations") Link: https://lore.kernel.org/all/CADyTPExpEqaJiMGoV+Z6xVgL50ZoMJg49B10LcZ=8eg19u34BA@mail.gmail.com/ Signed-off-by: Peter Xu <[email protected]> Reported-by: Nick Bowler <[email protected]> Acked-by: David Hildenbrand <[email protected]> Tested-by: Nick Bowler <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-02-17Merge tag 'powerpc-6.2-6' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fix from Michael Ellerman: - Prevent fallthrough to hash TLB flush when using radix Thanks to Benjamin Gray and Erhard Furtner. * tag 'powerpc-6.2-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64s: Prevent fallthrough to hash TLB flush when using radix
2023-02-17Merge tag 'nfs-for-6.2-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds1-4/+4
Pull NFS client fix from Trond Myklebust: "Unfortunately, we found another bug in the NFSv4.2 READ_PLUS code. Since it has not been possible to fix the bug in time for the 6.2 release, let's just revert the Kconfig change that enables it: - Revert 'NFSv4.2: Change the default KConfig value for READ_PLUS'" * tag 'nfs-for-6.2-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: Revert "NFSv4.2: Change the default KConfig value for READ_PLUS"
2023-02-17Merge tag 'sound-fix-6.2' of ↵Linus Torvalds5-7/+15
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A few last-minute fixes. The significant ones are two ASoC SOF regression fixes while the rest are trivial HD-audio quirks. All are small / one-liners and should be pretty safe to take" * tag 'sound-fix-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ASoC: SOF: Intel: hda-dai: fix possible stream_tag leak ALSA: hda/realtek: Enable mute/micmute LEDs and speaker support for HP Laptops ALSA: hda/realtek: fix mute/micmute LEDs don't work for a HP platform. ALSA: hda/realtek - fixed wrong gpio assigned ALSA: hda: Fix codec device field initializan ALSA: hda/conexant: add a new hda codec SN6180 ASoC: SOF: ops: refine parameters order in function snd_sof_dsp_update8
2023-02-17Merge tag 'gpio-fixes-for-v6.2-part2' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio fix from Bartosz Golaszewski: - fix a memory leak in gpio-sim that was triggered every time libgpiod tests are run in user-space * tag 'gpio-fixes-for-v6.2-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: gpio: sim: fix a memory leak
2023-02-17Merge tag 'ata-6.2-rc8' of ↵Linus Torvalds3-8/+12
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata Pull ata fixes from Damien Le Moal: "Three small fixes for 6.2 final: - Disable READ LOG DMA EXT for Samsung MZ7LH drives as these drives choke on that command, from Patrick. - Add Intel Tiger Lake UP{3,4} to the list of supported AHCI controllers (this is not technically a bug fix, but it is trivial enough that I add it here), from Simon. - Fix code comments in the pata_octeon_cf driver as incorrect formatting was causing warnings from kernel-doc, from Randy" * tag 'ata-6.2-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: ata: pata_octeon_cf: drop kernel-doc notation ata: ahci: Add Tiger Lake UP{3,4} AHCI controller ata: libata-core: Disable READ LOG DMA EXT for Samsung MZ7LH
2023-02-17Merge tag 'mmc-v6.2-rc5' of ↵Linus Torvalds5-29/+41
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC fixes from Ulf Hansson: "MMC core: - Fix potential resource leaks in SDIO card detection error path MMC host: - jz4740: Decrease maximum clock rate to workaround bug on JZ4760(B) - meson-gx: Fix SDIO support to get some WiFi modules to work again - mmc_spi: Fix error handling in ->probe()" * tag 'mmc-v6.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: jz4740: Work around bug on JZ4760(B) mmc: mmc_spi: fix error handling in mmc_spi_probe() mmc: sdio: fix possible resource leaks in some error paths mmc: meson-gx: fix SDIO mode if cap_sdio_irq isn't set
2023-02-17Merge tag 'sched-urgent-2023-02-17' of ↵Linus Torvalds3-11/+21
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: - Fix user-after-free bug in call_usermodehelper_exec() - Fix missing user_cpus_ptr update in __set_cpus_allowed_ptr_locked() - Fix PSI use-after-free bug in ep_remove_wait_queue() * tag 'sched-urgent-2023-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/psi: Fix use-after-free in ep_remove_wait_queue() sched/core: Fix a missed update of user_cpus_ptr freezer,umh: Fix call_usermode_helper_exec() vs SIGKILL
2023-02-17selftests/bpf: Add bpf_fib_lookup testMartin KaFai Lau2-0/+209
This patch tests the bpf_fib_lookup helper when looking up a neigh in NUD_FAILED and NUD_STALE state. It also adds test for the new BPF_FIB_LOOKUP_SKIP_NEIGH flag. Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2023-02-17bpf: Add BPF_FIB_LOOKUP_SKIP_NEIGH for bpf_fib_lookupMartin KaFai Lau3-13/+38
The bpf_fib_lookup() also looks up the neigh table. This was done before bpf_redirect_neigh() was added. In the use case that does not manage the neigh table and requires bpf_fib_lookup() to lookup a fib to decide if it needs to redirect or not, the bpf prog can depend only on using bpf_redirect_neigh() to lookup the neigh. It also keeps the neigh entries fresh and connected. This patch adds a bpf_fib_lookup flag, SKIP_NEIGH, to avoid the double neigh lookup when the bpf prog always call bpf_redirect_neigh() to do the neigh lookup. The params->smac output is skipped together when SKIP_NEIGH is set because bpf_redirect_neigh() will figure out the smac also. Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2023-02-17riscv, bpf: Add bpf trampoline support for RV64Pu Lehui1-0/+317
BPF trampoline is the critical infrastructure of the BPF subsystem, acting as a mediator between kernel functions and BPF programs. Numerous important features, such as using BPF program for zero overhead kernel introspection, rely on this key component. We can't wait to support bpf trampoline on RV64. The related tests have passed, as well as the test_verifier with no new failure ceses. Signed-off-by: Pu Lehui <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Tested-by: Björn Töpel <[email protected]> Acked-by: Björn Töpel <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2023-02-17riscv, bpf: Add bpf_arch_text_poke support for RV64Pu Lehui2-2/+91
Implement bpf_arch_text_poke for RV64. For call scenario, to make BPF trampoline compatible with the kernel and BPF context, we follow the framework of RV64 ftrace to reserve 4 nops for BPF programs as function entry, and use auipc+jalr instructions for function call. However, since auipc+jalr call instruction is non-atomic operation, we need to use stop-machine to make sure instructions patching in atomic context. Also, we use auipc+jalr pair and need to patch in stop-machine context for jump scenario. Signed-off-by: Pu Lehui <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Tested-by: Björn Töpel <[email protected]> Acked-by: Björn Töpel <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2023-02-17riscv, bpf: Factor out emit_call for kernel and bpf contextPu Lehui1-17/+13
The current emit_call function is not suitable for kernel function call as it store return value to bpf R0 register. We can separate it out for common use. Meanwhile, simplify judgment logic, that is, fixed function address can use jal or auipc+jalr, while the unfixed can use only auipc+jalr. Signed-off-by: Pu Lehui <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Tested-by: Björn Töpel <[email protected]> Acked-by: Björn Töpel <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2023-02-17riscv: Extend patch_text for multiple instructionsPu Lehui3-15/+21
Extend patch_text for multiple instructions. This is the preparaiton for multiple instructions text patching in riscv BPF trampoline, and may be useful for other scenario. Signed-off-by: Pu Lehui <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Tested-by: Björn Töpel <[email protected]> Reviewed-by: Conor Dooley <[email protected]> Acked-by: Björn Töpel <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2023-02-17Revert "bpf, test_run: fix &xdp_frame misplacement for LIVE_FRAMES"Martin KaFai Lau2-27/+9
This reverts commit 6c20822fada1b8adb77fa450d03a0d449686a4a9. build bot failed on arch with different cache line size: https://lore.kernel.org/bpf/[email protected]/ Signed-off-by: Martin KaFai Lau <[email protected]>
2023-02-17selftests/bpf: Add global subprog context passing testsAndrii Nakryiko2-0/+106
Add tests validating that it's possible to pass context arguments into global subprogs for various types of programs, including a particularly tricky KPROBE programs (which cover kprobes, uprobes, USDTs, a vast and important class of programs). Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Stanislav Fomichev <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2023-02-17selftests/bpf: Convert test_global_funcs test to test_loader frameworkAndrii Nakryiko18-123/+174
Convert 17 test_global_funcs subtests into test_loader framework for easier maintenance and more declarative way to define expected failures/successes. Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Stanislav Fomichev <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2023-02-17bpf: Fix global subprog context argument resolution logicAndrii Nakryiko1-2/+11
KPROBE program's user-facing context type is defined as typedef bpf_user_pt_regs_t. This leads to a problem when trying to passing kprobe/uprobe/usdt context argument into global subprog, as kernel always strip away mods and typedefs of user-supplied type, but takes expected type from bpf_ctx_convert as is, which causes mismatch. Current way to work around this is to define a fake struct with the same name as expected typedef: struct bpf_user_pt_regs_t {}; __noinline my_global_subprog(struct bpf_user_pt_regs_t *ctx) { ... } This patch fixes the issue by resolving expected type, if it's not a struct. It still leaves the above work-around working for backwards compatibility. Fixes: 91cc1a99740e ("bpf: Annotate context types") Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Stanislav Fomichev <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2023-02-17LoongArch, bpf: Use 4 instructions for function address in JITHengqi Chen2-1/+22
This patch fixes the following issue of function calls in JIT, like: [ 29.346981] multi-func JIT bug 105 != 103 The issus can be reproduced by running the "inline simple bpf_loop call" verifier test. This is because we are emiting 2-4 instructions for 64-bit immediate moves. During the first pass of JIT, the placeholder address is zero, emiting two instructions for it. In the extra pass, the function address is in XKVRANGE, emiting four instructions for it. This change the instruction index in JIT context. Let's always use 4 instructions for function address in JIT. So that the instruction sequences don't change between the first pass and the extra pass for function calls. Fixes: 5dc615520c4d ("LoongArch: Add BPF JIT support") Signed-off-by: Hengqi Chen <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Tested-by: Tiezhu Yang <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2023-02-17wifi: rtl8xxxu: add LEDS_CLASS dependencyArnd Bergmann1-0/+1
rtl8xxxu now unconditionally uses LEDS_CLASS, so a Kconfig dependency is required to avoid link errors: aarch64-linux-ld: drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.o: in function `rtl8xxxu_disconnect': rtl8xxxu_core.c:(.text+0x730): undefined reference to `led_classdev_unregister' ERROR: modpost: "led_classdev_unregister" [drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu.ko] undefined! ERROR: modpost: "led_classdev_register_ext" [drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu.ko] undefined! Fixes: 3be01622995b ("wifi: rtl8xxxu: Register the LED and make it blink") Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-02-17Merge tag 'nvme-6.2-2022-02-17' of git://git.infradead.org/nvme into block-6.2Jens Axboe1-0/+8
Pull NVMe fix from Christoph: "nvme fix for Linux 6.2 - fix visibility of the CMB sysfs attributes (Keith Busch)" * tag 'nvme-6.2-2022-02-17' of git://git.infradead.org/nvme: nvme-pci: refresh visible attrs for cmb attributes
2023-02-17bpf: bpf_fib_lookup should not return neigh in NUD_FAILED stateMartin KaFai Lau1-2/+2
The bpf_fib_lookup() helper does not only look up the fib (ie. route) but it also looks up the neigh. Before returning the neigh, the helper does not check for NUD_VALID. When a neigh state (neigh->nud_state) is in NUD_FAILED, its dmac (neigh->ha) could be all zeros. The helper still returns SUCCESS instead of NO_NEIGH in this case. Because of the SUCCESS return value, the bpf prog directly uses the returned dmac and ends up filling all zero in the eth header. This patch checks for NUD_VALID and returns NO_NEIGH if the neigh is not valid. Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2023-02-17bpf: Disable bh in bpf_test_run for xdp and tc progMartin KaFai Lau1-0/+2
Some of the bpf helpers require bh disabled. eg. The bpf_fib_lookup helper that will be used in a latter selftest. In particular, it calls ___neigh_lookup_noref that expects the bh disabled. This patch disables bh before calling bpf_prog_run[_xdp], so the testing prog can also use those helpers. Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2023-02-17xsk: check IFF_UP earlier in Tx pathMaciej Fijalkowski1-26/+33
Xsk Tx can be triggered via either sendmsg() or poll() syscalls. These two paths share a call to common function xsk_xmit() which has two sanity checks within. A pseudo code example to show the two paths: __xsk_sendmsg() : xsk_poll(): if (unlikely(!xsk_is_bound(xs))) if (unlikely(!xsk_is_bound(xs))) return -ENXIO; return mask; if (unlikely(need_wait)) (...) return -EOPNOTSUPP; xsk_xmit() mark napi id (...) xsk_xmit() xsk_xmit(): if (unlikely(!(xs->dev->flags & IFF_UP))) return -ENETDOWN; if (unlikely(!xs->tx)) return -ENOBUFS; As it can be observed above, in sendmsg() napi id can be marked on interface that was not brought up and this causes a NULL ptr dereference: [31757.505631] BUG: kernel NULL pointer dereference, address: 0000000000000018 [31757.512710] #PF: supervisor read access in kernel mode [31757.517936] #PF: error_code(0x0000) - not-present page [31757.523149] PGD 0 P4D 0 [31757.525726] Oops: 0000 [#1] PREEMPT SMP NOPTI [31757.530154] CPU: 26 PID: 95641 Comm: xdpsock Not tainted 6.2.0-rc5+ #40 [31757.536871] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019 [31757.547457] RIP: 0010:xsk_sendmsg+0xde/0x180 [31757.551799] Code: 00 75 a2 48 8b 00 a8 04 75 9b 84 d2 74 69 8b 85 14 01 00 00 85 c0 75 1b 48 8b 85 28 03 00 00 48 8b 80 98 00 00 00 48 8b 40 20 <8b> 40 18 89 85 14 01 00 00 8b bd 14 01 00 00 81 ff 00 01 00 00 0f [31757.570840] RSP: 0018:ffffc90034f27dc0 EFLAGS: 00010246 [31757.576143] RAX: 0000000000000000 RBX: ffffc90034f27e18 RCX: 0000000000000000 [31757.583389] RDX: 0000000000000001 RSI: ffffc90034f27e18 RDI: ffff88984cf3c100 [31757.590631] RBP: ffff88984714a800 R08: ffff88984714a800 R09: 0000000000000000 [31757.597877] R10: 0000000000000001 R11: 0000000000000000 R12: 00000000fffffffa [31757.605123] R13: 0000000000000000 R14: 0000000000000003 R15: 0000000000000000 [31757.612364] FS: 00007fb4c5931180(0000) GS:ffff88afdfa00000(0000) knlGS:0000000000000000 [31757.620571] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [31757.626406] CR2: 0000000000000018 CR3: 000000184b41c003 CR4: 00000000007706e0 [31757.633648] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [31757.640894] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [31757.648139] PKRU: 55555554 [31757.650894] Call Trace: [31757.653385] <TASK> [31757.655524] sock_sendmsg+0x8f/0xa0 [31757.659077] ? sockfd_lookup_light+0x12/0x70 [31757.663416] __sys_sendto+0xfc/0x170 [31757.667051] ? do_sched_setscheduler+0xdb/0x1b0 [31757.671658] __x64_sys_sendto+0x20/0x30 [31757.675557] do_syscall_64+0x38/0x90 [31757.679197] entry_SYSCALL_64_after_hwframe+0x72/0xdc [31757.687969] Code: 8e f6 ff 44 8b 4c 24 2c 4c 8b 44 24 20 41 89 c4 44 8b 54 24 28 48 8b 54 24 18 b8 2c 00 00 00 48 8b 74 24 10 8b 7c 24 08 0f 05 <48> 3d 00 f0 ff ff 77 3a 44 89 e7 48 89 44 24 08 e8 b5 8e f6 ff 48 [31757.707007] RSP: 002b:00007ffd49c73c70 EFLAGS: 00000293 ORIG_RAX: 000000000000002c [31757.714694] RAX: ffffffffffffffda RBX: 000055a996565380 RCX: 00007fb4c5727c16 [31757.721939] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003 [31757.729184] RBP: 0000000000000040 R08: 0000000000000000 R09: 0000000000000000 [31757.736429] R10: 0000000000000040 R11: 0000000000000293 R12: 0000000000000000 [31757.743673] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [31757.754940] </TASK> To fix this, let's make xsk_xmit a function that will be responsible for generic Tx, where RCU is handled accordingly and pull out sanity checks and xs->zc handling. Populate sanity checks to __xsk_sendmsg() and xsk_poll(). Fixes: ca2e1a627035 ("xsk: Mark napi_id on sendmsg()") Fixes: 18b1ab7aa76b ("xsk: Fix race at socket teardown") Signed-off-by: Maciej Fijalkowski <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
2023-02-17Revert "NFSv4.2: Change the default KConfig value for READ_PLUS"Anna Schumaker1-4/+4
This reverts commit 7fd461c47c6cfab4ca4d003790ec276209e52978. Unfortunately, it has come to our attention that there is still a bug somewhere in the READ_PLUS code that can result in nfsroot systems on ARM to crash during boot. Let's do the right thing and revert this change so we don't break people's nfsroot setups. Signed-off-by: Anna Schumaker <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2023-02-17brd: use radix_tree_maybe_preload instead of radix_tree_preloadPankaj Raghav1-1/+1
Unconditionally calling radix_tree_preload_end() results in a OOPS message as the preload is only conditionally called for gfpflags_allow_blocking(). [ 20.267323] BUG: using smp_processor_id() in preemptible [00000000] code: fio/416 [ 20.267837] caller is brd_insert_page.part.0+0xbe/0x190 [brd] [ 20.269436] Call Trace: [ 20.269598] <TASK> [ 20.269742] dump_stack_lvl+0x32/0x50 [ 20.269982] check_preemption_disabled+0xd1/0xe0 [ 20.270289] brd_insert_page.part.0+0xbe/0x190 [brd] [ 20.270664] brd_submit_bio+0x33f/0xf40 [brd] Use radix_tree_maybe_preload() which does preload only if gfpflags_allow_blocking() is true but also takes the lock. Therefore, unconditionally calling radix_tree_preload_end() should not create any issues and the message disappears. Fixes: 6ded703c56c2 ("brd: check for REQ_NOWAIT and set correct page allocation mask") Signed-off-by: Pankaj Raghav <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2023-02-17netfilter: let reset rules clean out conntrack entriesFlorian Westphal7-0/+76
iptables/nftables support responding to tcp packets with tcp resets. The generated tcp reset packet passes through both output and postrouting netfilter hooks, but conntrack will never see them because the generated skb has its ->nfct pointer copied over from the packet that triggered the reset rule. If the reset rule is used for established connections, this may result in the conntrack entry to be around for a very long time (default timeout is 5 days). One way to avoid this would be to not copy the nf_conn pointer so that the rest packet passes through conntrack too. Problem is that output rules might not have the same conntrack zone setup as the prerouting ones, so its possible that the reset skb won't find the correct entry. Generating a template entry for the skb seems error prone as well. Add an explicit "closing" function that switches a confirmed conntrack entry to closed state and wire this up for tcp. If the entry isn't confirmed, no action is needed because the conntrack entry will never be committed to the table. Reported-by: Russel King <[email protected]> Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2023-02-17Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/netDavid S. Miller196-775/+1611
Some of the devlink bits were tricky, but I think I got it right. Signed-off-by: David S. Miller <[email protected]>
2023-02-17gpio: sim: fix a memory leakBartosz Golaszewski1-1/+1
Fix an inverted logic bug in gpio_sim_remove_hogs() that leads to GPIO hog structures never being freed. Fixes: cb8c474e79be ("gpio: sim: new testing module") Reported-by: Mirsad Goran Todorovac <[email protected]> Signed-off-by: Bartosz Golaszewski <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]>
2023-02-17wifi: iwlegacy: avoid fortify warningJohannes Berg1-1/+1
There are two different alive messages, the "init" one is bigger than the other one, so we have a fortify read warn here. Avoid it by copying from the variable-sized 'raw' instead. Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-02-17wifi: iwlwifi: mvm: remove unused iwl_dbgfs_is_match()Johannes Berg1-7/+0
This inline function is unused, remove it. Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/20230216205754.d500dcc2e90c.Id87df297263f86b5bba002f7cbb387abc13adf53@changeid
2023-02-17wifi: rtw89: fix AP mode authentication transmission failedPo-Hao Huang1-21/+26
For some ICs, packets can't be sent correctly without initializing CMAC table first. Previous flow do this initialization after associated, results in authentication response fails to transmit. Move the initialization up front to a proper place to solve this. Signed-off-by: Po-Hao Huang <[email protected]> Signed-off-by: Ping-Ke Shih <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-02-17wifi: rtw88: use RTW_FLAG_POWERON flag to prevent to power on/off twicePing-Ke Shih5-5/+15
Use power state to decide whether we can enter or leave IPS accurately, and then prevent to power on/off twice. The commit 6bf3a083407b ("wifi: rtw88: add flag check before enter or leave IPS") would like to prevent this as well, but it still can't entirely handle all cases. The exception is that WiFi gets connected and does suspend/resume, it will power on twice and cause it failed to power on after resuming, like: rtw_8723de 0000:03:00.0: failed to poll offset=0x6 mask=0x2 value=0x2 rtw_8723de 0000:03:00.0: mac power on failed rtw_8723de 0000:03:00.0: failed to power on mac rtw_8723de 0000:03:00.0: leave idle state failed rtw_8723de 0000:03:00.0: failed to leave ips state rtw_8723de 0000:03:00.0: failed to leave idle state rtw_8723de 0000:03:00.0: failed to send h2c command To fix this, introduce new flag RTW_FLAG_POWERON to reflect power state, and call rtw_mac_pre_system_cfg() to configure registers properly between power-off/-on. Reported-by: Paul Gover <[email protected]> Link: https://bugzilla.kernel.org/show_bug.cgi?id=217016 Fixes: 6bf3a083407b ("wifi: rtw88: add flag check before enter or leave IPS") Cc: <[email protected]> Signed-off-by: Ping-Ke Shih <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-02-17Merge tag 'asoc-fix-v6.2-rc8-2' of ↵Takashi Iwai1-4/+4
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: One more fix for v6.2 One more fix from Peter which he'd very much like to get into v6.2.
2023-02-17nvme-pci: refresh visible attrs for cmb attributesKeith Busch1-0/+8
The sysfs group containing the cmb attributes is registered before the driver knows if they need to be visible or not. Update the group when cmb attributes are known to exist so the visibility setting is correct. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217037 Fixes: 86adbf0cdb9ec65 ("nvme: simplify transport specific device attribute handling") Signed-off-by: Keith Busch <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2023-02-16Merge tag 'drm-fixes-2023-02-17' of git://anongit.freedesktop.org/drm/drmLinus Torvalds19-38/+73
Pull drm fixes from Dave Airlie: "Just a final collection of misc fixes, the biggest disables the recently added dynamic debugging support, it has a regression that needs some bigger fixes. Otherwise a bunch of fixes across the board, vc4, amdgpu and vmwgfx mostly, with some smaller i915 and ast fixes. drm: - dynamic debug disable for now fbdev: - deferred i/o device close fix amdgpu: - Fix GC11.x suspend warning - Fix display warning vc4: - YUV planes fix - hdmi display fix - crtc reduced blanking fix ast: - fix start address computation vmwgfx: - fix bo/handle races i915: - gen11 WA fix" * tag 'drm-fixes-2023-02-17' of git://anongit.freedesktop.org/drm/drm: drm/amd/display: Fail atomic_check early on normalize_zpos error drm/amd/amdgpu: fix warning during suspend drm/vmwgfx: Do not drop the reference to the handle too soon drm/vmwgfx: Stop accessing buffer objects which failed init drm/i915/gen11: Wa_1408615072/Wa_1407596294 should be on GT list drm: Disable dynamic debug as broken drm/ast: Fix start address computation fbdev: Fix invalid page access after closing deferred I/O devices drm/vc4: crtc: Increase setup cost in core clock calculation to handle extreme reduced blanking drm/vc4: hdmi: Always enable GCP with AVMUTE cleared drm/vc4: Fix YUV plane handling when planes are in different buffers
2023-02-16block: use proper return value from bio_failfast()Jens Axboe1-1/+1
kernel test robot complains about a type mismatch: block/blk-merge.c:984:42: sparse: expected restricted blk_opf_t const [usertype] ff block/blk-merge.c:984:42: sparse: got unsigned int block/blk-merge.c:1010:42: sparse: sparse: incorrect type in initializer (different base types) @@ expected restricted blk_opf_t const [usertype] ff @@ got unsigned int @@ block/blk-merge.c:1010:42: sparse: expected restricted blk_opf_t const [usertype] ff block/blk-merge.c:1010:42: sparse: got unsigned int because bio_failfast() is return an unsigned int rather than the appropriate blk_opt_f type. Fix it up. Fixes: 3ce6a115980c ("block: sync mixed merged request's failfast with 1st bio's") Reported-by: kernel test robot <[email protected]> Link: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Signed-off-by: Jens Axboe <[email protected]>
2023-02-16MAINTAINERS: update FPU EMULATOR web pageRandy Dunlap1-1/+1
The web page entry for the FPU EMULATOR no longer works. I notified Bill of this and he asked me to update it to this new entry. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Randy Dunlap <[email protected]> Acked-by: Bill Metzenthen <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-02-16mm/MADV_COLLAPSE: set EAGAIN on unexpected page refcountZach O'Keefe1-0/+1
During collapse, in a few places we check to see if a given small page has any unaccounted references. If the refcount on the page doesn't match our expectations, it must be there is an unknown user concurrently interested in the page, and so it's not safe to move the contents elsewhere. However, the unaccounted pins are likely an ephemeral state. In this situation, MADV_COLLAPSE returns -EINVAL when it should return -EAGAIN. This could cause userspace to conclude that the syscall failed, when it in fact could succeed by retrying. Link: https://lkml.kernel.org/r/[email protected] Fixes: 7d8faaf15545 ("mm/madvise: introduce MADV_COLLAPSE sync hugepage collapse") Signed-off-by: Zach O'Keefe <[email protected]> Reported-by: Hugh Dickins <[email protected]> Acked-by: Hugh Dickins <[email protected]> Reviewed-by: Yang Shi <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-02-16mm/filemap: fix page end in filemap_get_read_batchQian Yingjin1-2/+3
I was running traces of the read code against an RAID storage system to understand why read requests were being misaligned against the underlying RAID strips. I found that the page end offset calculation in filemap_get_read_batch() was off by one. When a read is submitted with end offset 1048575, then it calculates the end page for read of 256 when it should be 255. "last_index" is the index of the page beyond the end of the read and it should be skipped when get a batch of pages for read in @filemap_get_read_batch(). The below simple patch fixes the problem. This code was introduced in kernel 5.12. Link: https://lkml.kernel.org/r/[email protected] Fixes: cbd59c48ae2b ("mm/filemap: use head pages in generic_file_buffered_read") Signed-off-by: Qian Yingjin <[email protected]> Reviewed-by: Matthew Wilcox (Oracle) <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-02-17powerpc/64s: Prevent fallthrough to hash TLB flush when using radixBenjamin Gray1-2/+2
In the fix reconnecting hash__tlb_flush() to tlb_flush() the void return on radix__tlb_flush() was not restored and subsequently falls through to the restored hash__tlb_flush(). Guard hash__tlb_flush() under an else to prevent this. Fixes: 1665c027afb2 ("powerpc/64s: Reconnect tlb_flush() to hash__tlb_flush()") Reported-by: "Erhard F." <[email protected]> Suggested-by: Christophe Leroy <[email protected]> Signed-off-by: Benjamin Gray <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]