aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-08-16net/mlx5: Fix IPsec RoCE MPV trace callPatrisious Haddad1-2/+4
Prevent the call trace below from happening, by not allowing IPsec creation over a slave, if master device doesn't support IPsec. WARNING: CPU: 44 PID: 16136 at kernel/locking/rwsem.c:240 down_read+0x75/0x94 Modules linked in: esp4_offload esp4 act_mirred act_vlan cls_flower sch_ingress mlx5_vdpa vringh vhost_iotlb vdpa mst_pciconf(OE) nfsv3 nfs_acl nfs lockd grace fscache netfs xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill cuse fuse rpcrdma sunrpc rdma_ucm ib_srpt ib_isert iscsi_target_mod target_core_mod ib_umad ib_iser libiscsi scsi_transport_iscsi rdma_cm ib_ipoib iw_cm ib_cm ipmi_ssif intel_rapl_msr intel_rapl_common amd64_edac edac_mce_amd kvm_amd kvm irqbypass crct10dif_pclmul crc32_pclmul mlx5_ib ghash_clmulni_intel sha1_ssse3 dell_smbios ib_uverbs aesni_intel crypto_simd dcdbas wmi_bmof dell_wmi_descriptor cryptd pcspkr ib_core acpi_ipmi sp5100_tco ccp i2c_piix4 ipmi_si ptdma k10temp ipmi_devintf ipmi_msghandler acpi_power_meter acpi_cpufreq ext4 mbcache jbd2 sd_mod t10_pi sg mgag200 drm_kms_helper syscopyarea sysfillrect mlx5_core sysimgblt fb_sys_fops cec ahci libahci mlxfw drm pci_hyperv_intf libata tg3 sha256_ssse3 tls megaraid_sas i2c_algo_bit psample wmi dm_mirror dm_region_hash dm_log dm_mod [last unloaded: mst_pci] CPU: 44 PID: 16136 Comm: kworker/44:3 Kdump: loaded Tainted: GOE 5.15.0-20240509.el8uek.uek7_u3_update_v6.6_ipsec_bf.x86_64 #2 Hardware name: Dell Inc. PowerEdge R7525/074H08, BIOS 2.0.3 01/15/2021 Workqueue: events xfrm_state_gc_task RIP: 0010:down_read+0x75/0x94 Code: 00 48 8b 45 08 65 48 8b 14 25 80 fc 01 00 83 e0 02 48 09 d0 48 83 c8 01 48 89 45 08 5d 31 c0 89 c2 89 c6 89 c7 e9 cb 88 3b 00 <0f> 0b 48 8b 45 08 a8 01 74 b2 a8 02 75 ae 48 89 c2 48 83 ca 02 f0 RSP: 0018:ffffb26387773da8 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffffa08b658af900 RCX: 0000000000000001 RDX: 0000000000000000 RSI: ff886bc5e1366f2f RDI: 0000000000000000 RBP: ffffa08b658af940 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffffa0a9bfb31540 R13: ffffa0a9bfb37900 R14: 0000000000000000 R15: ffffa0a9bfb37905 FS: 0000000000000000(0000) GS:ffffa0a9bfb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055a45ed814e8 CR3: 000000109038a000 CR4: 0000000000350ee0 Call Trace: <TASK> ? show_trace_log_lvl+0x1d6/0x2f9 ? show_trace_log_lvl+0x1d6/0x2f9 ? mlx5_devcom_for_each_peer_begin+0x29/0x60 [mlx5_core] ? down_read+0x75/0x94 ? __warn+0x80/0x113 ? down_read+0x75/0x94 ? report_bug+0xa4/0x11d ? handle_bug+0x35/0x8b ? exc_invalid_op+0x14/0x75 ? asm_exc_invalid_op+0x16/0x1b ? down_read+0x75/0x94 ? down_read+0xe/0x94 mlx5_devcom_for_each_peer_begin+0x29/0x60 [mlx5_core] mlx5_ipsec_fs_roce_tx_destroy+0xb1/0x130 [mlx5_core] tx_destroy+0x1b/0xc0 [mlx5_core] tx_ft_put+0x53/0xc0 [mlx5_core] mlx5e_xfrm_free_state+0x45/0x90 [mlx5_core] ___xfrm_state_destroy+0x10f/0x1a2 xfrm_state_gc_task+0x81/0xa9 process_one_work+0x1f1/0x3c6 worker_thread+0x53/0x3e4 ? process_one_work.cold+0x46/0x3c kthread+0x127/0x144 ? set_kthread_struct+0x60/0x52 ret_from_fork+0x22/0x2d </TASK> ---[ end trace 5ef7896144d398e1 ]--- Fixes: dfbd229abeee ("net/mlx5: Configure IPsec steering for egress RoCEv2 MPV traffic") Reviewed-by: Leon Romanovsky <[email protected]> Signed-off-by: Patrisious Haddad <[email protected]> Signed-off-by: Tariq Toukan <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-08-16net/mlx5e: XPS, Fix oversight of Multi-PF Netdev changesCarolina Jubran1-5/+8
The offending commit overlooked the Multi-PF Netdev changes. Revert mlx5e_set_default_xps_cpumasks to incorporate Multi-PF Netdev changes. Fixes: bcee093751f8 ("net/mlx5e: Modifying channels number and updating TX queues") Signed-off-by: Carolina Jubran <[email protected]> Signed-off-by: Tariq Toukan <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-08-16net/mlx5e: SHAMPO, Release in progress headersDragos Tatulea3-10/+24
The change in the fixes tag cleaned up too much: it removed the part that was releasing header pages that were posted via UMR but haven't been acknowledged yet on the ICOSQ. This patch corrects this omission by setting the bits between pi and ci to on when shutting down a queue with SHAMPO. To be consistent with the Striding RQ code, this action is done in mlx5e_free_rx_missing_descs(). Fixes: e839ac9a89cb ("net/mlx5e: SHAMPO, Simplify header page release in teardown") Signed-off-by: Dragos Tatulea <[email protected]> Signed-off-by: Tariq Toukan <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-08-16net/mlx5e: SHAMPO, Fix page leakDragos Tatulea1-0/+1
When SHAMPO is used, a receive queue currently almost always leaks one page on shutdown. A page has MLX5E_SHAMPO_WQ_HEADER_PER_PAGE (8) headers. These headers are tracked in the SHAMPO bitmap. Each page is released when the last header index in the group is processed. During header allocation, there can be leftovers from a page that will be used in a subsequent allocation. This is normally fine, except for the following scenario (simplified a bit): 1) Allocate N new page fragments, showing only the relevant last 4 fragments: 0: new page 1: new page 2: new page 3: new page 4: page from previous allocation 5: page from previous allocation 6: page from previous allocation 7: page from previous allocation 2) NAPI processes header indices 4-7 because they are the oldest allocated. Bit 7 will be set to 0. 3) Receive queue shutdown occurs. All the remaining bits are being iterated on to release the pages. But the page assigned to header indices 0-3 will not be freed due to what happened in step 2. This patch fixes the issue by making sure that on allocation, header fragments are always allocated in groups of MLX5E_SHAMPO_WQ_HEADER_PER_PAGE so that there is never a partial page left over between allocations. A more appropriate fix would be a refactoring of mlx5e_alloc_rx_hd_mpwqe() and mlx5e_build_shampo_hd_umr(). But this refactoring is too big for net. It will be targeted for net-next. Fixes: e839ac9a89cb ("net/mlx5e: SHAMPO, Simplify header page release in teardown") Signed-off-by: Dragos Tatulea <[email protected]> Signed-off-by: Tariq Toukan <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-08-16Merge tag 'block-6.11-20240824' of git://git.kernel.dk/linuxLinus Torvalds7-60/+63
Pull block fixes from Jens Axboe: - Fix corruption issues with s390/dasd (Eric, Stefan) - Fix a misuse of non irq locking grab of a lock (Li) - MD pull request with a single data corruption fix for raid1 (Yu) * tag 'block-6.11-20240824' of git://git.kernel.dk/linux: block: Fix lockdep warning in blk_mq_mark_tag_wait md/raid1: Fix data corruption for degraded array with slow disk s390/dasd: fix error recovery leading to data corruption on ESE devices s390/dasd: Remove DMA alignment
2024-08-16Merge tag 'io_uring-6.11-20240824' of git://git.kernel.dk/linuxLinus Torvalds4-5/+4
Pull io_uring fixes from Jens Axboe: - Fix a comment in the uapi header using the wrong member name (Caleb) - Fix KCSAN warning for a debug check in sqpoll (me) - Two more NAPI tweaks (Olivier) * tag 'io_uring-6.11-20240824' of git://git.kernel.dk/linux: io_uring: fix user_data field name in comment io_uring/sqpoll: annotate debug task == current with data_race() io_uring/napi: remove duplicate io_napi_entry timeout assignation io_uring/napi: check napi_enabled in io_napi_add() before proceeding
2024-08-16Merge tag 'devicetree-fixes-for-6.11-2' of ↵Linus Torvalds26-28/+36
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree fixes from Rob Herring: - Fix a possible (but unlikely) out-of-bounds read in interrupts parsing code - Add AT25 EEPROM "fujitsu,mb85rs256" compatible - Update Konrad Dybcio's email * tag 'devicetree-fixes-for-6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: of/irq: Prevent device address out-of-bounds read in interrupt map walk dt-bindings: eeprom: at25: add fujitsu,mb85rs256 compatible dt-bindings: Batch-update Konrad Dybcio's email
2024-08-16btrfs: only enable extent map shrinker for DEBUG buildsQu Wenruo1-1/+7
Although there are several patches improving the extent map shrinker, there are still reports of too frequent shrinker behavior, taking too much CPU for the kswapd process. So let's only enable extent shrinker for now, until we got more comprehensive understanding and a better solution. Link: https://lore.kernel.org/linux-btrfs/[email protected]/ Link: https://lore.kernel.org/linux-btrfs/[email protected]/ Fixes: 956a17d9d050 ("btrfs: add a shrinker for extent maps") CC: [email protected] # 6.10+ Signed-off-by: Qu Wenruo <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2024-08-16Merge tag 'thermal-6.11-rc4' of ↵Linus Torvalds3-18/+69
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull thermal control fix from Rafael Wysocki: "Fix a Bang-bang thermal governor issue causing it to fail to reset the state of cooling devices if they are 'on' to start with, but the thermal zone temperature is always below the corresponding trip point (Rafael Wysocki)" * tag 'thermal-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: thermal: gov_bang_bang: Use governor_data to reduce overhead thermal: gov_bang_bang: Add .manage() callback thermal: gov_bang_bang: Split bang_bang_control() thermal: gov_bang_bang: Call __thermal_cdev_update() directly
2024-08-16Merge tag 'acpi-6.11-rc4' of ↵Linus Torvalds7-74/+30
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fix from Rafael Wysocki: "Fix an issue related to the ACPI EC device handling that causes the _REG control method to be evaluated for EC operation regions that are not expected to be used. This confuses the platform firmware and provokes various types of misbehavior on some systems (Rafael Wysocki)" * tag 'acpi-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: EC: Evaluate _REG outside the EC scope more carefully ACPICA: Add a depth argument to acpi_execute_reg_methods() Revert "ACPI: EC: Evaluate orphan _REG under EC device"
2024-08-16Merge tag 'libnvdimm-fixes-6.11-rc4' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fix from Ira Weiny: "Commit f467fee48da4 ("block: move the dax flag to queue_limits") broke the DAX tests by skipping over the legacy pmem mapping pages case. Set the DAX flag in this case as well" * tag 'libnvdimm-fixes-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: nvdimm/pmem: Set dax flag for all 'PFN_MAP' cases
2024-08-16io_uring: fix user_data field name in commentCaleb Sander Mateos1-1/+1
io_uring_cqe's user_data field refers to `sqe->data`, but io_uring_sqe does not have a data field. Fix the comment to say `sqe->user_data`. Signed-off-by: Caleb Sander Mateos <[email protected]> Link: https://github.com/axboe/liburing/pull/1206 Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2024-08-16Merge tag 'rust-fixes-6.11' of https://github.com/Rust-for-Linux/linuxLinus Torvalds7-11/+18
Pull rust fixes from Miguel Ojeda: - Fix '-Os' Rust 1.80.0+ builds adding more intrinsics (also tweaked in upstream Rust for the upcoming 1.82.0). - Fix support for the latest version of rust-analyzer due to a change on rust-analyzer config file semantics (considered a fix since most developers use the latest version of the tool, which is the only one actually supported by upstream). I am discussing stability of the config file with upstream -- they may be able to start versioning it. - Fix GCC 14 builds due to '-fmin-function-alignment' not skipped for libclang (bindgen). - A couple Kconfig fixes around '{RUSTC,BINDGEN}_VERSION_TEXT' to suppress error messages in a foreign architecture chroot and to use a proper default format. - Clean 'rust-analyzer' target warning due to missing recursive make invocation mark. - Clean Clippy warning due to missing indentation in docs. - Clean LLVM 19 build warning due to removed 3dnow feature upstream. * tag 'rust-fixes-6.11' of https://github.com/Rust-for-Linux/linux: rust: x86: remove `-3dnow{,a}` from target features kbuild: rust-analyzer: mark `rust_is_available.sh` invocation as recursive rust: add intrinsics to fix `-Os` builds kbuild: rust: skip -fmin-function-alignment in bindgen flags rust: Support latest version of `rust-analyzer` rust: macros: indent list item in `module!`'s docs rust: fix the default format for CONFIG_{RUSTC,BINDGEN}_VERSION_TEXT rust: suppress error messages from CONFIG_{RUSTC,BINDGEN}_VERSION_TEXT
2024-08-16Merge tag 'riscv-for-linus-6.11-rc4' of ↵Linus Torvalds11-35/+54
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - reintroduce the text patching global icache flush - fix syscall entry code to correctly initialize a0, which manifested as a strace bug - XIP kernels now map the entire kernel, which fixes boot under at least DEBUG_VIRTUAL=y - initialize all nodes in the acpi_early_node_map initializer - fix OOB access in the Andes vendor extension probing code - A new key for scalar misaligned access performance in hwprobe, which correctly treat the values as an enum (as opposed to a bitmap) * tag 'riscv-for-linus-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: Fix out-of-bounds when accessing Andes per hart vendor extension array RISC-V: hwprobe: Add SCALAR to misaligned perf defines RISC-V: hwprobe: Add MISALIGNED_PERF key RISC-V: ACPI: NUMA: initialize all values of acpi_early_node_map to NUMA_NO_NODE riscv: change XIP's kernel_map.size to be size of the entire kernel riscv: entry: always initialize regs->a0 to -ENOSYS riscv: Re-introduce global icache flush in patch_text_XXX()
2024-08-16Merge tag 'trace-v6.11-rc3' of ↵Linus Torvalds2-8/+5
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: "A couple of fixes for tracing: - Prevent a NULL pointer dereference in the error path of RTLA tool - Fix an infinite loop bug when reading from the ring buffer when closed. If there's a thread trying to read the ring buffer and it gets closed by another thread, the one reading will go into an infinite loop when the buffer is empty instead of exiting back to user space" * tag 'trace-v6.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: rtla/osnoise: Prevent NULL dereference in error handling tracing: Return from tracing_buffers_read() if the file has been closed
2024-08-16Merge tag 'keys-trusted-next-6.11-rc4' of ↵Linus Torvalds1-13/+22
git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd Pull key fixes from Jarkko Sakkinen: "Two bug fixes for a memory corruption bug and a memory leak bug in the DCP trusted keys type. Just as a reminder DCP was a crypto coprocessor in i.MX SoCs" * tag 'keys-trusted-next-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: KEYS: trusted: dcp: fix leak of blob encryption key KEYS: trusted: fix DCP blob payload length assignment
2024-08-16bcachefs: fix incorrect i_state usageKent Overstreet1-1/+1
Reported-by: [email protected] Signed-off-by: Kent Overstreet <[email protected]>
2024-08-16bcachefs: avoid overflowing LRU_TIME_BITS for cached data lruKent Overstreet1-1/+3
Reported-by: [email protected] Signed-off-by: Kent Overstreet <[email protected]>
2024-08-16bcachefs: Fix forgetting to pass trans to fsck_err()Kent Overstreet1-1/+1
Reported-by: [email protected] Signed-off-by: Kent Overstreet <[email protected]>
2024-08-16bcachefs: Increase size of cuckoo hash table on too many rehashesKent Overstreet1-2/+9
Also, improve the calculation of the new table size, so that it can shrink when needed. Signed-off-by: Kent Overstreet <[email protected]>
2024-08-16Merge tag 'for-6.11/dm-fixes' of ↵Linus Torvalds4-13/+32
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mikulas Patocka: - fix misbehavior if suspend or resume is interrupted by a signal - fix wrong indentation in dm-crypt.rst - fix memory allocation failure in dm-persistent-data * tag 'for-6.11/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm persistent data: fix memory allocation failure Documentation: dm-crypt.rst warning + error fix dm resume: don't return EINVAL when signalled dm suspend: return -ERESTARTSYS instead of -EINTR
2024-08-16Merge tag 'iommu-fixes-v6.11-rc3' of ↵Linus Torvalds2-2/+1
git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux Pull iommu fixes from Joerg Roedel: - Bring back a lost return statement in io-page-fault code - Remove an unused function declaration * tag 'iommu-fixes-v6.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: iommu: Remove unused declaration iommu_sva_unbind_gpasid() iommu: Restore lost return in iommu_report_device_fault()
2024-08-16Merge tag 'gpio-fixes-for-v6.11-rc4' of ↵Linus Torvalds1-0/+14
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio fix from Bartosz Golaszewski: - add the shutdown() callback to gpio-mlxbf3 in order to disable interrupts during graceful reboot * tag 'gpio-fixes-for-v6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: gpio: mlxbf3: Support shutdown() function
2024-08-16Merge tag 'sound-6.11-rc4' of ↵Linus Torvalds8-10/+132
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "All small fixes, mostly for usual suspects, HD-audio and USB-audio device-specific fixes / quirks. The Cirrus codec support took the update of SPI header as well. Other than that, there is a regression fix in the sanity check of ALSA timer code" * tag 'sound-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda/tas2781: Use correct endian conversion ALSA: usb-audio: Support Yamaha P-125 quirk entry ALSA: hda: cs35l41: Remove redundant call to hda_cs_dsp_control_remove() ALSA: hda: cs35l56: Remove redundant call to hda_cs_dsp_control_remove() ALSA: hda/tas2781: fix wrong calibrated data order ALSA: usb-audio: Add delay quirk for VIVO USB-C-XE710 HEADSET ALSA: hda/realtek: Add support for new HP G12 laptops ALSA: hda/realtek: Fix noise from speakers on Lenovo IdeaPad 3 15IAU7 ALSA: timer: Relax start tick time check for slave timer elements spi: Add empty versions of ACPI functions
2024-08-16Merge tag 'drm-fixes-2024-08-16' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds47-426/+699
Pull drm fixes from Dave Airlie: "Weekly drm fixes, mostly amdgpu and xe. The larger amdgpu fix is for a new IP block introduced in rc1, so should be fine. The xe fixes contain some missed fixes from the end of the previous round along with some fixes which required precursor changes, but otherwise everything seems fine, mediatek: - fix cursor crash amdgpu: - Fix MES ring buffer overflow - DCN 3.5 fix - DCN 3.2.1 fix - DP MST fix - Cursor fixes - JPEG fixes - Context ops validation - MES 12 fixes - VCN 5.0 fix - HDP fix panel: - dt bindings style fix - orientation quirks rockchip: - inno-hdmi: fix infoframe upload v3d: - fix OOB access in v3d_csd_job_run() xe: - Validate user fence during creation - Fix use after free when client stats are captured - SRIOV fixes - Runtime PM fixes" * tag 'drm-fixes-2024-08-16' of https://gitlab.freedesktop.org/drm/kernel: (37 commits) drm/xe: Hold a PM ref when GT TLB invalidations are inflight drm/xe: Drop xe_gt_tlb_invalidation_wait drm/xe: Add xe_gt_tlb_invalidation_fence_init helper drm/xe/pf: Fix VF config validation on multi-GT platforms drm/xe: Build PM into GuC CT layer drm/xe/vf: Fix register value lookup drm/xe: Fix use after free when client stats are captured drm/xe: Take a ref to xe file when user creates a VM drm/xe: Add ref counting for xe_file drm/xe: Move part of xe_file cleanup to a helper drm/xe: Validate user fence during creation drm/rockchip: inno-hdmi: Fix infoframe upload drm/amd/amdgpu: add HDP_SD support on gc 12.0.0/1 drm/amdgpu: Update kmd_fw_shared for VCN5 drm/amd/amdgpu: command submission parser for JPEG drm/amdgpu/mes12: fix suspend issue drm/amdgpu/mes12: sw/hw fini for unified mes drm/amdgpu/mes12: configure two pipes hardware resources drm/amdgpu/mes12: adjust mes12 sw/hw init for multiple pipes drm/amdgpu/mes12: add mes pipe switch support ...
2024-08-16Merge tag 'i2c-host-fixes-6.11-rc4' of ↵Wolfram Sang2-3/+5
git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-current Two fixes in this update: Tegra I2C Controller: Addresses a potential double-locking issue during probe. ACPI devices are not IRQ-safe when invoking runtime suspend and resume functions, so the irq_safe flag should not be set. Qualcomm GENI I2C Controller: Fixes an oversight in the exit path of the runtime_resume() function, which was missed in the previous release.
2024-08-16Documentation/llvm: turn make command for ccache into code blockJavier Carrasco1-1/+1
The command provided to use ccache with clang is not a literal code block. Once built, the documentation displays the '' symbols as a " character, which is wrong, and the command can not be applied as provided. Turn the command into a literal code block. Signed-off-by: Javier Carrasco <[email protected]> Reviewed-by: Nathan Chancellor <[email protected]> Signed-off-by: Masahiro Yamada <[email protected]>
2024-08-16thermal: gov_bang_bang: Use governor_data to reduce overheadRafael J. Wysocki3-1/+21
After running once, the for_each_trip_desc() loop in bang_bang_manage() is pure needless overhead because it is not going to make any changes unless a new cooling device has been bound to one of the trips in the thermal zone or the system is resuming from sleep. For this reason, make bang_bang_manage() set governor_data for the thermal zone and check it upfront to decide whether or not it needs to do anything. However, governor_data needs to be reset in some cases to let bang_bang_manage() know that it should walk the trips again, so add an .update_tz() callback to the governor and make the core additionally invoke it during system resume. To avoid affecting the other users of that callback unnecessarily, add a special notification reason for system resume, THERMAL_TZ_RESUME, and also pass it to __thermal_zone_device_update() called during system resume for consistency. Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Peter Kästle <[email protected]> Reviewed-by: Zhang Rui <[email protected]> Cc: 6.10+ <[email protected]> # 6.10+ Link: https://patch.msgid.link/[email protected]
2024-08-16thermal: gov_bang_bang: Add .manage() callbackRafael J. Wysocki1-0/+30
After recent changes, the Bang-bang governor may not adjust the initial configuration of cooling devices to the actual situation. Namely, if a cooling device bound to a certain trip point starts in the "on" state and the thermal zone temperature is below the threshold of that trip point, the trip point may never be crossed on the way up in which case the state of the cooling device will never be adjusted because the thermal core will never invoke the governor's .trip_crossed() callback. [Note that there is no issue if the zone temperature is at the trip threshold or above it to start with because .trip_crossed() will be invoked then to indicate the start of thermal mitigation for the given trip.] To address this, add a .manage() callback to the Bang-bang governor and use it to ensure that all of the thermal instances managed by the governor have been initialized properly and the states of all of the cooling devices involved have been adjusted to the current zone temperature as appropriate. Fixes: 530c932bdf75 ("thermal: gov_bang_bang: Use .trip_crossed() instead of .throttle()") Link: https://lore.kernel.org/linux-pm/[email protected]/ Cc: 6.10+ <[email protected]> # 6.10+ Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Peter Kästle <[email protected]> Reviewed-by: Zhang Rui <[email protected]> Link: https://patch.msgid.link/[email protected]
2024-08-16thermal: gov_bang_bang: Split bang_bang_control()Rafael J. Wysocki1-19/+23
Move the setting of the thermal instance target state from bang_bang_control() into a separate function that will be also called in a different place going forward. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Peter Kästle <[email protected]> Reviewed-by: Zhang Rui <[email protected]> Cc: 6.10+ <[email protected]> # 6.10+ Link: https://patch.msgid.link/[email protected]
2024-08-16thermal: gov_bang_bang: Call __thermal_cdev_update() directlyRafael J. Wysocki1-4/+1
Instead of clearing the "updated" flag for each cooling device affected by the trip point crossing in bang_bang_control() and walking all thermal instances to run thermal_cdev_update() for all of the affected cooling devices, call __thermal_cdev_update() directly for each of them. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Peter Kästle <[email protected]> Reviewed-by: Zhang Rui <[email protected]> Cc: 6.10+ <[email protected]> # 6.10+ Link: https://patch.msgid.link/[email protected]
2024-08-16Merge branch 'vln-ocelot-fixes'David S. Miller15-173/+1036
Vladimir Oltean says: ==================== VLAN fixes for Ocelot driver This is a collection of patches I've gathered over the past several months. Patches 1-6/14 are supporting patches for selftests. Patch 9/14 fixes PTP TX from a VLAN upper of a VLAN-aware bridge port when using the "ocelot-8021q" tagging protocol. Patch 7/14 is its supporting selftest. Patch 10/14 fixes the QoS class used by PTP in the same case as above. It is hard to quantify - there is no selftest. Patch 11/14 fixes potential data corruption during PTP TX in the same case as above. Again, there is no selftest. Patch 13/14 fixes RX in the same case as above - 8021q upper of a VLAN-aware bridge port, with the "ocelot-8021q" tagging protocol. Patch 12/14 is a supporting patch for this in the DSA core, and 7/14 is also its selftest. Patch 14/14 ensures that VLAN-aware bridges offloaded to Ocelot only react to the ETH_P_8021Q TPID, and treat absolutely everything else as VLAN-untagged, including ETH_P_8021AD. Patch 8/14 is the supporting selftest. ==================== Signed-off-by: David S. Miller <[email protected]>
2024-08-16net: mscc: ocelot: treat 802.1ad tagged traffic as 802.1Q-untaggedVladimir Oltean3-11/+180
I was revisiting the topic of 802.1ad treatment in the Ocelot switch [0] and realized that not only is its basic VLAN classification pipeline improper for offloading vlan_protocol 802.1ad bridges, but also improper for offloading regular 802.1Q bridges already. Namely, 802.1ad-tagged traffic should be treated as VLAN-untagged by bridged ports, but this switch treats it as if it was 802.1Q-tagged with the same VID as in the 802.1ad header. This is markedly different to what the Linux bridge expects; see the "other_tpid()" function in tools/testing/selftests/net/forwarding/bridge_vlan_aware.sh. An idea came to me that the VCAP IS1 TCAM is more powerful than I'm giving it credit for, and that it actually overwrites the classified VID before the VLAN Table lookup takes place. In other words, it can be used even to save a packet from being dropped on ingress due to VLAN membership. Add a sophisticated TCAM rule hardcoded into the driver to force the switch to behave like a Linux bridge with vlan_filtering 1 vlan_protocol 802.1Q. Regarding the lifetime of the filter: eventually the bridge will disappear, and vlan_filtering on the port will be restored to 0 for standalone mode. Then the filter will be deleted. [0]: https://lore.kernel.org/netdev/20201009122947.nvhye4hvcha3tljh@skbuf/ Fixes: 7142529f1688 ("net: mscc: ocelot: add VLAN filtering") Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-08-16net: dsa: felix: fix VLAN tag loss on CPU reception with ocelot-8021qVladimir Oltean1-6/+109
There is a major design bug with ocelot-8021q, which is that it expects more of the hardware than the hardware can actually do. The short summary of the issue is that when a port is under a VLAN-aware bridge and we use this tagging protocol, VLAN upper interfaces of this port do not see RX traffic. We use VCAP ES0 (egress rewriter) rules towards the tag_8021q CPU port to encapsulate packets with an outer tag, later stripped by software, that depends on the source user port. We do this so that packets can be identified in ocelot_rcv(). To be precise, we create rules with push_outer_tag = OCELOT_ES0_TAG and push_inner_tag = 0. With this configuration, we expect the switch to keep the inner tag configuration as found in the packet (if it was untagged on user port ingress, keep it untagged, otherwise preserve the VLAN tag unmodified as the inner tag towards the tag_8021q CPU port). But this is not what happens. Instead, table "Tagging Combinations" from the user manual suggests that when the ES0 action is "PUSH_OUTER_TAG=1 and PUSH_INNER_TAG=0", there will be "no inner tag". Experimentation further clarifies what this means. It appears that this "inner tag" which is not pushed into the packet on its egress towards the CPU is none other than the classified VLAN. When the ingress user port is standalone or under a VLAN-unaware bridge, the classified VLAN is a discardable quantity: it is a fixed value - the result of ocelot_vlan_unaware_pvid()'s configuration, and actually independent of the VID from any 802.1Q header that may be in the frame. It is actually preferable to discard the "inner tag" in this case. The problem is when the ingress port is under a VLAN-aware bridge. Then, the classified VLAN is taken from the frame's 802.1Q header, with a fallback on the bridge port's PVID. It would be very good to not discard the "inner tag" here, because if we do, we break communication with any 8021q VLAN uppers that the port might have. These have a processing path outside the bridge. There seems to be nothing else we can do except to change the configuration for VCAP ES0 rules, to actually push the inner VLAN into the frame. There are 2 options for that, first is to push a fixed value specified in the rule, and second is to push a fixed value, plus (aka arithmetic +) the classified VLAN. We choose the second option, and we select that fixed value as 0. Thus, what is pushed in the inner tag is just the classified VLAN. From there, we need to perform software untagging, in the receive path, of stuff that was untagged on the wire. Fixes: 7c83a7c539ab ("net: dsa: add a second tagger for Ocelot switches based on tag_8021q") Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-08-16net: dsa: provide a software untagging function on RX for VLAN-aware bridgesVladimir Oltean3-38/+118
Through code analysis, I realized that the ds->untag_bridge_pvid logic is contradictory - see the newly added FIXME above the kernel-doc for dsa_software_untag_vlan_unaware_bridge(). Moreover, for the Felix driver, I need something very similar, but which is actually _not_ contradictory: untag the bridge PVID on RX, but for VLAN-aware bridges. The existing logic does it for VLAN-unaware bridges. Since I don't want to change the functionality of drivers which were supposedly properly tested with the ds->untag_bridge_pvid flag, I have introduced a new one: ds->untag_vlan_aware_bridge_pvid, and I have refactored the DSA reception code into a common path for both flags. TODO: both flags should be unified under a single ds->software_vlan_untag, which users of both current flags should set. This is not something that can be carried out right away. It needs very careful examination of all drivers which make use of this functionality, since some of them actually get this wrong in the first place. For example, commit 9130c2d30c17 ("net: dsa: microchip: ksz8795: Use software untagging on CPU port") uses this in a driver which has ds->configure_vlan_while_not_filtering = true. The latter mechanism has been known for many years to be broken by design: https://lore.kernel.org/netdev/CABumfLzJmXDN_W-8Z=p9KyKUVi_HhS7o_poBkeKHS2BkAiyYpw@mail.gmail.com/ and we have the situation of 2 bugs canceling each other. There is no private VLAN, and the port follows the PVID of the VLAN-unaware bridge. So, it's kinda ok for that driver to use the ds->untag_bridge_pvid mechanism, in a broken way. Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-08-16net: mscc: ocelot: serialize access to the injection/extraction groupsVladimir Oltean4-0/+76
As explained by Horatiu Vultur in commit 603ead96582d ("net: sparx5: Add spinlock for frame transmission from CPU") which is for a similar hardware design, multiple CPUs can simultaneously perform injection or extraction. There are only 2 register groups for injection and 2 for extraction, and the driver only uses one of each. So we'd better serialize access using spin locks, otherwise frame corruption is possible. Note that unlike in sparx5, FDMA in ocelot does not have this issue because struct ocelot_fdma_tx_ring already contains an xmit_lock. I guess this is mostly a problem for NXP LS1028A, as that is dual core. I don't think VSC7514 is. So I'm blaming the commit where LS1028A (aka the felix DSA driver) started using register-based packet injection and extraction. Fixes: 0a6f17c6ae21 ("net: dsa: tag_ocelot_8021q: add support for PTP timestamping") Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-08-16net: mscc: ocelot: fix QoS class for injected packets with "ocelot-8021q"Vladimir Oltean2-2/+9
There are 2 distinct code paths (listed below) in the source code which set up an injection header for Ocelot(-like) switches. Code path (2) lacks the QoS class and source port being set correctly. Especially the improper QoS classification is a problem for the "ocelot-8021q" alternative DSA tagging protocol, because we support tc-taprio and each packet needs to be scheduled precisely through its time slot. This includes PTP, which is normally assigned to a traffic class other than 0, but would be sent through TC 0 nonetheless. The code paths are: (1) ocelot_xmit_common() from net/dsa/tag_ocelot.c - called only by the standard "ocelot" DSA tagging protocol which uses NPI-based injection - sets up bit fields in the tag manually to account for a small difference (destination port offset) between Ocelot and Seville. Namely, ocelot_ifh_set_dest() is omitted out of ocelot_xmit_common(), because there's also seville_ifh_set_dest(). (2) ocelot_ifh_set_basic(), called by: - ocelot_fdma_prepare_skb() for FDMA transmission of the ocelot switchdev driver - ocelot_port_xmit() -> ocelot_port_inject_frame() for register-based transmission of the ocelot switchdev driver - felix_port_deferred_xmit() -> ocelot_port_inject_frame() for the DSA tagger ocelot-8021q when it must transmit PTP frames (also through register-based injection). sets the bit fields according to its own logic. The problem is that (2) doesn't call ocelot_ifh_set_qos_class(). Copying that logic from ocelot_xmit_common() fixes that. Unfortunately, although desirable, it is not easily possible to de-duplicate code paths (1) and (2), and make net/dsa/tag_ocelot.c directly call ocelot_ifh_set_basic()), because of the ocelot/seville difference. This is the "minimal" fix with some logic duplicated (but at least more consolidated). Fixes: 0a6f17c6ae21 ("net: dsa: tag_ocelot_8021q: add support for PTP timestamping") Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-08-16net: mscc: ocelot: use ocelot_xmit_get_vlan_info() also for FDMA and ↵Vladimir Oltean5-43/+75
register injection Problem description ------------------- On an NXP LS1028A (felix DSA driver) with the following configuration: - ocelot-8021q tagging protocol - VLAN-aware bridge (with STP) spanning at least swp0 and swp1 - 8021q VLAN upper interfaces on swp0 and swp1: swp0.700, swp1.700 - ptp4l on swp0.700 and swp1.700 we see that the ptp4l instances do not see each other's traffic, and they all go to the grand master state due to the ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES condition. Jumping to the conclusion for the impatient ------------------------------------------- There is a zero-day bug in the ocelot switchdev driver in the way it handles VLAN-tagged packet injection. The correct logic already exists in the source code, in function ocelot_xmit_get_vlan_info() added by commit 5ca721c54d86 ("net: dsa: tag_ocelot: set the classified VLAN during xmit"). But it is used only for normal NPI-based injection with the DSA "ocelot" tagging protocol. The other injection code paths (register-based and FDMA-based) roll their own wrong logic. This affects and was noticed on the DSA "ocelot-8021q" protocol because it uses register-based injection. By moving ocelot_xmit_get_vlan_info() to a place that's common for both the DSA tagger and the ocelot switch library, it can also be called from ocelot_port_inject_frame() in ocelot.c. We need to touch the lines with ocelot_ifh_port_set()'s prototype anyway, so let's rename it to something clearer regarding what it does, and add a kernel-doc. ocelot_ifh_set_basic() should do. Investigation notes ------------------- Debugging reveals that PTP event (aka those carrying timestamps, like Sync) frames injected into swp0.700 (but also swp1.700) hit the wire with two VLAN tags: 00000000: 01 1b 19 00 00 00 00 01 02 03 04 05 81 00 02 bc ~~~~~~~~~~~ 00000010: 81 00 02 bc 88 f7 00 12 00 2c 00 00 02 00 00 00 ~~~~~~~~~~~ 00000020: 00 00 00 00 00 00 00 00 00 00 00 01 02 ff fe 03 00000030: 04 05 00 01 00 04 00 00 00 00 00 00 00 00 00 00 00000040: 00 00 The second (unexpected) VLAN tag makes felix_check_xtr_pkt() -> ptp_classify_raw() fail to see these as PTP packets at the link partner's receiving end, and return PTP_CLASS_NONE (because the BPF classifier is not written to expect 2 VLAN tags). The reason why packets have 2 VLAN tags is because the transmission code treats VLAN incorrectly. Neither ocelot switchdev, nor felix DSA, declare the NETIF_F_HW_VLAN_CTAG_TX feature. Therefore, at xmit time, all VLANs should be in the skb head, and none should be in the hwaccel area. This is done by: static struct sk_buff *validate_xmit_vlan(struct sk_buff *skb, netdev_features_t features) { if (skb_vlan_tag_present(skb) && !vlan_hw_offload_capable(features, skb->vlan_proto)) skb = __vlan_hwaccel_push_inside(skb); return skb; } But ocelot_port_inject_frame() handles things incorrectly: ocelot_ifh_port_set(ifh, port, rew_op, skb_vlan_tag_get(skb)); void ocelot_ifh_port_set(struct sk_buff *skb, void *ifh, int port, u32 rew_op) { (...) if (vlan_tag) ocelot_ifh_set_vlan_tci(ifh, vlan_tag); (...) } The way __vlan_hwaccel_push_inside() pushes the tag inside the skb head is by calling: static inline void __vlan_hwaccel_clear_tag(struct sk_buff *skb) { skb->vlan_present = 0; } which does _not_ zero out skb->vlan_tci as seen by skb_vlan_tag_get(). This means that ocelot, when it calls skb_vlan_tag_get(), sees (and uses) a residual skb->vlan_tci, while the same VLAN tag is _already_ in the skb head. The trivial fix for double VLAN headers is to replace the content of ocelot_ifh_port_set() with: if (skb_vlan_tag_present(skb)) ocelot_ifh_set_vlan_tci(ifh, skb_vlan_tag_get(skb)); but this would not be correct either, because, as mentioned, vlan_hw_offload_capable() is false for us, so we'd be inserting dead code and we'd always transmit packets with VID=0 in the injection frame header. I can't actually test the ocelot switchdev driver and rely exclusively on code inspection, but I don't think traffic from 8021q uppers has ever been injected properly, and not double-tagged. Thus I'm blaming the introduction of VLAN fields in the injection header - early driver code. As hinted at in the early conclusion, what we _want_ to happen for VLAN transmission was already described once in commit 5ca721c54d86 ("net: dsa: tag_ocelot: set the classified VLAN during xmit"). ocelot_xmit_get_vlan_info() intends to ensure that if the port through which we're transmitting is under a VLAN-aware bridge, the outer VLAN tag from the skb head is stripped from there and inserted into the injection frame header (so that the packet is processed in hardware through that actual VLAN). And in all other cases, the packet is sent with VID=0 in the injection frame header, since the port is VLAN-unaware and has logic to strip this VID on egress (making it invisible to the wire). Fixes: 08d02364b12f ("net: mscc: fix the injection header") Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-08-16selftests: net: bridge_vlan_aware: test that other TPIDs are seen as untaggedVladimir Oltean1-1/+53
The bridge VLAN implementation w.r.t. VLAN protocol is described in merge commit 1a0b20b25732 ("Merge branch 'bridge-next'"). We are only sensitive to those VLAN tags whose TPID is equal to the bridge's vlan_protocol. Thus, an 802.1ad VLAN should be treated as 802.1Q-untagged. Add 3 tests which validate that: - 802.1ad-tagged traffic is learned into the PVID of an 802.1Q-aware bridge - Double-tagged traffic is forwarded when just the PVID of the port is present in the VLAN group of the ports - Double-tagged traffic is not forwarded when the PVID of the port is absent from the VLAN group of the ports The test passes with both veth and ocelot. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Ido Schimmel <[email protected]> Tested-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-08-16selftests: net: local_termination: add PTP frames to the mixVladimir Oltean1-13/+148
A breakage in the felix DSA driver shows we do not have enough test coverage. More generally, it is sufficiently special that it is likely drivers will treat it differently. This is not meant to be a full PTP test, it just makes sure that PTP packets sent to the different addresses corresponding to their profiles are received correctly. The local_termination selftest seemed like the most appropriate place for this addition. PTP RX/TX in some cases makes no sense (over a bridge) and this is why $skip_ptp exists. And in others - PTP over a bridge port - the IP stack needs convincing through the available bridge netfilter hooks to leave the PTP packets alone and not stolen by the bridge rx_handler. It is safe to assume that users have that figured out already. This is a driver level test, and by using tcpdump, all that extra setup is out of scope here. send_non_ip() was an unfinished idea; written but never used. Replace it with a more generic send_raw(), and send 3 PTP packet types times 3 transports. Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-08-16selftests: net: local_termination: don't use xfail_on_veth()Vladimir Oltean2-16/+99
xfail_on_veth() for this test is an incorrect approximation which gives false positives and false negatives. When local_termination fails with "reception succeeded, but should have failed", it is because the DUT ($h2) accepts packets even when not configured as promiscuous. This is not something specific to veth; even the bridge behaves that way, but this is not captured by the xfail_on_veth test. The IFF_UNICAST_FLT flag is not explicitly exported to user space, but it can somewhat be determined from the interface's behavior. We have to create a macvlan upper with a different MAC address. This forces a dev_uc_add() call in the kernel. When the unicast filtering list is not empty, but the device doesn't support IFF_UNICAST_FLT, __dev_set_rx_mode() force-enables promiscuity on the interface, to ensure correct behavior (that the requested address is received). We can monitor the change in the promiscuity flag and infer from it whether the device supports unicast filtering. There is no equivalent thing for allmulti, unfortunately. We never know what's hiding behind a device which has allmulti=off. Whether it will actually perform RX multicast filtering of unknown traffic is a strong "maybe". The bridge driver, for example, completely ignores the flag. We'll have to keep the xfail behavior, but instead of XFAIL on just veth, always XFAIL. Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-08-16selftests: net: local_termination: introduce new tests which capture VLAN ↵Vladimir Oltean1-7/+110
behavior Add more coverage to the local termination selftest as follows: - 8021q upper of $h2 - 8021q upper of $h2, where $h2 is a port of a VLAN-unaware bridge - 8021q upper of $h2, where $h2 is a port of a VLAN-aware bridge - 8021q upper of VLAN-unaware br0, which is the upper of $h2 - 8021q upper of VLAN-aware br0, which is the upper of $h2 Especially the cases with traffic sent through the VLAN upper of a VLAN-aware bridge port will be immediately relevant when we will start transmitting PTP packets as an additional kind of traffic. Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-08-16selftests: net: local_termination: add one more test for VLAN-aware bridgesVladimir Oltean1-4/+18
The current bridge() test is for packet reception on a VLAN-unaware bridge. Some things are different enough with VLAN-aware bridges that it's worth renaming this test into vlan_unaware_bridge(), and add a new vlan_aware_bridge() test. The two will share the same implementation: bridge() becomes a common function, which receives $vlan_filtering as an argument. Rename it to test_bridge() at the same time, because just bridge() pollutes the global namespace and we cannot invoke the binary with the same name from the iproute2 package currently. Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-08-16selftests: net: local_termination: parameterize test nameVladimir Oltean1-18/+20
There are upcoming tests which verify the RX filtering of a bridge (or bridge port), but under differing vlan_filtering conditions. Since we currently print $h2 (the DUT) in the log_test() output, it becomes necessary to make a further distinction between tests, to not give the user the impression that the exact same thing is run twice. Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-08-16selftests: net: local_termination: parameterize sending interfaceVladimir Oltean1-19/+20
In future changes we will want to subject the DUT, $h2, to additional VLAN-tagged traffic. For that, we need to run the tests using $h1.100 as a sending interface, rather than the currently hardcoded $h1. Add a parameter to run_test() and modify its 2 callers to explicitly pass $h1, as was implicit before. Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-08-16selftests: net: local_termination: refactor macvlan creation/deletionVladimir Oltean1-12/+18
This will be used in other subtests as well; make new macvlan_create() and macvlan_destroy() functions. Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-08-16char: xillybus: Check USB endpoints when probing deviceEli Billauer1-2/+20
Ensure, as the driver probes the device, that all endpoints that the driver may attempt to access exist and are of the correct type. All XillyUSB devices must have a Bulk IN and Bulk OUT endpoint at address 1. This is verified in xillyusb_setup_base_eps(). On top of that, a XillyUSB device may have additional Bulk OUT endpoints. The information about these endpoints' addresses is deduced from a data structure (the IDT) that the driver fetches from the device while probing it. These endpoints are checked in setup_channels(). A XillyUSB device never has more than one IN endpoint, as all data towards the host is multiplexed in this single Bulk IN endpoint. This is why setup_channels() only checks OUT endpoints. Reported-by: [email protected] Cc: stable <[email protected]> Closes: https://lore.kernel.org/all/[email protected]/T/ Fixes: a53d1202aef1 ("char: xillybus: Add driver for XillyUSB (Xillybus variant for USB)"). Signed-off-by: Eli Billauer <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2024-08-16char: xillybus: Refine workqueue handlingEli Billauer1-3/+5
As the wakeup work item now runs on a separate workqueue, it needs to be flushed separately along with flushing the device's workqueue. Also, move the destroy_workqueue() call to the end of the exit method, so that deinitialization is done in the opposite order of initialization. Fixes: ccbde4b128ef ("char: xillybus: Don't destroy workqueue from work item running on it") Cc: stable <[email protected]> Signed-off-by: Eli Billauer <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2024-08-15mm/migrate: fix deadlock in migrate_pages_batch() on large foliosGao Xiang1-5/+11
Currently, migrate_pages_batch() can lock multiple locked folios with an arbitrary order. Although folio_trylock() is used to avoid deadlock as commit 2ef7dbb26990 ("migrate_pages: try migrate in batch asynchronously firstly") mentioned, it seems try_split_folio() is still missing. It was found by compaction stress test when I explicitly enable EROFS compressed files to use large folios, which case I cannot reproduce with the same workload if large folio support is off (current mainline). Typically, filesystem reads (with locked file-backed folios) could use another bdev/meta inode to load some other I/Os (e.g. inode extent metadata or caching compressed data), so the locking order will be: file-backed folios (A) bdev/meta folios (B) The following calltrace shows the deadlock: Thread 1 takes (B) lock and tries to take folio (A) lock Thread 2 takes (A) lock and tries to take folio (B) lock [Thread 1] INFO: task stress:1824 blocked for more than 30 seconds. Tainted: G OE 6.10.0-rc7+ #6 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:stress state:D stack:0 pid:1824 tgid:1824 ppid:1822 flags:0x0000000c Call trace: __switch_to+0xec/0x138 __schedule+0x43c/0xcb0 schedule+0x54/0x198 io_schedule+0x44/0x70 folio_wait_bit_common+0x184/0x3f8 <-- folio mapping ffff00036d69cb18 index 996 (**) __folio_lock+0x24/0x38 migrate_pages_batch+0x77c/0xea0 // try_split_folio (mm/migrate.c:1486:2) // migrate_pages_batch (mm/migrate.c:1734:16) <--- LIST_HEAD(unmap_folios) has .. folio mapping 0xffff0000d184f1d8 index 1711; (*) folio mapping 0xffff0000d184f1d8 index 1712; .. migrate_pages+0xb28/0xe90 compact_zone+0xa08/0x10f0 compact_node+0x9c/0x180 sysctl_compaction_handler+0x8c/0x118 proc_sys_call_handler+0x1a8/0x280 proc_sys_write+0x1c/0x30 vfs_write+0x240/0x380 ksys_write+0x78/0x118 __arm64_sys_write+0x24/0x38 invoke_syscall+0x78/0x108 el0_svc_common.constprop.0+0x48/0xf0 do_el0_svc+0x24/0x38 el0_svc+0x3c/0x148 el0t_64_sync_handler+0x100/0x130 el0t_64_sync+0x190/0x198 [Thread 2] INFO: task stress:1825 blocked for more than 30 seconds. Tainted: G OE 6.10.0-rc7+ #6 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:stress state:D stack:0 pid:1825 tgid:1825 ppid:1822 flags:0x0000000c Call trace: __switch_to+0xec/0x138 __schedule+0x43c/0xcb0 schedule+0x54/0x198 io_schedule+0x44/0x70 folio_wait_bit_common+0x184/0x3f8 <-- folio = 0xfffffdffc6b503c0 (mapping == 0xffff0000d184f1d8 index == 1711) (*) __folio_lock+0x24/0x38 z_erofs_runqueue+0x384/0x9c0 [erofs] z_erofs_readahead+0x21c/0x350 [erofs] <-- folio mapping 0xffff00036d69cb18 range from [992, 1024] (**) read_pages+0x74/0x328 page_cache_ra_order+0x26c/0x348 ondemand_readahead+0x1c0/0x3a0 page_cache_sync_ra+0x9c/0xc0 filemap_get_pages+0xc4/0x708 filemap_read+0x104/0x3a8 generic_file_read_iter+0x4c/0x150 vfs_read+0x27c/0x330 ksys_pread64+0x84/0xd0 __arm64_sys_pread64+0x28/0x40 invoke_syscall+0x78/0x108 el0_svc_common.constprop.0+0x48/0xf0 do_el0_svc+0x24/0x38 el0_svc+0x3c/0x148 el0t_64_sync_handler+0x100/0x130 el0t_64_sync+0x190/0x198 Link: https://lkml.kernel.org/r/[email protected] Fixes: 5dfab109d519 ("migrate_pages: batch _unmap and _move") Signed-off-by: Gao Xiang <[email protected]> Reviewed-by: "Huang, Ying" <[email protected]> Acked-by: David Hildenbrand <[email protected]> Cc: Matthew Wilcox <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-08-15alloc_tag: mark pages reserved during CMA activation as not taggedSuren Baghdasaryan1-0/+2
During CMA activation, pages in CMA area are prepared and then freed without being allocated. This triggers warnings when memory allocation debug config (CONFIG_MEM_ALLOC_PROFILING_DEBUG) is enabled. Fix this by marking these pages not tagged before freeing them. Link: https://lkml.kernel.org/r/[email protected] Fixes: d224eb0287fb ("codetag: debug: mark codetags for reserved pages as empty") Signed-off-by: Suren Baghdasaryan <[email protected]> Acked-by: David Hildenbrand <[email protected]> Cc: Kees Cook <[email protected]> Cc: Kent Overstreet <[email protected]> Cc: Pasha Tatashin <[email protected]> Cc: Sourav Panda <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: <[email protected]> [6.10] Signed-off-by: Andrew Morton <[email protected]>