blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2021-09-10	arm64: kdump: Skip kmemleak scan reserved memory for kdump	Chen Wandun	1	-0/+6
	Trying to boot with kdump + kmemleak, command will result in a crash: "echo scan > /sys/kernel/debug/kmemleak" crashkernel reserved: 0x0000000007c00000 - 0x0000000027c00000 (512 MB) Kernel command line: BOOT_IMAGE=(hd1,gpt2)/vmlinuz-5.14.0-rc5-next-20210809+ root=/dev/mapper/ao-root ro rd.lvm.lv=ao/root rd.lvm.lv=ao/swap crashkernel=512M Unable to handle kernel paging request at virtual address ffff000007c00000 Mem abort info: ESR = 0x96000007 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x07: level 3 translation fault Data abort info: ISV = 0, ISS = 0x00000007 CM = 0, WnR = 0 swapper pgtable: 64k pages, 48-bit VAs, pgdp=00002024f0d80000 [ffff000007c00000] pgd=1800205ffffd0003, p4d=1800205ffffd0003, pud=1800205ffffd0003, pmd=1800205ffffc0003, pte=0068000007c00f06 Internal error: Oops: 96000007 [#1] SMP pstate: 804000c9 (Nzcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : scan_block+0x98/0x230 lr : scan_block+0x94/0x230 sp : ffff80008d6cfb70 x29: ffff80008d6cfb70 x28: 0000000000000000 x27: 0000000000000000 x26: 00000000000000c0 x25: 0000000000000001 x24: 0000000000000000 x23: ffffa88a6b18b398 x22: ffff000007c00ff9 x21: ffffa88a6ac7fc40 x20: ffffa88a6af6a830 x19: ffff000007c00000 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: ffffffffffffffff x14: ffffffff00000000 x13: ffffffffffffffff x12: 0000000000000020 x11: 0000000000000000 x10: 0000000001080000 x9 : ffffa88a6951c77c x8 : ffffa88a6a893988 x7 : ffff203ff6cfb3c0 x6 : ffffa88a6a52b3c0 x5 : ffff203ff6cfb3c0 x4 : 0000000000000000 x3 : 0000000000000000 x2 : 0000000000000001 x1 : ffff20226cb56a40 x0 : 0000000000000000 Call trace: scan_block+0x98/0x230 scan_gray_list+0x120/0x270 kmemleak_scan+0x3a0/0x648 kmemleak_write+0x3ac/0x4c8 full_proxy_write+0x6c/0xa0 vfs_write+0xc8/0x2b8 ksys_write+0x70/0xf8 __arm64_sys_write+0x24/0x30 invoke_syscall+0x4c/0x110 el0_svc_common+0x9c/0x190 do_el0_svc+0x30/0x98 el0_svc+0x28/0xd8 el0t_64_sync_handler+0x90/0xb8 el0t_64_sync+0x180/0x184 The reserved memory for kdump will be looked up by kmemleak, this area will be set invalid when kdump service is bring up. That will result in crash when kmemleak scan this area. Fixes: a7259df76702 ("memblock: make memblock_find_in_range method private") Signed-off-by: Chen Wandun <[email protected]> Reviewed-by: Kefeng Wang <[email protected]> Reviewed-by: Mike Rapoport <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
2021-09-10	qed: Handle management FW error	Shai Malin	1	-1/+5
	Handle MFW (management FW) error response in order to avoid a crash during recovery flows. Changes from v1: - Add "Fixes tag". Fixes: tag 5e7ba042fd05 ("qed: Fix reading stale configuration information") Signed-off-by: Ariel Elior <[email protected]> Signed-off-by: Shai Malin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-10	net/packet: clarify source of pr_*() messages	Baruch Siach	1	-0/+2
	Add pr_fmt macro to spell out the source of messages in prefix. Before this patch: packet size is too long (1543 > 1518) With this patch: af_packet: packet size is too long (1543 > 1518) Signed-off-by: Baruch Siach <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-10	r6040: Restore MDIO clock frequency after MAC reset	Florian Fainelli	1	-1/+8
	A number of users have reported that they were not able to get the PHY to successfully link up, especially after commit c36757eb9dee ("net: phy: consider AN_RESTART status when reading link status") where we stopped reading just BMSR, but we also read BMCR to determine the link status. Andrius at NetBSD did a wonderful job at debugging the problem and found out that the MDIO bus clock frequency would be incorrectly set back to its default value which would prevent the MDIO bus controller from reading PHY registers properly. Back when we only read BMSR, if we read all 1s, we could falsely indicate a link status, though in general there is a cable plugged in, so this went unnoticed. After a second read of BMCR was added, a wrong read will lead to the inability to determine a link UP condition which is when it started to be visibly broken, even if it was long before that. The fix consists in restoring the value of the MD_CSR register that was set prior to the MAC reset. Link: http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=53494 Fixes: 90f750a81a29 ("r6040: consolidate MAC reset to its own function") Reported-by: Andrius V <[email protected]> Reported-by: Darek Strugacz <[email protected]> Tested-by: Darek Strugacz <[email protected]> Signed-off-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-10	ice: Correctly deal with PFs that do not support RDMA	Dave Ertman	2	-0/+8
	There are two cases where the current PF does not support RDMA functionality. The first is if the NVM loaded on the device is set to not support RDMA (common_caps.rdma is false). The second is if the kernel bonding driver has included the current PF in an active link aggregate. When the driver has determined that this PF does not support RDMA, then auxiliary devices should not be created on the auxiliary bus. Without a device on the auxiliary bus, even if the irdma driver is present, there will be no RDMA activity attempted on this PF. Currently, in the reset flow, an attempt to create auxiliary devices is performed without regard to the ability of the PF. There needs to be a check in ice_aux_plug_dev (as the central point that creates auxiliary devices) to see if the PF is in a state to support the functionality. When disabling and re-enabling RDMA due to the inclusion/removal of the PF in a link aggregate, we also need to set/clear the bit which controls auxiliary device creation so that a reset recovery in a link aggregate situation doesn't try to create auxiliary devices when it shouldn't. Fixes: f9f5301e7e2d ("ice: Register auxiliary device to provide RDMA") Reported-by: Yongxin Liu <[email protected]> Signed-off-by: Dave Ertman <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-10	drm/ttm: Fix a deadlock if the target BO is not idle during swap	xinhui pan	1	-3/+3
	The ret value might be -EBUSY, caller will think lru lock is still locked but actually NOT. So return -ENOSPC instead. Otherwise we hit list corruption. ttm_bo_cleanup_refs might fail too if BO is not idle. If we return 0, caller(ttm_tt_populate -> ttm_global_swapout ->ttm_device_swapout) will be stuck as we actually did not free any BO memory. This usually happens when the fence is not signaled for a long time. Signed-off-by: xinhui pan <[email protected]> Reviewed-by: Christian König <[email protected]> Fixes: ebd59851c796 ("drm/ttm: move swapout logic around v3") Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Christian König <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
2021-09-10	Merge tag 'drm-misc-next-fixes-2021-09-09' of ↵	Dave Airlie	2	-1/+9
	git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next-fixes for v5.15: - Make some dma-buf config options depend on DMA_SHARED_BUFFER. - Handle multiplication overflow of fbdev xres/yres in the core. Signed-off-by: Dave Airlie <[email protected]> From: Maarten Lankhorst <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-09-09	Merge tag '5.15-rc-ksmbd-part2' of git://git.samba.org/ksmbd	Linus Torvalds	12	-223/+413
	Pull ksmbd fixes from Steve French: - various fixes pointed out by coverity, and a minor cleanup patch - id mapping and ownership fixes - an smbdirect fix * tag '5.15-rc-ksmbd-part2' of git://git.samba.org/ksmbd: ksmbd: fix control flow issues in sid_to_id() ksmbd: fix read of uninitialized variable ret in set_file_basic_info ksmbd: add missing assignments to ret on ndr_read_int64 read calls ksmbd: add validation for ndr read/write functions ksmbd: remove unused ksmbd_file_table_flush function ksmbd: smbd: fix dma mapping error in smb_direct_post_send_data ksmbd: Reduce error log 'speed is unknown' to debug ksmbd: defer notify_change() call ksmbd: remove setattr preparations in set_file_basic_info() ksmbd: ensure error is surfaced in set_file_basic_info() ndr: fix translation in ndr_encode_posix_acl() ksmbd: fix translation in sid_to_id() ksmbd: fix subauth 0 handling in sid_to_id() ksmbd: fix translation in acl entries ksmbd: fix translation in ksmbd_acls_fattr() ksmbd: fix translation in create_posix_rsp_buf() ksmbd: fix translation in smb2_populate_readdir_entry() ksmbd: fix lookup on idmapped mounts
2021-09-09	bootconfig: Rename xbc_node_find_child() to xbc_node_find_subkey()	Masami Hiramatsu	3	-18/+18
	Rename xbc_node_find_child() to xbc_node_find_subkey() for clarifying that function returns a key node (no value node). Since there are xbc_node_for_each_child() (loop on all child nodes) and xbc_node_for_each_subkey() (loop on only subkey nodes), this name distinction is necessary to avoid confusing users. Link: https://lkml.kernel.org/r/163119459826.161018.11200274779483115300.stgit@devnote2 Signed-off-by: Masami Hiramatsu <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2021-09-09	tracing/boot: Fix to check the histogram control param is a leaf node	Masami Hiramatsu	1	-3/+3
	Since xbc_node_find_child() doesn't ensure the returned node is a leaf node (key-value pair or do not have subkeys), use xbc_node_find_value to ensure the histogram control parameter is a leaf node in trace_boot_compose_hist_cmd(). Link: https://lkml.kernel.org/r/163119459059.161018.18341288218424528962.stgit@devnote2 Fixes: e66ed86ca6c5 ("tracing/boot: Add per-event histogram action options") Signed-off-by: Masami Hiramatsu <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2021-09-09	tracing/boot: Fix trace_boot_hist_add_array() to check array is value	Masami Hiramatsu	1	-4/+3
	trace_boot_hist_add_array() uses the combination of xbc_node_find_child() and xbc_node_get_child() to get the child node of the key node. But since it missed to check the child node is data node or not, user can pass the subkey node for the array node (anode). To avoid this issue, check the array node is a data node. Actually, there is xbc_node_find_value(node, key, vnode), which ensures the @vnode is a value node, so use it in trace_boot_hist_add_array() to fix this issue. Link: https://lkml.kernel.org/r/163119458308.161018.1516455973625940212.stgit@devnote2 Fixes: e66ed86ca6c5 ("tracing/boot: Add per-event histogram action options") Signed-off-by: Masami Hiramatsu <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2021-09-09	Merge tag 'for-5.15-tag' of ↵	Linus Torvalds	6	-47/+77
	git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - fix max_inline mount option limit on 64k page system - lockdep fixes: - update bdev time in a safer way - move bdev put outside of sb write section when removing device - fix possible deadlock when mounting seed/sprout filesystem - zoned mode: fix split extent accounting - minor include fixup * tag 'for-5.15-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: zoned: fix double counting of split ordered extent btrfs: fix lockdep warning while mounting sprout fs btrfs: delay blkdev_put until after the device remove btrfs: update the bdev time directly when closing btrfs: use correct header for div_u64 in misc.h btrfs: fix upper limit for max_inline for page size 64K
2021-09-09	Merge tag 'sound-fix-5.15-rc1' of ↵	Linus Torvalds	13	-82/+111
	git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A collection of small fixes that have been gathered before rc1, including a few regression fixes for the problem in the previous pull request" * tag 'sound-fix-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: gus: Fix repeated probe for ISA interwave card ALSA: gus: Fix repeated probes of snd_gus_create() ALSA: vx222: fix null-ptr-deref ASoC: rockchip: i2s: Fix concurrency between tx/rx ASoC: mt8195: correct the dts parsing logic about DPTX and HDMITX ASoC: Intel: boards: Fix CONFIG_SND_SOC_SDW_MOCKUP select ASoC: dt-bindings: fsl_rpmsg: Add compatible string for i.MX8ULP ALSA: usb-audio: Add registration quirk for JBL Quantum 800 ASoC: rt5682: fix headset background noise when S3 state ASoC: dt-bindings: mt8195: remove dependent headers in the example ASoC: mediatek: SND_SOC_MT8195 should depend on ARCH_MEDIATEK ASoC: samsung: s3c24xx_simtec: fix spelling mistake "devicec" -> "device" ASoC: audio-graph: respawn Platform Support ASoC: mediatek: mt8195: add MTK_PMIC_WRAP dependency
2021-09-09	cifs: properly invalidate cached root handle when closing it	Enzo Matsumiya	1	-7/+13
	Cached root file was not being completely invalidated sometimes. Reproducing: - With a DFS share with 2 targets, one disabled and one enabled - start some I/O on the mount # while true; do ls /mnt/dfs; done - at the same time, disable the enabled target and enable the disabled one - wait for DFS cache to expire - on reconnect, the previous cached root handle should be invalid, but open_cached_dir_by_dentry() will still try to use it, but throws a use-after-free warning (kref_get()) Make smb2_close_cached_fid() invalidate all fields every time, but only send an SMB2_close() when the entry is still valid. Signed-off-by: Enzo Matsumiya <[email protected]> Reviewed-by: Paulo Alcantara (SUSE) <[email protected]> Signed-off-by: Steve French <[email protected]>
2021-09-09	parisc: Implement __get/put_kernel_nofault()	Helge Deller	6	-86/+62
	Remove CONFIG_SET_FS from parisc, so we need to add __get_kernel_nofault() and __put_kernel_nofault(), define HAVE_GET_KERNEL_NOFAULT and remove set_fs(), get_fs(), load_sr2(), thread_info->addr_limit, KERNEL_DS and USER_DS. The nice side-effect of this patch is that we now can directly access userspace via sr3 without the need to use a temporary sr2 which is either copied from sr3 or set to zero (for kernel space). Signed-off-by: Helge Deller <[email protected]> Suggested-by: Arnd Bergmann <[email protected]>
2021-09-09	Merge tag 'for-linus-5.15-rc1' of ↵	Linus Torvalds	10	-42/+124
	git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml Pull UML updates from Richard Weinberger: - Support for VMAP_STACK - Support for splice_write in hostfs - Fixes for virt-pci - Fixes for virtio_uml - Various fixes * tag 'for-linus-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml: um: fix stub location calculation um: virt-pci: fix uapi documentation um: enable VMAP_STACK um: virt-pci: don't do DMA from stack hostfs: support splice_write um: virtio_uml: fix memory leak on init failures um: virtio_uml: include linux/virtio-uml.h lib/logic_iomem: fix sparse warnings um: make PCI emulation driver init/exit static
2021-09-09	Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm	Linus Torvalds	30	-371/+399
	Pull ARM development updates from Russell King: - Rename "mod_init" and "mod_exit" so that initcall debug output is actually useful (Randy Dunlap) - Update maintainers entries for linux-arm-kernel to indicate it is moderated for non-subscribers (Randy Dunlap) - Move install rules to arch/arm/Makefile (Masahiro Yamada) - Drop unnecessary ARCH_NR_GPIOS definition (Linus Walleij) - Don't warn about atags_to_fdt() stack size (David Heidelberg) - Speed up unaligned copy_{from,to}_kernel_nofault (Arnd Bergmann) - Get rid of set_fs() usage (Arnd Bergmann) - Remove checks for GCC prior to v4.6 (Geert Uytterhoeven) * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 9118/1: div64: Remove always-true __div64_const32_is_OK() duplicate ARM: 9117/1: asm-generic: div64: Remove always-true __div64_const32_is_OK() ARM: 9116/1: unified: Remove check for gcc < 4 ARM: 9110/1: oabi-compat: fix oabi epoll sparse warning ARM: 9113/1: uaccess: remove set_fs() implementation ARM: 9112/1: uaccess: add __{get,put}_kernel_nofault ARM: 9111/1: oabi-compat: rework fcntl64() emulation ARM: 9114/1: oabi-compat: rework sys_semtimedop emulation ARM: 9108/1: oabi-compat: rework epoll_wait/epoll_pwait emulation ARM: 9107/1: syscall: always store thread_info->abi_syscall ARM: 9109/1: oabi-compat: add epoll_pwait handler ARM: 9106/1: traps: use get_kernel_nofault instead of set_fs() ARM: 9115/1: mm/maccess: fix unaligned copy_{from,to}_kernel_nofault ARM: 9105/1: atags_to_fdt: don't warn about stack size ARM: 9103/1: Drop ARCH_NR_GPIOS definition ARM: 9102/1: move theinstall rules to arch/arm/Makefile ARM: 9100/1: MAINTAINERS: mark all linux-arm-kernel@infradead list as moderated ARM: 9099/1: crypto: rename 'mod_init' & 'mod_exit' functions to be module-specific
2021-09-09	n64cart: fix return value check in n64cart_probe()	Yang Yingliang	1	-2/+2
	In case of error, the function devm_platform_ioremap_resource() returns ERR_PTR() and never returns NULL. The NULL test in the return value check should be replaced with IS_ERR(). Fixes: d9b2a2bbbb4d ("block: Add n64 cart driver") Reported-by: Hulk Robot <[email protected]> Signed-off-by: Yang Yingliang <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2021-09-09	Merge tag 'trace-v5.15-2' of ↵	Linus Torvalds	14	-39/+122
	git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull more tracing updates from Steven Rostedt: - Add migrate-disable counter to tracing header - Fix error handling in event probes - Fix missed unlock in osnoise in error path - Fix merge issue with tools/bootconfig - Clean up bootconfig data when init memory is removed - Fix bootconfig to loop only on subkeys - Have kernel command lines override bootconfig options - Increase field counts for synthetic events - Have histograms dynamic allocate event elements to save space - Fixes in testing and documentation * tag 'trace-v5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing/boot: Fix to loop on only subkeys selftests/ftrace: Exclude "(fault)" in testing add/remove eprobe events tracing: Dynamically allocate the per-elt hist_elt_data array tracing: synth events: increase max fields count tools/bootconfig: Show whole test command for each test case bootconfig: Fix missing return check of xbc_node_compose_key function tools/bootconfig: Fix tracing_on option checking in ftrace2bconf.sh docs: bootconfig: Add how to use bootconfig for kernel parameters init/bootconfig: Reorder init parameter from bootconfig and cmdline init: bootconfig: Remove all bootconfig data when the init memory is removed tracing/osnoise: Fix missed cpus_read_unlock() in start_per_cpu_kthreads() tracing: Fix some alloc_event_probe() error handling bugs tracing: Add migrate-disabled counter to tracing output.
2021-09-09	Merge tag 's390-5.15-2' of ↵	Linus Torvalds	38	-562/+139
	git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull more s390 updates from Heiko Carstens: "Except for the xpram device driver removal it is all about fixes and cleanups. - Fix topology update on cpu hotplug, so notifiers see expected masks. This bug was uncovered with SCHED_CORE support. - Fix stack unwinding so that the correct number of entries are omitted like expected by common code. This fixes KCSAN selftests. - Add kmemleak annotation to stack_alloc to avoid false positive kmemleak warnings. - Avoid layering violation in common I/O code and don't unregister subchannel from child-drivers. - Remove xpram device driver for which no real use case exists since the kernel is 64 bit only. Also all hypervisors got required support removed in the meantime, which means the xpram device driver is dead code. - Fix -ENODEV handling of clp_get_state in our PCI code. - Enable KFENCE in debug defconfig. - Cleanup hugetlbfs s390 specific Kconfig dependency. - Quite a lot of trivial fixes to get rid of "W=1" warnings, and and other simple cleanups" * tag 's390-5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: hugetlbfs: s390 is always 64bit s390/ftrace: remove incorrect __va usage s390/zcrypt: remove incorrect kernel doc indicators scsi: zfcp: fix kernel doc comments s390/sclp: add __nonstring annotation s390/hmcdrv_ftp: fix kernel doc comment s390: remove xpram device driver s390/pci: read clp_list_pci_req only once s390/pci: fix clp_get_state() handling of -ENODEV s390/cio: fix kernel doc comment s390/ctrlchar: fix kernel doc comment s390/con3270: use proper type for tasklet function s390/cpum_cf: move array from header to C file s390/mm: fix kernel doc comments s390/topology: fix topology information when calling cpu hotplug notifiers s390/unwind: use current_frame_address() to unwind current task s390/configs: enable CONFIG_KFENCE in debug_defconfig s390/entry: make oklabel within CHKSTG macro local s390: add kmemleak annotation in stack_alloc() s390/cio: dont unregister subchannel from child-drivers
2021-09-09	Merge branch 'work.gfs2' of ↵	Linus Torvalds	3	-21/+35
	git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull gfs2 setattr updates from Al Viro: "Make it possible for filesystems to use a generic 'may_setattr()' and switch gfs2 to using it" * 'work.gfs2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: gfs2: Switch to may_setattr in gfs2_setattr fs: Move notify_change permission checks into may_setattr
2021-09-09	Merge branch 'work.init' of ↵	Linus Torvalds	3	-36/+83
	git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull root filesystem type handling updates from Al Viro: "Teach init/do_mounts.c to handle non-block filesystems, hopefully preventing even more special-cased kludges (such as root=/dev/nfs, etc)" * 'work.init' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: simplify get_filesystem_list / get_all_fs_names init: allow mounting arbitrary non-blockdevice filesystems as root init: split get_fs_names
2021-09-09	Merge branch 'work.iov_iter' of ↵	Linus Torvalds	2	-1/+7
	git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull iov_iter fixes from Al Viro: "Fixes for io-uring handling of iov_iter reexpands" * 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: io_uring: reexpand under-reexpanded iters iov_iter: track truncated size
2021-09-09	Merge tag 'cxl-for-5.15' of ↵	Linus Torvalds	19	-889/+1352
	git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull CXL (Compute Express Link) updates from Dan Williams: - Fix detection of CXL host bridges to filter out disabled ACPI0016 devices in the ACPI DSDT. - Fix kernel lockdown integration to disable raw commands when raw PCI access is disabled. - Fix a broken debug message. - Add support for "Get Partition Info". I.e. enumerate the split between volatile and persistent capacity on bi-modal CXL memory expanders. - Re-factor the core by subject area. This is a work in progress. - Prepare libnvdimm to understand CXL labels in addition to EFI labels. This is a work in progress. * tag 'cxl-for-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (25 commits) cxl/registers: Fix Documentation warning cxl/pmem: Fix Documentation warning cxl/uapi: Fix defined but not used warnings cxl/pci: Fix debug message in cxl_probe_regs() cxl/pci: Fix lockdown level cxl/acpi: Do not add DSDT disabled ACPI0016 host bridge ports libnvdimm/labels: Add claim class helpers libnvdimm/labels: Add type-guid helpers libnvdimm/labels: Add blk special cases for nlabel and position helpers libnvdimm/labels: Add blk isetcookie set / validation helpers libnvdimm/labels: Add a checksum calculation helper libnvdimm/labels: Introduce label setter helpers libnvdimm/labels: Add isetcookie validation helper libnvdimm/labels: Introduce getters for namespace label fields cxl/mem: Adjust ram/pmem range to represent DPA ranges cxl/mem: Account for partitionable space in ram/pmem ranges cxl/pci: Store memory capacity values cxl/pci: Simplify register setup cxl/pci: Ignore unknown register block types cxl/core: Move memdev management to core ...
2021-09-09	Merge tag 'libnvdimm-for-5.15' of ↵	Linus Torvalds	10	-172/+120
	git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm updates from Dan Williams: - Fix a race condition in the teardown path of raw mode pmem namespaces. - Cleanup the code that filesystems use to detect filesystem-dax capabilities of their underlying block device. * tag 'libnvdimm-for-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: dax: remove bdev_dax_supported xfs: factor out a xfs_buftarg_is_dax helper dax: stub out dax_supported for !CONFIG_FS_DAX dax: remove __generic_fsdax_supported dax: move the dax_read_lock() locking into dax_supported dax: mark dax_get_by_host static dm: use fs_dax_get_by_bdev instead of dax_get_by_host dax: stop using bdevname fsdax: improve the FS_DAX Kconfig description and help text libnvdimm/pmem: Fix crash triggered when I/O in-flight during unbind
2021-09-09	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma	Linus Torvalds	4	-6/+8
	Pull rdma fixes from Jason Gunthorpe: "I don't usually send a second PR in the merge window, but the fix to mlx5 is significant enough that it should start going through the process ASAP. Along with it comes some of the usual -rc stuff that would normally wait for a -rc2 or so. Summary: Important error case regression fixes in mlx5: - Wrong size used when computing the error path smaller allocation request leads to corruption - Confusing but ultimately harmless alignment mis-calculation Static checker warning fixes: - NULL pointer subtraction in qib - kcalloc in bnxt_re - Missing static on global variable in hfi1" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: IB/hfi1: make hist static RDMA/bnxt_re: Prefer kcalloc over open coded arithmetic IB/qib: Fix null pointer subtraction compiler warning RDMA/mlx5: Fix xlt_chunk_align calculation RDMA/mlx5: Fix number of allocated XLT entries
2021-09-09	Merge tag 'dmaengine-5.15-rc1' of ↵	Linus Torvalds	54	-928/+4080
	git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine Pull dmaengine updates from Vinod Koul: "New drivers/devices - Support for Renesas RZ/G2L dma controller - New driver for AMD PTDMA controller Updates: - Big pile of idxd updates - Updates for Altera driver, stm32-dma, dw etc" * tag 'dmaengine-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (83 commits) dmaengine: sh: fix some NULL dereferences dmaengine: sh: Fix unused initialization of pointer lmdesc MAINTAINERS: Fix AMD PTDMA DRIVER entry dmaengine: ptdma: remove PT_OFFSET to avoid redefnition dmaengine: ptdma: Add debugfs entries for PTDMA dmaengine: ptdma: register PTDMA controller as a DMA resource dmaengine: ptdma: Initial driver for the AMD PTDMA dmaengine: fsl-dpaa2-qdma: Fix spelling mistake "faile" -> "failed" dmaengine: idxd: remove interrupt disable for dev_lock dmaengine: idxd: remove interrupt disable for cmd_lock dmaengine: idxd: fix setting up priv mode for dwq dmaengine: xilinx_dma: Set DMA mask for coherent APIs dmaengine: ti: k3-psil-j721e: Add entry for CSI2RX dmaengine: sh: Add DMAC driver for RZ/G2L SoC dmaengine: Extend the dma_slave_width for 128 bytes dt-bindings: dma: Document RZ/G2L bindings dmaengine: ioat: depends on !UML dmaengine: idxd: set descriptor allocation size to threshold for swq dmaengine: idxd: make submit failure path consistent on desc freeing dmaengine: idxd: remove interrupt flag for completion list spinlock ...
2021-09-09	arm64: mm: limit linear region to 51 bits for KVM in nVHE mode	Ard Biesheuvel	1	-1/+15
	KVM in nVHE mode divides up its VA space into two equal halves, and picks the half that does not conflict with the HYP ID map to map its linear region. This worked fine when the kernel's linear map itself was guaranteed to cover precisely as many bits of VA space, but this was changed by commit f4693c2716b35d08 ("arm64: mm: extend linear region for 52-bit VA configurations"). The result is that, depending on the placement of the ID map, kernel-VA to hyp-VA translations may produce addresses that either conflict with other HYP mappings (including the ID map itself) or generate addresses outside of the 52-bit addressable range, neither of which is likely to lead to anything useful. Given that 52-bit capable cores are guaranteed to implement VHE, this only affects configurations such as pKVM where we opt into non-VHE mode even if the hardware is VHE capable. So just for these configurations, let's limit the kernel linear map to 51 bits and work around the problem. Fixes: f4693c2716b3 ("arm64: mm: extend linear region for 52-bit VA configurations") Signed-off-by: Ard Biesheuvel <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
2021-09-09	io_uring: fail links of cancelled timeouts	Pavel Begunkov	1	-0/+2
	When we cancel a timeout we should mark it with REQ_F_FAIL, so linked requests are cancelled as well, but not queued for further execution. Cc: [email protected] Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/fff625b44eeced3a5cae79f60e6acf3fbdf8f990.1631192135.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <[email protected]>
2021-09-09	thermal/drivers/qcom/spmi-adc-tm5: Don't abort probing if a sensor is not used	Matthias Kaehlcke	1	-0/+6
	adc_tm5_register_tzd() registers the thermal zone sensors for all channels of the thermal monitor. If the registration of one channel fails the function skips the processing of the remaining channels and returns an error, which results in _probe() being aborted. One of the reasons the registration could fail is that none of the thermal zones is using the channel/sensor, which hardly is a critical error (if it is an error at all). If this case is detected emit a warning and continue with processing the remaining channels. Fixes: ca66dca5eda6 ("thermal: qcom: add support for adc-tm5 PMIC thermal monitor") Signed-off-by: Matthias Kaehlcke <[email protected]> Reported-by: Stephen Boyd <[email protected]> Reviewed-by: Stephen Boyd <[email protected]> Reviewed-by: Dmitry Baryshkov <[email protected]> Signed-off-by: Daniel Lezcano <[email protected]> Link: https://lore.kernel.org/r/20210823134726.1.I1dd23ddf77e5b3568625d80d6827653af071ce19@changeid
2021-09-09	thermal/drivers/intel: Allow processing of HWP interrupt	Srinivas Pandruvada	2	-1/+9
	Add a weak function to process HWP (Hardware P-states) notifications and move updating HWP_STATUS MSR to this function. This allows HWP interrupts to be processed by the intel_pstate driver in HWP mode by overriding the implementation. Signed-off-by: Srinivas Pandruvada <[email protected]> Acked-by: Zhang Rui <[email protected]> Signed-off-by: Daniel Lezcano <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-09-09	io-wq: fix memory leak in create_io_worker()	Qiang.zhang	1	-0/+3
	BUG: memory leak unreferenced object 0xffff888126fcd6c0 (size 192): comm "syz-executor.1", pid 11934, jiffies 4294983026 (age 15.690s) backtrace: [<ffffffff81632c91>] kmalloc_node include/linux/slab.h:609 [inline] [<ffffffff81632c91>] kzalloc_node include/linux/slab.h:732 [inline] [<ffffffff81632c91>] create_io_worker+0x41/0x1e0 fs/io-wq.c:739 [<ffffffff8163311e>] io_wqe_create_worker fs/io-wq.c:267 [inline] [<ffffffff8163311e>] io_wqe_enqueue+0x1fe/0x330 fs/io-wq.c:866 [<ffffffff81620b64>] io_queue_async_work+0xc4/0x200 fs/io_uring.c:1473 [<ffffffff8162c59c>] __io_queue_sqe+0x34c/0x510 fs/io_uring.c:6933 [<ffffffff8162c7ab>] io_req_task_submit+0x4b/0xa0 fs/io_uring.c:2233 [<ffffffff8162cb48>] io_async_task_func+0x108/0x1c0 fs/io_uring.c:5462 [<ffffffff816259e3>] tctx_task_work+0x1b3/0x3a0 fs/io_uring.c:2158 [<ffffffff81269b43>] task_work_run+0x73/0xb0 kernel/task_work.c:164 [<ffffffff812dcdd1>] tracehook_notify_signal include/linux/tracehook.h:212 [inline] [<ffffffff812dcdd1>] handle_signal_work kernel/entry/common.c:146 [inline] [<ffffffff812dcdd1>] exit_to_user_mode_loop kernel/entry/common.c:172 [inline] [<ffffffff812dcdd1>] exit_to_user_mode_prepare+0x151/0x180 kernel/entry/common.c:209 [<ffffffff843ff25d>] __syscall_exit_to_user_mode_work kernel/entry/common.c:291 [inline] [<ffffffff843ff25d>] syscall_exit_to_user_mode+0x1d/0x40 kernel/entry/common.c:302 [<ffffffff843fa4a2>] do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86 [<ffffffff84600068>] entry_SYSCALL_64_after_hwframe+0x44/0xae when create_io_thread() return error, and not retry, the worker object need to be freed. Reported-by: [email protected] Signed-off-by: Qiang.zhang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2021-09-09	iommu: Clarify default domain Kconfig	Robin Murphy	1	-1/+1
	Although strictly it is the AMD and Intel drivers which have an existing expectation of lazy behaviour by default, it ends up being rather unintuitive to describe this literally in Kconfig. Express it instead as an architecture dependency, to clarify that it is a valid config-time decision. The end result is the same since virtio-iommu doesn't support lazy mode and thus falls back to strict at runtime regardless. The per-architecture disparity is a matter of historical expectations: the AMD and Intel drivers have been lazy by default since 2008, and changing that gets noticed by people asking where their I/O throughput has gone. Conversely, Arm-based systems with their wider assortment of IOMMU drivers mostly only support strict mode anyway; only the Arm SMMU drivers have later grown support for passthrough and lazy mode, for users who wanted to explicitly trade off isolation for performance. These days, reducing the default level of isolation in a way which may go unnoticed by users who expect otherwise hardly seems worth risking for the sake of one line of Kconfig, so here's where we are. Reported-by: Linus Torvalds <[email protected]> Signed-off-by: Robin Murphy <[email protected]> Link: https://lore.kernel.org/r/69a0c6f17b000b54b8333ee42b3124c1d5a869e2.1631105737.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <[email protected]>
2021-09-09	iommu/vt-d: Fix a deadlock in intel_svm_drain_prq()	Fenghua Yu	1	-0/+12
	pasid_mutex and dev->iommu->param->lock are held while unbinding mm is flushing IO page fault workqueue and waiting for all page fault works to finish. But an in-flight page fault work also need to hold the two locks while unbinding mm are holding them and waiting for the work to finish. This may cause an ABBA deadlock issue as shown below: idxd 0000:00:0a.0: unbind PASID 2 ====================================================== WARNING: possible circular locking dependency detected 5.14.0-rc7+ #549 Not tainted [ 186.615245] ---------- dsa_test/898 is trying to acquire lock: ffff888100d854e8 (&param->lock){+.+.}-{3:3}, at: iopf_queue_flush_dev+0x29/0x60 but task is already holding lock: ffffffff82b2f7c8 (pasid_mutex){+.+.}-{3:3}, at: intel_svm_unbind+0x34/0x1e0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (pasid_mutex){+.+.}-{3:3}: __mutex_lock+0x75/0x730 mutex_lock_nested+0x1b/0x20 intel_svm_page_response+0x8e/0x260 iommu_page_response+0x122/0x200 iopf_handle_group+0x1c2/0x240 process_one_work+0x2a5/0x5a0 worker_thread+0x55/0x400 kthread+0x13b/0x160 ret_from_fork+0x22/0x30 -> #1 (&param->fault_param->lock){+.+.}-{3:3}: __mutex_lock+0x75/0x730 mutex_lock_nested+0x1b/0x20 iommu_report_device_fault+0xc2/0x170 prq_event_thread+0x28a/0x580 irq_thread_fn+0x28/0x60 irq_thread+0xcf/0x180 kthread+0x13b/0x160 ret_from_fork+0x22/0x30 -> #0 (&param->lock){+.+.}-{3:3}: __lock_acquire+0x1134/0x1d60 lock_acquire+0xc6/0x2e0 __mutex_lock+0x75/0x730 mutex_lock_nested+0x1b/0x20 iopf_queue_flush_dev+0x29/0x60 intel_svm_drain_prq+0x127/0x210 intel_svm_unbind+0xc5/0x1e0 iommu_sva_unbind_device+0x62/0x80 idxd_cdev_release+0x15a/0x200 [idxd] __fput+0x9c/0x250 ____fput+0xe/0x10 task_work_run+0x64/0xa0 exit_to_user_mode_prepare+0x227/0x230 syscall_exit_to_user_mode+0x2c/0x60 do_syscall_64+0x48/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae other info that might help us debug this: Chain exists of: &param->lock --> &param->fault_param->lock --> pasid_mutex Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(pasid_mutex); lock(&param->fault_param->lock); lock(pasid_mutex); lock(&param->lock); * DEADLOCK * 2 locks held by dsa_test/898: #0: ffff888100cc1cc0 (&group->mutex){+.+.}-{3:3}, at: iommu_sva_unbind_device+0x53/0x80 #1: ffffffff82b2f7c8 (pasid_mutex){+.+.}-{3:3}, at: intel_svm_unbind+0x34/0x1e0 stack backtrace: CPU: 2 PID: 898 Comm: dsa_test Not tainted 5.14.0-rc7+ #549 Hardware name: Intel Corporation Kabylake Client platform/KBL S DDR4 UD IMM CRB, BIOS KBLSE2R1.R00.X050.P01.1608011715 08/01/2016 Call Trace: dump_stack_lvl+0x5b/0x74 dump_stack+0x10/0x12 print_circular_bug.cold+0x13d/0x142 check_noncircular+0xf1/0x110 __lock_acquire+0x1134/0x1d60 lock_acquire+0xc6/0x2e0 ? iopf_queue_flush_dev+0x29/0x60 ? pci_mmcfg_read+0xde/0x240 __mutex_lock+0x75/0x730 ? iopf_queue_flush_dev+0x29/0x60 ? pci_mmcfg_read+0xfd/0x240 ? iopf_queue_flush_dev+0x29/0x60 mutex_lock_nested+0x1b/0x20 iopf_queue_flush_dev+0x29/0x60 intel_svm_drain_prq+0x127/0x210 ? intel_pasid_tear_down_entry+0x22e/0x240 intel_svm_unbind+0xc5/0x1e0 iommu_sva_unbind_device+0x62/0x80 idxd_cdev_release+0x15a/0x200 pasid_mutex protects pasid and svm data mapping data. It's unnecessary to hold pasid_mutex while flushing the workqueue. To fix the deadlock issue, unlock pasid_pasid during flushing the workqueue to allow the works to be handled. Fixes: d5b9e4bfe0d8 ("iommu/vt-d: Report prq to io-pgfault framework") Reported-and-tested-by: Dave Jiang <[email protected]> Signed-off-by: Fenghua Yu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lu Baolu <[email protected]> Link: https://lore.kernel.org/r/[email protected] [joro: Removed timing information from kernel log messages] Signed-off-by: Joerg Roedel <[email protected]>
2021-09-09	iommu/vt-d: Fix PASID leak in intel_svm_unbind_mm()	Fenghua Yu	1	-3/+0
	The mm->pasid will be used in intel_svm_free_pasid() after load_pasid() during unbinding mm. Clearing it in load_pasid() will cause PASID cannot be freed in intel_svm_free_pasid(). Additionally mm->pasid was updated already before load_pasid() during pasid allocation. No need to update it again in load_pasid() during binding mm. Don't update mm->pasid to avoid the issues in both binding mm and unbinding mm. Fixes: 4048377414162 ("iommu/vt-d: Use iommu_sva_alloc(free)_pasid() helpers") Reported-and-tested-by: Dave Jiang <[email protected]> Co-developed-by: Jacob Pan <[email protected]> Signed-off-by: Jacob Pan <[email protected]> Signed-off-by: Fenghua Yu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lu Baolu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2021-09-09	iommu/amd: Remove iommu_init_ga()	Suravee Suthikulpanit	1	-13/+4
	Since the function has been simplified and only call iommu_init_ga_log(), remove the function and replace with iommu_init_ga_log() instead. Signed-off-by: Suravee Suthikulpanit <[email protected]> Link: https://lore.kernel.org/r/[email protected] Fixes: 8bda0cfbdc1a ("iommu/amd: Detect and initialize guest vAPIC log") Signed-off-by: Joerg Roedel <[email protected]>
2021-09-09	iommu/amd: Relocate GAMSup check to early_enable_iommus	Wei Huang	1	-7/+24
	Currently, iommu_init_ga() checks and disables IOMMU VAPIC support (i.e. AMD AVIC support in IOMMU) when GAMSup feature bit is not set. However it forgets to clear IRQ_POSTING_CAP from the previously set amd_iommu_irq_ops.capability. This triggers an invalid page fault bug during guest VM warm reboot if AVIC is enabled since the irq_remapping_cap(IRQ_POSTING_CAP) is incorrectly set, and crash the system with the following kernel trace. BUG: unable to handle page fault for address: 0000000000400dd8 RIP: 0010:amd_iommu_deactivate_guest_mode+0x19/0xbc Call Trace: svm_set_pi_irte_mode+0x8a/0xc0 [kvm_amd] ? kvm_make_all_cpus_request_except+0x50/0x70 [kvm] kvm_request_apicv_update+0x10c/0x150 [kvm] svm_toggle_avic_for_irq_window+0x52/0x90 [kvm_amd] svm_enable_irq_window+0x26/0xa0 [kvm_amd] vcpu_enter_guest+0xbbe/0x1560 [kvm] ? avic_vcpu_load+0xd5/0x120 [kvm_amd] ? kvm_arch_vcpu_load+0x76/0x240 [kvm] ? svm_get_segment_base+0xa/0x10 [kvm_amd] kvm_arch_vcpu_ioctl_run+0x103/0x590 [kvm] kvm_vcpu_ioctl+0x22a/0x5d0 [kvm] __x64_sys_ioctl+0x84/0xc0 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes by moving the initializing of AMD IOMMU interrupt remapping mode (amd_iommu_guest_ir) earlier before setting up the amd_iommu_irq_ops.capability with appropriate IRQ_POSTING_CAP flag. [joro: Squashed the two patches and limited check_features_on_all_iommus() to CONFIG_IRQ_REMAP to fix a compile warning.] Signed-off-by: Wei Huang <[email protected]> Co-developed-by: Suravee Suthikulpanit <[email protected]> Signed-off-by: Suravee Suthikulpanit <[email protected]> Link: https://lore.kernel.org/r/[email protected] Link: https://lore.kernel.org/r/[email protected] Fixes: 8bda0cfbdc1a ("iommu/amd: Detect and initialize guest vAPIC log") Signed-off-by: Joerg Roedel <[email protected]>
2021-09-09	parisc: Mark sched_clock unstable only if clocks are not syncronized	Helge Deller	2	-6/+3
	We check at runtime if the cr16 clocks are stable across CPUs. Only mark the sched_clock unstable by calling clear_sched_clock_stable() if we know that we run on a system which isn't syncronized across CPUs. Signed-off-by: Helge Deller <[email protected]>
2021-09-09	parisc: Move pci_dev_is_behind_card_dino to where it is used	Guenter Roeck	1	-9/+9
	parisc build test images fail to compile with the following error. drivers/parisc/dino.c:160:12: error: 'pci_dev_is_behind_card_dino' defined but not used Move the function just ahead of its only caller to avoid the error. Fixes: 5fa1659105fa ("parisc: Disable HP HSC-PCI Cards to prevent kernel crash") Cc: Helge Deller <[email protected]> Signed-off-by: Guenter Roeck <[email protected]> Signed-off-by: Helge Deller <[email protected]>
2021-09-09	parisc: Reduce sigreturn trampoline to 3 instructions	Helge Deller	3	-9/+8
	We can move the INSN_LDI_R20 instruction into the branch delay slot. Signed-off-by: Helge Deller <[email protected]>
2021-09-09	parisc: Check user signal stack trampoline is inside TASK_SIZE	Helge Deller	1	-7/+10
	Add some additional checks to ensure the signal stack is inside userspace bounds. Signed-off-by: Helge Deller <[email protected]>
2021-09-09	parisc: Drop useless debug info and comments from signal.c	Helge Deller	1	-15/+0
	Signed-off-by: Helge Deller <[email protected]>
2021-09-09	parisc: Drop strnlen_user() in favour of generic version	Helge Deller	4	-38/+1
	As suggested by Arnd Bergmann, drop the parisc version of strnlen_user() and switch to the generic version. Suggested-by: Arnd Bergmann <[email protected]> Acked-by: Arnd Bergmann <[email protected]> Signed-off-by: Helge Deller <[email protected]>
2021-09-09	parisc: Add missing FORCE prerequisite in Makefile	Helge Deller	1	-9/+9
	Signed-off-by: Helge Deller <[email protected]>
2021-09-09	net: ni65: Avoid typecast of pointer to u32	Guenter Roeck	1	-1/+1
	Building alpha:allmodconfig results in the following error. drivers/net/ethernet/amd/ni65.c: In function 'ni65_stop_start': drivers/net/ethernet/amd/ni65.c:751:37: error: cast from pointer to integer of different size buffer[i] = (u32) isa_bus_to_virt(tmdp->u.buffer); 'buffer[]' is declared as unsigned long, so replace the typecast to u32 with a typecast to unsigned long to fix the problem. Cc: Arnd Bergmann <[email protected]> Signed-off-by: Guenter Roeck <[email protected]> Acked-by: Arnd Bergmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-09	Merge branch 'sfx-xdp-fallback-tx-queues'	David S. Miller	3	-40/+103
	Íñigo Huguet says: ==================== sfc: fallback for lack of xdp tx queues If there are not enough hardware resources to allocate one tx queue per CPU for XDP, XDP_TX and XDP_REDIRECT actions were unavailable, and using them resulted each time with the packet being drop and this message in the logs: XDP TX failed (-22) These patches implement 2 fallback solutions for 2 different situations that might happen: 1. There are not enough free resources for all the tx queues, but there are some free resources available 2. There are not enough free resources at all for tx queues. Both solutions are based in sharing tx queues, using __netif_tx_lock for synchronization. In the second case, as there are not XDP TX queues to share, network stack queues are used instead, but since we're taking __netif_tx_lock, concurrent access to the queues is correctly protected. The solution for this second case might affect performance both of XDP traffic and normal traffice due to lock contention if both are used intensively. That's why I call it a "last resort" fallback: it's not a desirable situation, but at least we have XDP TX working. Some tests has shown good results and indicate that the non-fallback case is not being damaged by this changes. They are also promising for the fallback cases. This is the test: 1. From another machine, send high amount of packets with pktgen, script samples/pktgen/pktgen_sample04_many_flows.sh 2. In the tested machine, run samples/bpf/xdp_rxq_info with arguments "-a XDP_TX --swapmac" and see the results 3. In the tested machine, run also pktgen_sample04 to create high TX normal traffic, and see how xdp_rxq_info results vary Note that this test doesn't check the worst situations for the fallback solutions because XDP_TX will only be executed from the same CPUs that are processed by sfc, and not from every CPU in the system, so the performance drop due to the highest locking contention doesn't happen. I'd like to test that, as well, but I don't have access right now to a proper environment. Test results: Without doing TX: Before changes: ~2,900,000 pps After changes, 1 queues/core: ~2,900,000 pps After changes, 2 queues/core: ~2,900,000 pps After changes, 8 queues/core: ~2,900,000 pps After changes, borrowing from network stack: ~2,900,000 pps With multiflow TX at the same time: Before changes: ~1,700,000 - 2,900,000 pps After changes, 1 queues/core: ~1,700,000 - 2,900,000 pps After changes, 2 queues/core: ~1,700,000 pps After changes, 8 queues/core: ~1,700,000 pps After changes, borrowing from network stack: 1,150,000 pps Sporadic "XDP TX failed (-5)" warnings are shown when running xdp program and pktgen simultaneously. This was expected because XDP doesn't have any buffering system if the NIC is under very high pressure. Thousands of these warnings are shown in the case of borrowing net stack queues. As I said before, this was also expected. ==================== Signed-off-by: David S. Miller <[email protected]>
2021-09-09	sfc: last resort fallback for lack of xdp tx queues	Íñigo Huguet	3	-40/+58
	Previous patch addressed the situation of having some free resources for xdp tx but not enough for one tx queue per CPU. This patch address the worst case of not having resources at all for xdp tx. Instead of using queues dedicated to xdp, normal queues used by network stack are shared for both cases, using __netif_tx_lock for synchronization. Also queue stop/restart must be considered in the xdp path to avoid freezing the queue. This is not the ideal situation we might want to be, and a performance penalty is expected both for normal and xdp traffic, but at least XDP will work in all possible situations (with a warning in the logs), improving a bit the pain of not knowing in what situations we can use it and in what situations we cannot. Signed-off-by: Íñigo Huguet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-09	sfc: fallback for lack of xdp tx queues	Íñigo Huguet	3	-19/+64
	If there are not enough resources to allocate one TX queue per core for XDP TX it was completely disabled. This patch implements a fallback solution for sharing the available queues using __netif_tx_lock for synchronization. In the normal case that there is one TX queue per CPU, no locking is done, as it was before. With this fallback solution, XDP TX will work in much more cases that were failing, specially in machines with many CPUs. It's hard for XDP users to know what features are supported across different NICs and configurations, so they will benefit on having wider support. Signed-off-by: Íñigo Huguet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-09	net: stmmac: platform: fix build warning when with !CONFIG_PM_SLEEP	Joakim Zhang	1	-2/+2
	Use __maybe_unused for noirq_suspend()/noirq_resume() hooks to avoid build warning with !CONFIG_PM_SLEEP: >> drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c:796:12: error: 'stmmac_pltfr_noirq_resume' defined but not used [-Werror=unused-function] 796 \| static int stmmac_pltfr_noirq_resume(struct device dev) \| ^~~~~~~~~~~~~~~~~~~~~~~~~ >> drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c:775:12: error: 'stmmac_pltfr_noirq_suspend' defined but not used [-Werror=unused-function] 775 \| static int stmmac_pltfr_noirq_suspend(struct device dev) \| ^~~~~~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors Fixes: 276aae377206 ("net: stmmac: fix system hang caused by eee_ctrl_timer during suspend/resume") Reported-by: kernel test robot <[email protected]> Signed-off-by: Joakim Zhang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-09	net/l2tp: Fix reference count leak in l2tp_udp_recv_core	Xiyu Yang	1	-1/+3
	The reference count leak issue may take place in an error handling path. If both conditions of tunnel->version == L2TP_HDR_VER_3 and the return value of l2tp_v3_ensure_opt_in_linear is nonzero, the function would directly jump to label invalid, without decrementing the reference count of the l2tp_session object session increased earlier by l2tp_tunnel_get_session(). This may result in refcount leaks. Fix this issue by decrease the reference count before jumping to the label invalid. Fixes: 4522a70db7aa ("l2tp: fix reading optional fields of L2TPv3") Signed-off-by: Xiyu Yang <[email protected]> Signed-off-by: Xin Xiong <[email protected]> Signed-off-by: Xin Tan <[email protected]> Signed-off-by: David S. Miller <[email protected]>