aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2024-04-14Merge tag 'locking-urgent-2024-04-14' of ↵Linus Torvalds2-5/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fix from Ingo Molnar: "Fix a PREEMPT_RT build bug" * tag 'locking-urgent-2024-04-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking: Make rwsem_assert_held_write_nolockdep() build with PREEMPT_RT=y
2024-04-14Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds2-9/+13
Pull virtio bugfixes from Michael Tsirkin: "Some small, obvious (in hindsight) bugfixes: - new ioctl in vhost-vdpa has a wrong # - not too late to fix - vhost has apparently been lacking an smp_rmb() - due to code duplication :( The duplication will be fixed in the next merge cycle, this is a minimal fix - an error message in vhost talks about guest moving used index - which of course never happens, guest only ever moves the available index - i2c-virtio didn't set the driver owner so it did not get refcounted correctly" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vhost: correct misleading printing information vhost-vdpa: change ioctl # for VDPA_GET_VRING_SIZE virtio: store owner from modules with register_virtio_driver() vhost: Add smp_rmb() in vhost_enable_notify() vhost: Add smp_rmb() in vhost_vq_avail_empty()
2024-04-14net: change maximum number of UDP segments to 128Yuri Benditovich1-1/+1
The commit fc8b2a619469 ("net: more strict VIRTIO_NET_HDR_GSO_UDP_L4 validation") adds check of potential number of UDP segments vs UDP_MAX_SEGMENTS in linux/virtio_net.h. After this change certification test of USO guest-to-guest transmit on Windows driver for virtio-net device fails, for example with packet size of ~64K and mss of 536 bytes. In general the USO should not be more restrictive than TSO. Indeed, in case of unreasonably small mss a lot of segments can cause queue overflow and packet loss on the destination. Limit of 128 segments is good for any practical purpose, with minimal meaningful mss of 536 the maximal UDP packet will be divided to ~120 segments. The number of segments for UDP packets is validated vs UDP_MAX_SEGMENTS also in udp.c (v4,v6), this does not affect quest-to-guest path but does affect packets sent to host, for example. It is important to mention that UDP_MAX_SEGMENTS is kernel-only define and not available to user mode socket applications. In order to request MSS smaller than MTU the applications just uses setsockopt with SOL_UDP and UDP_SEGMENT and there is no limitations on socket API level. Fixes: fc8b2a619469 ("net: more strict VIRTIO_NET_HDR_GSO_UDP_L4 validation") Signed-off-by: Yuri Benditovich <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-04-14bootconfig: use memblock_free_late to free xbc memory to buddyQiang Zhang1-1/+6
On the time to free xbc memory in xbc_exit(), memblock may has handed over memory to buddy allocator. So it doesn't make sense to free memory back to memblock. memblock_free() called by xbc_exit() even causes UAF bugs on architectures with CONFIG_ARCH_KEEP_MEMBLOCK disabled like x86. Following KASAN logs shows this case. This patch fixes the xbc memory free problem by calling memblock_free() in early xbc init error rewind path and calling memblock_free_late() in xbc exit path to free memory to buddy allocator. [ 9.410890] ================================================================== [ 9.418962] BUG: KASAN: use-after-free in memblock_isolate_range+0x12d/0x260 [ 9.426850] Read of size 8 at addr ffff88845dd30000 by task swapper/0/1 [ 9.435901] CPU: 9 PID: 1 Comm: swapper/0 Tainted: G U 6.9.0-rc3-00208-g586b5dfb51b9 #5 [ 9.446403] Hardware name: Intel Corporation RPLP LP5 (CPU:RaptorLake)/RPLP LP5 (ID:13), BIOS IRPPN02.01.01.00.00.19.015.D-00000000 Dec 28 2023 [ 9.460789] Call Trace: [ 9.463518] <TASK> [ 9.465859] dump_stack_lvl+0x53/0x70 [ 9.469949] print_report+0xce/0x610 [ 9.473944] ? __virt_addr_valid+0xf5/0x1b0 [ 9.478619] ? memblock_isolate_range+0x12d/0x260 [ 9.483877] kasan_report+0xc6/0x100 [ 9.487870] ? memblock_isolate_range+0x12d/0x260 [ 9.493125] memblock_isolate_range+0x12d/0x260 [ 9.498187] memblock_phys_free+0xb4/0x160 [ 9.502762] ? __pfx_memblock_phys_free+0x10/0x10 [ 9.508021] ? mutex_unlock+0x7e/0xd0 [ 9.512111] ? __pfx_mutex_unlock+0x10/0x10 [ 9.516786] ? kernel_init_freeable+0x2d4/0x430 [ 9.521850] ? __pfx_kernel_init+0x10/0x10 [ 9.526426] xbc_exit+0x17/0x70 [ 9.529935] kernel_init+0x38/0x1e0 [ 9.533829] ? _raw_spin_unlock_irq+0xd/0x30 [ 9.538601] ret_from_fork+0x2c/0x50 [ 9.542596] ? __pfx_kernel_init+0x10/0x10 [ 9.547170] ret_from_fork_asm+0x1a/0x30 [ 9.551552] </TASK> [ 9.555649] The buggy address belongs to the physical page: [ 9.561875] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x45dd30 [ 9.570821] flags: 0x200000000000000(node=0|zone=2) [ 9.576271] page_type: 0xffffffff() [ 9.580167] raw: 0200000000000000 ffffea0011774c48 ffffea0012ba1848 0000000000000000 [ 9.588823] raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000 [ 9.597476] page dumped because: kasan: bad access detected [ 9.605362] Memory state around the buggy address: [ 9.610714] ffff88845dd2ff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 9.618786] ffff88845dd2ff80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 9.626857] >ffff88845dd30000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 9.634930] ^ [ 9.638534] ffff88845dd30080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 9.646605] ffff88845dd30100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 9.654675] ================================================================== Link: https://lore.kernel.org/all/[email protected]/ Fixes: 40caa127f3c7 ("init: bootconfig: Remove all bootconfig data when the init memory is removed") Cc: [email protected] Signed-off-by: Qiang Zhang <[email protected]> Acked-by: Masami Hiramatsu (Google) <[email protected]> Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
2024-04-13Merge tag 'counter-updates-for-6.10' of ↵Greg Kroah-Hartman2-1/+16
git://git.kernel.org/pub/scm/linux/kernel/git/wbg/counter into char-misc-next William writes: Counter updates for 6.10 Three key updates of note herein: - Introduction of the COUNTER_COMP_FREQUENCY() macro to simplify creation of "frequency" Counter extensions - Three additional Signals (Clock, Channel 3, and Channel 4) are supported for the stm32-timer-cnt - Counter events support added for the stm32-timer-cnt There are also some miscellaneous cleanups and improvements, such as constifying Counter structures, resolving a kernel-doc description warning, and converting platform_driver remove callbacks to remove_new. * tag 'counter-updates-for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/wbg/counter: counter: ti-ecap-capture: Utilize COUNTER_COMP_FREQUENCY macro counter: ti-eqep: Convert to platform remove callback returning void counter: ti-ecap-capture: Convert to platform remove callback returning void MAINTAINERS: Update email addresses for William Breathitt Gray counter: stm32-timer-cnt: add support for capture events counter: stm32-timer-cnt: add support for overflow events counter: stm32-timer-cnt: probe number of channels from registers counter: stm32-timer-cnt: introduce channels counter: stm32-timer-cnt: add checks on quadrature encoder capability counter: stm32-timer-cnt: add counter prescaler extension counter: stm32-timer-cnt: introduce clock signal counter: stm32-timer-cnt: adopt signal definitions counter: stm32-timer-cnt: rename counter counter: stm32-timer-cnt: rename quadrature signal counter: Introduce the COUNTER_COMP_FREQUENCY() macro counter: constify the struct device_type usage counter: make counter_bus_type const counter: linux/counter.h: fix Excess kernel-doc description warning
2024-04-13vfs: relax linkat() AT_EMPTY_PATH - aka flink() - requirementsLinus Torvalds1-0/+1
"The definition of insanity is doing the same thing over and over again and expecting different results” We've tried to do this before, most recently with commit bb2314b47996 ("fs: Allow unprivileged linkat(..., AT_EMPTY_PATH) aka flink") about a decade ago. But the effort goes back even further than that, eg this thread back from 1998 that is so old that we don't even have it archived in lore: https://lkml.org/lkml/1998/3/10/108 which also points out some of the reasons why it's dangerous. Or, how about then in 2003: https://lkml.org/lkml/2003/4/6/112 where we went through some of the same arguments, just wirh different people involved. In particular, having access to a file descriptor does not necessarily mean that you have access to the path that was used for lookup, and there may be very good reasons why you absolutely must not have access to a path to said file. For example, if we were passed a file descriptor from the outside into some limited environment (think chroot, but also user namespaces etc) a 'flink()' system call could now make that file visible inside a context where it's not supposed to be visible. In the process the user may also be able to re-open it with permissions that the original file descriptor did not have (eg a read-only file descriptor may be associated with an underlying file that is writable). Another variation on this is if somebody else (typically root) opens a file in a directory that is not accessible to others, and passes the file descriptor on as a read-only file. Again, the access to the file descriptor does not imply that you should have access to a path to the file in the filesystem. So while we have tried this several times in the past, it never works. The last time we did this, that commit bb2314b47996 quickly got reverted again in commit f0cc6ffb8ce8 (Revert "fs: Allow unprivileged linkat(..., AT_EMPTY_PATH) aka flink"), with a note saying "We may re-do this once the whole discussion about the interface is done". Well, the discussion is long done, and didn't come to any resolution. There's no question that 'flink()' would be a useful operation, but it's a dangerous one. However, it does turn out that since 2008 (commit d76b0d9b2d87: "CRED: Use creds in file structs") we have had a fairly straightforward way to check whether the file descriptor was opened by the same credentials as the credentials of the flink(). That allows the most common patterns that people want to use, which tend to be to either open the source carefully (ie using the openat2() RESOLVE_xyz flags, and/or checking ownership with fstat() before linking), or to use O_TMPFILE and fill in the file contents before it's exposed to the world with linkat(). But it also means that if the file descriptor was opened by somebody else, or we've gone through a credentials change since, the operation no longer works (unless we have CAP_DAC_READ_SEARCH capabilities in the opener's user namespace, as before). Note that the credential equality check is done by using pointer equality, which means that it's not enough that you have effectively the same user - they have to be literally identical, since our credentials are using copy-on-write semantics. So you can't change your credentials to something else and try to change it back to the same ones between the open() and the linkat(). This is not meant to be some kind of generic permission check, this is literally meant as a "the open and link calls are 'atomic' wrt user credentials" check. It also means that you can't just move things between namespaces, because the credentials aren't just a list of uid's and gid's: they includes the pointer to the user_ns that the capabilities are relative to. So let's try this one more time and see if maybe this approach ends up being workable after all. Cc: Andrew Lutomirski <[email protected]> Cc: Al Viro <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Peter Anvin <[email protected]> Cc: Jan Kara <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Link: https://lore.kernel.org/r/[email protected] [brauner: relax capability check to opener of the file] Link: https://lore.kernel.org/all/20231113-undenkbar-gediegen-efde5f1c34bc@brauner Signed-off-by: Christian Brauner <[email protected]>
2024-04-13efi: Clear up misconceptions about a maximum variable name sizeTim Schumacher1-5/+4
The UEFI specification does not make any mention of a maximum variable name size, so the headers and implementation shouldn't claim that one exists either. Comments referring to this limit have been removed or rewritten, as this is an implementation detail local to the Linux kernel. Where appropriate, the magic value of 1024 has been replaced with EFI_VAR_NAME_LEN, as this is used for the efi_variable struct definition. This in itself does not change any behavior, but should serve as points of interest when making future changes in the same area. A related build-time check has been added to ensure that the special 512 byte sized buffer will not overflow with a potentially decreased EFI_VAR_NAME_LEN. Signed-off-by: Tim Schumacher <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]>
2024-04-12Merge tag 'io_uring-6.9-20240412' of git://git.kernel.dk/linuxLinus Torvalds1-1/+1
Pull io_uring fixes from Jens Axboe: - Fix for sigmask restoring while waiting for events (Alexey) - Typo fix in comment (Haiyue) - Fix for a msg_control retstore on SEND_ZC retries (Pavel) * tag 'io_uring-6.9-20240412' of git://git.kernel.dk/linux: io-uring: correct typo in comment for IOU_F_TWQ_LAZY_WAKE io_uring/net: restore msg_control on sendzc retry io_uring: Fix io_cqring_wait() not restoring sigmask on get_timespec64() failure
2024-04-12Merge tag 'drm-fixes-2024-04-12' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds1-0/+7
Pull drm fixes from Dave Airlie: "Looks like everyone woke up after holidays, this weeks pull has a bunch of stuff all over, 2 weeks worth of amdgpu is a lot of it, then i915/xe have a few, a bunch of msm fixes, then some scattered driver fixes. I expect things will settle down for rc5. client: - Protect connector modes with mode_config mutex ast: - Fix soft lockup host1x: - Do not setup DMA for virtual addresses ivpu: - Fix deadlock in context_xa - PCI fixes - Fixes to error handling nouveau: - gsp: Fix OOB access - Fix casting panfrost: - Fix error path in MMU code qxl: - Revert "drm/qxl: simplify qxl_fence_wait" vmwgfx: - Enable DMA for SEV mappings i915: - Couple CDCLK programming fixes - HDCP related fix - 4 Bigjoiner related fixes - Fix for a circular locking around GuC on reset+wedged case xe: - Fix double display mutex initializations - Fix u32 -> u64 implicit conversions - Fix RING_CONTEXT_CONTROL not marked as masked msm: - DP refcount leak fix on disconnect - Add missing newlines to prints in msm_fb and msm_kms - fix dpu debugfs entry permissions - Fix the interface table for the catalog of X1E80100 - fix irq message printing - Bindings fix to add DP node as child of mdss for mdss node - Minor typo fix in DP driver API which handles port status change - fix CHRASHDUMP_READ() - fix HHB (highest bank bit) for a619 to fix UBWC corruption amdgpu: - GPU reset fixes - Fix some confusing logging - UMSCH fix - Aborted suspend fix - DCN 3.5 fixes - S4 fix - MES logging fixes - SMU 14 fixes - SDMA 4.4.2 fix - KASAN fix - SMU 13.0.10 fix - VCN partition fix - GFX11 fixes - DWB fixes - Plane handling fix - FAMS fix - DCN 3.1.6 fix - VSC SDP fixes - OLED panel fix - GFX 11.5 fix amdkfd: - GPU reset fixes - fix ioctl integer overflow" * tag 'drm-fixes-2024-04-12' of https://gitlab.freedesktop.org/drm/kernel: (65 commits) amdkfd: use calloc instead of kzalloc to avoid integer overflow drm/xe: Label RING_CONTEXT_CONTROL as masked drm/xe/xe_migrate: Cast to output precision before multiplying operands drm/xe/hwmon: Cast result to output precision on left shift of operand drm/xe/display: Fix double mutex initialization drm/amdgpu: differentiate external rev id for gfx 11.5.0 drm/amd/display: Adjust dprefclk by down spread percentage. drm/amd/display: Set VSC SDP Colorimetry same way for MST and SST drm/amd/display: Program VSC SDP colorimetry for all DP sinks >= 1.4 drm/amd/display: fix disable otg wa logic in DCN316 drm/amd/display: Do not recursively call manual trigger programming drm/amd/display: always reset ODM mode in context when adding first plane drm/amdgpu: fix incorrect number of active RBs for gfx11 drm/amd/display: Return max resolution supported by DWB amd/amdkfd: sync all devices to wait all processes being evicted drm/amdgpu: clear set_q_mode_offs when VM changed drm/amdgpu: Fix VCN allocation in CPX partition drm/amd/pm: fix the high voltage issue after unload drm/amd/display: Skip on writeback when it's not applicable drm/amdgpu: implement IRQ_STATE_ENABLE for SDMA v4.4.2 ...
2024-04-12genirq: Provide a snapshot mechanism for interrupt statisticsBitao Hu2-0/+12
The soft lockup detector lacks a mechanism to identify interrupt storms as root cause of a lockup. To enable this the detector needs a mechanism to snapshot the interrupt count statistics on a CPU when the detector observes a potential lockup scenario and compare that against the interrupt count when it warns about the lockup later on. The number of interrupts in that period give a hint whether the lockup might have been caused by an interrupt storm. Instead of having extra storage in the lockup detector and accessing the internals of the interrupt descriptor directly, add a snapshot member to the per CPU irq_desc::kstat_irq structure and provide interfaces to take a snapshot of all interrupts on the current CPU and to retrieve the delta of a specific interrupt later on. Originally-by: Thomas Gleixner <[email protected]> Signed-off-by: Bitao Hu <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-04-12genirq: Convert kstat_irqs to a structBitao Hu1-2/+10
The irq_desc::kstat_irqs member is a per-CPU variable of type int, which is only capable of counting. A snapshot mechanism for interrupt statistics will be added soon, which requires an additional variable to store the snapshot. To facilitate expansion, convert kstat_irqs here to a struct containing only the count. Originally-by: Thomas Gleixner <[email protected]> Signed-off-by: Bitao Hu <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-04-12ACPICA: Update acpixf.h for new ACPICA release 20240322Saket Dumbre1-1/+1
ACPICA commit 718374cd1bc21d08960b61069c8ac62b0cf67c0c Link: https://github.com/acpica/acpica/commit/718374cd Signed-off-by: Rafael J. Wysocki <[email protected]>
2024-04-12ACPICA: Fix CXL 3.0 structure (RDPAS) in the CEDT tableHojin Nam1-2/+0
ACPICA commit a0ad1ed5105fb8a15f6f8384b8ab0a2157efaf23 struct acpi_cedt_rdpas does not match with CXL r3.0 9.17.1.5 Table 9-24. reserved1 and length fields are already added by struct acpi_cedt_header. Link: https://github.com/acpica/acpica/commit/a0ad1ed5 Signed-off-by: Hojin Nam <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2024-04-12ACPICA: SRAT: Add dump and compiler support for RINTC affinity structureHaibo Xu1-1/+1
ACPICA commit b9423c1d35b072c8f2acf97a5842b9f144449eaa After adding RISC-V RINTC affinity structure definition, enable corresponding dump and compiler support. Link: https://github.com/acpica/acpica/commit/b9423c1d Signed-off-by: Haibo Xu <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2024-04-12ACPICA: SRAT: Add RISC-V RINTC affinity structureHaibo Xu1-1/+17
ACPICA commit 93caddbf2f620769052c59ec471f018281dc3a24 Add definition of RISC-V Interrupt Controller(RINTC) affinity structure which was approved by UEFI forum and will be part of next ACPI spec version(6.6). Link: https://github.com/acpica/acpica/commit/93caddbf Signed-off-by: Haibo Xu <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2024-04-12Merge drm/drm-next into drm-xe-nextThomas Hellström623-4826/+13085
Backmerging drm-next in order to get up-to-date and in particular to access commit 9ca5facd0400f610f3f7f71aeb7fc0b949a48c67. Signed-off-by: Thomas Hellström <[email protected]>
2024-04-12ACPICA: ACPI 6.5: RAS2: Add support for RAS2 tableShiju Jose1-0/+129
ACPICA commit c581606cf49b7574d29c02b1a3bc144650375e32 Add support for ACPI RAS2 feature table(RAS2) defined in the ACPI 6.5 Specification & upwards revision, section 5.2.21. The RAS2 table provides interfaces for platform RAS features. RAS2 offers the same services as RASF, but is more scalable than the latter. RAS2 supports independent RAS controls and capabilities for a given RAS feature for multiple instances of the same component in a given system. The platform can support either RAS2 or RASF but not both. Link: https://github.com/acpica/acpica/commit/c581606c Signed-off-by: Shiju Jose <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2024-04-12ACPICA: actbl1.h: Add EINJ CXL error typesBen Cheatham1-0/+6
ACPICA commit c7171588a9f684afafc83c6c18ed0bab9274e5e6 Add EINJ CXL error types added in ACPI v6.5. Link: https://github.com/acpica/acpica/commit/c7171588 Signed-off-by: Ben Cheatham <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2024-04-12Merge tag 'nf-24-04-11' of ↵David S. Miller2-1/+25
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf netfilter pull request 24-04-11 Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: Patches #1 and #2 add missing rcu read side lock when iterating over expression and object type list which could race with module removal. Patch #3 prevents promisc packet from visiting the bridge/input hook to amend a recent fix to address conntrack confirmation race in br_netfilter and nf_conntrack_bridge. Patch #4 adds and uses iterate decorator type to fetch the current pipapo set backend datastructure view when netlink dumps the set elements. Patch #5 fixes removal of duplicate elements in the pipapo set backend. Patch #6 flowtable validates pppoe header before accessing it. Patch #7 fixes flowtable datapath for pppoe packets, otherwise lookup fails and pppoe packets follow classic path. ==================== Signed-off-by: David S. Miller <[email protected]>
2024-04-12devlink: add a new info version tagFei Qin1-1/+3
Add definition and documentation for the new generic info "board.part_number". The new one is for part number specific use, and board.id is modified to match the documentation in devlink-info. Signed-off-by: Fei Qin <[email protected]> Signed-off-by: Louis Peens <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-04-12Merge patch series "convert SCSI to atomic queue limits, part 1 (v3)"Martin K. Petersen9-24/+37
Christoph Hellwig <[email protected]> says: Hi all, this series converts the SCSI midlayer and LLDDs to use atomic queue limits API. It is pretty straight forward, except for the mpt3mr driver which does really weird and probably already broken things by setting limits from unlocked device iteration callbacks. I will probably defer the (more complicated) ULD changes to the next merge window as they would heavily conflict with Damien's zone write plugging series. With that the series could go in through the SCSI tree if Jens' ACKs the core block layer bits. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2024-04-12scsi: block: Remove now unused queue limits helpersChristoph Hellwig2-15/+2
Signed-off-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Bart Van Assche <[email protected]> Reviewed-by: John Garry <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2024-04-12iommu: Pass domain to remove_dev_pasid() opYi Liu1-1/+2
Existing remove_dev_pasid() callbacks of the underlying iommu drivers get the attached domain from the group->pasid_array. However, the domain stored in group->pasid_array is not always correct in all scenarios. A wrong domain may result in failure in remove_dev_pasid() callback. To avoid such problems, it is more reliable to pass the domain to the remove_dev_pasid() op. Suggested-by: Jason Gunthorpe <[email protected]> Signed-off-by: Yi Liu <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2024-04-12Merge branch 'x86/urgent' into x86/cpu, to resolve conflictIngo Molnar2-1/+3
There's a new conflict between this commit pending in x86/cpu: 63edbaa48a57 x86/cpu/topology: Add support for the AMD 0x80000026 leaf And these fixes in x86/urgent: c064b536a8f9 x86/cpu/amd: Make the NODEID_MSR union actually work 1b3108f6898e x86/cpu/amd: Make the CPUID 0x80000008 parser correct Resolve them. Conflicts: arch/x86/kernel/cpu/topology_amd.c Signed-off-by: Ingo Molnar <[email protected]>
2024-04-12tcp: increase the default TCP scaling ratioHechao Li1-3/+2
After commit dfa2f0483360 ("tcp: get rid of sysctl_tcp_adv_win_scale"), we noticed an application-level timeout due to reduced throughput. Before the commit, for a client that sets SO_RCVBUF to 65k, it takes around 22 seconds to transfer 10M data. After the commit, it takes 40 seconds. Because our application has a 30-second timeout, this regression broke the application. The reason that it takes longer to transfer data is that tp->scaling_ratio is initialized to a value that results in ~0.25 of rcvbuf. In our case, SO_RCVBUF is set to 65536 by the application, which translates to 2 * 65536 = 131,072 bytes in rcvbuf and hence a ~28k initial receive window. Later, even though the scaling_ratio is updated to a more accurate skb->len/skb->truesize, which is ~0.66 in our environment, the window stays at ~0.25 * rcvbuf. This is because tp->window_clamp does not change together with the tp->scaling_ratio update when autotuning is disabled due to SO_RCVBUF. As a result, the window size is capped at the initial window_clamp, which is also ~0.25 * rcvbuf, and never grows bigger. Most modern applications let the kernel do autotuning, and benefit from the increased scaling_ratio. But there are applications such as kafka that has a default setting of SO_RCVBUF=64k. This patch increases the initial scaling_ratio from ~25% to 50% in order to make it backward compatible with the original default sysctl_tcp_adv_win_scale for applications setting SO_RCVBUF. Fixes: dfa2f0483360 ("tcp: get rid of sysctl_tcp_adv_win_scale") Signed-off-by: Hechao Li <[email protected]> Reviewed-by: Tycho Andersen <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Link: https://lore.kernel.org/netdev/[email protected]/ Signed-off-by: David S. Miller <[email protected]>
2024-04-12perf/bpf: Remove unneeded uses_default_overflow_handler()Kyle Huey1-14/+3
Now that struct perf_event's orig_overflow_handler is gone, there's no need for the functions and macros to support looking past overflow_handler to orig_overflow_handler. This patch is solely a refactoring and results in no behavior change. Signed-off-by: Kyle Huey <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Acked-by: Will Deacon <[email protected]> Acked-by: Song Liu <[email protected]> Acked-by: Jiri Olsa <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-04-12perf/bpf: Call BPF handler directly, not through overflow machineryKyle Huey1-5/+1
To ultimately allow BPF programs attached to perf events to completely suppress all of the effects of a perf event overflow (rather than just the sample output, as they do today), call bpf_overflow_handler() from __perf_event_overflow() directly rather than modifying struct perf_event's overflow_handler. Return the BPF program's return value from bpf_overflow_handler() so that __perf_event_overflow() knows how to proceed. Remove the now unnecessary orig_overflow_handler from struct perf_event. This patch is solely a refactoring and results in no behavior change. Suggested-by: Namhyung Kim <[email protected]> Signed-off-by: Kyle Huey <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Acked-by: Song Liu <[email protected]> Acked-by: Jiri Olsa <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-04-12perf/bpf: Remove #ifdef CONFIG_BPF_SYSCALL from struct perf_event membersKyle Huey1-2/+0
This will allow __perf_event_overflow() (which is independent of CONFIG_BPF_SYSCALL) to use struct perf_event's prog to decide whether to call bpf_overflow_handler(). Suggested-by: Ingo Molnar <[email protected]> Signed-off-by: Kyle Huey <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-04-12dt-bindings: leds: Add LED_FUNCTION_SPEED_* for link speed on LAN/WANINAGAKI Hiroshi1-0/+2
Add LED_FUNCTION_SPEED_LAN and LED_FUNCTION_SPEED_WAN for LEDs that indicate link speed of ethernet ports on LAN/WAN. This is useful to distinguish those LEDs from LEDs that indicate link status (up/down). example: Fortinet FortiGate 30E/50E have LEDs that indicate link speed on each of the ethernet ports in addition to LEDs that indicate link status (up/down). - 1000 Mbps: green:speed-(lan|wan)-N - 100 Mbps: amber:speed-(lan|wan)-N - 10 Mbps: (none, turned off) Signed-off-by: INAGAKI Hiroshi <[email protected]> Reviewed-by: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lee Jones <[email protected]>
2024-04-12dt-bindings: leds: Add LED_FUNCTION_MOBILE for mobile networkINAGAKI Hiroshi1-0/+1
Add LED_FUNCTION_MOBILE for LEDs that indicate status of mobile network connection. This is useful to distinguish those LEDs from LEDs that indicates status of wired "wan" connection. example (on stock fw): IIJ SA-W2 has "Mobile" LEDs that indicate status (no signal, too low, low, good) of mobile network connection via dongle connected to USB port. - no signal: (none, turned off) - too low: green:mobile & red:mobile (amber, blink) - low: green:mobile & red:mobile (amber, turned on) - good: green:mobile (turned on) Suggested-by: Hauke Mehrtens <[email protected]> Signed-off-by: INAGAKI Hiroshi <[email protected]> Reviewed-by: Rob Herring <[email protected]> Acked-by: Krzysztof Kozlowski <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lee Jones <[email protected]>
2024-04-12Merge branches 'ib-leds-mips-sound-6.10' and 'ib-leds-locking-6.10' into ↵Lee Jones1-0/+27
ibs-for-leds-merged
2024-04-12mm: replace set_pte_at_notify() with just set_pte_at()Paolo Bonzini1-2/+0
With the demise of the .change_pte() MMU notifier callback, there is no notification happening in set_pte_at_notify(). It is a synonym of set_pte_at() and can be replaced with it. Signed-off-by: Paolo Bonzini <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Reviewed-by: Philippe Mathieu-Daudé <[email protected]> Message-ID: <[email protected]> Acked-by: Andrew Morton <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-12X.509: Introduce scope-based x509_certificate allocationLukas Wunner1-0/+2
Add a DEFINE_FREE() clause for x509_certificate structs and use it in x509_cert_parse() and x509_key_preparse(). These are the only functions where scope-based x509_certificate allocation currently makes sense. A third user will be introduced with the forthcoming SPDM library (Security Protocol and Data Model) for PCI device authentication. Unlike most other DEFINE_FREE() clauses, this one checks for IS_ERR() instead of NULL before calling x509_free_certificate() at end of scope. That's because the "constructor" of x509_certificate structs, x509_cert_parse(), returns a valid pointer or an ERR_PTR(), but never NULL. Comparing the Assembler output before/after has shown they are identical, save for the fact that gcc-12 always generates two return paths when __cleanup() is used, one for the success case and one for the error case. In x509_cert_parse(), add a hint for the compiler that kzalloc() never returns an ERR_PTR(). Otherwise the compiler adds a gratuitous IS_ERR() check on return. Introduce an assume() macro for this which can be re-used elsewhere in the kernel to provide hints for the compiler. Suggested-by: Jonathan Cameron <[email protected]> Link: https://lore.kernel.org/all/[email protected]/ Link: https://lwn.net/Articles/934679/ Signed-off-by: Lukas Wunner <[email protected]> Signed-off-by: Herbert Xu <[email protected]>
2024-04-12crypto: x509 - Add OID for NIST P521 and extend parser for itStefan Berger1-0/+1
Enable the x509 parser to accept NIST P521 certificates and add the OID for ansip521r1, which is the identifier for NIST P521. Cc: David Howells <[email protected]> Tested-by: Lukas Wunner <[email protected]> Reviewed-by: Jarkko Sakkinen <[email protected]> Signed-off-by: Stefan Berger <[email protected]> Signed-off-by: Herbert Xu <[email protected]>
2024-04-12crypto: ecc - Add NIST P521 curve parametersStefan Berger1-0/+1
Add the parameters for the NIST P521 curve and define a new curve ID for it. Make the curve available in ecc_get_curve. Tested-by: Lukas Wunner <[email protected]> Reviewed-by: Jarkko Sakkinen <[email protected]> Signed-off-by: Stefan Berger <[email protected]> Signed-off-by: Herbert Xu <[email protected]>
2024-04-12crypto: ecc - Implement vli_mmod_fast_521 for NIST p521Stefan Berger1-1/+2
Implement vli_mmod_fast_521 following the description for how to calculate the modulus for NIST P521 in the NIST publication "Recommendations for Discrete Logarithm-Based Cryptography: Elliptic Curve Domain Parameters" section G.1.4. NIST p521 requires 9 64bit digits, so increase the ECC_MAX_DIGITS so that the vli digit array provides enough elements to fit the larger integers required by this curve. Tested-by: Lukas Wunner <[email protected]> Reviewed-by: Jarkko Sakkinen <[email protected]> Signed-off-by: Stefan Berger <[email protected]> Signed-off-by: Herbert Xu <[email protected]>
2024-04-12crypto: ecc - Add nbits field to ecc_curve structureStefan Berger1-0/+2
Add the number of bits a curve has to the ecc_curve definition to be able to derive the number of bytes a curve requires for its coordinates from it. It also allows one to identify a curve by its particular size. Set the number of bits on all curve definitions. Tested-by: Lukas Wunner <[email protected]> Reviewed-by: Jarkko Sakkinen <[email protected]> Signed-off-by: Stefan Berger <[email protected]> Reviewed-by: Vitaly Chikunov <[email protected]> Signed-off-by: Herbert Xu <[email protected]>
2024-04-12crypto: ecdsa - Convert byte arrays with key coordinates to digitsStefan Berger1-0/+21
For NIST P192/256/384 the public key's x and y parameters could be copied directly from a given array since both parameters filled 'ndigits' of digits (a 'digit' is a u64). For support of NIST P521 the key parameters need to have leading zeros prepended to the most significant digit since only 2 bytes of the most significant digit are provided. Therefore, implement ecc_digits_from_bytes to convert a byte array into an array of digits and use this function in ecdsa_set_pub_key where an input byte array needs to be converted into digits. Suggested-by: Lukas Wunner <[email protected]> Tested-by: Lukas Wunner <[email protected]> Reviewed-by: Jarkko Sakkinen <[email protected]> Signed-off-by: Stefan Berger <[email protected]> Signed-off-by: Herbert Xu <[email protected]>
2024-04-11dt-bindings: clocks: stm32mp25: add description of all parentsGabriel Fernandez1-1/+1
RCC driver uses '.index' to define all parent clocks instead '.names' because the use of a name to define a parent clock is discouraged. This is an ABI change, but the RCC driver has not yet merged, unlike all others drivers besides Linux. Fixes: b5be49db3d47 ("dt-bindings: stm32: add clocks and reset binding for stm32mp25 platform") Signed-off-by: Gabriel Fernandez <[email protected]> Acked-by: Krzysztof Kozlowski <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Stephen Boyd <[email protected]>
2024-04-11net: dsa: allow DSA switch drivers to provide their own phylink mac opsRussell King (Oracle)1-0/+5
Rather than having a shim for each and every phylink MAC operation, allow DSA switch drivers to provide their own ops structure. When a DSA driver provides the phylink MAC operations, the shimmed ops must not be provided, so fail an attempt to register a switch with both the phylink_mac_ops in struct dsa_switch and the phylink_mac_* operations populated in dsa_switch_ops populated. Signed-off-by: Russell King (Oracle) <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-04-11net: dsa: introduce dsa_phylink_to_port()Russell King (Oracle)1-0/+6
We convert from a phylink_config struct to a dsa_port struct in many places, let's provide a helper for this. Reviewed-by: Andrew Lunn <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: Russell King (Oracle) <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-04-11ipv4: Remove RTO_ONLINK.Guillaume Nault1-2/+0
RTO_ONLINK was a flag used in ->flowi4_tos that allowed to alter the scope of an IPv4 route lookup. Setting this flag was equivalent to specifying RT_SCOPE_LINK in ->flowi4_scope. With commit ec20b2830093 ("ipv4: Set scope explicitly in ip_route_output()."), the last users of RTO_ONLINK have been removed. Therefore, we can now drop the code that checked this bit and stop modifying ->flowi4_scope in ip_route_output_key_hash(). Signed-off-by: Guillaume Nault <[email protected]> Reviewed-by: Przemek Kitszel <[email protected]> Reviewed-by: David Ahern <[email protected]> Link: https://lore.kernel.org/r/57de760565cab55df7b129f523530ac6475865b2.1712754146.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <[email protected]>
2024-04-11flow_offload: fix flow_offload_has_one_action() kdocAsbjørn Sloth Tønnesen1-1/+1
include/net/flow_offload.h:351: warning: No description found for return value of 'flow_offload_has_one_action' Signed-off-by: Asbjørn Sloth Tønnesen <[email protected]> Reviewed-by: Pieter Jansen van Vuuren <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-04-11net: mirror skb frag ref/unref helpersMina Almasry1-4/+35
Refactor some of the skb frag ref/unref helpers for improved clarity. Implement napi_pp_get_page() to be the mirror counterpart of napi_pp_put_page(). Implement skb_page_ref() to be the mirror of skb_page_unref(). Improve __skb_frag_ref() to become a mirror counterpart of __skb_frag_unref(). Previously unref could handle pp & non-pp pages, while the ref could only handle non-pp pages. Now both the ref & unref helpers can correctly handle both pp & non-pp pages. Now that __skb_frag_ref() can handle both pp & non-pp pages, remove skb_pp_frag_ref(), and use __skb_frag_ref() instead. This lets us remove pp specific handling from skb_try_coalesce. Additionally, since __skb_frag_ref() can now handle both pp & non-pp pages, a latent issue in skb_shift() should now be fixed. Previously this function would do a non-pp ref & pp unref on potential pp frags (fragfrom). After this patch, skb_shift() should correctly do a pp ref/unref on pp frags. Signed-off-by: Mina Almasry <[email protected]> Reviewed-by: Dragos Tatulea <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-04-11net: move skb ref helpers to new headerMina Almasry2-63/+75
Add a new header, linux/skbuff_ref.h, which contains all the skb_*_ref() helpers. Many of the consumers of skbuff.h do not actually use any of the skb ref helpers, and we can speed up compilation a bit by minimizing this header file. Additionally in the later patch in the series we add page_pool support to skb_frag_ref(), which requires some page_pool dependencies. We can now add these dependencies to skbuff_ref.h instead of a very ubiquitous skbuff.h Signed-off-by: Mina Almasry <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-04-11scsi: ufs: Remove support for old UFSHCI versionsAvri Altman2-14/+1
UFS spec version 2.1 was published more than 10 years ago. It is vanishingly unlikely that even there are out there platforms that uses earlier host controllers, let alone that those ancient platforms will ever run a V6.10 kernel. To be extra cautious, leave out removal of UFSHCI 2.0 support from this patch, and just remove support of host controllers prior to UFS2.0. This patch removes some legacy tuning calls that no longer apply. Signed-off-by: Avri Altman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Acked-by: Bean Huo <[email protected]> Reviewed-by: Bart Van Assche <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2024-04-11scsi: libata: Switch to using ->device_configureChristoph Hellwig2-6/+9
Switch to the ->device_configure method instead of ->slave_configure and update the block limits on the passed in queue_limits instead of using the per-limit accessors. Signed-off-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: John Garry <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Acked-by: Damien Le Moal <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2024-04-11scsi: core: Add a device_configure method to the host templateChristoph Hellwig1-0/+4
This is a version of ->slave_configure that also takes a queue_limits structure that the caller applies, and thus allows drivers to reconfigure the queue using the atomic queue limits API. In the long run it should also replace ->slave_configure entirely as there is no need to have two different methods here, and the slave name in addition to being politically charged also has no basis in the SCSI standards or the kernel code. Signed-off-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Bart Van Assche <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2024-04-11scsi: ufs: ufs-exynos: Move setting the the DMA alignment to the init methodChristoph Hellwig1-1/+0
Use the SCSI host's dma_alignment field and set it in ->init and remove the now unused config_scsi_dev method. Signed-off-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Alim Akhtar <[email protected]> Reviewed-by: Bart Van Assche <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2024-04-11scsi: core: Add a dma_alignment field to the host and host templateChristoph Hellwig1-0/+3
Get drivers out of the business of having to call the block layer DMA alignment limits helpers themselves. Signed-off-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Bart Van Assche <[email protected]> Reviewed-by: John Garry <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Acked-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>