aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2024-08-30fbdev: Introduce devm_register_framebuffer()Thomas Weißschuh1-0/+1
Introduce a device-managed variant of register_framebuffer() which automatically unregisters the framebuffer on device destruction. This can simplify the error handling and resource management in drivers. Signed-off-by: Thomas Weißschuh <[email protected]> Signed-off-by: Helge Deller <[email protected]>
2024-08-30arm64: smccc: Reserve block of KVM "vendor" services for pKVM hypercallsWill Deacon1-0/+60
pKVM relies on hypercalls to expose services such as memory sharing to protected guests. Tentatively allocate a block of 58 hypercalls (i.e. fill the remaining space in the first 64 function IDs) for pKVM usage, as future extensions such as pvIOMMU support, range-based memory sharing and validation of assigned devices will require additional services. Suggested-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Acked-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2024-08-30drivers/virt: pkvm: Intercept ioremap using pKVM MMIO_GUARD hypercallWill Deacon1-0/+7
Hook up pKVM's MMIO_GUARD hypercall so that ioremap() and friends will register the target physical address as MMIO with the hypervisor, allowing guest exits to that page to be emulated by the host with full syndrome information. Acked-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2024-08-30drivers/virt: pkvm: Hook up mem_encrypt API using pKVM hypercallsWill Deacon1-0/+14
If we detect the presence of pKVM's SHARE and UNSHARE hypercalls, then register a backend implementation of the mem_encrypt API so that things like DMA buffers can be shared appropriately with the host. Acked-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2024-08-30drivers/virt: pkvm: Add initial support for running as a protected guestWill Deacon1-0/+7
Implement a pKVM protected guest driver to probe the presence of pKVM and determine the memory protection granule using the HYP_MEMINFO hypercall. Acked-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2024-08-30regulator: core: Stub devm_regulator_bulk_get_const() if !CONFIG_REGULATORDouglas Anderson1-0/+8
When adding devm_regulator_bulk_get_const() I missed adding a stub for when CONFIG_REGULATOR is not enabled. Under certain conditions (like randconfig testing) this can cause the compiler to reports errors like: error: implicit declaration of function 'devm_regulator_bulk_get_const'; did you mean 'devm_regulator_bulk_get_enable'? Add the stub. Fixes: 1de452a0edda ("regulator: core: Allow drivers to define their init data as const") Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Cc: Neil Armstrong <[email protected]> Signed-off-by: Douglas Anderson <[email protected]> Link: https://patch.msgid.link/20240830073511.1.Ib733229a8a19fad8179213c05e1af01b51e42328@changeid Signed-off-by: Mark Brown <[email protected]>
2024-08-30fs: drop GFP_NOFAIL mode from alloc_page_buffersMichal Hocko1-2/+1
There is only one called of alloc_page_buffers and it doesn't require __GFP_NOFAIL so drop this allocation mode. Signed-off-by: Michal Hocko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Acked-by: Song Liu <[email protected]> Reviewed-by: Jan Kara <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
2024-08-30iommu: Allow ATS to work on VFs when the PF uses IDENTITYJason Gunthorpe1-0/+3
PCI ATS has a global Smallest Translation Unit field that is located in the PF but shared by all of the VFs. The expectation is that the STU will be set to the root port's global STU capability which is driven by the IO page table configuration of the iommu HW. Today it becomes set when the iommu driver first enables ATS. Thus, to enable ATS on the VF, the PF must have already had the correct STU programmed, even if ATS is off on the PF. Unfortunately the PF only programs the STU when the PF enables ATS. The iommu drivers tend to leave ATS disabled when IDENTITY translation is being used. Thus we can get into a state where the PF is setup to use IDENTITY with the DMA API while the VF would like to use VFIO with a PAGING domain and have ATS turned on. This fails because the PF never loaded a PAGING domain and so it never setup the STU, and the VF can't do it. The simplest solution is to have the iommu driver set the ATS STU when it probes the device. This way the ATS STU is loaded immediately at boot time to all PFs and there is no issue when a VF comes to use it. Add a new call pci_prepare_ats() which should be called by iommu drivers in their probe_device() op for every PCI device if the iommu driver supports ATS. This will setup the STU based on whatever page size capability the iommu HW has. Signed-off-by: Jason Gunthorpe <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Reviewed-by: Lu Baolu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2024-08-30drivers/perf: arm_spe: Use perf_allow_kernel() for permissionsJames Clark1-7/+1
Use perf_allow_kernel() for 'pa_enable' (physical addresses), 'pct_enable' (physical timestamps) and context IDs. This means that perf_event_paranoid is now taken into account and LSM hooks can be used, which is more consistent with other perf_event_open calls. For example PERF_SAMPLE_PHYS_ADDR uses perf_allow_kernel() rather than just perfmon_capable(). This also indirectly fixes the following error message which is misleading because perf_event_paranoid is not taken into account by perfmon_capable(): $ perf record -e arm_spe/pa_enable/ Error: Access to performance monitoring and observability operations is limited. Consider adjusting /proc/sys/kernel/perf_event_paranoid setting ... Suggested-by: Al Grant <[email protected]> Signed-off-by: James Clark <[email protected]> Link: https://lore.kernel.org/r/[email protected] Link: https://lore.kernel.org/all/[email protected]/ Signed-off-by: Will Deacon <[email protected]>
2024-08-30mfd: ds1wm: Remove remaining header fileWilken Gottwalt1-29/+0
The driver was removed after kernel 6.2 rendering the header file unused. Signed-off-by: Wilken Gottwalt <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lee Jones <[email protected]>
2024-08-30mfd: 88pm80x: Constify read-only regmap structsJavier Carrasco1-1/+1
`pm800_irq`, `pm805_irq` and `pm805_irq_chip` are not modified and can be declared as const to move their data to a read-only section. In order to keep the const modifier for the regmap_irq_chip structures, the pointer used to reference them must be converted to const as well. Signed-off-by: Javier Carrasco <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lee Jones <[email protected]>
2024-08-30Merge branches 'ib-mfd-for-iio-power-6.12' and 'ib-mfd-gpio-pwm-6.12' into ↵Lee Jones1-0/+126
ibs-for-mfd-merged
2024-08-30writeback: Refine the show_inode_state() macro definitionJulian Sun1-1/+9
Currently, the show_inode_state() macro only prints part of the state of inode->i_state. Let’s improve it to display more of its state. Signed-off-by: Julian Sun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Jan Kara <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
2024-08-30inode: make i_state a u32Christian Brauner1-1/+2
Now that we use the wait var event mechanism make i_state a u32 and free up 4 bytes. This means we currently have two 4 byte holes in struct inode which we can pack. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Josef Bacik <[email protected]> Reviewed-by: Jan Kara <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
2024-08-30inode: port __I_NEW to var eventChristian Brauner1-1/+2
Port the __I_NEW mechanism to use the new var event mechanism. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Josef Bacik <[email protected]> Reviewed-by: Jan Kara <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
2024-08-30fs: reorder i_state bitsChristian Brauner1-17/+21
so that we can use the first bits to derive unique addresses from i_state. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Josef Bacik <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Reviewed-by: Jan Kara <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
2024-08-30fs: add i_state helpersChristian Brauner1-0/+15
The i_state member is an unsigned long so that it can be used with the wait bit infrastructure which expects unsigned long. This wastes 4 bytes which we're unlikely to ever use. Switch to using the var event wait mechanism using the address of the bit. Thanks to Linus for the address idea. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Josef Bacik <[email protected]> Reviewed-by: Jan Kara <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
2024-08-30fs: s/__u32/u32/ for s_fsnotify_maskChristian Brauner1-1/+1
The underscore variants are for uapi whereas the non-underscore variants are for in-kernel consumers. Link: https://lore.kernel.org/r/20240822-anwerben-nutzung-1cd6c82a565f@brauner Reviewed-by: Jan Kara <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
2024-08-30fs: remove unused path_put_init()Christian Brauner1-6/+0
This helper has been unused for a while now. Link: https://lore.kernel.org/r/20240822-bewuchs-werktag-46672b3c0606@brauner Reviewed-by: Jan Kara <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
2024-08-30inode: remove __I_DIO_WAKEUPChristian Brauner1-5/+3
Afaict, we can just rely on inode->i_dio_count for waiting instead of this awkward indirection through __I_DIO_WAKEUP. This survives LTP dio and xfstests dio tests. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Josef Bacik <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
2024-08-30autofs: add per dentry expire timeoutIan Kent1-1/+1
Add ability to set per-dentry mount expire timeout to autofs. There are two fairly well known automounter map formats, the autofs format and the amd format (more or less System V and Berkley). Some time ago Linux autofs added an amd map format parser that implemented a fair amount of the amd functionality. This was done within the autofs infrastructure and some functionality wasn't implemented because it either didn't make sense or required extra kernel changes. The idea was to restrict changes to be within the existing autofs functionality as much as possible and leave changes with a wider scope to be considered later. One of these changes is implementing the amd options: 1) "unmount", expire this mount according to a timeout (same as the current autofs default). 2) "nounmount", don't expire this mount (same as setting the autofs timeout to 0 except only for this specific mount) . 3) "utimeout=<seconds>", expire this mount using the specified timeout (again same as setting the autofs timeout but only for this mount). To implement these options per-dentry expire timeouts need to be implemented for autofs indirect mounts. This is because all map keys (mounts) for autofs indirect mounts use an expire timeout stored in the autofs mount super block info. structure and all indirect mounts use the same expire timeout. Now I have a request to add the "nounmount" option so I need to add the per-dentry expire handling to the kernel implementation to do this. The implementation uses the trailing path component to identify the mount (and is also used as the autofs map key) which is passed in the autofs_dev_ioctl structure path field. The expire timeout is passed in autofs_dev_ioctl timeout field (well, of the timeout union). If the passed in timeout is equal to -1 the per-dentry timeout and flag are cleared providing for the "unmount" option. If the timeout is greater than or equal to 0 the timeout is set to the value and the flag is also set. If the dentry timeout is 0 the dentry will not expire by timeout which enables the implementation of the "nounmount" option for the specific mount. When the dentry timeout is greater than zero it allows for the implementation of the "utimeout=<seconds>" option. Signed-off-by: Ian Kent <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Christian Brauner <[email protected]>
2024-08-30fs: move FMODE_UNSIGNED_OFFSET to fop_flagsChristian Brauner4-4/+8
This is another flag that is statically set and doesn't need to use up an FMODE_* bit. Move it to ->fop_flags and free up another FMODE_* bit. (1) mem_open() used from proc_mem_operations (2) adi_open() used from adi_fops (3) drm_open_helper(): (3.1) accel_open() used from DRM_ACCEL_FOPS (3.2) drm_open() used from (3.2.1) amdgpu_driver_kms_fops (3.2.2) psb_gem_fops (3.2.3) i915_driver_fops (3.2.4) nouveau_driver_fops (3.2.5) panthor_drm_driver_fops (3.2.6) radeon_driver_kms_fops (3.2.7) tegra_drm_fops (3.2.8) vmwgfx_driver_fops (3.2.9) xe_driver_fops (3.2.10) DRM_GEM_FOPS (3.2.11) DEFINE_DRM_GEM_DMA_FOPS (4) struct memdev sets fmode flags based on type of device opened. For devices using struct mem_fops unsigned offset is used. Mark all these file operations as FOP_UNSIGNED_OFFSET and add asserts into the open helper to ensure that the flag is always set. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Jeff Layton <[email protected]> Reviewed-by: Jan Kara <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
2024-08-30vfs: only read fops once in fops_get/putMateusz Guzik1-4/+11
In do_dentry_open() the usage is: f->f_op = fops_get(inode->i_fop); In generated asm the compiler emits 2 reads from inode->i_fop instead of just one. This popped up due to false-sharing where loads from that offset end up bouncing a cacheline during parallel open. While this is going to be fixed, the spurious load does not need to be there. This makes do_dentry_open() go down from 1177 to 1154 bytes. fops_put() is patched to maintain some consistency. No functional changes. Reviewed-by: Josef Bacik <[email protected]> Signed-off-by: Mateusz Guzik <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Christian Brauner <[email protected]>
2024-08-30vfs: dodge smp_mb in break_lease and break_deleg in the common caseMateusz Guzik1-2/+12
These inlines show up in the fast path (e.g., in do_dentry_open()) and induce said full barrier regarding i_flctx access when in most cases the pointer is NULL. The pointer can be safely checked before issuing the barrier, dodging it in most cases as a result. It is plausible the consume fence would be sufficient, but I don't want to go audit all callers regarding what they before calling here. Signed-off-by: Mateusz Guzik <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
2024-08-30Merge tag 'drm-intel-next-2024-08-29' of ↵Dave Airlie2-4/+32
https://gitlab.freedesktop.org/drm/i915/kernel into drm-next Cross-driver (xe-core) Changes: - Require BMG scanout buffers to be 64k physically aligned (Maarten) Core (drm) Changes: - Introducing Xe2 ccs modifiers for integrated and discrete graphics (Juha-Pekka) Driver Changes: - General cleanup and more work moving towards intel_display isolation (Jani) - New display workaround (Suraj) - Use correct cp_irq_count on HDCP (Suraj) - eDP PSR fix when CRC is enabled (Jouni) - Fix DP MST state after a sink reset (Imre) - Fix Arrow Lake GSC firmware version (John) - Use chained DSBs for LUT programming (Ville) Signed-off-by: Dave Airlie <[email protected]> From: Rodrigo Vivi <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-08-30Merge tag 'drm-xe-next-2024-08-28' of ↵Dave Airlie1-1/+53
https://gitlab.freedesktop.org/drm/xe/kernel into drm-next UAPI Changes: - Fix OA format masks which were breaking build with gcc-5 Cross-subsystem Changes: Driver Changes: - Use dma_fence_chain_free in chain fence unused as a sync (Matthew Brost) - Refactor hw engine lookup and mmio access to be used in more places (Dominik, Matt Auld, Mika Kuoppala) - Enable priority mem read for Xe2 and later (Pallavi Mishra) - Fix PL1 disable flow in xe_hwmon_power_max_write (Karthik) - Fix refcount and speedup devcoredump (Matthew Brost) - Add performance tuning changes to Xe2 (Akshata, Shekhar) - Fix OA sysfs entry (Ashutosh) - Add first GuC firmware support for BMG (Julia) - Bump minimum GuC firmware for platforms under force_probe to match LNL and BMG (Julia) - Fix access check on user fence creation (Nirmoy) - Add/document workarounds for Xe2 (Julia, Daniele, John, Tejas) - Document workaround and use proper WA infra (Matt Roper) - Fix VF configuration on media GT (Michal Wajdeczko) - Fix VM dma-resv lock (Matthew Brost) - Allow suspend/resume exec queue backend op to be called multiple times (Matthew Brost) - Add GT stats to debugfs (Nirmoy) - Add hwconfig to debugfs (Matt Roper) - Compile out all debugfs code with ONFIG_DEUBG_FS=n (Lucas) - Remove dead kunit code (Jani Nikula) - Refactor drvdata storing to help display (Jani Nikula) - Cleanup unsused xe parameter in pte handling (Himal) - Rename s/enable_display/probe_display/ for clarity (Lucas) - Fix missing MCR annotation in couple of registers (Tejas) - Fix DGFX display suspend/resume (Maarten) - Prepare exec_queue_kill for PXP handling (Daniele) - Fix devm/drmm issues (Daniele, Matthew Brost) - Fix tile and ggtt fini sequences (Matthew Brost) - Fix crashes when probing without firmware in place (Daniele, Matthew Brost) - Use xe_managed for kernel BOs (Daniele, Matthew Brost) - Future-proof dss_per_group calculation by using hwconfig (Matt Roper) - Use reserved copy engine for user binds on faulting devices (Matthew Brost) - Allow mixing dma-fence jobs and long-running faulting jobs (Francois) - Cleanup redundant arg when creating use BO (Nirmoy) - Prevent UAF around preempt fence (Auld) - Fix display suspend/resume (Maarten) - Use vma_pages() helper (Thorsten) - Calculate pagefault queue size (Stuart, Matthew Auld) - Fix missing pagefault wq destroy (Stuart) - Fix lifetime handling of HW fence ctx (Matthew Brost) - Fix order destroy order for jobs (Matthew Brost) - Fix TLB invalidation for media GT (Matthew Brost) - Document GGTT (Rodrigo Vivi) - Refactor GGTT layering and fix runtime outer protection (Rodrigo Vivi) - Handle HPD polling on display pm runtime suspend/resume (Imre, Vinod) - Drop unrequired NULL checks (Apoorva, Himal) - Use separate rpm lockdep map for non-d3cold-capable devices (Thomas Hellström) - Support "nomodeset" kernel command-line option (Thomas Zimmermann) - Drop force_probe requirement for LNL and BMG (Lucas, Balasubramani) Signed-off-by: Dave Airlie <[email protected]> From: Lucas De Marchi <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/wd42jsh4i3q5zlrmi2cljejohdsrqc6hvtxf76lbxsp3ibrgmz@y54fa7wwxgsd
2024-08-30Merge tag 'drm-misc-next-2024-08-29' of ↵Dave Airlie6-24/+30
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next drm-misc-next for v6.12: UAPI Changes: devfs: - support device numbers up to MINORBITS limit Core Changes: ci: - increase job timeout devfs: - use XArray for minor ids displayport: - mst: GUID improvements docs: - add fixes and cleanups panic: - optionally display QR code Driver Changes: amdgpu: - faster vblank disabling - GUID improvements gm12u320 - convert to struct drm_edid host1x: - fix syncpoint IRQ during resume - use iommu_paging_domain_alloc() imx: - ipuv3: convert to struct drm_edid omapdrm: - improve error handling panel: - add support for BOE TV101WUM-LL2 plus DT bindings - novatek-nt35950: improve error handling - nv3051d: improve error handling - panel-edp: add support for BOE NE140WUM-N6G; revert support for SDC ATNA45AF01 - visionox-vtdr6130: improve error handling; use devm_regulator_bulk_get_const() renesas: - rz-du: add support for RZ/G2UL plus DT bindings sti: - convert to struct drm_edid tegra: - gr3d: improve PM domain handling - convert to struct drm_edid Signed-off-by: Dave Airlie <[email protected]> From: Thomas Zimmermann <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-08-30Merge tag 'drm-fixes-2024-08-30' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds3-10/+9
Pull drm fixes from Dave Airlie: "Another week, another set of GPU fixes. amdgpu and vmwgfx leading the charge, then i915 and xe changes along with v3d and some other bits. The TTM revert is due to some stuttering graphical apps probably due to longer stalls while prefaulting. Seems pretty much where I'd expect things, ttm: - revert prefault change, caused stutters aperture: - handle non-VGA devices bettter amdgpu: - SWSMU gaming stability fix - SMU 13.0.7 fix - SWSMU documentation alignment fix - SMU 14.0.x fixes - GC 12.x fix - Display fix - IP discovery fix - SMU 13.0.6 fix i915: - Fix #11195: The external display connect via USB type-C dock stays blank after re-connect the dock - Make DSI backlight work for 2G version of Lenovo Yoga Tab 3 X90F - Move ARL GuC firmware to correct version xe: - Invalidate media_gt TLBs - Fix HWMON i1 power setup write command vmwgfx: - prevent unmapping active read buffers - fix prime with external buffers - disable coherent dumb buffers without 3d v3d: - disable preemption while updating GPU stats" * tag 'drm-fixes-2024-08-30' of https://gitlab.freedesktop.org/drm/kernel: drm/xe/hwmon: Fix WRITE_I1 param from u32 to u16 drm/v3d: Disable preemption while updating GPU stats drm/amd/pm: Drop unsupported features on smu v14_0_2 drm/amd/pm: Add support for new P2S table revision drm/amdgpu: support for gc_info table v1.3 drm/amd/display: avoid using null object of framebuffer drm/amdgpu/gfx12: set UNORD_DISPATCH in compute MQDs drm/amd/pm: update message interface for smu v14.0.2/3 drm/amdgpu/swsmu: always force a state reprogram on init drm/amdgpu/smu13.0.7: print index for profiles drm/amdgpu: align pp_power_profile_mode with kernel docs drm/i915/dp_mst: Fix MST state after a sink reset drm/xe: Invalidate media_gt TLBs drm/i915: ARL requires a newer GSC firmware drm/i915/dsi: Make Lenovo Yoga Tab 3 X90F DMI match less strict video/aperture: optionally match the device in sysfb_disable() drm/vmwgfx: Disable coherent dumb buffers without 3d drm/vmwgfx: Fix prime with external buffers drm/vmwgfx: Prevent unmapping active read buffers Revert "drm/ttm: increase ttm pre-fault value to PMD size"
2024-08-30Merge tag 'drm-misc-fixes-2024-08-29' of ↵Dave Airlie2-6/+2
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes A revert for a previous TTM commit causing stuttering, 3 fixes for vmwgfx related to buffer operations, a fix for video/aperture with non-VGA primary devices, and a preemption status fix for v3d Signed-off-by: Dave Airlie <[email protected]> From: Maxime Ripard <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/20240829-efficient-swift-from-lemuria-f60c05@houat
2024-08-29bpf: Add gen_epilogue to bpf_verifier_opsMartin KaFai Lau3-0/+13
This patch adds a .gen_epilogue to the bpf_verifier_ops. It is similar to the existing .gen_prologue. Instead of allowing a subsystem to run code at the beginning of a bpf prog, it allows the subsystem to run code just before the bpf prog exit. One of the use case is to allow the upcoming bpf qdisc to ensure that the skb->dev is the same as the qdisc->dev_queue->dev. The bpf qdisc struct_ops implementation could either fix it up or drop the skb. Another use case could be in bpf_tcp_ca.c to enforce snd_cwnd has sane value (e.g. non zero). The epilogue can do the useful thing (like checking skb->dev) if it can access the bpf prog's ctx. Unlike prologue, r1 may not hold the ctx pointer. This patch saves the r1 in the stack if the .gen_epilogue has returned some instructions in the "epilogue_buf". The existing .gen_prologue is done in convert_ctx_accesses(). The new .gen_epilogue is done in the convert_ctx_accesses() also. When it sees the (BPF_JMP | BPF_EXIT) instruction, it will be patched with the earlier generated "epilogue_buf". The epilogue patching is only done for the main prog. Only one epilogue will be patched to the main program. When the bpf prog has multiple BPF_EXIT instructions, a BPF_JA is used to goto the earlier patched epilogue. Majority of the archs support (BPF_JMP32 | BPF_JA): x86, arm, s390, risv64, loongarch, powerpc and arc. This patch keeps it simple and always use (BPF_JMP32 | BPF_JA). A new macro BPF_JMP32_A is added to generate the (BPF_JMP32 | BPF_JA) insn. Acked-by: Eduard Zingerman <[email protected]> Signed-off-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-08-29bpf: Move insn_buf[16] to bpf_verifier_envMartin KaFai Lau1-0/+3
This patch moves the 'struct bpf_insn insn_buf[16]' stack usage to the bpf_verifier_env. A '#define INSN_BUF_SIZE 16' is also added to replace the ARRAY_SIZE(insn_buf) usages. Both convert_ctx_accesses() and do_misc_fixup() are changed to use the env->insn_buf. It is a refactoring work for adding the epilogue_buf[16] in a later patch. With this patch, the stack size usage decreased. Before: ./kernel/bpf/verifier.c:22133:5: warning: stack frame size (2584) After: ./kernel/bpf/verifier.c:22184:5: warning: stack frame size (2264) Reviewed-by: Eduard Zingerman <[email protected]> Signed-off-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-08-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski11-19/+63
Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/faraday/ftgmac100.c 4186c8d9e6af ("net: ftgmac100: Ensure tx descriptor updates are visible") e24a6c874601 ("net: ftgmac100: Get link speed and duplex for NC-SI") https://lore.kernel.org/[email protected] net/ipv4/tcp.c bac76cf89816 ("tcp: fix forever orphan socket caused by tcp_abort") edefba66d929 ("tcp: rstreason: introduce SK_RST_REASON_TCP_STATE for active reset") https://lore.kernel.org/[email protected] No adjacent changes. Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-08-30Merge tag 'net-6.11-rc6' of ↵Linus Torvalds4-8/+11
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bluetooth, wireless and netfilter. No known outstanding regressions. Current release - regressions: - wifi: iwlwifi: fix hibernation - eth: ionic: prevent tx_timeout due to frequent doorbell ringing Previous releases - regressions: - sched: fix sch_fq incorrect behavior for small weights - wifi: - iwlwifi: take the mutex before running link selection - wfx: repair open network AP mode - netfilter: restore IP sanity checks for netdev/egress - tcp: fix forever orphan socket caused by tcp_abort - mptcp: close subflow when receiving TCP+FIN - bluetooth: fix random crash seen while removing btnxpuart driver Previous releases - always broken: - mptcp: more fixes for the in-kernel PM - eth: bonding: change ipsec_lock from spin lock to mutex - eth: mana: fix race of mana_hwc_post_rx_wqe and new hwc response Misc: - documentation: drop special comment style for net code" * tag 'net-6.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (57 commits) nfc: pn533: Add poll mod list filling check mailmap: update entry for Sriram Yagnaraman selftests: mptcp: join: check re-re-adding ID 0 signal mptcp: pm: ADD_ADDR 0 is not a new address selftests: mptcp: join: validate event numbers mptcp: avoid duplicated SUB_CLOSED events selftests: mptcp: join: check re-re-adding ID 0 endp mptcp: pm: fix ID 0 endp usage after multiple re-creations mptcp: pm: do not remove already closed subflows selftests: mptcp: join: no extra msg if no counter selftests: mptcp: join: check re-adding init endp with != id mptcp: pm: reset MPC endp ID when re-added mptcp: pm: skip connecting to already established sf mptcp: pm: send ACK on an active subflow selftests: mptcp: join: check removing ID 0 endpoint mptcp: pm: fix RM_ADDR ID for the initial subflow mptcp: pm: reuse ID 0 after delete and re-add net: busy-poll: use ktime_get_ns() instead of local_clock() sctp: fix association labeling in the duplicate COOKIE-ECHO case mptcp: pr_debug: add missing \n at the end ...
2024-08-29ACPICA: Setup for ACPICA release 20240827Saket Dumbre1-1/+1
ACPICA commit 86c762afe5bf915e8101a0455513368e2a60dd80 Link: https://github.com/acpica/acpica/commit/86c762af Signed-off-by: Saket Dumbre <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2024-08-29ACPICA: HMAT: Add extended linear address mode to MSCISDave Jiang1-1/+4
ACPICA commit aaa08569b81aa4d9ff59f91f00e589e98d499e6c Redefine the 2 reserved bytes at offset 28 of Memory Side Cache Information Structure as "Address Mode" and add defines of the new value. * 0 - Reserved (Unkown Address Mode) * 1 - Extended-linear (N direct-map aliases linearly mapped) * 2..65535 - Reserved (Unknown Address Mode) Link: https://github.com/acpica/acpica/commit/aaa08569 Signed-off-by: Dave Jiang <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2024-08-29ACPICA: Add support for Windows 11 22H2 _OSI stringArmin Wolf1-0/+1
ACPICA commit f56218c4e4dc1d1f699662d0726ad9e7a0d58548 See: https://github.com/microsoft_docs/windows-driver-docs/commit/be9d1c211adf8fabe5a43de71c034b1b6d7372de Link: https://github.com/acpica/acpica/commit/f56218c4 Signed-off-by: Armin Wolf <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2024-08-29dmaengine: idxd: Add new DSA and IAA device IDs for Diamond Rapids platformFenghua Yu1-0/+2
A new DSA device ID, 0x1212, and a new IAA device ID, 0x1216, are introduced for Diamond Rapids platform. Add the device IDs to the IDXD driver. The name "IAA" is used in new code instead of the old name "IAX". However, the "IAX" naming (e.g., IDXD_TYPE_IAX) is retained for legacy code compatibility. Signed-off-by: Fenghua Yu <[email protected]> Reviewed-by: Dave Jiang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Vinod Koul <[email protected]>
2024-08-29dmaengine: idxd: Add a new DSA device ID for Granite Rapids-D platformFenghua Yu1-0/+1
A new DSA device ID, 0x11fb, is introduced for the Granite Rapids-D platform. Add the device ID to the IDXD driver. Since a potential security issue has been fixed on the new device, it's secure to assign the device to virtual machines, and therefore, the new device ID will not be added to the VFIO denylist. Additionally, the new device ID may be useful in identifying and addressing any other potential issues with this specific device in the future. The same is also applied to any other new DSA/IAA devices with new device IDs. Signed-off-by: Fenghua Yu <[email protected]> Reviewed-by: Dave Jiang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Vinod Koul <[email protected]>
2024-08-29ACPICA: Implement ACPI_WARNING_ONCE and ACPI_ERROR_ONCEVasily Khoruzhick1-0/+5
ACPICA commit 2ad4e6e7c4118f4cdfcad321c930b836cec77406 In some cases it is not practical nor useful to nag user about some firmware errors that they cannot fix. Add a macro that will print a warning or error only once to be used in these cases. Link: https://github.com/acpica/acpica/commit/2ad4e6e7 Signed-off-by: Vasily Khoruzhick <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2024-08-29ACPICA: MPAM: Correct the typo in struct acpi_mpam_msc_node memberPunit Agrawal1-1/+1
ACPICA commit 3da3f7d776d17e9bfbb15de88317de8d7397ce38 A member of the struct acpi_mpam_msc_node that represents a Memory System Controller node structure - num_resource_nodes has a typo. Fix the typo No functional change. Link: https://github.com/acpica/acpica/commit/3da3f7d7 Signed-off-by: Punit Agrawal <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2024-08-29ACPICA: Headers: Add RISC-V SBI Subtype to DBG2Sia Jee Heng1-0/+1
ACPICA commit 6f4c900bcf9ca065129353f17a83773aa58095aa Include the RISC-V SBI debugging subtype as documented in DBG2 dated April 10, 2023 [1]. Link: https://learn.microsoft.com/en-us/windows-hardware/drivers/bringup/acpi-debug-port-table # [1] Link: https://github.com/acpica/acpica/commit/6f4c900b Signed-off-by: Sia Jee Heng <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2024-08-29ACPICA: SPCR: Update the SPCR table to version 4Sia Jee Heng1-4/+8
ACPICA commit 1eeff52124a45d5cd887ba5687bbad0116e4d211 The Microsoft Serial Port Console Redirection (SPCR) specification revision 1.09 comprises additional fields [1]. The newly added fields are: - RISC-V SBI - Precise Baud Rate - namespace_string_length - namespace_string_offset - namespace_string Additionaly, this code will support up to SPCR revision 1.10, as it includes only minor wording changes. Link: https://learn.microsoft.com/en-us/windows-hardware/drivers/serports/serial-port-console-redirection-table # [1] Link: https://github.com/acpica/acpica/commit/1eeff521 Signed-off-by: Sia Jee Heng <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2024-08-29ACPICA: Complete CXL 3.0 CXIMS structuresZhang Rui1-0/+4
ACPICA commit eb2a2ff303416fb3f6c425d519dbcd6988dbd91f Commit 2d8dc0383d3c9 ("Add CXL 3.0 structures (CXIMS & RDPAS) to the CEDT table") introduces basic support for CXL XOR Interleave Math Structure (CXIMS). Complete the CXIMS structures. No functional change. Link: https://github.com/acpica/acpica/commit/eb2a2ff3 Signed-off-by: Zhang Rui <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2024-08-29ACPICA: haiku: Fix invalid value used for semaphoresAdrien Destugues1-0/+6
ACPICA commit 49fe4f25483feec2f685b204ef19e28d92979e95 In Haiku, semaphores are represented by integers, not pointers. So, we can't use NULL as the invalid/destroyed value, the correct value is -1. Introduce a platform overridable define to allow this. Fixes #162 (which was closed after coming to the conclusion that this should be done, but the change was never done). Link: https://github.com/acpica/acpica/commit/49fe4f25 Signed-off-by: Rafael J. Wysocki <[email protected]>
2024-08-29io_uring/kbuf: add support for incremental buffer consumptionJens Axboe1-0/+18
By default, any recv/read operation that uses provided buffers will consume at least 1 buffer fully (and maybe more, in case of bundles). This adds support for incremental consumption, meaning that an application may add large buffers, and each read/recv will just consume the part of the buffer that it needs. For example, let's say an application registers 1MB buffers in a provided buffer ring, for streaming receives. If it gets a short recv, then the full 1MB buffer will be consumed and passed back to the application. With incremental consumption, only the part that was actually used is consumed, and the buffer remains the current one. This means that both the application and the kernel needs to keep track of what the current receive point is. Each recv will still pass back a buffer ID and the size consumed, the only difference is that before the next receive would always be the next buffer in the ring. Now the same buffer ID may return multiple receives, each at an offset into that buffer from where the previous receive left off. Example: Application registers a provided buffer ring, and adds two 32K buffers to the ring. Buffer1 address: 0x1000000 (buffer ID 0) Buffer2 address: 0x2000000 (buffer ID 1) A recv completion is received with the following values: cqe->res 0x1000 (4k bytes received) cqe->flags 0x11 (CQE_F_BUFFER|CQE_F_BUF_MORE set, buffer ID 0) and the application now knows that 4096b of data is available at 0x1000000, the start of that buffer, and that more data from this buffer will be coming. Now the next receive comes in: cqe->res 0x2010 (8k bytes received) cqe->flags 0x11 (CQE_F_BUFFER|CQE_F_BUF_MORE set, buffer ID 0) which tells the application that 8k is available where the last completion left off, at 0x1001000. Next completion is: cqe->res 0x5000 (20k bytes received) cqe->flags 0x1 (CQE_F_BUFFER set, buffer ID 0) and the application now knows that 20k of data is available at 0x1003000, which is where the previous receive ended. CQE_F_BUF_MORE isn't set, as no more data is available in this buffer ID. The next completion is then: cqe->res 0x1000 (4k bytes received) cqe->flags 0x10001 (CQE_F_BUFFER|CQE_F_BUF_MORE set, buffer ID 1) which tells the application that buffer ID 1 is now the current one, hence there's 4k of valid data at 0x2000000. 0x2001000 will be the next receive point for this buffer ID. When a buffer will be reused by future CQE completions, IORING_CQE_BUF_MORE will be set in cqe->flags. This tells the application that the kernel isn't done with the buffer yet, and that it should expect more completions for this buffer ID. Will only be set by provided buffer rings setup with IOU_PBUF_RING INC, as that's the only type of buffer that will see multiple consecutive completions for the same buffer ID. For any other provided buffer type, any completion that passes back a buffer to the application is final. Once a buffer has been fully consumed, the buffer ring head is incremented and the next receive will indicate the next buffer ID in the CQE cflags. On the send side, the application can manage how much data is sent from an existing buffer by setting sqe->len to the desired send length. An application can request incremental consumption by setting IOU_PBUF_RING_INC in the provided buffer ring registration. Outside of that, any provided buffer ring setup and buffer additions is done like before, no changes there. The only change is in how an application may see multiple completions for the same buffer ID, hence needing to know where the next receive will happen. Note that like existing provided buffer rings, this should not be used with IOSQE_ASYNC, as both really require the ring to remain locked over the duration of the buffer selection and the operation completion. It will consume a buffer otherwise regardless of the size of the IO done. Signed-off-by: Jens Axboe <[email protected]>
2024-08-29fs: use kmem_cache_create_rcu()Christian Brauner1-0/+2
Switch to the new kmem_cache_create_rcu() helper which allows us to use a custom free pointer offset avoiding the need to have an external free pointer which would grow struct file behind our backs. Link: https://lore.kernel.org/r/[email protected] Acked-by: Mike Rapoport (Microsoft) <[email protected]> Reviewed-by: Vlastimil Babka <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
2024-08-29mm: add kmem_cache_create_rcu()Christian Brauner1-0/+9
When a kmem cache is created with SLAB_TYPESAFE_BY_RCU the free pointer must be located outside of the object because we don't know what part of the memory can safely be overwritten as it may be needed to prevent object recycling. That has the consequence that SLAB_TYPESAFE_BY_RCU may end up adding a new cacheline. This is the case for e.g., struct file. After having it shrunk down by 40 bytes and having it fit in three cachelines we still have SLAB_TYPESAFE_BY_RCU adding a fourth cacheline because it needs to accommodate the free pointer. Add a new kmem_cache_create_rcu() function that allows the caller to specify an offset where the free pointer is supposed to be placed. Link: https://lore.kernel.org/r/[email protected] Acked-by: Mike Rapoport (Microsoft) <[email protected]> Reviewed-by: Vlastimil Babka <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
2024-08-29fs: pack struct fileChristian Brauner1-40/+51
Now that we shrunk struct file to 192 bytes aka 3 cachelines reorder struct file to not leave any holes or have members cross cachelines. Add a short comment to each of the fields and mark the cachelines. It's possible that we may have to tweak this based on profiling in the future. So far I had Jens test this comparing io_uring with non-fixed and fixed files and it improved performance. The layout is a combination of Jens' and my changes. Link: https: //lore.kernel.org/r/20240824-peinigen-hocken-7384b977c643@brauner Signed-off-by: Christian Brauner <[email protected]>
2024-08-29ACPICA: Allow setting waking vector on reduced hardware platformsJiaqing Zhao1-4/+4
Allow setting waking vector in FACS table on reduced hardware platforms to support S3 wakeup. Link: https://github.com/acpica/acpica/commit/ee53ed6b5452612bb44af542b68d605f8b2b1104 Signed-off-by: Jiaqing Zhao <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Rafael J. Wysocki <[email protected]>
2024-08-29ACPICA: Detect FACS in reduced hardware buildJiaqing Zhao1-1/+0
According to Section 5.2.10 of ACPI Specification, FACS is optional in reduced hardware model. Enable the detection for "Hardware-reduced ACPI support only" build (CONFIG_ACPI_REDUCED_HARDWARE_ONLY=y) also. Link: https://github.com/acpica/acpica/commit/ee53ed6b5452612bb44af542b68d605f8b2b1104 Signed-off-by: Jiaqing Zhao <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Rafael J. Wysocki <[email protected]>