aboutsummaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)AuthorFilesLines
2023-10-31drm/nouveau/sec2/tu102-: prepare for GSP-RMBen Skeggs5-2/+73
- add (initial) R535 implementation of SEC2, needed for boot Signed-off-by: Ben Skeggs <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-31drm/nouveau/nvenc/tu102-: prepare for GSP-RMBen Skeggs7-7/+46
- (temporarily) disable if GSP-RM detected, will be added later - provide empty class list for non-GSP paths - split tu102 from gm107, it will provide host classes later Signed-off-by: Ben Skeggs <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-31drm/nouveau/nvdec/tu102-: prepare for GSP-RMBen Skeggs8-12/+55
- (temporarily) disable if GSP-RM detected, will be added later - provide empty class list for non-GSP paths - split tu102- from gm107, they will provide host classes later - fixup HW engine instance masks Signed-off-by: Ben Skeggs <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-31drm/nouveau/gr/tu102-: prepare for GSP-RMBen Skeggs3-1/+14
- (temporarily) disable if GSP-RM detected, will be added later - make init() optional Signed-off-by: Ben Skeggs <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-31drm/nouveau/fifo/tu102-: prepare for GSP-RMBen Skeggs10-22/+67
- (temporarily) disable if GSP-RM detected, will be added later - add dtor() so GSP-RM paths can cleanup properly - add alternate engine context mapping interface for RM engines - add alternate chid interfaces to handle RM USERD oddities Signed-off-by: Ben Skeggs <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-31drm/nouveau/disp/tu102-: prepare for GSP-RMBen Skeggs7-9/+21
- (temporarily) disable if GSP-RM detected, will be added later - pass "suspend" flag down to chipset-specific DISP code Signed-off-by: Ben Skeggs <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-31drm/nouveau/ce/tu102-: prepare for GSP-RMBen Skeggs3-0/+14
- (temporarily) disable if GSP-RM detected, will be added later Signed-off-by: Ben Skeggs <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-31drm/nouveau/vfn/tu102-: prepare for GSP-RMBen Skeggs5-1/+68
- add R535 implementation of VFN, minus interrupt table Signed-off-by: Ben Skeggs <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-31drm/nouveau/top/tu102-: prepare for GSP-RMBen Skeggs2-0/+10
- disable TOP completely when GSP-RM detected Signed-off-by: Ben Skeggs <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-31drm/nouveau/therm/tu102-: prepare for GSP-RMBen Skeggs1-0/+5
- disable THERM completely when GSP-RM detected Signed-off-by: Ben Skeggs <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-31drm/nouveau/privring/tu102-: prepare for GSP-RMBen Skeggs1-0/+5
- disable PRIVRING completely when GSP-RM detected Signed-off-by: Ben Skeggs <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-31drm/nouveau/pmu/tu102-: prepare for GSP-RMBen Skeggs1-0/+5
- disable PMU completely when GSP-RM detected Signed-off-by: Ben Skeggs <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-31drm/nouveau/mmu/tu102-: prepare for GSP-RMBen Skeggs1-0/+4
- (temporarily) disable if GSP-RM detected, will be added later Signed-off-by: Ben Skeggs <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-31drm/nouveau/mc/tu102-: prepare for GSP-RMBen Skeggs2-0/+10
- disable MC completely when GSP-RM detected Signed-off-by: Ben Skeggs <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-31drm/nouveau/ltc/tu102-: prepare for GSP-RMBen Skeggs2-0/+10
- disable LTC completely when GSP-RM detected Signed-off-by: Ben Skeggs <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-31drm/nouveau/imem/tu102-: prepare for GSP-RMBen Skeggs6-30/+96
- move suspend/resume paths to HW-specific code - allow (future) RM paths to be based on nv50_instmem Signed-off-by: Ben Skeggs <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-31drm/nouveau/i2c/tu102-: prepare for GSP-RMBen Skeggs1-0/+5
- disable I2C completely when GSP-RM detected Signed-off-by: Ben Skeggs <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-31drm/nouveau/gpio/tu102-: prepare for GSP-RMBen Skeggs2-0/+10
- disable GPIO completely when GSP-RM detected Signed-off-by: Ben Skeggs <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-31drm/nouveau/fuse/tu102-: prepare for GSP-RMBen Skeggs1-0/+5
- disable FUSE completely when GSP-RM detected Signed-off-by: Ben Skeggs <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-31drm/nouveau/fb/tu102-: prepare for GSP-RMBen Skeggs8-8/+113
- add (initial) R535 implementation of FB, need VRAM size etc for boot - expose a way to "wrap" vram at a specific address/size as a standard nvkm_memory allocation, which will be used to write PTEs etc for RM- defined memory regions Signed-off-by: Ben Skeggs <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-31drm/nouveau/fault/tu102-: prepare for GSP-RMBen Skeggs1-1/+7
- disable FAULT completely when GSP-RM detected - SVM support will be disabled when running on RM because of this Signed-off-by: Ben Skeggs <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-31drm/nouveau/devinit/tu102-: prepare for GSP-RMBen Skeggs5-0/+75
- add R535 implementation of DEVINIT, we need some of this for boot - add display disable fuse for ga100- Signed-off-by: Ben Skeggs <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-31drm/nouveau/bus/tu102-: prepare for GSP-RMBen Skeggs1-0/+5
- disable BUS completely when GSP-RM detected Signed-off-by: Ben Skeggs <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-31drm/nouveau/bar/tu102-: prepare for GSP-RMBen Skeggs2-1/+13
- (temporarily) disable if GSP-RM detected, will be added later - move BAR2 teardown from dtor(), it doesn't belong there Signed-off-by: Ben Skeggs <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-31drm/nouveau/acr/tu102-: prepare for GSP-RMBen Skeggs2-0/+7
- disable ACR completely when GSP-RM detected Signed-off-by: Ben Skeggs <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-31drm/nouveau/gsp: prepare for GSP-RMBen Skeggs11-24/+182
- move TOP after GSP, so we can disable TOP if GSP is in use - provide plumbing to support falcon-only and GSP-RM paths - provide a method for subdevs to detect GSP-RM paths - split tu102/tu116/ga100 paths from gv100, which can't support GSP-RM Signed-off-by: Ben Skeggs <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-31drm/nouveau/nvkm: bump maximum number of NVJPGBen Skeggs3-3/+3
RM (and GH100) support 8 NVJPG instances. Signed-off-by: Ben Skeggs <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-31drm/nouveau/nvkm: bump maximum number of NVDECBen Skeggs1-1/+1
RM (and GH100) support 8 NVDEC instances. Signed-off-by: Ben Skeggs <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-31drm/nouveau/mmu/tu102-: remove write to 0x100e68 during tlb invalidateBen Skeggs1-1/+0
This was cargo-culted from traces of RM when the code was written, but we probably shouldn't be touching NV_PFB regs while GSP-RM is running. From traces, it looks like NVIDIA dropped this sometime between 510.54 and 515.48.07, so I guess we can too. Signed-off-by: Ben Skeggs <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-30Merge tag 'x86-core-2023-10-29-v2' of ↵Linus Torvalds4-12/+7
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 core updates from Thomas Gleixner: - Limit the hardcoded topology quirk for Hygon CPUs to those which have a model ID less than 4. The newer models have the topology CPUID leaf 0xB correctly implemented and are not affected. - Make SMT control more robust against enumeration failures SMT control was added to allow controlling SMT at boottime or runtime. The primary purpose was to provide a simple mechanism to disable SMT in the light of speculation attack vectors. It turned out that the code is sensible to enumeration failures and worked only by chance for XEN/PV. XEN/PV has no real APIC enumeration which means the primary thread mask is not set up correctly. By chance a XEN/PV boot ends up with smp_num_siblings == 2, which makes the hotplug control stay at its default value "enabled". So the mask is never evaluated. The ongoing rework of the topology evaluation caused XEN/PV to end up with smp_num_siblings == 1, which sets the SMT control to "not supported" and the empty primary thread mask causes the hotplug core to deny the bringup of the APS. Make the decision logic more robust and take 'not supported' and 'not implemented' into account for the decision whether a CPU should be booted or not. - Fake primary thread mask for XEN/PV Pretend that all XEN/PV vCPUs are primary threads, which makes the usage of the primary thread mask valid on XEN/PV. That is consistent with because all of the topology information on XEN/PV is fake or even non-existent. - Encapsulate topology information in cpuinfo_x86 Move the randomly scattered topology data into a separate data structure for readability and as a preparatory step for the topology evaluation overhaul. - Consolidate APIC ID data type to u32 It's fixed width hardware data and not randomly u16, int, unsigned long or whatever developers decided to use. - Cure the abuse of cpuinfo for persisting logical IDs. Per CPU cpuinfo is used to persist the logical package and die IDs. That's really not the right place simply because cpuinfo is subject to be reinitialized when a CPU goes through an offline/online cycle. Use separate per CPU data for the persisting to enable the further topology management rework. It will be removed once the new topology management is in place. - Provide a debug interface for inspecting topology information Useful in general and extremly helpful for validating the topology management rework in terms of correctness or "bug" compatibility. * tag 'x86-core-2023-10-29-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) x86/apic, x86/hyperv: Use u32 in hv_snp_boot_ap() too x86/cpu: Provide debug interface x86/cpu/topology: Cure the abuse of cpuinfo for persisting logical ids x86/apic: Use u32 for wakeup_secondary_cpu[_64]() x86/apic: Use u32 for [gs]et_apic_id() x86/apic: Use u32 for phys_pkg_id() x86/apic: Use u32 for cpu_present_to_apicid() x86/apic: Use u32 for check_apicid_used() x86/apic: Use u32 for APIC IDs in global data x86/apic: Use BAD_APICID consistently x86/cpu: Move cpu_l[l2]c_id into topology info x86/cpu: Move logical package and die IDs into topology info x86/cpu: Remove pointless evaluation of x86_coreid_bits x86/cpu: Move cu_id into topology info x86/cpu: Move cpu_core_id into topology info hwmon: (fam15h_power) Use topology_core_id() scsi: lpfc: Use topology_core_id() x86/cpu: Move cpu_die_id into topology info x86/cpu: Move phys_proc_id into topology info x86/cpu: Encapsulate topology information in cpuinfo_x86 ...
2023-10-30Merge tag 'timers-core-2023-10-29-v2' of ↵Linus Torvalds8-16/+249
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: "Updates for time, timekeeping and timers: Core: - Avoid superfluous deactivation of the tick in the low resolution tick NOHZ interrupt handler as the deactivation is handled already in the idle loop and on interrupt exit. - Update stale comments in the tick NOHZ code and rename the tick handler functions to be self-explanatory. - Remove an unused function in the tick NOHZ code, which was forgotten when the last user went away. - Handle RTC alarms which exceed the maximum alarm time of the underlying RTC hardware gracefully. Setting RTC alarms which exceed the maximum alarm time of the RTC hardware failed so far and caused suspend operations to abort. Cure this by limiting the alarm to the maximum alarm time of the RTC hardware, which is provided by the driver. This causes early resume wakeups, but that's way better than not suspending at all. Drivers: - Add a proper clocksource/event driver for the ancient Cirrus Logic EP93xx SoC family, which is one of the last non device-tree holdouts in arch/arm. - The usual boring device tree bindings updates and small fixes and enhancements all over the place" * tag 'timers-core-2023-10-29-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clocksource: ep93xx: Add driver for Cirrus Logic EP93xx dt-bindings: timers: Add Cirrus EP93xx clocksource/drivers/timer-atmel-tcb: Fix initialization on SAM9 hardware clocksource/timer-riscv: ACPI: Add timer_cannot_wakeup_cpu clocksource/drivers/sun5i: Remove surplus dev_err() when using platform_get_irq() drivers/clocksource/timer-ti-dm: Don't call clk_get_rate() in stop function clocksource/drivers/timer-imx-gpt: Fix potential memory leak dt-bindings: timer: renesas,rz-mtu3: Document RZ/{G2UL,Five} SoCs dt-bindings: timer: renesas,rz-mtu3: Improve documentation dt-bindings: timer: renesas,rz-mtu3: Fix overflow/underflow interrupt names alarmtimer: Use maximum alarm time for suspend rtc: Add API function to return alarm time bound by hardware limit tick/nohz: Update comments some more tick/nohz: Remove unused tick_nohz_idle_stop_tick_protected() tick/nohz: Don't shutdown the lowres tick from itself tick/nohz: Update obsolete comments tick/nohz: Rename the tick handlers to more self-explanatory names
2023-10-30Merge tag 'irq-core-2023-10-29-v2' of ↵Linus Torvalds3-10/+10
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "Core: - Exclude managed interrupts in the calculation of interrupts which are targeted to a CPU which is about to be offlined to ensure that there are enough free vectors on the still online CPUs to migrate them over. Managed interrupts do not need to be accounted because they are either shut down on offline or migrated to an already reserved and guaranteed slot on a still online CPU in the interrupts affinity mask. Including managed interrupts is overaccounting and can result in needlessly aborting hibernation on large server machines. - The usual set of small improvements Drivers: - Make the generic interrupt chip implementation handle interrupt domains correctly and initialize the name pointers correctly - Add interrupt affinity setting support to the Renesas RZG2L chip driver. - Prevent registering syscore operations multiple times in the SiFive PLIC chip driver. - Update device tree handling in the NXP Layerscape MSI chip driver" * tag 'irq-core-2023-10-29-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/sifive-plic: Fix syscore registration for multi-socket systems irqchip/ls-scfg-msi: Use device_get_match_data() genirq/generic_chip: Make irq_remove_generic_chip() irqdomain aware genirq/matrix: Exclude managed interrupts in irq_matrix_allocated() PCI/MSI: Provide stubs for IMS functions irqchip/renesas-rzg2l: Enhance driver to support interrupt affinity setting genirq/generic-chip: Fix the irq_chip name for /proc/interrupts irqdomain: Annotate struct irq_domain with __counted_by
2023-10-31Merge tag 'amd-drm-next-6.7-2023-10-27' of ↵Dave Airlie137-698/+1474
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.7-2023-10-27: amdgpu: - RAS fixes - Seamless boot fixes - NBIO 7.7 fix - SMU 14.0 fixes - GC 11.5 fixes - DML2 fixes - ASPM fixes - VPE fixes - Misc code cleanups - SRIOV fixes - Add some missing copyright notices - DCN 3.5 fixes - FAMS fixes - Backlight fix - S/G display fix - fdinfo cleanups - EXT_COHERENT fixes for APU and NUMA systems amdkfd: - Misc fixes - Misc code cleanups - SVM fixes Signed-off-by: Dave Airlie <[email protected]> From: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-30Merge tag 'x86-mm-2023-10-28' of ↵Linus Torvalds1-3/+8
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm handling updates from Ingo Molnar: - Add new NX-stack self-test - Improve NUMA partial-CFMWS handling - Fix #VC handler bugs resulting in SEV-SNP boot failures - Drop the 4MB memory size restriction on minimal NUMA nodes - Reorganize headers a bit, in preparation to header dependency reduction efforts - Misc cleanups & fixes * tag 'x86-mm-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Drop the 4 MB restriction on minimal NUMA node memory size selftests/x86/lam: Zero out buffer for readlink() x86/sev: Drop unneeded #include x86/sev: Move sev_setup_arch() to mem_encrypt.c x86/tdx: Replace deprecated strncpy() with strtomem_pad() selftests/x86/mm: Add new test that userspace stack is in fact NX x86/sev: Make boot_ghcb_page[] static x86/boot: Move x86_cache_alignment initialization to correct spot x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach x86/sev-es: Allow copy_from_kernel_nofault() in earlier boot x86_64: Show CR4.PSE on auxiliaries like on BSP x86/iommu/docs: Update AMD IOMMU specification document URL x86/sev/docs: Update document URL in amd-memory-encryption.rst x86/mm: Move arch_memory_failure() and arch_is_platform_page() definitions from <asm/processor.h> to <asm/pgtable.h> ACPI/NUMA: Apply SRAT proximity domain to entire CFMWS window x86/numa: Introduce numa_fill_memblks()
2023-10-31Merge tag 'drm-misc-next-2023-10-27' of ↵Dave Airlie171-1631/+4017
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v6.7-rc1: drm-misc-next-2023-10-19 + following: UAPI Changes: Cross-subsystem Changes: - Convert fbdev drivers to use fbdev i/o mem helpers. Core Changes: - Use cross-references for macros in docs. - Make drm_client_buffer_addb use addfb2. - Add NV20 and NV30 YUV formats. - Documentation updates for create_dumb ioctl. - CI fixes. - Allow variable number of run-queues in scheduler. Driver Changes: - Rename drm/ast constants. - Make ili9882t its own driver. - Assorted fixes in ivpu, vc4, bridge/synopsis, amdgpu. - Add planar formats to rockchip. Signed-off-by: Dave Airlie <[email protected]> From: Maarten Lankhorst <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-30Merge tag 'x86-boot-2023-10-28' of ↵Linus Torvalds3-50/+7
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 boot updates from Ingo Molnar: - Rework PE header generation, primarily to generate a modern, 4k aligned kernel image view with narrower W^X permissions. - Further refine init-lifetime annotations - Misc cleanups & fixes * tag 'x86-boot-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) x86/boot: efistub: Assign global boot_params variable x86/boot: Rename conflicting 'boot_params' pointer to 'boot_params_ptr' x86/head/64: Move the __head definition to <asm/init.h> x86/head/64: Add missing __head annotation to startup_64_load_idt() x86/head/64: Mark 'startup_gdt[]' and 'startup_gdt_descr' as __initdata x86/boot: Harmonize the style of array-type parameter for fixup_pointer() calls x86/boot: Fix incorrect startup_gdt_descr.size x86/boot: Compile boot code with -std=gnu11 too x86/boot: Increase section and file alignment to 4k/512 x86/boot: Split off PE/COFF .data section x86/boot: Drop PE/COFF .reloc section x86/boot: Construct PE/COFF .text section from assembler x86/boot: Derive file size from _edata symbol x86/boot: Define setup size in linker script x86/boot: Set EFI handover offset directly in header asm x86/boot: Grab kernel_info offset from zoffset header directly x86/boot: Drop references to startup_64 x86/boot: Drop redundant code setting the root device x86/boot: Omit compression buffer from PE/COFF image memory footprint x86/boot: Remove the 'bugger off' message ...
2023-10-30Merge tag 'sched-core-2023-10-28' of ↵Linus Torvalds1-5/+13
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "Fair scheduler (SCHED_OTHER) improvements: - Remove the old and now unused SIS_PROP code & option - Scan cluster before LLC in the wake-up path - Use candidate prev/recent_used CPU if scanning failed for cluster wakeup NUMA scheduling improvements: - Improve the VMA access-PID code to better skip/scan VMAs - Extend tracing to cover VMA-skipping decisions - Improve/fix the recently introduced sched_numa_find_nth_cpu() code - Generalize numa_map_to_online_node() Energy scheduling improvements: - Remove the EM_MAX_COMPLEXITY limit - Add tracepoints to track energy computation - Make the behavior of the 'sched_energy_aware' sysctl more consistent - Consolidate and clean up access to a CPU's max compute capacity - Fix uclamp code corner cases RT scheduling improvements: - Drive dl_rq->overloaded with dl_rq->pushable_dl_tasks updates - Drive the ->rto_mask with rt_rq->pushable_tasks updates Scheduler scalability improvements: - Rate-limit updates to tg->load_avg - On x86 disable IBRS when CPU is offline to improve single-threaded performance - Micro-optimize in_task() and in_interrupt() - Micro-optimize the PSI code - Avoid updating PSI triggers and ->rtpoll_total when there are no state changes Core scheduler infrastructure improvements: - Use saved_state to reduce some spurious freezer wakeups - Bring in a handful of fast-headers improvements to scheduler headers - Make the scheduler UAPI headers more widely usable by user-space - Simplify the control flow of scheduler syscalls by using lock guards - Fix sched_setaffinity() vs. CPU hotplug race Scheduler debuggability improvements: - Disallow writing invalid values to sched_rt_period_us - Fix a race in the rq-clock debugging code triggering warnings - Fix a warning in the bandwidth distribution code - Micro-optimize in_atomic_preempt_off() checks - Enforce that the tasklist_lock is held in for_each_thread() - Print the TGID in sched_show_task() - Remove the /proc/sys/kernel/sched_child_runs_first sysctl ... and misc cleanups & fixes" * tag 'sched-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (82 commits) sched/fair: Remove SIS_PROP sched/fair: Use candidate prev/recent_used CPU if scanning failed for cluster wakeup sched/fair: Scan cluster before scanning LLC in wake-up path sched: Add cpus_share_resources API sched/core: Fix RQCF_ACT_SKIP leak sched/fair: Remove unused 'curr' argument from pick_next_entity() sched/nohz: Update comments about NEWILB_KICK sched/fair: Remove duplicate #include sched/psi: Update poll => rtpoll in relevant comments sched: Make PELT acronym definition searchable sched: Fix stop_one_cpu_nowait() vs hotplug sched/psi: Bail out early from irq time accounting sched/topology: Rename 'DIE' domain to 'PKG' sched/psi: Delete the 'update_total' function parameter from update_triggers() sched/psi: Avoid updating PSI triggers and ->rtpoll_total when there are no state changes sched/headers: Remove comment referring to rq::cpu_load, since this has been removed sched/numa: Complete scanning of inactive VMAs when there is no alternative sched/numa: Complete scanning of partial VMAs regardless of PID activity sched/numa: Move up the access pid reset logic sched/numa: Trace decisions related to skipping VMAs ...
2023-10-30Merge tag 'locking-core-2023-10-28' of ↵Linus Torvalds2-20/+16
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Info Molnar: "Futex improvements: - Add the 'futex2' syscall ABI, which is an attempt to get away from the multiplex syscall and adds a little room for extentions, while lifting some limitations. - Fix futex PI recursive rt_mutex waiter state bug - Fix inter-process shared futexes on no-MMU systems - Use folios instead of pages Micro-optimizations of locking primitives: - Improve arch_spin_value_unlocked() on asm-generic ticket spinlock architectures, to improve lockref code generation - Improve the x86-32 lockref_get_not_zero() main loop by adding build-time CMPXCHG8B support detection for the relevant lockref code, and by better interfacing the CMPXCHG8B assembly code with the compiler - Introduce arch_sync_try_cmpxchg() on x86 to improve sync_try_cmpxchg() code generation. Convert some sync_cmpxchg() users to sync_try_cmpxchg(). - Micro-optimize rcuref_put_slowpath() Locking debuggability improvements: - Improve CONFIG_DEBUG_RT_MUTEXES=y to have a fast-path as well - Enforce atomicity of sched_submit_work(), which is de-facto atomic but was un-enforced previously. - Extend <linux/cleanup.h>'s no_free_ptr() with __must_check semantics - Fix ww_mutex self-tests - Clean up const-propagation in <linux/seqlock.h> and simplify the API-instantiation macros a bit RT locking improvements: - Provide the rt_mutex_*_schedule() primitives/helpers and use them in the rtmutex code to avoid recursion vs. rtlock on the PI state. - Add nested blocking lockdep asserts to rt_mutex_lock(), rtlock_lock() and rwbase_read_lock() .. plus misc fixes & cleanups" * tag 'locking-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits) futex: Don't include process MM in futex key on no-MMU locking/seqlock: Fix grammar in comment alpha: Fix up new futex syscall numbers locking/seqlock: Propagate 'const' pointers within read-only methods, remove forced type casts locking/lockdep: Fix string sizing bug that triggers a format-truncation compiler-warning locking/seqlock: Change __seqprop() to return the function pointer locking/seqlock: Simplify SEQCOUNT_LOCKNAME() locking/atomics: Use atomic_try_cmpxchg_release() to micro-optimize rcuref_put_slowpath() locking/atomic, xen: Use sync_try_cmpxchg() instead of sync_cmpxchg() locking/atomic/x86: Introduce arch_sync_try_cmpxchg() locking/atomic: Add generic support for sync_try_cmpxchg() and its fallback locking/seqlock: Fix typo in comment futex/requeue: Remove unnecessary ‘NULL’ initialization from futex_proxy_trylock_atomic() locking/local, arch: Rewrite local_add_unless() as a static inline function locking/debug: Fix debugfs API return value checks to use IS_ERR() locking/ww_mutex/test: Make sure we bail out instead of livelock locking/ww_mutex/test: Fix potential workqueue corruption locking/ww_mutex/test: Use prng instead of rng to avoid hangs at bootup futex: Add sys_futex_requeue() futex: Add flags2 argument to futex_requeue() ...
2023-10-30Merge tag 'edac_updates_for_v6.7' of ↵Linus Torvalds3-0/+1082
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC updates from Borislav Petkov: - A new EDAC driver for Xilinx's Versal integrated memory controller * tag 'edac_updates_for_v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC/versal: Add a Xilinx Versal memory controller driver dt-bindings: memory-controllers: Add support for Xilinx Versal EDAC for DDRMC
2023-10-30Merge branch 'clk-cleanup' into clk-nextStephen Boyd26-237/+161
* clk-cleanup: clk: si521xx: Increase stack based print buffer size in probe clk: Use device_get_match_data() clk: cdce925: Extend match support for OF tables clk: si570: Simplify probe clk: si5351: Simplify probe clk: rs9: Use i2c_get_match_data() instead of device_get_match_data() clk: clk-si544: Simplify probe() and is_valid_frequency() clk: si521xx: Use i2c_get_match_data() instead of device_get_match_data() clk: npcm7xx: Fix incorrect kfree clk: at91: remove unnecessary conditions clk: ti: fix double free in of_ti_divider_clk_setup() clk: keystone: pll: fix a couple NULL vs IS_ERR() checks clk: ralink: mtmips: quiet unused variable warning clk: gate: fix comment typo and grammar clk: asm9620: Remove 'hw' local variable that isn't checked
2023-10-30Merge branches 'clk-renesas', 'clk-kunit', 'clk-regmap' and ↵Stephen Boyd26-184/+1071
'clk-frac-divider' into clk-next - Make clk kunit tests work with lockdep - Fix clk gate kunit test for big-endian - Convert more than a handful of clk drivers to use regmap maple tree - Consider the CLK_FRAC_DIVIDER_ZERO_BASED in fractional divider clk implementation * clk-renesas: (23 commits) clk: renesas: r9a08g045: Add clock and reset support for SDHI1 and SDHI2 clk: renesas: rzg2l: Use %x format specifier to print CLK_ON_R() clk: renesas: Add minimal boot support for RZ/G3S SoC clk: renesas: rzg2l: Add divider clock for RZ/G3S clk: renesas: rzg2l: Refactor SD mux driver clk: renesas: rzg2l: Remove CPG_SDHI_DSEL from generic header clk: renesas: rzg2l: Add struct clk_hw_data clk: renesas: rzg2l: Add support for RZ/G3S PLL clk: renesas: rzg2l: Remove critical area clk: renesas: rzg2l: Fix computation formula clk: renesas: rzg2l: Trust value returned by hardware clk: renesas: rzg2l: Lock around writes to mux register clk: renesas: rzg2l: Wait for status bit of SD mux before continuing clk: renesas: rcar-gen3: Extend SDnH divider table dt-bindings: clock: renesas,rzg2l-cpg: Document RZ/G3S SoC clk: renesas: r8a7795: Constify r8a7795_*_clks clk: renesas: r9a06g032: Name anonymous structs clk: renesas: r9a06g032: Fix kerneldoc warning clk: renesas: rzg2l: Use u32 for flag and mux_flags clk: renesas: rzg2l: Use FIELD_GET() for PLL register fields ... * clk-kunit: clk: Fix clk gate kunit test on big-endian CPUs clk: Parameterize clk_leaf_mux_set_rate_parent clk: Drive clk_leaf_mux_set_rate_parent test from clk_ops * clk-regmap: clk: versaclock7: Convert to use maple tree register cache clk: versaclock5: Convert to use maple tree register cache clk: versaclock3: Convert to use maple tree register cache clk: versaclock3: Remove redundant _is_writeable() clk: si570: Convert to use maple tree register cache clk: si544: Convert to use maple tree register cache clk: si5351: Convert to use maple tree register cache clk: si5341: Convert to use maple tree register cache clk: si514: Convert to use maple tree register cache clk: cdce925: Convert to use maple tree register cache * clk-frac-divider: clk: fractional-divider: tests: Add test suite for edge cases clk: fractional-divider: Improve approximation when zero based and export
2023-10-30Merge branches 'clk-debugfs', 'clk-spreadtrum', 'clk-sifive', 'clk-counted' ↵Stephen Boyd39-363/+6915
and 'clk-qcom' into clk-next - Add consumer info to clk debugfs - Fix various clk drivers that have clk_hw_onecell_data not at the end of an allocation * clk-debugfs: clk: Allow phase adjustment from debugfs clk: Show active consumers of clocks in debugfs * clk-spreadtrum: clk: sprd: Composite driver support offset config * clk-sifive: clk: sifive: Allow building the driver as a module clk: analogbits: Allow building the library as a module * clk-counted: clk: socfpga: agilex: Add bounds-checking coverage for struct stratix10_clock_data clk: socfpga: Fix undefined behavior bug in struct stratix10_clock_data clk: visconti: Add bounds-checking coverage for struct visconti_pll_provider clk: visconti: Fix undefined behavior bug in struct visconti_pll_provider * clk-qcom: (36 commits) clk: qcom: apss-ipq6018: add the GPLL0 clock also as clock provider clk: qcom: ipq5332: drop the CLK_SET_RATE_PARENT flag from GPLL clocks clk: qcom: ipq9574: drop the CLK_SET_RATE_PARENT flag from GPLL clocks clk: qcom: ipq5018: drop the CLK_SET_RATE_PARENT flag from GPLL clocks clk: qcom: ipq6018: drop the CLK_SET_RATE_PARENT flag from PLL clocks clk: qcom: ipq8074: drop the CLK_SET_RATE_PARENT flag from PLL clocks clk: qcom: gcc-ipq6018: add QUP6 I2C clock clk: qcom: apss-ipq6018: ipq5332: add safe source switch for a53pll clk: qcom: apss-ipq-pll: Fix 'l' value for ipq5332_pll_config clk: qcom: apss-ipq-pll: Use stromer plus ops for stromer plus pll clk: qcom: clk-alpha-pll: introduce stromer plus ops clk: qcom: config IPQ_APSS_6018 should depend on QCOM_SMEM clk: qcom: videocc-sm8550: switch to clk_lucid_ole_pll_configure clk: qcom: gpucc-sm8550: switch to clk_lucid_ole_pll_configure clk: qcom: Replace of_device.h with explicit includes clk: qcom: smd-rpm: Move CPUSS_GNoC clock to interconnect clk: qcom: cbf-msm8996: Convert to platform remove callback returning void clk: qcom: gcc-sm8150: Fix gcc_sdcc2_apps_clk_src clk: qcom: Add GCC driver support for SM4450 dt-bindings: clock: qcom: Add GCC clocks for SM4450 ...
2023-10-30Merge branches 'clk-doc', 'clk-amlogic', 'clk-mediatek', 'clk-twl' and ↵Stephen Boyd26-38/+5121
'clk-imx' into clk-next - Add clock driver for TWL6032 * clk-doc: clk: linux/clk-provider.h: fix kernel-doc warnings and typos * clk-amlogic: clk: meson: S4: select CONFIG_COMMON_CLK_MESON_CLKC_UTILS clk: meson: S4: add support for Amlogic S4 SoC peripheral clock controller clk: meson: S4: add support for Amlogic S4 SoC PLL clock driver dt-bindings: clock: document Amlogic S4 SoC peripherals clock controller dt-bindings: clock: document Amlogic S4 SoC PLL clock controller * clk-mediatek: clk: mediatek: fix double free in mtk_clk_register_pllfh() clk: mediatek: clk-mt2701: Add check for mtk_alloc_clk_data clk: mediatek: clk-mt7629: Add check for mtk_alloc_clk_data clk: mediatek: clk-mt7629-eth: Add check for mtk_alloc_clk_data clk: mediatek: clk-mt6797: Add check for mtk_alloc_clk_data clk: mediatek: clk-mt6779: Add check for mtk_alloc_clk_data clk: mediatek: clk-mt6765: Add check for mtk_alloc_clk_data * clk-twl: clk: twl: add clock driver for TWL6032 * clk-imx: clk: imx: imx8qm/qxp: add more resources to whitelist clk: imx: scu: ignore clks not owned by Cortex-A partition clk: imx8: remove MLB support clk: imx: imx8qm-rsrc: drop VPU_UART/VPUCORE clk: imx: imx8qxp: correct the enet clocks for i.MX8DXL clk: imx: imx8qxp: Fix elcdif_pll clock clk: imx: imx8dxl-rsrc: keep sorted in the ascending order clk: imx: imx6sx: Allow a different LCDIF1 clock parent clk: imx: imx8mq: correct error handling path clk: imx8mp: Remove non-existent IMX8MP_CLK_AUDIOMIX_PDM_ROOT clk: imx: imx8: Simplify clk_imx_acm_detach_pm_domains() clk: imx: imx8: Add a message in case of devm_clk_hw_register_mux_parent_data_table() error clk: imx: imx8: Fix an error handling path in imx8_acm_clk_probe() clk: imx: imx8: Fix an error handling path if devm_clk_hw_register_mux_parent_data_table() fails clk: imx: imx8: Fix an error handling path in clk_imx_acm_attach_pm_domains() clk: imx: Select MXC_CLK for CLK_IMX8QXP
2023-10-30Merge tag 'bcachefs-2023-10-30' of https://evilpiepirate.org/git/bcachefsLinus Torvalds7-600/+5
Pull initial bcachefs updates from Kent Overstreet: "Here's the bcachefs filesystem pull request. One new patch since last week: the exportfs constants ended up conflicting with other filesystems that are also getting added to the global enum, so switched to new constants picked by Amir. The only new non fs/bcachefs/ patch is the objtool patch that adds bcachefs functions to the list of noreturns. The patch that exports osq_lock() has been dropped for now, per Ingo" * tag 'bcachefs-2023-10-30' of https://evilpiepirate.org/git/bcachefs: (2781 commits) exportfs: Change bcachefs fid_type enum to avoid conflicts bcachefs: Refactor memcpy into direct assignment bcachefs: Fix drop_alloc_keys() bcachefs: snapshot_create_lock bcachefs: Fix snapshot skiplists during snapshot deletion bcachefs: bch2_sb_field_get() refactoring bcachefs: KEY_TYPE_error now counts towards i_sectors bcachefs: Fix handling of unknown bkey types bcachefs: Switch to unsafe_memcpy() in a few places bcachefs: Use struct_size() bcachefs: Correctly initialize new buckets on device resize bcachefs: Fix another smatch complaint bcachefs: Use strsep() in split_devs() bcachefs: Add iops fields to bch_member bcachefs: Rename bch_sb_field_members -> bch_sb_field_members_v1 bcachefs: New superblock section members_v2 bcachefs: Add new helper to retrieve bch_member from sb bcachefs: bucket_lock() is now a sleepable lock bcachefs: fix crc32c checksum merge byte order problem bcachefs: Fix bch2_inode_delete_keys() ...
2023-10-30iommufd: Organize the mock domain alloc functions closer to Joerg's treeJason Gunthorpe1-19/+16
Patches in Joerg's iommu tree to convert the mock driver to use domain_alloc_paging() that clash badly with the way the selftest changes for nesting were structured. Massage the selftest so that it looks closer the code after the domain_alloc_paging() conversion to ease the merge. Change __mock_domain_alloc_paging() into mock_domain_alloc_paging() in the same way as the iommu tree. The merge resolution then trivially takes both and deletes mock_domain_alloc(). Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Nicolin Chen <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Reviewed-by: Yi Liu <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-10-30Merge tag 'vfs-6.7.ctime' of ↵Linus Torvalds10-25/+37
gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs Pull vfs inode time accessor updates from Christian Brauner: "This finishes the conversion of all inode time fields to accessor functions as discussed on list. Changing timestamps manually as we used to do before is error prone. Using accessors function makes this robust. It does not contain the switch of the time fields to discrete 64 bit integers to replace struct timespec and free up space in struct inode. But after this, the switch can be trivially made and the patch should only affect the vfs if we decide to do it" * tag 'vfs-6.7.ctime' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: (86 commits) fs: rename inode i_atime and i_mtime fields security: convert to new timestamp accessors selinux: convert to new timestamp accessors apparmor: convert to new timestamp accessors sunrpc: convert to new timestamp accessors mm: convert to new timestamp accessors bpf: convert to new timestamp accessors ipc: convert to new timestamp accessors linux: convert to new timestamp accessors zonefs: convert to new timestamp accessors xfs: convert to new timestamp accessors vboxsf: convert to new timestamp accessors ufs: convert to new timestamp accessors udf: convert to new timestamp accessors ubifs: convert to new timestamp accessors tracefs: convert to new timestamp accessors sysv: convert to new timestamp accessors squashfs: convert to new timestamp accessors server: convert to new timestamp accessors client: convert to new timestamp accessors ...
2023-10-30Merge tag 'vfs-6.7.iov_iter' of ↵Linus Torvalds2-2/+2
gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs Pull iov_iter updates from Christian Brauner: "This contain's David's iov_iter cleanup work to convert the iov_iter iteration macros to inline functions: - Remove last_offset from iov_iter as it was only used by ITER_PIPE - Add a __user tag on copy_mc_to_user()'s dst argument on x86 to match that on powerpc and get rid of a sparse warning - Convert iter->user_backed to user_backed_iter() in the sound PCM driver - Convert iter->user_backed to user_backed_iter() in a couple of infiniband drivers - Renumber the type enum so that the ITER_* constants match the order in iterate_and_advance*() - Since the preceding patch puts UBUF and IOVEC at 0 and 1, change user_backed_iter() to just use the type value and get rid of the extra flag - Convert the iov_iter iteration macros to always-inline functions to make the code easier to follow. It uses function pointers, but they get optimised away - Move the check for ->copy_mc to _copy_from_iter() and copy_page_from_iter_atomic() rather than in memcpy_from_iter_mc() where it gets repeated for every segment. Instead, we check once and invoke a side function that can use iterate_bvec() rather than iterate_and_advance() and supply a different step function - Move the copy-and-csum code to net/ where it can be in proximity with the code that uses it - Fold memcpy_and_csum() in to its two users - Move csum_and_copy_from_iter_full() out of line and merge in csum_and_copy_from_iter() since the former is the only caller of the latter - Move hash_and_copy_to_iter() to net/ where it can be with its only caller" * tag 'vfs-6.7.iov_iter' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: iov_iter, net: Move hash_and_copy_to_iter() to net/ iov_iter, net: Merge csum_and_copy_from_iter{,_full}() together iov_iter, net: Fold in csum_and_memcpy() iov_iter, net: Move csum_and_copy_to/from_iter() to net/ iov_iter: Don't deal with iter->copy_mc in memcpy_from_iter_mc() iov_iter: Convert iterate*() to inline funcs iov_iter: Derive user-backedness from the iterator type iov_iter: Renumber ITER_* constants infiniband: Use user_backed_iter() to see if iterator is UBUF/IOVEC sound: Fix snd_pcm_readv()/writev() to use iov access functions iov_iter, x86: Be consistent about the __user tag on copy_mc_to_user() iov_iter: Remove last_offset from iov_iter as it was for ITER_PIPE
2023-10-30Merge tag 'vfs-6.7.misc' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfsLinus Torvalds1-5/+1
Pull misc vfs updates from Christian Brauner: "This contains the usual miscellaneous features, cleanups, and fixes for vfs and individual fses. Features: - Rename and export helpers that get write access to a mount. They are used in overlayfs to get write access to the upper mount. - Print the pretty name of the root device on boot failure. This helps in scenarios where we would usually only print "unknown-block(1,2)". - Add an internal SB_I_NOUMASK flag. This is another part in the endless POSIX ACL saga in a way. When POSIX ACLs are enabled via SB_POSIXACL the vfs cannot strip the umask because if the relevant inode has POSIX ACLs set it might take the umask from there. But if the inode doesn't have any POSIX ACLs set then we apply the umask in the filesytem itself. So we end up with: (1) no SB_POSIXACL -> strip umask in vfs (2) SB_POSIXACL -> strip umask in filesystem The umask semantics associated with SB_POSIXACL allowed filesystems that don't even support POSIX ACLs at all to raise SB_POSIXACL purely to avoid umask stripping. That specifically means NFS v4 and Overlayfs. NFS v4 does it because it delegates this to the server and Overlayfs because it needs to delegate umask stripping to the upper filesystem, i.e., the filesystem used as the writable layer. This went so far that SB_POSIXACL is raised eve on kernels that don't even have POSIX ACL support at all. Stop this blatant abuse and add SB_I_NOUMASK which is an internal superblock flag that filesystems can raise to opt out of umask handling. That should really only be the two mentioned above. It's not that we want any filesystems to do this. Ideally we have all umask handling always in the vfs. - Make overlayfs use SB_I_NOUMASK too. - Now that we have SB_I_NOUMASK, stop checking for SB_POSIXACL in IS_POSIXACL() if the kernel doesn't have support for it. This is a very old patch but it's only possible to do this now with the wider cleanup that was done. - Follow-up work on fake path handling from last cycle. Citing mostly from Amir: When overlayfs was first merged, overlayfs files of regular files and directories, the ones that are installed in file table, had a "fake" path, namely, f_path is the overlayfs path and f_inode is the "real" inode on the underlying filesystem. In v6.5, we took another small step by introducing of the backing_file container and the file_real_path() helper. This change allowed vfs and filesystem code to get the "real" path of an overlayfs backing file. With this change, we were able to make fsnotify work correctly and report events on the "real" filesystem objects that were accessed via overlayfs. This method works fine, but it still leaves the vfs vulnerable to new code that is not aware of files with fake path. A recent example is commit db1d1e8b9867 ("IMA: use vfs_getattr_nosec to get the i_version"). This commit uses direct referencing to f_path in IMA code that otherwise uses file_inode() and file_dentry() to reference the filesystem objects that it is measuring. This contains work to switch things around: instead of having filesystem code opt-in to get the "real" path, have generic code opt-in for the "fake" path in the few places that it is needed. Is it far more likely that new filesystems code that does not use the file_dentry() and file_real_path() helpers will end up causing crashes or averting LSM/audit rules if we keep the "fake" path exposed by default. This change already makes file_dentry() moot, but for now we did not change this helper just added a WARN_ON() in ovl_d_real() to catch if we have made any wrong assumptions. After the dust settles on this change, we can make file_dentry() a plain accessor and we can drop the inode argument to ->d_real(). - Switch struct file to SLAB_TYPESAFE_BY_RCU. This looks like a small change but it really isn't and I would like to see everyone on their tippie toes for any possible bugs from this work. Essentially we've been doing most of what SLAB_TYPESAFE_BY_RCU for files since a very long time because of the nasty interactions between the SCM_RIGHTS file descriptor garbage collection. So extending it makes a lot of sense but it is a subtle change. There are almost no places that fiddle with file rcu semantics directly and the ones that did mess around with struct file internal under rcu have been made to stop doing that because it really was always dodgy. I forgot to put in the link tag for this change and the discussion in the commit so adding it into the merge message: https://lore.kernel.org/r/[email protected] Cleanups: - Various smaller pipe cleanups including the removal of a spin lock that was only used to protect against writes without pipe_lock() from O_NOTIFICATION_PIPE aka watch queues. As that was never implemented remove the additional locking from pipe_write(). - Annotate struct watch_filter with the new __counted_by attribute. - Clarify do_unlinkat() cleanup so that it doesn't look like an extra iput() is done that would cause issues. - Simplify file cleanup when the file has never been opened. - Use module helper instead of open-coding it. - Predict error unlikely for stale retry. - Use WRITE_ONCE() for mount expiry field instead of just commenting that one hopes the compiler doesn't get smart. Fixes: - Fix readahead on block devices. - Fix writeback when layztime is enabled and inodes whose timestamp is the only thing that changed reside on wb->b_dirty_time. This caused excessively large zombie memory cgroup when lazytime was enabled as such inodes weren't handled fast enough. - Convert BUG_ON() to WARN_ON_ONCE() in open_last_lookups()" * tag 'vfs-6.7.misc' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: (26 commits) file, i915: fix file reference for mmap_singleton() vfs: Convert BUG_ON to WARN_ON_ONCE in open_last_lookups writeback, cgroup: switch inodes with dirty timestamps to release dying cgwbs chardev: Simplify usage of try_module_get() ovl: rely on SB_I_NOUMASK fs: fix umask on NFS with CONFIG_FS_POSIX_ACL=n fs: store real path instead of fake path in backing file f_path fs: create helper file_user_path() for user displayed mapped file path fs: get mnt_writers count for an open backing file's real path vfs: stop counting on gcc not messing with mnt_expiry_mark if not asked vfs: predict the error in retry_estale as unlikely backing file: free directly vfs: fix readahead(2) on block devices io_uring: use files_lookup_fd_locked() file: convert to SLAB_TYPESAFE_BY_RCU vfs: shave work on failed file open fs: simplify misleading code to remove ambiguity regarding ihold()/iput() watch_queue: Annotate struct watch_filter with __counted_by fs/pipe: use spinlock in pipe_read() only if there is a watch_queue fs/pipe: remove unnecessary spinlock from pipe_write() ...
2023-10-30Merge tag 'vfs-6.7.super' of ↵Linus Torvalds28-282/+304
gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs Pull vfs superblock updates from Christian Brauner: "This contains the work to make block device opening functions return a struct bdev_handle instead of just a struct block_device. The same struct bdev_handle is then also passed to block device closing functions. This allows us to propagate context from opening to closing a block device without having to modify all users everytime. Sidenote, in the future we might even want to try and have block device opening functions return a struct file directly but that's a series on top of this. These are further preparatory changes to be able to count writable opens and blocking writes to mounted block devices. That's a separate piece of work for next cycle and for that we absolutely need the changes to btrfs that have been quietly dropped somehow. Originally the series contained a patch that removed the old blkdev_*() helpers. But since this would've caused needles churn in -next for bcachefs we ended up delaying it. The second piece of work addresses one of the major annoyances about the work last cycle, namely that we required dropping s_umount whenever we used the superblock and fs_holder_ops for a block device. The reason for that requirement had been that in some codepaths s_umount could've been taken under disk->open_mutex (that's always been the case, at least theoretically). For example, on surprise block device removal or media change. And opening and closing block devices required grabbing disk->open_mutex as well. So we did the work and went through the block layer and fixed all those places so that s_umount is never taken under disk->open_mutex. This means no more brittle games where we yield and reacquire s_umount during block device opening and closing and no more requirements where block devices need to be closed. Filesystems don't need to care about this. There's a bunch of other follow-up work such as moving block device freezing and thawing to holder operations which makes it work for all block devices and not just the main block device just as we did for surprise removal. But that is for next cycle. Tested with fstests for all major fses, blktests, LTP" * tag 'vfs-6.7.super' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: (37 commits) porting: update locking requirements fs: assert that open_mutex isn't held over holder ops block: assert that we're not holding open_mutex over blk_report_disk_dead block: move bdev_mark_dead out of disk_check_media_change block: WARN_ON_ONCE() when we remove active partitions block: simplify bdev_del_partition() fs: Avoid grabbing sb->s_umount under bdev->bd_holder_lock jfs: fix log->bdev_handle null ptr deref in lbmStartIO bcache: Fixup error handling in register_cache() xfs: Convert to bdev_open_by_path() reiserfs: Convert to bdev_open_by_dev/path() ocfs2: Convert to use bdev_open_by_dev() nfs/blocklayout: Convert to use bdev_open_by_dev/path() jfs: Convert to bdev_open_by_dev() f2fs: Convert to bdev_open_by_dev/path() ext4: Convert to bdev_open_by_dev() erofs: Convert to use bdev_open_by_path() btrfs: Convert to bdev_open_by_path() fs: Convert to bdev_open_by_dev() mm/swap: Convert to use bdev_open_by_dev() ...
2023-10-30iommufd/selftest: Fix page-size check in iommufd_test_dirty()Joao Martins1-2/+4
iommufd_test_dirty()/IOMMU_TEST_OP_DIRTY sets the dirty bits in the mock domain implementation that the userspace side validates against what it obtains via the UAPI. However in introducing iommufd_test_dirty() it forgot to validate page_size being 0 leading to two possible divide-by-zero problems: one at the beginning when calculating @max and while calculating the IOVA in the XArray PFN tracking list. While at it, validate the length to require non-zero value as well, as we can't be allocating a 0-sized bitmap. Link: https://lore.kernel.org/r/[email protected] Reported-by: [email protected] Closes: https://lore.kernel.org/linux-iommu/[email protected]/ Fixes: a9af47e382a4 ("iommufd/selftest: Test IOMMU_HWPT_GET_DIRTY_BITMAP") Signed-off-by: Joao Martins <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>