blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2023-04-06	ALSA: hda: add HDaudio Extended link definitions	Pierre-Louis Bossart	1	-2/+38
	Add new definitions for the HDaudio Extended link support, specifically new registers for SoundWire, Intel DMIC and INTEL SSP interfaces. Signed-off-by: Pierre-Louis Bossart <[email protected]> Reviewed-by: Rander Wang <[email protected]> Reviewed-by: Péter Ujfalusi <[email protected]> Reviewed-by: Ranjani Sridharan <[email protected]> Signed-off-by: Peter Ujfalusi <[email protected]> Reviewed-by: Takashi Iwai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]>
2023-04-06	USB: core: Fix docs warning caused by wireless_status feature	Bastien Nocera	1	-2/+4
	Fix wrongly named 'dev' parameter in doc block, should have been iface: drivers/usb/core/message.c:1939: warning: Function parameter or member 'iface' not described in 'usb_set_wireless_status' drivers/usb/core/message.c:1939: warning: Excess function parameter 'dev' description in 'usb_set_wireless_status' And fix missing struct member doc in kernel API, and reorder to match struct: include/linux/usb.h:270: warning: Function parameter or member 'wireless_status_work' not described in 'usb_interface' Reported-by: Stephen Rothwell <[email protected]> Link: https://lore.kernel.org/linux-next/[email protected]/T/#t Fixes: 0a4db185f078 ("USB: core: Add API to change the wireless_status") Signed-off-by: Bastien Nocera <[email protected]> Acked-by: Greg Kroah-Hartman <[email protected]> Link: https://lore.kernel.org/r/[email protected] [bentiss: fix checkpatch warning] Signed-off-by: Benjamin Tissoires <[email protected]>
2023-04-06	ALSA: document that struct __snd_pcm_mmap_control64 is messed up	Oswald Buddenhagen	1	-1/+2
	I'm not the first one to run into this, see e.g. https://lore.kernel.org/all/[email protected]/ Signed-off-by: Oswald Buddenhagen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2023-04-06	pwm: stm32: Enforce settings for PWM capture	Olivier Moysan	1	-0/+1
	The PWM capture assumes that the input selector is set to default input and that the slave mode is disabled. Force reset state for TISEL and SMCR registers to match this requirement. Note that slave mode disabling is not a pre-requisite by itself for capture mode, as hardware supports it for PWM capture. However, the current implementation of the driver does not allow slave mode for PWM capture. Setting slave mode for PWM capture results in wrong capture values. Signed-off-by: Olivier Moysan <[email protected]> Acked-by: Lee Jones <[email protected]> Acked-by: Uwe Kleine-König <[email protected]> Signed-off-by: Thierry Reding <[email protected]>
2023-04-06	Merge tag 'drm-intel-next-2023-04-06' of ↵	Daniel Vetter	1	-0/+13
	git://anongit.freedesktop.org/drm/drm-intel into drm-next - Fix DPT+shmem combo and add i915.enable_dpt modparam (Ville) - i915.enable_sagv module parameter (Ville) - Correction to QGV related register addresses (Vinod) - IPS debugfs per-crtc and new file for false_color (Ville) - More clean-up and reorganization of Display code (Jani) - DP DSC related fixes and improvements (Stanislav, Ankit, Suraj, Swati) - Make utility pin asserts more accurate (Ville) - Meteor Lake enabling (Daniele) - High refresh rate PSR fixes (Jouni) - Cursor and Plane chicken register fixes (Ville) - Align the ADL-P TypeC sequences with hardware specification (Imre) - Documentation build fixes and improvements to catch bugs earlier (Lee, Jani) - PL1 power limit hwmon entry changed to use 0 as disabled state (Ashutosh) - DP aux sync fix and improvements (Ville) - DP MST fixes and w/a (Stanislav) - Limit PXP drm-errors or warning on firmware API failures (Alan) Signed-off-by: Daniel Vetter <[email protected]> From: Rodrigo Vivi <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-04-06	Merge tag 'drm/tegra/for-6.4-rc1' of ↵	Daniel Vetter	1	-1/+1
	https://gitlab.freedesktop.org/drm/tegra into drm-next drm/tegra: Changes for v6.4-rc1 The majority of this is minor cleanups and fixes. Other than those, this contains Uwe's conversion to the new driver remove callback and Thomas' fbdev DRM client conversion. The driver can now also be built on other architectures to easy compile coverage. Finally, this adds Mikko as a second maintainer for the driver. As a next step we also want Tegra DRM to move into drm-misc to streamline the maintenance process. Signed-off-by: Daniel Vetter <[email protected]> From: Thierry Reding <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-04-06	Merge tag 'drm-misc-next-2023-04-06' of ↵	Daniel Vetter	3	-4/+404
	git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v6.4-rc1: UAPI Changes: Cross-subsystem Changes: - Document port and rotation dt bindings better. - For panel timing DT bindings, document that vsync and hsync are first, rather than last in image. - Fix video/aperture typos. Core Changes: - Reject prime DMA-Buf attachment if get_sg_table is missing. (For self-importing dma-buf only.) - Add prime import/export to vram-helper. - Fix oops in drm/vblank when init is not called. - Fixup xres/yres_virtual and other fixes in fb helper. - Improve SCDC debugs. - Skip setting deadline on modesets. - Assorted TTM fixes. Driver Changes: - Add lima usage stats. - Assorted fixes to bridge/lt8192b, tc358767, ivpu, bridge/ti-sn65dsi83, ps8640. - Use pci aperture helpers in drm/ast lynxfb, radeonfb. - Revert some lima patches, as they required a commit that has been reverted upstream. - Add AUO NE135FBM-N41 v8.1 eDP panel. - Add QAIC accel driver. Signed-off-by: Daniel Vetter <[email protected]> From: Maarten Lankhorst <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-04-06	platform/x86: apple-gmux: Fix iomem_base __iomem annotation	Hans de Goede	1	-1/+1
	Fix the __iomem annotation of the iomem_base pointers in the apple-gmux code. The __iomem should go before the . This fixes a bunch of sparse warnings like this one: drivers/platform/x86/apple-gmux.c:224:48: sparse: expected void const [noderef] __iomem got unsigned char [usertype] * Fixes: 0c18184de990 ("platform/x86: apple-gmux: support MMIO gmux on T2 Macs") Reported-by: kernel test robot <[email protected]> Link: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Suggested-by: Dan Carpenter <[email protected]> Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Orlando Chamberlain <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-04-06	Merge tag 'drm-intel-gt-next-2023-04-06' of ↵	Daniel Vetter	1	-1/+24
	git://anongit.freedesktop.org/drm/drm-intel into drm-next UAPI Changes: - (Build-time only, should not have any impact) drm/i915/uapi: Replace fake flex-array with flexible-array member "Zero-length arrays as fake flexible arrays are deprecated and we are moving towards adopting C99 flexible-array members instead." This is on core kernel request moving towards GCC 13. Driver Changes: - Fix context runtime accounting on sysfs fdinfo for heavy workloads (Tvrtko) - Add support for OA media units on MTL (Umesh) - Add new workarounds for Meteorlake (Daniele, Radhakrishna, Haridhar) - Fix sysfs to read actual frequency for MTL and Gen6 and earlier (Ashutosh) - Synchronize i915/BIOS on C6 enabling on MTL (Vinay) - Fix DMAR error noise due to GPU error capture (Andrej) - Fix forcewake during BAR resize on discrete (Andrzej) - Flush lmem contents after construction on discrete (Chris) - Fix GuC loading timeout on systems where IFWI programs low boot frequency (John) - Fix race condition UAF in i915_perf_add_config_ioctl (Min) - Sanitycheck MMIO access early in driver load and during forcewake (Matt) - Wakeref fixes for GuC RC error scenario and active VM tracking (Chris) - Cancel HuC delayed load timer on reset (Daniele) - Limit double GT reset to pre-MTL (Daniele) - Use i915 instead of dev_priv insied the file_priv structure (Andi) - Improve GuC load error reporting (John) - Simplify VCS/BSD engine selection logic (Tvrtko) - Perform uc late init after probe error injection (Andrzej) - Fix format for perf_limit_reasons in debugfs (Vinay) - Create per-gt debugfs files (Andi) - Documentation and kerneldoc fixes (Nirmoy, Lee) - Selftest improvements (Fei, Jonathan) Signed-off-by: Daniel Vetter <[email protected]> From: Joonas Lahtinen <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/ZC6APj/[email protected]
2023-04-06	netfilter: br_netfilter: fix recent physdev match breakage	Florian Westphal	1	-0/+1
	Recent attempt to ensure PREROUTING hook is executed again when a decrypted ipsec packet received on a bridge passes through the network stack a second time broke the physdev match in INPUT hook. We can't discard the nf_bridge info strct from sabotage_in hook, as this is needed by the physdev match. Keep the struct around and handle this with another conditional instead. Fixes: 2b272bb558f1 ("netfilter: br_netfilter: disable sabotage_in hook after first suppression") Reported-and-tested-by: Farid BENAMROUCHE <[email protected]> Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2023-04-06	crypto: hash - Remove maximum statesize limit	Herbert Xu	1	-2/+0
	Remove the HASH_MAX_STATESIZE limit now that it is unused. Signed-off-by: Herbert Xu <[email protected]>
2023-04-06	accel/qaic: Add uapi and core driver file	Jeffrey Hugo	1	-0/+397
	Add the QAIC driver uapi file and core driver file that binds to the PCIe device. The core driver file also creates the accel device and manages all the interconnections between the different parts of the driver. The driver can be built as a module. If so, it will be called "qaic.ko". Signed-off-by: Jeffrey Hugo <[email protected]> Reviewed-by: Carl Vanderlip <[email protected]> Reviewed-by: Pranjal Ramajor Asha Kanojiya <[email protected]> Reviewed-by: Stanislaw Gruszka <[email protected]> Reviewed-by: Jacek Lawrynowicz <[email protected]> Acked-by: Oded Gabbay <[email protected]> Signed-off-by: Jacek Lawrynowicz <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-04-05	kallsyms: move module-related functions under correct configs	Viktor Malik	1	-61/+74
	Functions for searching module kallsyms should have non-empty definitions only if CONFIG_MODULES=y and CONFIG_KALLSYMS=y. Until now, only CONFIG_MODULES check was used for many of these, which may have caused complilation errors on some configs. This patch moves all relevant functions under the correct configs. Fixes: bd5314f8dd2d ("kallsyms, bpf: Move find_kallsyms_symbol_value out of internal header") Signed-off-by: Viktor Malik <[email protected]> Reported-by: kernel test robot <[email protected]> Link: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2023-04-05	rpmsg: qcom_smd: Make qcom_smd_unregister_edge() return void	Uwe Kleine-König	1	-3/+2
	This function returned zero unconditionally. Convert it to return no value instead. This makes it more obvious what happens in the callers. One caller is converted to return zero explicitly. The only other caller (smd_subdev_stop() in drivers/remoteproc/qcom_common.c) already ignored the return value before. Signed-off-by: Uwe Kleine-König <[email protected]> Signed-off-by: Bjorn Andersson <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-04-05	sched/numa: use hash_32 to mix up PIDs accessing VMA	Raghavendra K T	1	-1/+1
	before: last 6 bits of PID is used as index to store information about tasks accessing VMA's. after: hash_32 is used to take of cases where tasks are created over a period of time, and thus improve collision probability. Result: The patch series overall improves autonuma cost. Kernbench around more than 5% improvement and system time in mmtest autonuma showed more than 80% improvement Link: https://lkml.kernel.org/r/d5a9f75513300caed74e5c8570bba9317b963c2b.1677672277.git.raghavendra.kt@amd.com Signed-off-by: Raghavendra K T <[email protected]> Suggested-by: Peter Zijlstra <[email protected]> Cc: Bharata B Rao <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Disha Talreja <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Mike Rapoport <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	sched/numa: implement access PID reset logic	Raghavendra K T	2	-3/+4
	This helps to ensure that only recently accessed PIDs scan the VMAs. Current implementation: (idea supported by PeterZ) 1. Accessing PID information is maintained in two windows. access_pids[1] being newest. 2. Reset old access PID info i.e. access_pid[0] every (4 * sysctl_numa_balancing_scan_delay) interval after initial scan delay period expires. The above interval seemed to be experimentally optimum since it avoids frequent reset of access info as well as helps clearing the old access info regularly. The reset logic is implemented in scan path. Link: https://lkml.kernel.org/r/f7a675f66d1442d048b4216b2baf94515012c405.1677672277.git.raghavendra.kt@amd.com Signed-off-by: Raghavendra K T <[email protected]> Suggested-by: Mel Gorman <[email protected]> Cc: Bharata B Rao <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Disha Talreja <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Peter Zijlstra <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	sched/numa: enhance vma scanning logic	Raghavendra K T	2	-0/+15
	During Numa scanning make sure only relevant vmas of the tasks are scanned. Before: All the tasks of a process participate in scanning the vma even if they do not access vma in it's lifespan. Now: Except cases of first few unconditional scans, if a process do not touch vma (exluding false positive cases of PID collisions) tasks no longer scan all vma Logic used: 1) 6 bits of PID used to mark active bit in vma numab status during fault to remember PIDs accessing vma. (Thanks Mel) 2) Subsequently in scan path, vma scanning is skipped if current PID had not accessed vma. 3) First two times we do allow unconditional scan to preserve earlier behaviour of scanning. Acknowledgement to Bharata B Rao <[email protected]> for initial patch to store pid information and Peter Zijlstra <[email protected]> (Usage of test and set bit) Link: https://lkml.kernel.org/r/092f03105c7c1d3450f4636b1ea350407f07640e.1677672277.git.raghavendra.kt@amd.com Signed-off-by: Raghavendra K T <[email protected]> Suggested-by: Mel Gorman <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Disha Talreja <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Mike Rapoport <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	sched/numa: apply the scan delay to every new vma	Mel Gorman	2	-0/+23
	Pach series "sched/numa: Enhance vma scanning", v3. The patchset proposes one of the enhancements to numa vma scanning suggested by Mel. This is continuation of [3]. Reposting the rebased patchset to akpm mm-unstable tree (March 1) Existing mechanism of scan period involves, scan period derived from per-thread stats. Process Adaptive autoNUMA [1] proposed to gather NUMA fault stats at per-process level to capture aplication behaviour better. During that course of discussion, Mel proposed several ideas to enhance current numa balancing. One of the suggestion was below Track what threads access a VMA. The suggestion was to use an unsigned long pid_mask and use the lower bits to tag approximately what threads access a VMA. Skip VMAs that did not trap a fault. This would be approximate because of PID collisions but would reduce scanning of areas the thread is not interested in. The above suggestion intends not to penalize threads that has no interest in the vma, thus reduce scanning overhead. V3 changes are mostly based on PeterZ comments (details below in changes) Summary of patchset: Current patchset implements: 1. Delay the vma scanning logic for newly created VMA's so that additional overhead of scanning is not incurred for short lived tasks (implementation by Mel) 2. Store the information of tasks accessing VMA in 2 windows. It is regularly cleared in (4sysctl_numa_balancing_scan_delay) interval. The above time is derived from experimenting (Suggested by PeterZ) to balance between frequent clearing vs obsolete access data 3. hash_32 used to encode task index accessing VMA information 4. VMA's acess information is used to skip scanning for the tasks which had not accessed VMA Changes since V2: patch1: - Renaming of structure, macro to function, - Add explanation to heuristics - Adding more details from result (PeterZ) Patch2: - Usage of test and set bit (PeterZ) - Move storing access PID info to numa_migrate_prep() - Add a note on fainess among tasks allowed to scan (PeterZ) Patch3: - Maintain two windows of access PID information (PeterZ supported implementation and Gave idea to extend to N if needed) Patch4: - Apply hash_32 function to track VMA accessing PIDs (PeterZ) Changes since RFC V1: - Include Mel's vma scan delay patch - Change the accessing pid store logic (Thanks Mel) - Fencing structure / code to NUMA_BALANCING (David, Mel) - Adding clearing access PID logic (Mel) - Descriptive change log ( Mike Rapoport) Things to ponder over: ========================================== - Improvement to clearing accessing PIDs logic (discussed in-detail in patch3 itself (Done in this patchset by implementing 2 window history) - Current scan period is not changed in the patchset, so we do see frequent tries to scan. Relaxing scan period dynamically could improve results further. [1] sched/numa: Process Adaptive autoNUMA Link: https://lore.kernel.org/lkml/[email protected]/T/ [2] RFC V1 Link: https://lore.kernel.org/all/[email protected]/ [3] V2 Link: https://lore.kernel.org/lkml/[email protected]/ Results: Summary: Huge autonuma cost reduction seen in mmtest. Kernbench improvement is more than 5% and huge system time (80%+) improvement from mmtest autonuma. (dbench had huge std deviation to post) kernbench =========== 6.2.0-mmunstable-base 6.2.0-mmunstable-patched Amean user-256 22002.51 ( 0.00%) 22649.95 -2.94%* Amean syst-256 10162.78 ( 0.00%) 8214.13 * 19.17%* Amean elsp-256 160.74 ( 0.00%) 156.92 * 2.38%* Duration User 66017.43 67959.84 Duration System 30503.15 24657.03 Duration Elapsed 504.61 493.12 6.2.0-mmunstable-base 6.2.0-mmunstable-patched Ops NUMA alloc hit 1738835089.00 1738780310.00 Ops NUMA alloc local 1738834448.00 1738779711.00 Ops NUMA base-page range updates 477310.00 392566.00 Ops NUMA PTE updates 477310.00 392566.00 Ops NUMA hint faults 96817.00 87555.00 Ops NUMA hint local faults % 10150.00 2192.00 Ops NUMA hint local percent 10.48 2.50 Ops NUMA pages migrated 86660.00 85363.00 Ops AutoNUMA cost 489.07 442.14 autonumabench =============== 6.2.0-mmunstable-base 6.2.0-mmunstable-patched Amean syst-NUMA01 399.50 ( 0.00%) 52.05 * 86.97%* Amean syst-NUMA01_THREADLOCAL 0.21 ( 0.00%) 0.22 * -5.41%* Amean syst-NUMA02 0.80 ( 0.00%) 0.78 * 2.68%* Amean syst-NUMA02_SMT 0.65 ( 0.00%) 0.68 * -3.95%* Amean elsp-NUMA01 313.26 ( 0.00%) 313.11 * 0.05%* Amean elsp-NUMA01_THREADLOCAL 1.06 ( 0.00%) 1.08 * -1.76%* Amean elsp-NUMA02 3.19 ( 0.00%) 3.24 * -1.52%* Amean elsp-NUMA02_SMT 3.72 ( 0.00%) 3.61 * 2.92%* Duration User 396433.47 324835.96 Duration System 2808.70 376.66 Duration Elapsed 2258.61 2258.12 6.2.0-mmunstable-base 6.2.0-mmunstable-patched Ops NUMA alloc hit 59921806.00 49623489.00 Ops NUMA alloc miss 0.00 0.00 Ops NUMA interleave hit 0.00 0.00 Ops NUMA alloc local 59920880.00 49622594.00 Ops NUMA base-page range updates 152259275.00 50075.00 Ops NUMA PTE updates 152259275.00 50075.00 Ops NUMA PMD updates 0.00 0.00 Ops NUMA hint faults 154660352.00 39014.00 Ops NUMA hint local faults % 138550501.00 23139.00 Ops NUMA hint local percent 89.58 59.31 Ops NUMA pages migrated 8179067.00 14147.00 Ops AutoNUMA cost 774522.98 195.69 This patch (of 4): Currently whenever a new task is created we wait for sysctl_numa_balancing_scan_delay to avoid unnessary scanning overhead. Extend the same logic to new or very short-lived VMAs. [[email protected]: add initialization in vm_area_dup())] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/7a6fbba87c8b51e67efd3e74285bb4cb311a16ca.1677672277.git.raghavendra.kt@amd.com Signed-off-by: Mel Gorman <[email protected]> Signed-off-by: Raghavendra K T <[email protected]> Cc: Bharata B Rao <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Disha Talreja <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	mm: separate vma->lock from vm_area_struct	Suren Baghdasaryan	2	-14/+15
	vma->lock being part of the vm_area_struct causes performance regression during page faults because during contention its count and owner fields are constantly updated and having other parts of vm_area_struct used during page fault handling next to them causes constant cache line bouncing. Fix that by moving the lock outside of the vm_area_struct. All attempts to keep vma->lock inside vm_area_struct in a separate cache line still produce performance regression especially on NUMA machines. Smallest regression was achieved when lock is placed in the fourth cache line but that bloats vm_area_struct to 256 bytes. Considering performance and memory impact, separate lock looks like the best option. It increases memory footprint of each VMA but that can be optimized later if the new size causes issues. Note that after this change vma_init() does not allocate or initialize vma->lock anymore. A number of drivers allocate a pseudo VMA on the stack but they never use the VMA's lock, therefore it does not need to be allocated. The future drivers which might need the VMA lock should use vm_area_alloc()/vm_area_free() to allocate the VMA. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Suren Baghdasaryan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	mm/mmap: free vm_area_struct without call_rcu in exit_mmap	Suren Baghdasaryan	1	-0/+2
	call_rcu() can take a long time when callback offloading is enabled. Its use in the vm_area_free can cause regressions in the exit path when multiple VMAs are being freed. Because exit_mmap() is called only after the last mm user drops its refcount, the page fault handlers can't be racing with it. Any other possible user like oom-reaper or process_mrelease are already synchronized using mmap_lock. Therefore exit_mmap() can free VMAs directly, without the use of call_rcu(). Expose __vm_area_free() and use it from exit_mmap() to avoid possible call_rcu() floods and performance regressions caused by it. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Suren Baghdasaryan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	mm: introduce per-VMA lock statistics	Suren Baghdasaryan	2	-0/+12
	Add a new CONFIG_PER_VMA_LOCK_STATS config option to dump extra statistics about handling page fault under VMA lock. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Suren Baghdasaryan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	mm: add FAULT_FLAG_VMA_LOCK flag	Suren Baghdasaryan	2	-1/+4
	Add a new flag to distinguish page faults handled under protection of per-vma lock. [[email protected]: document FAULT_FLAG_VMA_LOCK flag] Link: https://lkml.kernel.org/r/[email protected] Link: https://lore.kernel.org/all/[email protected]/ Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Suren Baghdasaryan <[email protected]> Reviewed-by: Laurent Dufour <[email protected]> Cc: Dan Carpenter <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: David Howells <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Liam R. Howlett <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Yu Zhao <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	mm: introduce lock_vma_under_rcu to be used from arch-specific code	Suren Baghdasaryan	1	-0/+3
	Introduce lock_vma_under_rcu function to lookup and lock a VMA during page fault handling. When VMA is not found, can't be locked or changes after being locked, the function returns NULL. The lookup is performed under RCU protection to prevent the found VMA from being destroyed before the VMA lock is acquired. VMA lock statistics are updated according to the results. For now only anonymous VMAs can be searched this way. In other cases the function returns NULL. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Suren Baghdasaryan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	mm: introduce vma detached flag	Suren Baghdasaryan	2	-0/+14
	Per-vma locking mechanism will search for VMA under RCU protection and then after locking it, has to ensure it was not removed from the VMA tree after we found it. To make this check efficient, introduce a vma->detached flag to mark VMAs which were removed from the VMA tree. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Suren Baghdasaryan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	mm/khugepaged: write-lock VMA while collapsing a huge page	Suren Baghdasaryan	1	-11/+30
	Protect VMA from concurrent page fault handler while collapsing a huge page. Page fault handler needs a stable PMD to use PTL and relies on per-VMA lock to prevent concurrent PMD changes. pmdp_collapse_flush(), set_huge_pmd() and collapse_and_free_pmd() can modify a PMD, which will not be detected by a page fault handler without proper locking. Before this patch, page tables can be walked under any one of the mmap_lock, the mapping lock, and the anon_vma lock; so when khugepaged unlinks and frees page tables, it must ensure that all of those either are locked or don't exist. This patch adds a fourth lock under which page tables can be traversed, and so khugepaged must also lock out that one. [[email protected]: vm_lock/i_mmap_rwsem inversion in retract_page_tables] Link: https://lkml.kernel.org/r/[email protected] [[email protected]: build fix] Link: https://lkml.kernel.org/r/CAJuCfpFjWhtzRE1X=J+_JjgJzNKhq-=JT8yTBSTHthwp0pqWZw@mail.gmail.com Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Suren Baghdasaryan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	mm: mark VMA as being written when changing vm_flags	Suren Baghdasaryan	1	-5/+5
	Updates to vm_flags have to be done with VMA marked as being written for preventing concurrent page faults or other modifications. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Suren Baghdasaryan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	mm: add per-VMA lock and helper functions to control it	Suren Baghdasaryan	3	-0/+103
	Introduce per-VMA locking. The lock implementation relies on a per-vma and per-mm sequence counters to note exclusive locking: - read lock - (implemented by vma_start_read) requires the vma (vm_lock_seq) and mm (mm_lock_seq) sequence counters to differ. If they match then there must be a vma exclusive lock held somewhere. - read unlock - (implemented by vma_end_read) is a trivial vma->lock unlock. - write lock - (vma_start_write) requires the mmap_lock to be held exclusively and the current mm counter is assigned to the vma counter. This will allow multiple vmas to be locked under a single mmap_lock write lock (e.g. during vma merging). The vma counter is modified under exclusive vma lock. - write unlock - (vma_end_write_all) is a batch release of all vma locks held. It doesn't pair with a specific vma_start_write! It is done before exclusive mmap_lock is released by incrementing mm sequence counter (mm_lock_seq). - write downgrade - if the mmap_lock is downgraded to the read lock, all vma write locks are released as well (effectivelly same as write unlock). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Suren Baghdasaryan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	mm: move mmap_lock assert function definitions	Suren Baghdasaryan	1	-12/+12
	Move mmap_lock assert function definitions up so that they can be used by other mmap_lock routines. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Suren Baghdasaryan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	mm: rcu safe VMA freeing	Michel Lespinasse	1	-3/+10
	This prepares for page faults handling under VMA lock, looking up VMAs under protection of an rcu read lock, instead of the usual mmap read lock. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Michel Lespinasse <[email protected]> Signed-off-by: Suren Baghdasaryan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	trace: cma: remove unnecessary event class cma_alloc_class	Wenchao Hao	1	-33/+25
	After commit cb6c33d4dc09 ("cma: tracing: print alloc result in trace_cma_alloc_finish"), cma_alloc_class has only one event which is cma_alloc_busy_retry. So we can remove the cma_alloc_class. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Wenchao Hao <[email protected]> Acked-by: Steven Rostedt (Google) <[email protected]> Cc: Feilong Lin <[email protected]> Cc: Hongxiang Lou <[email protected]> Cc: Masami Hiramatsu <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	mm: vmalloc: convert vread() to vread_iter()	Lorenzo Stoakes	1	-1/+2
	Having previously laid the foundation for converting vread() to an iterator function, pull the trigger and do so. This patch attempts to provide minimal refactoring and to reflect the existing logic as best we can, for example we continue to zero portions of memory not read, as before. Overall, there should be no functional difference other than a performance improvement in /proc/kcore access to vmalloc regions. Now we have eliminated the need for a bounce buffer in read_kcore_iter(), we dispense with it, and try to write to user memory optimistically but with faults disabled via copy_page_to_iter_nofault(). We already have preemption disabled by holding a spin lock. We continue faulting in until the operation is complete. Additionally, we must account for the fact that at any point a copy may fail (most likely due to a fault not being able to occur), we exit indicating fewer bytes retrieved than expected. [[email protected]: fix sparc64 warning] Link: https://lkml.kernel.org/r/[email protected] [[email protected]: redo Stephen's sparc build fix] Link: https://lkml.kernel.org/r/8506cbc667c39205e65a323f750ff9c11a463798.1679566220.git.lstoakes@gmail.com [[email protected]: unbreak uio.h includes] Link: https://lkml.kernel.org/r/941f88bc5ab928e6656e1e2593b91bf0f8c81e1b.1679511146.git.lstoakes@gmail.com Signed-off-by: Lorenzo Stoakes <[email protected]> Signed-off-by: Stephen Rothwell <[email protected]> Reviewed-by: Baoquan He <[email protected]> Cc: Alexander Viro <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Liu Shixin <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Uladzislau Rezki (Sony) <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	iov_iter: add copy_page_to_iter_nofault()	Lorenzo Stoakes	1	-0/+2
	Provide a means to copy a page to user space from an iterator, aborting if a page fault would occur. This supports compound pages, but may be passed a tail page with an offset extending further into the compound page, so we cannot pass a folio. This allows for this function to be called from atomic context and _try_ to user pages if they are faulted in, aborting if not. The function does not use _copy_to_iter() in order to not specify might_fault(), this is similar to copy_page_from_iter_atomic(). This is being added in order that an iteratable form of vread() can be implemented while holding spinlocks. Link: https://lkml.kernel.org/r/19734729defb0f498a76bdec1bef3ac48a3af3e8.1679511146.git.lstoakes@gmail.com Signed-off-by: Lorenzo Stoakes <[email protected]> Reviewed-by: Baoquan He <[email protected]> Cc: Alexander Viro <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Liu Shixin <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Uladzislau Rezki (Sony) <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	mm/page_alloc: make deferred page init free pages in MAX_ORDER blocks	Kirill A. Shutemov	1	-0/+2
	Normal page init path frees pages during the boot in MAX_ORDER chunks, but deferred page init path does it in pageblock blocks. Change deferred page init path to work in MAX_ORDER blocks. For cases when MAX_ORDER is larger than pageblock, set migrate type to MIGRATE_MOVABLE for all pageblocks covered by the page. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kirill A. Shutemov <[email protected]> Reviewed-by: Vlastimil Babka <[email protected]> Acked-by: David Hildenbrand <[email protected]> Acked-by: Mel Gorman <[email protected]> Acked-by: Mike Rapoport (IBM) <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	mm: remove vmf_insert_pfn_xxx_prot() for huge page-table entries	Lorenzo Stoakes	1	-37/+2
	This functionality's sole user, the drm ttm module, removed support for it in commit 0d979509539e ("drm/ttm: remove ttm_bo_vm_insert_huge()") as the whole approach is currently unworkable without a PMD/PUD special bit and updates to GUP. Link: https://lkml.kernel.org/r/604c2ad79659d4b8a6e3e1611c6219d5d3233988.1678661628.git.lstoakes@gmail.com Signed-off-by: Lorenzo Stoakes <[email protected]> Cc: Christian König <[email protected]> Cc: Dan Williams <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Thomas Hellström <[email protected]> Cc: Aaron Tomlin <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Marcelo Tosatti <[email protected]> Cc: Peter Xu <[email protected]> Cc: "Russell King (Oracle)" <[email protected]> Cc: Vlastimil Babka <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	mm: remove unused vmf_insert_mixed_prot()	Lorenzo Stoakes	2	-8/+1
	Patch series "Remove drm/ttm-specific mm changes". Functionality was added specifically for the DRM TTM driver to support mapping memory for VM_MIXEDMAP VMAs with customised protection flags, however this has now been rolled back as issues were found with this approach. This series removes the mm changes too, retaining some of the useful comments. This patch (of 3): The sole user of vmf_insert_mixed_prot(), the drm ttm module, stopped using this in commit f91142c62161 ("drm/ttm: nuke VM_MIXEDMAP on BO mappings v3") citing use of VM_MIXEDMAP in this case being terribly broken. Remove this now-dead code and references to it, but retain the useful description of the prot != vma->vm_page_prot case, moving it to vmf_insert_pfn_prot() instead. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/a069644388e6f1593a7020d15840e6fc9f39bcaf.1678661628.git.lstoakes@gmail.com Signed-off-by: Lorenzo Stoakes <[email protected]> Cc: Christian König <[email protected]> Cc: Dan Williams <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Thomas Hellström <[email protected]> Cc: Aaron Tomlin <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Marcelo Tosatti <[email protected]> Cc: Peter Xu <[email protected]> Cc: "Russell King (Oracle)" <[email protected]> Cc: Vlastimil Babka <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	mm/memtest: add results of early memtest to /proc/meminfo	Tomas Mudrunka	1	-0/+2
	Currently the memtest results were only presented in dmesg. When running a large fleet of devices without ECC RAM it's currently not easy to do bulk monitoring for memory corruption. You have to parse dmesg, but that's a ring buffer so the error might disappear after some time. In general I do not consider dmesg to be a great API to query RAM status. In several companies I've seen such errors remain undetected and cause issues for way too long. So I think it makes sense to provide a monitoring API, so that we can safely detect and act upon them. This adds /proc/meminfo entry which can be easily used by scripts. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Tomas Mudrunka <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Mike Rapoport (IBM) <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	mm: move vmalloc_init() declaration to mm/internal.h	Mike Rapoport (IBM)	1	-4/+0
	vmalloc_init() is called only from mm_core_init(), there is no need to declare it in include/linux/vmalloc.h Move vmalloc_init() declaration to mm/internal.h Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Mike Rapoport (IBM) <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Reviewed-by: Vlastimil Babka <[email protected]> Cc: Doug Berger <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	mm: move kmem_cache_init() declaration to mm/slab.h	Mike Rapoport (IBM)	1	-1/+0
	kmem_cache_init() is called only from mm_core_init(), there is no need to declare it in include/linux/slab.h Move kmem_cache_init() declaration to mm/slab.h Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Mike Rapoport (IBM) <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Reviewed-by: Vlastimil Babka <[email protected]> Cc: Doug Berger <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	mm: move mem_init_print_info() to mm_init.c	Mike Rapoport (IBM)	1	-1/+0
	mem_init_print_info() is only called from mm_core_init(). Move it close to the caller and make it static. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Mike Rapoport (IBM) <[email protected]> Acked-by: David Hildenbrand <[email protected]> Reviewed-by: Vlastimil Babka <[email protected]> Cc: Doug Berger <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	init,mm: fold late call to page_ext_init() to page_alloc_init_late()	Mike Rapoport (IBM)	1	-2/+0
	When deferred initialization of struct pages is enabled, page_ext_init() must be called after all the deferred initialization is done, but there is no point to keep it a separate call from kernel_init_freeable() right after page_alloc_init_late(). Fold the call to page_ext_init() into page_alloc_init_late() and localize deferred_struct_pages variable. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Mike Rapoport (IBM) <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Reviewed-by: Vlastimil Babka <[email protected]> Cc: Doug Berger <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	mm: move init_mem_debugging_and_hardening() to mm/mm_init.c	Mike Rapoport (IBM)	1	-1/+0
	init_mem_debugging_and_hardening() is only called from mm_core_init(). Move it close to the caller, make it static and rename it to mem_debugging_and_hardening_init() for consistency with surrounding convention. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Mike Rapoport (IBM) <[email protected]> Acked-by: David Hildenbrand <[email protected]> Reviewed-by: Vlastimil Babka <[email protected]> Cc: Doug Berger <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	mm: call {ptlock,pgtable}_cache_init() directly from mm_core_init()	Mike Rapoport (IBM)	1	-6/+0
	and drop pgtable_init() as it has no real value and its name is misleading. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Mike Rapoport (IBM) <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Reviewed-by: Vlastimil Babka <[email protected]> Cc: Doug Berger <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Sergei Shtylyov <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	init,mm: move mm_init() to mm/mm_init.c and rename it to mm_core_init()	Mike Rapoport (IBM)	1	-0/+1
	Make mm_init() a part of mm/ codebase. mm_core_init() better describes what the function does and does not clash with mm_init() in kernel/fork.c Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Mike Rapoport (IBM) <[email protected]> Acked-by: David Hildenbrand <[email protected]> Reviewed-by: Vlastimil Babka <[email protected]> Cc: Doug Berger <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	mm/page_alloc: rename page_alloc_init() to page_alloc_init_cpuhp()	Mike Rapoport (IBM)	1	-1/+1
	The page_alloc_init() name is really misleading because all this function does is sets up CPU hotplug callbacks for the page allocator. Rename it to page_alloc_init_cpuhp() so that name will reflect what the function does. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Mike Rapoport (IBM) <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Reviewed-by: Vlastimil Babka <[email protected]> Cc: Doug Berger <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	mm: move most of core MM initialization to mm/mm_init.c	Mike Rapoport (IBM)	1	-5/+0
	The bulk of memory management initialization code is spread all over mm/page_alloc.c and makes navigating through page allocator functionality difficult. Move most of the functions marked __init and __meminit to mm/mm_init.c to make it better localized and allow some more spare room before mm/page_alloc.c reaches 10k lines. No functional changes. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Mike Rapoport (IBM) <[email protected]> Acked-by: David Hildenbrand <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Cc: Doug Berger <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	mm: move get_page_from_free_area() to mm/page_alloc.c	Mike Rapoport (IBM)	1	-7/+0
	The get_page_from_free_area() helper is only used in mm/page_alloc.c so move it there to reduce noise in include/linux/mmzone.h Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Mike Rapoport (IBM) <[email protected]> Reviewed-by: Lorenzo Stoakes <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Reviewed-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	mm: userfaultfd: add UFFDIO_CONTINUE_MODE_WP to install WP PTEs	Axel Rasmussen	2	-1/+9
	UFFDIO_COPY already has UFFDIO_COPY_MODE_WP, so when installing a new PTE to resolve a missing fault, one can install a write-protected one. This is useful when using UFFDIO_REGISTER_MODE_{MISSING,WP} in combination. This was motivated by testing HugeTLB HGM [1], and in particular its interaction with userfaultfd features. Existing userfaultfd code supports using WP and MINOR modes together (i.e. you can register an area with both enabled), but without this CONTINUE flag the combination is in practice unusable. So, add an analogous UFFDIO_CONTINUE_MODE_WP, which does the same thing as UFFDIO_COPY_MODE_WP, but for minor faults. Update the selftest to do some very basic exercising of the new flag. Update Documentation/ to describe how these flags are used (neither the COPY nor the new CONTINUE versions of this mode flag were described there before). [1]: https://patchwork.kernel.org/project/linux-mm/cover/[email protected]/ Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Axel Rasmussen <[email protected]> Acked-by: Peter Xu <[email protected]> Acked-by: Mike Rapoport (IBM) <[email protected]> Cc: Al Viro <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Jan Kara <[email protected]> Cc: Liam R. Howlett <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Muchun Song <[email protected]> Cc: Nadav Amit <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	mm: userfaultfd: combine 'mode' and 'wp_copy' arguments	Axel Rasmussen	3	-24/+37
	Many userfaultfd ioctl functions take both a 'mode' and a 'wp_copy' argument. In future commits we plan to plumb the flags through to more places, so we'd be proliferating the very long argument list even further. Let's take the time to simplify the argument list. Combine the two arguments into one - and generalize, so when we add more flags in the future, it doesn't imply more function arguments. Since the modes (copy, zeropage, continue) are mutually exclusive, store them as an integer value (0, 1, 2) in the low bits. Place combine-able flag bits in the high bits. This is quite similar to an earlier patch proposed by Nadav Amit ("userfaultfd: introduce uffd_flags" [1]). The main difference is that patch only handled flags, whereas this patch also combines the "mode" argument into the same type to shorten the argument list. [1]: https://lore.kernel.org/all/[email protected]/ Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Axel Rasmussen <[email protected]> Acked-by: James Houghton <[email protected]> Acked-by: Peter Xu <[email protected]> Acked-by: Mike Rapoport (IBM) <[email protected]> Cc: Al Viro <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Jan Kara <[email protected]> Cc: Liam R. Howlett <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Muchun Song <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	mm: userfaultfd: don't pass around both mm and vma	Axel Rasmussen	3	-7/+6
	Quite a few userfaultfd functions took both mm and vma pointers as arguments. Since the mm is trivially accessible via vma->vm_mm, there's no reason to pass both; it just needlessly extends the already long argument list. Get rid of the mm pointer, where possible, to shorten the argument list. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Axel Rasmussen <[email protected]> Acked-by: Peter Xu <[email protected]> Acked-by: Mike Rapoport (IBM) <[email protected]> Cc: Al Viro <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: James Houghton <[email protected]> Cc: Jan Kara <[email protected]> Cc: Liam R. Howlett <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Muchun Song <[email protected]> Cc: Nadav Amit <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	mm: userfaultfd: rename functions for clarity + consistency	Axel Rasmussen	2	-24/+24
	Patch series "mm: userfaultfd: refactor and add UFFDIO_CONTINUE_MODE_WP", v5. - Commits 1-3 refactor userfaultfd ioctl code without behavior changes, with the main goal of improving consistency and reducing the number of function args. - Commit 4 adds UFFDIO_CONTINUE_MODE_WP. This patch (of 4): The basic problem is, over time we've added new userfaultfd ioctls, and we've refactored the code so functions which used to handle only one case are now re-used to deal with several cases. While this happened, we didn't bother to rename the functions. Similarly, as we added new functions, we cargo-culted pieces of the now-inconsistent naming scheme, so those functions too ended up with names that don't make a lot of sense. A key point here is, "copy" in most userfaultfd code refers specifically to UFFDIO_COPY, where we allocate a new page and copy its contents from userspace. There are many functions with "copy" in the name that don't actually do this (at least in some cases). So, rename things into a consistent scheme. The high level idea is that the call stack for userfaultfd ioctls becomes: userfaultfd_ioctl -> userfaultfd_(particular ioctl) -> mfill_atomic_(particular kind of fill operation) -> mfill_atomic /* loops over pages in range / -> mfill_atomic_pte / deals with single pages */ -> mfill_atomic_pte_(particular kind of fill operation) -> mfill_atomic_install_pte There are of course some special cases (shmem, hugetlb), but this is the general structure which all function names now adhere to. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Axel Rasmussen <[email protected]> Acked-by: Peter Xu <[email protected]> Acked-by: Mike Rapoport (IBM) <[email protected]> Cc: Al Viro <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: James Houghton <[email protected]> Cc: Jan Kara <[email protected]> Cc: Liam R. Howlett <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Muchun Song <[email protected]> Cc: Nadav Amit <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>