aboutsummaryrefslogtreecommitdiff
path: root/arch/arc
AgeCommit message (Collapse)AuthorFilesLines
2023-08-18ARC: pt_regs: create seperate type for ecrVineet Gupta7-45/+33
Reduces duplication in each ISA specific pt_regs Tested-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202308151342.ROQ9Urvv-lkp@intel.com Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2023-08-18ARCv2: entry: rearrange pt_regs slightlyVineet Gupta3-13/+15
Instead of r26,fp,sp,r12,r30 order as fp,r30,r12,r26,sp - keeps SP at well known position (right abive hardware autosave) - r26,r12 saved specifically for ARCv2 (and not in ARCv3) kept closer for easy ifdef'ry later Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2023-08-18ARC: entry: replace 8 byte ADD.ne with 4 byte ADD2.neVineet Gupta1-1/+1
ARCv2 current ------------ 000007e0 <EV_Trap>: 7e0: 2482 3c01 sub sp,sp,112 7e4: 1c28 3006 std r0r1,[sp,40] 7e8: 1c30 3086 std r2r3,[sp,48] 7ec: 1c38 3106 std r4r5,[sp,56] 7f0: 1c40 3186 std r6r7,[sp,64] 7f4: 1c48 3206 std r8r9,[sp,72] 7f8: 1c50 3286 std r10r11,[sp,80] 7fc: 1c58 37c0 st blink,[sp,88] 800: 1c0c 36c0 st fp,[sp,12] 804: 1c18 3680 st gp,[sp,24] 808: 1c10 3780 st r30,[sp,16] 80c: 1c14 3300 st r12,[sp,20] 810: 226a 1340 lr r10,[aux_user_sp] 814: 22ca 1702 mov.ne r10,sp 818: 22c0 1f82 0000 0070 add.ne r10,r10,0x70 ^^^^^^^^^ With fix -------- 000007b4 <EV_Trap>: 7b4: 2482 3c01 sub sp,sp,112 7b8: 1c28 3006 std r0r1,[sp,40] 7bc: 1c30 3086 std r2r3,[sp,48] 7c0: 1c38 3106 std r4r5,[sp,56] 7c4: 1c40 3186 std r6r7,[sp,64] 7c8: 1c48 3206 std r8r9,[sp,72] 7cc: 1c50 3286 std r10r11,[sp,80] 7d0: 1c58 37c0 st blink,[sp,88] 7d4: 1c0c 36c0 st fp,[sp,12] 7d8: 1c18 3680 st gp,[sp,24] 7dc: 1c10 3780 st r30,[sp,16] 7e0: 1c14 3300 st r12,[sp,20] 7e4: 226a 1340 lr r10,[aux_user_sp] 7e8: 22ca 1702 mov.ne r10,sp 7ec: 22d5 1722 add2.ne r10,r10,0x1c Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2023-08-18ARC: entry: replace 8 byte OR with 4 byte BSETVineet Gupta1-2/+2
FAKE_RET_FROM_EXCEPTION drops down to pure kernel mode. It currently has an 8 byte instruction which can be replaced with 4 byte BSET This is applicable to both ARCv2 and ARCv3 entr code. ARCv2 current ------------ 00000804 <EV_Trap>: ... 874: 216a 1280 lr r9,[status32] 878: 2146 1809 bic r9,r9,0x20 87c: 2105 1f89 8000 0000 or r9,r9,0x80000000 ^^^^^^^^^ 884: 2029 8240 kflag r9 ARCv2 after ---------- 000007e0 <EV_Trap>: ... 850: 216a 1280 lr r9,[status32] 854: 2150 1149 bclr r9,r9,0x5 858: 214f 17c9 bset r9,r9,0x1f 85c: 2029 8240 kflag r9 Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2023-08-18ARC: entry: Add more common chores to EXCEPTION_PROLOGUEVineet Gupta5-47/+24
THe high level structure of most ARC exception handlers is 1. save regfile with EXCEPTION_PROLOGUE 2. setup r0: EFA (not part of pt_regs) 3. setup r1: pointer to pt_regs (SP) 4. drop down to pure kernel mode (from exception) 5. call the Linux "C" handler Remove the boiler plate code by moving #2, #3, #4 into #1. The exceptions to most exceptions are syscall Trap and Machine check which don't do some of above for various reasons, so call a newly introduced variant EXCEPTION_PROLOGUE_KEEP_AE (same as original EXCEPTION_PROLOGUE) Tested-by: Pavel Kozlov <Pavel.Kozlov@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2023-08-18arc: mm: convert to GENERIC_IOREMAPBaoquan He3-49/+8
By taking GENERIC_IOREMAP method, the generic generic_ioremap_prot(), generic_iounmap(), and their generic wrapper ioremap_prot(), ioremap() and iounmap() are all visible and available to arch. Arch needs to provide wrapper functions to override the generic versions if there's arch specific handling in its ioremap_prot(), ioremap() or iounmap(). This change will simplify implementation by removing duplicated code with generic_ioremap_prot() and generic_iounmap(), and has the equivalent functioality as before. Here, add wrapper functions ioremap_prot() and iounmap() for arc's special operation when ioremap_prot() and iounmap(). Link: https://lkml.kernel.org/r/20230706154520.11257-8-bhe@redhat.com Signed-off-by: Baoquan He <bhe@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Brian Cain <bcain@quicinc.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Chris Zankel <chris@zankel.net> Cc: David Laight <David.Laight@ACULAB.COM> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonas Bonn <jonas@southpole.se> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Niklas Schnelle <schnelle@linux.ibm.com> Cc: Rich Felker <dalias@libc.org> Cc: Stafford Horne <shorne@gmail.com> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-17ARC: entry: EV_MachineCheck dont re-read ECRVineet Gupta1-3/+2
Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2023-08-17ARC: entry: ARcompact EV_ProtV to use r10 directlyVineet Gupta1-4/+2
Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2023-08-17ARC: entry: rework (non-functional)Vineet Gupta6-37/+39
- comments update - rename syscall_trace_entry - use PT_xxx in entry code Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2023-08-17ARC: __switch_to: move ksp to thread_info from thread_structVineet Gupta5-23/+20
task's arch specific bits are carried in 2 places - embedded thread_struct in task_struct - associated thread_info (hoisted in task's stack page) and syntactically: (thread_info *)(task_struct->stack) ksp (dynamic kernel stack top) currently lives in thread_struct but given its deep location in task struct likely to cache miss when accessed from __switch_to(). Moving it to thread_info would be more efficient given proximity to frequently accessed items such as preempt_count thus very likely to be in cache, specially in schedular code. Note however that currently tsk.thread.ksp takes 1 memory access (off of tsk pointer) while new code tsk->stack.ksp would take 2, but likely to be in cache. Moreover if task is current the 2nd reference can be elided and instead derived from SP as (SP & ~(THREAD_SIZE - 1)) All of this also makes __switch_to() code simpler and we can see the 2 ways of retirving ksp (descrobed above) in new code. Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2023-08-17ARC: __switch_to: asm with dwarf ops (vs. inline asm)Vineet Gupta4-155/+61
__switch_to() is final step of context switch, swapping kernel modes stack (and callee regs) of outgoing task with next task. It is also the starting point of stack unwinging of a sleeping task and captures SP, FP, BLINK and the corresponding dwarf info. Back when dinosaurs still roamed around, ARC gas didn't support CFI pseudo ops and gcc was responsible for generating dwarf info. Thus it had to be written in "C" with inline asm to do the hand crafting of stack. The function prologue (and crucial saving of blink etc) was still gcc generated but not visible in code. Likewise dwarf info was missing. Now with modern tools, we can make things more obvious by writing the code in asm and adding approproate dwarf cfi pseudo ops. This is mostly non functional change, except for slight chnages to asm - ARCompact doesn't support MOV_S fp, sp, so we use MOV Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2023-08-17ARC: kernel stack: INIT_THREAD need not setup @init_stack in @kspVineet Gupta1-3/+1
There are 2 pointers to kernel mode stack of a task - task_struct.stack: base address of stack page (max possible stack top) - thread_info.ksp : runtime stack top in __switch_to INIT_THREAD was setting up ksp to stack base which was not really needed - it would get overwritten with dynamic value on first call to __switch_to when init is switched out for the very first time. - generic code already does init_task.stack = init_stack and ARC code uses that to retrieve task's stack base. Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2023-08-17ARC: entry: use gp to cache task pointer (vs. r25)Vineet Gupta13-154/+58
The motivation is eventual ABI considerations for ARCv3 but even without it this change us worthwhile as diffstat reduces 100 net lines r25 is a callee saved register, normally not saved by entry code in pt_regs. However because of its usage in CONFIG_ARC_CURR_IN_REG it needs to be. This in turn requires a whole bunch of special casing when we need to access r25. Then there is distinction between user mode r25 vs. kernel mode r25 - hence distinct SAVE_CALLEE_SAVED_{USER,KERNEL} Instead use gp which is a scratch register and thus saved already in entry code. This cleans things up significantly and much nocer on eyes: - SAVE_CALLEE_SAVED_{USER,KERNEL} are now exactly same - no special user_r25 slot in pt_reggs Note that typical global asm registers are callee-saved (r25), but gp is not callee-saved thus needs additional -ffixed-<reg> toggle Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2023-08-17ARC: boot log: eliminate struct cpuinfo_arc #4: boot log per ISAVineet Gupta5-343/+268
- boot log now clearly per ISA - global struct cpuinfo_arc[] elimiated - local struct struct arcinfo kept for passing info between functions Tested-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202308162101.Ve5jBg80-lkp@intel.com Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2023-08-17ARC: boot log: eliminate struct cpuinfo_arc #3: don't exportVineet Gupta2-4/+0
Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2023-08-17ARC: boot log: eliminate struct cpuinfo_arc #2: cacheVineet Gupta4-115/+97
Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2023-08-17ARC: boot log: eliminate struct cpuinfo_arc #1: mmVineet Gupta4-67/+58
This is first step in eliminating struct cpuinfo_arc[NR_CPUS] Back when we had just ARCompact ISA, the idea was to read/bit-fiddle the BCRs once and and cache decoded information in a global struct ready to use. With ARCv2 it was modified to contained abstract / ISA agnostic information. However with ARCv3 there 's too much disparity to abstract in common structures. So drop the entire decode once and store paradigm. Afterall there's only 2 users of this machinery anyways: boot printing and cat /proc/cpuinfo. None is performance critical to warrant locking away resident memory per cpu. This patch is first step in that direction - decouples struct cpuinfo_arc_mmu from global struct cpuinfo_arc - mmu code still has a trimmed down static version of struct cpuinfo_arc_mmu to cache information needed in performance critical code such as tlb flush routines - folds read_decode_mmu_bcr() into arc_mmu_mumbojumbo() - setup_processor() directly calls arc_mmu_init() and not via arc_cpu_init() Tested-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202308151213.qKZPMiyz-lkp@intel.com/ Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2023-08-17ARCv2: memset: don't prefetch for len == 0 which happens a alotVineet Gupta1-1/+2
This avoids potential "bleeding" when size == 0 as cache line would be dirtied (and possibly fetched from other cores) and due to the same reaons more optimal too. Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2023-08-17ARC: uaccess: elide unaliged handling if hardware supportsVineet Gupta1-4/+6
Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2023-08-17ARC: uaccess: use optimized generic __strnlen_user/__strncpy_from_userVineet Gupta1-0/+2
The existing ARC variants have 2 issues - Use ZOL which may not be present in forthcoming architecture - Byte loop based vs. generic version which is word loop based Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2023-08-17ARC: uaccess: remove arc specific out-of-line handles for -OsVineet Gupta2-20/+2
Everything is now out-of-line in lib/usercopy.c Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2023-08-15ARC: atomics: Add compiler barrier to atomic operations...Pavel Kozlov2-6/+6
... to avoid unwanted gcc optimizations SMP kernels fail to boot with commit 596ff4a09b89 ("cpumask: re-introduce constant-sized cpumask optimizations"). | | percpu: BUG: failure at mm/percpu.c:2981/pcpu_build_alloc_info()! | The write operation performed by the SCOND instruction in the atomic inline asm code is not properly passed to the compiler. The compiler cannot correctly optimize a nested loop that runs through the cpumask in the pcpu_build_alloc_info() function. Fix this by add a compiler barrier (memory clobber in inline asm). Apparently atomic ops used to have memory clobber implicitly via surrounding smp_mb(). However commit b64be6836993c431e ("ARC: atomics: implement relaxed variants") removed the smp_mb() for the relaxed variants, but failed to add the explicit compiler barrier. Link: https://github.com/foss-for-synopsys-dwc-arc-processors/linux/issues/135 Cc: <stable@vger.kernel.org> # v6.3+ Fixes: b64be6836993c43 ("ARC: atomics: implement relaxed variants") Signed-off-by: Pavel Kozlov <pavel.kozlov@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@kernel.org> [vgupta: tweaked the changelog and added Fixes tag]
2023-08-13ARC: -Wmissing-prototype warning fixesVineet Gupta17-10/+48
Anrd reported [1] new compiler warnings due to -Wmissing-protype. These are for non static functions mostly used in asm code hence not exported already. Fix this by adding the prototypes. [1] https://lore.kernel.org/lkml/20230810141947.1236730-1-arnd@kernel.org Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2023-08-10csky: Cast argument to virt_to_pfn() to (void *)Linus Walleij1-1/+1
The virt_to_pfn() function takes a (void *) as argument, fix this up to avoid exploiting the unintended polymorphism of virt_to_pfn. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
2023-07-11mm: Rename arch pte_mkwrite()'s to pte_mkwrite_novma()Rick Edgecombe2-2/+2
The x86 Shadow stack feature includes a new type of memory called shadow stack. This shadow stack memory has some unusual properties, which requires some core mm changes to function properly. One of these unusual properties is that shadow stack memory is writable, but only in limited ways. These limits are applied via a specific PTE bit combination. Nevertheless, the memory is writable, and core mm code will need to apply the writable permissions in the typical paths that call pte_mkwrite(). The goal is to make pte_mkwrite() take a VMA, so that the x86 implementation of it can know whether to create regular writable or shadow stack mappings. But there are a couple of challenges to this. Modifying the signatures of each arch pte_mkwrite() implementation would be error prone because some are generated with macros and would need to be re-implemented. Also, some pte_mkwrite() callers operate on kernel memory without a VMA. So this can be done in a three step process. First pte_mkwrite() can be renamed to pte_mkwrite_novma() in each arch, with a generic pte_mkwrite() added that just calls pte_mkwrite_novma(). Next callers without a VMA can be moved to pte_mkwrite_novma(). And lastly, pte_mkwrite() and all callers can be changed to take/pass a VMA. Start the process by renaming pte_mkwrite() to pte_mkwrite_novma() and adding the pte_mkwrite() wrapper in linux/pgtable.h. Apply the same pattern for pmd_mkwrite(). Since not all archs have a pmd_mkwrite_novma(), create a new arch config HAS_HUGE_PAGE that can be used to tell if pmd_mkwrite() should be defined. Otherwise in the !HAS_HUGE_PAGE cases the compiler would not be able to find pmd_mkwrite_novma(). No functional change. Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/lkml/CAHk-=wiZjSu7c9sFYZb3q04108stgHff2wfbokGCCgW7riz+8Q@mail.gmail.com/ Link: https://lore.kernel.org/all/20230613001108.3040476-2-rick.p.edgecombe%40intel.com
2023-07-06Merge tag 'asm-generic-6.5' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic updates from Arnd Bergmann: "These are cleanups for architecture specific header files: - the comments in include/linux/syscalls.h have gone out of sync and are really pointless, so these get removed - The asm/bitsperlong.h header no longer needs to be architecture specific on modern compilers, so use a generic version for newer architectures that use new enough userspace compilers - A cleanup for virt_to_pfn/virt_to_bus to have proper type checking, forcing the use of pointers" * tag 'asm-generic-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: syscalls: Remove file path comments from headers tools arch: Remove uapi bitsperlong.h of hexagon and microblaze asm-generic: Unify uapi bitsperlong.h for arm64, riscv and loongarch m68k/mm: Make pfn accessors static inlines arm64: memory: Make virt_to_pfn() a static inline ARM: mm: Make virt_to_pfn() a static inline asm-generic/page.h: Make pfn accessors static inlines xen/netback: Pass (void *) to virt_to_page() netfs: Pass a pointer to virt_to_page() cifs: Pass a pointer to virt_to_page() in cifsglob cifs: Pass a pointer to virt_to_page() riscv: mm: init: Pass a pointer to virt_to_page() ARC: init: Pass a pointer to virt_to_pfn() in init m68k: Pass a pointer to virt_to_pfn() virt_to_page() fs/proc/kcore.c: Pass a pointer to virt_addr_valid()
2023-07-01Merge tag 'kbuild-v6.5' of ↵Linus Torvalds1-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Remove the deprecated rule to build *.dtbo from *.dts - Refactor section mismatch detection in modpost - Fix bogus ARM section mismatch detections - Fix error of 'make gtags' with O= option - Add Clang's target triple to KBUILD_CPPFLAGS to fix a build error with the latest LLVM version - Rebuild the built-in initrd when KBUILD_BUILD_TIMESTAMP is changed - Ignore more compiler-generated symbols for kallsyms - Fix 'make local*config' to handle the ${CONFIG_FOO} form in Makefiles - Enable more kernel-doc warnings with W=2 - Refactor <linux/export.h> by generating KSYMTAB data by modpost - Deprecate <asm/export.h> and <asm-generic/export.h> - Remove the EXPORT_DATA_SYMBOL macro - Move the check for static EXPORT_SYMBOL back to modpost, which makes the build faster - Re-implement CONFIG_TRIM_UNUSED_KSYMS with one-pass algorithm - Warn missing MODULE_DESCRIPTION when building modules with W=1 - Make 'make clean' robust against too long argument error - Exclude more objects from GCOV to fix CFI failures with GCOV - Allow 'make modules_install' to install modules.builtin and modules.builtin.modinfo even when CONFIG_MODULES is disabled - Include modules.builtin and modules.builtin.modinfo in the linux-image Debian package even when CONFIG_MODULES is disabled - Revive "Entering directory" logging for the latest Make version * tag 'kbuild-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (72 commits) modpost: define more R_ARM_* for old distributions kbuild: revive "Entering directory" for Make >= 4.4.1 kbuild: set correct abs_srctree and abs_objtree for package builds scripts/mksysmap: Ignore prefixed KCFI symbols kbuild: deb-pkg: remove the CONFIG_MODULES check in buildeb kbuild: builddeb: always make modules_install, to install modules.builtin* modpost: continue even with unknown relocation type modpost: factor out Elf_Sym pointer calculation to section_rel() modpost: factor out inst location calculation to section_rel() kbuild: Disable GCOV for *.mod.o kbuild: Fix CFI failures with GCOV kbuild: make clean rule robust against too long argument error script: modpost: emit a warning when the description is missing kbuild: make modules_install copy modules.builtin(.modinfo) linux/export.h: rename 'sec' argument to 'license' modpost: show offset from symbol for section mismatch warnings modpost: merge two similar section mismatch warnings kbuild: implement CONFIG_TRIM_UNUSED_KSYMS without recursion modpost: use null string instead of NULL pointer for default namespace modpost: squash sym_update_namespace() into sym_add_exported() ...
2023-06-29Merge tag 'slab-for-6.5' of ↵Linus Torvalds5-5/+0
git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab Pull slab updates from Vlastimil Babka: - SLAB deprecation: Following the discussion at LSF/MM 2023 [1] and no objections, the SLAB allocator is deprecated by renaming the config option (to make its users notice) to CONFIG_SLAB_DEPRECATED with updated help text. SLUB should be used instead. Existing defconfigs with CONFIG_SLAB are also updated. - SLAB_NO_MERGE kmem_cache flag (Jesper Dangaard Brouer): There are (very limited) cases where kmem_cache merging is undesirable, and existing ways to prevent it are hacky. Introduce a new flag to do that cleanly and convert the existing hacky users. Btrfs plans to use this for debug kernel builds (that use case is always fine), networking for performance reasons (that should be very rare). - Replace the usage of weak PRNGs (David Keisar Schmidt): In addition to using stronger RNGs for the security related features, the code is a bit cleaner. - Misc code cleanups (SeongJae Parki, Xiongwei Song, Zhen Lei, and zhaoxinchao) Link: https://lwn.net/Articles/932201/ [1] * tag 'slab-for-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: mm/slab_common: use SLAB_NO_MERGE instead of negative refcount mm/slab: break up RCU readers on SLAB_TYPESAFE_BY_RCU example code mm/slab: add a missing semicolon on SLAB_TYPESAFE_BY_RCU example code mm/slab_common: reduce an if statement in create_cache() mm/slab: introduce kmem_cache flag SLAB_NO_MERGE mm/slab: rename CONFIG_SLAB to CONFIG_SLAB_DEPRECATED mm/slab: remove HAVE_HARDENED_USERCOPY_ALLOCATOR mm/slab_common: Replace invocation of weak PRNG mm/slab: Replace invocation of weak PRNG slub: Don't read nr_slabs and total_objects directly slub: Remove slabs_node() function slub: Remove CONFIG_SMP defined check slub: Put objects_show() into CONFIG_SLUB_DEBUG enabled block slub: Correct the error code when slab_kset is NULL mm/slab: correct return values in comment for _kmem_cache_create()
2023-06-29Merge tag 'drm-next-2023-06-29' of git://anongit.freedesktop.org/drm/drmLinus Torvalds1-14/+2
Pull drm updates from Dave Airlie: "There is one set of patches to misc for a i915 gsc/mei proxy driver. Otherwise it's mostly amdgpu/i915/msm, lots of hw enablement and lots of refactoring. core: - replace strlcpy with strscpy - EDID changes to support further conversion to struct drm_edid - Move i915 DSC parameter code to common DRM helpers - Add Colorspace functionality aperture: - ignore framebuffers with non-primary devices fbdev: - use fbdev i/o helpers - add Kconfig options for fb_ops helpers - use new fb io helpers directly in drivers sysfs: - export DRM connector ID scheduler: - Avoid an infinite loop ttm: - store function table in .rodata - Add query for TTM mem limit - Add NUMA awareness to pools - Export ttm_pool_fini() bridge: - fsl-ldb: support i.MX6SX - lt9211, lt9611: remove blanking packets - tc358768: implement input bus formats, devm cleanups - ti-snd65dsi86: implement wait_hpd_asserted - analogix: fix endless probe loop - samsung-dsim: support swapped clock, fix enabling, support var clock - display-connector: Add support for external power supply - imx: Fix module linking - tc358762: Support reset GPIO panel: - nt36523: Support Lenovo J606F - st7703: Support Anbernic RG353V-V2 - InnoLux G070ACE-L01 support - boe-tv101wum-nl6: Improve initialization - sharp-ls043t1le001: Mode fixes - simple: BOE EV121WXM-N10-1850, S6D7AA0 - Ampire AM-800480L1TMQW-T00H - Rocktech RK043FN48H - Starry himax83102-j02 - Starry ili9882t amdgpu: - add new ctx query flag to handle reset better - add new query/set shadow buffer for rdna3 - DCN 3.2/3.1.x/3.0.x updates - Enable DC_FP on loongarch - PCIe fix for RDNA2 - improve DC FAMS/SubVP support for better power management - partition support for lots of engines - Take NUMA into account when allocating memory - Add new DRM_AMDGPU_WERROR config parameter to help with CI - Initial SMU13 overdrive support - Add support for new colorspace KMS API - W=1 fixes amdkfd: - Query TTM mem limit rather than hardcoding it - GC 9.4.3 partition support - Handle NUMA for partitions - Add debugger interface for enabling gdb - Add KFD event age tracking radeon: - Fix possible UAF i915: - new getparam for PXP support - GSC/MEI proxy driver - Meteorlake display enablement - avoid clearing preallocated framebuffers with TTM - implement framebuffer mmap support - Disable sampler indirect state in bindless heap - Enable fdinfo for GuC backends - GuC loading and firmware table handling fixes - Various refactors for multi-tile enablement - Define MOCS and PAT tables for MTL - GSC/MEI support for Meteorlake - PMU multi-tile support - Large driver kernel doc cleanup - Allow VRR toggling and arbitrary refresh rates - Support async flips on linear buffers on display ver 12+ - Expose CRTC CTM property on ILK/SNB/VLV - New debugfs for display clock frequencies - Hotplug refactoring - Display refactoring - I915_GEM_CREATE_EXT_SET_PAT for Mesa on Meteorlake - Use large rings for compute contexts - HuC loading for MTL - Allow user to set cache at BO creation - MTL powermanagement enhancements - Switch to dedicated workqueues to stop using flush_scheduled_work() - Move display runtime init under display/ - Remove 10bit gamma on desktop gen3 parts, they don't support it habanalabs: - uapi: return 0 for user queries if there was a h/w or f/w error - Add pci health check when we lose connection with the firmware. This can be used to distinguish between pci link down and firmware getting stuck. - Add more info to the error print when TPC interrupt occur. - Firmware fixes msm: - Adreno A660 bindings - SM8350 MDSS bindings fix - Added support for DPU on sm6350 and sm6375 platforms - Implemented tearcheck support to support vsync on SM150 and newer platforms - Enabled missing features (DSPP, DSC, split display) on sc8180x, sc8280xp, sm8450 - Added support for DSI and 28nm DSI PHY on MSM8226 platform - Added support for DSI on sm6350 and sm6375 platforms - Added support for display controller on MSM8226 platform - A690 GPU support - Move cmdstream dumping out of fence signaling path - a610 support - Support for a6xx devices without GMU nouveau: - NULL ptr before deref fixes armada: - implement fbdev emulation as client sun4i: - fix mipi-dsi dotclock - release clocks vc4: - rgb range toggle property - BT601 / BT2020 HDMI support vkms: - convert to drmm helpers - add reflection and rotation support - fix rgb565 conversion gma500: - fix iomem access shmobile: - support renesas soc platform - enable fbdev mxsfb: - Add support for i.MX93 LCDIF stm: - dsi: Use devm_ helper - ltdc: Fix potential invalid pointer deref renesas: - Group drivers in renesas subdirectory to prepare for new platform - Drop deprecated R-Car H3 ES1.x support meson: - Add support for MIPI DSI displays virtio: - add sync object support mediatek: - Add display binding document for MT6795" * tag 'drm-next-2023-06-29' of git://anongit.freedesktop.org/drm/drm: (1791 commits) drm/i915: Fix a NULL vs IS_ERR() bug drm/i915: make i915_drm_client_fdinfo() reference conditional again drm/i915/huc: Fix missing error code in intel_huc_init() drm/i915/gsc: take a wakeref for the proxy-init-completion check drm/msm/a6xx: Add A610 speedbin support drm/msm/a6xx: Add A619_holi speedbin support drm/msm/a6xx: Use adreno_is_aXYZ macros in speedbin matching drm/msm/a6xx: Use "else if" in GPU speedbin rev matching drm/msm/a6xx: Fix some A619 tunables drm/msm/a6xx: Add A610 support drm/msm/a6xx: Add support for A619_holi drm/msm/adreno: Disable has_cached_coherent in GMU wrapper configurations drm/msm/a6xx: Introduce GMU wrapper support drm/msm/a6xx: Move CX GMU power counter enablement to hw_init drm/msm/a6xx: Extend and explain UBWC config drm/msm/a6xx: Remove both GBIF and RBBM GBIF halt on hw init drm/msm/a6xx: Add a helper for software-resetting the GPU drm/msm/a6xx: Improve a6xx_bus_clear_pending_transactions() drm/msm/a6xx: Move a6xx_bus_clear_pending_transactions to a6xx_gpu drm/msm/a6xx: Move force keepalive vote removal to a6xx_gmu_force_off() ...
2023-06-28Merge branch 'expand-stack'Linus Torvalds2-8/+4
This modifies our user mode stack expansion code to always take the mmap_lock for writing before modifying the VM layout. It's actually something we always technically should have done, but because we didn't strictly need it, we were being lazy ("opportunistic" sounds so much better, doesn't it?) about things, and had this hack in place where we would extend the stack vma in-place without doing the proper locking. And it worked fine. We just needed to change vm_start (or, in the case of grow-up stacks, vm_end) and together with some special ad-hoc locking using the anon_vma lock and the mm->page_table_lock, it all was fairly straightforward. That is, it was all fine until Ruihan Li pointed out that now that the vma layout uses the maple tree code, we *really* don't just change vm_start and vm_end any more, and the locking really is broken. Oops. It's not actually all _that_ horrible to fix this once and for all, and do proper locking, but it's a bit painful. We have basically three different cases of stack expansion, and they all work just a bit differently: - the common and obvious case is the page fault handling. It's actually fairly simple and straightforward, except for the fact that we have something like 24 different versions of it, and you end up in a maze of twisty little passages, all alike. - the simplest case is the execve() code that creates a new stack. There are no real locking concerns because it's all in a private new VM that hasn't been exposed to anybody, but lockdep still can end up unhappy if you get it wrong. - and finally, we have GUP and page pinning, which shouldn't really be expanding the stack in the first place, but in addition to execve() we also use it for ptrace(). And debuggers do want to possibly access memory under the stack pointer and thus need to be able to expand the stack as a special case. None of these cases are exactly complicated, but the page fault case in particular is just repeated slightly differently many many times. And ia64 in particular has a fairly complicated situation where you can have both a regular grow-down stack _and_ a special grow-up stack for the register backing store. So to make this slightly more manageable, the bulk of this series is to first create a helper function for the most common page fault case, and convert all the straightforward architectures to it. Thus the new 'lock_mm_and_find_vma()' helper function, which ends up being used by x86, arm, powerpc, mips, riscv, alpha, arc, csky, hexagon, loongarch, nios2, sh, sparc32, and xtensa. So we not only convert more than half the architectures, we now have more shared code and avoid some of those twisty little passages. And largely due to this common helper function, the full diffstat of this series ends up deleting more lines than it adds. That still leaves eight architectures (ia64, m68k, microblaze, openrisc, parisc, s390, sparc64 and um) that end up doing 'expand_stack()' manually because they are doing something slightly different from the normal pattern. Along with the couple of special cases in execve() and GUP. So there's a couple of patches that first create 'locked' helper versions of the stack expansion functions, so that there's a obvious path forward in the conversion. The execve() case is then actually pretty simple, and is a nice cleanup from our old "grow-up stackls are special, because at execve time even they grow down". The #ifdef CONFIG_STACK_GROWSUP in that code just goes away, because it's just more straightforward to write out the stack expansion there manually, instead od having get_user_pages_remote() do it for us in some situations but not others and have to worry about locking rules for GUP. And the final step is then to just convert the remaining odd cases to a new world order where 'expand_stack()' is called with the mmap_lock held for reading, but where it might drop it and upgrade it to a write, only to return with it held for reading (in the success case) or with it completely dropped (in the failure case). In the process, we remove all the stack expansion from GUP (where dropping the lock wouldn't be ok without special rules anyway), and add it in manually to __access_remote_vm() for ptrace(). Thanks to Adrian Glaubitz and Frank Scheiner who tested the ia64 cases. Everything else here felt pretty straightforward, but the ia64 rules for stack expansion are really quite odd and very different from everything else. Also thanks to Vegard Nossum who caught me getting one of those odd conditions entirely the wrong way around. Anyway, I think I want to actually move all the stack expansion code to a whole new file of its own, rather than have it split up between mm/mmap.c and mm/memory.c, but since this will have to be backported to the initial maple tree vma introduction anyway, I tried to keep the patches _fairly_ minimal. Also, while I don't think it's valid to expand the stack from GUP, the final patch in here is a "warn if some crazy GUP user wants to try to expand the stack" patch. That one will be reverted before the final release, but it's left to catch any odd cases during the merge window and release candidates. Reported-by: Ruihan Li <lrh2000@pku.edu.cn> * branch 'expand-stack': gup: add warning if some caller would seem to want stack expansion mm: always expand the stack with the mmap write lock held execve: expand new process stack manually ahead of time mm: make find_extend_vma() fail if write lock not held powerpc/mm: convert coprocessor fault to lock_mm_and_find_vma() mm/fault: convert remaining simple cases to lock_mm_and_find_vma() arm/mm: Convert to using lock_mm_and_find_vma() riscv/mm: Convert to using lock_mm_and_find_vma() mips/mm: Convert to using lock_mm_and_find_vma() powerpc/mm: Convert to using lock_mm_and_find_vma() arm64/mm: Convert to using lock_mm_and_find_vma() mm: make the page fault mmap locking killable mm: introduce new 'lock_mm_and_find_vma()' page fault helper
2023-06-24mm/fault: convert remaining simple cases to lock_mm_and_find_vma()Linus Torvalds2-8/+4
This does the simple pattern conversion of alpha, arc, csky, hexagon, loongarch, nios2, sh, sparc32, and xtensa to the lock_mm_and_find_vma() helper. They all have the regular fault handling pattern without odd special cases. The remaining architectures all have something that keeps us from a straightforward conversion: ia64 and parisc have stacks that can grow both up as well as down (and ia64 has special address region checks). And m68k, microblaze, openrisc, sparc64, and um end up having extra rules about only expanding the stack down a limited amount below the user space stack pointer. That is something that x86 used to do too (long long ago), and it probably could just be skipped, but it still makes the conversion less than trivial. Note that this conversion was done manually and with the exception of alpha without any build testing, because I have a fairly limited cross- building environment. The cases are all simple, and I went through the changes several times, but... Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-06-15ARC: define ASM_NL and __ALIGN(_STR) outside #ifdef __ASSEMBLY__ guardMasahiro Yamada1-4/+4
ASM_NL is useful not only in *.S files but also in .c files for using inline assembler in C code. On ARC, however, ASM_NL is evaluated inconsistently. It is expanded to a backquote (`) in *.S files, but a semicolon (;) in *.c files because arch/arc/include/asm/linkage.h defines it inside #ifdef __ASSEMBLY__, so the definition for C code falls back to the default value defined in include/linux/linkage.h. If ASM_NL is used in inline assembler in .c files, it will result in wrong assembly code because a semicolon is not an instruction separator, but the start of a comment for ARC. Move ASM_NL (also __ALIGN and __ALIGN_STR) out of the #ifdef. Fixes: 9df62f054406 ("arch: use ASM_NL instead of ';' for assembler new line character in the macro") Fixes: 8d92e992a785 ("ARC: define __ALIGN_STR and __ALIGN symbols for ARC") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-06-05locking/atomic: treewide: delete arch_atomic_*() kerneldocMark Rutland1-17/+0
Currently several architectures have kerneldoc comments for arch_atomic_*(), which is unhelpful as these live in a shared namespace where they clash, and the arch_atomic_*() ops are now an implementation detail of the raw_atomic_*() ops, which no-one should use those directly. Delete the kerneldoc comments for arch_atomic_*(), along with pseudo-kerneldoc comments which are in the correct style but are missing the leading '/**' necessary to be true kerneldoc comments. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-28-mark.rutland@arm.com
2023-06-05locking/atomic: arc: add preprocessor symbolsMark Rutland1-0/+9
Some atomics can be implemented in several different ways, e.g. FULL/ACQUIRE/RELEASE ordered atomics can be implemented in terms of RELAXED atomics, and ACQUIRE/RELEASE/RELAXED can be implemented in terms of FULL ordered atomics. Other atomics are optional, and don't exist in some configurations (e.g. not all architectures implement the 128-bit cmpxchg ops). Subsequent patches will require that architectures define a preprocessor symbol for any atomic (or ordering variant) which is optional. This will make the fallback ifdeffery more robust, and simplify future changes. Add the required definitions to arch/arc. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-6-mark.rutland@arm.com
2023-06-05locking/atomic: make atomic*_{cmp,}xchg optionalMark Rutland2-24/+2
Most architectures define the atomic/atomic64 xchg and cmpxchg operations in terms of arch_xchg and arch_cmpxchg respectfully. Add fallbacks for these cases and remove the trivial cases from arch code. On some architectures the existing definitions are kept as these are used to build other arch_atomic*() operations. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-5-mark.rutland@arm.com
2023-05-29ARC: init: Pass a pointer to virt_to_pfn() in initLinus Walleij1-1/+1
Functions that work on a pointer to virtual memory such as virt_to_pfn() and users of that function such as virt_to_page() are supposed to pass a pointer to virtual memory, ideally a (void *) or other pointer. However since many architectures implement virt_to_pfn() as a macro, this function becomes polymorphic and accepts both a (unsigned long) and a (void *). Fix up the offending call in arch/arc with an explicit cast. Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2023-05-26mm/slab: rename CONFIG_SLAB to CONFIG_SLAB_DEPRECATEDVlastimil Babka5-5/+0
As discussed at LSF/MM [1] [2] and with no objections raised there, deprecate the SLAB allocator. Rename the user-visible option so that users with CONFIG_SLAB=y get a new prompt with explanation during make oldconfig, while make olddefconfig will just switch to SLUB. In all defconfigs with CONFIG_SLAB=y remove the line so those also switch to SLUB. Regressions due to the switch should be reported to linux-mm and slab maintainers. [1] https://lore.kernel.org/all/4b9fc9c6-b48c-198f-5f80-811a44737e5f@suse.cz/ [2] https://lwn.net/Articles/932201/ Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k Acked-by: Helge Deller <deller@gmx.de> # parisc
2023-05-09Merge drm/drm-next into drm-misc-nextMaxime Ripard5-15/+12
Start the 6.5 release cycle. Signed-off-by: Maxime Ripard <maxime@cerno.tech>
2023-05-05Merge tag 'locking-core-2023-05-05' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: - Introduce local{,64}_try_cmpxchg() - a slightly more optimal primitive, which will be used in perf events ring-buffer code - Simplify/modify rwsems on PREEMPT_RT, to address writer starvation - Misc cleanups/fixes * tag 'locking-core-2023-05-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/atomic: Correct (cmp)xchg() instrumentation locking/x86: Define arch_try_cmpxchg_local() locking/arch: Wire up local_try_cmpxchg() locking/generic: Wire up local{,64}_try_cmpxchg() locking/atomic: Add generic try_cmpxchg{,64}_local() support locking/rwbase: Mitigate indefinite writer starvation locking/arch: Rename all internal __xchg() names to __arch_xchg()
2023-04-29locking/arch: Rename all internal __xchg() names to __arch_xchg()Andrzej Hajda1-2/+2
Decrease the probability of this internal facility to be used by driver code. Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Acked-by: Palmer Dabbelt <palmer@rivosinc.com> [riscv] Link: https://lore.kernel.org/r/20230118154450.73842-1-andrzej.hajda@intel.com Cc: Linus Torvalds <torvalds@linux-foundation.org>
2023-04-28Merge tag 'smp-core-2023-04-27' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull SMP cross-CPU function-call updates from Ingo Molnar: - Remove diagnostics and adjust config for CSD lock diagnostics - Add a generic IPI-sending tracepoint, as currently there's no easy way to instrument IPI origins: it's arch dependent and for some major architectures it's not even consistently available. * tag 'smp-core-2023-04-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: trace,smp: Trace all smp_function_call*() invocations trace: Add trace_ipi_send_cpu() sched, smp: Trace smp callback causing an IPI smp: reword smp call IPI comment treewide: Trace IPIs sent via smp_send_reschedule() irq_work: Trace self-IPIs sent via arch_irq_work_raise() smp: Trace IPIs sent via arch_send_call_function_ipi_mask() sched, smp: Trace IPIs sent via send_call_function_single_ipi() trace: Add trace_ipi_send_cpumask() kernel/smp: Make csdlock_debug= resettable locking/csd_lock: Remove per-CPU data indirection from CSD lock debugging locking/csd_lock: Remove added data from CSD lock debugging locking/csd_lock: Add Kconfig option for csd_debug default
2023-04-27Merge tag 'mm-stable-2023-04-27-15-30' of ↵Linus Torvalds2-7/+2
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of switching from a user process to a kernel thread. - More folio conversions from Kefeng Wang, Zhang Peng and Pankaj Raghav. - zsmalloc performance improvements from Sergey Senozhatsky. - Yue Zhao has found and fixed some data race issues around the alteration of memcg userspace tunables. - VFS rationalizations from Christoph Hellwig: - removal of most of the callers of write_one_page() - make __filemap_get_folio()'s return value more useful - Luis Chamberlain has changed tmpfs so it no longer requires swap backing. Use `mount -o noswap'. - Qi Zheng has made the slab shrinkers operate locklessly, providing some scalability benefits. - Keith Busch has improved dmapool's performance, making part of its operations O(1) rather than O(n). - Peter Xu adds the UFFD_FEATURE_WP_UNPOPULATED feature to userfaultd, permitting userspace to wr-protect anon memory unpopulated ptes. - Kirill Shutemov has changed MAX_ORDER's meaning to be inclusive rather than exclusive, and has fixed a bunch of errors which were caused by its unintuitive meaning. - Axel Rasmussen give userfaultfd the UFFDIO_CONTINUE_MODE_WP feature, which causes minor faults to install a write-protected pte. - Vlastimil Babka has done some maintenance work on vma_merge(): cleanups to the kernel code and improvements to our userspace test harness. - Cleanups to do_fault_around() by Lorenzo Stoakes. - Mike Rapoport has moved a lot of initialization code out of various mm/ files and into mm/mm_init.c. - Lorenzo Stoakes removd vmf_insert_mixed_prot(), which was added for DRM, but DRM doesn't use it any more. - Lorenzo has also coverted read_kcore() and vread() to use iterators and has thereby removed the use of bounce buffers in some cases. - Lorenzo has also contributed further cleanups of vma_merge(). - Chaitanya Prakash provides some fixes to the mmap selftesting code. - Matthew Wilcox changes xfs and afs so they no longer take sleeping locks in ->map_page(), a step towards RCUification of pagefaults. - Suren Baghdasaryan has improved mmap_lock scalability by switching to per-VMA locking. - Frederic Weisbecker has reworked the percpu cache draining so that it no longer causes latency glitches on cpu isolated workloads. - Mike Rapoport cleans up and corrects the ARCH_FORCE_MAX_ORDER Kconfig logic. - Liu Shixin has changed zswap's initialization so we no longer waste a chunk of memory if zswap is not being used. - Yosry Ahmed has improved the performance of memcg statistics flushing. - David Stevens has fixed several issues involving khugepaged, userfaultfd and shmem. - Christoph Hellwig has provided some cleanup work to zram's IO-related code paths. - David Hildenbrand has fixed up some issues in the selftest code's testing of our pte state changing. - Pankaj Raghav has made page_endio() unneeded and has removed it. - Peter Xu contributed some rationalizations of the userfaultfd selftests. - Yosry Ahmed has fixed an issue around memcg's page recalim accounting. - Chaitanya Prakash has fixed some arm-related issues in the selftests/mm code. - Longlong Xia has improved the way in which KSM handles hwpoisoned pages. - Peter Xu fixes a few issues with uffd-wp at fork() time. - Stefan Roesch has changed KSM so that it may now be used on a per-process and per-cgroup basis. * tag 'mm-stable-2023-04-27-15-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits) mm,unmap: avoid flushing TLB in batch if PTE is inaccessible shmem: restrict noswap option to initial user namespace mm/khugepaged: fix conflicting mods to collapse_file() sparse: remove unnecessary 0 values from rc mm: move 'mmap_min_addr' logic from callers into vm_unmapped_area() hugetlb: pte_alloc_huge() to replace huge pte_alloc_map() maple_tree: fix allocation in mas_sparse_area() mm: do not increment pgfault stats when page fault handler retries zsmalloc: allow only one active pool compaction context selftests/mm: add new selftests for KSM mm: add new KSM process and sysfs knobs mm: add new api to enable ksm per process mm: shrinkers: fix debugfs file permissions mm: don't check VMA write permissions if the PTE/PMD indicates write permissions migrate_pages_batch: fix statistics for longterm pin retry userfaultfd: use helper function range_in_vma() lib/show_mem.c: use for_each_populated_zone() simplify code mm: correct arg in reclaim_pages()/reclaim_clean_pages_from_list() fs/buffer: convert create_page_buffers to folio_create_buffers fs/buffer: add folio_create_empty_buffers helper ...
2023-04-20arch/arc: Implement <asm/fb.h> with generic helpersThomas Zimmermann1-14/+2
Replace the architecture's fbdev helpers with the generic ones from <asm-generic/fb.h>. On arc, pgprot_writecombine() and pgprot_noncached() are the same; hence no functional changes. v3: * use default implementation for fb_pgprotect() (Arnd) Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Cc: Vineet Gupta <vgupta@kernel.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Helge Deller <deller@gmx.de> Link: https://patchwork.freedesktop.org/patch/msgid/20230417125651.25126-3-tzimmermann@suse.de
2023-04-18mm: make arch_has_descending_max_zone_pfns() staticArnd Bergmann1-5/+0
clang produces a build failure on x86 for some randconfig builds after a change that moves around code to mm/mm_init.c: Cannot find symbol for section 2: .text. mm/mm_init.o: failed I have not been able to figure out why this happens, but the __weak annotation on arch_has_descending_max_zone_pfns() is the trigger here. Removing the weak function in favor of an open-coded Kconfig option check avoids the problem and becomes clearer as well as better to optimize by the compiler. [arnd@arndb.de: fix logic bug] Link: https://lkml.kernel.org/r/20230415081904.969049-1-arnd@kernel.org Link: https://lkml.kernel.org/r/20230414080418.110236-1-arnd@kernel.org Fixes: 9420f89db2dd ("mm: move most of core MM initialization to mm/mm_init.c") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Vlastimil Babka <vbabka@suse.cz> Tested-by: SeongJae Park <sj@kernel.org> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: kernel test robot <oliver.sang@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-05mm, treewide: redefine MAX_ORDER sanelyKirill A. Shutemov1-2/+2
MAX_ORDER currently defined as number of orders page allocator supports: user can ask buddy allocator for page order between 0 and MAX_ORDER-1. This definition is counter-intuitive and lead to number of bugs all over the kernel. Change the definition of MAX_ORDER to be inclusive: the range of orders user can ask from buddy allocator is 0..MAX_ORDER now. [kirill@shutemov.name: fix min() warning] Link: https://lkml.kernel.org/r/20230315153800.32wib3n5rickolvh@box [akpm@linux-foundation.org: fix another min_t warning] [kirill@shutemov.name: fixups per Zi Yan] Link: https://lkml.kernel.org/r/20230316232144.b7ic4cif4kjiabws@box.shutemov.name [akpm@linux-foundation.org: fix underlining in docs] Link: https://lore.kernel.org/oe-kbuild-all/202303191025.VRCTk6mP-lkp@intel.com/ Link: https://lkml.kernel.org/r/20230315113133.11326-11-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-03-24treewide: Trace IPIs sent via smp_send_reschedule()Valentin Schneider1-1/+1
To be able to trace invocations of smp_send_reschedule(), rename the arch-specific definitions of it to arch_smp_send_reschedule() and wrap it into an smp_send_reschedule() that contains a tracepoint. Changes to include the declaration of the tracepoint were driven by the following coccinelle script: @func_use@ @@ smp_send_reschedule(...); @include@ @@ #include <trace/events/ipi.h> @no_include depends on func_use && !include@ @@ #include <...> + + #include <trace/events/ipi.h> [csky bits] [riscv bits] Signed-off-by: Valentin Schneider <vschneid@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Guo Ren <guoren@kernel.org> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Link: https://lore.kernel.org/r/20230307143558.294354-6-vschneid@redhat.com
2023-03-09module: replace module_layout with module_memorySong Liu1-5/+7
module_layout manages different types of memory (text, data, rodata, etc.) in one allocation, which is problematic for some reasons: 1. It is hard to enable CONFIG_STRICT_MODULE_RWX. 2. It is hard to use huge pages in modules (and not break strict rwx). 3. Many archs uses module_layout for arch-specific data, but it is not obvious how these data are used (are they RO, RX, or RW?) Improve the scenario by replacing 2 (or 3) module_layout per module with up to 7 module_memory per module: MOD_TEXT, MOD_DATA, MOD_RODATA, MOD_RO_AFTER_INIT, MOD_INIT_TEXT, MOD_INIT_DATA, MOD_INIT_RODATA, and allocating them separately. This adds slightly more entries to mod_tree (from up to 3 entries per module, to up to 7 entries per module). However, this at most adds a small constant overhead to __module_address(), which is expected to be fast. Various archs use module_layout for different data. These data are put into different module_memory based on their location in module_layout. IOW, data that used to go with text is allocated with MOD_MEM_TYPE_TEXT; data that used to go with data is allocated with MOD_MEM_TYPE_DATA, etc. module_memory simplifies quite some of the module code. For example, ARCH_WANTS_MODULES_DATA_IN_VMALLOC is a lot cleaner, as it just uses a different allocator for the data. kernel/module/strict_rwx.c is also much cleaner with module_memory. Signed-off-by: Song Liu <song@kernel.org> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-02-23Merge tag 'mm-stable-2023-02-20-13-37' of ↵Linus Torvalds2-4/+23
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Daniel Verkamp has contributed a memfd series ("mm/memfd: add F_SEAL_EXEC") which permits the setting of the memfd execute bit at memfd creation time, with the option of sealing the state of the X bit. - Peter Xu adds a patch series ("mm/hugetlb: Make huge_pte_offset() thread-safe for pmd unshare") which addresses a rare race condition related to PMD unsharing. - Several folioification patch serieses from Matthew Wilcox, Vishal Moola, Sidhartha Kumar and Lorenzo Stoakes - Johannes Weiner has a series ("mm: push down lock_page_memcg()") which does perform some memcg maintenance and cleanup work. - SeongJae Park has added DAMOS filtering to DAMON, with the series "mm/damon/core: implement damos filter". These filters provide users with finer-grained control over DAMOS's actions. SeongJae has also done some DAMON cleanup work. - Kairui Song adds a series ("Clean up and fixes for swap"). - Vernon Yang contributed the series "Clean up and refinement for maple tree". - Yu Zhao has contributed the "mm: multi-gen LRU: memcg LRU" series. It adds to MGLRU an LRU of memcgs, to improve the scalability of global reclaim. - David Hildenbrand has added some userfaultfd cleanup work in the series "mm: uffd-wp + change_protection() cleanups". - Christoph Hellwig has removed the generic_writepages() library function in the series "remove generic_writepages". - Baolin Wang has performed some maintenance on the compaction code in his series "Some small improvements for compaction". - Sidhartha Kumar is doing some maintenance work on struct page in his series "Get rid of tail page fields". - David Hildenbrand contributed some cleanup, bugfixing and generalization of pte management and of pte debugging in his series "mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures with swap PTEs". - Mel Gorman and Neil Brown have removed the __GFP_ATOMIC allocation flag in the series "Discard __GFP_ATOMIC". - Sergey Senozhatsky has improved zsmalloc's memory utilization with his series "zsmalloc: make zspage chain size configurable". - Joey Gouly has added prctl() support for prohibiting the creation of writeable+executable mappings. The previous BPF-based approach had shortcomings. See "mm: In-kernel support for memory-deny-write-execute (MDWE)". - Waiman Long did some kmemleak cleanup and bugfixing in the series "mm/kmemleak: Simplify kmemleak_cond_resched() & fix UAF". - T.J. Alumbaugh has contributed some MGLRU cleanup work in his series "mm: multi-gen LRU: improve". - Jiaqi Yan has provided some enhancements to our memory error statistics reporting, mainly by presenting the statistics on a per-node basis. See the series "Introduce per NUMA node memory error statistics". - Mel Gorman has a second and hopefully final shot at fixing a CPU-hog regression in compaction via his series "Fix excessive CPU usage during compaction". - Christoph Hellwig does some vmalloc maintenance work in the series "cleanup vfree and vunmap". - Christoph Hellwig has removed block_device_operations.rw_page() in ths series "remove ->rw_page". - We get some maple_tree improvements and cleanups in Liam Howlett's series "VMA tree type safety and remove __vma_adjust()". - Suren Baghdasaryan has done some work on the maintainability of our vm_flags handling in the series "introduce vm_flags modifier functions". - Some pagemap cleanup and generalization work in Mike Rapoport's series "mm, arch: add generic implementation of pfn_valid() for FLATMEM" and "fixups for generic implementation of pfn_valid()" - Baoquan He has done some work to make /proc/vmallocinfo and /proc/kcore better represent the real state of things in his series "mm/vmalloc.c: allow vread() to read out vm_map_ram areas". - Jason Gunthorpe rationalized the GUP system's interface to the rest of the kernel in the series "Simplify the external interface for GUP". - SeongJae Park wishes to migrate people from DAMON's debugfs interface over to its sysfs interface. To support this, we'll temporarily be printing warnings when people use the debugfs interface. See the series "mm/damon: deprecate DAMON debugfs interface". - Andrey Konovalov provided the accurately named "lib/stackdepot: fixes and clean-ups" series. - Huang Ying has provided a dramatic reduction in migration's TLB flush IPI rates with the series "migrate_pages(): batch TLB flushing". - Arnd Bergmann has some objtool fixups in "objtool warning fixes". * tag 'mm-stable-2023-02-20-13-37' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (505 commits) include/linux/migrate.h: remove unneeded externs mm/memory_hotplug: cleanup return value handing in do_migrate_range() mm/uffd: fix comment in handling pte markers mm: change to return bool for isolate_movable_page() mm: hugetlb: change to return bool for isolate_hugetlb() mm: change to return bool for isolate_lru_page() mm: change to return bool for folio_isolate_lru() objtool: add UACCESS exceptions for __tsan_volatile_read/write kmsan: disable ftrace in kmsan core code kasan: mark addr_has_metadata __always_inline mm: memcontrol: rename memcg_kmem_enabled() sh: initialize max_mapnr m68k/nommu: add missing definition of ARCH_PFN_OFFSET mm: percpu: fix incorrect size in pcpu_obj_full_size() maple_tree: reduce stack usage with gcc-9 and earlier mm: page_alloc: call panic() when memoryless node allocation fails mm: multi-gen LRU: avoid futile retries migrate_pages: move THP/hugetlb migration support check to simplify code migrate_pages: batch flushing TLB migrate_pages: share more code between _unmap and _move ...
2023-02-09mm, arch: add generic implementation of pfn_valid() for FLATMEMMike Rapoport (IBM)1-1/+0
Every architecture that supports FLATMEM memory model defines its own version of pfn_valid() that essentially compares a pfn to max_mapnr. Use mips/powerpc version implemented as static inline as a generic implementation of pfn_valid() and drop its per-architecture definitions. [rppt@kernel.org: fix the generic pfn_valid()] Link: https://lkml.kernel.org/r/Y9lg7R1Yd931C+y5@kernel.org Link: https://lkml.kernel.org/r/20230129124235.209895-5-rppt@kernel.org Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Guo Ren <guoren@kernel.org> [csky] Acked-by: Huacai Chen <chenhuacai@loongson.cn> [LoongArch] Acked-by: Stafford Horne <shorne@gmail.com> [OpenRISC] Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Reviewed-by: David Hildenbrand <david@redhat.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Cc: Brian Cain <bcain@quicinc.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Helge Deller <deller@gmx.de> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vineet Gupta <vgupta@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-02mm: remove __HAVE_ARCH_PTE_SWP_EXCLUSIVEDavid Hildenbrand1-1/+0
__HAVE_ARCH_PTE_SWP_EXCLUSIVE is now supported by all architectures that support swp PTEs, so let's drop it. Link: https://lkml.kernel.org/r/20230113171026.582290-27-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>