aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-01-20delayacct: fix incomplete disable operation when switch enable to disableYang Yang1-0/+18
When a task is created after delayacct is enabled, kernel will do all the delay accountings for that task. The problems is if user disables delayacct by set /proc/sys/kernel/task_delayacct to zero, only blkio delay accounting is disabled. Now disable all the kinds of delay accountings when /proc/sys/kernel/task_delayacct sets to zero. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Yang Yang <[email protected]> Reported-by: Zeal Robot <[email protected]> Cc: Balbir Singh <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Johannes Weiner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20delayacct: support swapin delay accounting for swapping without blkioYang Yang4-41/+43
Currently delayacct accounts swapin delay only for swapping that cause blkio. If we use zram for swapping, tools/accounting/getdelays can't get any SWAP delay. It's useful to get zram swapin delay information, for example to adjust compress algorithm or /proc/sys/vm/swappiness. Reference to PSI, it accounts any kind of swapping by doing its work in swap_readpage(), no matter whether swapping causes blkio. Let delayacct do the similar work. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Yang Yang <[email protected]> Reported-by: Zeal Robot <[email protected]> Cc: Balbir Singh <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Johannes Weiner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20panic: remove oops_idSebastian Andrzej Siewior1-18/+1
The oops id has been added as part of the end of trace marker for the kerneloops.org project. The id is used to automatically identify duplicate submissions of the same report. Identical looking reports with different a id can be considered as the same oops occurred again. The early initialisation of the oops_id can create a warning if the random core is not yet fully initialized. On PREEMPT_RT it is problematic if the id is initialized on demand from non preemptible context. The kernel oops project is not available since 2017. Remove the oops_id and use 0 in the output in case parser rely on it. Link: https://bugs.debian.org/953172 Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Cc: Arjan van de Ven <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20panic: use error_report_end tracepoint on warningsMarco Elver2-3/+7
Introduce the error detector "warning" to the error_report event and use the error_report_end tracepoint at the end of a warning report. This allows in-kernel tests but also userspace to more easily determine if a warning occurred without polling kernel logs. [[email protected]: add comma to enum list, per Andy] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Marco Elver <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Petr Mladek <[email protected]> Cc: Luis Chamberlain <[email protected]> Cc: Wei Liu <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: John Ogness <[email protected]> Cc: Andy Shevchenko <[email protected]> Cc: Alexander Popov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20fs/adfs: remove unneeded variable make code cleanerMinghao Chi1-3/+1
Return value directly instead of taking this in a variable. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Minghao Chi <[email protected]> Reported-by: Zeal Robot <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Jan Kara <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20FAT: use io_schedule_timeout() instead of congestion_wait()NeilBrown1-2/+3
congestion_wait() in this context is just a sleep - block devices do not support congestion signalling any more. The goal for this wait, which was introduced in commit ae78bf9c4f5f ("[PATCH] add -o flush for fat") is to wait for any recently written data to get to storage. We currently have no direct mechanism to do this, so a simple wait that behaves identically to the current congestion_wait() is the best we can do. This is a step towards removing congestion_wait() Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: NeilBrown <[email protected]> Acked-by: OGAWA Hirofumi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20hfsplus: use struct_group_attr() for memcpy() regionKees Cook2-6/+10
In preparation for FORTIFY_SOURCE performing compile-time and run-time field bounds checking for memset(), avoid intentionally writing across neighboring fields. Add struct_group() to mark the "info" region (containing struct DInfo and struct DXInfo structs) in struct hfsplus_cat_folder and struct hfsplus_cat_file that are written into directly, so the compiler can correctly reason about the expected size of the writes. "pahole" shows no size nor member offset changes to struct hfsplus_cat_folder nor struct hfsplus_cat_file. "objdump -d" shows no object code changes. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kees Cook <[email protected]> Acked-by: Christian Brauner <[email protected]> Cc: Zhen Lei <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20nilfs2: remove redundant pointer sbufsColin Ian King1-2/+2
Pointer sbufs is being assigned a value but it's not being used later on. The pointer is redundant and can be removed. Cleans up scan-build static analysis warning: fs/nilfs2/page.c:203:8: warning: Although the value stored to 'sbufs' is used in the enclosing expression, the value is never actually read from 'sbufs' [deadcode.DeadStores] sbh = sbufs = page_buffers(src); Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Ryusuke Konishi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20fs/binfmt_elf: use PT_LOAD p_align values for static PIEH.J. Lu1-2/+2
Extend commit ce81bb256a22 ("fs/binfmt_elf: use PT_LOAD p_align values for suitable start address") which fixed PIE binaries built with -Wl,-z,max-page-size=0x200000, to cover static PIE binaries. This fixes: https://bugzilla.kernel.org/show_bug.cgi?id=215275 Tested by verifying static PIE binaries with -Wl,-z,max-page-size=0x200000 loading. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: H.J. Lu <[email protected]> Cc: Chris Kennelly <[email protected]> Cc: Al Viro <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Song Liu <[email protected]> Cc: David Rientjes <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Cc: Sandeep Patil <[email protected]> Cc: Fangrui Song <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20const_structs.checkpatch: add frequently used ops structsRikard Falkeborn1-0/+23
Add commonly used structs (>50 instances) which are always or almost always const. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Rikard Falkeborn <[email protected]> Cc: Joe Perches <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20checkpatch: improve Kconfig help testJoe Perches1-26/+26
The Kconfig help test erroneously counts patch context lines as part of the help text. Fix that and improve the message block output. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Joe Perches <[email protected]> Tested-by: Randy Dunlap <[email protected]> Acked-by: Randy Dunlap <[email protected]> Cc: Andy Whitcroft <[email protected]> Cc: Dwaipayan Ray <[email protected]> Cc: Lukas Bulwahn <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20checkpatch: relax regexp for COMMIT_LOG_LONG_LINEJerome Forissier1-1/+1
One exceptions to the COMMIT_LOG_LONG_LINE rule is a file path followed by ':'. That is typically some sort diagnostic message from a compiler or a build tool, in which case we don't want to wrap the lines but keep the message unmodified. The regular expression used to match this pattern currently doesn't accept absolute paths or + characters. This can result in false positives as in the following (out-of-tree) example: ... /home/jerome/work/optee_repo_qemu/build/../toolchains/aarch32/bin/arm-linux-gnueabihf-ld.bfd: /home/jerome/work/toolchains-gcc10.2/aarch32/bin/../lib/gcc/arm-none-linux-gnueabihf/10.2.1/../../../../arm-none-linux-gnueabihf/lib/libstdc++.a(eh_alloc.o): in function `__cxa_allocate_exception': /tmp/dgboter/bbs/build03--cen7x86_64/buildbot/cen7x86_64--arm-none-linux-gnueabihf/build/src/gcc/libstdc++-v3/libsupc++/eh_alloc.cc:284: undefined reference to `malloc' ... Update the regular expression to match the above paths. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Jerome Forissier <[email protected]> Acked-by: Joe Perches <[email protected]> Cc: Andy Whitcroft <[email protected]> Cc: Dwaipayan Ray <[email protected]> Cc: Lukas Bulwahn <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20lib/test_meminit: destroy cache in kmem_cache_alloc_bulk() testAndrey Konovalov1-0/+1
Make do_kmem_cache_size_bulk() destroy the cache it creates. Link: https://lkml.kernel.org/r/aced20a94bf04159a139f0846e41d38a1537debb.1640018297.git.andreyknvl@google.com Fixes: 03a9349ac0e0 ("lib/test_meminit: add a kmem_cache_alloc_bulk() test") Signed-off-by: Andrey Konovalov <[email protected]> Reviewed-by: Marco Elver <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Andrey Ryabinin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20uuid: remove licence boilerplate text from the headerAndy Shevchenko1-9/+0
Remove licence boilerplate text from the UAPI header. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Andy Shevchenko <[email protected]> Acked-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20uuid: discourage people from using UAPI header in new codeAndy Shevchenko1-0/+1
Discourage people from using UAPI header in new code by adding a note. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Andy Shevchenko <[email protected]> Acked-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20kunit: replace kernel.h with the necessary inclusionsAndy Shevchenko1-1/+1
When kernel.h is used in the headers it adds a lot into dependency hell, especially when there are circular dependencies are involved. Replace kernel.h inclusion with the list of what is really being used. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Andy Shevchenko <[email protected]> Reviewed-by: Brendan Higgins <[email protected]> Tested-by: Brendan Higgins <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20test_hash.c: refactor into kunitIsabella Basso3-143/+81
Use KUnit framework to make tests more easily integrable with CIs. Even though these tests are not yet properly written as unit tests this change should help in debugging. Also remove kernel messages (i.e. through pr_info) as KUnit handles all debugging output and let it handle module init and exit details. Link: https://lkml.kernel.org/r/[email protected] Reviewed-by: David Gow <[email protected]> Reported-by: kernel test robot <[email protected]> Tested-by: David Gow <[email protected]> Co-developed-by: Augusto Durães Camargo <[email protected]> Signed-off-by: Augusto Durães Camargo <[email protected]> Co-developed-by: Enzo Ferreira <[email protected]> Signed-off-by: Enzo Ferreira <[email protected]> Signed-off-by: Isabella Basso <[email protected]> Cc: Brendan Higgins <[email protected]> Cc: Daniel Latypov <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Rodrigo Siqueira <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20lib/Kconfig.debug: properly split hash test kernel entriesIsabella Basso2-4/+13
Split TEST_HASH so that each entry only has one file. Note that there's no stringhash test file, but actually <linux/stringhash.h> tests are performed in lib/test_hash.c. Link: https://lkml.kernel.org/r/[email protected] Reviewed-by: David Gow <[email protected]> Tested-by: David Gow <[email protected]> Signed-off-by: Isabella Basso <[email protected]> Cc: Augusto Durães Camargo <[email protected]> Cc: Brendan Higgins <[email protected]> Cc: Daniel Latypov <[email protected]> Cc: Enzo Ferreira <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: kernel test robot <[email protected]> Cc: Rodrigo Siqueira <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20test_hash.c: split test_hash_initIsabella Basso1-12/+54
Split up test_hash_init so that it calls each test more explicitly insofar it is possible without rewriting the entire file. This aims at improving readability. Split tests performed on string_or as they don't interfere with those performed in hash_or. Also separate pr_info calls about skipped tests as they're not part of the tests themselves, but only warn about (un)defined arch-specific hash functions. Link: https://lkml.kernel.org/r/[email protected] Reviewed-by: David Gow <[email protected]> Tested-by: David Gow <[email protected]> Signed-off-by: Isabella Basso <[email protected]> Cc: Augusto Durães Camargo <[email protected]> Cc: Brendan Higgins <[email protected]> Cc: Daniel Latypov <[email protected]> Cc: Enzo Ferreira <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: kernel test robot <[email protected]> Cc: Rodrigo Siqueira <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20test_hash.c: split test_int_hash into arch-specific functionsIsabella Basso1-29/+62
Split the test_int_hash function to keep its mainloop separate from arch-specific chunks, which are only compiled as needed. This aims at improving readability. Link: https://lkml.kernel.org/r/[email protected] Reviewed-by: David Gow <[email protected]> Tested-by: David Gow <[email protected]> Signed-off-by: Isabella Basso <[email protected]> Cc: Augusto Durães Camargo <[email protected]> Cc: Brendan Higgins <[email protected]> Cc: Daniel Latypov <[email protected]> Cc: Enzo Ferreira <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: kernel test robot <[email protected]> Cc: Rodrigo Siqueira <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20hash.h: remove unused define directiveIsabella Basso4-33/+4
Patch series "test_hash.c: refactor into KUnit", v3. We refactored the lib/test_hash.c file into KUnit as part of the student group LKCAMP [1] introductory hackathon for kernel development. This test was pointed to our group by Daniel Latypov [2], so its full conversion into a pure KUnit test was our goal in this patch series, but we ran into many problems relating to it not being split as unit tests, which complicated matters a bit, as the reasoning behind the original tests is quite cryptic for those unfamiliar with hash implementations. Some interesting developments we'd like to highlight are: - In patch 1/5 we noticed that there was an unused define directive that could be removed. - In patch 4/5 we noticed how stringhash and hash tests are all under the lib/test_hash.c file, which might cause some confusion, and we also broke those kernel config entries up. Overall KUnit developments have been made in the other patches in this series: In patches 2/5, 3/5 and 5/5 we refactored the lib/test_hash.c file so as to make it more compatible with the KUnit style, whilst preserving the original idea of the maintainer who designed it (i.e. George Spelvin), which might be undesirable for unit tests, but we assume it is enough for a first patch. This patch (of 5): Currently, there exist hash_32() and __hash_32() functions, which were introduced in a patch [1] targeting architecture specific optimizations. These functions can be overridden on a per-architecture basis to achieve such optimizations. They must set their corresponding define directive (HAVE_ARCH_HASH_32 and HAVE_ARCH__HASH_32, respectively) so that header files can deal with these overrides properly. As the supported 32-bit architectures that have their own hash function implementation (i.e. m68k, Microblaze, H8/300, pa-risc) have only been making use of the (more general) __hash_32() function (which only lacks a right shift operation when compared to the hash_32() function), remove the define directive corresponding to the arch-specific hash_32() implementation. [1] https://lore.kernel.org/lkml/[email protected]/ [[email protected]: hash_32_generic() becomes hash_32()] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Reviewed-by: David Gow <[email protected]> Tested-by: David Gow <[email protected]> Co-developed-by: Augusto Durães Camargo <[email protected]> Signed-off-by: Augusto Durães Camargo <[email protected]> Co-developed-by: Enzo Ferreira <[email protected]> Signed-off-by: Enzo Ferreira <[email protected]> Signed-off-by: Isabella Basso <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Brendan Higgins <[email protected]> Cc: Daniel Latypov <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Rodrigo Siqueira <[email protected]> Cc: kernel test robot <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20lib/list_debug.c: print more list debugging context in __list_del_entry_valid()Zhen Lei1-4/+4
Currently, the entry->prev and entry->next are considered to be valid as long as they are not LIST_POISON{1|2}. However, the memory may be corrupted. The prev->next is invalid probably because 'prev' is invalid, not because prev->next's content is illegal. Unfortunately, the printk and its subfunctions will modify the registers that hold the 'prev' and 'next', and we don't see this valuable information in the BUG context. So print the contents of 'entry->prev' and 'entry->next'. Here's an example: list_del corruption. prev->next should be c0ecbf74, but was c08410dc kernel BUG at lib/list_debug.c:53! ... ... PC is at __list_del_entry_valid+0x58/0x98 LR is at __list_del_entry_valid+0x58/0x98 psr: 60000093 sp : c0ecbf30 ip : 00000000 fp : 00000001 r10: c08410d0 r9 : 00000001 r8 : c0825e0c r7 : 20000013 r6 : c08410d0 r5 : c0ecbf74 r4 : c0ecbf74 r3 : c0825d08 r2 : 00000000 r1 : df7ce6f4 r0 : 00000044 ... ... Stack: (0xc0ecbf30 to 0xc0ecc000) bf20: c0ecbf74 c0164fd0 c0ecbf70 c0165170 bf40: c0eca000 c0840c00 c0840c00 c0824500 c0825e0c c0189bbc c088f404 60000013 bf60: 60000013 c0e85100 000004ec 00000000 c0ebcdc0 c0ecbf74 c0ecbf74 c0825d08 bf80: c0e807c0 c018965c 00000000 c013f2a0 c0e807c0 c013f154 00000000 00000000 bfa0: 00000000 00000000 00000000 c01001b0 00000000 00000000 00000000 00000000 bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 bfe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000 (__list_del_entry_valid) from (__list_del_entry+0xc/0x20) (__list_del_entry) from (finish_swait+0x60/0x7c) (finish_swait) from (rcu_gp_kthread+0x560/0xa20) (rcu_gp_kthread) from (kthread+0x14c/0x15c) (kthread) from (ret_from_fork+0x14/0x24) At first, I thought prev->next was overwritten. Later, I carefully analyzed the RCU code and the disassembly code. The error occurred when deleting a node from the list rcu_state.gp_wq. The System.map shows that the address of rcu_state is c0840c00. Then I use gdb to obtain the offset of rcu_state.gp_wq.task_list. (gdb) p &((struct rcu_state *)0)->gp_wq.task_list $1 = (struct list_head *) 0x4dc Again: list_del corruption. prev->next should be c0ecbf74, but was c08410dc c08410dc = c0840c00 + 0x4dc = &rcu_state.gp_wq.task_list Because rcu_state.gp_wq has at most one node, so I can guess that "prev = &rcu_state.gp_wq.task_list". But for other scenes, maybe I wasn't so lucky, I cannot figure out the value of 'prev'. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Zhen Lei <[email protected]> Cc: "Paul E . McKenney" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20list: introduce list_is_head() helper and re-use it in list.hAndy Shevchenko1-14/+22
Introduce list_is_head() in the similar (*) way as it's done for list_entry_is_head(). Make use of it in the list.h. *) it's done as inliner and not a macro to be aligned with other list_is_*() APIs; while at it, make all three to have the same style. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Andy Shevchenko <[email protected]> Cc: Heikki Krogerus <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20kstrtox: uninline everythingAlexey Dobriyan1-0/+12
I've made a mistake of looking into lib/kstrtox.o code generation. The only function remotely performance critical is _parse_integer() (via /proc/*/map_files/*), everything else is not. Uninline everything, shrink lib/kstrtox.o by ~20 % ! Space savings on x86_64: add/remove: 0/0 grow/shrink: 0/23 up/down: 0/-1269 (-1269 !!!) Function old new delta kstrtoull 16 13 -3 kstrtouint 59 48 -11 kstrtou8 60 49 -11 kstrtou16 61 50 -11 _kstrtoul 46 35 -11 kstrtoull_from_user 95 83 -12 kstrtoul_from_user 95 83 -12 kstrtoll 93 80 -13 kstrtouint_from_user 124 83 -41 kstrtou8_from_user 125 83 -42 kstrtou16_from_user 126 83 -43 kstrtos8 101 50 -51 kstrtos16 102 51 -51 kstrtoint 100 49 -51 _kstrtol 93 35 -58 kstrtobool_from_user 156 75 -81 kstrtoll_from_user 165 83 -82 kstrtol_from_user 165 83 -82 kstrtoint_from_user 172 83 -89 kstrtos8_from_user 173 83 -90 kstrtos16_from_user 174 83 -91 _parse_integer 136 10 -126 _kstrtoull 308 101 -207 Total: Before=3421236, After=3419967, chg -0.04% Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20get_maintainer: don't remind about no git repo when --nogit is usedRandy Dunlap1-1/+1
When --nogit is used with scripts/get_maintainer.pl, the script spews 4 lines of unnecessary information (noise). Do not print those lines when --nogit is specified. This change removes the printing of these 4 lines: ./scripts/get_maintainer.pl: No supported VCS found. Add --nogit to options? Using a git repository produces better results. Try Linus Torvalds' latest git repository using: git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Randy Dunlap <[email protected]> Cc: Joe Perches <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20kernel/sys.c: only take tasklist_lock for get/setpriority(PRIO_PGRP)Davidlohr Bueso1-8/+8
PRIO_PGRP needs the tasklist_lock mainly to serialize vs setpgid(2), to protect against any concurrent change_pid(PIDTYPE_PGID) that can move the task from one hlist to another while iterating. However, the remaining can only rely only on RCU: PRIO_PROCESS only does the task lookup and never iterates over tasklist and we already have an rcu-aware stable pointer. PRIO_USER is already racy vs setuid(2) so with creds being rcu protected, we can end up seeing stale data. When removing the tasklist_lock there can be a race with (i) fork but this is benign as the child's nice is inherited and the new task is not observable by the user yet either, hence the return semantics do not differ. And (ii) a race with exit, which is a small window and can cause us to miss a task which was removed from the list and it had the highest nice. Similarly change the buggy do_each_thread/while_each_thread combo in PRIO_USER for the rcu-safe for_each_process_thread flavor, which doesn't make use of next_thread/p->thread_group. [[email protected]: coding style fixes] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Davidlohr Bueso <[email protected]> Acked-by: Oleg Nesterov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20kthread: dynamically allocate memory to store kthread's full nameYafang Shao3-2/+34
When I was implementing a new per-cpu kthread cfs_migration, I found the comm of it "cfs_migration/%u" is truncated due to the limitation of TASK_COMM_LEN. For example, the comm of the percpu thread on CPU10~19 all have the same name "cfs_migration/1", which will confuse the user. This issue is not critical, because we can get the corresponding CPU from the task's Cpus_allowed. But for kthreads corresponding to other hardware devices, it is not easy to get the detailed device info from task comm, for example, jbd2/nvme0n1p2- xfs-reclaim/sdf Currently there are so many truncated kthreads: rcu_tasks_kthre rcu_tasks_rude_ rcu_tasks_trace poll_mpt3sas0_s ext4-rsv-conver xfs-reclaim/sd{a, b, c, ...} xfs-blockgc/sd{a, b, c, ...} xfs-inodegc/sd{a, b, c, ...} audit_send_repl ecryptfs-kthrea vfio-irqfd-clea jbd2/nvme0n1p2- ... We can shorten these names to work around this problem, but it may be not applied to all of the truncated kthreads. Take 'jbd2/nvme0n1p2-' for example, it is a nice name, and it is not a good idea to shorten it. One possible way to fix this issue is extending the task comm size, but as task->comm is used in lots of places, that may cause some potential buffer overflows. Another more conservative approach is introducing a new pointer to store kthread's full name if it is truncated, which won't introduce too much overhead as it is in the non-critical path. Finally we make a dicision to use the second approach. See also the discussions in this thread: https://lore.kernel.org/lkml/[email protected]/ After this change, the full name of these truncated kthreads will be displayed via /proc/[pid]/comm: rcu_tasks_kthread rcu_tasks_rude_kthread rcu_tasks_trace_kthread poll_mpt3sas0_statu ext4-rsv-conversion xfs-reclaim/sdf1 xfs-blockgc/sdf1 xfs-inodegc/sdf1 audit_send_reply ecryptfs-kthread vfio-irqfd-cleanup jbd2/nvme0n1p2-8 Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Yafang Shao <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Reviewed-by: Petr Mladek <[email protected]> Suggested-by: Petr Mladek <[email protected]> Suggested-by: Steven Rostedt <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Michal Miroslaw <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Al Viro <[email protected]> Cc: Kees Cook <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20tools/testing/selftests/bpf: replace open-coded 16 with TASK_COMM_LENYafang Shao3-8/+13
As the sched:sched_switch tracepoint args are derived from the kernel, we'd better make it same with the kernel. So the macro TASK_COMM_LEN is converted to type enum, then all the BPF programs can get it through BTF. The BPF program which wants to use TASK_COMM_LEN should include the header vmlinux.h. Regarding the test_stacktrace_map and test_tracepoint, as the type defined in linux/bpf.h are also defined in vmlinux.h, so we don't need to include linux/bpf.h again. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Yafang Shao <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Acked-by: David Hildenbrand <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Michal Miroslaw <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Al Viro <[email protected]> Cc: Kees Cook <[email protected]> Cc: Petr Mladek <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Dennis Dalessandro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20tools/bpf/bpftool/skeleton: replace bpf_probe_read_kernel with ↵Yafang Shao1-2/+2
bpf_probe_read_kernel_str to get task comm bpf_probe_read_kernel_str() will add a nul terminator to the dst, then we don't care about if the dst size is big enough. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Yafang Shao <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Michal Miroslaw <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Al Viro <[email protected]> Cc: Kees Cook <[email protected]> Cc: Petr Mladek <[email protected]> Cc: Dennis Dalessandro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20samples/bpf/test_overhead_kprobe_kern: replace bpf_probe_read_kernel with ↵Yafang Shao3-9/+11
bpf_probe_read_kernel_str to get task comm bpf_probe_read_kernel_str() will add a nul terminator to the dst, then we don't care about if the dst size is big enough. This patch also replaces the hard-coded 16 with TASK_COMM_LEN to make it grepable. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Yafang Shao <[email protected]> Reviewed-by: Kees Cook <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Michal Miroslaw <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Al Viro <[email protected]> Cc: Kees Cook <[email protected]> Cc: Petr Mladek <[email protected]> Cc: Dennis Dalessandro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20fs/binfmt_elf: replace open-coded string copy with get_task_commYafang Shao3-1/+11
It is better to use get_task_comm() instead of the open coded string copy as we do in other places. struct elf_prpsinfo is used to dump the task information in userspace coredump or kernel vmcore. Below is the verification of vmcore, crash> ps PID PPID CPU TASK ST %MEM VSZ RSS COMM 0 0 0 ffffffff9d21a940 RU 0.0 0 0 [swapper/0] > 0 0 1 ffffa09e40f85e80 RU 0.0 0 0 [swapper/1] > 0 0 2 ffffa09e40f81f80 RU 0.0 0 0 [swapper/2] > 0 0 3 ffffa09e40f83f00 RU 0.0 0 0 [swapper/3] > 0 0 4 ffffa09e40f80000 RU 0.0 0 0 [swapper/4] > 0 0 5 ffffa09e40f89f80 RU 0.0 0 0 [swapper/5] 0 0 6 ffffa09e40f8bf00 RU 0.0 0 0 [swapper/6] > 0 0 7 ffffa09e40f88000 RU 0.0 0 0 [swapper/7] > 0 0 8 ffffa09e40f8de80 RU 0.0 0 0 [swapper/8] > 0 0 9 ffffa09e40f95e80 RU 0.0 0 0 [swapper/9] > 0 0 10 ffffa09e40f91f80 RU 0.0 0 0 [swapper/10] > 0 0 11 ffffa09e40f93f00 RU 0.0 0 0 [swapper/11] > 0 0 12 ffffa09e40f90000 RU 0.0 0 0 [swapper/12] > 0 0 13 ffffa09e40f9bf00 RU 0.0 0 0 [swapper/13] > 0 0 14 ffffa09e40f98000 RU 0.0 0 0 [swapper/14] > 0 0 15 ffffa09e40f9de80 RU 0.0 0 0 [swapper/15] It works well as expected. Some comments are added to explain why we use the hard-coded 16. Link: https://lkml.kernel.org/r/[email protected] Suggested-by: Kees Cook <[email protected]> Signed-off-by: Yafang Shao <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Michal Miroslaw <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Al Viro <[email protected]> Cc: Kees Cook <[email protected]> Cc: Petr Mladek <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Dennis Dalessandro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20drivers/infiniband: replace open-coded string copy with get_task_commYafang Shao2-2/+2
We'd better use the helper get_task_comm() rather than the open-coded strlcpy() to get task comm. As the comment above the hard-coded 16, we can replace it with TASK_COMM_LEN. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Yafang Shao <[email protected]> Acked-by: Dennis Dalessandro <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Michal Miroslaw <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Al Viro <[email protected]> Cc: Kees Cook <[email protected]> Cc: Petr Mladek <[email protected]> Cc: Andrii Nakryiko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20fs/exec: replace strncpy with strscpy_pad in __get_task_commYafang Shao1-1/+2
If the dest buffer size is smaller than sizeof(tsk->comm), the buffer will be without null ternimator, that may cause problem. Using strscpy_pad() instead of strncpy() in __get_task_comm() can make the string always nul ternimated and zero padded. Link: https://lkml.kernel.org/r/[email protected] Suggested-by: Kees Cook <[email protected]> Suggested-by: Steven Rostedt <[email protected]> Signed-off-by: Yafang Shao <[email protected]> Reviewed-by: Kees Cook <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Michal Miroslaw <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Al Viro <[email protected]> Cc: Kees Cook <[email protected]> Cc: Petr Mladek <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Dennis Dalessandro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20fs/exec: replace strlcpy with strscpy_pad in __set_task_commYafang Shao1-1/+1
Patch series "task comm cleanups", v2. This patchset is part of the patchset "extend task comm from 16 to 24"[1]. Now we have different opinion that dynamically allocates memory to store kthread's long name into a separate pointer, so I decide to take the useful cleanups apart from the original patchset and send it separately[2]. These useful cleanups can make the usage around task comm less error-prone. Furthermore, it will be useful if we want to extend task comm in the future. [1]. https://lore.kernel.org/lkml/[email protected]/ [2]. https://lore.kernel.org/lkml/CALOAHbAx55AUo3bm8ZepZSZnw7A08cvKPdPyNTf=E_tPqmw5hw@mail.gmail.com/ This patch (of 7): strlcpy() can trigger out-of-bound reads on the source string[1], we'd better use strscpy() instead. To make it be robust against full tsk->comm copies that got noticed in other places, we should make sure it's zero padded. [1] https://github.com/KSPP/linux/issues/89 Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Yafang Shao <[email protected]> Reviewed-by: Kees Cook <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Michal Miroslaw <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Al Viro <[email protected]> Cc: Kees Cook <[email protected]> Cc: Petr Mladek <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Dennis Dalessandro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20kernel.h: include a note to discourage people from including it in headersAndy Shevchenko1-0/+9
Include a note at the top to discourage people from including it in headers. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Andy Shevchenko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20include/linux/unaligned: replace kernel.h with the necessary inclusionsAndy Shevchenko3-1/+5
When kernel.h is used in the headers it adds a lot into dependency hell, especially when there are circular dependencies are involved. Replace kernel.h inclusion with the list of what is really being used. The rest of the changes are induced by the above and may not be split. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Andy Shevchenko <[email protected]> Acked-by: Arend van Spriel <[email protected]> [brcmfmac] Acked-by: Kalle Valo <[email protected]> Cc: Arend van Spriel <[email protected]> Cc: Franky Lin <[email protected]> Cc: Hante Meuleman <[email protected]> Cc: Chi-hsien Lin <[email protected]> Cc: Wright Feng <[email protected]> Cc: Chung-hsien Hsu <[email protected]> Cc: Kalle Valo <[email protected]> Cc: David S. Miller <[email protected]> Cc: Jakub Kicinski <[email protected]> Cc: Heikki Krogerus <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20sysctl: remove redundant ret assignmentluo penghao1-1/+0
Subsequent if judgments will assign new values to ret, so the statement here should be deleted The clang_analyzer complains as follows: fs/proc/proc_sysctl.c: Value stored to 'ret' is never read Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: luo penghao <[email protected]> Reported-by: Zeal Robot <[email protected]> Acked-by: Luis Chamberlain <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20sysctl: fix duplicate path separator in printed entriesGeert Uytterhoeven1-4/+4
sysctl_print_dir() always terminates the printed path name with a slash, so printing a slash before the file part causes a duplicate like in sysctl duplicate entry: /kernel//perf_user_access Fix this by dropping the extra slash. Link: https://lkml.kernel.org/r/e3054d605dc56f83971e4b6d2f5fa63a978720ad.1641551872.git.geert+renesas@glider.be Signed-off-by: Geert Uytterhoeven <[email protected]> Acked-by: Christian Brauner <[email protected]> Acked-by: Luis Chamberlain <[email protected]> Cc: Iurii Zaikin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20proc: convert the return type of proc_fd_access_allowed() to be booleanQi Zheng1-2/+2
Convert return type of proc_fd_access_allowed() and the 'allowed' in it to be boolean since the return type of ptrace_may_access() is boolean. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Qi Zheng <[email protected]> Cc: Kees Cook <[email protected]> Cc: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20proc: make the proc_create[_data]() stubs static inlinesHans de Goede2-7/+13
Change the proc_create[_data]() stubs which are used when CONFIG_PROC_FS is not set from #defines to a static inline stubs. This should fix clang -Werror builds failing due to errors like this: drivers/platform/x86/thinkpad_acpi.c:918:30: error: unused variable 'dispatch_proc_ops' [-Werror,-Wunused-const-variable] Fixing this in include/linux/proc_fs.h should ensure that the same issue is also fixed in any other drivers hitting the same -Werror issue. [[email protected]: fix CONFIG_PROC_FS=n] [[email protected]: fix arch/sparc/kernel/led.c] [[email protected]: fix build] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Hans de Goede <[email protected]> Reported-by: kernel test robot <[email protected]> Acked-by: Christian Brauner <[email protected]> Cc: Alexander Viro <[email protected]> Cc: Hans de Goede <[email protected]> Cc: David Howells <[email protected]> Cc: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20proc/vmcore: don't fake reading zeroes on surprise vmcore_cb unregistrationDavid Hildenbrand1-8/+2
In commit cc5f2704c934 ("proc/vmcore: convert oldmem_pfn_is_ram callback to more generic vmcore callbacks"), we added detection of surprise vmcore_cb unregistration after the vmcore was already opened. Once detected, we warn the user and simulate reading zeroes from that point on when accessing the vmcore. The basic reason was that unexpected unregistration, for example, by manually unbinding a driver from a device after opening the vmcore, is not supported and could result in reading oldmem the vmcore_cb would have actually prohibited while registered. However, something like that can similarly be trigger by a user that's really looking for trouble simply by unbinding the relevant driver before opening the vmcore -- or by disallowing loading the driver in the first place. So it's actually of limited help. Currently, unregistration can only be triggered via virtio-mem when manually unbinding the driver from the device inside the VM; there is no way to trigger it from the hypervisor, as hypervisors don't allow for unplugging virtio-mem devices -- ripping out system RAM from a VM without coordination with the guest is usually not a good idea. The important part is that unbinding the driver and unregistering the vmcore_cb while concurrently reading the vmcore won't crash the system, and that is handled by the rwsem. To make the mechanism more future proof, let's remove the "read zero" part, but leave the warning in place. For example, we could have a future driver (like virtio-balloon) that will contact the hypervisor to figure out if we already populated a page for a given PFN. Hotunplugging such a device and consequently unregistering the vmcore_cb could be triggered from the hypervisor without harming the system even while kdump is running. In that case, we don't want to silently end up with a vmcore that contains wrong data, because the user inside the VM might be unaware of the hypervisor action and might easily miss the warning in the log. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Acked-by: Baoquan He <[email protected]> Cc: Dave Young <[email protected]> Cc: Vivek Goyal <[email protected]> Cc: Philipp Rudo <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20mm: percpu: add generic pcpu_populate_pte() functionKefeng Wang6-162/+78
With NEED_PER_CPU_PAGE_FIRST_CHUNK enabled, we need a function to populate pte, this patch adds a generic pcpu populate pte function, pcpu_populate_pte(), which is marked __weak and used on most architectures, but it is overridden on x86, which has its own implementation. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kefeng Wang <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Dennis Zhou <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Albert Ou <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20mm: percpu: add generic pcpu_fc_alloc/free funcitonKefeng Wang7-228/+54
With the previous patch, we could add a generic pcpu first chunk allocate and free function to cleanup the duplicated definations on each architecture. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kefeng Wang <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Dennis Zhou <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Albert Ou <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20mm: percpu: add pcpu_fc_cpu_to_node_fn_t typedefKefeng Wang7-25/+62
Add pcpu_fc_cpu_to_node_fn_t and pass it into pcpu_fc_alloc_fn_t, pcpu first chunk allocation will call it to alloc memblock on the corresponding node by it, this is prepare for the next patch. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kefeng Wang <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Dennis Zhou <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Albert Ou <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-20mm: percpu: generalize percpu related configKefeng Wang8-74/+33
Patch series "mm: percpu: Cleanup percpu first chunk function". When supporting page mapping percpu first chunk allocator on arm64, we found there are lots of duplicated codes in percpu embed/page first chunk allocator. This patchset is aimed to cleanup them and should no function change. The currently supported status about 'embed' and 'page' in Archs shows below, embed: NEED_PER_CPU_PAGE_FIRST_CHUNK page: NEED_PER_CPU_EMBED_FIRST_CHUNK embed page ------------------------ arm64 Y Y mips Y N powerpc Y Y riscv Y N sparc Y Y x86 Y Y ------------------------ There are two interfaces about percpu first chunk allocator, extern int __init pcpu_embed_first_chunk(size_t reserved_size, size_t dyn_size, size_t atom_size, pcpu_fc_cpu_distance_fn_t cpu_distance_fn, - pcpu_fc_alloc_fn_t alloc_fn, - pcpu_fc_free_fn_t free_fn); + pcpu_fc_cpu_to_node_fn_t cpu_to_nd_fn); extern int __init pcpu_page_first_chunk(size_t reserved_size, - pcpu_fc_alloc_fn_t alloc_fn, - pcpu_fc_free_fn_t free_fn, - pcpu_fc_populate_pte_fn_t populate_pte_fn); + pcpu_fc_cpu_to_node_fn_t cpu_to_nd_fn); The pcpu_fc_alloc_fn_t/pcpu_fc_free_fn_t is killed, we provide generic pcpu_fc_alloc() and pcpu_fc_free() function, which are called in the pcpu_embed/page_first_chunk(). 1) For pcpu_embed_first_chunk(), pcpu_fc_cpu_to_node_fn_t is needed to be provided when archs supported NUMA. 2) For pcpu_page_first_chunk(), the pcpu_fc_populate_pte_fn_t is killed too, a generic pcpu_populate_pte() which marked '__weak' is provided, if you need a different function to populate pte on the arch(like x86), please provide its own implementation. [1] https://github.com/kevin78/linux.git percpu-cleanup This patch (of 4): The HAVE_SETUP_PER_CPU_AREA/NEED_PER_CPU_EMBED_FIRST_CHUNK/ NEED_PER_CPU_PAGE_FIRST_CHUNK/USE_PERCPU_NUMA_NODE_ID configs, which have duplicate definitions on platforms that subscribe it. Move them into mm, drop these redundant definitions and instead just select it on applicable platforms. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kefeng Wang <[email protected]> Acked-by: Catalin Marinas <[email protected]> [arm64] Cc: Will Deacon <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Albert Ou <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Dennis Zhou <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Tejun Heo <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-19Merge branch 'ipv4-avoid-pathological-hash-tables'Jakub Kicinski1-33/+32
Eric Dumazet says: ==================== ipv4: avoid pathological hash tables This series speeds up netns dismantles on hosts having many active netns, by making sure two hash tables used for IPV4 fib contains uniformly spread items. v2: changed second patch to add fib_info_laddrhash_bucket() for consistency (David Ahern suggestion). ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-01-19ipv4: add net_hash_mix() dispersion to fib_info_laddrhash keysEric Dumazet1-14/+15
net/ipv4/fib_semantics.c uses a hash table (fib_info_laddrhash) in which fib_sync_down_addr() can locate fib_info based on IPv4 local address. This hash table is resized based on total number of hashed fib_info, but the hash function is only using the local address. For hosts having many active network namespaces, all fib_info for loopback devices (IPv4 address 127.0.0.1) are hashed into a single bucket, making netns dismantles very slow. Signed-off-by: Eric Dumazet <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-01-19ipv4: avoid quadratic behavior in netns dismantleEric Dumazet1-19/+17
net/ipv4/fib_semantics.c uses an hash table of 256 slots, keyed by device ifindexes: fib_info_devhash[DEVINDEX_HASHSIZE] Problem is that with network namespaces, devices tend to use the same ifindex. lo device for instance has a fixed ifindex of one, for all network namespaces. This means that hosts with thousands of netns spend a lot of time looking at some hash buckets with thousands of elements, notably at netns dismantle. Simply add a per netns perturbation (net_hash_mix()) to spread elements more uniformely. Also change fib_devindex_hashfn() to use more entropy. Fixes: aa79e66eee5d ("net: Make ifindex generation per-net namespace") Signed-off-by: Eric Dumazet <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-01-19Merge branch 'net-fsl-xgmac_mdio-add-workaround-for-erratum-a-009885'Jakub Kicinski3-7/+32
Tobias Waldekranz says: ==================== net/fsl: xgmac_mdio: Add workaround for erratum A-009885 The individual messages mostly speak for themselves. It is very possible that there are more chips out there that are impacted by this, but I only have access to the errata document for the T1024 family, so I've limited the DT changes to the exact FMan version used in that device. Hopefully someone from NXP can supply a follow-up if need be. The final commit is an unrelated fix that was brought to my attention by sparse. ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-01-19net/fsl: xgmac_mdio: Fix incorrect iounmap when removing moduleTobias Waldekranz1-1/+2
As reported by sparse: In the remove path, the driver would attempt to unmap its own priv pointer - instead of the io memory that it mapped in probe. Fixes: 9f35a7342cff ("net/fsl: introduce Freescale 10G MDIO driver") Signed-off-by: Tobias Waldekranz <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>