aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-11-17epoll: avoid calling ep_call_nested() from ep_poll_safewake()Jason Baron1-29/+18
ep_poll_safewake() is used to wakeup potentially nested epoll file descriptors. The function uses ep_call_nested() to prevent entering the same wake up queue more than once, and to prevent excessively deep wakeup paths (deeper than EP_MAX_NESTS). However, this is not necessary since we are already preventing these conditions during EPOLL_CTL_ADD. This saves extra function calls, and avoids taking a global lock during the ep_call_nested() calls. I have, however, left ep_call_nested() for the CONFIG_DEBUG_LOCK_ALLOC case, since ep_call_nested() keeps track of the nesting level, and this is required by the call to spin_lock_irqsave_nested(). It would be nice to remove the ep_call_nested() calls for the CONFIG_DEBUG_LOCK_ALLOC case as well, however its not clear how to simply pass the nesting level through multiple wake_up() levels without more surgery. In any case, I don't think CONFIG_DEBUG_LOCK_ALLOC is generally used for production. This patch, also apparently fixes a workload at Google that Salman Qazi reported by completely removing the poll_safewake_ncalls->lock from wakeup paths. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jason Baron <[email protected]> Acked-by: Davidlohr Bueso <[email protected]> Cc: Alexander Viro <[email protected]> Cc: Salman Qazi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17epoll: account epitem and eppoll_entry to kmemcgShakeel Butt1-2/+2
A userspace application can directly trigger the allocations from eventpoll_epi and eventpoll_pwq slabs. A buggy or malicious application can consume a significant amount of system memory by triggering such allocations. Indeed we have seen in production where a buggy application was leaking the epoll references and causing a burst of eventpoll_epi and eventpoll_pwq slab allocations. This patch opt-in the charging of eventpoll_epi and eventpoll_pwq slabs. There is a per-user limit (~4% of total memory if no highmem) on these caches. I think it is too generous particularly in the scenario where jobs of multiple users are running on the system and the administrator is reducing cost by overcomitting the memory. This is unaccounted kernel memory and will not be considered by the oom-killer. I think by accounting it to kmemcg, for systems with kmem accounting enabled, we can provide better isolation between jobs of different users. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Shakeel Butt <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Alexander Viro <[email protected]> Cc: Vladimir Davydov <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Greg Thelen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17checkpatch: do not check missing blank line before builtin_*_driverMasahiro Yamada1-0/+1
checkpatch.pl does not check missing blank line before module_*_driver. I want it to behave likewise for builtin_*_driver. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Masahiro Yamada <[email protected]> Acked-by: Joe Perches <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17checkpatch: add --strict test for lines ending in [ or (Joe Perches1-0/+6
Lines that end in an open bracket or open parenthesis are generally hard to follow. Lines following those ending with open parenthesis are also rarely aligned to that open parenthesis. Suggest not ending lines with '[' or '(' Link: http://lkml.kernel.org/r/8fd0b2b4a7482064254e37931eb9302a81d5aa2f.1508340786.git.joe@perches.com Signed-off-by: Joe Perches <[email protected]> Suggested-by: Vivien Didelot <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17checkpatch: add TP_printk to list of logging functionsJoe Perches1-0/+1
So the line length check can be bypassed by its callers. Link: http://lkml.kernel.org/r/7de542c08a6e79f2ebe7c1416c9f403c23fdcc09.1508282823.git.joe@perches.com Signed-off-by: Joe Perches <[email protected]> Reported-by: Song Liu <[email protected]> Tested-by: Song Liu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17checkpatch: allow DEFINE_PER_CPU definitions to exceed line lengthJoe Perches1-2/+3
Some of the definitions are very long and can't be split into multiple lines because ctags is limited. Exempt these lines from the line length checks. See commit 25528213fe9f ("tags: Fix DEFINE_PER_CPU expansions") for more details. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Joe Perches <[email protected]> Acked-by: Mark Rutland <[email protected]> Cc: Daniel Lezcano <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ard Biesheuvel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17checkpatch: printks always need a KERN_<LEVEL>Joe Perches1-22/+4
There was code in checkpatch that allowed continuation printks to be used without KERN_CONT. Remove the continuation check and always require a KERN_<LEVEL>. Link: http://lkml.kernel.org/r/61980ef41d5b9b6543da1c49055042e0ab74d308.1507047008.git.joe@perches.com Signed-off-by: Joe Perches <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17scripts/checkpatch.pl: avoid false warning missing breakHeinrich Schuchardt1-1/+1
void foo(int a) switch (a) { case 'h': fun1(); exit(1); default: } creates a warning "Possible switch case/default not preceded by break or fallthrough comment". exit( should be treated like return. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Heinrich Schuchardt <[email protected]> Acked-by: Joe Perches <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17checkpatch: support function pointers for unnamed function definition argumentsMiles Chen1-1/+1
Current unnamed function definition argument does not include function pointer cases and it reports something like: WARNING: function definition argument 'void' should also have an identifier name +unsigned int (*dummy)(void); Support function pointers for unnamed function arguments Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Miles Chen <[email protected]> Acked-by: Joe Perches <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17lib: test module for find_*_bit() functionsYury Norov3-0/+154
find_bit functions are widely used in the kernel, including hot paths. This module tests performance of those functions in 2 typical scenarios: randomly filled bitmap with relatively equal distribution of set and cleared bits, and sparse bitmap which has 1 set bit for 500 cleared bits. On ThunderX machine: Start testing find_bit() with random-filled bitmap find_next_bit: 240043 cycles, 164062 iterations find_next_zero_bit: 312848 cycles, 163619 iterations find_last_bit: 193748 cycles, 164062 iterations find_first_bit: 177720874 cycles, 164062 iterations Start testing find_bit() with sparse bitmap find_next_bit: 3633 cycles, 656 iterations find_next_zero_bit: 620399 cycles, 327025 iterations find_last_bit: 3038 cycles, 656 iterations find_first_bit: 691407 cycles, 656 iterations [[email protected]: use correct format string for find-bit tests] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Yury Norov <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]> Reviewed-by: Clement Courbet <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Rasmus Villemoes <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17lib/rbtree-test: lower default paramsDavidlohr Bueso2-3/+3
Fengguang reported soft lockups while running the rbtree and interval tree test modules. The logic for these tests all occur in init phase, and we currently are pounding with the default values for number of nodes and number of iterations of each test. Reduce the latter by two orders of magnitude. This does not influence the value of the tests in that one thousand times by default is enough to get the picture. Link: http://lkml.kernel.org/r/20171109161715.xai2dtwqw2frhkcm@linux-n805 Signed-off-by: Davidlohr Bueso <[email protected]> Reported-by: Fengguang Wu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17tools/lib/traceevent/parse-filter.c: clean up clang build warningCheng Jian1-3/+3
The uniform structure filter_arg sets its union based on the difference of enum filter_arg_type, However, some functions use implicit type conversion obviously. warning: implicit conversion from enumeration type 'enum filter_exp_type' to different enumeration type 'enum filter_op_type' warning: implicit conversion from enumeration type 'enum filter_cmp_type' to different enumeration type 'enum filter_exp_type' Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Cheng Jian <[email protected]> Cc: Kate Stewart <[email protected]> Cc: Xie XiuQi <[email protected]> Cc: Li Bin <[email protected]> Cc: Steven Rostedt (Red Hat) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17lib/nmi_backtrace.c: fix kernel text address leakLiu, Changcheng1-2/+2
Don't leak idle function address in NMI backtrace. Link: http://lkml.kernel.org/r/20171106165648.GA95243@sofia Signed-off-by: Liu Changcheng <[email protected]> Reviewed-by: Petr Mladek <[email protected]> Reviewed-by: Josh Poimboeuf <[email protected]> Reviewed-by: Sergey Senozhatsky <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17lib/genalloc.c: make the avail variable an atomic_long_tStephen Bates2-6/+7
If the amount of resources allocated to a gen_pool exceeds 2^32 then the avail atomic overflows and this causes problems when clients try and borrow resources from the pool. This is only expected to be an issue on 64 bit systems. Add the <linux/atomic.h> header to pull in atomic_long* operations. So that 32 bit systems continue to use atomic32_t but 64 bit systems can use atomic64_t. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Stephen Bates <[email protected]> Reviewed-by: Logan Gunthorpe <[email protected]> Reviewed-by: Mathieu Desnoyers <[email protected]> Reviewed-by: Daniel Mentz <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17lib/int_sqrt: adjust commentsPeter Zijlstra1-2/+2
Our current int_sqrt() is not rough nor any approximation; it calculates the exact value of: floor(sqrt()). Document this. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Linus Torvalds <[email protected]> Cc: Anshul Garg <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: David Miller <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Joe Perches <[email protected]> Cc: Kees Cook <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Michael Davidson <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17lib/int_sqrt: optimize initial value computePeter Zijlstra1-4/+2
The initial value (@m) compute is: m = 1UL << (BITS_PER_LONG - 2); while (m > x) m >>= 2; Which is a linear search for the highest even bit smaller or equal to @x We can implement this using a binary search using __fls() (or better when its hardware implemented). m = 1UL << (__fls(x) & ~1UL); Especially for small values of @x; which are the more common arguments when doing a CDF on idle times; the linear search is near to worst case, while the binary search of __fls() is a constant 6 (or 5 on 32bit) branches. cycles: branches: branch-misses: PRE: hot: 43.633557 +- 0.034373 45.333132 +- 0.002277 0.023529 +- 0.000681 cold: 207.438411 +- 0.125840 45.333132 +- 0.002277 6.976486 +- 0.004219 SOFTWARE FLS: hot: 29.576176 +- 0.028850 26.666730 +- 0.004511 0.019463 +- 0.000663 cold: 165.947136 +- 0.188406 26.666746 +- 0.004511 6.133897 +- 0.004386 HARDWARE FLS: hot: 24.720922 +- 0.025161 20.666784 +- 0.004509 0.020836 +- 0.000677 cold: 132.777197 +- 0.127471 20.666776 +- 0.004509 5.080285 +- 0.003874 Averages computed over all values <128k using a LFSR to generate order. Cold numbers have a LFSR based branch trace buffer 'confuser' ran between each int_sqrt() invocation. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Suggested-by: Joe Perches <[email protected]> Acked-by: Will Deacon <[email protected]> Acked-by: Linus Torvalds <[email protected]> Cc: Anshul Garg <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: David Miller <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kees Cook <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Michael Davidson <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17lib/int_sqrt: optimize small argumentPeter Zijlstra1-0/+3
The current int_sqrt() computation is sub-optimal for the case of small @x. Which is the interesting case when we're going to do cumulative distribution functions on idle times, which we assume to be a random variable, where the target residency of the deepest idle state gives an upper bound on the variable (5e6ns on recent Intel chips). In the case of small @x, the compute loop: while (m != 0) { b = y + m; y >>= 1; if (x >= b) { x -= b; y += m; } m >>= 2; } can be reduced to: while (m > x) m >>= 2; Because y==0, b==m and until x>=m y will remain 0. And while this is computationally equivalent, it runs much faster because there's less code, in particular less branches. cycles: branches: branch-misses: OLD: hot: 45.109444 +- 0.044117 44.333392 +- 0.002254 0.018723 +- 0.000593 cold: 187.737379 +- 0.156678 44.333407 +- 0.002254 6.272844 +- 0.004305 PRE: hot: 67.937492 +- 0.064124 66.999535 +- 0.000488 0.066720 +- 0.001113 cold: 232.004379 +- 0.332811 66.999527 +- 0.000488 6.914634 +- 0.006568 POST: hot: 43.633557 +- 0.034373 45.333132 +- 0.002277 0.023529 +- 0.000681 cold: 207.438411 +- 0.125840 45.333132 +- 0.002277 6.976486 +- 0.004219 Averages computed over all values <128k using a LFSR to generate order. Cold numbers have a LFSR based branch trace buffer 'confuser' ran between each int_sqrt() invocation. Link: http://lkml.kernel.org/r/[email protected] Fixes: 30493cc9dddb ("lib/int_sqrt.c: optimize square root algorithm") Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Suggested-by: Anshul Garg <[email protected]> Acked-by: Linus Torvalds <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Will Deacon <[email protected]> Cc: Joe Perches <[email protected]> Cc: David Miller <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Kees Cook <[email protected]> Cc: Michael Davidson <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17lib/test: delete five error messages for failed memory allocationsMarkus Elfring3-15/+7
Omit extra messages for a memory allocation failure in these functions. This issue was detected by using the Coccinelle software. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Markus Elfring <[email protected]> Acked-by: Michal Hocko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17lib: add module support to string testsGeert Uytterhoeven4-142/+143
Extract the string test code into its own source file, to allow compiling it either to a loadable module, or built into the kernel. Fixes: 03270c13c5ffaa6a ("lib/string.c: add testcases for memset16/32/64") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Geert Uytterhoeven <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17include/linux/radix-tree.h: remove unneeded #include <linux/bug.h>Masahiro Yamada1-1/+0
This include was added by commit 187f1882b5b0 ("BUG: headers with BUG/BUG_ON etc. need linux/bug.h") because BUG_ON() was used in this header at that time. Some time later, commit 6d75f366b924 ("lib: radix-tree: check accounting of existing slot replacement users") removed the use of BUG_ON() from this header. Since then, there is no reason to include <linux/bug.h>. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Masahiro Yamada <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Masahiro Yamada <[email protected]> Cc: Jan Kara <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Chris Mi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17include/linux/bitfield.h: include <linux/build_bug.h> instead of <linux/bug.h>Masahiro Yamada1-1/+1
Since commit bc6245e5efd7 ("bug: split BUILD_BUG stuff out into <linux/build_bug.h>"), #include <linux/build_bug.h> is better to pull minimal headers needed for BUILG_BUG() family. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Masahiro Yamada <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Cc: Dinan Gunawardena <[email protected]> Cc: Kalle Valo <[email protected]> Cc: Ian Abbott <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17get_maintainer: add more --self-test optionsJoe Perches1-17/+132
Add tests for duplicate section headers, missing section content, link and scm reachability. Miscellanea: o Add --self-test=<foo> options (a comma separated list of any of sections, patterns, links or scm) where the default without options is all tests o Rename check_maintainers_patterns to self_test o Rename self_test_pattern_info to self_test_info [[email protected]: improvements] Link: http://lkml.kernel.org/r/13e3986c374902fcf08ae947e36c5c608bbe3b79.1510075301.git.joe@perches.com Signed-off-by: Joe Perches <[email protected]> Reviewed-by: Tom Saeger <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17get_maintainer: add --self-test for internal consistency testsTom Saeger1-17/+77
Add "--self-test" option to get_maintainer.pl to show potential issues in MAINTAINERS file(s) content. Pattern check warnings are shown for "F" and "X" patterns found in MAINTAINERS file(s) which do not match any files known by git. Link: http://lkml.kernel.org/r/64994f911b3510d0f4c8ac2e113501dfcec1f3c9.1509559540.git.tom.saeger@oracle.com Signed-off-by: Tom Saeger <[email protected]> Acked-by: Joe Perches <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17dynamic_debug documentation: minor fixesRandy Dunlap1-3/+3
Fix minor typo. Fix missing words in explaining parsing of last line number. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Randy Dunlap <[email protected]> Cc: Jason Baron <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17dynamic-debug-howto: fix optional/omitted ending line number to be LARGE ↵Randy Dunlap1-0/+4
instead of 0 line-range is supposed to treat "1-" as "1-endoffile", so handle the special case by setting last_lineno to UINT_MAX. Fixes this error: dynamic_debug:ddebug_parse_query: last-line:0 < 1st-line:1 dynamic_debug:ddebug_exec_query: query parse failed Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Randy Dunlap <[email protected]> Acked-by: Jason Baron <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17kernel/umh.c: optimize 'proc_cap_handler()'Christophe JAILLET1-2/+2
If 'write' is 0, we can avoid a call to spin_lock/spin_unlock. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Christophe JAILLET <[email protected]> Acked-by: Luis R. Rodriguez <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17include/linux/compiler-clang.h: handle randomizable anonymous structsSandipan Das1-0/+3
The GCC randomize layout plugin can randomize the member offsets of sensitive kernel data structures. To use this feature, certain annotations and members are added to the structures which affect the member offsets even if this plugin is not used. All of these structures are completely randomized, except for task_struct which leaves out some of its members. All the other members are wrapped within an anonymous struct with the __randomize_layout attribute. This is done using the randomized_struct_fields_start and randomized_struct_fields_end defines. When the plugin is disabled, the behaviour of this attribute can vary based on the GCC version. For GCC 5.1+, this attribute maps to __designated_init otherwise it is just an empty define but the anonymous structure is still present. For other compilers, both randomized_struct_fields_start and randomized_struct_fields_end default to empty defines meaning the anonymous structure is not introduced at all. So, if a module compiled with Clang, such as a BPF program, needs to access task_struct fields such as pid and comm, the offsets of these members as recognized by Clang are different from those recognized by modules compiled with GCC. If GCC 4.6+ is used to build the kernel, this can be solved by introducing appropriate defines for Clang so that the anonymous structure is seen when determining the offsets for the members. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Sandipan Das <[email protected]> Cc: David Rientjes <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Kate Stewart <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Alexei Starovoitov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17bug: fix "cut here" location for __WARN_TAINT architecturesKees Cook2-3/+18
Prior to v4.11, x86 used warn_slowpath_fmt() for handling WARN()s. After WARN() was moved to using UD0 on x86, the warning text started appearing _before_ the "cut here" line. This appears to have been a long-standing bug on architectures that used __WARN_TAINT, but it didn't get fixed. v4.11 and earlier on x86: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 2956 at drivers/misc/lkdtm_bugs.c:65 lkdtm_WARNING+0x21/0x30 This is a warning message Modules linked in: v4.12 and later on x86: This is a warning message ------------[ cut here ]------------ WARNING: CPU: 1 PID: 2982 at drivers/misc/lkdtm_bugs.c:68 lkdtm_WARNING+0x15/0x20 Modules linked in: With this fix: ------------[ cut here ]------------ This is a warning message WARNING: CPU: 3 PID: 3009 at drivers/misc/lkdtm_bugs.c:67 lkdtm_WARNING+0x15/0x20 Since the __FILE__ reporting happens as part of the UD0 handler, it isn't trivial to move the message to after the WARNING line, but at least we can fix the position of the "cut here" line so all the various logging tools will start including the actual runtime warning message again, when they follow the instruction and "cut here". Link: http://lkml.kernel.org/r/[email protected] Fixes: 9a93848fe787 ("x86/debug: Implement __WARN() using UD0") Signed-off-by: Kees Cook <[email protected]> Cc: Peter Zijlstra (Intel) <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Fengguang Wu <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17bug: define the "cut here" string in a single placeKees Cook4-3/+5
The "cut here" string is used in a few paths. Define it in a single place. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Kees Cook <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Fengguang Wu <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17lkdtm: include WARN format stringKees Cook1-1/+3
In order to test the ordering of WARN format strings, actually include one in LKDTM. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Kees Cook <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Fengguang Wu <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17iopoll: avoid -Wint-in-bool-context warningArnd Bergmann1-9/+15
When we pass the result of a multiplication as the timeout or the delay, we can get a warning from gcc-7: drivers/mmc/host/bcm2835.c:596:149: error: '*' in boolean context, suggest '&&' instead [-Werror=int-in-bool-context] drivers/mfd/arizona-core.c:247:195: error: '*' in boolean context, suggest '&&' instead [-Werror=int-in-bool-context] drivers/gpu/drm/sun4i/sun4i_hdmi_i2c.c:49:27: error: '*' in boolean context, suggest '&&' instead [-Werror=int-in-bool-context] The warning is a bit questionable inside of a macro, but this is intentional on the side of the gcc developers. It is also an indication of another problem: we evaluate the timeout and sleep arguments multiple times, which can have undesired side-effects when those are complex expressions. This changes the two iopoll variants to use local variables for storing copies of the timeouts. This adds some more type safety, and avoids both the double-evaluation and the gcc warning. Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81484 Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnd Bergmann <[email protected]> Reviewed-by: Mark Brown <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17parse-maintainers: add ability to specify filenamesJoe Perches1-5/+47
parse-maintainers.pl is convenient, but currently hard-codes the filenames that are used. Allow user-specified filenames to simplify the use of the script. Link: http://lkml.kernel.org/r/48703c068b3235223ffa3b2eb268fa0a125b25e0.1502251549.git.joe@perches.com Signed-off-by: Joe Perches <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17kernel debug: support resetting WARN_ONCE for all architecturesAndi Kleen3-1/+30
Some architectures store the WARN_ONCE state in the flags field of the bug_entry. Clear that one too when resetting once state through /sys/kernel/debug/clear_warn_once Pointed out by Michael Ellerman Improves the earlier patch that add clear_warn_once. [[email protected]: add a missing ifdef CONFIG_MODULES] Link: http://lkml.kernel.org/r/[email protected] [[email protected]: fix unused var warning] [[email protected]: Use 0200 for clear_warn_once file, per mpe] [[email protected]: clear BUGFLAG_DONE in clear_once_table(), per mpe] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Andi Kleen <[email protected]> Tested-by: Michael Ellerman <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17kernel debug: support resetting WARN*_ONCEAndi Kleen5-3/+42
I like _ONCE warnings because it's guaranteed that they don't flood the log. During testing I find it useful to reset the state of the once warnings, so that I can rerun tests and see if they trigger again, or can guarantee that a test run always hits the same warnings. This patch adds a debugfs interface to reset all the _ONCE warnings so that they appear again: echo 1 > /sys/kernel/debug/clear_warn_once This is implemented by putting all the warning booleans into a special section, and clearing it. [[email protected]: coding-style fixes] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Andi Kleen <[email protected]> Tested-by: Michael Ellerman <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17sh/boot: add static stack-protector to pre-kernelKees Cook1-0/+14
The sh decompressor code triggers stack-protector code generation when using CONFIG_CC_STACKPROTECTOR_STRONG. As done for arm and mips, add a simple static stack-protector canary. As this wasn't protected before, the risk of using a weak canary is minimized. Once the kernel is actually up, a better canary is chosen. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Kees Cook <[email protected]> Cc: Yoshinori Sato <[email protected]> Cc: Rich Felker <[email protected]> Cc: Al Viro <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Laura Abbott <[email protected]> Cc: Masahiro Yamada <[email protected]> Cc: Michal Marek <[email protected]> Cc: Nicholas Piggin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17spelling.txt: add "unnecessary" typo variantsJoe Perches1-0/+4
Add unnecessary typos by copying the necessary typos. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Joe Perches <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17proc: use do-while in name_to_int()Alexey Dobriyan1-2/+2
Gcc doesn't know that "len" is guaranteed to be >=1 by dcache and generates standard while-loop prologue duplicating loop condition. add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-27 (-27) function old new delta name_to_int 104 77 -27 Link: http://lkml.kernel.org/r/20170912195213.GB17730@avx2 Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17proc: : uninline name_to_int()Alexey Dobriyan3-22/+25
Save ~360 bytes. add/remove: 1/0 grow/shrink: 0/4 up/down: 104/-463 (-359) function old new delta name_to_int - 104 +104 proc_pid_lookup 217 126 -91 proc_lookupfd_common 212 121 -91 proc_task_lookup 289 194 -95 __proc_create 588 402 -186 Link: http://lkml.kernel.org/r/20170912194850.GA17730@avx2 Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17proc, coredump: add CoreDumping flag to /proc/pid/statusRoman Gushchin2-0/+9
Right now there is no convenient way to check if a process is being coredumped at the moment. It might be necessary to recognize such state to prevent killing the process and getting a broken coredump. Writing a large core might take significant time, and the process is unresponsive during it, so it might be killed by timeout, if another process is monitoring and killing/restarting hanging tasks. We're getting a significant number of corrupted coredump files on machines in our fleet, just because processes are being killed by timeout in the middle of the core writing process. We do have a process health check, and some agent is responsible for restarting processes which are not responding for health check requests. Writing a large coredump to the disk can easily exceed the reasonable timeout (especially on an overloaded machine). This flag will allow the agent to distinguish processes which are being coredumped, extend the timeout for them, and let them produce a full coredump file. To provide an ability to detect if a process is in the state of being coredumped, we can expose a boolean CoreDumping flag in /proc/pid/status. Example: $ cat core.sh #!/bin/sh echo "|/usr/bin/sleep 10" > /proc/sys/kernel/core_pattern sleep 1000 & PID=$! cat /proc/$PID/status | grep CoreDumping kill -ABRT $PID sleep 1 cat /proc/$PID/status | grep CoreDumping $ ./core.sh CoreDumping: 0 CoreDumping: 1 [[email protected]: document CoreDumping flag in /proc/<pid>/status] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Roman Gushchin <[email protected]> Cc: Alexander Viro <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Konstantin Khlebnikov <[email protected]> Cc: Oleg Nesterov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17mm, compaction: remove unneeded pageblock_skip_persistent() checksVlastimil Babka1-15/+3
Commit f3c931633a59 ("mm, compaction: persistently skip hugetlbfs pageblocks") has introduced pageblock_skip_persistent() checks into migration and free scanners, to make sure pageblocks that should be persistently skipped are marked as such, regardless of the ignore_skip_hint flag. Since the previous patch introduced a new no_set_skip_hint flag, the ignore flag no longer prevents marking pageblocks as skipped. Therefore we can remove the special cases. The relevant pageblocks will be marked as skipped by the common logic which marks each pageblock where no page could be isolated. This makes the code simpler. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Vlastimil Babka <[email protected]> Cc: Mel Gorman <[email protected]> Cc: David Rientjes <[email protected]> Cc: Joonsoo Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17mm, compaction: split off flag for not updating skip hintsVlastimil Babka3-1/+3
Pageblock skip hints were added as a heuristic for compaction, which shares core code with CMA. Since CMA reliability would suffer from the heuristics, compact_control flag ignore_skip_hint was added for the CMA use case. Since 6815bf3f233e ("mm/compaction: respect ignore_skip_hint in update_pageblock_skip") the flag also means that CMA won't *update* the skip hints in addition to ignoring them. Today, direct compaction can also ignore the skip hints in the last resort attempt, but there's no reason not to set them when isolation fails in such case. Thus, this patch splits off a new no_set_skip_hint flag to avoid the updating, which only CMA sets. This should improve the heuristics a bit, and allow us to simplify the persistent skip bit handling as the next step. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Vlastimil Babka <[email protected]> Acked-by: Mel Gorman <[email protected]> Cc: David Rientjes <[email protected]> Cc: Joonsoo Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17mm, compaction: extend pageblock_skip_persistent() to all compound pagesVlastimil Babka1-11/+14
pageblock_skip_persistent() checks for HugeTLB pages of pageblock order. When clearing pageblock skip bits for compaction, the bits are not cleared for such pageblocks, because they cannot contain base pages suitable for migration, nor free pages to use as migration targets. This optimization can be simply extended to all compound pages of order equal or larger than pageblock order, because migrating such pages (if they support it) cannot help sub-pageblock fragmentation. This includes THP's and also gigantic HugeTLB pages, which the current implementation doesn't persistently skip due to a strict pageblock_order equality check and not recognizing tail pages. While THP pages are generally less "persistent" than HugeTLB, we can still expect that if a THP exists at the point of __reset_isolation_suitable(), it will exist also during the subsequent compaction run. The time difference here could be actually smaller than between a compaction run that sets a (non-persistent) skip bit on a THP, and the next compaction run that observes it. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Vlastimil Babka <[email protected]> Acked-by: Mel Gorman <[email protected]> Acked-by: David Rientjes <[email protected]> Cc: Joonsoo Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17mm, compaction: persistently skip hugetlbfs pageblocksDavid Rientjes2-12/+55
It is pointless to migrate hugetlb memory as part of memory compaction if the hugetlb size is equal to the pageblock order. No defragmentation is occurring in this condition. It is also pointless to for the freeing scanner to scan a pageblock where a hugetlb page is pinned. Unconditionally skip these pageblocks, and do so peristently so that they are not rescanned until it is observed that these hugepages are no longer pinned. It would also be possible to do this by involving the hugetlb subsystem in marking pageblocks to no longer be skipped when they hugetlb pages are freed. This is a simple solution that doesn't involve any additional subsystems in pageblock skip manipulation. [[email protected]: fix build] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: David Rientjes <[email protected]> Tested-by: Michal Hocko <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Mel Gorman <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17mm, compaction: kcompactd should not ignore pageblock skipDavid Rientjes1-2/+1
Kcompactd is needlessly ignoring pageblock skip information. It is doing MIGRATE_SYNC_LIGHT compaction, which is no more powerful than MIGRATE_SYNC compaction. If compaction recently failed to isolate memory from a set of pageblocks, there is nothing to indicate that kcompactd will be able to do so, or that it is beneficial from attempting to isolate memory. Use the pageblock skip hint to avoid rescanning pageblocks needlessly until that information is reset. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: David Rientjes <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Mel Gorman <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17mm: shmem: remove unused info variableCorentin Labbe1-2/+0
Fix the following warning by removing the unused variable: mm/shmem.c:3205:27: warning: variable 'info' set but not used [-Wunused-but-set-variable] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Corentin Labbe <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Hugh Dickins <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17lib/dma-debug.c: fix incorrect pfn calculationMiles Chen1-2/+18
dma-debug reports the following warning: WARNING: CPU: 3 PID: 298 at kernel-4.4/lib/dma-debug.c:604 debug _dma_assert_idle+0x1a8/0x230() DMA-API: cpu touching an active dma mapped cacheline [cln=0x00000882300] CPU: 3 PID: 298 Comm: vold Tainted: G W O 4.4.22+ #1 Hardware name: MT6739 (DT) Call trace: debug_dma_assert_idle+0x1a8/0x230 wp_page_copy.isra.96+0x118/0x520 do_wp_page+0x4fc/0x534 handle_mm_fault+0xd4c/0x1310 do_page_fault+0x1c8/0x394 do_mem_abort+0x50/0xec I found that debug_dma_alloc_coherent() and debug_dma_free_coherent() assume that dma_alloc_coherent() always returns a linear address. However it's possible that dma_alloc_coherent() returns a non-linear address. In this case, page_to_pfn(virt_to_page(virt)) will return an incorrect pfn. If the pfn is valid and mapped as a COW page, we will hit the warning when doing wp_page_copy(). Fix this by calculating pfn for linear and non-linear addresses. [[email protected]: v4] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Miles Chen <[email protected]> Reviewed-by: Robin Murphy <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Marek Szyprowski <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17mm/z3fold.c: use kref to prevent page free/compact raceVitaly Wool1-2/+8
There is a race in the current z3fold implementation between do_compact() called in a work queue context and the page release procedure when page's kref goes to 0. do_compact() may be waiting for page lock, which is released by release_z3fold_page_locked right before putting the page onto the "stale" list, and then the page may be freed as do_compact() modifies its contents. The mechanism currently implemented to handle that (checking the PAGE_STALE flag) is not reliable enough. Instead, we'll use page's kref counter to guarantee that the page is not released if its compaction is scheduled. It then becomes compaction function's responsibility to decrease the counter and quit immediately if the page was actually freed. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Vitaly Wool <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17mm: fix nodemask printingArnd Bergmann1-3/+10
The cleanup caused build warnings for constant mask pointers: mm/mempolicy.c: In function `mpol_to_str': ./include/linux/nodemask.h:108:11: warning: the comparison will always evaluate as `true' for the address of `nodes' will never be NULL [-Waddress] An earlier workaround I suggested was incorporated in the version that got merged, but that only solved the problem for gcc-7 and higher, while gcc-4.6 through gcc-6.x still warn. This changes the printing again to use inline functions that make it clear to the compiler that the line that does the NULL check has no idea whether the argument is a constant NULL. Link: http://lkml.kernel.org/r/[email protected] Fixes: 0205f75571e3 ("mm: simplify nodemask printing") Signed-off-by: Arnd Bergmann <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Stephen Rothwell <[email protected]> Cc: Zhangshaokun <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-17Merge tag 'trace-v4.15' of ↵Linus Torvalds32-518/+903
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from - allow module init functions to be traced - clean up some unused or not used by config events (saves space) - clean up of trace histogram code - add support for preempt and interrupt enabled/disable events - other various clean ups * tag 'trace-v4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (30 commits) tracing, thermal: Hide cpu cooling trace events when not in use tracing, thermal: Hide devfreq trace events when not in use ftrace: Kill FTRACE_OPS_FL_PER_CPU perf/ftrace: Small cleanup perf/ftrace: Fix function trace events perf/ftrace: Revert ("perf/ftrace: Fix double traces of perf on ftrace:function") tracing, dma-buf: Remove unused trace event dma_fence_annotate_wait_on tracing, memcg, vmscan: Hide trace events when not in use tracing/xen: Hide events that are not used when X86_PAE is not defined tracing: mark trace_test_buffer as __maybe_unused printk: Remove superfluous memory barriers from printk_safe ftrace: Clear hashes of stale ips of init memory tracing: Add support for preempt and irq enable/disable events tracing: Prepare to add preempt and irq trace events ftrace/kallsyms: Have /proc/kallsyms show saved mod init functions ftrace: Add freeing algorithm to free ftrace_mod_maps ftrace: Save module init functions kallsyms symbols for tracing ftrace: Allow module init functions to be traced ftrace: Add a ftrace_free_mem() function for modules to use tracing: Reimplement log2 ...
2017-11-17Merge tag 'linux-kselftest-4.15-rc1' of ↵Linus Torvalds55-71/+1303
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest updates from Shuah Khan: "This update to Kselftest consists of cleanup patches, fixes, and a new test for ion buffer sharing. Fixes include changes to skip firmware tests on systems that aren't configured to support them, as opposed to failing them" * tag 'linux-kselftest-4.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests: firmware: skip unsupported custom firmware fallback tests selftests: firmware: skip unsupported async loading tests selftests: memfd_test.c: fix compilation warning. selftests/ftrace: Introduce exit_pass and exit_fail selftests: ftrace: add more config fragments android/ion: userspace test utility for ion buffer sharing selftests: remove obsolete kconfig fragment for cpu-hotplug selftests: vdso_test: support ARM64 targets selftests/ftrace: Do not use arch dependent do_IRQ as a target function selftests: breakpoints: fix compile error on breakpoint_test_arm64 selftests: add missing test result status in memory-hotplug test selftests/exec: include cwd in long path calculation selftests: seccomp: update .gitignore with newly added tests selftests: vm: Update .gitignore with newly added tests selftests: timers: Update .gitignore with newly added tests