aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2016-12-12Merge branch 'smp-hotplug-for-linus' of ↵Linus Torvalds52-1657/+894
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull smp hotplug updates from Thomas Gleixner: "This is the final round of converting the notifier mess to the state machine. The removal of the notifiers and the related infrastructure will happen around rc1, as there are conversions outstanding in other trees. The whole exercise removed about 2000 lines of code in total and in course of the conversion several dozen bugs got fixed. The new mechanism allows to test almost every hotplug step standalone, so usage sites can exercise all transitions extensively. There is more room for improvement, like integrating all the pointlessly different architecture mechanisms of synchronizing, setting cpus online etc into the core code" * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits) tracing/rb: Init the CPU mask on allocation soc/fsl/qbman: Convert to hotplug state machine soc/fsl/qbman: Convert to hotplug state machine zram: Convert to hotplug state machine KVM/PPC/Book3S HV: Convert to hotplug state machine arm64/cpuinfo: Convert to hotplug state machine arm64/cpuinfo: Make hotplug notifier symmetric mm/compaction: Convert to hotplug state machine iommu/vt-d: Convert to hotplug state machine mm/zswap: Convert pool to hotplug state machine mm/zswap: Convert dst-mem to hotplug state machine mm/zsmalloc: Convert to hotplug state machine mm/vmstat: Convert to hotplug state machine mm/vmstat: Avoid on each online CPU loops mm/vmstat: Drop get_online_cpus() from init_cpu_node_state/vmstat_cpu_dead() tracing/rb: Convert to hotplug state machine oprofile/nmi timer: Convert to hotplug state machine net/iucv: Use explicit clean up labels in iucv_init() x86/pci/amd-bus: Convert to hotplug state machine x86/oprofile/nmi: Convert to hotplug state machine ...
2016-12-13MAINTAINERS: Samsung: Update maintainer for PWM FAN and SAMSUNG THERMALLukasz Majewski1-2/+2
Since I leave Samsung, I would like to step down from maintenance duties. Bartek Zolnierkiewicz will replace. Signed-off-by: Lukasz Majewski <[email protected]> Acked-by: Guenter Roeck <[email protected]> Signed-off-by: Zhang Rui <[email protected]>
2016-12-12init: reduce rootwait polling interval time to 5msJungseung Lee1-1/+1
For several devices, the rootwait time is sensitive because it directly affects booting time. The polling interval of rootwait is currently 100ms. To save unnessesary waiting time, reduce the polling interval to 5 ms. [[email protected]: remove used-once #define] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jungseung Lee <[email protected]> Cc: Al Viro <[email protected]> Cc: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12binfmt_elf: use vmalloc() for allocation of vma_fileszJason Baron1-2/+4
We have observed page allocations failures of order 4 during core dump while trying to allocate vma_filesz. This results in a useless core file of size 0. To improve reliability use vmalloc(). Note that the vmalloc() allocation is bounded by sysctl_max_map_count, which is 65,530 by default. So with a 4k page size, and 8 bytes per seg, this is a max of 128 pages or an order 7 allocation. Other parts of the core dump path, such as fill_files_note() are already using vmalloc() for presumably similar reasons. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jason Baron <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12checkpatch: don't emit unified-diff error for rename-only patchesAndrew Jeffery1-0/+1
I generated a patch with `git format-patch` which checkpatch thinks is invalid: $ ./scripts/checkpatch.pl lpc-dt/0006-mfd-dt-Move-syscon-bindings-to-syscon-subdirectory.patch WARNING: added, moved or deleted file(s), does MAINTAINERS need updating? Documentation/devicetree/bindings/mfd/{ => syscon}/aspeed-scu.txt | 0 ERROR: Does not appear to be a unified-diff format patch total: 1 errors, 1 warnings, 0 lines checked NOTE: For some of the reported defects, checkpatch may be able to mechanically convert to the typical style using --fix or --fix-inplace. lpc-dt/0006-mfd-dt-Move-syscon-bindings-to-syscon-subdirectory.patch has style problems, please review. NOTE: If any of the errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. The patch in question was all renames with no edits, giving 100% similarity and thus no diff markers. Set '$is_patch = 1;' in the add/remove/rename detection to avoid generating spurious warnings. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Andrew Jeffery <[email protected]> Acked-by: Joe Perches <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12checkpatch: don't check c99 types like uint8_t under toolsTomas Winkler1-1/+2
Tools contains user space code so uintX_t types are just fine. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Tomas Winkler <[email protected]> Acked-by: Joe Perches <[email protected]> Cc: Andy Whitcroft <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12checkpatch: avoid multiple line dereferencesJoe Perches1-0/+12
Code that puts a single dereferencing identifier on multiple lines like: struct_identifier->member[index]. member = <foo>; is generally hard to follow. Prefer that dereferencing identifiers be single line. Link: http://lkml.kernel.org/r/e9c191ae3f41bedc8ffd5c0fbcc5a1cec1d1d2df.1478120869.git.joe@perches.com Signed-off-by: Joe Perches <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12checkpatch: don't check .pl files, improve absolute path commit log testJoe Perches1-15/+15
perl files (*.pl) are mostly inappropriate to check coding styles so exempt them from long line checks and various .[ch] file type tests. And as well, only scan absolute paths in the commit log, not in the patch. Link: http://lkml.kernel.org/r/85b101d50acafe6c0261d9f7df283c827da52c4a.1477340110.git.joe@perches.com Signed-off-by: Joe Perches <[email protected]> Cc: Andy Whitcroft <[email protected]> Cc: Heinrich Schuchardt <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12scripts/checkpatch.pl: fix spellingAndrew Morton1-1/+1
s/preceeded/preceded/ Cc: Joe Perches <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12checkpatch: don't try to get maintained status when --no-tree is givenJerome Forissier1-1/+1
Fixes the following warning: Use of uninitialized value $root in concatenation (.) or string at /path/to/checkpatch.pl line 764. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jerome Forissier <[email protected]> Reviewed-by: Brian Norris <[email protected]> Acked-by: Joe Perches <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12lib/ida: document locking requirements a bit betterDaniel Vetter1-0/+11
I wanted to wrap a bunch of ida_simple_get calls into their own locking, until I dug around and read the original commit message. Stuff like this should imo be added to the kernel doc, let's do that. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Daniel Vetter <[email protected]> Acked-by: Tejun Heo <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Vlastimil Babka <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12lib/rbtree.c: fix typo in comment of ____rb_erase_colorJie Chen1-4/+19
In Case 3 of `sibling == parent->rb_right': Right rotation will not change color of sl and S in the diagram (i.e. should not change "sl" to "Sl", "S" to "s") In Case 3 of `sibling == parent->rb_left': (p) (p) / \ / \ S N --> sr N / \ / Sl sr S / Sl This is actually left rotation at "S", not right rotation. In Case 4 of `sibling == parent->rb_left': (p) (s) / \ / \ S N --> Sl P / \ / \ sl (sr) (sr) N This is actually right rotation at "(p)" + color flips, not left rotation + color flips. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jie Chen <[email protected]> Cc: Wei Yang <[email protected]> Cc: Xiao Guangrong <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12lib/Kconfig.debug: make CONFIG_STRICT_DEVMEM depend on CONFIG_DEVMEMDave Young1-1/+1
With CONFIG_DEVMEM not set, CONFIG_STRICT_DEVMEM will be useless even if it is set =y, thus let's update the dependency in Kconfig. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Dave Young <[email protected]> Acked-by: Kees Cook <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Dan Williams <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Nikolay Aleksandrov <[email protected]> Cc: Dmitry Vyukov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12MAINTAINERS: add drm and drm/i915 irc channelsJani Nikula1-0/+2
Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jani Nikula <[email protected]> Reviewed-by: Andrew Donnellan <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Dave Airlie <[email protected]> Cc: Joe Perches <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12MAINTAINERS: add "C:" for URI for chat where developers hang outJani Nikula1-0/+2
Make it easier to find the developer chat for the subsystem or driver. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jani Nikula <[email protected]> Reviewed-by: Andrew Donnellan <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Dave Airlie <[email protected]> Cc: Joe Perches <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12MAINTAINERS: add drm and drm/i915 bug filing infoJani Nikula1-0/+2
Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jani Nikula <[email protected]> Reviewed-by: Andrew Donnellan <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Dave Airlie <[email protected]> Cc: Joe Perches <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12MAINTAINERS: add "B:" for URI where to file bugsJani Nikula1-0/+2
Different subsystems and drivers have different preferences for where to file bugs and what information to include. Add "B:" entry for specifying the URI for the bug tracker directly, a web page for detailed info on filing bugs, or a mailto: URI. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jani Nikula <[email protected]> Reviewed-by: Andrew Donnellan <[email protected]> Acked-by: Daniel Vetter <[email protected]> Cc: Joe Perches <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12get_maintainer: look for arbitrary letter prefixes in sectionsJoe Perches1-3/+9
Jani Nikula proposes patches to add a few new letter prefixes for "B:" bug reporting and "C:" maintainer chatting to the various sections of MAINTAINERS. Add a generic mechanism to get_maintainer.pl to find sections that have any combination of "[A-Z]" letter prefix types in a section. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Joe Perches <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Dave Airlie <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12printk: add Kconfig option to set default console loglevelOlof Johansson2-1/+25
Add a configuration option to set the default console loglevel. This is, as before, still possible to override at runtime through bootargs (loglevel=<x>), sysrq and /proc/printk. There are cases where adding additional arguments on the commandline is impractical, and changing the default for the kernel when being built makes more sense. Provide such a method here, for those who choose to do so. Also, while touching this code, clarify the difference between MESSAGE_LOGLEVEL_DEFAULT and CONSOLE_LOGLEVEL_DEFAULT. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Olof Johansson <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12printk/sound: handle more message headersPetr Mladek1-6/+14
Commit 4bcc595ccd80 ("printk: reinstate KERN_CONT for printing continuation lines") allows to define more message headers for a single message. The motivation is that continuous lines might get mixed. Therefore it make sense to define the right log level for every piece of a cont line. This patch allows to copy only the real message level. We should ignore KERN_CONT because <filename:line> is added for each message. By other words, we want to know where each piece of the line comes from. [[email protected]: fix a check of the valid message level] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Petr Mladek <[email protected]> Cc: Joe Perches <[email protected]> Cc: Sergey Senozhatsky <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Jason Wessel <[email protected]> Cc: Jaroslav Kysela <[email protected]> Cc: Takashi Iwai <[email protected]> Cc: Chris Mason <[email protected]> Cc: Josef Bacik <[email protected]> Cc: David Sterba <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12printk/btrfs: handle more message headersPetr Mladek2-11/+17
Commit 4bcc595ccd80 ("printk: reinstate KERN_CONT for printing continuation lines") allows to define more message headers for a single message. The motivation is that continuous lines might get mixed. Therefore it make sense to define the right log level for every piece of a cont line. The current btrfs_printk() macros do not support continuous lines at the moment. But better be prepared for a custom messages and avoid potential "lvl" buffer overflow. This patch iterates over the entire message header. It is interested only into the message level like the original code. This patch also introduces PRINTK_MAX_SINGLE_HEADER_LEN. Three bytes are enough for the message level header at the moment. But it used to be three, see the commit 04d2c8c83d0e ("printk: convert the format for KERN_<LEVEL> to a 2 byte pattern"). Also I fixed the default ratelimit level. It looked very strange when it was different from the default log level. [[email protected]: Fix a check of the valid message level] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Petr Mladek <[email protected]> Acked-by: David Sterba <[email protected]> Cc: Joe Perches <[email protected]> Cc: Sergey Senozhatsky <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Jason Wessel <[email protected]> Cc: Jaroslav Kysela <[email protected]> Cc: Takashi Iwai <[email protected]> Cc: Chris Mason <[email protected]> Cc: Josef Bacik <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12printk/kdb: handle more message headersPetr Mladek2-1/+9
Commit 4bcc595ccd80 ("printk: reinstate KERN_CONT for printing continuation lines") allows to define more message headers for a single message. The motivation is that continuous lines might get mixed. Therefore it make sense to define the right log level for every piece of a cont line. This patch introduces printk_skip_headers() that will skip all headers and uses it in the kdb code instead of printk_skip_level(). This approach helps to fix other printk_skip_level() users independently. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Petr Mladek <[email protected]> Cc: Joe Perches <[email protected]> Cc: Sergey Senozhatsky <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Jason Wessel <[email protected]> Cc: Jaroslav Kysela <[email protected]> Cc: Takashi Iwai <[email protected]> Cc: Chris Mason <[email protected]> Cc: Josef Bacik <[email protected]> Cc: David Sterba <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12printk/NMI: handle continuous lines and missing newlinePetr Mladek1-28/+50
Commit 4bcc595ccd80 ("printk: reinstate KERN_CONT for printing continuation lines") added back KERN_CONT message header. As a result it might appear in the middle of the line when the parts are squashed via the temporary NMI buffer. A reasonable solution seems to be to split the text in the NNI temporary not only by newlines but also by the message headers. Another solution would be to filter out KERN_CONT when writing to the temporary buffer. But this would complicate the lockless handling. Also it would not solve problems with a missing newline that was there even before the KERN_CONT stuff. This patch moves the temporary buffer handling into separate function. I played with it and it seems that using the char pointers make the code easier to read. Also it prints the final newline as a continuous line. Finally, it moves handling of the s->len overflow into the paranoid check. And allows to recover from the disaster. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Petr Mladek <[email protected]> Reviewed-by: Sergey Senozhatsky <[email protected]> Cc: Joe Perches <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Jason Wessel <[email protected]> Cc: Jaroslav Kysela <[email protected]> Cc: Takashi Iwai <[email protected]> Cc: Chris Mason <[email protected]> Cc: Josef Bacik <[email protected]> Cc: David Sterba <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12printk/NMI: fix up handling of the full nmi log bufferPetr Mladek1-2/+3
vsnprintf() adds the trailing '\0' but it does not count it into the number of printed characters. The result is that there is one byte less space for the real characters in the buffer. The broken check for the free space might cause that we will repeatedly try to print 1 character into the buffer, never reach the full buffer, and do not count the messages as missed. Also vsnprintf() returns the number of characters that would be printed if the buffer was big enough. As a result, s->len might be bigger than the size of the buffer[*]. And the printk() function might return bigger len than it really printed. Both problems are fixed by using vscnprintf() instead. Note that I though about increasing the number of missed messages even when the message was shrunken. But it made the code even more complicated. I think that it is not worth it. Shrunken messages are usually easy to recognize. And it should be a corner case. [*] The overflown s->len value is crazy and unexpected. I "made a mistake" and reported this situation as an internal error when fixed handling of PR_CONT headers in some other patch. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Petr Mladek <[email protected]> CcL Sergey Senozhatsky <[email protected]> Cc: Chris Mason <[email protected]> Cc: David Sterba <[email protected]> Cc: Jason Wessel <[email protected]> Cc: Josef Bacik <[email protected]> Cc: Joe Perches <[email protected]> Cc: Jaroslav Kysela <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Takashi Iwai <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12compiler-gcc.h: use "proved" instead of "proofed"Benjamin Peterson1-1/+1
Link: http://lkml.kernel.org/r/1477894241.1103202.772260161.1B0A5995@webmail.messagingengine.com Signed-off-by: Benjamin Peterson <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12hung_task: decrement sysctl_hung_task_warnings only if it is positiveTetsuo Handa1-1/+2
Since sysctl_hung_task_warnings == -1 is allowed (infinite warnings), commit 48a6d64edadb ("hung_task: allow hung_task_panic when hung_task_warnings is 0") should decrement it only when it is not -1. This prevents the kernel from ceasing warnings after the first 4294967295 ;) Signed-off-by: Tetsuo Handa <[email protected]> Cc: John Siddle <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12fs/proc: calculate /proc/* and /proc/*/task/* nlink at init timeAlexey Dobriyan3-6/+15
Runtime nlink calculation works but meh. I don't know how to do it at compile time, but I know how to do it at init time. Shift "2+" part into init time as a bonus. Link: http://lkml.kernel.org/r/20161122195549.GB29812@avx2 Signed-off-by: Alexey Dobriyan <[email protected]> Reviewed-by: Vegard Nossum <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12fs/proc/base.c: save decrement during lookup/readdir in /proc/$PIDAlexey Dobriyan1-4/+4
Comparison for "<" works equally well as comparison for "<=" but one SUB/LEA is saved (no, it is not optimised away, at least here). Link: http://lkml.kernel.org/r/20161122195143.GA29812@avx2 Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12fs/proc/array.c: slightly improve render_sigset_tRasmus Villemoes1-1/+1
format_decode and vsnprintf occasionally show up in perf top, so I went looking for places that might not need the full printf power. With the help of kprobes, I gathered some statistics on which format strings we mostly pass to vsnprintf. On a trivial desktop workload, I hit "%x" 25% of the time, so something apparently reads /proc/pid/status (which does 5*16 printf("%x") calls) a lot. With this patch, reading /proc/pid/status is 30% faster according to this microbenchmark: char buf[4096]; int i, fd; for (i = 0; i < 10000; ++i) { fd = open("/proc/self/status", O_RDONLY); read(fd, buf, sizeof(buf)); close(fd); } Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Rasmus Villemoes <[email protected]> Acked-by: Andrei Vagin <[email protected]> Acked-by: Kees Cook <[email protected]> Cc: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12proc: tweak comments about 2 stage open and everythingAlexey Dobriyan1-8/+21
Some comments were obsoleted since commit 05c0ae21c034 ("try a saner locking for pde_opener..."). Some new comments added. Some confusing comments replaced with equally confusing ones. Link: http://lkml.kernel.org/r/20161029160231.GD1246@avx2 Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12proc: kmalloc struct pde_openerAlexey Dobriyan1-1/+3
kzalloc is too much, half of the fields will be reinitialized anyway. If proc file doesn't have ->release hook (some still do not), clearing is unnecessary because it will be freed immediately. Link: http://lkml.kernel.org/r/20161029155747.GC1246@avx2 Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12proc: fix type of struct pde_opener::closing fieldAlexey Dobriyan2-2/+2
struct pde_opener::closing is boolean. Link: http://lkml.kernel.org/r/20161029155439.GB1246@avx2 Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12proc: just list_del() struct pde_openerAlexey Dobriyan1-1/+1
list_del_init() is too much, structure will be freed in three lines anyway. Link: http://lkml.kernel.org/r/20161029155313.GA1246@avx2 Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12proc: make struct struct map_files_info::len unsigned intAlexey Dobriyan1-1/+1
Linux doesn't support 4GB+ filenames in /proc, so unsigned long is too much. MOV r64, r/m64 is larger than MOV r32, r/m32. Link: http://lkml.kernel.org/r/20161029161123.GG1246@avx2 Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12proc: make struct pid_entry::len unsignedAlexey Dobriyan1-1/+1
"unsigned int" is better on x86_64 because it most of the time it autoexpands to 64-bit value while "int" requires MOVSX instruction. Link: http://lkml.kernel.org/r/20161029160810.GF1246@avx2 Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12proc: report no_new_privs stateKees Cook2-2/+5
Similar to being able to examine if a process has been correctly confined with seccomp, the state of no_new_privs is equally interesting, so this adds it to /proc/$pid/status. Link: http://lkml.kernel.org/r/20161103214041.GA58566@beast Signed-off-by: Kees Cook <[email protected]> Reviewed-by: Jann Horn <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Konstantin Khlebnikov <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Rodrigo Freire <[email protected]> Cc: John Stultz <[email protected]> Cc: Ross Zwisler <[email protected]> Cc: Robert Ho <[email protected]> Cc: Jerome Marchand <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: "Richard W.M. Jones" <[email protected]> Cc: Joe Perches <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12mm/percpu.c: fix panic triggered by BUG_ON() falselyzijun_hu1-4/+12
As shown by pcpu_build_alloc_info(), the number of units within a percpu group is deduced by rounding up the number of CPUs within the group to @upa boundary/ Therefore, the number of CPUs isn't equal to the units's if it isn't aligned to @upa normally. However, pcpu_page_first_chunk() uses BUG_ON() to assert that one number is equal to the other roughly, so a panic is maybe triggered by the BUG_ON() incorrectly. In order to fix this issue, the number of CPUs is rounded up then compared with units's and the BUG_ON() is replaced with a warning and return of an error code as well, to keep system alive as much as possible. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: zijun_hu <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Christoph Lameter <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12kasan: turn on -fsanitize-address-use-after-scopeAndrey Ryabinin1-0/+2
In the upcoming gcc7 release, the -fsanitize=kernel-address option at first implied new -fsanitize-address-use-after-scope option. This would cause link errors on older kernels because they don't have two new functions required for use-after-scope support. Therefore, gcc7 changed default to -fno-sanitize-address-use-after-scope. Now the kernel has everything required for that feature since commit 828347f8f9a5 ("kasan: support use-after-scope detection"). So, to make it work, we just have to enable use-after-scope in CFLAGS. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Andrey Ryabinin <[email protected]> Acked-by: Dmitry Vyukov <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Andrey Konovalov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12kasan: eliminate long stalls during quarantine reductionDmitry Vyukov1-46/+48
Currently we dedicate 1/32 of RAM for quarantine and then reduce it by 1/4 of total quarantine size. This can be a significant amount of memory. For example, with 4GB of RAM total quarantine size is 128MB and it is reduced by 32MB at a time. With 128GB of RAM total quarantine size is 4GB and it is reduced by 1GB. This leads to several problems: - freeing 1GB can take tens of seconds, causes rcu stall warnings and just introduces unexpected long delays at random places - if kmalloc() is called under a mutex, other threads stall on that mutex while a thread reduces quarantine - threads wait on quarantine_lock while one thread grabs a large batch of objects to evict - we walk the uncached list of object to free twice which makes all of the above worse - when a thread frees objects, they are already not accounted against global_quarantine.bytes; as the result we can have quarantine_size bytes in quarantine + unbounded amount of memory in large batches in threads that are in process of freeing Reduce size of quarantine in smaller batches to reduce the delays. The only reason to reduce it in batches is amortization of overheads, the new batch size of 1MB should be well enough to amortize spinlock lock/unlock and few function calls. Plus organize quarantine as a FIFO array of batches. This allows to not walk the list in quarantine_reduce() under quarantine_lock, which in turn reduces contention and is just faster. This improves performance of heavy load (syzkaller fuzzing) by ~20% with 4 CPUs and 32GB of RAM. Also this eliminates frequent (every 5 sec) drops of CPU consumption from ~400% to ~100% (one thread reduces quarantine while others are waiting on a mutex). Some reference numbers: 1. Machine with 4 CPUs and 4GB of memory. Quarantine size 128MB. Currently we free 32MB at at time. With new code we free 1MB at a time (1024 batches, ~128 are used). 2. Machine with 32 CPUs and 128GB of memory. Quarantine size 4GB. Currently we free 1GB at at time. With new code we free 8MB at a time (1024 batches, ~512 are used). 3. Machine with 4096 CPUs and 1TB of memory. Quarantine size 32GB. Currently we free 8GB at at time. With new code we free 4MB at a time (16K batches, ~8K are used). Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Dmitry Vyukov <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Greg Thelen <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Andrey Konovalov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12kasan: support panic_on_warnDmitry Vyukov1-0/+2
If user sets panic_on_warn, he wants kernel to panic if there is anything barely wrong with the kernel. KASAN-detected errors are definitely not less benign than an arbitrary kernel WARNING. Panic after KASAN errors if panic_on_warn is set. We use this for continuous fuzzing where we want kernel to stop and reboot on any error. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Dmitry Vyukov <[email protected]> Acked-by: Andrey Ryabinin <[email protected]> Cc: Alexander Potapenko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12mm: make transparent hugepage size publicHugh Dickins2-0/+15
Test programs want to know the size of a transparent hugepage. While it is commonly the same as the size of a hugetlbfs page (shown as Hugepagesize in /proc/meminfo), that is not always so: powerpc implements transparent hugepages in a different way from hugetlbfs pages, so it's coincidence when their sizes are the same; and x86 and others can support more than one hugetlbfs page size. Add /sys/kernel/mm/transparent_hugepage/hpage_pmd_size to show the THP size in bytes - it's the same for Anonymous and Shmem hugepages. Call it hpage_pmd_size (after HPAGE_PMD_SIZE) rather than hpage_size, in case some transparent support for pud and pgd pages is added later. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Hugh Dickins <[email protected]> Acked-by: Michal Hocko <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Cc: Greg Thelen <[email protected]> Cc: David Rientjes <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: "Aneesh Kumar K.V" <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dan Williams <[email protected]> Cc: Jan Kara <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12mm: add cond_resched() in gather_pte_stats()Hugh Dickins1-0/+1
The other pagetable walks in task_mmu.c have a cond_resched() after walking their ptes: add a cond_resched() in gather_pte_stats() too, for reading /proc/<id>/numa_maps. Only pagemap_pmd_range() has a cond_resched() in its (unusually expensive) pmd_trans_huge case: more should probably be added, but leave them unchanged for now. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Hugh Dickins <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: David Rientjes <[email protected]> Cc: Gerald Schaefer <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12mm: add three more cond_resched() in swapoffHugh Dickins1-7/+6
Add a cond_resched() in the unuse_pmd_range() loop (so as to call it even when pmd none or trans_huge, like zap_pmd_range() does); and in the unuse_mm() loop (since that might skip over many vmas). shmem_unuse() and radix_tree_locate_item() look good enough already. Those were the obvious places, but in fact the stalls came from find_next_to_unuse(), which sometimes scans through many unused entries. Apply scan_swap_map()'s LATENCY_LIMIT of 256 there too; and only go off to test frontswap_map when a used entry is found. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Hugh Dickins <[email protected]> Reported-by: Eric Dumazet <[email protected]> Cc: David Rientjes <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12mm, page_alloc: keep pcp count and list contents in sync if struct page is ↵Mel Gorman1-2/+10
corrupted Vlastimil Babka pointed out that commit 479f854a207c ("mm, page_alloc: defer debugging checks of pages allocated from the PCP") will allow the per-cpu list counter to be out of sync with the per-cpu list contents if a struct page is corrupted. The consequence is an infinite loop if the per-cpu lists get fully drained by free_pcppages_bulk because all the lists are empty but the count is positive. The infinite loop occurs here do { batch_free++; if (++migratetype == MIGRATE_PCPTYPES) migratetype = 0; list = &pcp->lists[migratetype]; } while (list_empty(list)); What the user sees is a bad page warning followed by a soft lockup with interrupts disabled in free_pcppages_bulk(). This patch keeps the accounting in sync. Fixes: 479f854a207c ("mm, page_alloc: defer debugging checks of pages allocated from the PCP") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Mel Gorman <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Acked-by: Michal Hocko <[email protected]> Acked-by: Hillf Danton <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Jesper Dangaard Brouer <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: <[email protected]> [4.7+] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12mm, rmap: handle anon_vma_prepare() common case inlineVlastimil Babka2-36/+43
anon_vma_prepare() is mostly a large "if (unlikely(...))" block, as the expected common case is that an anon_vma already exists. We could turn the condition around and return 0, but it also makes sense to do it inline and avoid a call for the common case. Bloat-o-meter naturally shows that inlining the check has some code size costs: add/remove: 1/1 grow/shrink: 4/0 up/down: 475/-373 (102) function old new delta __anon_vma_prepare - 359 +359 handle_mm_fault 2744 2796 +52 hugetlb_cow 1146 1170 +24 hugetlb_fault 2123 2145 +22 wp_page_copy 1469 1487 +18 anon_vma_prepare 373 - -373 Checking the asm however confirms that the hot paths now avoid a call, which is moved away. [[email protected]: coding-style fixes] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Vlastimil Babka <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Konstantin Khlebnikov <[email protected]> Cc: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12mm, debug: print raw struct page data in __dump_page()Vlastimil Babka1-0/+4
__dump_page() is used when a page metadata inconsistency is detected, either by standard runtime checks, or extra checks in CONFIG_DEBUG_VM builds. It prints some of the relevant metadata, but not the whole struct page, which is based on unions and interpretation is dependent on the context. This means that sometimes e.g. a VM_BUG_ON_PAGE() checks certain field, which is however not printed by __dump_page() and the resulting bug report may then lack clues that could help in determining the root cause. This patch solves the problem by simply printing the whole struct page word by word, so no part is missing, but the interpretation of the data is left to developers. This is similar to e.g. x86_64 raw stack dumps. Example output: page:ffffea00000475c0 count:1 mapcount:0 mapping: (null) index:0x0 flags: 0x100000000000400(reserved) raw: 0100000000000400 0000000000000000 0000000000000000 00000001ffffffff raw: ffffea00000475e0 ffffea00000475e0 0000000000000000 0000000000000000 page dumped because: VM_BUG_ON_PAGE(1) [[email protected]: suggested print_hex_dump()] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Vlastimil Babka <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Acked-by: Andrey Ryabinin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12mm: THP page cache support for ppc64Aneesh Kumar K.V6-17/+100
Add arch specific callback in the generic THP page cache code that will deposit and withdarw preallocated page table. Archs like ppc64 use this preallocated table to store the hash pte slot information. Testing: kernel build of the patch series on tmpfs mounted with option huge=always The related thp stat: thp_fault_alloc 72939 thp_fault_fallback 60547 thp_collapse_alloc 603 thp_collapse_alloc_failed 0 thp_file_alloc 253763 thp_file_mapped 4251 thp_split_page 51518 thp_split_page_failed 1 thp_deferred_split_page 73566 thp_split_pmd 665 thp_zero_page_alloc 3 thp_zero_page_alloc_failed 0 [[email protected]: remove unneeded parentheses, per Kirill] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Aneesh Kumar K.V <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Michael Neuling <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Balbir Singh <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12mm: move vma_is_anonymous check within pmd_move_must_withdrawAneesh Kumar K.V3-15/+18
Independent of whether the vma is for anonymous memory, some arches like ppc64 would like to override pmd_move_must_withdraw(). One option is to encapsulate the vma_is_anonymous() check for general architectures inside pmd_move_must_withdraw() so that is always called and architectures that need unconditional overriding can override this function. ppc64 needs to override the function when the MMU is configured to use hash PTE's. [[email protected]: reworked changelog] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Aneesh Kumar K.V <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Acked-by: Michael Ellerman <[email protected]> (powerpc) Cc: Benjamin Herrenschmidt <[email protected]> Cc: Michael Neuling <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Balbir Singh <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12mm: add preempt points into __purge_vmap_area_lazy()Joel Fernandes1-5/+9
Use cond_resched_lock to avoid holding the vmap_area_lock for a potentially long time and thus creating bad latencies for various workloads. [hch: split from a larger patch by Joel, wrote the crappy changelog] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Joel Fernandes <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Jisheng Zhang <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Chris Wilson <[email protected]> Cc: John Dias <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-12-12mm: turn vmap_purge_lock into a mutexChristoph Hellwig1-7/+7
The purge_lock spinlock causes high latencies with non RT kernel. This has been reported multiple times on lkml [1] [2] and affects applications like audio. This patch replaces it with a mutex to allow preemption while holding the lock. Thanks to Joel Fernandes for the detailed report and analysis as well as an earlier attempt at fixing this issue. [1] http://lists.openwall.net/linux-kernel/2016/03/23/29 [2] https://lkml.org/lkml/2016/10/9/59 Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Jisheng Zhang <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Chris Wilson <[email protected]> Cc: John Dias <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>