aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-07-12scripts/gdb: lx-dmesg: cast log_buf to void* for addr fetchLeonard Crestez1-1/+1
In some cases it is possible for the str() conversion here to throw encoding errors because log_buf might not point to valid ascii. For example: (gdb) python print str(gdb.parse_and_eval("log_buf")) Traceback (most recent call last): File "<string>", line 1, in <module> UnicodeEncodeError: 'ascii' codec can't encode character u'\u0303' in position 24: ordinal not in range(128) Avoid this by explicitly casting to (void *) inside the gdb expression. Link: http://lkml.kernel.org/r/ba6f85dbb02ca980ebd0e2399b0649423399b565.1498481469.git.leonard.crestez@nxp.com Signed-off-by: Leonard Crestez <[email protected]> Reviewed-by: Jan Kiszka <[email protected]> Cc: Jason Wessel <[email protected]> Cc: Kieran Bingham <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12scripts/gdb: add lx-fdtdump commandPeter Griffin2-0/+80
lx-fdtdump dumps the flattened device tree passed to the kernel from the bootloader to the filename specified as the command argument. If no argument is provided it defaults to fdtdump.dtb. This then allows further post processing on the machine running GDB. The fdt header is also also printed in the GDB console. For example: (gdb) lx-fdtdump fdt_magic: 0xD00DFEED fdt_totalsize: 0xC108 off_dt_struct: 0x38 off_dt_strings: 0x3804 off_mem_rsvmap: 0x28 version: 17 last_comp_version: 16 Dumped fdt to fdtdump.dtb >fdtdump fdtdump.dtb | less This command is useful as the bootloader can often re-write parts of the device tree, and this can sometimes cause the kernel to not boot. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Peter Griffin <[email protected]> Signed-off-by: Kieran Bingham <[email protected]> Cc: Jason Wessel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12fs/Kconfig: kill CONFIG_PERCPU_RWSEM some moreDavidlohr Bueso1-1/+0
As of commit bf3eac84c42d ("percpu-rwsem: kill CONFIG_PERCPU_RWSEM") we unconditionally build pcpu-rwsems. Remove a leftover in for FILE_LOCKING. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Davidlohr Bueso <[email protected]> Cc: Oleg Nesterov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12bfs: fix sanity checks for empty filesRakesh Pandit1-1/+1
Mount fails if file system image has empty files because of sanity check while reading superblock. For empty files disk offset to end of file (i_eoffset) is cpu_to_le32(-1). Sanity check comparison, which compares disk offset with file system size isn't valid for this value and hence is ignored with this patch. Steps to reproduce: $ dd if=/dev/zero of=bfs-image count=204800 $ mkfs.bfs bfs-image $ mkdir bfs-mount-point $ sudo mount -t bfs -o loop bfs-image bfs-mount-point/ $ cd bfs-mount-point/ $ sudo touch a $ cd .. $ sudo umount bfs-mount-point/ $ sudo mount -t bfs -o loop bfs-image bfs-mount-point/ mount: /dev/loop0: can't read superblock $ dmesg [25526.689580] BFS-fs: bfs_fill_super(): Inode 0x00000003 corrupted Tigran said: "If you had created the filesystem with the proper mkfs under SCO UnixWare 7 you (probably) wouldn't encounter this issue. But since commercial Unix-es are now part of history and the only proper way is the Linux mkfs.bfs utility, your patch is fine" Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Rakesh Pandit <[email protected]> Acked-by: Tigran Aivazian <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12random: do not ignore early device randomnessKees Cook2-0/+6
The add_device_randomness() function would ignore incoming bytes if the crng wasn't ready. This additionally makes sure to make an early enough call to add_latent_entropy() to influence the initial stack canary, which is especially important on non-x86 systems where it stays the same through the life of the boot. Link: http://lkml.kernel.org/r/20170626233038.GA48751@beast Signed-off-by: Kees Cook <[email protected]> Cc: "Theodore Ts'o" <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jessica Yu <[email protected]> Cc: Steven Rostedt (VMware) <[email protected]> Cc: Viresh Kumar <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Prarit Bhargava <[email protected]> Cc: Lokesh Vutla <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: AKASHI Takahiro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12kernel/sysctl_binary.c: check name array length in deprecated_sysctl_warning()Mateusz Jurczyk1-1/+1
Prevent use of uninitialized memory (originating from the stack frame of do_sysctl()) by verifying that the name array is filled with sufficient input data before comparing its specific entries with integer constants. Through timing measurement or analyzing the kernel debug logs, a user-mode program could potentially infer the results of comparisons against the uninitialized memory, and acquire some (very limited) information about the state of the kernel stack. The change also eliminates possible future warnings by tools such as KMSAN and other code checkers / instrumentations. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Mateusz Jurczyk <[email protected]> Acked-by: Kees Cook <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Matthew Whitehead <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Tetsuo Handa <[email protected]> Cc: Alexander Potapenko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12test_sysctl: test against int proc_dointvec() array supportLuis R. Rodriguez2-0/+102
Add a few initial respective tests for an array: o Echoing values separated by spaces works o Echoing only first elements will set first elements o Confirm PAGE_SIZE limit still applies even if an array is used Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Luis R. Rodriguez <[email protected]> Cc: Kees Cook <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12test_sysctl: add simple proc_douintvec() caseLuis R. Rodriguez2-0/+74
Test against a simple proc_douintvec() case. While at it, add a test against UINT_MAX. Make sure UINT_MAX works, and UINT_MAX+1 will fail and that negative values are not accepted. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Luis R. Rodriguez <[email protected]> Cc: Kees Cook <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12test_sysctl: add simple proc_dointvec() caseLuis R. Rodriguez2-0/+73
Test against a simple proc_dointvec() case. While at it, add a test against INT_MAX. Make sure INT_MAX works, and INT_MAX+1 will fail. Also test negative values work. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Luis R. Rodriguez <[email protected]> Cc: Kees Cook <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12test_sysctl: test against PAGE_SIZE for intLuis R. Rodriguez1-0/+66
Add the following tests to ensure we do not regress: o Test using a buffer full of space (PAGE_SIZE-1) followed by a single digit works o Test using a buffer full of spaces (PAGE_SIZE or over) will fail As tests increase instead of unloading the module and reloading it we can just do a shell reset_vals() with a reset to values we know are set at init on the driver. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Luis R. Rodriguez <[email protected]> Cc: Kees Cook <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12test_sysctl: add generic script to expand on testsLuis R. Rodriguez5-220/+495
This adds a generic script to let us more easily add more tests cases. Since we really have only two types of tests cases just fold them into the one file. Each test unit is now identified into its separate function: # ./sysctl.sh -l Test ID list: TEST_ID x NUM_TEST TEST_ID: Test ID NUM_TESTS: Number of recommended times to run the test 0001 x 1 - tests proc_dointvec_minmax() 0002 x 1 - tests proc_dostring() For now we start off with what we had before, and run only each test once. We can now watch a test case until it fails: ./sysctl.sh -w 0002 We can also run a test case x number of times, say we want to run a test case 100 times: ./sysctl.sh -c 0001 100 To run a test case only once, for example: ./sysctl.sh -s 0002 The default settings are specified at the top of sysctl.sh. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Luis R. Rodriguez <[email protected]> Cc: Kees Cook <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12test_sysctl: add dedicated proc sysctl test driverLuis R. Rodriguez6-4/+130
The existing tools/testing/selftests/sysctl/ tests include two test cases, but these use existing production kernel sysctl interfaces. We want to expand test coverage but we can't just be looking for random safe production values to poke at, that's just insane! Instead just dedicate a test driver for debugging purposes and port the existing scripts to use it. This will make it easier for further tests to be added. Subsequent patches will extend our test coverage for sysctl. The stress test driver uses a new license (GPL on Linux, copyleft-next outside of Linux). Linus was fine with this [0] and later due to Ted's and Alans's request ironed out an "or" language clause to use [1] which is already present upstream. [0] https://lkml.kernel.org/r/CA+55aFyhxcvD+q7tp+-yrSFDKfR0mOHgyEAe=f_94aKLsOu0Og@mail.gmail.com [1] https://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Luis R. Rodriguez <[email protected]> Acked-by: Kees Cook <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12sysctl: add unsigned int range supportLuis R. Rodriguez3-1/+72
To keep parity with regular int interfaces provide the an unsigned int proc_douintvec_minmax() which allows you to specify a range of allowed valid numbers. Adding proc_douintvec_minmax_sysadmin() is easy but we can wait for an actual user for that. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Luis R. Rodriguez <[email protected]> Acked-by: Kees Cook <[email protected]> Cc: Subash Abhinov Kasiviswanathan <[email protected]> Cc: Heinrich Schuchardt <[email protected]> Cc: Kees Cook <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Al Viro <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12sysctl: simplify unsigned int supportLuis R. Rodriguez2-7/+160
Commit e7d316a02f68 ("sysctl: handle error writing UINT_MAX to u32 fields") added proc_douintvec() to start help adding support for unsigned int, this however was only half the work needed. Two fixes have come in since then for the following issues: o Printing the values shows a negative value, this happens since do_proc_dointvec() and this uses proc_put_long() This was fixed by commit 5380e5644afbba9 ("sysctl: don't print negative flag for proc_douintvec"). o We can easily wrap around the int values: UINT_MAX is 4294967295, if we echo in 4294967295 + 1 we end up with 0, using 4294967295 + 2 we end up with 1. o We echo negative values in and they are accepted This was fixed by commit 425fffd886ba ("sysctl: report EINVAL if value is larger than UINT_MAX for proc_douintvec"). It still also failed to be added to sysctl_check_table()... instead of adding it with the current implementation just provide a proper and simplified unsigned int support without any array unsigned int support with no negative support at all. Historically sysctl proc helpers have supported arrays, due to the complexity this adds though we've taken a step back to evaluate array users to determine if its worth upkeeping for unsigned int. An evaluation using Coccinelle has been done to perform a grammatical search to ask ourselves: o How many sysctl proc_dointvec() (int) users exist which likely should be moved over to proc_douintvec() (unsigned int) ? Answer: about 8 - Of these how many are array users ? Answer: Probably only 1 o How many sysctl array users exist ? Answer: about 12 This last question gives us an idea just how popular arrays: they are not. Array support should probably just be kept for strings. The identified uint ports are: drivers/infiniband/core/ucma.c - max_backlog drivers/infiniband/core/iwcm.c - default_backlog net/core/sysctl_net_core.c - rps_sock_flow_sysctl() net/netfilter/nf_conntrack_timestamp.c - nf_conntrack_timestamp -- bool net/netfilter/nf_conntrack_acct.c nf_conntrack_acct -- bool net/netfilter/nf_conntrack_ecache.c - nf_conntrack_events -- bool net/netfilter/nf_conntrack_helper.c - nf_conntrack_helper -- bool net/phonet/sysctl.c proc_local_port_range() The only possible array users is proc_local_port_range() but it does not seem worth it to add array support just for this given the range support works just as well. Unsigned int support should be desirable more for when you *need* more than INT_MAX or using int min/max support then does not suffice for your ranges. If you forget and by mistake happen to register an unsigned int proc entry with an array, the driver will fail and you will get something as follows: sysctl table check failed: debug/test_sysctl//uint_0002 array now allowed CPU: 2 PID: 1342 Comm: modprobe Tainted: G W E <etc> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS <etc> Call Trace: dump_stack+0x63/0x81 __register_sysctl_table+0x350/0x650 ? kmem_cache_alloc_trace+0x107/0x240 __register_sysctl_paths+0x1b3/0x1e0 ? 0xffffffffc005f000 register_sysctl_table+0x1f/0x30 test_sysctl_init+0x10/0x1000 [test_sysctl] do_one_initcall+0x52/0x1a0 ? kmem_cache_alloc_trace+0x107/0x240 do_init_module+0x5f/0x200 load_module+0x1867/0x1bd0 ? __symbol_put+0x60/0x60 SYSC_finit_module+0xdf/0x110 SyS_finit_module+0xe/0x10 entry_SYSCALL_64_fastpath+0x1e/0xad RIP: 0033:0x7f042b22d119 <etc> Fixes: e7d316a02f68 ("sysctl: handle error writing UINT_MAX to u32 fields") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Luis R. Rodriguez <[email protected]> Suggested-by: Alexey Dobriyan <[email protected]> Cc: Subash Abhinov Kasiviswanathan <[email protected]> Cc: Liping Zhang <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Heinrich Schuchardt <[email protected]> Cc: Kees Cook <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Al Viro <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12sysctl: fold sysctl_writes_strict checks into helperLuis R. Rodriguez1-24/+32
The mode sysctl_writes_strict positional checks keep being copy and pasted as we add new proc handlers. Just add a helper to avoid code duplication. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Luis R. Rodriguez <[email protected]> Suggested-by: Kees Cook <[email protected]> Cc: Al Viro <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12sysctl: kdoc'ify sysctl_writes_strictLuis R. Rodriguez1-4/+25
Document the different sysctl_writes_strict modes in code. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Luis R. Rodriguez <[email protected]> Cc: Al Viro <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Kees Cook <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12sysctl: fix lax sysctl_check_table() sanity checkLuis R. Rodriguez1-5/+5
Patch series "sysctl: few fixes", v5. I've been working on making kmod more deterministic, and as I did that I couldn't help but notice a few issues with sysctl. My end goal was just to fix unsigned int support, which back then was completely broken. Liping Zhang has sent up small atomic fixes, however it still missed yet one more fix and Alexey Dobriyan had also suggested to just drop array support given its complexity. I have inspected array support using Coccinelle and indeed its not that popular, so if in fact we can avoid it for new interfaces, I agree its best. I did develop a sysctl stress driver but will hold that off for another series. This patch (of 5): Commit 7c60c48f58a7 ("sysctl: Improve the sysctl sanity checks") improved sanity checks considerbly, however the enhancements on sysctl_check_table() meant adding a functional change so that only the last table entry's sanity error is propagated. It also changed the way errors were propagated so that each new check reset the err value, this means only last sanity check computed is used for an error. This has been in the kernel since v3.4 days. Fix this by carrying on errors from previous checks and iterations as we traverse the table and ensuring we keep any error from previous checks. We keep iterating on the table even if an error is found so we can complain for all errors found in one shot. This works as -EINVAL is always returned on error anyway, and the check for error is any non-zero value. Fixes: 7c60c48f58a7 ("sysctl: Improve the sysctl sanity checks") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Luis R. Rodriguez <[email protected]> Cc: Al Viro <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Kees Cook <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12kexec/kdump: minor Documentation updates for arm64 and ImageBharat Bhushan1-3/+9
Minor updates in Documentation for arm64 as relocatable kernel. Also this patch updates documentation for using uncompressed image "Image" which is used for ARM64. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Bharat Bhushan <[email protected]> Cc: Dave Young <[email protected]> Cc: Baoquan He <[email protected]> Cc: Vivek Goyal <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: AKASHI Takahiro <[email protected]> Cc: Pratyush Anand <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12kdump: protect vmcoreinfo data under the crash memoryXunlei Pang6-2/+74
Currently vmcoreinfo data is updated at boot time subsys_initcall(), it has the risk of being modified by some wrong code during system is running. As a result, vmcore dumped may contain the wrong vmcoreinfo. Later on, when using "crash", "makedumpfile", etc utility to parse this vmcore, we probably will get "Segmentation fault" or other unexpected errors. E.g. 1) wrong code overwrites vmcoreinfo_data; 2) further crashes the system; 3) trigger kdump, then we obviously will fail to recognize the crash context correctly due to the corrupted vmcoreinfo. Now except for vmcoreinfo, all the crash data is well protected(including the cpu note which is fully updated in the crash path, thus its correctness is guaranteed). Given that vmcoreinfo data is a large chunk prepared for kdump, we better protect it as well. To solve this, we relocate and copy vmcoreinfo_data to the crash memory when kdump is loading via kexec syscalls. Because the whole crash memory will be protected by existing arch_kexec_protect_crashkres() mechanism, we naturally protect vmcoreinfo_data from write(even read) access under kernel direct mapping after kdump is loaded. Since kdump is usually loaded at the very early stage after boot, we can trust the correctness of the vmcoreinfo data copied. On the other hand, we still need to operate the vmcoreinfo safe copy when crash happens to generate vmcoreinfo_note again, we rely on vmap() to map out a new kernel virtual address and update to use this new one instead in the following crash_save_vmcoreinfo(). BTW, we do not touch vmcoreinfo_note, because it will be fully updated using the protected vmcoreinfo_data after crash which is surely correct just like the cpu crash note. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Xunlei Pang <[email protected]> Tested-by: Michael Holzheu <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Dave Young <[email protected]> Cc: Eric Biederman <[email protected]> Cc: Hari Bathini <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Mahesh Salgaonkar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12powerpc/fadump: use the correct VMCOREINFO_NOTE_SIZE for phdrXunlei Pang3-5/+2
vmcoreinfo_max_size stands for the vmcoreinfo_data, the correct one we should use is vmcoreinfo_note whose total size is VMCOREINFO_NOTE_SIZE. Like explained in commit 77019967f06b ("kdump: fix exported size of vmcoreinfo note"), it should not affect the actual function, but we better fix it, also this change should be safe and backward compatible. After this, we can get rid of variable vmcoreinfo_max_size, let's use the corresponding macros directly, fewer variables means more safety for vmcoreinfo operation. [[email protected]: fix build warning] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Xunlei Pang <[email protected]> Reviewed-by: Mahesh Salgaonkar <[email protected]> Reviewed-by: Dave Young <[email protected]> Cc: Hari Bathini <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Eric Biederman <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Michael Holzheu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12kexec: move vmcoreinfo out of the kernel's .bss sectionXunlei Pang8-21/+29
As Eric said, "what we need to do is move the variable vmcoreinfo_note out of the kernel's .bss section. And modify the code to regenerate and keep this information in something like the control page. Definitely something like this needs a page all to itself, and ideally far away from any other kernel data structures. I clearly was not watching closely the data someone decided to keep this silly thing in the kernel's .bss section." This patch allocates extra pages for these vmcoreinfo_XXX variables, one advantage is that it enhances some safety of vmcoreinfo, because vmcoreinfo now is kept far away from other kernel data structures. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Xunlei Pang <[email protected]> Tested-by: Michael Holzheu <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Suggested-by: Eric Biederman <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Dave Young <[email protected]> Cc: Hari Bathini <[email protected]> Cc: Mahesh Salgaonkar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12kernel/fork.c: virtually mapped stacks: do not disable interruptsChristoph Lameter1-11/+5
The reason to disable interrupts seems to be to avoid switching to a different processor while handling per cpu data using individual loads and stores. If we use per cpu RMV primitives we will not have to disable interrupts. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Christoph Lameter <[email protected]> Cc: Andy Lutomirski <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12mm/memory.c: mark create_huge_pmd() inline to prevent build failureGeert Uytterhoeven1-1/+1
With gcc 4.1.2: mm/memory.o: In function `create_huge_pmd': memory.c:(.text+0x93e): undefined reference to `do_huge_pmd_anonymous_page' Interestingly, create_huge_pmd() is emitted in the assembler output, but never called. Converting transparent_hugepage_enabled() from a macro to a static inline function reduced the ability of the compiler to remove unused code. Fix this by marking create_huge_pmd() inline. Fixes: 16981d763501c0e0 ("mm: improve readability of transparent_hugepage_enabled()") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Geert Uytterhoeven <[email protected]> Acked-by: Arnd Bergmann <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12kernel.h: handle pointers to arrays better in container_of()Ian Abbott1-3/+7
If the first parameter of container_of() is a pointer to a non-const-qualified array type (and the third parameter names a non-const-qualified array member), the local variable __mptr will be defined with a const-qualified array type. In ISO C, these types are incompatible. They work as expected in GNU C, but some versions will issue warnings. For example, GCC 4.9 produces the warning "initialization from incompatible pointer type". Here is an example of where the problem occurs: ------------------------------------------------------- #include <linux/kernel.h> #include <linux/module.h> MODULE_LICENSE("GPL"); struct st { int a; char b[16]; }; static int __init example_init(void) { struct st t = { .a = 101, .b = "hello" }; char (*p)[16] = &t.b; struct st *x = container_of(p, struct st, b); printk(KERN_DEBUG "%p %p\n", (void *)&t, (void *)x); return 0; } static void __exit example_exit(void) { } module_init(example_init); module_exit(example_exit); ------------------------------------------------------- Building the module with gcc-4.9 results in these warnings (where '{m}' is the module source and '{k}' is the kernel source): ------------------------------------------------------- In file included from {m}/example.c:1:0: {m}/example.c: In function `example_init': {k}/include/linux/kernel.h:854:48: warning: initialization from incompatible pointer type const typeof( ((type *)0)->member ) *__mptr = (ptr); \ ^ {m}/example.c:14:17: note: in expansion of macro `container_of' struct st *x = container_of(p, struct st, b); ^ {k}/include/linux/kernel.h:854:48: warning: (near initialization for `x') const typeof( ((type *)0)->member ) *__mptr = (ptr); \ ^ {m}/example.c:14:17: note: in expansion of macro `container_of' struct st *x = container_of(p, struct st, b); ^ ------------------------------------------------------- Replace the type checking performed by the macro to avoid these warnings. Make sure `*(ptr)` either has type compatible with the member, or has type compatible with `void`, ignoring qualifiers. Raise compiler errors if this is not true. This is stronger than the previous behaviour, which only resulted in compiler warnings for a type mismatch. [[email protected]: fix new warnings for container_of()] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ian Abbott <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]> Acked-by: Michal Nazarewicz <[email protected]> Acked-by: Kees Cook <[email protected]> Cc: Hidehiro Kawai <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Rasmus Villemoes <[email protected]> Cc: Johannes Berg <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Alexander Potapenko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12include/linux/dcache.h: use unsigned chars in struct name_snapshotStephen Rothwell1-2/+2
"kernel.h: handle pointers to arrays better in container_of()" triggers: In file included from include/uapi/linux/stddef.h:1:0, from include/linux/stddef.h:4, from include/uapi/linux/posix_types.h:4, from include/uapi/linux/types.h:13, from include/linux/types.h:5, from include/linux/syscalls.h:71, from fs/dcache.c:17: fs/dcache.c: In function 'release_dentry_name_snapshot': include/linux/compiler.h:542:38: error: call to '__compiletime_assert_305' declared with attribute error: pointer type mismatch in container_of() _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__) ^ include/linux/compiler.h:525:4: note: in definition of macro '__compiletime_assert' prefix ## suffix(); \ ^ include/linux/compiler.h:542:2: note: in expansion of macro '_compiletime_assert' _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__) ^ include/linux/build_bug.h:46:37: note: in expansion of macro 'compiletime_assert' #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg) ^ include/linux/kernel.h:860:2: note: in expansion of macro 'BUILD_BUG_ON_MSG' BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) && \ ^ fs/dcache.c:305:7: note: in expansion of macro 'container_of' p = container_of(name->name, struct external_name, name[0]); Switch name_snapshot to use unsigned chars, matching struct qstr and struct external_name. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Stephen Rothwell <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12kokr/memory-barriers.txt: Fix obsolete link to atomic_ops.txtSeongJae Park1-7/+7
Obsolete links to atomic_ops.txt exist in ko_KR/memory-barriers.txt though the file has moved to core-api/atomic_ops.rst. This commit fixes the obsolete links. Signed-off-by: SeongJae Park <[email protected]> Signed-off-by: Jonathan Corbet <[email protected]>
2017-07-12memory-barriers.txt: Fix broken link to atomic_ops.txtSeongJae Park1-3/+3
Few obsolete links to atomic_ops.txt exist in memory-barriers.txt though the file has moved to core-api/atomic_ops.rst. This commit fixes the obsolete links. Signed-off-by: SeongJae Park <[email protected]> Signed-off-by: Jonathan Corbet <[email protected]>
2017-07-12docs: Turn off section numbering for the input docsJonathan Corbet1-1/+0
The input docs enable section numbering at multiple levels, leading to a lot of bright-red "nested numbered toctree" warnings in newer Sphinx versions. Just take that directive out for now to help alleviate the global red-pixel shortage. Signed-off-by: Jonathan Corbet <[email protected]>
2017-07-12docs: Include uaccess docs from the right fileJonathan Corbet1-1/+1
Documentation/core-api/kernel-api.rst was including kerneldoc comments from arch/x86/include/asm/uaccess_32.h, but the relevant comments moved to .../uaccess.h some time ago. Correct the include to pick up the comments and eliminate a warning. Signed-off-by: Jonathan Corbet <[email protected]>
2017-07-12net: stmmac: revert "support future possible different internal phy mode"LABBE Corentin1-7/+3
Since internal phy-mode is reserved for non-xMII protocol we cannot use it with dwmac-sun8i. Furthermore, all DT patchs which comes with this patch were cleaned, so the current state is broken. This reverts commit 1c2fa5f84683 ("net: stmmac: support future possible different internal phy mode") Fixes: 1c2fa5f84683 ("net: stmmac: support future possible different internal phy mode") Signed-off-by: Corentin Labbe <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-07-12sfc: don't read beyond unicast address listBert Kenward1-5/+3
If we have more than 32 unicast MAC addresses assigned to an interface we will read beyond the end of the address table in the driver when adding filters. The next 256 entries store multicast addresses, so we will end up attempting to insert duplicate filters, which is mostly harmless. If we add more than 288 unicast addresses we will then read past the multicast address table, which is likely to be more exciting. Fixes: 12fb0da45c9a ("sfc: clean fallbacks between promisc/normal in efx_ef10_filter_sync_rx_mode") Signed-off-by: Bert Kenward <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-07-12Merge branch 'net-doc-fixes'David S. Miller2-3/+6
Stephen Hemminger says: ==================== minor net kernel-doc fixes Fix a couple of small errors in kernel-doc for networking ==================== Signed-off-by: David S. Miller <[email protected]>
2017-07-12datagram: fix kernel-doc commentsstephen hemminger1-3/+3
An underscore in the kernel-doc comment section has special meaning and mis-use generates an errors. ./net/core/datagram.c:207: ERROR: Unknown target name: "msg". ./net/core/datagram.c:379: ERROR: Unknown target name: "msg". ./net/core/datagram.c:816: ERROR: Unknown target name: "t". Signed-off-by: Stephen Hemminger <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-07-12socket: add documentation for missing elementsstephen hemminger1-0/+3
Fill in missing kernel-doc for missing elements in struct sock. Signed-off-by: Stephen Hemminger <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-07-12smsc911x: Add check for ioremap_nocache() return codeAlexey Khoroshilov1-0/+5
There is no check for return code of smsc911x_drv_probe() in smsc911x_drv_probe(). The patch adds one. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-07-12Merge branch 'for-rc' of ↵Rafael J. Wysocki3-6/+7
https://git.kernel.org/pub/scm/linux/kernel/git/mzx/devfreq Pull devfreq changes for v4.13 from MyungJoo Ham. * 'for-rc' of https://git.kernel.org/pub/scm/linux/kernel/git/mzx/devfreq: PM / devfreq: constify attribute_group structures. PM / devfreq: tegra: fix error return code in tegra_devfreq_probe() PM / devfreq: rk3399_dmc: fix error return code in rk3399_dmcfreq_probe()
2017-07-12Merge branch 'next' into for-linusDmitry Torokhov3-44/+204
Prepare second round of input updates for 4.13 merge window.
2017-07-12rtc: Remove wrong deprecation commentAlexandre Belloni1-6/+0
rtc_time_to_tm and rtc_tm_to_time are not deprecated and make perfect sense for RTCs that are simple 32bit counters. Signed-off-by: Alexandre Belloni <[email protected]>
2017-07-12PCI / PM: Restore PME Enable after config space restorationRafael J. Wysocki3-8/+11
Commit dc15e71eefc7 (PCI / PM: Restore PME Enable if skipping wakeup setup) introduced a mechanism by which the PME Enable bit can be restored by pci_enable_wake() if dev->wakeup_prepared is set in case it has been overwritten by PCI config space restoration. However, that commit overlooked the fact that on some systems (Dell XPS13 9360 in particular) the AML handling wakeup events checks PME Status and PME Enable and it won't trigger a Notify() for devices where those bits are not set while it is running. That happens during resume from suspend-to-idle when pci_restore_state() invoked by pci_pm_default_resume_early() clears PME Enable before the wakeup events are processed by AML, effectively causing those wakeup events to be ignored. Fix this issue by restoring the PME Enable configuration right after pci_restore_state() has been called instead of doing that in pci_enable_wake(). Fixes: dc15e71eefc7 (PCI / PM: Restore PME Enable if skipping wakeup setup) Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Bjorn Helgaas <[email protected]>
2017-07-12platform/x86: silead_dmi: Add entry for Ployer Momo7w tablet touchscreenHans de Goede1-0/+10
This Ployer Momo7w revision has the same hardware as the Trekstor ST70416-6, so we re-use the surftab_wintron70_st70416_6_data. Signed-off-by: Hans de Goede <[email protected]> Signed-off-by: Darren Hart (VMware) <[email protected]>
2017-07-12KVM: trigger uevents when creating or destroying a VMClaudio Imbrenda1-0/+69
This patch adds a few lines to the KVM common code to fire a KOBJ_CHANGE uevent whenever a KVM VM is created or destroyed. The event carries five environment variables: CREATED indicates how many times a new VM has been created. It is useful for example to trigger specific actions when the first VM is started COUNT indicates how many VMs are currently active. This can be used for logging or monitoring purposes PID has the pid of the KVM process that has been started or stopped. This can be used to perform process-specific tuning. STATS_PATH contains the path in debugfs to the directory with all the runtime statistics for this VM. This is useful for performance monitoring and profiling. EVENT described the type of event, its value can be either "create" or "destroy" Specific udev rules can be then set up in userspace to deal with the creation or destruction of VMs as needed. Signed-off-by: Claudio Imbrenda <[email protected]> Signed-off-by: Radim Krčmář <[email protected]>
2017-07-12KVM: SVM: Enable Virtual VMLOAD VMSAVE featureJanakarajan Natarajan2-0/+25
Enable the Virtual VMLOAD VMSAVE feature. This is done by setting bit 1 at position B8h in the vmcb. The processor must have nested paging enabled, be in 64-bit mode and have support for the Virtual VMLOAD VMSAVE feature for the bit to be set in the vmcb. Signed-off-by: Janakarajan Natarajan <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Signed-off-by: Radim Krčmář <[email protected]>
2017-07-12KVM: SVM: Add Virtual VMLOAD VMSAVE feature definitionJanakarajan Natarajan1-0/+1
Define a new cpufeature definition for Virtual VMLOAD VMSAVE. Signed-off-by: Janakarajan Natarajan <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Signed-off-by: Radim Krčmář <[email protected]>
2017-07-12KVM: SVM: Rename lbr_ctl field in the vmcb control areaJanakarajan Natarajan2-6/+6
Rename the lbr_ctl variable to better reflect the purpose of the field - provide support for virtualization extensions. Signed-off-by: Janakarajan Natarajan <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Signed-off-by: Radim Krčmář <[email protected]>
2017-07-12KVM: SVM: Prepare for new bit definition in lbr_ctlJanakarajan Natarajan2-2/+4
The lbr_ctl variable in the vmcb control area is used to enable or disable Last Branch Record (LBR) virtualization. However, this is to be done using only bit 0 of the variable. To correct this and to prepare for a new feature, change the current usage to work only on a particular bit. Signed-off-by: Janakarajan Natarajan <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Signed-off-by: Radim Krčmář <[email protected]>
2017-07-12KVM: SVM: handle singlestep exception when skipping emulated instructionsLadi Prosek1-26/+33
kvm_skip_emulated_instruction handles the singlestep debug exception which is something we almost always want. This commit (specifically the change in rdmsr_interception) makes the debug.flat KVM unit test pass on AMD. Two call sites still call skip_emulated_instruction directly: * In svm_queue_exception where it's used only for moving the rip forward * In task_switch_interception which is analogous to handle_task_switch in VMX Signed-off-by: Ladi Prosek <[email protected]> Signed-off-by: Radim Krčmář <[email protected]>
2017-07-12KVM: x86: take slots_lock in kvm_free_pitRadim Krčmář1-0/+2
kvm_vm_release() did not have slots_lock when calling kvm_io_bus_unregister_dev() and this went unnoticed until 4a12f9517728 ("KVM: mark kvm->busses as rcu protected") added dynamic checks. Luckily, there should be no race at that point: ============================= WARNING: suspicious RCU usage 4.12.0.kvm+ #0 Not tainted ----------------------------- ./include/linux/kvm_host.h:479 suspicious rcu_dereference_check() usage! lockdep_rcu_suspicious+0xc5/0x100 kvm_io_bus_unregister_dev+0x173/0x190 [kvm] kvm_free_pit+0x28/0x80 [kvm] kvm_arch_sync_events+0x2d/0x30 [kvm] kvm_put_kvm+0xa7/0x2a0 [kvm] kvm_vm_release+0x21/0x30 [kvm] Reviewed-by: David Hildenbrand <[email protected]> Signed-off-by: Radim Krčmář <[email protected]>
2017-07-12KVM: s390: Fix KVM_S390_GET_CMMA_BITS ioctl definitionGleb Fotengauer-Malinovskiy1-1/+1
In case of KVM_S390_GET_CMMA_BITS, the kernel does not only read struct kvm_s390_cmma_log passed from userspace (which constitutes _IOC_WRITE), it also writes back a return value (which constitutes _IOC_READ) making this an _IOWR ioctl instead of _IOW. Fixes: 4036e387 ("KVM: s390: ioctls to get and set guest storage attributes") Signed-off-by: Gleb Fotengauer-Malinovskiy <[email protected]> Acked-by: Christian Borntraeger <[email protected]> Signed-off-by: Radim Krčmář <[email protected]>
2017-07-12kvm: vmx: Properly handle machine check during VM-entryJim Mattson1-6/+9
vmx_complete_atomic_exit should call kvm_machine_check for any VM-entry failure due to a machine-check event. Such an exit should be recognized solely by its basic exit reason (i.e. the low 16 bits of the VMCS exit reason field). None of the other VMCS exit information fields contain valid information when the VM-exit is due to "VM-entry failure due to machine-check event". Signed-off-by: Jim Mattson <[email protected]> Reviewed-by: Xiao Guangrong <[email protected]> [Changed VM_EXIT_INTR_INFO condition to better describe its reason.] Signed-off-by: Radim Krčmář <[email protected]>
2017-07-12KVM: x86: update master clock before computing kvmclock_offsetRadim Krčmář1-1/+7
kvm master clock usually has a different frequency than the kernel boot clock. This is not a problem until the master clock is updated; update uses the current kernel boot clock to compute new kvm clock, which erases any kvm clock cycles that might have built up due to frequency difference over a long period. KVM_SET_CLOCK is one of places where we can safely update master clock as the guest-visible clock is going to be shifted anyway. The problem with current code is that it updates the kvm master clock after updating the offset. If the master clock was enabled before calling KVM_SET_CLOCK, then it might have built up a significant delta from kernel boot clock. In the worst case, the time set by userspace would be shifted by so much that it couldn't have been set at any point during KVM_SET_CLOCK. To fix this, move kvm_gen_update_masterclock() before computing kvmclock_offset, which means that the master clock and kernel boot clock will be sufficiently close together. Another solution would be to replace get_kvmclock_ns() with "ktime_get_boot_ns() + ka->kvmclock_offset", which is marginally more accurate, but would break symmetry with KVM_GET_CLOCK. Signed-off-by: Radim Krčmář <[email protected]>