aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-07-12fs/Kconfig: kill CONFIG_PERCPU_RWSEM some moreDavidlohr Bueso1-1/+0
As of commit bf3eac84c42d ("percpu-rwsem: kill CONFIG_PERCPU_RWSEM") we unconditionally build pcpu-rwsems. Remove a leftover in for FILE_LOCKING. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Davidlohr Bueso <[email protected]> Cc: Oleg Nesterov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12bfs: fix sanity checks for empty filesRakesh Pandit1-1/+1
Mount fails if file system image has empty files because of sanity check while reading superblock. For empty files disk offset to end of file (i_eoffset) is cpu_to_le32(-1). Sanity check comparison, which compares disk offset with file system size isn't valid for this value and hence is ignored with this patch. Steps to reproduce: $ dd if=/dev/zero of=bfs-image count=204800 $ mkfs.bfs bfs-image $ mkdir bfs-mount-point $ sudo mount -t bfs -o loop bfs-image bfs-mount-point/ $ cd bfs-mount-point/ $ sudo touch a $ cd .. $ sudo umount bfs-mount-point/ $ sudo mount -t bfs -o loop bfs-image bfs-mount-point/ mount: /dev/loop0: can't read superblock $ dmesg [25526.689580] BFS-fs: bfs_fill_super(): Inode 0x00000003 corrupted Tigran said: "If you had created the filesystem with the proper mkfs under SCO UnixWare 7 you (probably) wouldn't encounter this issue. But since commercial Unix-es are now part of history and the only proper way is the Linux mkfs.bfs utility, your patch is fine" Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Rakesh Pandit <[email protected]> Acked-by: Tigran Aivazian <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12random: do not ignore early device randomnessKees Cook2-0/+6
The add_device_randomness() function would ignore incoming bytes if the crng wasn't ready. This additionally makes sure to make an early enough call to add_latent_entropy() to influence the initial stack canary, which is especially important on non-x86 systems where it stays the same through the life of the boot. Link: http://lkml.kernel.org/r/20170626233038.GA48751@beast Signed-off-by: Kees Cook <[email protected]> Cc: "Theodore Ts'o" <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jessica Yu <[email protected]> Cc: Steven Rostedt (VMware) <[email protected]> Cc: Viresh Kumar <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Prarit Bhargava <[email protected]> Cc: Lokesh Vutla <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: AKASHI Takahiro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12kernel/sysctl_binary.c: check name array length in deprecated_sysctl_warning()Mateusz Jurczyk1-1/+1
Prevent use of uninitialized memory (originating from the stack frame of do_sysctl()) by verifying that the name array is filled with sufficient input data before comparing its specific entries with integer constants. Through timing measurement or analyzing the kernel debug logs, a user-mode program could potentially infer the results of comparisons against the uninitialized memory, and acquire some (very limited) information about the state of the kernel stack. The change also eliminates possible future warnings by tools such as KMSAN and other code checkers / instrumentations. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Mateusz Jurczyk <[email protected]> Acked-by: Kees Cook <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Matthew Whitehead <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Tetsuo Handa <[email protected]> Cc: Alexander Potapenko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12test_sysctl: test against int proc_dointvec() array supportLuis R. Rodriguez2-0/+102
Add a few initial respective tests for an array: o Echoing values separated by spaces works o Echoing only first elements will set first elements o Confirm PAGE_SIZE limit still applies even if an array is used Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Luis R. Rodriguez <[email protected]> Cc: Kees Cook <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12test_sysctl: add simple proc_douintvec() caseLuis R. Rodriguez2-0/+74
Test against a simple proc_douintvec() case. While at it, add a test against UINT_MAX. Make sure UINT_MAX works, and UINT_MAX+1 will fail and that negative values are not accepted. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Luis R. Rodriguez <[email protected]> Cc: Kees Cook <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12test_sysctl: add simple proc_dointvec() caseLuis R. Rodriguez2-0/+73
Test against a simple proc_dointvec() case. While at it, add a test against INT_MAX. Make sure INT_MAX works, and INT_MAX+1 will fail. Also test negative values work. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Luis R. Rodriguez <[email protected]> Cc: Kees Cook <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12test_sysctl: test against PAGE_SIZE for intLuis R. Rodriguez1-0/+66
Add the following tests to ensure we do not regress: o Test using a buffer full of space (PAGE_SIZE-1) followed by a single digit works o Test using a buffer full of spaces (PAGE_SIZE or over) will fail As tests increase instead of unloading the module and reloading it we can just do a shell reset_vals() with a reset to values we know are set at init on the driver. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Luis R. Rodriguez <[email protected]> Cc: Kees Cook <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12test_sysctl: add generic script to expand on testsLuis R. Rodriguez5-220/+495
This adds a generic script to let us more easily add more tests cases. Since we really have only two types of tests cases just fold them into the one file. Each test unit is now identified into its separate function: # ./sysctl.sh -l Test ID list: TEST_ID x NUM_TEST TEST_ID: Test ID NUM_TESTS: Number of recommended times to run the test 0001 x 1 - tests proc_dointvec_minmax() 0002 x 1 - tests proc_dostring() For now we start off with what we had before, and run only each test once. We can now watch a test case until it fails: ./sysctl.sh -w 0002 We can also run a test case x number of times, say we want to run a test case 100 times: ./sysctl.sh -c 0001 100 To run a test case only once, for example: ./sysctl.sh -s 0002 The default settings are specified at the top of sysctl.sh. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Luis R. Rodriguez <[email protected]> Cc: Kees Cook <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12test_sysctl: add dedicated proc sysctl test driverLuis R. Rodriguez6-4/+130
The existing tools/testing/selftests/sysctl/ tests include two test cases, but these use existing production kernel sysctl interfaces. We want to expand test coverage but we can't just be looking for random safe production values to poke at, that's just insane! Instead just dedicate a test driver for debugging purposes and port the existing scripts to use it. This will make it easier for further tests to be added. Subsequent patches will extend our test coverage for sysctl. The stress test driver uses a new license (GPL on Linux, copyleft-next outside of Linux). Linus was fine with this [0] and later due to Ted's and Alans's request ironed out an "or" language clause to use [1] which is already present upstream. [0] https://lkml.kernel.org/r/CA+55aFyhxcvD+q7tp+-yrSFDKfR0mOHgyEAe=f_94aKLsOu0Og@mail.gmail.com [1] https://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Luis R. Rodriguez <[email protected]> Acked-by: Kees Cook <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12sysctl: add unsigned int range supportLuis R. Rodriguez3-1/+72
To keep parity with regular int interfaces provide the an unsigned int proc_douintvec_minmax() which allows you to specify a range of allowed valid numbers. Adding proc_douintvec_minmax_sysadmin() is easy but we can wait for an actual user for that. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Luis R. Rodriguez <[email protected]> Acked-by: Kees Cook <[email protected]> Cc: Subash Abhinov Kasiviswanathan <[email protected]> Cc: Heinrich Schuchardt <[email protected]> Cc: Kees Cook <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Al Viro <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12sysctl: simplify unsigned int supportLuis R. Rodriguez2-7/+160
Commit e7d316a02f68 ("sysctl: handle error writing UINT_MAX to u32 fields") added proc_douintvec() to start help adding support for unsigned int, this however was only half the work needed. Two fixes have come in since then for the following issues: o Printing the values shows a negative value, this happens since do_proc_dointvec() and this uses proc_put_long() This was fixed by commit 5380e5644afbba9 ("sysctl: don't print negative flag for proc_douintvec"). o We can easily wrap around the int values: UINT_MAX is 4294967295, if we echo in 4294967295 + 1 we end up with 0, using 4294967295 + 2 we end up with 1. o We echo negative values in and they are accepted This was fixed by commit 425fffd886ba ("sysctl: report EINVAL if value is larger than UINT_MAX for proc_douintvec"). It still also failed to be added to sysctl_check_table()... instead of adding it with the current implementation just provide a proper and simplified unsigned int support without any array unsigned int support with no negative support at all. Historically sysctl proc helpers have supported arrays, due to the complexity this adds though we've taken a step back to evaluate array users to determine if its worth upkeeping for unsigned int. An evaluation using Coccinelle has been done to perform a grammatical search to ask ourselves: o How many sysctl proc_dointvec() (int) users exist which likely should be moved over to proc_douintvec() (unsigned int) ? Answer: about 8 - Of these how many are array users ? Answer: Probably only 1 o How many sysctl array users exist ? Answer: about 12 This last question gives us an idea just how popular arrays: they are not. Array support should probably just be kept for strings. The identified uint ports are: drivers/infiniband/core/ucma.c - max_backlog drivers/infiniband/core/iwcm.c - default_backlog net/core/sysctl_net_core.c - rps_sock_flow_sysctl() net/netfilter/nf_conntrack_timestamp.c - nf_conntrack_timestamp -- bool net/netfilter/nf_conntrack_acct.c nf_conntrack_acct -- bool net/netfilter/nf_conntrack_ecache.c - nf_conntrack_events -- bool net/netfilter/nf_conntrack_helper.c - nf_conntrack_helper -- bool net/phonet/sysctl.c proc_local_port_range() The only possible array users is proc_local_port_range() but it does not seem worth it to add array support just for this given the range support works just as well. Unsigned int support should be desirable more for when you *need* more than INT_MAX or using int min/max support then does not suffice for your ranges. If you forget and by mistake happen to register an unsigned int proc entry with an array, the driver will fail and you will get something as follows: sysctl table check failed: debug/test_sysctl//uint_0002 array now allowed CPU: 2 PID: 1342 Comm: modprobe Tainted: G W E <etc> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS <etc> Call Trace: dump_stack+0x63/0x81 __register_sysctl_table+0x350/0x650 ? kmem_cache_alloc_trace+0x107/0x240 __register_sysctl_paths+0x1b3/0x1e0 ? 0xffffffffc005f000 register_sysctl_table+0x1f/0x30 test_sysctl_init+0x10/0x1000 [test_sysctl] do_one_initcall+0x52/0x1a0 ? kmem_cache_alloc_trace+0x107/0x240 do_init_module+0x5f/0x200 load_module+0x1867/0x1bd0 ? __symbol_put+0x60/0x60 SYSC_finit_module+0xdf/0x110 SyS_finit_module+0xe/0x10 entry_SYSCALL_64_fastpath+0x1e/0xad RIP: 0033:0x7f042b22d119 <etc> Fixes: e7d316a02f68 ("sysctl: handle error writing UINT_MAX to u32 fields") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Luis R. Rodriguez <[email protected]> Suggested-by: Alexey Dobriyan <[email protected]> Cc: Subash Abhinov Kasiviswanathan <[email protected]> Cc: Liping Zhang <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Heinrich Schuchardt <[email protected]> Cc: Kees Cook <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Al Viro <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12sysctl: fold sysctl_writes_strict checks into helperLuis R. Rodriguez1-24/+32
The mode sysctl_writes_strict positional checks keep being copy and pasted as we add new proc handlers. Just add a helper to avoid code duplication. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Luis R. Rodriguez <[email protected]> Suggested-by: Kees Cook <[email protected]> Cc: Al Viro <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12sysctl: kdoc'ify sysctl_writes_strictLuis R. Rodriguez1-4/+25
Document the different sysctl_writes_strict modes in code. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Luis R. Rodriguez <[email protected]> Cc: Al Viro <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Kees Cook <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12sysctl: fix lax sysctl_check_table() sanity checkLuis R. Rodriguez1-5/+5
Patch series "sysctl: few fixes", v5. I've been working on making kmod more deterministic, and as I did that I couldn't help but notice a few issues with sysctl. My end goal was just to fix unsigned int support, which back then was completely broken. Liping Zhang has sent up small atomic fixes, however it still missed yet one more fix and Alexey Dobriyan had also suggested to just drop array support given its complexity. I have inspected array support using Coccinelle and indeed its not that popular, so if in fact we can avoid it for new interfaces, I agree its best. I did develop a sysctl stress driver but will hold that off for another series. This patch (of 5): Commit 7c60c48f58a7 ("sysctl: Improve the sysctl sanity checks") improved sanity checks considerbly, however the enhancements on sysctl_check_table() meant adding a functional change so that only the last table entry's sanity error is propagated. It also changed the way errors were propagated so that each new check reset the err value, this means only last sanity check computed is used for an error. This has been in the kernel since v3.4 days. Fix this by carrying on errors from previous checks and iterations as we traverse the table and ensuring we keep any error from previous checks. We keep iterating on the table even if an error is found so we can complain for all errors found in one shot. This works as -EINVAL is always returned on error anyway, and the check for error is any non-zero value. Fixes: 7c60c48f58a7 ("sysctl: Improve the sysctl sanity checks") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Luis R. Rodriguez <[email protected]> Cc: Al Viro <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Kees Cook <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12kexec/kdump: minor Documentation updates for arm64 and ImageBharat Bhushan1-3/+9
Minor updates in Documentation for arm64 as relocatable kernel. Also this patch updates documentation for using uncompressed image "Image" which is used for ARM64. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Bharat Bhushan <[email protected]> Cc: Dave Young <[email protected]> Cc: Baoquan He <[email protected]> Cc: Vivek Goyal <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: AKASHI Takahiro <[email protected]> Cc: Pratyush Anand <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12kdump: protect vmcoreinfo data under the crash memoryXunlei Pang6-2/+74
Currently vmcoreinfo data is updated at boot time subsys_initcall(), it has the risk of being modified by some wrong code during system is running. As a result, vmcore dumped may contain the wrong vmcoreinfo. Later on, when using "crash", "makedumpfile", etc utility to parse this vmcore, we probably will get "Segmentation fault" or other unexpected errors. E.g. 1) wrong code overwrites vmcoreinfo_data; 2) further crashes the system; 3) trigger kdump, then we obviously will fail to recognize the crash context correctly due to the corrupted vmcoreinfo. Now except for vmcoreinfo, all the crash data is well protected(including the cpu note which is fully updated in the crash path, thus its correctness is guaranteed). Given that vmcoreinfo data is a large chunk prepared for kdump, we better protect it as well. To solve this, we relocate and copy vmcoreinfo_data to the crash memory when kdump is loading via kexec syscalls. Because the whole crash memory will be protected by existing arch_kexec_protect_crashkres() mechanism, we naturally protect vmcoreinfo_data from write(even read) access under kernel direct mapping after kdump is loaded. Since kdump is usually loaded at the very early stage after boot, we can trust the correctness of the vmcoreinfo data copied. On the other hand, we still need to operate the vmcoreinfo safe copy when crash happens to generate vmcoreinfo_note again, we rely on vmap() to map out a new kernel virtual address and update to use this new one instead in the following crash_save_vmcoreinfo(). BTW, we do not touch vmcoreinfo_note, because it will be fully updated using the protected vmcoreinfo_data after crash which is surely correct just like the cpu crash note. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Xunlei Pang <[email protected]> Tested-by: Michael Holzheu <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Dave Young <[email protected]> Cc: Eric Biederman <[email protected]> Cc: Hari Bathini <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Mahesh Salgaonkar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12powerpc/fadump: use the correct VMCOREINFO_NOTE_SIZE for phdrXunlei Pang3-5/+2
vmcoreinfo_max_size stands for the vmcoreinfo_data, the correct one we should use is vmcoreinfo_note whose total size is VMCOREINFO_NOTE_SIZE. Like explained in commit 77019967f06b ("kdump: fix exported size of vmcoreinfo note"), it should not affect the actual function, but we better fix it, also this change should be safe and backward compatible. After this, we can get rid of variable vmcoreinfo_max_size, let's use the corresponding macros directly, fewer variables means more safety for vmcoreinfo operation. [[email protected]: fix build warning] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Xunlei Pang <[email protected]> Reviewed-by: Mahesh Salgaonkar <[email protected]> Reviewed-by: Dave Young <[email protected]> Cc: Hari Bathini <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Eric Biederman <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Michael Holzheu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12kexec: move vmcoreinfo out of the kernel's .bss sectionXunlei Pang8-21/+29
As Eric said, "what we need to do is move the variable vmcoreinfo_note out of the kernel's .bss section. And modify the code to regenerate and keep this information in something like the control page. Definitely something like this needs a page all to itself, and ideally far away from any other kernel data structures. I clearly was not watching closely the data someone decided to keep this silly thing in the kernel's .bss section." This patch allocates extra pages for these vmcoreinfo_XXX variables, one advantage is that it enhances some safety of vmcoreinfo, because vmcoreinfo now is kept far away from other kernel data structures. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Xunlei Pang <[email protected]> Tested-by: Michael Holzheu <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Suggested-by: Eric Biederman <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Dave Young <[email protected]> Cc: Hari Bathini <[email protected]> Cc: Mahesh Salgaonkar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12kernel/fork.c: virtually mapped stacks: do not disable interruptsChristoph Lameter1-11/+5
The reason to disable interrupts seems to be to avoid switching to a different processor while handling per cpu data using individual loads and stores. If we use per cpu RMV primitives we will not have to disable interrupts. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Christoph Lameter <[email protected]> Cc: Andy Lutomirski <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12mm/memory.c: mark create_huge_pmd() inline to prevent build failureGeert Uytterhoeven1-1/+1
With gcc 4.1.2: mm/memory.o: In function `create_huge_pmd': memory.c:(.text+0x93e): undefined reference to `do_huge_pmd_anonymous_page' Interestingly, create_huge_pmd() is emitted in the assembler output, but never called. Converting transparent_hugepage_enabled() from a macro to a static inline function reduced the ability of the compiler to remove unused code. Fix this by marking create_huge_pmd() inline. Fixes: 16981d763501c0e0 ("mm: improve readability of transparent_hugepage_enabled()") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Geert Uytterhoeven <[email protected]> Acked-by: Arnd Bergmann <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12kernel.h: handle pointers to arrays better in container_of()Ian Abbott1-3/+7
If the first parameter of container_of() is a pointer to a non-const-qualified array type (and the third parameter names a non-const-qualified array member), the local variable __mptr will be defined with a const-qualified array type. In ISO C, these types are incompatible. They work as expected in GNU C, but some versions will issue warnings. For example, GCC 4.9 produces the warning "initialization from incompatible pointer type". Here is an example of where the problem occurs: ------------------------------------------------------- #include <linux/kernel.h> #include <linux/module.h> MODULE_LICENSE("GPL"); struct st { int a; char b[16]; }; static int __init example_init(void) { struct st t = { .a = 101, .b = "hello" }; char (*p)[16] = &t.b; struct st *x = container_of(p, struct st, b); printk(KERN_DEBUG "%p %p\n", (void *)&t, (void *)x); return 0; } static void __exit example_exit(void) { } module_init(example_init); module_exit(example_exit); ------------------------------------------------------- Building the module with gcc-4.9 results in these warnings (where '{m}' is the module source and '{k}' is the kernel source): ------------------------------------------------------- In file included from {m}/example.c:1:0: {m}/example.c: In function `example_init': {k}/include/linux/kernel.h:854:48: warning: initialization from incompatible pointer type const typeof( ((type *)0)->member ) *__mptr = (ptr); \ ^ {m}/example.c:14:17: note: in expansion of macro `container_of' struct st *x = container_of(p, struct st, b); ^ {k}/include/linux/kernel.h:854:48: warning: (near initialization for `x') const typeof( ((type *)0)->member ) *__mptr = (ptr); \ ^ {m}/example.c:14:17: note: in expansion of macro `container_of' struct st *x = container_of(p, struct st, b); ^ ------------------------------------------------------- Replace the type checking performed by the macro to avoid these warnings. Make sure `*(ptr)` either has type compatible with the member, or has type compatible with `void`, ignoring qualifiers. Raise compiler errors if this is not true. This is stronger than the previous behaviour, which only resulted in compiler warnings for a type mismatch. [[email protected]: fix new warnings for container_of()] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ian Abbott <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]> Acked-by: Michal Nazarewicz <[email protected]> Acked-by: Kees Cook <[email protected]> Cc: Hidehiro Kawai <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Rasmus Villemoes <[email protected]> Cc: Johannes Berg <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Alexander Potapenko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12include/linux/dcache.h: use unsigned chars in struct name_snapshotStephen Rothwell1-2/+2
"kernel.h: handle pointers to arrays better in container_of()" triggers: In file included from include/uapi/linux/stddef.h:1:0, from include/linux/stddef.h:4, from include/uapi/linux/posix_types.h:4, from include/uapi/linux/types.h:13, from include/linux/types.h:5, from include/linux/syscalls.h:71, from fs/dcache.c:17: fs/dcache.c: In function 'release_dentry_name_snapshot': include/linux/compiler.h:542:38: error: call to '__compiletime_assert_305' declared with attribute error: pointer type mismatch in container_of() _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__) ^ include/linux/compiler.h:525:4: note: in definition of macro '__compiletime_assert' prefix ## suffix(); \ ^ include/linux/compiler.h:542:2: note: in expansion of macro '_compiletime_assert' _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__) ^ include/linux/build_bug.h:46:37: note: in expansion of macro 'compiletime_assert' #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg) ^ include/linux/kernel.h:860:2: note: in expansion of macro 'BUILD_BUG_ON_MSG' BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) && \ ^ fs/dcache.c:305:7: note: in expansion of macro 'container_of' p = container_of(name->name, struct external_name, name[0]); Switch name_snapshot to use unsigned chars, matching struct qstr and struct external_name. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Stephen Rothwell <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-07-12kokr/memory-barriers.txt: Fix obsolete link to atomic_ops.txtSeongJae Park1-7/+7
Obsolete links to atomic_ops.txt exist in ko_KR/memory-barriers.txt though the file has moved to core-api/atomic_ops.rst. This commit fixes the obsolete links. Signed-off-by: SeongJae Park <[email protected]> Signed-off-by: Jonathan Corbet <[email protected]>
2017-07-12memory-barriers.txt: Fix broken link to atomic_ops.txtSeongJae Park1-3/+3
Few obsolete links to atomic_ops.txt exist in memory-barriers.txt though the file has moved to core-api/atomic_ops.rst. This commit fixes the obsolete links. Signed-off-by: SeongJae Park <[email protected]> Signed-off-by: Jonathan Corbet <[email protected]>
2017-07-12docs: Turn off section numbering for the input docsJonathan Corbet1-1/+0
The input docs enable section numbering at multiple levels, leading to a lot of bright-red "nested numbered toctree" warnings in newer Sphinx versions. Just take that directive out for now to help alleviate the global red-pixel shortage. Signed-off-by: Jonathan Corbet <[email protected]>
2017-07-12docs: Include uaccess docs from the right fileJonathan Corbet1-1/+1
Documentation/core-api/kernel-api.rst was including kerneldoc comments from arch/x86/include/asm/uaccess_32.h, but the relevant comments moved to .../uaccess.h some time ago. Correct the include to pick up the comments and eliminate a warning. Signed-off-by: Jonathan Corbet <[email protected]>
2017-07-12net: stmmac: revert "support future possible different internal phy mode"LABBE Corentin1-7/+3
Since internal phy-mode is reserved for non-xMII protocol we cannot use it with dwmac-sun8i. Furthermore, all DT patchs which comes with this patch were cleaned, so the current state is broken. This reverts commit 1c2fa5f84683 ("net: stmmac: support future possible different internal phy mode") Fixes: 1c2fa5f84683 ("net: stmmac: support future possible different internal phy mode") Signed-off-by: Corentin Labbe <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-07-12sfc: don't read beyond unicast address listBert Kenward1-5/+3
If we have more than 32 unicast MAC addresses assigned to an interface we will read beyond the end of the address table in the driver when adding filters. The next 256 entries store multicast addresses, so we will end up attempting to insert duplicate filters, which is mostly harmless. If we add more than 288 unicast addresses we will then read past the multicast address table, which is likely to be more exciting. Fixes: 12fb0da45c9a ("sfc: clean fallbacks between promisc/normal in efx_ef10_filter_sync_rx_mode") Signed-off-by: Bert Kenward <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-07-12Merge branch 'net-doc-fixes'David S. Miller2-3/+6
Stephen Hemminger says: ==================== minor net kernel-doc fixes Fix a couple of small errors in kernel-doc for networking ==================== Signed-off-by: David S. Miller <[email protected]>
2017-07-12datagram: fix kernel-doc commentsstephen hemminger1-3/+3
An underscore in the kernel-doc comment section has special meaning and mis-use generates an errors. ./net/core/datagram.c:207: ERROR: Unknown target name: "msg". ./net/core/datagram.c:379: ERROR: Unknown target name: "msg". ./net/core/datagram.c:816: ERROR: Unknown target name: "t". Signed-off-by: Stephen Hemminger <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-07-12socket: add documentation for missing elementsstephen hemminger1-0/+3
Fill in missing kernel-doc for missing elements in struct sock. Signed-off-by: Stephen Hemminger <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-07-12smsc911x: Add check for ioremap_nocache() return codeAlexey Khoroshilov1-0/+5
There is no check for return code of smsc911x_drv_probe() in smsc911x_drv_probe(). The patch adds one. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-07-12rtc: Remove wrong deprecation commentAlexandre Belloni1-6/+0
rtc_time_to_tm and rtc_tm_to_time are not deprecated and make perfect sense for RTCs that are simple 32bit counters. Signed-off-by: Alexandre Belloni <[email protected]>
2017-07-12platform/x86: silead_dmi: Add entry for Ployer Momo7w tablet touchscreenHans de Goede1-0/+10
This Ployer Momo7w revision has the same hardware as the Trekstor ST70416-6, so we re-use the surftab_wintron70_st70416_6_data. Signed-off-by: Hans de Goede <[email protected]> Signed-off-by: Darren Hart (VMware) <[email protected]>
2017-07-12nfsd4: factor ctime into change attributeJ. Bruce Fields3-3/+25
Factoring ctime into the nfsv4 change attribute gives us better properties than just i_version alone. Eventually we'll likely also expose this (as opposed to raw i_version) to userspace, at which point we'll want to move it to a common helper, called from either userspace or individual filesystems. For now, nfsd is the only user. Signed-off-by: J. Bruce Fields <[email protected]>
2017-07-12svcrdma: Remove svc_rdma_chunk_ctxt::cc_dir fieldChuck Lever1-10/+8
Clean up: No need to save the I/O direction. The functions that release svc_rdma_chunk_ctxt already know what direction to use. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2017-07-12svcrdma: use offset_in_page() macroChuck Lever1-2/+3
Clean up: Use offset_in_page() macro instead of open-coding. Reported-by: Geliang Tang <[email protected]> Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2017-07-12svcrdma: Clean up after converting svc_rdma_recvfrom to rdma_rw APIChuck Lever2-39/+4
Clean up: Registration mode details are now handled by the rdma_rw API, and thus can be removed from svcrdma. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2017-07-12svcrdma: Clean-up svc_rdma_unmap_dmaChuck Lever1-14/+5
There's no longer a need to compare each SGE's lkey with the PD's local_dma_lkey. Now that FRWR is gone, all DMA mappings are for pages that were registered with this key. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2017-07-12svcrdma: Remove frmr cacheChuck Lever2-104/+0
Clean up: Now that the svc_rdma_recvfrom path uses the rdma_rw API, the details of Read sink buffer registration are dealt with by the kernel's RDMA core. This cache is no longer used, and can be removed. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2017-07-12svcrdma: Remove unused Read completion handlersChuck Lever2-87/+10
Clean up: The generic RDMA R/W API conversion of svc_rdma_recvfrom replaced the Register, Read, and Invalidate completion handlers. Remove the old ones, which are no longer used. These handlers shared some helper code with svc_rdma_wc_send. Fold the wc_common helper back into the one remaining completion handler. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2017-07-12svcrdma: Properly compute .len and .buflen for received RPC CallsChuck Lever2-11/+5
When an RPC-over-RDMA request is received, the Receive buffer contains a Transport Header possibly followed by an RPC message. Even though rq_arg.head[0] (as passed to NFSD) does not contain the Transport Header header, currently rq_arg.len includes the size of the Transport Header. That violates the intent of the xdr_buf API contract. .buflen should include everything, but .len should be exactly the length of the RPC message in the buffer. The rq_arg fields are summed together at the end of svc_rdma_recvfrom to obtain the correct return value. rq_arg.len really ought to contain the correct number of bytes already, but it currently doesn't due to the above misbehavior. Let's instead ensure that .buflen includes the length of the transport header, and that .len is always equal to head.iov_len + .page_len + tail.iov_len . Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2017-07-12svcrdma: Use generic RDMA R/W API in RPC Call pathChuck Lever3-468/+106
The current svcrdma recvfrom code path has a lot of detail about registration mode and the type of port (iWARP, IB, etc). Instead, use the RDMA core's generic R/W API. This shares code with other RDMA-enabled ULPs that manages the gory details of buffer registration and the posting of RDMA Read Work Requests. Since the Read list marshaling code is being replaced, I took the opportunity to replace C structure-based XDR encoding code with more portable code that uses pointer arithmetic. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2017-07-12svcrdma: Add recvfrom helpers to svc_rdma_rw.cChuck Lever2-1/+427
svc_rdma_rw.c already contains helpers for the sendto path. Introduce helpers for the recvfrom path. The plan is to replace the local NFSD bespoke code that constructs and posts RDMA Read Work Requests with calls to the rdma_rw API. This shares code with other RDMA-enabled ULPs that manages the gory details of buffer registration and posting Work Requests. This new code also puts all RDMA_NOMSG-specific logic in one place. Lastly, the use of rqstp->rq_arg.pages is deprecated in favor of using rqstp->rq_pages directly, for clarity. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2017-07-12sunrpc: Allocate up to RPCSVC_MAXPAGES per svc_rqstChuck Lever2-5/+7
svcrdma needs 259 pages allocated to receive 1MB NFSv4.0 WRITE requests: - 1 page for the transport header and head iovec - 256 pages for the data payload - 1 page for the trailing GETATTR request (since NFSD XDR decoding does not look for a tail iovec, the GETATTR is stuck at the end of the rqstp->rq_arg.pages list) - 1 page for building the reply xdr_buf But RPCSVC_MAXPAGES is already 259 (on x86_64). The problem is that svc_alloc_arg never allocates that many pages. To address this: 1. The final element of rq_pages always points to NULL. To accommodate up to 259 pages in rq_pages, add an extra element to rq_pages for the array termination sentinel. 2. Adjust the calculation of "pages" to match how RPCSVC_MAXPAGES is calculated, so it can go up to 259. Bruce noted that the calculation assumes sv_max_mesg is a multiple of PAGE_SIZE, which might not always be true. I didn't change this assumption. 3. Change the loop boundaries to allow 259 pages to be allocated. Additional clean-up: WARN_ON_ONCE adds an extra conditional branch, which is basically never taken. And there's no need to dump the stack here because svc_alloc_arg has only one caller. Keeping that NULL "array termination sentinel"; there doesn't appear to be any code that depends on it, only code in nfsd_splice_actor() which needs the 259th element to be initialized to *something*. So it's possible we could just keep the array at 259 elements and drop that final NULL, but we're being conservative for now. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2017-07-12Merge branch 'i2c/for-4.13' of ↵Linus Torvalds41-2483/+4944
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c updates from Wolfram Sang: "This pull request contains: - i2c core reorganization. One source file became too monolithic. It is now split up, yet we still have the same named object as the final output. This should ease maintenance. - new drivers: ZTE ZX2967 family, ASPEED 24XX/25XX - designware driver gained slave mode support - xgene-slimpro driver gained ACPI support - bigger overhaul for pca-platform driver - the algo-bit module now supports messages with enforced STOP - slightly bigger than usual set of driver updates and improvements and with much appreciated quality assurance from Andy Shevchenko" * 'i2c/for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (51 commits) i2c: Provide a stub for i2c_detect_slave_mode() i2c: designware: Let slave adapter support be optional i2c: designware: Make HW init functions static i2c: designware: fix spelling mistakes i2c: pca-platform: propagate error from i2c_pca_add_numbered_bus i2c: pca-platform: correctly set algo_data.reset_chip i2c: acpi: Do not create i2c-clients for LNXVIDEO ACPI devices i2c: designware: enable SLAVE in platform module i2c: designware: add SLAVE mode functions i2c: zx2967: drop COMPILE_TEST dependency i2c: zx2967: always use the same device when printing errors i2c: pca-platform: use dev_warn/dev_info instead of printk i2c: pca-platform: use device managed allocations i2c: pca-platform: add devicetree awareness i2c: pca-platform: switch to struct gpio_desc dt-bindings: add bindings for i2c-pca-platform i2c: cadance: fix ctrl/addr reg write order i2c: zx2967: add i2c controller driver for ZTE's zx2967 family dt: bindings: add documentation for zx2967 family i2c controller i2c: algo-bit: add support for I2C_M_STOP ...
2017-07-12Merge tag 'iommu-updates-v4.13' of ↵Linus Torvalds22-548/+1233
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: "This update comes with: - Support for lockless operation in the ARM io-pgtable code. This is an important step to solve the scalability problems in the common dma-iommu code for ARM - Some Errata workarounds for ARM SMMU implemenations - Rewrite of the deferred IO/TLB flush code in the AMD IOMMU driver. The code suffered from very high flush rates, with the new implementation the flush rate is down to ~1% of what it was before - Support for amd_iommu=off when booting with kexec. The problem here was that the IOMMU driver bailed out early without disabling the iommu hardware, if it was enabled in the old kernel - The Rockchip IOMMU driver is now available on ARM64 - Align the return value of the iommu_ops->device_group call-backs to not miss error values - Preempt-disable optimizations in the Intel VT-d and common IOVA code to help Linux-RT - Various other small cleanups and fixes" * tag 'iommu-updates-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (60 commits) iommu/vt-d: Constify intel_dma_ops iommu: Warn once when device_group callback returns NULL iommu/omap: Return ERR_PTR in device_group call-back iommu: Return ERR_PTR() values from device_group call-backs iommu/s390: Use iommu_group_get_for_dev() in s390_iommu_add_device() iommu/vt-d: Don't disable preemption while accessing deferred_flush() iommu/iova: Don't disable preempt around this_cpu_ptr() iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #126 iommu/arm-smmu-v3: Enable ACPI based HiSilicon CMD_PREFETCH quirk(erratum 161010701) iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #74 ACPI/IORT: Fixup SMMUv3 resource size for Cavium ThunderX2 SMMUv3 model iommu/arm-smmu-v3, acpi: Add temporary Cavium SMMU-V3 IORT model number definitions iommu/io-pgtable-arm: Use dma_wmb() instead of wmb() when publishing table iommu/io-pgtable: depend on !GENERIC_ATOMIC64 when using COMPILE_TEST with LPAE iommu/arm-smmu-v3: Remove io-pgtable spinlock iommu/arm-smmu: Remove io-pgtable spinlock iommu/io-pgtable-arm-v7s: Support lockless operation iommu/io-pgtable-arm: Support lockless operation iommu/io-pgtable: Introduce explicit coherency iommu/io-pgtable-arm-v7s: Refactor split_blk_unmap ...
2017-07-12Merge branch 'overlayfs-linus' of ↵Linus Torvalds12-383/+1456
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull overlayfs updates from Miklos Szeredi: "This work from Amir introduces the inodes index feature, which provides: - hardlinks are not broken on copy up - infrastructure for overlayfs NFS export This also fixes constant st_ino for samefs case for lower hardlinks" * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: (33 commits) ovl: mark parent impure and restore timestamp on ovl_link_up() ovl: document copying layers restrictions with inodes index ovl: cleanup orphan index entries ovl: persistent overlay inode nlink for indexed inodes ovl: implement index dir copy up ovl: move copy up lock out ovl: rearrange copy up ovl: add flag for upper in ovl_entry ovl: use struct copy_up_ctx as function argument ovl: base tmpfile in workdir too ovl: factor out ovl_copy_up_inode() helper ovl: extract helper to get temp file in copy up ovl: defer upper dir lock to tempfile link ovl: hash overlay non-dir inodes by copy up origin ovl: cleanup bad and stale index entries on mount ovl: lookup index entry for copy up origin ovl: verify index dir matches upper dir ovl: verify upper root dir matches lower root dir ovl: introduce the inodes index dir feature ovl: generalize ovl_create_workdir() ...
2017-07-12fbdev: make get_fb_unmapped_area depends of !MMUBenjamin Gaignard1-2/+3
Even if CONFIG_FB_PROVIDE_GET_FB_UNMAPPED_AREA flag is selected do not compile and use get_fb_unmapped_area() if CONFIG_MMU is also set. This will avoid mmap errors when compiling multi architectures at same time. Signed-off-by: Benjamin Gaignard <[email protected]> Reviewed-by: Daniel Vetter <[email protected]> Cc: Noralf Trønnes <[email protected]> Cc: Emil Velikov <[email protected]> Cc: Yannick Fertre <[email protected]> Cc: Eric Engestrom <[email protected]> Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>