blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2020-08-13	libbpf: Enforce 64-bitness of BTF for BPF object files	Andrii Nakryiko	1	-0/+4
	BPF object files are always targeting 64-bit BPF target architecture, so enforce that at BTF level as well. Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-08-13	selftests/bpf: Fix btf_dump test cases on 32-bit arches	Andrii Nakryiko	1	-7/+20
	Fix btf_dump test cases by hard-coding BPF's pointer size of 8 bytes for cases where it's impossible to deterimne the pointer size (no long type in BTF). In cases where it's known, validate libbpf correctly determines it as 8. Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-08-13	libbpf: Handle BTF pointer sizes more carefully	Andrii Nakryiko	4	-4/+87
	With libbpf and BTF it is pretty common to have libbpf built for one architecture, while BTF information was generated for a different architecture (typically, but not always, BPF). In such case, the size of a pointer might differ betweem architectures. libbpf previously was always making an assumption that pointer size for BTF is the same as native architecture pointer size, but that breaks for cases where libbpf is built as 32-bit library, while BTF is for 64-bit architecture. To solve this, add heuristic to determine pointer size by searching for `long` or `unsigned long` integer type and using its size as a pointer size. Also, allow to override the pointer size with a new API btf__set_pointer_size(), for cases where application knows which pointer size should be used. User application can check what libbpf "guessed" by looking at the result of btf__pointer_size(). If it's not 0, then libbpf successfully determined a pointer size, otherwise native arch pointer size will be used. For cases where BTF is parsed from ELF file, use ELF's class (32-bit or 64-bit) to determine pointer size. Fixes: 8a138aed4a80 ("bpf: btf: Add BTF support to libbpf") Fixes: 351131b51c7a ("libbpf: add btf_dump API for BTF-to-C conversion") Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-08-13	libbpf: Fix BTF-defined map-in-map initialization on 32-bit host arches	Andrii Nakryiko	1	-6/+10
	Libbpf built in 32-bit mode should be careful about not conflating 64-bit BPF pointers in BPF ELF file and host architecture pointers. This patch fixes issue of incorrect initializating of map-in-map inner map slots due to such difference. Fixes: 646f02ffdd49 ("libbpf: Add BTF-defined map-in-map support") Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-08-13	selftest/bpf: Fix compilation warnings in 32-bit mode	Andrii Nakryiko	9	-19/+24
	Fix compilation warnings emitted when compiling selftests for 32-bit platform (x86 in my case). Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-08-13	tools/bpftool: Fix compilation warnings in 32-bit mode	Andrii Nakryiko	4	-12/+20
	Fix few compilation warnings in bpftool when compiling in 32-bit mode. Abstract away u64 to pointer conversion into a helper function. Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-08-13	bpf, selftests: Add tests to sock_ops for loading sk	John Fastabend	1	-0/+21
	Add tests to directly accesse sock_ops sk field. Then use it to ensure a bad pointer access will fault if something goes wrong. We do three tests: The first test ensures when we read sock_ops sk pointer into the same register that we don't fault as described earlier. Here r9 is chosen as the temp register. The xlated code is, 36: (7b) (u64 )(r1 +32) = r9 37: (61) r9 = (u32 )(r1 +28) 38: (15) if r9 == 0x0 goto pc+3 39: (79) r9 = (u64 )(r1 +32) 40: (79) r1 = (u64 )(r1 +0) 41: (05) goto pc+1 42: (79) r9 = (u64 )(r1 +32) The second test ensures the temp register selection does not collide with in-use register r9. Shown here r8 is chosen because r9 is the sock_ops pointer. The xlated code is as follows, 46: (7b) (u64 )(r9 +32) = r8 47: (61) r8 = (u32 )(r9 +28) 48: (15) if r8 == 0x0 goto pc+3 49: (79) r8 = (u64 )(r9 +32) 50: (79) r9 = (u64 )(r9 +0) 51: (05) goto pc+1 52: (79) r8 = (u64 )(r9 +32) And finally, ensure we didn't break the base case where dst_reg does not equal the source register, 56: (61) r2 = (u32 )(r1 +28) 57: (15) if r2 == 0x0 goto pc+1 58: (79) r2 = (u64 )(r1 +0) Notice it takes us an extra four instructions when src reg is the same as dst reg. One to save the reg, two to restore depending on the branch taken and a goto to jump over the second restore. Signed-off-by: John Fastabend <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Song Liu <[email protected]> Acked-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/bpf/159718355325.4728.4163036953345999636.stgit@john-Precision-5820-Tower
2020-08-13	bpf, selftests: Add tests for sock_ops load with r9, r8.r7 registers	John Fastabend	1	-0/+7
	Loads in sock_ops case when using high registers requires extra logic to ensure the correct temporary value is used. We need to ensure the temp register does not use either the src_reg or dst_reg. Lets add an asm test to force the logic is triggered. The xlated code is here, 30: (7b) (u64 )(r9 +32) = r7 31: (61) r7 = (u32 )(r9 +28) 32: (15) if r7 == 0x0 goto pc+2 33: (79) r7 = (u64 )(r9 +0) 34: (63) (u32 )(r7 +916) = r8 35: (79) r7 = (u64 )(r9 +32) Notice r9 and r8 are not used for temp registers and r7 is chosen. Signed-off-by: John Fastabend <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Song Liu <[email protected]> Acked-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/bpf/159718353345.4728.8805043614257933227.stgit@john-Precision-5820-Tower
2020-08-13	bpf, selftests: Add tests for ctx access in sock_ops with single register	John Fastabend	1	-0/+13
	To verify fix ("bpf: sock_ops ctx access may stomp registers in corner case") we want to force compiler to generate the following code when accessing a field with BPF_TCP_SOCK_GET_COMMON, r1 = (u32 )(r1 + 96) // r1 is skops ptr Rather than depend on clang to do this we add the test with inline asm to the tcpbpf test. This saves us from having to create another runner and ensures that if we break this again test_tcpbpf will crash. With above code we get the xlated code, 11: (7b) (u64 )(r1 +32) = r9 12: (61) r9 = (u32 )(r1 +28) 13: (15) if r9 == 0x0 goto pc+4 14: (79) r9 = (u64 )(r1 +32) 15: (79) r1 = (u64 )(r1 +0) 16: (61) r1 = (u32 )(r1 +2348) 17: (05) goto pc+1 18: (79) r9 = (u64 )(r1 +32) We also add the normal case where src_reg != dst_reg so we can compare code generation easily from llvm-objdump and ensure that case continues to work correctly. The normal code is xlated to, 20: (b7) r1 = 0 21: (61) r1 = (u32 )(r3 +28) 22: (15) if r1 == 0x0 goto pc+2 23: (79) r1 = (u64 )(r3 +0) 24: (61) r1 = (u32 )(r1 +2348) Where the temp variable is not used. Signed-off-by: John Fastabend <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Song Liu <[email protected]> Acked-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/bpf/159718351457.4728.3295119261717842496.stgit@john-Precision-5820-Tower
2020-08-13	libbpf: Prevent overriding errno when logging errors	Toke Høiland-Jørgensen	1	-5/+7
	Turns out there were a few more instances where libbpf didn't save the errno before writing an error message, causing errno to be overridden by the printf() return and the error disappearing if logging is enabled. Signed-off-by: Toke Høiland-Jørgensen <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-08-12	libbpf: Handle GCC built-in types for Arm NEON	Jean-Philippe Brucker	1	-1/+34
	When building Arm NEON (SIMD) code from lib/raid6/neon.uc, GCC emits DWARF information using a base type "__Poly8_t", which is internal to GCC and not recognized by Clang. This causes build failures when building with Clang a vmlinux.h generated from an arm64 kernel that was built with GCC. vmlinux.h:47284:9: error: unknown type name '__Poly8_t' typedef __Poly8_t poly8x16_t[16]; ^~~~~~~~~ The polyX_t types are defined as unsigned integers in the "Arm C Language Extension" document (101028_Q220_00_en). Emit typedefs based on standard integer types for the GCC internal types, similar to those emitted by Clang. Including linux/kernel.h to use ARRAY_SIZE() incidentally redefined max(), causing a build bug due to different types, hence the seemingly unrelated change. Reported-by: Jakov Petrina <[email protected]> Signed-off-by: Jean-Philippe Brucker <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-08-12	tools/bpftool: Make skeleton code C++17-friendly by dropping typeof()	Andrii Nakryiko	1	-4/+4
	Seems like C++17 standard mode doesn't recognize typeof() anymore. This can be tested by compiling test_cpp test with -std=c++17 or -std=c++1z options. The use of typeof in skeleton generated code is unnecessary, all types are well-known at the time of code generation, so remove all typeof()'s to make skeleton code more future-proof when interacting with C++ compilers. Fixes: 985ead416df3 ("bpftool: Add skeleton codegen command") Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Song Liu <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-08-11	selftests/bpf: Fix v4_to_v6 in sk_lookup	Stanislav Fomichev	1	-0/+1
	I'm getting some garbage in bytes 8 and 9 when doing conversion from sockaddr_in to sockaddr_in6 (leftover from AF_INET?). Let's explicitly clear the higher bytes. Fixes: 0ab5539f8584 ("selftests/bpf: Tests for BPF_SK_LOOKUP attach point") Signed-off-by: Stanislav Fomichev <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Jakub Sitnicki <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-08-11	selftests/bpf: Fix segmentation fault in test_progs	Jianlin Lv	1	-6/+13
	test_progs reports the segmentation fault as below: $ sudo ./test_progs -t mmap --verbose test_mmap:PASS:skel_open_and_load 0 nsec [...] test_mmap:PASS:adv_mmap1 0 nsec test_mmap:PASS:adv_mmap2 0 nsec test_mmap:PASS:adv_mmap3 0 nsec test_mmap:PASS:adv_mmap4 0 nsec Segmentation fault This issue was triggered because mmap() and munmap() used inconsistent length parameters; mmap() creates a new mapping of 3 * page_size, but the length parameter set in the subsequent re-map and munmap() functions is 4 * page_size; this leads to the destruction of the process space. To fix this issue, first create 4 pages of anonymous mapping, then do all the mmap() with MAP_FIXED. Another issue is that when unmap the second page fails, the length parameter to delete tmp1 mappings should be 4 * page_size. Signed-off-by: Jianlin Lv <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-08-11	libbpf: Do not use __builtin_offsetof for offsetof	Yonghong Song	1	-1/+1
	Commit 5fbc220862fc ("tools/libpf: Add offsetof/container_of macro in bpf_helpers.h") added a macro offsetof() to get the offset of a structure member: #define offsetof(TYPE, MEMBER) ((size_t)&((TYPE *)0)->MEMBER) In certain use cases, size_t type may not be available so Commit da7a35062bcc ("libbpf bpf_helpers: Use __builtin_offsetof for offsetof") changed to use __builtin_offsetof which removed the dependency on type size_t, which I suggested. But using __builtin_offsetof will prevent CO-RE relocation generation in case that, e.g., TYPE is annotated with "preserve_access_info" where a relocation is desirable in case the member offset is changed in a different kernel version. So this patch reverted back to the original macro but using "unsigned long" instead of "site_t". Fixes: da7a35062bcc ("libbpf bpf_helpers: Use __builtin_offsetof for offsetof") Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Acked-by: Ian Rogers <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-08-07	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf	David S. Miller	14	-64/+111
	Daniel Borkmann says: ==================== pull-request: bpf 2020-08-08 The following pull-request contains BPF updates for your net tree. We've added 11 non-merge commits during the last 2 day(s) which contain a total of 24 files changed, 216 insertions(+), 135 deletions(-). The main changes are: 1) Fix UAPI for BPF map iterator before it gets frozen to allow for more extensions/customization in future, from Yonghong Song. 2) Fix selftests build to undo verbose build output, from Andrii Nakryiko. 3) Fix inlining compilation error on bpf_do_trace_printk() due to variable argument lists, from Stanislav Fomichev. 4) Fix an uninitialized pointer warning at btf__parse_raw() in libbpf, from Daniel T. Lee. 5) Fix several compilation warnings in selftests with regards to ignoring return value, from Jianlin Lv. 6) Fix interruptions by switching off timeout for BPF tests, from Jiri Benc. ==================== Signed-off-by: David S. Miller <[email protected]>
2020-08-07	mptcp: more stable diag self-tests	Paolo Abeni	1	-4/+5
	During diag self-tests we introduce long wait in the mptcp test program to give the script enough time to access the sockets dump. Such wait is introduced after shutting down one sockets end. Since commit 43b54c6ee382 ("mptcp: Use full MPTCP-level disconnect state machine") if both sides shutdown the socket is correctly transitioned into CLOSED status. As a side effect some sockets are not dumped via the diag interface, because the socket state (CLOSED) does not match the default filter, and this cause self-tests instability. Address the issue moving the above mentioned wait before shutting down the socket. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/68 Fixes: df62f2ec3df6 ("selftests/mptcp: add diag interface tests") Tested-and-acked-by: Matthieu Baerts <[email protected]> Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-08-07	selftests: mptcp: fix dependecies	Paolo Abeni	1	-0/+2
	Since commit df62f2ec3df6 ("selftests/mptcp: add diag interface tests") the MPTCP selftests relies on the MPTCP diag interface which is enabled by a specific kconfig knob: be sure to include it. Fixes: df62f2ec3df6 ("selftests/mptcp: add diag interface tests") Signed-off-by: Paolo Abeni <[email protected]> Reviewed-by: Matthieu Baerts <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-08-07	selftests/bpf: Fix silent Makefile output	Andrii Nakryiko	1	-22/+26
	99aacebecb75 ("selftests: do not use .ONESHELL") removed .ONESHELL, which changes how Makefile "silences" multi-command target recipes. selftests/bpf's Makefile relied (a somewhat unknowingly) on .ONESHELL behavior of silencing all commands within the recipe if the first command contains @ symbol. Removing .ONESHELL exposed this hack. This patch fixes the issue by explicitly silencing each command with $(Q). Also explicitly define fallback rule for building .o from .c, instead of relying on non-silent inherited rule. This was causing a non-silent output for bench.o object file. Fixes: 92f7440ecc93 ("selftests/bpf: More succinct Makefile output") Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-08-06	bpf: Fix compilation warning of selftests	Jianlin Lv	3	-14/+21
	Clang compiler version: 12.0.0 The following warning appears during the selftests/bpf compilation: prog_tests/send_signal.c:51:3: warning: ignoring return value of ‘write’, declared with attribute warn_unused_result [-Wunused-result] 51 \| write(pipe_c2p[1], buf, 1); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~ prog_tests/send_signal.c:54:3: warning: ignoring return value of ‘read’, declared with attribute warn_unused_result [-Wunused-result] 54 \| read(pipe_p2c[0], buf, 1); \| ^~~~~~~~~~~~~~~~~~~~~~~~~ ...... prog_tests/stacktrace_build_id_nmi.c:13:2: warning: ignoring return value of ‘fscanf’,declared with attribute warn_unused_result [-Wunused-resul] 13 \| fscanf(f, "%llu", &sample_freq); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ test_tcpnotify_user.c:133:2: warning:ignoring return value of ‘system’, declared with attribute warn_unused_result [-Wunused-result] 133 \| system(test_script); \| ^~~~~~~~~~~~~~~~~~~ test_tcpnotify_user.c:138:2: warning:ignoring return value of ‘system’, declared with attribute warn_unused_result [-Wunused-result] 138 \| system(test_script); \| ^~~~~~~~~~~~~~~~~~~ test_tcpnotify_user.c:143:2: warning:ignoring return value of ‘system’, declared with attribute warn_unused_result [-Wunused-result] 143 \| system(test_script); \| ^~~~~~~~~~~~~~~~~~~ Add code that fix compilation warning about ignoring return value and handles any errors; Check return value of library`s API make the code more secure. Signed-off-by: Jianlin Lv <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-08-06	selftests: bpf: Switch off timeout	Jiri Benc	1	-0/+1
	Several bpf tests are interrupted by the default timeout of 45 seconds added by commit 852c8cbf34d3 ("selftests/kselftest/runner.sh: Add 45 second timeout per test"). In my case it was test_progs, test_tunnel.sh, test_lwt_ip_encap.sh and test_xdping.sh. There's not much value in having a timeout for bpf tests, switch it off. Signed-off-by: Jiri Benc <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/7a9198ed10917f4ecab4a3dd74bcda1200791efd.1596739059.git.jbenc@redhat.com
2020-08-06	bpf: Add missing return to resolve_btfids	Stanislav Fomichev	1	-0/+1
	int sets_patch(struct object *obj) doesn't have a 'return 0' at the end. Fixes: fbbb68de80a4 ("bpf: Add resolve_btfids tool to resolve BTF IDs in ELF object") Signed-off-by: Stanislav Fomichev <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-08-06	libbf: Fix uninitialized pointer at btf__parse_raw()	Daniel T. Lee	1	-1/+1
	Recently, from commit 94a1fedd63ed ("libbpf: Add btf__parse_raw() and generic btf__parse() APIs"), new API has been added to libbpf that allows to parse BTF from raw data file (btf__parse_raw()). The commit derives build failure of samples/bpf due to improper access of uninitialized pointer at btf_parse_raw(). btf.c: In function btf__parse_raw: btf.c:625:28: error: btf may be used uninitialized in this function 625 \| return err ? ERR_PTR(err) : btf; \| ~~~~~~~~~~~~~~~~~~~^~~~~ This commit fixes the build failure of samples/bpf by adding code of initializing btf pointer as NULL. Fixes: 94a1fedd63ed ("libbpf: Add btf__parse_raw() and generic btf__parse() APIs") Signed-off-by: Daniel T. Lee <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-08-06	tools/bpf: Support new uapi for map element bpf iterator	Yonghong Song	7	-25/+58
	Previous commit adjusted kernel uapi for map element bpf iterator. This patch adjusted libbpf API due to uapi change. bpftool and bpf_iter selftests are also changed accordingly. Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Acked-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-08-06	selftests/bpf: Prevent runqslower from racing on building bpftool	Andrii Nakryiko	1	-2/+3
	runqslower's Makefile is building/installing bpftool into $(OUTPUT)/sbin/bpftool, which coincides with $(DEFAULT_BPFTOOL). In practice this means that often when building selftests from scratch (after `make clean`), selftests are racing with runqslower to simultaneously build bpftool and one of the two processes fail due to file being busy. Prevent this race by explicitly order-depending on $(BPFTOOL_DEFAULT). Fixes: a2c9652f751e ("selftests: Refactor build to remove tools/lib/bpf from include path") Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-08-06	Merge tag 'csky-for-linus-5.9-rc1' of https://github.com/c-sky/csky-linux	Linus Torvalds	1	-1/+11
	Pull arch/csky updates from Guo Ren: "New features: - seccomp-filter - err-injection - top-down&random mmap-layout - irq_work - show_ipi - context-tracking Fixes & Optimizations: - kprobe_on_ftrace - optimize panic print" * tag 'csky-for-linus-5.9-rc1' of https://github.com/c-sky/csky-linux: csky: Add context tracking support csky: Add arch_show_interrupts for IPI interrupts csky: Add irq_work support csky: Fixup warning by EXPORT_SYMBOL(kmap) csky: Set CONFIG_NR_CPU 4 as default csky: Use top-down mmap layout csky: Optimize the trap processing flow csky: Add support for function error injection csky: Fixup kprobes handler couldn't change pc csky: Fixup duplicated restore sp in RESTORE_REGS_FTRACE csky: Add cpu feature register hint for smp csky: Add SECCOMP_FILTER supported csky: remove unusued thread_saved_pc and *_segments functions/macros
2020-08-06	Merge tag 'xtensa-20200805' of git://github.com/jcmvbkbc/linux-xtensa	Linus Torvalds	1	-1/+15
	Pull Xtensa updates from Max Filippov: - add syscall audit support - add seccomp filter support - clean up make rules under arch/xtensa/boot - fix state management for exclusive access opcodes - fix build with PMU enabled * tag 'xtensa-20200805' of git://github.com/jcmvbkbc/linux-xtensa: xtensa: add missing exclusive access state management xtensa: fix xtensa_pmu_setup prototype xtensa: add boot subdirectories build artifacts to 'targets' xtensa: add uImage and xipImage to targets xtensa: move vmlinux.bin[.gz] to boot subdirectory xtensa: initialize_mmu.h: fix a duplicated word selftests/seccomp: add xtensa support xtensa: add seccomp support xtensa: expose syscall through user_pt_regs xtensa: add audit support
2020-08-06	Merge tag 'hyperv-next-signed' of ↵	Linus Torvalds	1	-1/+1
	git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv updates from Wei Liu: - A patch series from Andrea to improve vmbus code - Two clean-up patches from Alexander and Randy * tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: hyperv: hyperv.h: drop a duplicated word tools: hv: change http to https in hv_kvp_daemon.c Drivers: hv: vmbus: Remove the lock field from the vmbus_channel struct scsi: storvsc: Introduce the per-storvsc_device spinlock Drivers: hv: vmbus: Remove unnecessary channel->lock critical sections (sc_list updaters) Drivers: hv: vmbus: Use channel_mutex in channel_vp_mapping_show() Drivers: hv: vmbus: Remove unnecessary channel->lock critical sections (sc_list readers) Drivers: hv: vmbus: Replace cpumask_test_cpu(, cpu_online_mask) with cpu_online() Drivers: hv: vmbus: Remove the numa_node field from the vmbus_channel struct Drivers: hv: vmbus: Remove the target_vp field from the vmbus_channel struct
2020-08-05	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next	Linus Torvalds	198	-1663/+14723
	Pull networking updates from David Miller: 1) Support 6Ghz band in ath11k driver, from Rajkumar Manoharan. 2) Support UDP segmentation in code TSO code, from Eric Dumazet. 3) Allow flashing different flash images in cxgb4 driver, from Vishal Kulkarni. 4) Add drop frames counter and flow status to tc flower offloading, from Po Liu. 5) Support n-tuple filters in cxgb4, from Vishal Kulkarni. 6) Various new indirect call avoidance, from Eric Dumazet and Brian Vazquez. 7) Fix BPF verifier failures on 32-bit pointer arithmetic, from Yonghong Song. 8) Support querying and setting hardware address of a port function via devlink, use this in mlx5, from Parav Pandit. 9) Support hw ipsec offload on bonding slaves, from Jarod Wilson. 10) Switch qca8k driver over to phylink, from Jonathan McDowell. 11) In bpftool, show list of processes holding BPF FD references to maps, programs, links, and btf objects. From Andrii Nakryiko. 12) Several conversions over to generic power management, from Vaibhav Gupta. 13) Add support for SO_KEEPALIVE et al. to bpf_setsockopt(), from Dmitry Yakunin. 14) Various https url conversions, from Alexander A. Klimov. 15) Timestamping and PHC support for mscc PHY driver, from Antoine Tenart. 16) Support bpf iterating over tcp and udp sockets, from Yonghong Song. 17) Support 5GBASE-T i40e NICs, from Aleksandr Loktionov. 18) Add kTLS RX HW offload support to mlx5e, from Tariq Toukan. 19) Fix the ->ndo_start_xmit() return type to be netdev_tx_t in several drivers. From Luc Van Oostenryck. 20) XDP support for xen-netfront, from Denis Kirjanov. 21) Support receive buffer autotuning in MPTCP, from Florian Westphal. 22) Support EF100 chip in sfc driver, from Edward Cree. 23) Add XDP support to mvpp2 driver, from Matteo Croce. 24) Support MPTCP in sock_diag, from Paolo Abeni. 25) Commonize UDP tunnel offloading code by creating udp_tunnel_nic infrastructure, from Jakub Kicinski. 26) Several pci_ --> dma_ API conversions, from Christophe JAILLET. 27) Add FLOW_ACTION_POLICE support to mlxsw, from Ido Schimmel. 28) Add SK_LOOKUP bpf program type, from Jakub Sitnicki. 29) Refactor a lot of networking socket option handling code in order to avoid set_fs() calls, from Christoph Hellwig. 30) Add rfc4884 support to icmp code, from Willem de Bruijn. 31) Support TBF offload in dpaa2-eth driver, from Ioana Ciornei. 32) Support XDP_REDIRECT in qede driver, from Alexander Lobakin. 33) Support PCI relaxed ordering in mlx5 driver, from Aya Levin. 34) Support TCP syncookies in MPTCP, from Flowian Westphal. 35) Fix several tricky cases of PMTU handling wrt. briding, from Stefano Brivio. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2056 commits) net: thunderx: initialize VF's mailbox mutex before first usage usb: hso: remove bogus check for EINPROGRESS usb: hso: no complaint about kmalloc failure hso: fix bailout in error case of probe ip_tunnel_core: Fix build for archs without _HAVE_ARCH_IPV6_CSUM selftests/net: relax cpu affinity requirement in msg_zerocopy test mptcp: be careful on subflow creation selftests: rtnetlink: make kci_test_encap() return sub-test result selftests: rtnetlink: correct the final return value for the test net: dsa: sja1105: use detected device id instead of DT one on mismatch tipc: set ub->ifindex for local ipv6 address ipv6: add ipv6_dev_find() net: openvswitch: silence suspicious RCU usage warning Revert "vxlan: fix tos value before xmit" ptp: only allow phase values lower than 1 period farsync: switch from 'pci_' to 'dma_' API wan: wanxl: switch from 'pci_' to 'dma_' API hv_netvsc: do not use VF device if link is down dpaa2-eth: Fix passing zero to 'PTR_ERR' warning net: macb: Properly handle phylink on at91sam9x ...
2020-08-05	Merge tag 'for-linus-hmm' of ↵	Linus Torvalds	1	-4/+90
	git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull hmm updates from Jason Gunthorpe: "Ralph has been working on nouveau's use of hmm_range_fault() and migrate_vma() which resulted in this small series. It adds reporting of the page table order from hmm_range_fault() and some optimization of migrate_vma(): - Report the size of the page table mapping out of hmm_range_fault(). This makes it easier to establish a large/huge/etc mapping in the device's page table. - Allow devices to ignore the invalidations during migration in cases where the migration is not going to change pages. For instance migrating pages to a device does not require the device to invalidate pages already in the device. - Update nouveau and hmm_tests to use the above" * tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: mm/hmm/test: use the new migration invalidation nouveau/svm: use the new migration invalidation mm/notifier: add migration invalidation type mm/migrate: add a flags parameter to migrate_vma nouveau: fix storing invalid ptes nouveau/hmm: support mapping large sysmem pages nouveau: fix mapping 2MB sysmem pages nouveau/hmm: fault one page at a time mm/hmm: add tests for hmm_pfn_to_map_order() mm/hmm: provide the page mapping order in hmm_range_fault()
2020-08-05	Merge tag 'gpio-v5.9-1' of ↵	Linus Torvalds	3	-4/+6
	git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO updates from Linus Walleij: "This is the bulk of GPIO changes for the v5.9 kernel cycle. There is nothing too exciting in it, but a new macro that fixes a build failure on a minor ARM32 platform that appeared yesterday is part of it so we better merge it. Core changes: - Introduce the for_each_requested_gpio() macro to help in dependent code all over the place. Also patch a few locations to use it while we are at it. - Split out the sysfs code into its own file. - Split out the character device code into its own file, then make a set of refactorings and improvements to this code. We are setting the stage to revamp the userspace API a bit in the next cycle. - Fix a whole slew of kerneldoc that was wrong or missing. New drivers: - The PCA953x driver now supports the PCAL9535. Driver improvements: - A host of incremental modernizations and improvements to the PCA953x driver. - Incremental improvements to the Xilinx Zynq driver. - Some improvements to the GPIO aggregator driver. - I ran all over the place switching all threaded and other drivers requesting their own IRQ while using the core GPIO IRQ helpers to pass the GPIO irq chip as a template instead of calling the explicit set-up functions. Next merge window we may retire the old code altogether" * tag 'gpio-v5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (97 commits) gpio: wcove: Request IRQ after all initialisation done gpio: crystalcove: Free IRQ on error path gpio: pca953x: Request IRQ after all initialisation done gpio: don't use same lockdep class for all devm_gpiochip_add_data users gpio: max732x: Use irqchip template gpio: stmpe: Move chip registration gpio: rcar: Use irqchip template gpio: regmap: fix type clash gpio: Correct kernel-doc inconsistency gpio: pci-idio-16: Use irqchip template gpio: pcie-idio-24: Use irqchip template gpio: 104-idio-16: Use irqchip template gpio: 104-idi-48: Use irqchip template gpio: 104-dio-48e: Use irqchip template gpio: ws16c48: Use irqchip template gpio: omap: improve coding style for pin config flags gpio: dln2: Use irqchip template gpio: sch: Add a blank line between declaration and code gpio: sch: changed every 'unsigned' to 'unsigned int' gpio: ich: changed every 'unsigned' to 'unsigned int' ...
2020-08-05	selftests/net: relax cpu affinity requirement in msg_zerocopy test	Willem de Bruijn	1	-3/+2
	The msg_zerocopy test pins the sender and receiver threads to separate cores to reduce variance between runs. But it hardcodes the cores and skips core 0, so it fails on machines with the selected cores offline, or simply fewer cores. The test mainly gives code coverage in automated runs. The throughput of zerocopy ('-z') and non-zerocopy runs is logged for manual inspection. Continue even when sched_setaffinity fails. Just log to warn anyone interpreting the data. Fixes: 07b65c5b31ce ("test: add msg_zerocopy test") Reported-by: Colin Ian King <[email protected]> Signed-off-by: Willem de Bruijn <[email protected]> Acked-by: Colin Ian King <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-08-05	selftests: rtnetlink: make kci_test_encap() return sub-test result	Po-Hsu Lin	1	-0/+3
	kci_test_encap() is actually composed by two different sub-tests, kci_test_encap_vxlan() and kci_test_encap_fou() Therefore we should check the test result of these two in kci_test_encap() to let the script be aware of the pass / fail status. Otherwise it will generate false-negative result like below: $ sudo ./test.sh PASS: policy routing PASS: route get PASS: preferred_lft addresses have expired PASS: promote_secondaries complete PASS: tc htb hierarchy PASS: gre tunnel endpoint PASS: gretap PASS: ip6gretap PASS: erspan PASS: ip6erspan PASS: bridge setup PASS: ipv6 addrlabel PASS: set ifalias 5b193daf-0a08-46d7-af2c-e7aadd422ded for test-dummy0 PASS: vrf PASS: vxlan FAIL: can't add fou port 7777, skipping test PASS: macsec PASS: bridge fdb get PASS: neigh get $ echo $? 0 Signed-off-by: Po-Hsu Lin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-08-05	selftests: rtnetlink: correct the final return value for the test	Po-Hsu Lin	1	-22/+43
	The return value "ret" will be reset to 0 from the beginning of each sub-test in rtnetlink.sh, therefore this test will always pass if the last sub-test has passed: $ sudo ./rtnetlink.sh PASS: policy routing PASS: route get PASS: preferred_lft addresses have expired PASS: promote_secondaries complete PASS: tc htb hierarchy PASS: gre tunnel endpoint PASS: gretap PASS: ip6gretap PASS: erspan PASS: ip6erspan PASS: bridge setup PASS: ipv6 addrlabel PASS: set ifalias a39ee707-e36b-41d3-802f-63179ed4d580 for test-dummy0 PASS: vrf PASS: vxlan FAIL: can't add fou port 7777, skipping test PASS: macsec PASS: ipsec 3,7c3,7 < sa[0] spi=0x00000009 proto=0x32 salt=0x64636261 crypt=1 < sa[0] key=0x31323334 35363738 39303132 33343536 < sa[1] rx ipaddr=0x00000000 00000000 00000000 c0a87b03 < sa[1] spi=0x00000009 proto=0x32 salt=0x64636261 crypt=1 < sa[1] key=0x31323334 35363738 39303132 33343536 --- > sa[0] spi=0x00000009 proto=0x32 salt=0x61626364 crypt=1 > sa[0] key=0x34333231 38373635 32313039 36353433 > sa[1] rx ipaddr=0x00000000 00000000 00000000 037ba8c0 > sa[1] spi=0x00000009 proto=0x32 salt=0x61626364 crypt=1 > sa[1] key=0x34333231 38373635 32313039 36353433 FAIL: ipsec_offload incorrect driver data FAIL: ipsec_offload PASS: bridge fdb get PASS: neigh get $ echo $? 0 Make "ret" become a local variable for all sub-tests. Also, check the sub-test results in kci_test_rtnl() and return the final result for this test. Signed-off-by: Po-Hsu Lin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-08-05	Merge tag 'usb-5.9-rc1' of ↵	Linus Torvalds	4	-4/+6
	git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB/Thunderbolt updates from Greg KH: "Here is the large set of USB and Thunderbolt patches for 5.9-rc1. Nothing really magic/major in here, just lots of little changes and updates: - clean up language usages in USB core and some drivers - Thunderbolt driver updates and additions - USB Gadget driver updates - dwc3 driver updates (like always...) - build with "W=1" warning fixups - mtu3 driver updates - usb-serial driver updates and device ids - typec additions and updates for new hardware - xhci debug code updates for future platforms - cdns3 driver updates - lots of other minor driver updates and fixes and cleanups All of these have been in linux-next for a while with no reported issues" * tag 'usb-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (330 commits) usb: common: usb-conn-gpio: Register charger usb: mtu3: simplify mtu3_req_complete() usb: mtu3: clear dual mode of u3port when disable device usb: mtu3: use MTU3_EP_WEDGE flag usb: mtu3: remove useless member @busy in mtu3_ep struct usb: mtu3: remove repeated error log usb: mtu3: add ->udc_set_speed() usb: mtu3: introduce a funtion to check maximum speed usb: mtu3: clear interrupts status when disable interrupts usb: mtu3: reinitialize CSR registers usb: mtu3: fix macro for maximum number of packets usb: mtu3: remove unnecessary pointer checks usb: xhci: Fix ASMedia ASM1142 DMA addressing usb: xhci: define IDs for various ASMedia host controllers usb: musb: convert to devm_platform_ioremap_resource_byname usb: gadget: tegra-xudc: convert to devm_platform_ioremap_resource_byname usb: gadget: r8a66597: convert to devm_platform_ioremap_resource_byname usb: dwc3: convert to devm_platform_ioremap_resource_byname usb: cdns3: convert to devm_platform_ioremap_resource_byname usb: phy: am335x: convert to devm_platform_ioremap_resource_byname ...
2020-08-05	Merge tag 'driver-core-5.9-rc1' of ↵	Linus Torvalds	2	-1/+13
	git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the "big" set of changes to the driver core, and some drivers using the changes, for 5.9-rc1. "Biggest" thing in here is the device link exposure in sysfs, to help to tame the madness that is SoC device tree representations and driver interactions with it. Other stuff in here that is interesting is: - device probe log helper so that drivers can report problems in a unified way easier. - devres functions added - DEVICE_ATTR_ADMIN_* macro added to make it harder to write incorrect sysfs file permissions - documentation cleanups - ability for debugfs to be present in the kernel, yet not exposed to userspace. Needed for systems that want it enabled, but do not trust users, so they can still use some kernel functions that were otherwise disabled. - other minor fixes and cleanups The patches outside of drivers/base/ all have acks from the respective subsystem maintainers to go through this tree instead of theirs. All of these have been in linux-next with no reported issues" * tag 'driver-core-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (39 commits) drm/bridge: lvds-codec: simplify error handling drm/bridge/sii8620: fix resource acquisition error handling driver core: add deferring probe reason to devices_deferred property driver core: add device probe log helper driver core: Avoid binding drivers to dead devices Revert "test_firmware: Test platform fw loading on non-EFI systems" firmware_loader: EFI firmware loader must handle pre-allocated buffer selftest/firmware: Add selftest timeout in settings test_firmware: Test platform fw loading on non-EFI systems driver core: Change delimiter in devlink device's name to "--" debugfs: Add access restriction option tracefs: Remove unnecessary debug_fs checks. driver core: Fix probe_count imbalance in really_probe() kobject: remove unused KOBJ_MAX action driver core: Fix sleeping in invalid context during device link deletion driver core: Add waiting_for_supplier sysfs file for devices driver core: Add state_synced sysfs file for devices that support it driver core: Expose device link details in sysfs driver core: Drop mention of obsolete bus rwsem from kernel-doc debugfs: file: Remove unnecessary cast in kfree() ...
2020-08-05	Merge tag 'char-misc-5.9-rc1' of ↵	Linus Torvalds	2	-0/+7
	git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver updates from Greg KH: "Here is the large set of char and misc and other driver subsystem patches for 5.9-rc1. Lots of new driver submissions in here, and cleanups and features for existing drivers. Highlights are: - habanalabs driver updates - coresight driver updates - nvmem driver updates - huge number of "W=1" build warning cleanups from Lee Jones - dyndbg updates - virtbox driver fixes and updates - soundwire driver updates - mei driver updates - phy driver updates - fpga driver updates - lots of smaller individual misc/char driver cleanups and fixes Full details are in the shortlog. All of these have been in linux-next with no reported issues" * tag 'char-misc-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (322 commits) habanalabs: remove unused but set variable 'ctx_asid' nvmem: qcom-spmi-sdam: Enable multiple devices dt-bindings: nvmem: SID: add binding for A100's SID controller nvmem: update Kconfig description nvmem: qfprom: Add fuse blowing support dt-bindings: nvmem: Add properties needed for blowing fuses dt-bindings: nvmem: qfprom: Convert to yaml nvmem: qfprom: use NVMEM_DEVID_AUTO for multiple instances nvmem: core: add support to auto devid nvmem: core: Add nvmem_cell_read_u8() nvmem: core: Grammar fixes for help text nvmem: sc27xx: add sc2730 efuse support nvmem: Enforce nvmem stride in the sysfs interface MAINTAINERS: Add git tree for NVMEM FRAMEWORK nvmem: sprd: Fix return value of sprd_efuse_probe() drivers: android: Fix the SPDX comment style drivers: android: Fix a variable declaration coding style issue drivers: android: Remove braces for a single statement if-else block drivers: android: Remove the use of else after return drivers: android: Fix a variable declaration coding style issue ...
2020-08-05	Merge tag 'linux-kselftest-5.9-rc1' of ↵	Linus Torvalds	17	-247/+458
	git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest updates form Shuah Khan: - TAP output reporting related fixes from Paolo Bonzini and Kees Cook. These fixes make it skip reporting consistent with TAP format. - Cleanup fixes to framework run_tests from Yauheni Kaliuta * tag 'linux-kselftest-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (23 commits) selftests/harness: Limit step counter reporting selftests/seccomp: Check ENOSYS under tracing selftests/seccomp: Refactor to use fixture variants selftests/harness: Clean up kern-doc for fixtures selftests: kmod: Add module address visibility test Replace HTTP links with HTTPS ones: KMOD KERNEL MODULE LOADER - USERMODE HELPER selftests: fix condition in run_tests selftests: do not use .ONESHELL selftests: pidfd: skip test if unshare fails with EPERM selftests: pidfd: do not use ksft_exit_skip after ksft_set_plan selftests/harness: Report skip reason selftests/harness: Display signed values correctly selftests/harness: Refactor XFAIL into SKIP selftests/harness: Switch to TAP output selftests: Add header documentation and helpers selftests/binderfs: Fix harness API usage selftests: Remove unneeded selftest API headers selftests/clone3: Reorder reporting output selftests: sync_test: do not use ksft_exit_skip after ksft_set_plan selftests: sigaltstack: do not use ksft_exit_skip after ksft_set_plan ...
2020-08-05	Merge tag 'linux-kselftest-kunit-5.9-rc1' of ↵	Linus Torvalds	3	-34/+10
	git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kunit updates from Shuah Khan: - Add a generic kunit_resource API extending it to support resources that are passed in to kunit in addition kunit allocated resources. In addition, KUnit resources are now refcounted to avoid passed in resources being released while in use by kunit. - Add support for named resources. - Important bug fixes from Brendan Higgins and Will Chen * tag 'linux-kselftest-kunit-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: kunit: tool: fix improper treatment of file location kunit: tool: fix broken default args in unit tests kunit: capture stderr on all make subprocess calls Documentation: kunit: Remove references to --defconfig kunit: add support for named resources kunit: generalize kunit_resource API beyond allocated resources
2020-08-04	Merge tag 'docs-5.9' of git://git.lwn.net/linux	Linus Torvalds	1	-1/+1
	Pull documentation updates from Jonathan Corbet: "It's been a busy cycle for documentation - hopefully the busiest for a while to come. Changes include: - Some new Chinese translations - Progress on the battle against double words words and non-HTTPS URLs - Some block-mq documentation - More RST conversions from Mauro. At this point, that task is essentially complete, so we shouldn't see this kind of churn again for a while. Unless we decide to switch to asciidoc or something...:) - Lots of typo fixes, warning fixes, and more" * tag 'docs-5.9' of git://git.lwn.net/linux: (195 commits) scripts/kernel-doc: optionally treat warnings as errors docs: ia64: correct typo mailmap: add entry for <[email protected]> doc/zh_CN: add cpu-load Chinese version Documentation/admin-guide: tainted-kernels: fix spelling mistake MAINTAINERS: adjust kprobes.rst entry to new location devices.txt: document rfkill allocation PCI: correct flag name docs: filesystems: vfs: correct flag name docs: filesystems: vfs: correct sync_mode flag names docs: path-lookup: markup fixes for emphasis docs: path-lookup: more markup fixes docs: path-lookup: fix HTML entity mojibake CREDITS: Replace HTTP links with HTTPS ones docs: process: Add an example for creating a fixes tag doc/zh_CN: add Chinese translation prefer section doc/zh_CN: add clearing-warn-once Chinese version doc/zh_CN: add admin-guide index doc:it_IT: process: coding-style.rst: Correct __maybe_unused compiler label futex: MAINTAINERS: Re-add selftests directory ...
2020-08-04	Merge tag 'x86-fsgsbase-2020-08-04' of ↵	Linus Torvalds	4	-6/+295
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fsgsbase from Thomas Gleixner: "Support for FSGSBASE. Almost 5 years after the first RFC to support it, this has been brought into a shape which is maintainable and actually works. This final version was done by Sasha Levin who took it up after Intel dropped the ball. Sasha discovered that the SGX (sic!) offerings out there ship rogue kernel modules enabling FSGSBASE behind the kernels back which opens an instantanious unpriviledged root hole. The FSGSBASE instructions provide a considerable speedup of the context switch path and enable user space to write GSBASE without kernel interaction. This enablement requires careful handling of the exception entries which go through the paranoid entry path as they can no longer rely on the assumption that user GSBASE is positive (as enforced via prctl() on non FSGSBASE enabled systemn). All other entries (syscalls, interrupts and exceptions) can still just utilize SWAPGS unconditionally when the entry comes from user space. Converting these entries to use FSGSBASE has no benefit as SWAPGS is only marginally slower than WRGSBASE and locating and retrieving the kernel GSBASE value is not a free operation either. The real benefit of RD/WRGSBASE is the avoidance of the MSR reads and writes. The changes come with appropriate selftests and have held up in field testing against the (sanitized) Graphene-SGX driver" * tag 'x86-fsgsbase-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits) x86/fsgsbase: Fix Xen PV support x86/ptrace: Fix 32-bit PTRACE_SETREGS vs fsbase and gsbase selftests/x86/fsgsbase: Add a missing memory constraint selftests/x86/fsgsbase: Fix a comment in the ptrace_write_gsbase test selftests/x86: Add a syscall_arg_fault_64 test for negative GSBASE selftests/x86/fsgsbase: Test ptracer-induced GS base write with FSGSBASE selftests/x86/fsgsbase: Test GS selector on ptracer-induced GS base write Documentation/x86/64: Add documentation for GS/FS addressing mode x86/elf: Enumerate kernel FSGSBASE capability in AT_HWCAP2 x86/cpu: Enable FSGSBASE on 64bit by default and add a chicken bit x86/entry/64: Handle FSGSBASE enabled paranoid entry/exit x86/entry/64: Introduce the FIND_PERCPU_BASE macro x86/entry/64: Switch CR3 before SWAPGS in paranoid entry x86/speculation/swapgs: Check FSGSBASE in enabling SWAPGS mitigation x86/process/64: Use FSGSBASE instructions on thread copy and ptrace x86/process/64: Use FSBSBASE in switch_to() if available x86/process/64: Make save_fsgs_for_kvm() ready for FSGSBASE x86/fsgsbase/64: Enable FSGSBASE instructions in helper functions x86/fsgsbase/64: Add intrinsics for FSGSBASE instructions x86/cpu: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE ...
2020-08-04	Merge tag 'close-range-v5.9' of ↵	Linus Torvalds	4	-0/+236
	git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull close_range() implementation from Christian Brauner: "This adds the close_range() syscall. It allows to efficiently close a range of file descriptors up to all file descriptors of a calling task. This is coordinated with the FreeBSD folks which have copied our version of this syscall and in the meantime have already merged it in April 2019: https://reviews.freebsd.org/D21627 https://svnweb.freebsd.org/base?view=revision&revision=359836 The syscall originally came up in a discussion around the new mount API and making new file descriptor types cloexec by default. During this discussion, Al suggested the close_range() syscall. First, it helps to close all file descriptors of an exec()ing task. This can be done safely via (quoting Al's example from [1] verbatim): /* that exec is sensitive / unshare(CLONE_FILES); / we don't want anything past stderr here / close_range(3, ~0U); execve(....); The code snippet above is one way of working around the problem that file descriptors are not cloexec by default. This is aggravated by the fact that we can't just switch them over without massively regressing userspace. For a whole class of programs having an in-kernel method of closing all file descriptors is very helpful (e.g. demons, service managers, programming language standard libraries, container managers etc.). Second, it allows userspace to avoid implementing closing all file descriptors by parsing through /proc/<pid>/fd/ and calling close() on each file descriptor and other hacks. From looking at various large(ish) userspace code bases this or similar patterns are very common in service managers, container runtimes, and programming language runtimes/standard libraries such as Python or Rust. In addition, the syscall will also work for tasks that do not have procfs mounted and on kernels that do not have procfs support compiled in. In such situations the only way to make sure that all file descriptors are closed is to call close() on each file descriptor up to UINT_MAX or RLIMIT_NOFILE, OPEN_MAX trickery. Based on Linus' suggestion close_range() also comes with a new flag CLOSE_RANGE_UNSHARE to more elegantly handle file descriptor dropping right before exec. This would usually be expressed in the sequence: unshare(CLONE_FILES); close_range(3, ~0U); as pointed out by Linus it might be desirable to have this be a part of close_range() itself under a new flag CLOSE_RANGE_UNSHARE which gets especially handy when we're closing all file descriptors above a certain threshold. Test-suite as always included" * tag 'close-range-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: tests: add CLOSE_RANGE_UNSHARE tests close_range: add CLOSE_RANGE_UNSHARE tests: add close_range() tests arch: wire-up close_range() open: add close_range()
2020-08-04	Merge tag 'cap-checkpoint-restore-v5.9' of ↵	Linus Torvalds	3	-1/+186
	git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull checkpoint-restore updates from Christian Brauner: "This enables unprivileged checkpoint/restore of processes. Given that this work has been going on for quite some time the first sentence in this summary is hopefully more exciting than the actual final code changes required. Unprivileged checkpoint/restore has seen a frequent increase in interest over the last two years and has thus been one of the main topics for the combined containers & checkpoint/restore microconference since at least 2018 (cf. [1]). Here are just the three most frequent use-cases that were brought forward: - The JVM developers are integrating checkpoint/restore into a Java VM to significantly decrease the startup time. - In high-performance computing environment a resource manager will typically be distributing jobs where users are always running as non-root. Long-running and "large" processes with significant startup times are supposed to be checkpointed and restored with CRIU. - Container migration as a non-root user. In all of these scenarios it is either desirable or required to run without CAP_SYS_ADMIN. The userspace implementation of checkpoint/restore CRIU already has the pull request for supporting unprivileged checkpoint/restore up (cf. [2]). To enable unprivileged checkpoint/restore a new dedicated capability CAP_CHECKPOINT_RESTORE is introduced. This solution has last been discussed in 2019 in a talk by Google at Linux Plumbers (cf. [1] "Update on Task Migration at Google Using CRIU") with Adrian and Nicolas providing the implementation now over the last months. In essence, this allows the CRIU binary to be installed with the CAP_CHECKPOINT_RESTORE vfs capability set thereby enabling unprivileged users to restore processes. To make this possible the following permissions are altered: - Selecting a specific PID via clone3() set_tid relaxed from userns CAP_SYS_ADMIN to CAP_CHECKPOINT_RESTORE. - Selecting a specific PID via /proc/sys/kernel/ns_last_pid relaxed from userns CAP_SYS_ADMIN to CAP_CHECKPOINT_RESTORE. - Accessing /proc/pid/map_files relaxed from init userns CAP_SYS_ADMIN to init userns CAP_CHECKPOINT_RESTORE. - Changing /proc/self/exe from userns CAP_SYS_ADMIN to userns CAP_CHECKPOINT_RESTORE. Of these four changes the /proc/self/exe change deserves a few words because the reasoning behind even restricting /proc/self/exe changes in the first place is just full of historical quirks and tracking this down was a questionable version of fun that I'd like to spare others. In short, it is trivial to change /proc/self/exe as an unprivileged user, i.e. without userns CAP_SYS_ADMIN right now. Either via ptrace() or by simply intercepting the elf loader in userspace during exec. Nicolas was nice enough to even provide a POC for the latter (cf. [3]) to illustrate this fact. The original patchset which introduced PR_SET_MM_MAP had no permissions around changing the exe link. They too argued that it is trivial to spoof the exe link already which is true. The argument brought up against this was that the Tomoyo LSM uses the exe link in tomoyo_manager() to detect whether the calling process is a policy manager. This caused changing the exe links to be guarded by userns CAP_SYS_ADMIN. All in all this rather seems like a "better guard it with something rather than nothing" argument which imho doesn't qualify as a great security policy. Again, because spoofing the exe link is possible for the calling process so even if this were security relevant it was broken back then and would be broken today. So technically, dropping all permissions around changing the exe link would probably be possible and would send a clearer message to any userspace that relies on /proc/self/exe for security reasons that they should stop doing this but for now we're only relaxing the exe link permissions from userns CAP_SYS_ADMIN to userns CAP_CHECKPOINT_RESTORE. There's a final uapi change in here. Changing the exe link used to accidently return EINVAL when the caller lacked the necessary permissions instead of the more correct EPERM. This pr contains a commit fixing this. I assume that userspace won't notice or care and if they do I will revert this commit. But since we are changing the permissions anyway it seems like a good opportunity to try this fix. With these changes merged unprivileged checkpoint/restore will be possible and has already been tested by various users" [1] LPC 2018 1. "Task Migration at Google Using CRIU" https://www.youtube.com/watch?v=yI_1cuhoDgA&t=12095 2. "Securely Migrating Untrusted Workloads with CRIU" https://www.youtube.com/watch?v=yI_1cuhoDgA&t=14400 LPC 2019 1. "CRIU and the PID dance" https://www.youtube.com/watch?v=LN2CUgp8deo&list=PLVsQ_xZBEyN30ZA3Pc9MZMFzdjwyz26dO&index=9&t=2m48s 2. "Update on Task Migration at Google Using CRIU" https://www.youtube.com/watch?v=LN2CUgp8deo&list=PLVsQ_xZBEyN30ZA3Pc9MZMFzdjwyz26dO&index=9&t=1h2m8s [2] https://github.com/checkpoint-restore/criu/pull/1155 [3] https://github.com/nviennot/run_as_exe * tag 'cap-checkpoint-restore-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: selftests: add clone3() CAP_CHECKPOINT_RESTORE test prctl: exe link permission error changed from -EINVAL to -EPERM prctl: Allow local CAP_CHECKPOINT_RESTORE to change /proc/self/exe proc: allow access in init userns for map_files with CAP_CHECKPOINT_RESTORE pid_namespace: use checkpoint_restore_ns_capable() for ns_last_pid pid: use checkpoint_restore_ns_capable() for set_tid capabilities: Introduce CAP_CHECKPOINT_RESTORE
2020-08-04	Merge tag 'threads-v5.9' of ↵	Linus Torvalds	2	-0/+80
	git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull thread updates from Christian Brauner: "This contains the changes to add the missing support for attaching to time namespaces via pidfds. Last cycle setns() was changed to support attaching to multiple namespaces atomically. This requires all namespaces to have a point of no return where they can't fail anymore. Specifically, <namespace-type>_install() is allowed to perform permission checks and install the namespace into the new struct nsset that it has been given but it is not allowed to make visible changes to the affected task. Once <namespace-type>_install() returns, anything that the given namespace type additionally requires to be setup needs to ideally be done in a function that can't fail or if it fails the failure must be non-fatal. For time namespaces the relevant functions that fell into this category were timens_set_vvar_page() and vdso_join_timens(). The latter could still fail although it didn't need to. This function is only implemented for vdso_join_timens() in current mainline. As discussed on-list (cf. [1]), in order to make setns() support time namespaces when attaching to multiple namespaces at once properly we changed vdso_join_timens() to always succeed. So vdso_join_timens() replaces the mmap_write_lock_killable() with mmap_read_lock(). Please note that arm is about to grow vdso support for time namespaces (possibly this merge window). We've synced on this change and arm64 also uses mmap_read_lock(), i.e. makes vdso_join_timens() a function that can't fail. Once the changes here and the arm64 changes have landed, vdso_join_timens() should be turned into a void function so it's obvious to callers and implementers on other architectures that the expectation is that it can't fail. We didn't do this right away because it would've introduced unnecessary merge conflicts between the two trees for no major gain. As always, tests included" [1]: https://lore.kernel.org/lkml/20200611110221.pgd3r5qkjrjmfqa2@wittgenstein * tag 'threads-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: tests: add CLONE_NEWTIME setns tests nsproxy: support CLONE_NEWTIME with setns() timens: add timens_commit() helper timens: make vdso_join_timens() always succeed
2020-08-04	Merge tag 'seccomp-v5.9-rc1' of ↵	Linus Torvalds	5	-232/+573
	git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull seccomp updates from Kees Cook: "There are a bunch of clean ups and selftest improvements along with two major updates to the SECCOMP_RET_USER_NOTIF filter return: EPOLLHUP support to more easily detect the death of a monitored process, and being able to inject fds when intercepting syscalls that expect an fd-opening side-effect (needed by both container folks and Chrome). The latter continued the refactoring of __scm_install_fd() started by Christoph, and in the process found and fixed a handful of bugs in various callers. - Improved selftest coverage, timeouts, and reporting - Add EPOLLHUP support for SECCOMP_RET_USER_NOTIF (Christian Brauner) - Refactor __scm_install_fd() into __receive_fd() and fix buggy callers - Introduce 'addfd' command for SECCOMP_RET_USER_NOTIF (Sargun Dhillon)" * tag 'seccomp-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (30 commits) selftests/seccomp: Test SECCOMP_IOCTL_NOTIF_ADDFD seccomp: Introduce addfd ioctl to seccomp user notifier fs: Expand __receive_fd() to accept existing fd pidfd: Replace open-coded receive_fd() fs: Add receive_fd() wrapper for __receive_fd() fs: Move __scm_install_fd() to __receive_fd() net/scm: Regularize compat handling of scm_detach_fds() pidfd: Add missing sock updates for pidfd_getfd() net/compat: Add missing sock updates for SCM_RIGHTS selftests/seccomp: Check ENOSYS under tracing selftests/seccomp: Refactor to use fixture variants selftests/harness: Clean up kern-doc for fixtures seccomp: Use -1 marker for end of mode 1 syscall list seccomp: Fix ioctl number for SECCOMP_IOCTL_NOTIF_ID_VALID selftests/seccomp: Rename user_trap_syscall() to user_notif_syscall() selftests/seccomp: Make kcmp() less required seccomp: Use pr_fmt selftests/seccomp: Improve calibration loop selftests/seccomp: use 90s as timeout selftests/seccomp: Expand benchmark to per-filter measurements ...
2020-08-04	Merge tag 'uninit-macro-v5.9-rc1' of ↵	Linus Torvalds	2	-4/+0
	git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull uninitialized_var() macro removal from Kees Cook: "This is long overdue, and has hidden too many bugs over the years. The series has several "by hand" fixes, and then a trivial treewide replacement. - Clean up non-trivial uses of uninitialized_var() - Update documentation and checkpatch for uninitialized_var() removal - Treewide removal of uninitialized_var()" * tag 'uninit-macro-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: compiler: Remove uninitialized_var() macro treewide: Remove uninitialized_var() usage checkpatch: Remove awareness of uninitialized_var() macro mm/debug_vm_pgtable: Remove uninitialized_var() usage f2fs: Eliminate usage of uninitialized_var() macro media: sur40: Remove uninitialized_var() usage KVM: PPC: Book3S PR: Remove uninitialized_var() usage clk: spear: Remove uninitialized_var() usage clk: st: Remove uninitialized_var() usage spi: davinci: Remove uninitialized_var() usage ide: Remove uninitialized_var() usage rtlwifi: rtl8192cu: Remove uninitialized_var() usage b43: Remove uninitialized_var() usage drbd: Remove uninitialized_var() usage x86/mm/numa: Remove uninitialized_var() usage docs: deprecated.rst: Add uninitialized_var()
2020-08-04	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf	David S. Miller	2	-1/+125
	Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Flush the cleanup xtables worker to make sure destructors have completed, from Florian Westphal. 2) iifgroup is matching erroneously, also from Florian. 3) Add selftest for meta interface matching, from Florian Westphal. 4) Move nf_ct_offload_timeout() to header, from Roi Dayan. 5) Call nf_ct_offload_timeout() from flow_offload_add() to make sure garbage collection does not evict offloaded flow, from Roi Dayan. ==================== Signed-off-by: David S. Miller <[email protected]>
2020-08-04	selftests: pmtu.sh: Add tests for UDP tunnels handled by Open vSwitch	Stefano Brivio	1	-0/+180
	The new tests check that IP and IPv6 packets exceeding the local PMTU estimate, forwarded by an Open vSwitch instance from another node, result in the correct route exceptions being created, and that communication with end-to-end fragmentation, over GENEVE and VXLAN Open vSwitch ports, is now possible as a result of PMTU discovery. Signed-off-by: Stefano Brivio <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-08-04	selftests: pmtu.sh: Add tests for bridged UDP tunnels	Stefano Brivio	1	-7/+159
	The new tests check that IP and IPv6 packets exceeding the local PMTU estimate, both locally generated and forwarded by a bridge from another node, result in the correct route exceptions being created, and that communication with end-to-end fragmentation over VXLAN and GENEVE tunnels is now possible as a result of PMTU discovery. Part of the existing setup functions aren't generic enough to simply add a namespace and a bridge to the existing routing setup. This rework is in progress and we can easily shrink this once more generic topology functions are available. Signed-off-by: Stefano Brivio <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-08-03	Merge tag 'acpi-5.9-rc1' of ↵	Linus Torvalds	1	-1/+1
	git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI updates from Rafael Wysocki: "These eliminate significant AML processing overhead related to using operation regions in system memory, update the ACPICA code in the kernel to upstream revision 20200717 (including a fix to prevent operation region reference counts from overflowing in some cases), remove the last bits of the (long deprecated) ACPI procfs interface and do some assorted cleanups. Specifics: - Eliminate significant AML processing overhead related to using operation regions in system memory by reworking the management of memory mappings in the ACPI code to defer unmap operations (to do them outside of the ACPICA locks, among other things) and making the memory operation reagion handler avoid releasing memory mappings created by it too early (Rafael Wysocki). - Update the ACPICA code in the kernel to upstream revision 20200717: * Prevent operation region reference counts from overflowing in some cases (Erik Kaneda). * Replace one-element array with flexible-array (Gustavo A. R. Silva). - Fix ACPI PCI hotplug reference counting (Rafael Wysocki). - Drop last bits of the ACPI procfs interface (Thomas Renninger). - Drop some redundant checks from the code parsing ACPI tables related to NUMA (Hanjun Guo). - Avoid redundant object evaluation in the ACPI device properties handling code (Heikki Krogerus). - Avoid unecessary memory overhead related to storing the signatures of the ACPI tables recognized by the kernel (Ard Biesheuvel). - Add missing newline characters when printing module parameter values in some places (Xiongfeng Wang). - Update the link to the ACPI specifications in some places (Tiezhu Yang). - Use the fallthrough pseudo-keyword in the ACPI code (Gustavo A. R. Silva). - Drop redundant variable initialization from the APEI code (Colin Ian King). - Drop uninitialized_var() from the ACPI PAD driver (Jason Yan). - Replace HTTP links with HTTPS ones in the ACPI code (Alexander A. Klimov)" * tag 'acpi-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (22 commits) ACPI: APEI: remove redundant assignment to variable rc ACPI: NUMA: Remove the useless 'node >= MAX_NUMNODES' check ACPI: NUMA: Remove the useless sub table pointer check ACPI: tables: Remove the duplicated checks for acpi_parse_entries_array() ACPICA: Update version to 20200717 ACPICA: Do not increment operation_region reference counts for field units ACPICA: Replace one-element array with flexible-array ACPI: Replace HTTP links with HTTPS ones ACPI: Use valid link to the ACPI specification ACPI: OSL: Clean up the removal of unused memory mappings ACPI: OSL: Use deferred unmapping in acpi_os_unmap_iomem() ACPI: OSL: Use deferred unmapping in acpi_os_unmap_generic_address() ACPICA: Preserve memory opregion mappings ACPI: OSL: Implement deferred unmapping of ACPI memory ACPI: Use fallthrough pseudo-keyword PCI: hotplug: ACPI: Fix context refcounting in acpiphp_grab_context() ACPI: tables: avoid relocations for table signature array ACPI: PAD: Eliminate usage of uninitialized_var() macro ACPI: sysfs: add newlines when printing module parameters ACPI: EC: add newline when printing 'ec_event_clearing' module parameter ...