blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2020-08-13	bpf: sock_ops sk access may stomp registers when dst_reg = src_reg	John Fastabend	1	-11/+38
	Similar to patch ("bpf: sock_ops ctx access may stomp registers") if the src_reg = dst_reg when reading the sk field of a sock_ops struct we generate xlated code, 53: (61) r9 = (u32 )(r9 +28) 54: (15) if r9 == 0x0 goto pc+3 56: (79) r9 = (u64 )(r9 +0) This stomps on the r9 reg to do the sk_fullsock check and then when reading the skops->sk field instead of the sk pointer we get the sk_fullsock. To fix use similar pattern noted in the previous fix and use the temp field to save/restore a register used to do sk_fullsock check. After the fix the generated xlated code reads, 52: (7b) (u64 )(r9 +32) = r8 53: (61) r8 = (u32 )(r9 +28) 54: (15) if r9 == 0x0 goto pc+3 55: (79) r8 = (u64 )(r9 +32) 56: (79) r9 = (u64 )(r9 +0) 57: (05) goto pc+1 58: (79) r8 = (u64 )(r9 +32) Here r9 register was in-use so r8 is chosen as the temporary register. In line 52 r8 is saved in temp variable and at line 54 restored in case fullsock != 0. Finally we handle fullsock == 0 case by restoring at line 58. This adds a new macro SOCK_OPS_GET_SK it is almost possible to merge this with SOCK_OPS_GET_FIELD, but I found the extra branch logic a bit more confusing than just adding a new macro despite a bit of duplicating code. Fixes: 1314ef561102e ("bpf: export bpf_sock for BPF_PROG_TYPE_SOCK_OPS prog type") Signed-off-by: John Fastabend <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Song Liu <[email protected]> Acked-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/bpf/159718349653.4728.6559437186853473612.stgit@john-Precision-5820-Tower
2020-08-13	bpf: sock_ops ctx access may stomp registers in corner case	John Fastabend	1	-2/+24
	I had a sockmap program that after doing some refactoring started spewing this splat at me: [18610.807284] BUG: unable to handle kernel NULL pointer dereference at 0000000000000001 [...] [18610.807359] Call Trace: [18610.807370] ? 0xffffffffc114d0d5 [18610.807382] __cgroup_bpf_run_filter_sock_ops+0x7d/0xb0 [18610.807391] tcp_connect+0x895/0xd50 [18610.807400] tcp_v4_connect+0x465/0x4e0 [18610.807407] __inet_stream_connect+0xd6/0x3a0 [18610.807412] ? __inet_stream_connect+0x5/0x3a0 [18610.807417] inet_stream_connect+0x3b/0x60 [18610.807425] __sys_connect+0xed/0x120 After some debugging I was able to build this simple reproducer, __section("sockops/reproducer_bad") int bpf_reproducer_bad(struct bpf_sock_ops skops) { volatile __maybe_unused __u32 i = skops->snd_ssthresh; return 0; } And along the way noticed that below program ran without splat, __section("sockops/reproducer_good") int bpf_reproducer_good(struct bpf_sock_ops skops) { volatile __maybe_unused __u32 i = skops->snd_ssthresh; volatile __maybe_unused __u32 family; compiler_barrier(); family = skops->family; return 0; } So I decided to check out the code we generate for the above two programs and noticed each generates the BPF code you would expect, 0000000000000000 <bpf_reproducer_bad>: ; volatile __maybe_unused __u32 i = skops->snd_ssthresh; 0: r1 = (u32 )(r1 + 96) 1: (u32 )(r10 - 4) = r1 ; return 0; 2: r0 = 0 3: exit 0000000000000000 <bpf_reproducer_good>: ; volatile __maybe_unused __u32 i = skops->snd_ssthresh; 0: r2 = (u32 )(r1 + 96) 1: (u32 )(r10 - 4) = r2 ; family = skops->family; 2: r1 = (u32 )(r1 + 20) 3: (u32 )(r10 - 8) = r1 ; return 0; 4: r0 = 0 5: exit So we get reasonable assembly, but still something was causing the null pointer dereference. So, we load the programs and dump the xlated version observing that line 0 above 'r* = (u32 )(r1 +96)' is going to be translated by the skops access helpers. int bpf_reproducer_bad(struct bpf_sock_ops * skops): ; volatile __maybe_unused __u32 i = skops->snd_ssthresh; 0: (61) r1 = (u32 )(r1 +28) 1: (15) if r1 == 0x0 goto pc+2 2: (79) r1 = (u64 )(r1 +0) 3: (61) r1 = (u32 )(r1 +2340) ; volatile __maybe_unused __u32 i = skops->snd_ssthresh; 4: (63) (u32 )(r10 -4) = r1 ; return 0; 5: (b7) r0 = 0 6: (95) exit int bpf_reproducer_good(struct bpf_sock_ops * skops): ; volatile __maybe_unused __u32 i = skops->snd_ssthresh; 0: (61) r2 = (u32 )(r1 +28) 1: (15) if r2 == 0x0 goto pc+2 2: (79) r2 = (u64 )(r1 +0) 3: (61) r2 = (u32 )(r2 +2340) ; volatile __maybe_unused __u32 i = skops->snd_ssthresh; 4: (63) (u32 )(r10 -4) = r2 ; family = skops->family; 5: (79) r1 = (u64 )(r1 +0) 6: (69) r1 = (u16 )(r1 +16) ; family = skops->family; 7: (63) (u32 )(r10 -8) = r1 ; return 0; 8: (b7) r0 = 0 9: (95) exit Then we look at lines 0 and 2 above. In the good case we do the zero check in r2 and then load 'r1 + 0' at line 2. Do a quick cross-check into the bpf_sock_ops check and we can confirm that is the 'struct sock sk' pointer field. But, in the bad case, 0: (61) r1 = (u32 )(r1 +28) 1: (15) if r1 == 0x0 goto pc+2 2: (79) r1 = (u64 )(r1 +0) Oh no, we read 'r1 +28' into r1, this is skops->fullsock and then in line 2 we read the 'r1 +0' as a pointer. Now jumping back to our spat, [18610.807284] BUG: unable to handle kernel NULL pointer dereference at 0000000000000001 The 0x01 makes sense because that is exactly the fullsock value. And its not a valid dereference so we splat. To fix we need to guard the case when a program is doing a sock_ops field access with src_reg == dst_reg. This is already handled in the load case where the ctx_access handler uses a tmp register being careful to store the old value and restore it. To fix the get case test if src_reg == dst_reg and in this case do the is_fullsock test in the temporary register. Remembering to restore the temporary register before writing to either dst_reg or src_reg to avoid smashing the pointer into the struct holding the tmp variable. Adding this inline code to test_tcpbpf_kern will now be generated correctly from, 9: r2 = (u32 )(r2 + 96) to xlated code, 12: (7b) (u64 )(r2 +32) = r9 13: (61) r9 = (u32 )(r2 +28) 14: (15) if r9 == 0x0 goto pc+4 15: (79) r9 = (u64 )(r2 +32) 16: (79) r2 = (u64 )(r2 +0) 17: (61) r2 = (u32 )(r2 +2348) 18: (05) goto pc+1 19: (79) r9 = (u64 )(r2 +32) And in the normal case we keep the original code, because really this is an edge case. From this, 9: r2 = (u32 )(r6 + 96) to xlated code, 22: (61) r2 = (u32 )(r6 +28) 23: (15) if r2 == 0x0 goto pc+2 24: (79) r2 = (u64 )(r6 +0) 25: (61) r2 = (u32 *)(r2 +2348) So three additional instructions if dst == src register, but I scanned my current code base and did not see this pattern anywhere so should not be a big deal. Further, it seems no one else has hit this or at least reported it so it must a fairly rare pattern. Fixes: 9b1f3d6e5af29 ("bpf: Refactor sock_ops_convert_ctx_access") Signed-off-by: John Fastabend <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Song Liu <[email protected]> Acked-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/bpf/159718347772.4728.2781381670567919577.stgit@john-Precision-5820-Tower
2020-08-13	libbpf: Prevent overriding errno when logging errors	Toke Høiland-Jørgensen	1	-5/+7
	Turns out there were a few more instances where libbpf didn't save the errno before writing an error message, causing errno to be overridden by the printf() return and the error disappearing if logging is enabled. Signed-off-by: Toke Høiland-Jørgensen <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-08-13	Merge tag 's390-5.9-2' of ↵	Linus Torvalds	12	-46/+59
	git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull more s390 updates from Heiko Carstens: - Allow s390 debug feature to handle finally more than 256 CPU numbers, instead of truncating the most significant bits. - Improve THP splitting required by qemu processes by making use of walk_page_vma() instead of calling follow_page() for every single page within each vma. - Add missing ZCRYPT dependency to VFIO_AP to fix potential compile problems. - Remove not required select CLOCKSOURCE_VALIDATE_LAST_CYCLE again. - Set node distance to LOCAL_DISTANCE instead of 0, since e.g. libnuma translates a node distance of 0 to "no NUMA support available". - Couple of other minor fixes and improvements. * tag 's390-5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/numa: move code to arch/s390/kernel s390/time: remove select CLOCKSOURCE_VALIDATE_LAST_CYCLE again s390/debug: debug feature version 3 s390/Kconfig: add missing ZCRYPT dependency to VFIO_AP s390/numa: set node distance to LOCAL_DISTANCE s390/pkey: remove redundant variable initialization s390/test_unwind: fix possible memleak in test_unwind() s390/gmap: improve THP splitting s390/atomic: circumvent gcc 10 build regression
2020-08-13	Merge tag 'for-5.9-tag' of ↵	Linus Torvalds	9	-25/+65
	git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull more btrfs updates from David Sterba: "One minor update, the rest are fixes that have arrived a bit late for the first batch. There are also some recent fixes for bugs that were discovered during the merge window and pop up during testing. User visible change: - show correct subvolume path in /proc/mounts for bind mounts Fixes: - fix compression messages when remounting with different level or compression algorithm - tree-log: fix some memory leaks on error handling paths - restore I_VERSION on remount - fix return values and error code mixups - fix umount crash with quotas enabled when removing sysfs files - fix trim range on a shrunk device" * tag 'for-5.9-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: trim: fix underflow in trim length to prevent access beyond device boundary btrfs: fix return value mixup in btrfs_get_extent btrfs: sysfs: fix NULL pointer dereference at btrfs_sysfs_del_qgroups() btrfs: check correct variable after allocation in btrfs_backref_iter_alloc btrfs: make sure SB_I_VERSION doesn't get unset by remount btrfs: fix memory leaks after failure to lookup checksums during inode logging btrfs: don't show full path of bind mounts in subvol= btrfs: fix messages after changing compression level by remount btrfs: only search for left_info if there is no right_info in try_merge_free_space btrfs: inode: fix NULL pointer dereference if inode doesn't need compression
2020-08-13	Merge tag 'xfs-5.9-merge-8' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux	Linus Torvalds	15	-19/+21
	Pull xfs fixes from Darrick Wong: "Two small fixes that have come in during the past week: - Fix duplicated words in comments - Fix an ubsan complaint about null pointer arithmetic" * tag 'xfs-5.9-merge-8' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: Fix UBSAN null-ptr-deref in xfs_sysfs_init xfs: delete duplicated words + other fixes
2020-08-13	Merge tag 'exfat-for-5.9-rc1' of ↵	Linus Torvalds	10	-116/+121
	git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat Pull exfat updates from Namjae Jeon: - don't clear MediaFailure and VolumeDirty bit in volume flags if these were already set before mounting - write multiple dirty buffers at once in sync mode - remove unneeded EXFAT_SB_DIRTY bit set * tag 'exfat-for-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat: exfat: retain 'VolumeFlags' properly exfat: optimize exfat_zeroed_cluster() exfat: add error check when updating dir-entries exfat: write multiple sectors at once exfat: remove EXFAT_SB_DIRTY flag
2020-08-13	mm: memcontrol: fix warning when allocating the root cgroup	Johannes Weiner	1	-6/+0
	Commit 3e38e0aaca9e ("mm: memcg: charge memcg percpu memory to the parent cgroup") adds memory tracking to the memcg kernel structures themselves to make cgroups liable for the memory they are consuming through the allocation of child groups (which can be significant). This code is a bit awkward as it's spread out through several functions: The outermost function does memalloc_use_memcg(parent) to set up current->active_memcg, which designates which cgroup to charge, and the inner functions pass GFP_ACCOUNT to request charging for specific allocations. To make sure this dependency is satisfied at all times - to make sure we don't randomly charge whoever is calling the functions - the inner functions warn on !current->active_memcg. However, this triggers a false warning when the root memcg itself is allocated. No parent exists in this case, and so current->active_memcg is rightfully NULL. It's a false positive, not indicative of a bug. Delete the warnings for now, we can revisit this later. Fixes: 3e38e0aaca9e ("mm: memcg: charge memcg percpu memory to the parent cgroup") Signed-off-by: Johannes Weiner <[email protected]> Reported-by: Stephen Rothwell <[email protected]> Acked-by: Roman Gushchin <[email protected]> Cc: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-08-13	drm/xen-front: Pass dumb buffer data offset to the backend	Oleksandr Andrushchenko	3	-4/+7
	While importing a dmabuf it is possible that the data of the buffer is put with offset which is indicated by the SGT offset. Respect the offset value and forward it to the backend. Signed-off-by: Oleksandr Andrushchenko <[email protected]> Acked-by: Noralf Trønnes <[email protected]> Signed-off-by: Juergen Gross <[email protected]>
2020-08-13	xen: Sync up with the canonical protocol definition in Xen	Oleksandr Andrushchenko	1	-3/+88
	This is the sync up with the canonical definition of the display protocol in Xen. 1. Add protocol version as an integer Version string, which is in fact an integer, is hard to handle in the code that supports different protocol versions. To simplify that also add the version as an integer. 2. Pass buffer offset with XENDISPL_OP_DBUF_CREATE There are cases when display data buffer is created with non-zero offset to the data start. Handle such cases and provide that offset while creating a display buffer. 3. Add XENDISPL_OP_GET_EDID command Add an optional request for reading Extended Display Identification Data (EDID) structure which allows better configuration of the display connectors over the configuration set in XenStore. With this change connectors may have multiple resolutions defined with respect to detailed timing definitions and additional properties normally provided by displays. If this request is not supported by the backend then visible area is defined by the relevant XenStore's "resolution" property. If backend provides extended display identification data (EDID) with XENDISPL_OP_GET_EDID request then EDID values must take precedence over the resolutions defined in XenStore. 4. Bump protocol version to 2. Signed-off-by: Oleksandr Andrushchenko <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Juergen Gross <[email protected]>
2020-08-13	drm/xen-front: Add YUYV to supported formats	Oleksandr Andrushchenko	1	-0/+1
	Add YUYV to supported formats, so the frontend can work with the formats used by cameras and other HW. Signed-off-by: Oleksandr Andrushchenko <[email protected]> Acked-by: Noralf Trønnes <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Juergen Gross <[email protected]>
2020-08-13	drm/xen-front: Fix misused IS_ERR_OR_NULL checks	Oleksandr Andrushchenko	3	-7/+7
	The patch c575b7eeb89f: "drm/xen-front: Add support for Xen PV display frontend" from Apr 3, 2018, leads to the following static checker warning: drivers/gpu/drm/xen/xen_drm_front_gem.c:140 xen_drm_front_gem_create() warn: passing zero to 'ERR_CAST' drivers/gpu/drm/xen/xen_drm_front_gem.c 133 struct drm_gem_object xen_drm_front_gem_create(struct drm_device dev, 134 size_t size) 135 { 136 struct xen_gem_object *xen_obj; 137 138 xen_obj = gem_create(dev, size); 139 if (IS_ERR_OR_NULL(xen_obj)) 140 return ERR_CAST(xen_obj); Fix this and the rest of misused places with IS_ERR_OR_NULL in the driver. Fixes: c575b7eeb89f: "drm/xen-front: Add support for Xen PV display frontend" Signed-off-by: Oleksandr Andrushchenko <[email protected]> Reported-by: Dan Carpenter <[email protected]> Reviewed-by: Dan Carpenter <[email protected]> Cc: <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Juergen Gross <[email protected]>
2020-08-13	xen/gntdev: Fix dmabuf import with non-zero sgt offset	Oleksandr Andrushchenko	1	-0/+8
	It is possible that the scatter-gather table during dmabuf import has non-zero offset of the data, but user-space doesn't expect that. Fix this by failing the import, so user-space doesn't access wrong data. Fixes: bf8dc55b1358 ("xen/gntdev: Implement dma-buf import functionality") Signed-off-by: Oleksandr Andrushchenko <[email protected]> Acked-by: Juergen Gross <[email protected]> Cc: <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Juergen Gross <[email protected]>
2020-08-13	crypto: algif_aead - fix uninitialized ctx->init	Ondrej Mosnacek	2	-12/+1
	In skcipher_accept_parent_nokey() the whole af_alg_ctx structure is cleared by memset() after allocation, so add such memset() also to aead_accept_parent_nokey() so that the new "init" field is also initialized to zero. Without that the initial ctx->init checks might randomly return true and cause errors. While there, also remove the redundant zero assignments in both functions. Found via libkcapi testsuite. Cc: Stephan Mueller <[email protected]> Fixes: f3c802a1f300 ("crypto: algif_aead - Only wake up when ctx->more is zero") Suggested-by: Herbert Xu <[email protected]> Signed-off-by: Ondrej Mosnacek <[email protected]> Signed-off-by: Herbert Xu <[email protected]>
2020-08-13	selftests: netfilter: kill running process only	Fabian Frederick	1	-2/+8
	Avoid noise like the following: nft_flowtable.sh: line 250: kill: (4691) - No such process Signed-off-by: Fabian Frederick <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2020-08-13	selftests: netfilter: add MTU arguments to flowtables	Fabian Frederick	1	-6/+24
	Add some documentation, default values defined in original script and Originator/Link/Responder arguments using getopts like in tools/power/cpupower/bench/cpufreq-bench_plot.sh Signed-off-by: Fabian Frederick <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2020-08-13	selftests: netfilter: add checktool function	Fabian Frederick	1	-22/+11
	avoid repeating the same test for different toolcheck Signed-off-by: Fabian Frederick <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2020-08-13	netfilter: nf_tables: free chain context when BINDING flag is missing	Florian Westphal	1	-2/+4
	syzbot found a memory leak in nf_tables_addchain() because the chain object is not free'd correctly on error. Fixes: d0e2c7de92c7 ("netfilter: nf_tables: add NFT_CHAIN_BINDING") Reported-by: [email protected] Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2020-08-13	netfilter: avoid ipv6 -> nf_defrag_ipv6 module dependency	Florian Westphal	3	-23/+6
	nf_ct_frag6_gather is part of nf_defrag_ipv6.ko, not ipv6 core. The current use of the netfilter ipv6 stub indirections causes a module dependency between ipv6 and nf_defrag_ipv6. This prevents nf_defrag_ipv6 module from being removed because ipv6 can't be unloaded. Remove the indirection and always use a direct call. This creates a depency from nf_conntrack_bridge to nf_defrag_ipv6 instead: modinfo nf_conntrack depends: nf_conntrack,nf_defrag_ipv6,bridge .. and nf_conntrack already depends on nf_defrag_ipv6 anyway. Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2020-08-12	bpf: Iterate through all PT_NOTE sections when looking for build id	Jiri Olsa	1	-10/+14
	Currently when we look for build id within bpf_get_stackid helper call, we check the first NOTE section and we fail if build id is not there. However on some system (Fedora) there can be multiple NOTE sections in binaries and build id data is not always the first one, like: $ readelf -a /usr/bin/ls ... [ 2] .note.gnu.propert NOTE 0000000000000338 00000338 0000000000000020 0000000000000000 A 0 0 8358 [ 3] .note.gnu.build-i NOTE 0000000000000358 00000358 0000000000000024 0000000000000000 A 0 0 437c [ 4] .note.ABI-tag NOTE 000000000000037c 0000037c ... So the stack_map_get_build_id function will fail on build id retrieval and fallback to BPF_STACK_BUILD_ID_IP. This patch is changing the stack_map_get_build_id code to iterate through all the NOTE sections and try to get build id data from each of them. When tracing on sched_switch tracepoint that does bpf_get_stackid helper call kernel build, I can see about 60% increase of successful build id retrieval. The rest seems fails on -EFAULT. Signed-off-by: Jiri Olsa <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Song Liu <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-08-12	libbpf: Handle GCC built-in types for Arm NEON	Jean-Philippe Brucker	1	-1/+34
	When building Arm NEON (SIMD) code from lib/raid6/neon.uc, GCC emits DWARF information using a base type "__Poly8_t", which is internal to GCC and not recognized by Clang. This causes build failures when building with Clang a vmlinux.h generated from an arm64 kernel that was built with GCC. vmlinux.h:47284:9: error: unknown type name '__Poly8_t' typedef __Poly8_t poly8x16_t[16]; ^~~~~~~~~ The polyX_t types are defined as unsigned integers in the "Arm C Language Extension" document (101028_Q220_00_en). Emit typedefs based on standard integer types for the GCC internal types, similar to those emitted by Clang. Including linux/kernel.h to use ARRAY_SIZE() incidentally redefined max(), causing a build bug due to different types, hence the seemingly unrelated change. Reported-by: Jakov Petrina <[email protected]> Signed-off-by: Jean-Philippe Brucker <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-08-12	tools/bpftool: Make skeleton code C++17-friendly by dropping typeof()	Andrii Nakryiko	1	-4/+4
	Seems like C++17 standard mode doesn't recognize typeof() anymore. This can be tested by compiling test_cpp test with -std=c++17 or -std=c++1z options. The use of typeof in skeleton generated code is unnecessary, all types are well-known at the time of code generation, so remove all typeof()'s to make skeleton code more future-proof when interacting with C++ compilers. Fixes: 985ead416df3 ("bpftool: Add skeleton codegen command") Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Song Liu <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-08-12	bpf: Fix XDP FD-based attach/detach logic around XDP_FLAGS_UPDATE_IF_NOEXIST	Andrii Nakryiko	1	-4/+4
	Enforce XDP_FLAGS_UPDATE_IF_NOEXIST only if new BPF program to be attached is non-NULL (i.e., we are not detaching a BPF program). Fixes: d4baa9368a5e ("bpf, xdp: Extract common XDP program attachment logic") Reported-by: Stanislav Fomichev <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Tested-by: Stanislav Fomichev <[email protected]> Acked-by: Toke Høiland-Jørgensen <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-08-12	Merge tag 'rtc-5.9' of ↵	Linus Torvalds	15	-218/+244
	git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux Pull RTC updates from Alexandre Belloni: "Not much this cycle - mostly non urgent driver fixes: - ds1374: use watchdog core - pcf2127: add alarm and pcf2129 support" * tag 'rtc-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: rtc: pcf2127: fix alarm handling rtc: pcf2127: add alarm support rtc: pcf2127: add pca2129 device id rtc: max77686: Fix wake-ups for max77620 rtc: ds1307: provide an indication that the watchdog has fired rtc: ds1374: remove unused define rtc: ds1374: fix RTC_DRV_DS1374_WDT dependencies rtc: cleanup obsolete comment about struct rtc_class_ops rtc: pl031: fix set_alarm by adding back call to alarm_irq_enable rtc: ds1374: wdt: Use watchdog core for watchdog part rtc: Replace HTTP links with HTTPS ones rtc: goldfish: Enable interrupt in set_alarm() when necessary rtc: max77686: Do not allow interrupt to fire before system resume rtc: imxdi: fix trivial typos rtc: cpcap: fix range
2020-08-12	Revert "ipv4: tunnel: fix compilation on ARCH=um"	David S. Miller	1	-1/+0
	This reverts commit 06a7a37be55e29961c9ba2abec4d07c8e0e21861. The bug was already fixed, this added a dup include. Reported-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-08-12	net: accept an empty mask in /sys/class/net//queues/rx-/rps_cpus	Eric Dumazet	1	-5/+7
	We must accept an empty mask in store_rps_map(), or we are not able to disable RPS on a queue. Fixes: 07bbecb34106 ("net: Restrict receive packets queuing to housekeeping CPUs") Signed-off-by: Eric Dumazet <[email protected]> Reported-by: Maciej Żenczykowski <[email protected]> Cc: Alex Belits <[email protected]> Cc: Nitesh Narayan Lal <[email protected]> Cc: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Maciej Żenczykowski <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Nitesh Narayan Lal <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-08-12	Merge branch 'net-stmmac-Fix-multicast-filter-on-IPQ806x'	David S. Miller	2	-0/+4
	Jonathan McDowell says: ==================== net: stmmac: Fix multicast filter on IPQ806x This pair of patches are the result of discovering a failure to correctly receive IPv6 multicast packets on such a device (in particular DHCPv6 requests and RA solicitations). Putting the device into promiscuous mode, or allmulti, both resulted in such packets correctly being received. Examination of the vendor driver (nss-gmac from the qsdk) shows that it does not enable the multicast filter and instead falls back to allmulti. Extend the base dwmac1000 driver to fall back when there's no suitable hardware filter, and update the ipq806x platform to request this. ==================== Signed-off-by: David S. Miller <[email protected]>
2020-08-12	net: ethernet: stmmac: Disable hardware multicast filter	Jonathan McDowell	1	-0/+1
	The IPQ806x does not appear to have a functional multicast ethernet address filter. This was observed as a failure to correctly receive IPv6 packets on a LAN to the all stations address. Checking the vendor driver shows that it does not attempt to enable the multicast filter and instead falls back to receiving all multicast packets, internally setting ALLMULTI. Use the new fallback support in the dwmac1000 driver to correctly achieve the same with the mainline IPQ806x driver. Confirmed to fix IPv6 functionality on an RB3011 router. Cc: [email protected] Signed-off-by: Jonathan McDowell <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-08-12	net: stmmac: dwmac1000: provide multicast filter fallback	Jonathan McDowell	1	-0/+3
	If we don't have a hardware multicast filter available then instead of silently failing to listen for the requested ethernet broadcast addresses fall back to receiving all multicast packets, in a similar fashion to other drivers with no multicast filter. Cc: [email protected] Signed-off-by: Jonathan McDowell <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-08-12	i2c: iproc: fix race between client unreg and isr	Dhananjay Phadke	1	-1/+12
	When i2c client unregisters, synchronize irq before setting iproc_i2c->slave to NULL. (1) disable_irq() (2) Mask event enable bits in control reg (3) Erase slave address (avoid further writes to rx fifo) (4) Flush tx and rx FIFOs (5) Clear pending event (interrupt) bits in status reg (6) enable_irq() (7) Set client pointer to NULL Unable to handle kernel NULL pointer dereference at virtual address 0000000000000318 [ 371.020421] pc : bcm_iproc_i2c_isr+0x530/0x11f0 [ 371.025098] lr : __handle_irq_event_percpu+0x6c/0x170 [ 371.030309] sp : ffff800010003e40 [ 371.033727] x29: ffff800010003e40 x28: 0000000000000060 [ 371.039206] x27: ffff800010ca9de0 x26: ffff800010f895df [ 371.044686] x25: ffff800010f18888 x24: ffff0008f7ff3600 [ 371.050165] x23: 0000000000000003 x22: 0000000001600000 [ 371.055645] x21: ffff800010f18888 x20: 0000000001600000 [ 371.061124] x19: ffff0008f726f080 x18: 0000000000000000 [ 371.066603] x17: 0000000000000000 x16: 0000000000000000 [ 371.072082] x15: 0000000000000000 x14: 0000000000000000 [ 371.077561] x13: 0000000000000000 x12: 0000000000000001 [ 371.083040] x11: 0000000000000000 x10: 0000000000000040 [ 371.088519] x9 : ffff800010f317c8 x8 : ffff800010f317c0 [ 371.093999] x7 : ffff0008f805b3b0 x6 : 0000000000000000 [ 371.099478] x5 : ffff0008f7ff36a4 x4 : ffff8008ee43d000 [ 371.104957] x3 : 0000000000000000 x2 : ffff8000107d64c0 [ 371.110436] x1 : 00000000c00000af x0 : 0000000000000000 [ 371.115916] Call trace: [ 371.118439] bcm_iproc_i2c_isr+0x530/0x11f0 [ 371.122754] __handle_irq_event_percpu+0x6c/0x170 [ 371.127606] handle_irq_event_percpu+0x34/0x88 [ 371.132189] handle_irq_event+0x40/0x120 [ 371.136234] handle_fasteoi_irq+0xcc/0x1a0 [ 371.140459] generic_handle_irq+0x24/0x38 [ 371.144594] __handle_domain_irq+0x60/0xb8 [ 371.148820] gic_handle_irq+0xc0/0x158 [ 371.152687] el1_irq+0xb8/0x140 [ 371.155927] arch_cpu_idle+0x10/0x18 [ 371.159615] do_idle+0x204/0x290 [ 371.162943] cpu_startup_entry+0x24/0x60 [ 371.166990] rest_init+0xb0/0xbc [ 371.170322] arch_call_rest_init+0xc/0x14 [ 371.174458] start_kernel+0x404/0x430 Fixes: c245d94ed106 ("i2c: iproc: Add multi byte read-write support for slave mode") Signed-off-by: Dhananjay Phadke <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Acked-by: Ray Jui <[email protected]> Signed-off-by: Wolfram Sang <[email protected]>
2020-08-12	ipv4: tunnel: fix compilation on ARCH=um	Johannes Berg	1	-0/+1
	With certain configurations, a 64-bit ARCH=um errors out here with an unknown csum_ipv6_magic() function. Include the right header file to always have it. Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-08-12	vsock: fix potential null pointer dereference in vsock_poll()	Stefano Garzarella	1	-1/+1
	syzbot reported this issue where in the vsock_poll() we find the socket state at TCP_ESTABLISHED, but 'transport' is null: general protection fault, probably for non-canonical address 0xdffffc0000000012: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000090-0x0000000000000097] CPU: 0 PID: 8227 Comm: syz-executor.2 Not tainted 5.8.0-rc7-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:vsock_poll+0x75a/0x8e0 net/vmw_vsock/af_vsock.c:1038 Call Trace: sock_poll+0x159/0x460 net/socket.c:1266 vfs_poll include/linux/poll.h:90 [inline] do_pollfd fs/select.c:869 [inline] do_poll fs/select.c:917 [inline] do_sys_poll+0x607/0xd40 fs/select.c:1011 __do_sys_poll fs/select.c:1069 [inline] __se_sys_poll fs/select.c:1057 [inline] __x64_sys_poll+0x18c/0x440 fs/select.c:1057 do_syscall_64+0x60/0xe0 arch/x86/entry/common.c:384 entry_SYSCALL_64_after_hwframe+0x44/0xa9 This issue can happen if the TCP_ESTABLISHED state is set after we read the vsk->transport in the vsock_poll(). We could put barriers to synchronize, but this can only happen during connection setup, so we can simply check that 'transport' is valid. Fixes: c0cfa2d8a788 ("vsock: add multi-transports support") Reported-and-tested-by: [email protected] Signed-off-by: Stefano Garzarella <[email protected]> Reviewed-by: Jorgen Hansen <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-08-12	sfc: fix ef100 design-param checking	Edward Cree	1	-1/+2
	The handling of the RXQ/TXQ size granularity design-params had two problems: it had a 64-bit divide that didn't build on 32-bit platforms, and it could divide by zero if the NIC supplied 0 as the value of the design-param. Fix both by checking for 0 and for a granularity bigger than our min-size; if the granularity <= EFX_MIN_DMAQ_SIZE then it fits in 32 bits, so we can cast it to u32 for the divide. Reported-by: kernel test robot <[email protected]> Signed-off-by: Edward Cree <[email protected]> Reviewed-by: Guenter Roeck <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-08-12	Merge tag 'ceph-for-5.9-rc1' of git://github.com/ceph/ceph-client	Linus Torvalds	25	-136/+511
	Pull ceph updates from Ilya Dryomov: "Xiubo has completed his work on filesystem client metrics, they are sent to all available MDSes once per second now. Other than that, we have a lot of fixes and cleanups all around the filesystem, including a tweak to cut down on MDS request resends in multi-MDS setups from Yanhu and fixups for SELinux symlink labeling and MClientSession message decoding from Jeff" * tag 'ceph-for-5.9-rc1' of git://github.com/ceph/ceph-client: (22 commits) ceph: handle zero-length feature mask in session messages ceph: use frag's MDS in either mode ceph: move sb->wb_pagevec_pool to be a global mempool ceph: set sec_context xattr on symlink creation ceph: remove redundant initialization of variable mds ceph: fix use-after-free for fsc->mdsc ceph: remove unused variables in ceph_mdsmap_decode() ceph: delete repeated words in fs/ceph/ ceph: send client provided metric flags in client metadata ceph: periodically send perf metrics to MDSes ceph: check the sesion state and return false in case it is closed libceph: replace HTTP links with HTTPS ones ceph: remove unnecessary cast in kfree() libceph: just have osd_req_op_init() return a pointer ceph: do not access the kiocb after aio requests ceph: clean up and optimize ceph_check_delayed_caps() ceph: fix potential mdsc use-after-free crash ceph: switch to WARN_ON_ONCE in encode_supported_features() ceph: add global total_caps to count the mdsc's total caps number ceph: add check_session_state() helper and make it global ...
2020-08-12	Merge branch 'parisc-5.9-2' of ↵	Linus Torvalds	5	-8/+70
	git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull more parisc updates from Helge Deller: - Oscar Carter contributed a patch which fixes parisc's usage of dereference_function_descriptor() and thus will allow using the -Wcast-function-type compiler option in the top-level Makefile - Sven Schnelle fixed a bug in the SBA code to prevent crashes during kexec - John David Anglin provided implementations for __smp_store_release() and __smp_load_acquire barriers() which avoids using the sync assembler instruction and thus speeds up barrier paths - Some whitespace cleanups in parisc's atomic.h header file * 'parisc-5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Implement __smp_store_release and __smp_load_acquire barriers parisc: mask out enable and reserved bits from sba imask parisc: Whitespace cleanups in atomic.h parisc/kernel/ftrace: Remove function callback casts sections.h: dereference_function_descriptor() returns void pointer
2020-08-12	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm	Linus Torvalds	83	-2636/+3479
	Pull more KVM updates from Paolo Bonzini: "PPC: - Improvements and bugfixes for secure VM support, giving reduced startup time and memory hotplug support. - Locking fixes in nested KVM code - Increase number of guests supported by HV KVM to 4094 - Preliminary POWER10 support ARM: - Split the VHE and nVHE hypervisor code bases, build the EL2 code separately, allowing for the VHE code to now be built with instrumentation - Level-based TLB invalidation support - Restructure of the vcpu register storage to accomodate the NV code - Pointer Authentication available for guests on nVHE hosts - Simplification of the system register table parsing - MMU cleanups and fixes - A number of post-32bit cleanups and other fixes MIPS: - compilation fixes x86: - bugfixes - support for the SERIALIZE instruction" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (70 commits) KVM: MIPS/VZ: Fix build error caused by 'kvm_run' cleanup x86/kvm/hyper-v: Synic default SCONTROL MSR needs to be enabled MIPS: KVM: Convert a fallthrough comment to fallthrough MIPS: VZ: Only include loongson_regs.h for CPU_LOONGSON64 x86: Expose SERIALIZE for supported cpuid KVM: x86: Don't attempt to load PDPTRs when 64-bit mode is enabled KVM: arm64: Move S1PTW S2 fault logic out of io_mem_abort() KVM: arm64: Don't skip cache maintenance for read-only memslots KVM: arm64: Handle data and instruction external aborts the same way KVM: arm64: Rename kvm_vcpu_dabt_isextabt() KVM: arm: Add trace name for ARM_NISV KVM: arm64: Ensure that all nVHE hyp code is in .hyp.text KVM: arm64: Substitute RANDOMIZE_BASE for HARDEN_EL2_VECTORS KVM: arm64: Make nVHE ASLR conditional on RANDOMIZE_BASE KVM: PPC: Book3S HV: Rework secure mem slot dropping KVM: PPC: Book3S HV: Move kvmppc_svm_page_out up KVM: PPC: Book3S HV: Migrate hot plugged memory KVM: PPC: Book3S HV: In H_SVM_INIT_DONE, migrate remaining normal-GFNs to secure-GFNs KVM: PPC: Book3S HV: Track the state GFNs associated with secure VMs KVM: PPC: Book3S HV: Disable page merging in H_SVM_INIT_START ...
2020-08-12	Merge tag 'clk-for-linus' of ↵	Linus Torvalds	82	-871/+4778
	git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull more clk updates from Stephen Boyd: "Here's some more updates that missed the last pull request because I happened to tag the tree at an earlier point in the history of clk-next. I must have fat fingered it and checked out an older version of clk-next on this second computer I'm using. This time it actually includes more code for Qualcomm SoCs, the AT91 major updates, and some Rockchip SoC clk driver updates as well. I've corrected this flow so this shouldn't happen again" * tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (83 commits) clk: bcm2835: Do not use prediv with bcm2711's PLLs clk: drop unused function __clk_get_flags clk: hsdk: Fix bad dependency on IOMEM dt-bindings: clock: Fix YAML schemas for LPASS clocks on SC7180 clk: mmp: avoid missing prototype warning clk: sparx5: Add Sparx5 SoC DPLL clock driver dt-bindings: clock: sparx5: Add bindings include file clk: qoriq: add LS1021A core pll mux options clk: clk-atlas6: fix return value check in atlas6_clk_init() clk: tegra: pll: Improve PLLM enable-state detection clk: X1000: Add support for calculat REFCLK of USB PHY. clk: JZ4780: Reformat the code to align it. clk: JZ4780: Add functions for enable and disable USB PHY. clk: Ingenic: Add RTC related clocks for Ingenic SoCs. dt-bindings: clock: Add tabs to align code. dt-bindings: clock: Add RTC related clocks for Ingenic SoCs. clk: davinci: Use fallthrough pseudo-keyword clk: imx: Use fallthrough pseudo-keyword clk: qcom: gcc-sdm660: Fix up gcc_mss_mnoc_bimc_axi_clk clk: qcom: gcc-sdm660: Add missing modem reset ...
2020-08-12	Merge tag 'linux-watchdog-5.9-rc1' of ↵	Linus Torvalds	62	-210/+1032
	git://www.linux-watchdog.org/linux-watchdog Pull watchdog updates from Wim Van Sebroeck: - f71808e_wdt imporvements - dw_wdt improvements - mlx-wdt: support new watchdog type with longer timeout period - fallthrough pseudo-keyword replacements - overall small fixes and improvements * tag 'linux-watchdog-5.9-rc1' of git://www.linux-watchdog.org/linux-watchdog: (35 commits) watchdog: rti-wdt: balance pm runtime enable calls watchdog: rti-wdt: attach to running watchdog during probe watchdog: add support for adjusting last known HW keepalive time watchdog: use __watchdog_ping in startup watchdog: softdog: Add options 'soft_reboot_cmd' and 'soft_active_on_boot' watchdog: pcwd_usb: remove needless check before usb_free_coherent() watchdog: Replace HTTP links with HTTPS ones dt-bindings: watchdog: renesas,wdt: Document r8a774e1 support watchdog: initialize device before misc_register watchdog: booke_wdt: Add common nowayout parameter driver watchdog: scx200_wdt: Use fallthrough pseudo-keyword watchdog: Use fallthrough pseudo-keyword watchdog: f71808e_wdt: do stricter parameter validation watchdog: f71808e_wdt: clear watchdog timeout occurred flag watchdog: f71808e_wdt: remove use of wrong watchdog_info option watchdog: f71808e_wdt: indicate WDIOF_CARDRESET support in watchdog_info.options docs: watchdog: codify ident.options as superset of possible status flags dt-bindings: watchdog: Add compatible for QCS404, SC7180, SDM845, SM8150 dt-bindings: watchdog: Convert QCOM watchdog timer bindings to YAML watchdog: dw_wdt: Add DebugFS files ...
2020-08-12	Merge tag 'vfio-v5.9-rc1' of git://github.com/awilliam/linux-vfio	Linus Torvalds	5	-193/+282
	Pull VFIO updates from Alex Williamson: - Inclusive naming updates (Alex Williamson) - Intel X550 INTx quirk (Alex Williamson) - Error path resched between unmaps (Xiang Zheng) - SPAPR IOMMU pin_user_pages() conversion (John Hubbard) - Trivial mutex simplification (Alex Williamson) - QAT device denylist (Giovanni Cabiddu) - type1 IOMMU ioctl refactor (Liu Yi L) * tag 'vfio-v5.9-rc1' of git://github.com/awilliam/linux-vfio: vfio/type1: Refactor vfio_iommu_type1_ioctl() vfio/pci: Add QAT devices to denylist vfio/pci: Add device denylist PCI: Add Intel QuickAssist device IDs vfio/pci: Hold igate across releasing eventfd contexts vfio/spapr_tce: convert get_user_pages() --> pin_user_pages() vfio/type1: Add conditional rescheduling after iommu map failed vfio/pci: Add Intel X550 to hidden INTx devices vfio: Cleanup allowed driver naming
2020-08-12	Merge tag 'drm-next-2020-08-12' of git://anongit.freedesktop.org/drm/drm	Linus Torvalds	84	-382/+903
	Pull drm fixes from Dave Airlie: "This has a few vmwgfx regression fixes we hit from the merge window (one in TTM), it also has a bunch of amdgpu fixes along with a scattering everywhere else. core: - Fix drm_dp_mst_port refcount leaks in drm_dp_mst_allocate_vcpi - Remove null check for kfree in drm_dev_release. - Fix DRM_FORMAT_MOD_AMLOGIC_FBC definition. - re-added docs for drm_gem_flink_ioctl() - add orientation quirk for ASUS T103HAF ttm: - ttm: fix page-offset calculation within TTM - revert patch causing vmwgfx regressions fbcon: - Fix a fbcon OOB read in fbdev, found by syzbot. vga: - Mark vga_tryget static as it's not used elsewhere. amdgpu: - Re-add spelling typo fix - Sienna Cichlid fixes - Navy Flounder fixes - DC fixes - SMU i2c fix - Power fixes vmwgfx: - regression fixes for modesetting crashes - misc fixes xlnx: - Small fixes to xlnx. omap: - Fix mode initialization in omap_connector_mode_valid(). - force runtime PM suspend on system suspend tidss: - fix modeset init for DPI panels" * tag 'drm-next-2020-08-12' of git://anongit.freedesktop.org/drm/drm: (70 commits) drm/ttm: revert "drm/ttm: make TT creation purely optional v3" drm/vmwgfx: fix spelling mistake "Cant" -> "Can't" drm/vmwgfx: fix spelling mistake "Cound" -> "Could" drm/vmwgfx/ldu: Use drm_mode_config_reset drm/vmwgfx/sou: Use drm_mode_config_reset drm/vmwgfx/stdu: Use drm_mode_config_reset drm/vmwgfx: Fix two list_for_each loop exit tests drm/vmwgfx: Use correct vmw_legacy_display_unit pointer drm/vmwgfx: Use struct_size() helper drm/amdgpu: Fix bug where DPM is not enabled after hibernate and resume drm/amd/powerplay: put VCN/JPEG into PG ungate state before dpm table setup(V3) drm/amd/powerplay: update swSMU VCN/JPEG PG logics drm/amdgpu: use mode1 reset by default for sienna_cichlid drm/amdgpu/smu: rework i2c adpater registration drm/amd/display: Display goes blank after inst drm/amd/display: Change null plane state swizzle mode to 4kb_s drm/amd/display: Use helper function to check for HDMI signal drm/amd/display: AMD OUI (DPCD 0x00300) skipped on some sink drm/amd/display: Fix logger context drm/amd/display: populate new dml variable ...
2020-08-12	Merge branch 'akpm' (patches from Andrew)	Linus Torvalds	264	-1437/+2357
	Merge more updates from Andrew Morton: - most of the rest of MM (memcg, hugetlb, vmscan, proc, compaction, mempolicy, oom-kill, hugetlbfs, migration, thp, cma, util, memory-hotplug, cleanups, uaccess, migration, gup, pagemap), - various other subsystems (alpha, misc, sparse, bitmap, lib, bitops, checkpatch, autofs, minix, nilfs, ufs, fat, signals, kmod, coredump, exec, kdump, rapidio, panic, kcov, kgdb, ipc). * emailed patches from Andrew Morton <[email protected]>: (164 commits) mm/gup: remove task_struct pointer for all gup code mm: clean up the last pieces of page fault accountings mm/xtensa: use general page fault accounting mm/x86: use general page fault accounting mm/sparc64: use general page fault accounting mm/sparc32: use general page fault accounting mm/sh: use general page fault accounting mm/s390: use general page fault accounting mm/riscv: use general page fault accounting mm/powerpc: use general page fault accounting mm/parisc: use general page fault accounting mm/openrisc: use general page fault accounting mm/nios2: use general page fault accounting mm/nds32: use general page fault accounting mm/mips: use general page fault accounting mm/microblaze: use general page fault accounting mm/m68k: use general page fault accounting mm/ia64: use general page fault accounting mm/hexagon: use general page fault accounting mm/csky: use general page fault accounting ...
2020-08-12	mm/gup: remove task_struct pointer for all gup code	Peter Xu	18	-87/+69
	After the cleanup of page fault accounting, gup does not need to pass task_struct around any more. Remove that parameter in the whole gup stack. Signed-off-by: Peter Xu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: John Hubbard <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-08-12	mm: clean up the last pieces of page fault accountings	Peter Xu	4	-29/+10
	Here're the last pieces of page fault accounting that were still done outside handle_mm_fault() where we still have regs==NULL when calling handle_mm_fault(): arch/powerpc/mm/copro_fault.c: copro_handle_mm_fault arch/sparc/mm/fault_32.c: force_user_fault arch/um/kernel/trap.c: handle_page_fault mm/gup.c: faultin_page fixup_user_fault mm/hmm.c: hmm_vma_fault mm/ksm.c: break_ksm Some of them has the issue of duplicated accounting for page fault retries. Some of them didn't do the accounting at all. This patch cleans all these up by letting handle_mm_fault() to do per-task page fault accounting even if regs==NULL (though we'll still skip the perf event accountings). With that, we can safely remove all the outliers now. There's another functional change in that now we account the page faults to the caller of gup, rather than the task_struct that passed into the gup code. More information of this can be found at [1]. After this patch, below things should never be touched again outside handle_mm_fault(): - task_struct.[maj\|min]_flt - PERF_COUNT_SW_PAGE_FAULTS_[MAJ\|MIN] [1] https://lore.kernel.org/lkml/CAHk-=wj_V2Tps2QrMn20_W0OJF9xqNh52XSGA42s-ZJ8Y+GyKw@mail.gmail.com/ Signed-off-by: Peter Xu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Cain <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Chris Zankel <[email protected]> Cc: Dave Hansen <[email protected]> Cc: David S. Miller <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Gerald Schaefer <[email protected]> Cc: Greentime Hu <[email protected]> Cc: Guo Ren <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Helge Deller <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ivan Kokshaysky <[email protected]> Cc: James E.J. Bottomley <[email protected]> Cc: John Hubbard <[email protected]> Cc: Jonas Bonn <[email protected]> Cc: Ley Foon Tan <[email protected]> Cc: "Luck, Tony" <[email protected]> Cc: Matt Turner <[email protected]> Cc: Max Filippov <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Michal Simek <[email protected]> Cc: Nick Hu <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Richard Henderson <[email protected]> Cc: Rich Felker <[email protected]> Cc: Russell King <[email protected]> Cc: Stafford Horne <[email protected]> Cc: Stefan Kristiansson <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Vincent Chen <[email protected]> Cc: Vineet Gupta <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yoshinori Sato <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-08-12	mm/xtensa: use general page fault accounting	Peter Xu	1	-11/+4
	Use the general page fault accounting by passing regs into handle_mm_fault(). It naturally solve the issue of multiple page fault accounting when page fault retry happened. Remove the PERF_COUNT_SW_PAGE_FAULTS_[MAJ\|MIN] perf events because it's now also done in handle_mm_fault(). Move the PERF_COUNT_SW_PAGE_FAULTS event higher before taking mmap_sem for the fault, then it'll match with the rest of the archs. Signed-off-by: Peter Xu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Max Filippov <[email protected]> Cc: Chris Zankel <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-08-12	mm/x86: use general page fault accounting	Peter Xu	1	-15/+2
	Use the general page fault accounting by passing regs into handle_mm_fault(). Signed-off-by: Peter Xu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: H. Peter Anvin <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-08-12	mm/sparc64: use general page fault accounting	Peter Xu	1	-10/+1
	Use the general page fault accounting by passing regs into handle_mm_fault(). It naturally solve the issue of multiple page fault accounting when page fault retry happened. Signed-off-by: Peter Xu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: David S. Miller <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-08-12	mm/sparc32: use general page fault accounting	Peter Xu	1	-10/+1
	Use the general page fault accounting by passing regs into handle_mm_fault(). It naturally solve the issue of multiple page fault accounting when page fault retry happened. Signed-off-by: Peter Xu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: David S. Miller <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-08-12	mm/sh: use general page fault accounting	Peter Xu	1	-10/+1
	Use the general page fault accounting by passing regs into handle_mm_fault(). It naturally solve the issue of multiple page fault accounting when page fault retry happened. Signed-off-by: Peter Xu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Yoshinori Sato <[email protected]> Cc: Rich Felker <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-08-12	mm/s390: use general page fault accounting	Peter Xu	1	-15/+1
	Use the general page fault accounting by passing regs into handle_mm_fault(). It naturally solve the issue of multiple page fault accounting when page fault retry happened. Signed-off-by: Peter Xu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Gerald Schaefer <[email protected]> Acked-by: Gerald Schaefer <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Christian Borntraeger <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-08-12	mm/riscv: use general page fault accounting	Peter Xu	1	-15/+1
	Use the general page fault accounting by passing regs into handle_mm_fault(). It naturally solve the issue of multiple page fault accounting when page fault retry happened. Signed-off-by: Peter Xu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Pekka Enberg <[email protected]> Acked-by: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Albert Ou <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>