aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-05-21net: ipip: fix wrong address family in init error pathVadim Fedorenko1-1/+1
In case of error with MPLS support the code is misusing AF_INET instead of AF_MPLS. Fixes: 1b69e7e6c4da ("ipip: support MPLS over IPv4") Signed-off-by: Vadim Fedorenko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-05-21Merge branch 'net-tls-fix-encryption-error-path'David S. Miller1-7/+10
Vadim Fedorenko says: ==================== net/tls: fix encryption error path The problem with data stream corruption was found in KTLS transmit path with small socket send buffers and large amount of data. bpf_exec_tx_verdict() frees open record on any type of error including EAGAIN, ENOMEM and ENOSPC while callers are able to recover this transient errors. Also wrong error code was returned to user space in that case. This patchset fixes the problems. ==================== Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-05-21net/tls: free record only on encryption errorVadim Fedorenko1-2/+4
We cannot free record on any transient error because it leads to losing previos data. Check socket error to know whether record must be freed or not. Fixes: d10523d0b3d7 ("net/tls: free the record on encryption error") Signed-off-by: Vadim Fedorenko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-05-21net/tls: fix encryption error checkingVadim Fedorenko1-5/+6
bpf_exec_tx_verdict() can return negative value for copied variable. In that case this value will be pushed back to caller and the real error code will be lost. Fix it using signed type and checking for positive value. Fixes: d10523d0b3d7 ("net/tls: free the record on encryption error") Fixes: d3b18ad31f93 ("tls: add bpf support to sk_msg handling") Signed-off-by: Vadim Fedorenko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-05-21Merge branch 'net-ethernet-ti-fix-some-return-value-check'David S. Miller4-6/+7
Wei Yongjun says: ==================== net: ethernet: ti: fix some return value check This patchset convert cpsw_ale_create() to return PTR_ERR() only, and changed all the caller to check IS_ERR() instead of NULL. Since v2: 1) rebased on net.git, as Jakub's suggest 2) split am65-cpsw-nuss.c changes, as Grygorii's suggest ==================== Signed-off-by: David S. Miller <[email protected]>
2020-05-21net: ethernet: ti: am65-cpsw-nuss: fix error handling of am65_cpsw_nuss_probeWei Yongjun1-1/+2
Convert to using IS_ERR() instead of NULL test for cpsw_ale_create() error handling. Also fix to return negative error code from this error handling case instead of 0 in. Fixes: 93a76530316a ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver") Reported-by: Hulk Robot <[email protected]> Signed-off-by: Wei Yongjun <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-05-21net: ethernet: ti: fix some return value check of cpsw_ale_create()Wei Yongjun3-5/+5
cpsw_ale_create() can return both NULL and PTR_ERR(), but all of the caller only check NULL for error handling. This patch convert it to only return PTR_ERR() in all error cases, and the caller using IS_ERR() instead of NULL test. Fixes: 4b41d3436796 ("net: ethernet: ti: cpsw: allow untagged traffic on host port") Reported-by: Hulk Robot <[email protected]> Signed-off-by: Wei Yongjun <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-05-21net: qrtr: Fix passing invalid reference to qrtr_local_enqueue()Manivannan Sadhasivam1-1/+1
Once the traversal of the list is completed with list_for_each_entry(), the iterator (node) will point to an invalid object. So passing this to qrtr_local_enqueue() which is outside of the iterator block is erroneous eventhough the object is not used. So fix this by passing NULL to qrtr_local_enqueue(). Fixes: bdabad3e363d ("net: Add Qualcomm IPC router") Reported-by: kbuild test robot <[email protected]> Reported-by: Julia Lawall <[email protected]> Signed-off-by: Manivannan Sadhasivam <[email protected]> Reviewed-by: Bjorn Andersson <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-05-21ethtool: count header size in reply size estimateMichal Kubecek2-3/+2
As ethnl_request_ops::reply_size handlers do not include common header size into calculated/estimated reply size, it needs to be added in ethnl_default_doit() and ethnl_default_notify() before allocating the message. On the other hand, strset_reply_size() should not add common header size. Fixes: 728480f12442 ("ethtool: default handlers for GET requests") Reported-by: Oleksij Rempel <[email protected]> Signed-off-by: Michal Kubecek <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-05-21Merge tag 'apparmor-pr-2020-05-21' of ↵Linus Torvalds3-4/+5
git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor Pull apparmor bug fixes from John Johansen: - Fix use-after-free in aa_audit_rule_init - Fix refcnt leak in policy_update - Fix potential label refcnt leak in aa_change_profile * tag 'apparmor-pr-2020-05-21' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor: apparmor: Fix use-after-free in aa_audit_rule_init apparmor: Fix aa_label refcnt leak in policy_update apparmor: fix potential label refcnt leak in aa_change_profile
2020-05-21exfat: add the dummy mount options to be backward compatible with staging/exfatNamjae Jeon1-0/+19
As Ubuntu and Fedora release new version used kernel version equal to or higher than v5.4, They started to support kernel exfat filesystem. Linus reported a mount error with new version of exfat on Fedora: exfat: Unknown parameter 'namecase' This is because there is a difference in mount option between old staging/exfat and new exfat. And utf8, debug, and codepage options as well as namecase have been removed from new exfat. This patch add the dummy mount options as deprecated option to be backward compatible with old one. Reported-by: Linus Torvalds <[email protected]> Signed-off-by: Namjae Jeon <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Al Viro <[email protected]> Cc: Eric Sandeen <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-05-21apparmor: Fix use-after-free in aa_audit_rule_initNavid Emamdoost1-1/+2
In the implementation of aa_audit_rule_init(), when aa_label_parse() fails the allocated memory for rule is released using aa_audit_rule_free(). But after this release, the return statement tries to access the label field of the rule which results in use-after-free. Before releasing the rule, copy errNo and return it after release. Fixes: 52e8c38001d8 ("apparmor: Fix memory leak of rule on error exit path") Signed-off-by: Navid Emamdoost <[email protected]> Signed-off-by: John Johansen <[email protected]>
2020-05-21apparmor: Fix aa_label refcnt leak in policy_updateXiyu Yang1-1/+2
policy_update() invokes begin_current_label_crit_section(), which returns a reference of the updated aa_label object to "label" with increased refcount. When policy_update() returns, "label" becomes invalid, so the refcount should be decreased to keep refcount balanced. The reference counting issue happens in one exception handling path of policy_update(). When aa_may_manage_policy() returns not NULL, the refcnt increased by begin_current_label_crit_section() is not decreased, causing a refcnt leak. Fix this issue by jumping to "end_section" label when aa_may_manage_policy() returns not NULL. Fixes: 5ac8c355ae00 ("apparmor: allow introspecting the loaded policy pre internal transform") Signed-off-by: Xiyu Yang <[email protected]> Signed-off-by: Xin Tan <[email protected]> Signed-off-by: John Johansen <[email protected]>
2020-05-21apparmor: fix potential label refcnt leak in aa_change_profileXiyu Yang1-2/+1
aa_change_profile() invokes aa_get_current_label(), which returns a reference of the current task's label. According to the comment of aa_get_current_label(), the returned reference must be put with aa_put_label(). However, when the original object pointed by "label" becomes unreachable because aa_change_profile() returns or a new object is assigned to "label", reference count increased by aa_get_current_label() is not decreased, causing a refcnt leak. Fix this by calling aa_put_label() before aa_change_profile() return and dropping unnecessary aa_get_current_label(). Fixes: 9fcf78cca198 ("apparmor: update domain transitions that are subsets of confinement at nnp") Signed-off-by: Xiyu Yang <[email protected]> Signed-off-by: Xin Tan <[email protected]> Signed-off-by: John Johansen <[email protected]>
2020-05-21RISC-V: gp_in_global needs register keywordPalmer Dabbelt1-1/+1
The Intel kernel build robot recently pointed out that I missed the register keyword on this one when I refactored the code to remove local register variables (which aren't supported by LLVM). GCC's manual indicates that global register variables must have the register keyword, As far as I can tell lacking the register keyword causes GCC to ignore the __asm__ and treat this as a regular variable, but I'm not sure how that didn't show up as some sort of failure. Fixes: 52e7c52d2ded ("RISC-V: Stop relying on GCC's register allocator's hueristics") Signed-off-by: Palmer Dabbelt <[email protected]>
2020-05-21Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds2-10/+9
Pull virtio fixes from Michael Tsirkin: "Fix a couple of build warnings" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vhost: missing __user tags vdpasim: remove unused variable 'ret'
2020-05-21Merge tag 'dmaengine-fix-5.7-rc7' of ↵Linus Torvalds7-21/+40
git://git.infradead.org/users/vkoul/slave-dma Pull dmaengine fixes from Vinod Koul: "Some driver fixes: - dmatest restoration of defaults - tegra210-adma probe handling fix - k3-udma flags fixed for slave_sg and memcpy - list fix for zynqmp_dma - idxd interrupt completion fix - lock fix for owl" * tag 'dmaengine-fix-5.7-rc7' of git://git.infradead.org/users/vkoul/slave-dma: dmaengine: tegra210-adma: Fix an error handling path in 'tegra_adma_probe()' dmaengine: ti: k3-udma: Fix TR mode flags for slave_sg and memcpy dmaengine: zynqmp_dma: Move list_del inside zynqmp_dma_free_descriptor. dmaengine: dmatest: Restore default for channel dmaengine: idxd: fix interrupt completion after unmasking dmaengine: owl: Use correct lock in owl_dma_get_pchan()
2020-05-21Merge tag 'fiemap-regression-fix' of ↵Linus Torvalds3-32/+34
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Fix regression in ext4's FIEMAP handling introduced in v5.7-rc1" * tag 'fiemap-regression-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix fiemap size checks for bitmap files ext4: fix EXT4_MAX_LOGICAL_BLOCK macro
2020-05-21null_blk: don't allow discard for zoned modeChaitanya Kulkarni1-0/+7
Zoned block device specification do not define the behavior of discard/trim command as this command is generally replaced by the reset write pointer (zone reset) command. Emulate this in null_blk by making zoned and discard options mutually exclusive. Suggested-by: Damien Le Moal <[email protected]> Signed-off-by: Chaitanya Kulkarni <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-05-21null_blk: return error for invalid zone sizeChaitanya Kulkarni1-0/+4
In null_init_zone_dev() check if the zone size is larger than device capacity, return error if needed. This also fixes the following oops :- null_blk: changed the number of conventional zones to 4294967295 BUG: kernel NULL pointer dereference, address: 0000000000000010 PGD 7d76c5067 P4D 7d76c5067 PUD 7d240c067 PMD 0 Oops: 0002 [#1] SMP NOPTI CPU: 4 PID: 5508 Comm: nullbtests.sh Tainted: G OE 5.7.0-rc4lblk-fnext0 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e4 RIP: 0010:null_init_zoned_dev+0x17a/0x27f [null_blk] RSP: 0018:ffffc90007007e00 EFLAGS: 00010246 RAX: 0000000000000020 RBX: ffff8887fb3f3c00 RCX: 0000000000000007 RDX: 0000000000000000 RSI: ffff8887ca09d688 RDI: ffff888810fea510 RBP: 0000000000000010 R08: ffff8887ca09d688 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8887c26e8000 R13: ffffffffa05e9390 R14: 0000000000000000 R15: 0000000000000001 FS: 00007fcb5256f740(0000) GS:ffff888810e00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000010 CR3: 000000081e8fe000 CR4: 00000000003406e0 Call Trace: null_add_dev+0x534/0x71b [null_blk] nullb_device_power_store.cold.41+0x8/0x2e [null_blk] configfs_write_file+0xe6/0x150 vfs_write+0xba/0x1e0 ksys_write+0x5f/0xe0 do_syscall_64+0x60/0x250 entry_SYSCALL_64_after_hwframe+0x49/0xb3 RIP: 0033:0x7fcb51c71840 Signed-off-by: Chaitanya Kulkarni <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-05-22powerpc/64s: Disable STRICT_KERNEL_RWXMichael Ellerman1-1/+1
Several strange crashes have been eventually traced back to STRICT_KERNEL_RWX and its interaction with code patching. Various paths in our ftrace, kprobes and other patching code need to be hardened against patching failures, otherwise we can end up running with partially/incorrectly patched ftrace paths, kprobes or jump labels, which can then cause strange crashes. Although fixes for those are in development, they're not -rc material. There also seem to be problems with the underlying strict RWX logic, which needs further debugging. So for now disable STRICT_KERNEL_RWX on 64-bit to prevent people from enabling the option and tripping over the bugs. Fixes: 1e0fc9d1eb2b ("powerpc/Kconfig: Enable STRICT_KERNEL_RWX for some configs") Cc: [email protected] # v4.13+ Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-05-21kobject: Make sure the parent does not get released before its childrenHeikki Krogerus1-10/+20
In the function kobject_cleanup(), kobject_del(kobj) is called before the kobj->release(). That makes it possible to release the parent of the kobject before the kobject itself. To fix that, adding function __kboject_del() that does everything that kobject_del() does except release the parent reference. kobject_cleanup() then calls __kobject_del() instead of kobject_del(), and separately decrements the reference count of the parent kobject after kobj->release() has been called. Reported-by: Naresh Kamboju <[email protected]> Reported-by: kernel test robot <[email protected]> Fixes: 7589238a8cf3 ("Revert "software node: Simplify software_node_release() function"") Suggested-by: "Rafael J. Wysocki" <[email protected]> Signed-off-by: Heikki Krogerus <[email protected]> Reviewed-by: Rafael J. Wysocki <[email protected]> Reviewed-by: Brendan Higgins <[email protected]> Tested-by: Brendan Higgins <[email protected]> Acked-by: Randy Dunlap <[email protected]> Link: https://lore.kernel.org/r/[email protected] Cc: stable <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2020-05-21driver core: Fix handling of SYNC_STATE_ONLY + STATELESS device linksSaravana Kannan1-3/+5
Commit 21c27f06587d ("driver core: Fix SYNC_STATE_ONLY device link implementation") didn't completely fix STATELESS + SYNC_STATE_ONLY handling. What looks like an optimization in that commit is actually a bug that causes an if condition to always take the else path. This prevents reordering of devices in the dpm_list when a DL_FLAG_STATELESS device link is create on top of an existing DL_FLAG_SYNC_STATE_ONLY device link. Fixes: 21c27f06587d ("driver core: Fix SYNC_STATE_ONLY device link implementation") Signed-off-by: Saravana Kannan <[email protected]> Cc: stable <[email protected]> Reviewed-by: Rafael J. Wysocki <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2020-05-20net: nlmsg_cancel() if put fails for nhmsgStephen Worley1-0/+1
Fixes data remnant seen when we fail to reserve space for a nexthop group during a larger dump. If we fail the reservation, we goto nla_put_failure and cancel the message. Reproduce with the following iproute2 commands: ===================== ip link add dummy1 type dummy ip link add dummy2 type dummy ip link add dummy3 type dummy ip link add dummy4 type dummy ip link add dummy5 type dummy ip link add dummy6 type dummy ip link add dummy7 type dummy ip link add dummy8 type dummy ip link add dummy9 type dummy ip link add dummy10 type dummy ip link add dummy11 type dummy ip link add dummy12 type dummy ip link add dummy13 type dummy ip link add dummy14 type dummy ip link add dummy15 type dummy ip link add dummy16 type dummy ip link add dummy17 type dummy ip link add dummy18 type dummy ip link add dummy19 type dummy ip link add dummy20 type dummy ip link add dummy21 type dummy ip link add dummy22 type dummy ip link add dummy23 type dummy ip link add dummy24 type dummy ip link add dummy25 type dummy ip link add dummy26 type dummy ip link add dummy27 type dummy ip link add dummy28 type dummy ip link add dummy29 type dummy ip link add dummy30 type dummy ip link add dummy31 type dummy ip link add dummy32 type dummy ip link set dummy1 up ip link set dummy2 up ip link set dummy3 up ip link set dummy4 up ip link set dummy5 up ip link set dummy6 up ip link set dummy7 up ip link set dummy8 up ip link set dummy9 up ip link set dummy10 up ip link set dummy11 up ip link set dummy12 up ip link set dummy13 up ip link set dummy14 up ip link set dummy15 up ip link set dummy16 up ip link set dummy17 up ip link set dummy18 up ip link set dummy19 up ip link set dummy20 up ip link set dummy21 up ip link set dummy22 up ip link set dummy23 up ip link set dummy24 up ip link set dummy25 up ip link set dummy26 up ip link set dummy27 up ip link set dummy28 up ip link set dummy29 up ip link set dummy30 up ip link set dummy31 up ip link set dummy32 up ip link set dummy33 up ip link set dummy34 up ip link set vrf-red up ip link set vrf-blue up ip link set dummyVRFred up ip link set dummyVRFblue up ip ro add 1.1.1.1/32 dev dummy1 ip ro add 1.1.1.2/32 dev dummy2 ip ro add 1.1.1.3/32 dev dummy3 ip ro add 1.1.1.4/32 dev dummy4 ip ro add 1.1.1.5/32 dev dummy5 ip ro add 1.1.1.6/32 dev dummy6 ip ro add 1.1.1.7/32 dev dummy7 ip ro add 1.1.1.8/32 dev dummy8 ip ro add 1.1.1.9/32 dev dummy9 ip ro add 1.1.1.10/32 dev dummy10 ip ro add 1.1.1.11/32 dev dummy11 ip ro add 1.1.1.12/32 dev dummy12 ip ro add 1.1.1.13/32 dev dummy13 ip ro add 1.1.1.14/32 dev dummy14 ip ro add 1.1.1.15/32 dev dummy15 ip ro add 1.1.1.16/32 dev dummy16 ip ro add 1.1.1.17/32 dev dummy17 ip ro add 1.1.1.18/32 dev dummy18 ip ro add 1.1.1.19/32 dev dummy19 ip ro add 1.1.1.20/32 dev dummy20 ip ro add 1.1.1.21/32 dev dummy21 ip ro add 1.1.1.22/32 dev dummy22 ip ro add 1.1.1.23/32 dev dummy23 ip ro add 1.1.1.24/32 dev dummy24 ip ro add 1.1.1.25/32 dev dummy25 ip ro add 1.1.1.26/32 dev dummy26 ip ro add 1.1.1.27/32 dev dummy27 ip ro add 1.1.1.28/32 dev dummy28 ip ro add 1.1.1.29/32 dev dummy29 ip ro add 1.1.1.30/32 dev dummy30 ip ro add 1.1.1.31/32 dev dummy31 ip ro add 1.1.1.32/32 dev dummy32 ip next add id 1 via 1.1.1.1 dev dummy1 ip next add id 2 via 1.1.1.2 dev dummy2 ip next add id 3 via 1.1.1.3 dev dummy3 ip next add id 4 via 1.1.1.4 dev dummy4 ip next add id 5 via 1.1.1.5 dev dummy5 ip next add id 6 via 1.1.1.6 dev dummy6 ip next add id 7 via 1.1.1.7 dev dummy7 ip next add id 8 via 1.1.1.8 dev dummy8 ip next add id 9 via 1.1.1.9 dev dummy9 ip next add id 10 via 1.1.1.10 dev dummy10 ip next add id 11 via 1.1.1.11 dev dummy11 ip next add id 12 via 1.1.1.12 dev dummy12 ip next add id 13 via 1.1.1.13 dev dummy13 ip next add id 14 via 1.1.1.14 dev dummy14 ip next add id 15 via 1.1.1.15 dev dummy15 ip next add id 16 via 1.1.1.16 dev dummy16 ip next add id 17 via 1.1.1.17 dev dummy17 ip next add id 18 via 1.1.1.18 dev dummy18 ip next add id 19 via 1.1.1.19 dev dummy19 ip next add id 20 via 1.1.1.20 dev dummy20 ip next add id 21 via 1.1.1.21 dev dummy21 ip next add id 22 via 1.1.1.22 dev dummy22 ip next add id 23 via 1.1.1.23 dev dummy23 ip next add id 24 via 1.1.1.24 dev dummy24 ip next add id 25 via 1.1.1.25 dev dummy25 ip next add id 26 via 1.1.1.26 dev dummy26 ip next add id 27 via 1.1.1.27 dev dummy27 ip next add id 28 via 1.1.1.28 dev dummy28 ip next add id 29 via 1.1.1.29 dev dummy29 ip next add id 30 via 1.1.1.30 dev dummy30 ip next add id 31 via 1.1.1.31 dev dummy31 ip next add id 32 via 1.1.1.32 dev dummy32 i=100 while [ $i -le 200 ] do ip next add id $i group 1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/16/17/18/19 echo $i ((i++)) done ip next add id 999 group 1/2/3/4/5/6 ip next ls ======================== Fixes: ab84be7e54fc ("net: Initial nexthop code") Signed-off-by: Stephen Worley <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-05-20ax25: fix setsockopt(SO_BINDTODEVICE)Eric Dumazet1-2/+4
syzbot was able to trigger this trace [1], probably by using a zero optlen. While we are at it, cap optlen to IFNAMSIZ - 1 instead of IFNAMSIZ. [1] BUG: KMSAN: uninit-value in strnlen+0xf9/0x170 lib/string.c:569 CPU: 0 PID: 8807 Comm: syz-executor483 Not tainted 5.7.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x220 lib/dump_stack.c:118 kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:121 __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215 strnlen+0xf9/0x170 lib/string.c:569 dev_name_hash net/core/dev.c:207 [inline] netdev_name_node_lookup net/core/dev.c:277 [inline] __dev_get_by_name+0x75/0x2b0 net/core/dev.c:778 ax25_setsockopt+0xfa3/0x1170 net/ax25/af_ax25.c:654 __compat_sys_setsockopt+0x4ed/0x910 net/compat.c:403 __do_compat_sys_setsockopt net/compat.c:413 [inline] __se_compat_sys_setsockopt+0xdd/0x100 net/compat.c:410 __ia32_compat_sys_setsockopt+0x62/0x80 net/compat.c:410 do_syscall_32_irqs_on arch/x86/entry/common.c:339 [inline] do_fast_syscall_32+0x3bf/0x6d0 arch/x86/entry/common.c:398 entry_SYSENTER_compat+0x68/0x77 arch/x86/entry/entry_64_compat.S:139 RIP: 0023:0xf7f57dd9 Code: 90 e8 0b 00 00 00 f3 90 0f ae e8 eb f9 8d 74 26 00 89 3c 24 c3 90 90 90 90 90 90 90 90 90 90 90 90 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 eb 0d 90 90 90 90 90 90 90 90 90 90 90 90 RSP: 002b:00000000ffae8c1c EFLAGS: 00000217 ORIG_RAX: 000000000000016e RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000000101 RDX: 0000000000000019 RSI: 0000000020000000 RDI: 0000000000000004 RBP: 0000000000000012 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Local variable ----devname@ax25_setsockopt created at: ax25_setsockopt+0xe6/0x1170 net/ax25/af_ax25.c:536 ax25_setsockopt+0xe6/0x1170 net/ax25/af_ax25.c:536 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <[email protected]> Reported-by: syzbot <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-05-20Merge branch 'wireguard-fixes'David S. Miller8-59/+71
Jason A. Donenfeld says: ==================== wireguard fixes for 5.7-rc7 Hopefully these are the last fixes for 5.7: 1) A trivial bump in the selftest harness to support gcc-10. build.wireguard.com is still on gcc-9 but I'll probably switch to gcc-10 in the coming weeks. 2) A concurrency fix regarding userspace modifying the pre-shared key at the same time as packets are being processed, reported by Matt Dunwoodie. 3) We were previously clearing skb->hash on egress, which broke fq_codel, cake, and other things that actually make use of the flow hash for queueing, reported by Dave Taht and Toke Høiland-Jørgensen. 4) A fix for the increased memory usage caused by (3). This can be thought of as part of patch (3), but because of the separate reasoning and breadth of it I thought made it a bit cleaner to put in a standalone commit. Fixes (2), (3), and (4) are -stable material. ==================== Signed-off-by: David S. Miller <[email protected]>
2020-05-20wireguard: noise: separate receive counter from send counterJason A. Donenfeld5-53/+48
In "wireguard: queueing: preserve flow hash across packet scrubbing", we were required to slightly increase the size of the receive replay counter to something still fairly small, but an increase nonetheless. It turns out that we can recoup some of the additional memory overhead by splitting up the prior union type into two distinct types. Before, we used the same "noise_counter" union for both sending and receiving, with sending just using a simple atomic64_t, while receiving used the full replay counter checker. This meant that most of the memory being allocated for the sending counter was being wasted. Since the old "noise_counter" type increased in size in the prior commit, now is a good time to split up that union type into a distinct "noise_replay_ counter" for receiving and a boring atomic64_t for sending, each using neither more nor less memory than required. Also, since sometimes the replay counter is accessed without necessitating additional accesses to the bitmap, we can reduce cache misses by hoisting the always-necessary lock above the bitmap in the struct layout. We also change a "noise_replay_counter" stack allocation to kmalloc in a -DDEBUG selftest so that KASAN doesn't trigger a stack frame warning. All and all, removing a bit of abstraction in this commit makes the code simpler and smaller, in addition to the motivating memory usage recuperation. For example, passing around raw "noise_symmetric_key" structs is something that really only makes sense within noise.c, in the one place where the sending and receiving keys can safely be thought of as the same type of object; subsequent to that, it's important that we uniformly access these through keypair->{sending,receiving}, where their distinct roles are always made explicit. So this patch allows us to draw that distinction clearly as well. Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Signed-off-by: Jason A. Donenfeld <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-05-20wireguard: queueing: preserve flow hash across packet scrubbingJason A. Donenfeld4-4/+17
It's important that we clear most header fields during encapsulation and decapsulation, because the packet is substantially changed, and we don't want any info leak or logic bug due to an accidental correlation. But, for encapsulation, it's wrong to clear skb->hash, since it's used by fq_codel and flow dissection in general. Without it, classification does not proceed as usual. This change might make it easier to estimate the number of innerflows by examining clustering of out of order packets, but this shouldn't open up anything that can't already be inferred otherwise (e.g. syn packet size inference), and fq_codel can be disabled anyway. Furthermore, it might be the case that the hash isn't used or queried at all until after wireguard transmits the encrypted UDP packet, which means skb->hash might still be zero at this point, and thus no hash taken over the inner packet data. In order to address this situation, we force a calculation of skb->hash before encrypting packet data. Of course this means that fq_codel might transmit packets slightly more out of order than usual. Toke did some testing on beefy machines with high quantities of parallel flows and found that increasing the reply-attack counter to 8192 takes care of the most pathological cases pretty well. Reported-by: Dave Taht <[email protected]> Reviewed-and-tested-by: Toke Høiland-Jørgensen <[email protected]> Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Signed-off-by: Jason A. Donenfeld <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-05-20wireguard: noise: read preshared key while taking lockJason A. Donenfeld1-1/+5
Prior we read the preshared key after dropping the handshake lock, which isn't an actual crypto issue if it races, but it's still not quite correct. So copy that part of the state into a temporary like we do with the rest of the handshake state variables. Then we can release the lock, operate on the temporary, and zero it out at the end of the function. In performance tests, the impact of this was entirely unnoticable, probably because those bytes are coming from the same cacheline as other things that are being copied out in the same manner. Reported-by: Matt Dunwoodie <[email protected]> Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Signed-off-by: Jason A. Donenfeld <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-05-20wireguard: selftests: use newer iproute2 for gcc-10Jason A. Donenfeld1-1/+1
gcc-10 switched to defaulting to -fno-common, which broke iproute2-5.4. This was fixed in iproute-5.6, so switch to that. Because we're after a stable testing surface, we generally don't like to bump these unnecessarily, but in this case, being able to actually build is a basic necessity. Signed-off-by: Jason A. Donenfeld <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-05-20bpf: Prevent mmap()'ing read-only maps as writableAndrii Nakryiko3-4/+34
As discussed in [0], it's dangerous to allow mapping BPF map, that's meant to be frozen and is read-only on BPF program side, because that allows user-space to actually store a writable view to the page even after it is frozen. This is exacerbated by BPF verifier making a strong assumption that contents of such frozen map will remain unchanged. To prevent this, disallow mapping BPF_F_RDONLY_PROG mmap()'able BPF maps as writable, ever. [0] https://lore.kernel.org/bpf/CAEf4BzYGWYhXdp6BJ7_=9OQPJxQpgug080MMjdSB72i9R+5c6g@mail.gmail.com/ Fixes: fc9702273e2e ("bpf: Add mmap() support for BPF_MAP_TYPE_ARRAY") Suggested-by: Jann Horn <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Reviewed-by: Jann Horn <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-05-20security: Fix hook iteration for secid_to_secctxKP Singh1-2/+14
secid_to_secctx is not stackable, and since the BPF LSM registers this hook by default, the call_int_hook logic is not suitable which "bails-on-fail" and casues issues when other LSMs register this hook and eventually breaks Audit. In order to fix this, directly iterate over the security hooks instead of using call_int_hook as suggested in: https: //lore.kernel.org/bpf/[email protected]/#t Fixes: 98e828a0650f ("security: Refactor declaration of LSM hooks") Fixes: 625236ba3832 ("security: Fix the default value of secid_to_secctx hook") Reported-by: Alexei Starovoitov <[email protected]> Signed-off-by: KP Singh <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: James Morris <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-05-20riscv: Fix print_vm_layout build error if NOMMUKefeng Wang1-1/+1
arch/riscv/mm/init.c: In function ‘print_vm_layout’: arch/riscv/mm/init.c:68:37: error: ‘FIXADDR_START’ undeclared (first use in this function); arch/riscv/mm/init.c:69:20: error: ‘FIXADDR_TOP’ undeclared arch/riscv/mm/init.c:70:37: error: ‘PCI_IO_START’ undeclared arch/riscv/mm/init.c:71:20: error: ‘PCI_IO_END’ undeclared arch/riscv/mm/init.c:72:38: error: ‘VMEMMAP_START’ undeclared arch/riscv/mm/init.c:73:20: error: ‘VMEMMAP_END’ undeclared (first use in this function); Reported-by: Hulk Robot <[email protected]> Signed-off-by: Kefeng Wang <[email protected]> Signed-off-by: Palmer Dabbelt <[email protected]>
2020-05-20drm/amd/display: Defer cursor lock until after VUPDATENicholas Kazlauskas6-1/+81
[Why] We dropped the delay after changed the cursor functions locking the entire pipe to locking just the CURSOR registers to fix page flip stuttering - this introduced cursor stuttering instead, and an underflow issue. The cursor update can be delayed indefinitely if the cursor update repeatedly happens right around VUPDATE. The underflow issue can happen if we do a viewport update on a pipe on the same frame where a cursor update happens around VUPDATE - the old cursor registers are retained which can be in an invalid position. This can cause a pipe hang and indefinite underflow. [How] The complex, ideal solution to the problem would be a software triple buffering mechanism from the DM layer to program only one cursor update per frame just before VUPDATE. The simple workaround until we have that infrastructure in place is this change - bring back the delay until VUPDATE before locking, but with some corrections to the calculations. This didn't work for all timings before because the calculation for VUPDATE was wrong - it was using the offset from VSTARTUP instead and didn't correctly handle the case where VUPDATE could be in the back porch. Add a new hardware sequencer function to use the existing helper to calculate the real VUPDATE start and VUPDATE end - VUPDATE can last multiple lines after all. Change the udelay to incorporate the width of VUPDATE as well. Signed-off-by: Nicholas Kazlauskas <[email protected]> Reviewed-by: Aric Cyr <[email protected]> Acked-by: Rodrigo Siqueira <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-05-20drm/amd/display: Remove dml_common_def fileRodrigo Siqueira11-94/+18
During the rework for removing the FPU issues, I found the following warning: [..] dml_common_defs.o: warning: objtool: dml_round()+0x9: FPU instruction outside of kernel_fpu_{begin,end}() This file has a single function that does not need to be in a specific file. This commit drop dml_common_defs file, and move dml_round function to dml_inline_defs. CC: Christian König <[email protected]> CC: Alexander Deucher <[email protected]> CC: Peter Zijlstra <[email protected]> CC: Tony Cheng <[email protected]> CC: Harry Wentland <[email protected]> Signed-off-by: Rodrigo Siqueira <[email protected]> Reviewed-by: Dmytro Laktyushkin <[email protected]> Acked-by: Rodrigo Siqueira <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-05-20drm/amd/display: DP training to set properly SCRAMBLING_DISABLEVladimir Stempen1-0/+27
[Why] DP training sequence to set SCRAMBLING_DISABLE bit properly based on training pattern - per DP Spec. [How] Update dpcd_pattern.v1_4.SCRAMBLING_DISABLE with 1 for TPS1, TPS2, TPS3, but not for TPS4. Signed-off-by: Vladimir Stempen <[email protected]> Reviewed-by: Wenjing Liu <[email protected]> Acked-by: Rodrigo Siqueira <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-05-20Merge tag 'fixes-for-5.7-rc6' of ↵Linus Torvalds3-3/+6
git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux Pull MTD fixes from Richard Weinberger: - Fix a PM regression in brcmnand driver - Propagate ECC information correctly on SPI-NAND - Make sure no MTD name is used multiple time in nvmem * tag 'fixes-for-5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: mtd:rawnand: brcmnand: Fix PM resume crash mtd: Fix mtd not registered due to nvmem name collision mtd: spinand: Propagate ECC information to the MTD structure
2020-05-20Merge tag 'for-linus-5.7-rc6' of ↵Linus Torvalds4-39/+9
git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs Pull UBI and UBIFS fixes from Richard Weinberger: - Correctly set next cursor for detailed_erase_block_info debugfs file - Don't use crypto_shash_descsize() for digest size in UBIFS - Remove broken lazytime support from UBIFS * tag 'for-linus-5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs: ubi: Fix seq_file usage in detailed_erase_block_info debugfs file ubifs: fix wrong use of crypto_shash_descsize() ubifs: remove broken lazytime support
2020-05-20Merge tag 'for-linus-5.7-rc6' of ↵Linus Torvalds3-2/+3
git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml Pull UML fixes from Richard Weinberger: - Two missing includes which caused build issues on recent systems - Correctly set TRANS_GRE_LEN in our vector network driver * tag 'for-linus-5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml: um: Fix typo in vector driver transport option definition um: syscall.c: include <asm/unistd.h> um: Fix xor.h include
2020-05-20Merge tag 'pm-5.7-rc7' of ↵Linus Torvalds2-12/+9
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fix from Rafael Wysocki: "This makes a recently introduced suspend-to-idle wakeup issue on Dell XPS13 9360 go away" * tag 'pm-5.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: EC: PM: Avoid flushing EC work when EC GPE is inactive
2020-05-20Merge tag 'ovl-fixes-5.7-rc7' of ↵Linus Torvalds2-0/+21
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull overlayfs fixes from Miklos Szeredi: "Fix two bugs introduced in this cycle and one introduced in v5.5" * tag 'ovl-fixes-5.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: ovl: potential crash in ovl_fid_to_fh() ovl: clear ATTR_OPEN from attr->ia_valid ovl: clear ATTR_FILE from attr->ia_valid
2020-05-20pipe: Fix pipe_full() test in opipe_prep().Tetsuo Handa1-1/+1
syzbot is reporting that splice()ing from non-empty read side to already-full write side causes unkillable task, for opipe_prep() is by error not inverting pipe_full() test. CPU: 0 PID: 9460 Comm: syz-executor.5 Not tainted 5.6.0-rc3-next-20200228-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:rol32 include/linux/bitops.h:105 [inline] RIP: 0010:iterate_chain_key kernel/locking/lockdep.c:369 [inline] RIP: 0010:__lock_acquire+0x6a3/0x5270 kernel/locking/lockdep.c:4178 Call Trace: lock_acquire+0x197/0x420 kernel/locking/lockdep.c:4720 __mutex_lock_common kernel/locking/mutex.c:956 [inline] __mutex_lock+0x156/0x13c0 kernel/locking/mutex.c:1103 pipe_lock_nested fs/pipe.c:66 [inline] pipe_double_lock+0x1a0/0x1e0 fs/pipe.c:104 splice_pipe_to_pipe fs/splice.c:1562 [inline] do_splice+0x35f/0x1520 fs/splice.c:1141 __do_sys_splice fs/splice.c:1447 [inline] __se_sys_splice fs/splice.c:1427 [inline] __x64_sys_splice+0x2b5/0x320 fs/splice.c:1427 do_syscall_64+0xf6/0x790 arch/x86/entry/common.c:295 entry_SYSCALL_64_after_hwframe+0x49/0xbe Reported-by: [email protected] Link: https://syzkaller.appspot.com/bug?id=9386d051e11e09973d5a4cf79af5e8cedf79386d Fixes: 8cefc107ca54c8b0 ("pipe: Use head and tail pointers for the ring, not cursor and length") Cc: [email protected] # 5.5+ Signed-off-by: Tetsuo Handa <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-05-20rxrpc: Fix ack discardDavid Howells1-4/+26
The Rx protocol has a "previousPacket" field in it that is not handled in the same way by all protocol implementations. Sometimes it contains the serial number of the last DATA packet received, sometimes the sequence number of the last DATA packet received and sometimes the highest sequence number so far received. AF_RXRPC is using this to weed out ACKs that are out of date (it's possible for ACK packets to get reordered on the wire), but this does not work with OpenAFS which will just stick the sequence number of the last packet seen into previousPacket. The issue being seen is that big AFS FS.StoreData RPC (eg. of ~256MiB) are timing out when partly sent. A trace was captured, with an additional tracepoint to show ACKs being discarded in rxrpc_input_ack(). Here's an excerpt showing the problem. 52873.203230: rxrpc_tx_data: c=000004ae DATA ed1a3584:00000002 0002449c q=00024499 fl=09 A DATA packet with sequence number 00024499 has been transmitted (the "q=" field). ... 52873.243296: rxrpc_rx_ack: c=000004ae 00012a2b DLY r=00024499 f=00024497 p=00024496 n=0 52873.243376: rxrpc_rx_ack: c=000004ae 00012a2c IDL r=0002449b f=00024499 p=00024498 n=0 52873.243383: rxrpc_rx_ack: c=000004ae 00012a2d OOS r=0002449d f=00024499 p=0002449a n=2 The Out-Of-Sequence ACK indicates that the server didn't see DATA sequence number 00024499, but did see seq 0002449a (previousPacket, shown as "p=", skipped the number, but firstPacket, "f=", which shows the bottom of the window is set at that point). 52873.252663: rxrpc_retransmit: c=000004ae q=24499 a=02 xp=14581537 52873.252664: rxrpc_tx_data: c=000004ae DATA ed1a3584:00000002 000244bc q=00024499 fl=0b *RETRANS* The packet has been retransmitted. Retransmission recurs until the peer says it got the packet. 52873.271013: rxrpc_rx_ack: c=000004ae 00012a31 OOS r=000244a1 f=00024499 p=0002449e n=6 More OOS ACKs indicate that the other packets that are already in the transmission pipeline are being received. The specific-ACK list is up to 6 ACKs and NAKs. ... 52873.284792: rxrpc_rx_ack: c=000004ae 00012a49 OOS r=000244b9 f=00024499 p=000244b6 n=30 52873.284802: rxrpc_retransmit: c=000004ae q=24499 a=0a xp=63505500 52873.284804: rxrpc_tx_data: c=000004ae DATA ed1a3584:00000002 000244c2 q=00024499 fl=0b *RETRANS* 52873.287468: rxrpc_rx_ack: c=000004ae 00012a4a OOS r=000244ba f=00024499 p=000244b7 n=31 52873.287478: rxrpc_rx_ack: c=000004ae 00012a4b OOS r=000244bb f=00024499 p=000244b8 n=32 At this point, the server's receive window is full (n=32) with presumably 1 NAK'd packet and 31 ACK'd packets. We can't transmit any more packets. 52873.287488: rxrpc_retransmit: c=000004ae q=24499 a=0a xp=61327980 52873.287489: rxrpc_tx_data: c=000004ae DATA ed1a3584:00000002 000244c3 q=00024499 fl=0b *RETRANS* 52873.293850: rxrpc_rx_ack: c=000004ae 00012a4c DLY r=000244bc f=000244a0 p=00024499 n=25 And now we've received an ACK indicating that a DATA retransmission was received. 7 packets have been processed (the occupied part of the window moved, as indicated by f= and n=). 52873.293853: rxrpc_rx_discard_ack: c=000004ae r=00012a4c 000244a0<00024499 00024499<000244b8 However, the DLY ACK gets discarded because its previousPacket has gone backwards (from p=000244b8, in the ACK at 52873.287478 to p=00024499 in the ACK at 52873.293850). We then end up in a continuous cycle of retransmit/discard. kafs fails to update its window because it's discarding the ACKs and can't transmit an extra packet that would clear the issue because the window is full. OpenAFS doesn't change the previousPacket value in the ACKs because no new DATA packets are received with a different previousPacket number. Fix this by altering the discard check to only discard an ACK based on previousPacket if there was no advance in the firstPacket. This allows us to transmit a new packet which will cause previousPacket to advance in the next ACK. The check, however, needs to allow for the possibility that previousPacket may actually have had the serial number placed in it instead - in which case it will go outside the window and we should ignore it. Fixes: 1a2391c30c0b ("rxrpc: Fix detection of out of order acks") Reported-by: Dave Botsch <[email protected]> Signed-off-by: David Howells <[email protected]>
2020-05-20rxrpc: Trace discarded ACKsDavid Howells2-2/+45
Add a tracepoint to track received ACKs that are discarded due to being outside of the Tx window. Signed-off-by: David Howells <[email protected]>
2020-05-20io_uring: reset -EBUSY error when io sq thread is waken upXiaoguang Wang1-0/+1
In io_sq_thread(), currently if we get an -EBUSY error and go to sleep, we will won't clear it again, which will result in io_sq_thread() will never have a chance to submit sqes again. Below test program test.c can reveal this bug: int main(int argc, char *argv[]) { struct io_uring ring; int i, fd, ret; struct io_uring_sqe *sqe; struct io_uring_cqe *cqe; struct iovec *iovecs; void *buf; struct io_uring_params p; if (argc < 2) { printf("%s: file\n", argv[0]); return 1; } memset(&p, 0, sizeof(p)); p.flags = IORING_SETUP_SQPOLL; ret = io_uring_queue_init_params(4, &ring, &p); if (ret < 0) { fprintf(stderr, "queue_init: %s\n", strerror(-ret)); return 1; } fd = open(argv[1], O_RDONLY | O_DIRECT); if (fd < 0) { perror("open"); return 1; } iovecs = calloc(10, sizeof(struct iovec)); for (i = 0; i < 10; i++) { if (posix_memalign(&buf, 4096, 4096)) return 1; iovecs[i].iov_base = buf; iovecs[i].iov_len = 4096; } ret = io_uring_register_files(&ring, &fd, 1); if (ret < 0) { fprintf(stderr, "%s: register %d\n", __FUNCTION__, ret); return ret; } for (i = 0; i < 10; i++) { sqe = io_uring_get_sqe(&ring); if (!sqe) break; io_uring_prep_readv(sqe, 0, &iovecs[i], 1, 0); sqe->flags |= IOSQE_FIXED_FILE; ret = io_uring_submit(&ring); sleep(1); printf("submit %d\n", i); } for (i = 0; i < 10; i++) { io_uring_wait_cqe(&ring, &cqe); printf("receive: %d\n", i); if (cqe->res != 4096) { fprintf(stderr, "ret=%d, wanted 4096\n", cqe->res); ret = 1; } io_uring_cqe_seen(&ring, cqe); } close(fd); io_uring_queue_exit(&ring); return 0; } sudo ./test testfile above command will hang on the tenth request, to fix this bug, when io sq_thread is waken up, we reset the variable 'ret' to be zero. Suggested-by: Jens Axboe <[email protected]> Signed-off-by: Xiaoguang Wang <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-05-20Revert "powerpc/32s: reorder Linux PTE bits to better match Hash PTE bits."Christophe Leroy3-13/+18
This reverts commit 697ece78f8f749aeea40f2711389901f0974017a. The implementation of SWAP on powerpc requires page protection bits to not be one of the least significant PTE bits. Until the SWAP implementation is changed and this requirement voids, we have to keep at least _PAGE_RW outside of the 3 last bits. For now, revert to previous PTE bits order. A further rework may come later. Fixes: 697ece78f8f7 ("powerpc/32s: reorder Linux PTE bits to better match Hash PTE bits.") Reported-by: Rui Salvaterra <[email protected]> Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/b34706f8de87f84d135abb5f3ede6b6f16fb1f41.1589969799.git.christophe.leroy@csgroup.eu
2020-05-20arm64: Fix PTRACE_SYSEMU semanticsKeno Fischer1-3/+4
Quoth the man page: ``` If the tracee was restarted by PTRACE_SYSCALL or PTRACE_SYSEMU, the tracee enters syscall-enter-stop just prior to entering any system call (which will not be executed if the restart was using PTRACE_SYSEMU, regardless of any change made to registers at this point or how the tracee is restarted after this stop). ``` The parenthetical comment is currently true on x86 and powerpc, but not currently true on arm64. arm64 re-checks the _TIF_SYSCALL_EMU flag after the syscall entry ptrace stop. However, at this point, it reflects which method was used to re-start the syscall at the entry stop, rather than the method that was used to reach it. Fix that by recording the original flag before performing the ptrace stop, bringing the behavior in line with documentation and x86/powerpc. Fixes: f086f67485c5 ("arm64: ptrace: add support for syscall emulation") Cc: <[email protected]> # 5.3.x- Signed-off-by: Keno Fischer <[email protected]> Acked-by: Will Deacon <[email protected]> Tested-by: Sudeep Holla <[email protected]> Tested-by: Bin Lu <[email protected]> [[email protected]: moved 'flags' bit masking] [[email protected]: changed 'flags' type to unsigned long] Signed-off-by: Catalin Marinas <[email protected]>
2020-05-20s390/kaslr: add support for R_390_JMP_SLOT relocation typeGerald Schaefer1-0/+1
With certain kernel configurations, the R_390_JMP_SLOT relocation type might be generated, which is not expected by the KASLR relocation code, and the kernel stops with the message "Unknown relocation type". This was found with a zfcpdump kernel config, where CONFIG_MODULES=n and CONFIG_VFIO=n. In that case, symbol_get() is used on undefined __weak symbols in virt/kvm/vfio.c, which results in the generation of R_390_JMP_SLOT relocation types. Fix this by handling R_390_JMP_SLOT similar to R_390_GLOB_DAT. Fixes: 805bc0bc238f ("s390/kernel: build a relocatable kernel") Cc: <[email protected]> # v5.2+ Signed-off-by: Gerald Schaefer <[email protected]> Reviewed-by: Philipp Rudo <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2020-05-20s390/mm: fix set_huge_pte_at() for empty ptesGerald Schaefer1-3/+6
On s390, the layout of normal and large ptes (i.e. pmds/puds) differs. Therefore, set_huge_pte_at() does a conversion from a normal pte to the corresponding large pmd/pud. So, when converting an empty pte, this should result in an empty pmd/pud, which would return true for pmd/pud_none(). However, after conversion we also mark the pmd/pud as large, and therefore present. For empty ptes, this will result in an empty pmd/pud that is also marked as large, and pmd/pud_none() would not return true. There is currently no issue with this behaviour, as set_huge_pte_at() does not seem to be called for empty ptes. It would be valid though, so let's fix this by not marking empty ptes as large in set_huge_pte_at(). This was found by testing a patch from from Anshuman Khandual, which is currently discussed on LKML ("mm/debug: Add more arch page table helper tests"). Signed-off-by: Gerald Schaefer <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2020-05-19io_uring: don't add non-IO requests to iopoll pending listJens Axboe1-1/+2
We normally disable any commands that aren't specifically poll commands for a ring that is setup for polling, but we do allow buffer provide and remove commands to support buffer selection for polled IO. Once a request is issued, we add it to the poll list to poll for completion. But we should not do that for non-IO commands, as those request complete inline immediately and aren't pollable. If we do, we can leave requests on the iopoll list after they are freed. Fixes: ddf0322db79c ("io_uring: add IORING_OP_PROVIDE_BUFFERS") Signed-off-by: Jens Axboe <[email protected]>