aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2021-08-31Merge tag 'gfs2-v5.14-rc2-fixes' of ↵Linus Torvalds13-141/+139
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 updates from Andreas Gruenbacher: - Various withdraw related fixes (freeze glock recursion, thread initialization / destruction order, journal recovery, glock cleanup, withdraw under journal lock). - Some error message improvements. - Various minor cleanups. * tag 'gfs2-v5.14-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: Remove redundant check from gfs2_glock_dq gfs2: Delay withdraw from atomic context gfs2: Don't call dlm after protocol is unmounted gfs2: don't stop reads while withdraw in progress gfs2: Mark journal inodes as "don't cache" gfs2: nit: gfs2_drop_inode shouldn't return bool gfs2: Eliminate vestigial HIF_FIRST gfs2: Make recovery error more readable gfs2: Don't release and reacquire local statfs bh gfs2: init system threads before freeze lock gfs2: tiny cleanup in gfs2_log_reserve gfs2: trivial clean up of gfs2_ail_error gfs2: be more verbose replaying invalid rgrp blocks gfs2: Fix glock recursion in freeze_go_xmote_bh gfs2: Fix memory leak of object lsi on error return path
2021-08-31Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscryptLinus Torvalds7-58/+260
Pull fscrypt updates from Eric Biggers: "Some small fixes and cleanups for fs/crypto/: - Fix ->getattr() for ext4, f2fs, and ubifs to report the correct st_size for encrypted symlinks - Use base64url instead of a custom Base64 variant - Document struct fscrypt_operations" * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt: fscrypt: document struct fscrypt_operations fscrypt: align Base64 encoding with RFC 4648 base64url fscrypt: remove mention of symlink st_size quirk from documentation ubifs: report correct st_size for encrypted symlinks f2fs: report correct st_size for encrypted symlinks ext4: report correct st_size for encrypted symlinks fscrypt: add fscrypt_symlink_getattr() for computing st_size
2021-08-31Merge tag 'for-5.15-tag' of ↵Linus Torvalds58-1425/+2769
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs updates from David Sterba: "The highlights of this round are integrations with fs-verity and idmapped mounts, the rest is usual mix of minor improvements, speedups and cleanups. There are some patches outside of btrfs, namely updating some VFS interfaces, all straightforward and acked. Features: - fs-verity support, using standard ioctls, backward compatible with read-only limitation on inodes with previously enabled fs-verity - idmapped mount support - make mount with rescue=ibadroots more tolerant to partially damaged trees - allow raid0 on a single device and raid10 on two devices, degenerate cases but might be useful as an intermediate step during conversion to other profiles - zoned mode block group auto reclaim can be disabled via sysfs knob Performance improvements: - continue readahead of node siblings even if target node is in memory, could speed up full send (on sample test +11%) - batching of delayed items can speed up creating many files - fsync/tree-log speedups - avoid unnecessary work (gains +2% throughput, -2% run time on sample load) - reduced lock contention on renames (on dbench +4% throughput, up to -30% latency) Fixes: - various zoned mode fixes - preemptive flushing threshold tuning, avoid excessive work on almost full filesystems Core: - continued subpage support, preparation for implementing remaining features like compression and defragmentation; with some limitations, write is now enabled on 64K page systems with 4K sectors, still considered experimental - no readahead on compressed reads - inline extents disabled - disabled raid56 profile conversion and mount - improved flushing logic, fixing early ENOSPC on some workloads - inode flags have been internally split to read-only and read-write incompat bit parts, used by fs-verity - new tree items for fs-verity - descriptor item - Merkle tree item - inode operations extended to be namespace-aware - cleanups and refactoring Generic code changes: - fs: new export filemap_fdatawrite_wbc - fs: removed sync_inode - block: bio_trim argument type fixups - vfs: add namespace-aware lookup" * tag 'for-5.15-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (114 commits) btrfs: reset replace target device to allocation state on close btrfs: zoned: fix ordered extent boundary calculation btrfs: do not do preemptive flushing if the majority is global rsv btrfs: reduce the preemptive flushing threshold to 90% btrfs: tree-log: check btrfs_lookup_data_extent return value btrfs: avoid unnecessarily logging directories that had no changes btrfs: allow idmapped mount btrfs: handle ACLs on idmapped mounts btrfs: allow idmapped INO_LOOKUP_USER ioctl btrfs: allow idmapped SUBVOL_SETFLAGS ioctl btrfs: allow idmapped SET_RECEIVED_SUBVOL ioctls btrfs: relax restrictions for SNAP_DESTROY_V2 with subvolids btrfs: allow idmapped SNAP_DESTROY ioctls btrfs: allow idmapped SNAP_CREATE/SUBVOL_CREATE ioctls btrfs: check whether fsgid/fsuid are mapped during subvolume creation btrfs: allow idmapped permission inode op btrfs: allow idmapped setattr inode op btrfs: allow idmapped tmpfile inode op btrfs: allow idmapped symlink inode op btrfs: allow idmapped mkdir inode op ...
2021-08-31Merge tag '5.15-rc-smb3-fixes-part1' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds30-763/+485
Pull cifs client updates from Steve French: "Eleven cifs/smb3 client fixes: - mostly restructuring to allow disabling less secure algorithms (this will allow eventual removing rc4 and md4 from general use in the kernel) - four fixes, including two for stable - enable r/w support with fscache and cifs.ko I am working on a larger set of changes (the usual ... multichannel, auth and signing improvements), but wanted to get these in earlier to reduce chance of merge conflicts later in the merge window" * tag '5.15-rc-smb3-fixes-part1' of git://git.samba.org/sfrench/cifs-2.6: cifs: Do not leak EDEADLK to dgetents64 for STATUS_USER_SESSION_DELETED cifs: add cifs_common directory to MAINTAINERS file cifs: cifs_md4 convert to SPDX identifier cifs: create a MD4 module and switch cifs.ko to use it cifs: fork arc4 and create a separate module for it for cifs and other users cifs: remove support for NTLM and weaker authentication algorithms cifs: enable fscache usage even for files opened as rw oid_registry: Add OIDs for missing Spnego auth mechanisms to Macs smb3: fix posix extensions mount option cifs: fix wrong release in sess_alloc_buffer() failed path CIFS: Fix a potencially linear read overflow
2021-08-31Merge tag '5.15-rc-first-ksmbd-merge' of git://git.samba.org/ksmbdLinus Torvalds66-2/+32254
Pull initial ksmbd implementation from Steve French: "Initial merge of kernel smb3 file server, ksmbd. The SMB family of protocols is the most widely deployed network filesystem protocol, the default on Windows and Macs (and even on many phones and tablets), with clients and servers on all major operating systems, but lacked a kernel server for Linux. For many cases the current userspace server choices were suboptimal either due to memory footprint, performance or difficulty integrating well with advanced Linux features. ksmbd is a new kernel module which implements the server-side of the SMB3 protocol. The target is to provide optimized performance, GPLv2 SMB server, and better lease handling (distributed caching). The bigger goal is to add new features more rapidly (e.g. RDMA aka "smbdirect", and recent encryption and signing improvements to the protocol) which are easier to develop on a smaller, more tightly optimized kernel server than for example in Samba. The Samba project is much broader in scope (tools, security services, LDAP, Active Directory Domain Controller, and a cross platform file server for a wider variety of purposes) but the user space file server portion of Samba has proved hard to optimize for some Linux workloads, including for smaller devices. This is not meant to replace Samba, but rather be an extension to allow better optimizing for Linux, and will continue to integrate well with Samba user space tools and libraries where appropriate. Working with the Samba team we have already made sure that the configuration files and xattrs are in a compatible format between the kernel and user space server. Various types of functional and regression tests are regularly run against it. One example is the automated 'buildbot' regression tests which use the Linux client to test against ksmbd, e.g. http://smb3-test-rhel-75.southcentralus.cloudapp.azure.com/#/builders/8/builds/56 but other test suites, including Samba's smbtorture functional test suite are also used regularly" * tag '5.15-rc-first-ksmbd-merge' of git://git.samba.org/ksmbd: (219 commits) ksmbd: fix __write_overflow warning in ndr_read_string MAINTAINERS: ksmbd: add cifs_common directory to ksmbd entry MAINTAINERS: ksmbd: update my email address ksmbd: fix permission check issue on chown and chmod ksmbd: don't set FILE DELETE and FILE_DELETE_CHILD in access mask by default MAINTAINERS: add git adddress of ksmbd ksmbd: update SMB3 multi-channel support in ksmbd.rst ksmbd: smbd: fix kernel oops during server shutdown ksmbd: remove select FS_POSIX_ACL in Kconfig ksmbd: use proper errno instead of -1 in smb2_get_ksmbd_tcon() ksmbd: update the comment for smb2_get_ksmbd_tcon() ksmbd: change int data type to boolean ksmbd: Fix multi-protocol negotiation ksmbd: fix an oops in error handling in smb2_open() ksmbd: add ipv6_addr_v4mapped check to know if connection from client is ipv4 ksmbd: fix missing error code in smb2_lock ksmbd: use channel signingkey for binding SMB2 session setup ksmbd: don't set RSS capable in FSCTL_QUERY_NETWORK_INTERFACE_INFO ksmbd: Return STATUS_OBJECT_PATH_NOT_FOUND if smb2_creat() returns ENOENT ksmbd: fix -Wstringop-truncation warnings ...
2021-08-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski14-20/+117
include/linux/netdevice.h net/socket.c d0efb16294d1 ("net: don't unconditionally copy_from_user a struct ifreq for socket ioctls") 876f0bf9d0d5 ("net: socket: simplify dev_ifconf handling") 29c4964822aa ("net: socket: rework compat_ifreq_ioctl()") Signed-off-by: Jakub Kicinski <[email protected]>
2021-08-31net: Add depends on OF_NET for LiteX's LiteETHSlark Xiao1-0/+1
Current settings may produce a build error when CONFIG_OF_NET is disabled. The CONFIG_OF_NET controls a headfile <linux/of.h> and some functions in <linux/of_net.h>. Signed-off-by: Slark Xiao <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-08-31dt-bindings: display: remove zte,vou.txt binding docZenghui Yu1-120/+0
The zte zx platform was removed in commit 89d4f98ae90d ("ARM: remove zte zx platform") and the zxdrm driver is going to be removed in v5.15 via drm tree. Let's remove the now obsolete binding doc. Cc: Arnd Bergmann <[email protected]> Cc: Jun Nie <[email protected]> Cc: Shawn Guo <[email protected]> Signed-off-by: Zenghui Yu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Rob Herring <[email protected]>
2021-08-31dt-bindings: hwmon: merge max1619 into trivial devicesKrzysztof Kozlowski2-12/+2
Ther Maxim max1619 bindings are trivial, so simply merge it into trivial-devices.yaml. Signed-off-by: Krzysztof Kozlowski <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Rob Herring <[email protected]>
2021-08-31dt-bindings: mtd-physmap: Add 'arm,vexpress-flash' compatibleRob Herring1-0/+1
The 'arm,vexpress-flash' compatible is in use, but has never been documented, so add it now. Cc: Miquel Raynal <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Vignesh Raghavendra <[email protected]> Cc: [email protected] Signed-off-by: Rob Herring <[email protected]> Acked-by: Miquel Raynal <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Rob Herring <[email protected]>
2021-08-31ipv6: seg6: remove duplicated includeLv Ruyi1-1/+0
Remove all but the first include of net/lwtunnel.h from 'seg6_local.c. Reported-by: Zeal Robot <[email protected]> Signed-off-by: Lv Ruyi <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-08-31net: hns3: remove unnecessary spacesHao Chen2-2/+2
This patch removes some unnecessary spaces for cleanup. Signed-off-by: Hao Chen <[email protected]> Signed-off-by: Guangbin Huang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-08-31net: hns3: add some required spacesHao Chen5-25/+25
Add some required spaces to improve readability. Signed-off-by: Hao Chen <[email protected]> Signed-off-by: Guangbin Huang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-08-31net: hns3: clean up a type mismatch warningGuojia Liao1-1/+8
abs() returns signed long, which could not convert the type as unsigned, and it may cause a mismatch type warning from static tools. To fix it, this patch uses an variable to save the abs()'s result and does a explicit conversion. Signed-off-by: Guojia Liao <[email protected]> Signed-off-by: Guangbin Huang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-08-31net: hns3: refine function hns3_set_default_feature()Jian Shen1-46/+16
Currently, the driver sets default feature for netdev->features, netdev->hw_features, netdev->vlan_features and netdev->hw_enc_features separately. It's fussy, because most of the feature bits are same. So refine it by copy value from netdev->features. Signed-off-by: Jian Shen <[email protected]> Signed-off-by: Guangbin Huang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-08-31ipv6: remove duplicated 'net/lwtunnel.h' includeLv Ruyi1-1/+0
Remove all but the first include of net/lwtunnel.h from seg6_iptunnel.c. Reported-by: Zeal Robot <[email protected]> Signed-off-by: Lv Ruyi <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-08-31net: w5100: check return value after calling platform_get_resource()Yang Yingliang1-0/+2
It will cause null-ptr-deref if platform_get_resource() returns NULL, we need check the return value. Signed-off-by: Yang Yingliang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-08-31net/mlxbf_gige: Make use of devm_platform_ioremap_resourcexxx()Cai Huoqing4-36/+9
Use the devm_platform_ioremap_resource_byname() helper instead of calling platform_get_resource_byname() and devm_ioremap_resource() separately Use the devm_platform_ioremap_resource() helper instead of calling platform_get_resource() and devm_ioremap_resource() separately Signed-off-by: Cai Huoqing <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-08-31net: mdio: mscc-miim: Make use of the helper function ↵Cai Huoqing1-8/+4
devm_platform_ioremap_resource() Use the devm_platform_ioremap_resource() helper instead of calling platform_get_resource() and devm_ioremap_resource() separately Signed-off-by: Cai Huoqing <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-08-31net: mdio-ipq4019: Make use of devm_platform_ioremap_resource()Cai Huoqing1-4/+1
Use the devm_platform_ioremap_resource() helper instead of calling platform_get_resource() and devm_ioremap_resource() separately Signed-off-by: Cai Huoqing <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-08-31fou: remove sparse errorsEric Dumazet2-6/+6
We need to add __rcu qualifier to avoid these errors: net/ipv4/fou.c:250:18: warning: incorrect type in assignment (different address spaces) net/ipv4/fou.c:250:18: expected struct net_offload const **offloads net/ipv4/fou.c:250:18: got struct net_offload const [noderef] __rcu ** net/ipv4/fou.c:251:15: error: incompatible types in comparison expression (different address spaces): net/ipv4/fou.c:251:15: struct net_offload const [noderef] __rcu * net/ipv4/fou.c:251:15: struct net_offload const * net/ipv4/fou.c:272:18: warning: incorrect type in assignment (different address spaces) net/ipv4/fou.c:272:18: expected struct net_offload const **offloads net/ipv4/fou.c:272:18: got struct net_offload const [noderef] __rcu ** net/ipv4/fou.c:273:15: error: incompatible types in comparison expression (different address spaces): net/ipv4/fou.c:273:15: struct net_offload const [noderef] __rcu * net/ipv4/fou.c:273:15: struct net_offload const * net/ipv4/fou.c:442:18: warning: incorrect type in assignment (different address spaces) net/ipv4/fou.c:442:18: expected struct net_offload const **offloads net/ipv4/fou.c:442:18: got struct net_offload const [noderef] __rcu ** net/ipv4/fou.c:443:15: error: incompatible types in comparison expression (different address spaces): net/ipv4/fou.c:443:15: struct net_offload const [noderef] __rcu * net/ipv4/fou.c:443:15: struct net_offload const * net/ipv4/fou.c:489:18: warning: incorrect type in assignment (different address spaces) net/ipv4/fou.c:489:18: expected struct net_offload const **offloads net/ipv4/fou.c:489:18: got struct net_offload const [noderef] __rcu ** net/ipv4/fou.c:490:15: error: incompatible types in comparison expression (different address spaces): net/ipv4/fou.c:490:15: struct net_offload const [noderef] __rcu * net/ipv4/fou.c:490:15: struct net_offload const * net/ipv4/udp_offload.c:170:26: warning: incorrect type in assignment (different address spaces) net/ipv4/udp_offload.c:170:26: expected struct net_offload const **offloads net/ipv4/udp_offload.c:170:26: got struct net_offload const [noderef] __rcu ** net/ipv4/udp_offload.c:171:23: error: incompatible types in comparison expression (different address spaces): net/ipv4/udp_offload.c:171:23: struct net_offload const [noderef] __rcu * net/ipv4/udp_offload.c:171:23: struct net_offload const * Fixes: efc98d08e1ec ("fou: eliminate IPv4,v6 specific GRO functions") Fixes: 8bce6d7d0d1e ("udp: Generalize skb_udp_segment") Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-08-31ipv4: fix endianness issue in inet_rtm_getroute_build_skb()Eric Dumazet1-1/+1
The UDP length field should be in network order. This removes the following sparse error: net/ipv4/route.c:3173:27: warning: incorrect type in assignment (different base types) net/ipv4/route.c:3173:27: expected restricted __be16 [usertype] len net/ipv4/route.c:3173:27: got unsigned long Fixes: 404eb77ea766 ("ipv4: support sport, dport and ip_proto in RTM_GETROUTE") Signed-off-by: Eric Dumazet <[email protected]> Cc: Roopa Prabhu <[email protected]> Cc: David Ahern <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-08-31Merge branch 'octeon-npc-fixes'David S. Miller1-10/+12
Subbaraya Sundeep says: ==================== octeontx2-af: Miscellaneous fixes in npc This patchset consists of consolidated fixes in rvu_npc file. Two of the patches prevent infinite loop which can happen in corner cases. One patch is for fixing static code analyzer reported issues. And the last patch is a minor change in npc protocol checker hardware block to report ipv4 checksum errors as a known error codes. ==================== Signed-off-by: David S. Miller <[email protected]>
2021-08-31octeontx2-af: Set proper errorcode for IPv4 checksum errorsSunil Goutham1-3/+4
With current config, for packets with IPv4 checksum errors, errorcode is being set to UNKNOWN. Hence added a separate errorcodes for outer and inner IPv4 checksum and changed NPC configuration accordingly. Also turn on L2 multicast address check in NPC protocol check block. Fixes: 6b3321bacc5a ("octeontx2-af: Enable packet length and csum validation") Signed-off-by: Sunil Goutham <[email protected]> Signed-off-by: Subbaraya Sundeep <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-08-31octeontx2-af: Fix static code analyzer reported issuesSubbaraya Sundeep1-3/+3
This patch fixes the static code analyzer reported issues in rvu_npc.c. The reported errors are different sizes of operands in bitops and returning uninitialized values. Fixes: 651cd2652339 ("octeontx2-af: MCAM entry installation support") Signed-off-by: Subbaraya Sundeep <[email protected]> Signed-off-by: Sunil Goutham <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-08-31octeontx2-af: Fix mailbox errors in nix_rss_flowkey_cfgSubbaraya Sundeep1-3/+3
In npc_update_vf_flow_entry function the loop cursor 'index' is being changed inside the loop causing the loop to spin forever. This in turn hogs the kworker thread forever and no other mbox message is processed by AF driver after that. Fix this by using another variable in the loop. Fixes: 55307fcb9258 ("octeontx2-af: Add mbox messages to install and delete MCAM rules") Signed-off-by: Subbaraya Sundeep <[email protected]> Signed-off-by: Sunil Goutham <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-08-31octeontx2-af: Fix loop in free and unmap counterSubbaraya Sundeep1-1/+2
When the given counter does not belong to the entry then code ends up in infinite loop because the loop cursor, entry is not getting updated further. This patch fixes that by updating entry for every iteration. Fixes: a958dd59f9ce ("octeontx2-af: Map or unmap NPC MCAM entry and counter") Signed-off-by: Subbaraya Sundeep <[email protected]> Signed-off-by: Sunil Goutham <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-08-31af_unix: fix potential NULL deref in unix_dgram_connect()Eric Dumazet1-3/+6
syzbot was able to trigger NULL deref in unix_dgram_connect() [1] This happens in if (unix_peer(sk)) sk->sk_state = other->sk_state = TCP_ESTABLISHED; // crash because @other is NULL Because locks have been dropped, unix_peer() might be non NULL, while @other is NULL (AF_UNSPEC case) We need to move code around, so that we no longer access unix_peer() and sk_state while locks have been released. [1] general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017] CPU: 0 PID: 10341 Comm: syz-executor239 Not tainted 5.14.0-rc7-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:unix_dgram_connect+0x32a/0xc60 net/unix/af_unix.c:1226 Code: 00 00 45 31 ed 49 83 bc 24 f8 05 00 00 00 74 69 e8 eb 5b a6 f9 48 8d 7d 12 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 48 89 fa 83 e2 07 38 d0 7f 08 84 c0 0f 85 e0 07 00 00 RSP: 0018:ffffc9000a89fcd8 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: 0000000000000004 RCX: 0000000000000000 RDX: 0000000000000002 RSI: ffffffff87cf4ef5 RDI: 0000000000000012 RBP: 0000000000000000 R08: 0000000000000000 R09: ffff88802e1917c3 R10: ffffffff87cf4eba R11: 0000000000000001 R12: ffff88802e191740 R13: 0000000000000000 R14: ffff88802e191d38 R15: ffff88802e1917c0 FS: 00007f3eb0052700(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000004787d0 CR3: 0000000029c0a000 CR4: 00000000001506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __sys_connect_file+0x155/0x1a0 net/socket.c:1890 __sys_connect+0x161/0x190 net/socket.c:1907 __do_sys_connect net/socket.c:1917 [inline] __se_sys_connect net/socket.c:1914 [inline] __x64_sys_connect+0x6f/0xb0 net/socket.c:1914 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x446a89 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 a1 15 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f3eb0052208 EFLAGS: 00000246 ORIG_RAX: 000000000000002a RAX: ffffffffffffffda RBX: 00000000004cc4d8 RCX: 0000000000446a89 RDX: 000000000000006e RSI: 0000000020000180 RDI: 0000000000000003 RBP: 00000000004cc4d0 R08: 00007f3eb0052700 R09: 0000000000000000 R10: 00007f3eb0052700 R11: 0000000000000246 R12: 00000000004cc4dc R13: 00007ffd791e79cf R14: 00007f3eb0052300 R15: 0000000000022000 Modules linked in: ---[ end trace 4eb809357514968c ]--- RIP: 0010:unix_dgram_connect+0x32a/0xc60 net/unix/af_unix.c:1226 Code: 00 00 45 31 ed 49 83 bc 24 f8 05 00 00 00 74 69 e8 eb 5b a6 f9 48 8d 7d 12 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 48 89 fa 83 e2 07 38 d0 7f 08 84 c0 0f 85 e0 07 00 00 RSP: 0018:ffffc9000a89fcd8 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: 0000000000000004 RCX: 0000000000000000 RDX: 0000000000000002 RSI: ffffffff87cf4ef5 RDI: 0000000000000012 RBP: 0000000000000000 R08: 0000000000000000 R09: ffff88802e1917c3 R10: ffffffff87cf4eba R11: 0000000000000001 R12: ffff88802e191740 R13: 0000000000000000 R14: ffff88802e191d38 R15: ffff88802e1917c0 FS: 00007f3eb0052700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffd791fe960 CR3: 0000000029c0a000 CR4: 00000000001506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Fixes: 83301b5367a9 ("af_unix: Set TCP_ESTABLISHED for datagram sockets too") Signed-off-by: Eric Dumazet <[email protected]> Cc: Cong Wang <[email protected]> Cc: Alexei Starovoitov <[email protected]> Reported-by: syzbot <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-08-31dpaa2-eth: Replace strlcpy with strscpyJason Wang1-4/+4
The strlcpy should not be used because it doesn't limit the source length. As linus says, it's a completely useless function if you can't implicitly trust the source string - but that is almost always why people think they should use it! All in all the BSD function will lead some potential bugs. But the strscpy doesn't require reading memory from the src string beyond the specified "count" bytes, and since the return value is easier to error-check than strlcpy()'s. In addition, the implementation is robust to the string changing out from underneath it, unlike the current strlcpy() implementation. Thus, We prefer using strscpy instead of strlcpy. Signed-off-by: Jason Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-08-31octeontx2-af: Use NDC TX for transmit packet dataGeetha sowjanya2-0/+4
For better performance set hardware to use NDC TX for reading packet data specified NIX_SEND_SG_S. Signed-off-by: Geetha sowjanya <[email protected]> Signed-off-by: Sunil Goutham <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-08-31net: bridge: use mld2r_ngrec instead of icmpv6_dataunMichelleJin1-5/+5
br_ip6_multicast_mld2_report function uses icmp6h to parse mld2_report packet. mld2r_ngrec defines mld2r_hdr.icmp6_dataun.un_data16[1] in include/net/mld.h. So, it is more compact to use mld2r rather than icmp6h. By doing printk test, it is confirmed that icmp6h->icmp6_dataun.un_data16[1] and mld2r->mld2r_ngrec are indeed equivalent. Also, sizeof(*mld2r) and sizeof(*icmp6h) are equivalent, too. Signed-off-by: MichelleJin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-08-31net: qualcomm: fix QCA7000 checksum handlingStefan Wahren2-2/+2
Based on tests the QCA7000 doesn't support checksum offloading. So assume ip_summed is CHECKSUM_NONE and let the kernel take care of the checksum handling. This fixes data transfer issues in noisy environments. Reported-by: Michael Heimpold <[email protected]> Fixes: 291ab06ecf67 ("net: qualcomm: new Ethernet over SPI driver for QCA7000") Signed-off-by: Stefan Wahren <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-08-31Merge remote-tracking branch 'tip/sched/arm64' into for-next/coreCatalin Marinas787-4082/+8629
* tip/sched/arm64: (785 commits) Documentation: arm64: describe asymmetric 32-bit support arm64: Remove logic to kill 32-bit tasks on 64-bit-only cores arm64: Hook up cmdline parameter to allow mismatched 32-bit EL0 arm64: Advertise CPUs capable of running 32-bit applications in sysfs arm64: Prevent offlining first CPU with 32-bit EL0 on mismatched system arm64: exec: Adjust affinity for compat tasks with mismatched 32-bit EL0 arm64: Implement task_cpu_possible_mask() sched: Introduce dl_task_check_affinity() to check proposed affinity sched: Allow task CPU affinity to be restricted on asymmetric systems sched: Split the guts of sched_setaffinity() into a helper function sched: Introduce task_struct::user_cpus_ptr to track requested affinity sched: Reject CPU affinity changes based on task_cpu_possible_mask() cpuset: Cleanup cpuset_cpus_allowed_fallback() use in select_fallback_rq() cpuset: Honour task_cpu_possible_mask() in guarantee_online_cpus() cpuset: Don't use the cpu_possible_mask as a last resort for cgroup v1 sched: Introduce task_cpu_possible_mask() to limit fallback rq selection sched: Cgroup SCHED_IDLE support sched/topology: Skip updating masks for non-online nodes Linux 5.14-rc6 lib: use PFN_PHYS() in devmem_is_allowed() ...
2021-08-30ext4: make the updating inode data procedure atomicZhang Yi1-16/+28
Now that ext4_do_update_inode() return error before filling the whole inode data if we fail to set inode blocks in ext4_inode_blocks_set(). This error should never happen in theory since sb->s_maxbytes should not have allowed this, we have already init sb->s_maxbytes according to this feature in ext4_fill_super(). So even through that could only happen due to the filesystem corruption, we'd better to return after we finish updating the inode because it may left an uninitialized buffer and we could read this buffer later in "errors=continue" mode. This patch make the updating inode data procedure atomic, call EXT4_ERROR_INODE() after we dropping i_raw_lock after something bad happened, make sure that the inode is integrated, and also drop a BUG_ON and do some small cleanups. Signed-off-by: Zhang Yi <[email protected]> Reviewed-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-08-30ext4: remove an unnecessary if statement in __ext4_get_inode_loc()Zhang Yi1-84/+78
The "if (!buffer_uptodate(bh))" hunk covered almost the whole code after getting buffer in __ext4_get_inode_loc() which seems unnecessary, remove it and switch to check ext4_buffer_uptodate(), it simplify code and make it more readable. Signed-off-by: Zhang Yi <[email protected]> Reviewed-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-08-30ext4: move inode eio simulation behind io completeionZhang Yi1-3/+1
No EIO simulation is required if the buffer is uptodate, so move the simulation behind read bio completeion just like inode/block bitmap simulation does. Signed-off-by: Zhang Yi <[email protected]> Reviewed-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-08-30ext4: Improve scalability of ext4 orphan file handlingJan Kara2-27/+53
Even though the length of the critical section when adding / removing orphaned inodes was significantly reduced by using orphan file, the contention of lock protecting orphan file still appears high in profiles for truncate / unlink intensive workloads with high number of threads. This patch makes handling of orphan file completely lockless. Also to reduce conflicts between CPUs different CPUs start searching for empty slot in orphan file in different blocks. Performance comparison of locked orphan file handling, lockless orphan file handling, and completely disabled orphan inode handling from 80 CPU Xeon Server with 526 GB of RAM, filesystem located on SAS SSD disk, average of 5 runs: stress-orphan (microbenchmark truncating files byte-by-byte from N processes in parallel) Threads Time Time Time Orphan locked Orphan lockless No orphan 1 0.945600 0.939400 0.891200 2 1.331800 1.246600 1.174400 4 1.995000 1.780600 1.713200 8 6.424200 4.900000 4.106000 16 14.937600 8.516400 8.138000 32 33.038200 24.565600 24.002200 64 60.823600 39.844600 38.440200 128 122.941400 70.950400 69.315000 So we can see that with lockless orphan file handling, addition / deletion of orphaned inodes got almost completely out of picture even for a microbenchmark stressing it. For reaim creat_clo workload on ramdisk there are also noticeable gains (average of 5 runs): Clients Vanilla (ops/s) Patched (ops/s) creat_clo-1 14705.88 ( 0.00%) 14354.07 * -2.39%* creat_clo-3 27108.43 ( 0.00%) 28301.89 ( 4.40%) creat_clo-5 37406.48 ( 0.00%) 45180.73 * 20.78%* creat_clo-7 41338.58 ( 0.00%) 54687.50 * 32.29%* creat_clo-9 45226.13 ( 0.00%) 62937.07 * 39.16%* creat_clo-11 44000.00 ( 0.00%) 65088.76 * 47.93%* creat_clo-13 36516.85 ( 0.00%) 68661.97 * 88.03%* creat_clo-15 30864.20 ( 0.00%) 69551.78 * 125.35%* creat_clo-17 27478.45 ( 0.00%) 67729.08 * 146.48%* creat_clo-19 25000.00 ( 0.00%) 61621.62 * 146.49%* creat_clo-21 18772.35 ( 0.00%) 63829.79 * 240.02%* creat_clo-23 16698.94 ( 0.00%) 61938.96 * 270.92%* creat_clo-25 14973.05 ( 0.00%) 56947.61 * 280.33%* creat_clo-27 16436.69 ( 0.00%) 65008.03 * 295.51%* creat_clo-29 13949.01 ( 0.00%) 69047.62 * 395.00%* creat_clo-31 14283.52 ( 0.00%) 67982.45 * 375.95%* Reviewed-by: Theodore Ts'o <[email protected]> Reviewed-by: Lukas Czerner <[email protected]> Signed-off-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-08-30ext4: Orphan file documentationJan Kara5-6/+89
Add documentation about the orphan file feature. Reviewed-by: Theodore Ts'o <[email protected]> Signed-off-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-08-30ext4: Speedup ext4 orphan inode handlingJan Kara4-52/+394
Ext4 orphan inode handling is a bottleneck for workloads which heavily truncate / unlink small files since it contends on the global s_orphan_mutex lock (and generally it's difficult to improve scalability of the ondisk linked list of orphaned inodes). This patch implements new way of handling orphan inodes. Instead of linking orphaned inode into a linked list, we store it's inode number in a new special file which we call "orphan file". Only if there's no more space in the orphan file (too many inodes are currently orphaned) we fall back to using old style linked list. Currently we protect operations in the orphan file with a spinlock for simplicity but even in this setting we can substantially reduce the length of the critical section and thus speedup some workloads. In the next patch we improve this by making orphan handling lockless. Note that the change is backwards compatible when the filesystem is clean - the existence of the orphan file is a compat feature, we set another ro-compat feature indicating orphan file needs scanning for orphaned inodes when mounting filesystem read-write. This ro-compat feature gets cleared on unmount / remount read-only. Some performance data from 80 CPU Xeon Server with 512 GB of RAM, filesystem located on SSD, average of 5 runs: stress-orphan (microbenchmark truncating files byte-by-byte from N processes in parallel) Threads Time Time Vanilla Patched 1 1.057200 0.945600 2 1.680400 1.331800 4 2.547000 1.995000 8 7.049400 6.424200 16 14.827800 14.937600 32 40.948200 33.038200 64 87.787400 60.823600 128 206.504000 122.941400 So we can see significant wins all over the board. Reviewed-by: Theodore Ts'o <[email protected]> Signed-off-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-08-30ext4: Move orphan inode handling into a separate fileJan Kara5-363/+375
Move functions for handling orphan inodes into a new file fs/ext4/orphan.c to have them in one place and somewhat reduce size of other files. No code changes. Reviewed-by: Andreas Dilger <[email protected]> Reviewed-by: Theodore Ts'o <[email protected]> Signed-off-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-08-30ext4: Support for checksumming from journal triggersJan Kara16-128/+259
JBD2 layer support triggers which are called when journaling layer moves buffer to a certain state. We can use the frozen trigger, which gets called when buffer data is frozen and about to be written out to the journal, to compute block checksums for some buffer types (similarly as does ocfs2). This avoids unnecessary repeated recomputation of the checksum (at the cost of larger window where memory corruption won't be caught by checksumming) and is even necessary when there are unsynchronized updaters of the checksummed data. So add superblock and journal trigger type arguments to ext4_journal_get_write_access() and ext4_journal_get_create_access() so that frozen triggers can be set accordingly. Also add inode argument to ext4_walk_page_buffers() and all the callbacks used with that function for the same purpose. This patch is mostly only a change of prototype of the above mentioned functions and a few small helpers. Real checksumming will come later. Reviewed-by: Theodore Ts'o <[email protected]> Signed-off-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-08-30ext4: fix race writing to an inline_data file while its xattrs are changingTheodore Ts'o1-0/+6
The location of the system.data extended attribute can change whenever xattr_sem is not taken. So we need to recalculate the i_inline_off field since it mgiht have changed between ext4_write_begin() and ext4_write_end(). This means that caching i_inline_off is probably not helpful, so in the long run we should probably get rid of it and shrink the in-memory ext4 inode slightly, but let's fix the race the simple way for now. Cc: [email protected] Fixes: f19d5870cbf72 ("ext4: add normal write support for inline data") Reported-by: [email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-08-30jbd2: add sparse annotations for add_transaction_credits()Theodore Ts'o1-1/+18
Signed-off-by: Theodore Ts'o <[email protected]>
2021-08-30ext4: fix sparse warningsTheodore Ts'o2-6/+25
Add sparse annotations to suppress false positive context imbalance warnings, and use NULL instead of 0 in EXT_MAX_{EXTENT,INDEX}. Signed-off-by: Theodore Ts'o <[email protected]>
2021-08-30ext4: Make sure quota files are not grabbed accidentallyJan Kara1-2/+6
If ext4 filesystem is corrupted so that quota files are linked from directory hirerarchy, bad things can happen. E.g. quota files can get corrupted or deleted. Make sure we are not grabbing quota file inodes when we expect normal inodes. Signed-off-by: Jan Kara <[email protected]> Signed-off-by: Theodore Ts'o <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-08-30ext4: fix e2fsprogs checksum failure for mounted filesystemJan Kara1-0/+8
Commit 81414b4dd48 ("ext4: remove redundant sb checksum recomputation") removed checksum recalculation after updating superblock free space / inode counters in ext4_fill_super() based on the fact that we will recalculate the checksum on superblock writeout. That is correct assumption but until the writeout happens (which can take a long time) the checksum is incorrect in the buffer cache and if programs such as tune2fs or resize2fs is called shortly after a file system is mounted can fail. So return back the checksum recalculation and add a comment explaining why. Fixes: 81414b4dd48f ("ext4: remove redundant sb checksum recomputation") Cc: [email protected] Reported-by: Boyang Xue <[email protected]> Signed-off-by: Jan Kara <[email protected]> Signed-off-by: Theodore Ts'o <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-08-30ext4: if zeroout fails fall back to splitting the extent nodeTheodore Ts'o1-2/+3
If the underlying storage device is using thin-provisioning, it's possible for a zeroout operation to return ENOSPC. Commit df22291ff0fd ("ext4: Retry block allocation if we have free blocks left") added logic to retry block allocation since we might get free block after we commit a transaction. But the ENOSPC from thin-provisioning will confuse ext4, and lead to an infinite loop. Since using zeroout instead of splitting the extent node is an optimization, if it fails, we might as well fall back to splitting the extent node. Reported-by: yangerkun <[email protected]> Signed-off-by: Theodore Ts'o <[email protected]>
2021-08-30ext4: reduce arguments of ext4_fc_add_dentry_tlvGuoqing Jiang1-18/+9
Let's pass fc_dentry directly since those arguments (tag, parent_ino and ino etc) can be deferenced from it. Signed-off-by: Guoqing Jiang <[email protected]> Reviewed-by: Harshad Shirwadkar <[email protected]> Signed-off-by: Theodore Ts'o <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-08-30ext4: flush background discard kwork when retry allocationWang Jianchao3-3/+13
The background discard kwork tries to mark blocks used and issue discard. This can make filesystem suffer from NOSPC error, xfstest generic/371 can fail due to it. Fix it by flushing discard kwork in ext4_should_retry_alloc. At the same time, give up discard at the moment. Signed-off-by: Wang Jianchao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2021-08-30ext4: get discard out of jbd2 commit kthread contexWang Jianchao2-25/+78
Right now, discard is issued and waited to be completed in jbd2 commit kthread context after the logs are committed. When large amount of files are deleted and discard is flooding, jbd2 commit kthread can be blocked for long time. Then all of the metadata operations can be blocked to wait the log space. One case is the page fault path with read mm->mmap_sem held, which wants to update the file time but has to wait for the log space. When other threads in the task wants to do mmap, then write mmap_sem is blocked. Finally all of the following read mmap_sem requirements are blocked, even the ps command which need to read the /proc/pid/ -cmdline. Our monitor service which needs to read /proc/pid/cmdline used to be blocked for 5 mins. This patch frees the blocks back to buddy after commit and then do discard in a async kworker context in fstrim fashion, namely, - mark blocks to be discarded as used if they have not been allocated - do discard - mark them free After this, jbd2 commit kthread won't be blocked any more by discard and we won't get NOSPC even if the discard is slow or throttled. Link: https://marc.info/?l=linux-kernel&m=162143690731901&w=2 Suggested-by: Theodore Ts'o <[email protected]> Reviewed-by: Jan Kara <[email protected]> Signed-off-by: Wang Jianchao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>