aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-03-24tools/vm/page_owner_sort.c: fix the instructions for useYixuan Cao1-1/+1
I noticed a discrepancy between the usage method and the code logic. If we enable the -f option, it should be "Filter out the information of blocks whose memory has been released". Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Yixuan Cao <[email protected]> Cc: Stephen Rothwell <[email protected]> Cc: Sean Anderson <[email protected]> Cc: Muchun Song <[email protected]> Cc: Zhenliang Wei <[email protected]> Cc: Tang Bin <[email protected]> Cc: Yinan Zhang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-03-24mm/page_owner.c: record tgidYixuan Cao1-6/+9
In a single-threaded process, the pid in kernel task_struct is the same as the tgid, which can mark the process of page allocation. But in a multithreaded process, only the task_struct of the thread leader has the same pid as tgid, and the pids of other threads are different from tgid. Therefore, tgid is recorded to provide effective information for debugging and data statistics of multithreaded programs. This can also be achieved by observing the task name (executable file name) for a specific process. However, when the same program is started multiple times, the task name is the same and the tgid is different. Therefore, in the debugging of multi-threaded programs, combined with the task name and tgid, more accurate runtime information of a certain run of the program can be obtained. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Yixuan Cao <[email protected]> Cc: Waiman Long <[email protected]> Cc: Rafael Aquini <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-03-24mm/page_owner: record task command nameWaiman Long1-4/+10
The page_owner information currently includes the pid of the calling task. That is useful as long as the task is still running. Otherwise, the number is meaningless. To have more information about the allocating tasks that had exited by the time the page_owner information is retrieved, we need to store the command name of the task. Add a new comm field into page_owner structure to store the command name and display it when the page_owner information is retrieved. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Waiman Long <[email protected]> Acked-by: Rafael Aquini <[email protected]> Cc: Andy Shevchenko <[email protected]> Cc: David Rientjes <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Petr Mladek <[email protected]> Cc: Rasmus Villemoes <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Sergey Senozhatsky <[email protected]> Cc: Steven Rostedt (Google) <[email protected]> Cc: Vladimir Davydov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-03-24mm/page_owner: print memcg informationWaiman Long1-0/+42
It was found that a number of offline memcgs were not freed because they were pinned by some charged pages that were present. Even "echo 1 > /proc/sys/vm/drop_caches" wasn't able to free those pages. These offline but not freed memcgs tend to increase in number over time with the side effect that percpu memory consumption as shown in /proc/meminfo also increases over time. In order to find out more information about those pages that pin offline memcgs, the page_owner feature is extended to print memory cgroup information especially whether the cgroup is offline or not. RCU read lock is taken when memcg is being accessed to make sure that it won't be freed. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Waiman Long <[email protected]> Acked-by: David Rientjes <[email protected]> Acked-by: Roman Gushchin <[email protected]> Acked-by: Rafael Aquini <[email protected]> Acked-by: Mike Rapoport <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Andy Shevchenko <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Petr Mladek <[email protected]> Cc: Rasmus Villemoes <[email protected]> Cc: Sergey Senozhatsky <[email protected]> Cc: Steven Rostedt (Google) <[email protected]> Cc: Vladimir Davydov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-03-24mm/page_owner: use scnprintf() to avoid excessive buffer overrun checkWaiman Long1-11/+3
The snprintf() function can return a length greater than the given input size. That will require a check for buffer overrun after each invocation of snprintf(). scnprintf(), on the other hand, will never return a greater length. By using scnprintf() in selected places, we can avoid some buffer overrun checks except after stack_depot_snprint() and after the last snprintf(). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Waiman Long <[email protected]> Acked-by: David Rientjes <[email protected]> Reviewed-by: Sergey Senozhatsky <[email protected]> Acked-by: Rafael Aquini <[email protected]> Acked-by: Mike Rapoport <[email protected]> Cc: Andy Shevchenko <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Petr Mladek <[email protected]> Cc: Rasmus Villemoes <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Steven Rostedt (Google) <[email protected]> Cc: Vladimir Davydov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-03-24lib/vsprintf: avoid redundant work with 0 sizeWaiman Long1-3/+5
Patch series "mm/page_owner: Extend page_owner to show memcg information", v4. While debugging the constant increase in percpu memory consumption on a system that spawned large number of containers, it was found that a lot of offline mem_cgroup structures remained in place without being freed. Further investigation indicated that those mem_cgroup structures were pinned by some pages. In order to find out what those pages are, the existing page_owner debugging tool is extended to show memory cgroup information and whether those memcgs are offline or not. With the enhanced page_owner tool, the following is a typical page that pinned the mem_cgroup structure in my test case: Page allocated via order 0, mask 0x1100cca(GFP_HIGHUSER_MOVABLE), pid 162970 (podman), ts 1097761405537 ns, free_ts 1097760838089 ns PFN 1925700 type Movable Block 3761 type Movable Flags 0x17ffffc00c001c(uptodate|dirty|lru|reclaim|swapbacked|node=0|zone=2|lastcpupid=0x1fffff) prep_new_page+0xac/0xe0 get_page_from_freelist+0x1327/0x14d0 __alloc_pages+0x191/0x340 alloc_pages_vma+0x84/0x250 shmem_alloc_page+0x3f/0x90 shmem_alloc_and_acct_page+0x76/0x1c0 shmem_getpage_gfp+0x281/0x940 shmem_write_begin+0x36/0xe0 generic_perform_write+0xed/0x1d0 __generic_file_write_iter+0xdc/0x1b0 generic_file_write_iter+0x5d/0xb0 new_sync_write+0x11f/0x1b0 vfs_write+0x1ba/0x2a0 ksys_write+0x59/0xd0 do_syscall_64+0x37/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae Charged to offline memcg libpod-conmon-15e4f9c758422306b73b2dd99f9d50a5ea53cbb16b4a13a2c2308a4253cc0ec8. So the page was not freed because it was part of a shmem segment. That is useful information that can help users to diagnose similar problems. With cgroup v1, /proc/cgroups can be read to find out the total number of memory cgroups (online + offline). With cgroup v2, the cgroup.stat of the root cgroup can be read to find the number of dying cgroups (most likely pinned by dying memcgs). The page_owner feature is not supposed to be enabled for production system due to its memory overhead. However, if it is suspected that dying memcgs are increasing over time, a test environment with page_owner enabled can then be set up with appropriate workload for further analysis on what may be causing the increasing number of dying memcgs. This patch (of 4): For *scnprintf(), vsnprintf() is always called even if the input size is 0. That is a waste of time, so just return 0 in this case. Note that vsnprintf() will never return -1 to indicate an error. So skipping the call to vsnprintf() when size is 0 will have no functional impact at all. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Waiman Long <[email protected]> Acked-by: David Rientjes <[email protected]> Reviewed-by: Sergey Senozhatsky <[email protected]> Acked-by: Roman Gushchin <[email protected]> Acked-by: Rafael Aquini <[email protected]> Acked-by: Mike Rapoport <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Vladimir Davydov <[email protected]> Cc: Petr Mladek <[email protected]> Cc: Steven Rostedt (Google) <[email protected]> Cc: Andy Shevchenko <[email protected]> Cc: Rasmus Villemoes <[email protected]> Cc: Ira Weiny <[email protected]> Cc: David Rientjes <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-03-24Documentation/vm/page_owner.rst: fix unexpected indentation warnsShuah Khan1-3/+3
Fix Unexpected indentation warns in page_owner: Documentation/vm/page_owner.rst:92: WARNING: Unexpected indentation. Documentation/vm/page_owner.rst:96: WARNING: Unexpected indentation. Documentation/vm/page_owner.rst:107: WARNING: Unexpected indentation. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Shuah Khan <[email protected]> Cc: Jonathan Corbet <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-03-24Documentation/vm/page_owner.rst: update the documentationShenghong Han1-2/+21
Update the documentation of ``page_owner``. [[email protected]: small grammatical tweaks] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Shenghong Han <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Georgi Djakov <[email protected]> Cc: Liam Mark <[email protected]> Cc: Tang Bin <[email protected]> Cc: Zhang Shengju <[email protected]> Cc: Zhenliang Wei <[email protected]> Cc: Xiaoming Ni <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-03-24tools/vm/page_owner_sort.c: delete invalid duplicate codeYixuan Cao1-2/+0
I noticed that there is two invalid lines of duplicate code. It's better to delete it. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Yixuan Cao <[email protected]> Cc: Mark Brown <[email protected]> Cc: Sean Anderson <[email protected]> Cc: Zhenliang Wei <[email protected]> Cc: Tang Bin <[email protected]> Cc: Yinan Zhang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-03-24tools/vm/page_owner_sort.c: two trivial fixesShenghong Han1-3/+2
1) There is an unused variable. It's better to delete it. 2) One case is missing in the usage(). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Shenghong Han <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-03-24tools/vm/page_owner_sort.c: support sorting pid and timeChongxi Zhao1-29/+148
When viewing the page owner information, we expect that the information can be sorted by PID, so that we can quickly combine PID with the program to check the information together. We also expect that the information can be sorted by time. Time sorting helps to view the running status of the program according to the time interval when the program hangs up. Finally, we hope to pass the page_ owner_ Sort. C can reduce part of the output and only output the plate information whose memory has not been released, which can make us locate the problem of the program faster. Therefore, the following adjustments have been made: 1. Add the static functions search_pattern and check_regcomp to improve the cleanliness. 2. Add member attributes and their corresponding sorting methods. In terms of comparison time, int will overflow because the data of ull is too large, so the ternary operator is used 3. Add the -f parameter to filter out the information of blocks whose memory has not been released Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Chongxi Zhao <[email protected]> Reviewed-by: Sean Anderson <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-03-24tools/vm/page_owner_sort.c: add switch between culling by stacktrace and txtYinan Zhang1-3/+20
Culling by comparing stacktrace would casue loss of some information. For example, if there exists 2 blocks which have the same stacktrace and the different head info Page allocated via order 0, mask 0x108c48(...), pid 73696, ts 1578829190639010 ns, free_ts 1576583851324450 ns prep_new_page+0x80/0xb8 get_page_from_freelist+0x924/0xee8 __alloc_pages+0x138/0xc18 alloc_pages+0x80/0xf0 __page_cache_alloc+0x90/0xc8 Page allocated via order 0, mask 0x108c48(...), pid 61806, ts 1354113726046100 ns, free_ts 1354104926841400 ns prep_new_page+0x80/0xb8 get_page_from_freelist+0x924/0xee8 __alloc_pages+0x138/0xc18 alloc_pages+0x80/0xf0 __page_cache_alloc+0x90/0xc8 After culling, it would be like this 2 times, 2 pages: Page allocated via order 0, mask 0x108c48(...), pid 73696, ts 1578829190639010 ns, free_ts 1576583851324450 ns prep_new_page+0x80/0xb8 get_page_from_freelist+0x924/0xee8 __alloc_pages+0x138/0xc18 alloc_pages+0x80/0xf0 __page_cache_alloc+0x90/0xc8 The info of second block missed. So, add -c to turn on culling by stacktrace. By default, it will cull by txt. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Yinan Zhang <[email protected]> Cc: Changhee Han <[email protected]> Cc: Sean Anderson <[email protected]> Cc: Stephen Rothwell <[email protected]> Cc: Tang Bin <[email protected]> Cc: Zhang Shengju <[email protected]> Cc: Zhenliang Wei <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-03-24tools/vm/page_owner_sort.c: support sorting by stack traceSean Anderson1-9/+14
This adds the ability to sort by stacktraces. This is helpful when comparing multiple dumps of page_owner taken at different times, since blocks will not be reordered if they were allocated/free'd. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sean Anderson <[email protected]> Cc: Zhenliang Wei <[email protected]> Cc: Changhee Han <[email protected]> Cc: Tang Bin <[email protected]> Cc: Zhang Shengju <[email protected]> Cc: Stephen Rothwell <[email protected]> Cc: Yinan Zhang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-03-24tools/vm/page_owner_sort.c: sort by stacktrace before cullingSean Anderson1-4/+6
The contents of page_owner have changed to include more information than the stack trace. On a modern kernel, the blocks look like Page allocated via order 0, mask 0x0(), pid 1, ts 165564237 ns, free_ts 0 ns register_early_stack+0x4b/0x90 init_page_owner+0x39/0x250 kernel_init_freeable+0x11e/0x242 kernel_init+0x16/0x130 Sorting by the contents of .txt will result in almost no repeated pages, as the pid, ts, and free_ts will almost never be the same. Instead, sort by the contents of the stack trace, which we assume to be whatever is after the first line. [[email protected]: fix NULL-pointer dereference when comparing stack traces] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sean Anderson <[email protected]> Cc: Changhee Han <[email protected]> Cc: Tang Bin <[email protected]> Cc: Zhang Shengju <[email protected]> Cc: Zhenliang Wei <[email protected]> Cc: Stephen Rothwell <[email protected]> Cc: Yinan Zhang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-03-24Merge branch ↵Jakub Kicinski1-4/+7
'vsock-virtio-enable-vqs-early-on-probe-and-finish-the-setup-before-using-them' Stefano Garzarella says: ==================== vsock/virtio: enable VQs early on probe and finish the setup before using them The first patch fixes a virtio-spec violation. The other two patches complete the driver configuration before using the VQs in the probe. The patch order should simplify backporting in stable branches. v2: https://lore.kernel.org/netdev/[email protected]/ v1: https://lore.kernel.org/netdev/[email protected]/ ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-03-24vsock/virtio: enable VQs early on probeStefano Garzarella1-0/+2
virtio spec requires drivers to set DRIVER_OK before using VQs. This is set automatically after probe returns, but virtio-vsock driver uses VQs in the probe function to fill rx and event VQs with new buffers. Let's fix this, calling virtio_device_ready() before using VQs in the probe function. Fixes: 0ea9e1d3a9e3 ("VSOCK: Introduce virtio_transport.ko") Signed-off-by: Stefano Garzarella <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Reviewed-by: Stefan Hajnoczi <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-03-24vsock/virtio: read the negotiated features before using VQsStefano Garzarella1-3/+3
Complete the driver configuration, reading the negotiated features, before using the VQs in the virtio_vsock_probe(). Fixes: 53efbba12cc7 ("virtio/vsock: enable SEQPACKET for transport") Suggested-by: Michael S. Tsirkin <[email protected]> Reviewed-by: Stefan Hajnoczi <[email protected]> Signed-off-by: Stefano Garzarella <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-03-24vsock/virtio: initialize vdev->priv before using VQsStefano Garzarella1-1/+2
When we fill VQs with empty buffers and kick the host, it may send an interrupt. `vdev->priv` must be initialized before this since it is used in the virtqueue callbacks. Fixes: 0deab087b16a ("vsock/virtio: use RCU to avoid use-after-free on the_virtio_vsock") Suggested-by: Michael S. Tsirkin <[email protected]> Signed-off-by: Stefano Garzarella <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Reviewed-by: Stefan Hajnoczi <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-03-24Merge tag 'ceph-for-5.18-rc1' of https://github.com/ceph/ceph-clientLinus Torvalds20-376/+577
Pull ceph updates from Ilya Dryomov: "The highlights are: - several changes to how snap context and snap realms are tracked (Xiubo Li). In particular, this should resolve a long-standing issue of high kworker CPU usage and various stalls caused by needless iteration over all inodes in the snap realm. - async create fixes to address hangs in some edge cases (Jeff Layton) - support for getvxattr MDS op for querying server-side xattrs, such as file/directory layouts and ephemeral pins (Milind Changire) - average latency is now maintained for all metrics (Venky Shankar) - some tweaks around handling inline data to make it fit better with netfs helper library (David Howells) Also a couple of memory leaks got plugged along with a few assorted fixups. Last but not least, Xiubo has stepped up to serve as a CephFS co-maintainer" * tag 'ceph-for-5.18-rc1' of https://github.com/ceph/ceph-client: (27 commits) ceph: fix memory leak in ceph_readdir when note_last_dentry returns error ceph: uninitialized variable in debug output ceph: use tracked average r/w/m latencies to display metrics in debugfs ceph: include average/stdev r/w/m latency in mds metrics ceph: track average r/w/m latency ceph: use ktime_to_timespec64() rather than jiffies_to_timespec64() ceph: assign the ci only when the inode isn't NULL ceph: fix inode reference leakage in ceph_get_snapdir() ceph: misc fix for code style and logs ceph: allocate capsnap memory outside of ceph_queue_cap_snap() ceph: do not release the global snaprealm until unmounting ceph: remove incorrect and unused CEPH_INO_DOTDOT macro MAINTAINERS: add Xiubo Li as cephfs co-maintainer ceph: eliminate the recursion when rebuilding the snap context ceph: do not update snapshot context when there is no new snapshot ceph: zero the dir_entries memory when allocating it ceph: move to a dedicated slabcache for ceph_cap_snap ceph: add getvxattr op libceph: drop else branches in prepare_read_data{,_cont} ceph: fix comments mentioning i_mutex ...
2022-03-24net: usb: ax88179_178a: add Allied Telesis AT-UMCsGreg Jesionowski1-0/+51
Adds the driver_info and IDs for the AX88179 based Allied Telesis AT-UMC family of devices. Signed-off-by: Greg Jesionowski <[email protected]> Link: https://lore.kernel.org/r/20220323220024.GA36800@test-HP-EliteDesk-800-G1-SFF Signed-off-by: Jakub Kicinski <[email protected]>
2022-03-24Merge tag 'xfs-5.18-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds27-211/+344
Pull xfs updates from Darrick Wong: "The biggest change this cycle is bringing XFS' inode attribute setting code back towards alignment with what the VFS does. IOWs, setgid bit handling should be a closer match with ext4 and btrfs behavior. The rest of the branch is bug fixes around the filesystem -- patching gaps in quota enforcement, removing bogus selinux audit messages, and fixing log corruption and problems with log recovery. There will be a second pull request later on in the merge window with more bug fixes. Dave Chinner will be taking over as XFS maintainer for one release cycle, starting from the day 5.18-rc1 drops until 5.19-rc1 is tagged so that I can focus on starting a massive design review for the (feature complete after five years) online repair feature. Summary: - Fix some incorrect mapping state being passed to iomap during COW - Don't create bogus selinux audit messages when deciding to degrade gracefully due to lack of privilege - Fix setattr implementation to use VFS helpers so that we drop setgid consistently with the other filesystems - Fix link/unlink/rename to check quota limits - Constify xfs_name_dotdot to prevent abuse of in-kernel symbols - Fix log livelock between the AIL and inodegc threads during recovery - Fix a log stall when the AIL races with pushers - Fix stalls in CIL flushes due to pinned inode cluster buffers during recovery - Fix log corruption due to incorrect usage of xfs_is_shutdown vs xlog_is_shutdown because during an induced fs shutdown, AIL writeback must continue until the log is shut down, even if the filesystem has already shut down" * tag 'xfs-5.18-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: xfs_is_shutdown vs xlog_is_shutdown cage fight xfs: AIL should be log centric xfs: log items should have a xlog pointer, not a mount xfs: async CIL flushes need pending pushes to be made stable xfs: xfs_ail_push_all_sync() stalls when racing with updates xfs: check buffer pin state after locking in delwri_submit xfs: log worker needs to start before intent/unlink recovery xfs: constify xfs_name_dotdot xfs: constify the name argument to various directory functions xfs: reserve quota for target dir expansion when renaming files xfs: reserve quota for dir expansion when linking/unlinking files xfs: refactor user/group quota chown in xfs_setattr_nonsize xfs: use setattr_copy to set vfs inode attributes xfs: don't generate selinux audit messages for capability testing xfs: add missing cmap->br_state = XFS_EXT_NORM update
2022-03-24Merge tag 'dax-for-5.18' of ↵Linus Torvalds2-1/+3
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull DAX updates from Dan Williams: "Andrew has been shepherding major dax features that touch the core -mm through his tree, but I still collect the dax updates that are core-mm independent. - Fix a crash due to a missing rcu_barrier() in dax_fs_exit() - Fix two miscellaneous doc issues" * tag 'dax-for-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: dax: Fix missing kdoc for dax_device dax: make sure inodes are flushed before destroy cache fsdax: fix function description
2022-03-24Merge tag 'cxl-for-5.18' of ↵Linus Torvalds32-1225/+3725
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull CXL (Compute Express Link) updates from Dan Williams: "This development cycle extends the subsystem to discover CXL resources throughout a CXL/PCIe switch topology and respond to hot add/remove events anywhere in that topology. This is more foundational infrastructure in preparation for dynamic memory region provisioning support. Recall that CXL memory regions, as the new "Theory of Operation" section of Documentation/driver-api/cxl/memory-devices.rst describes, bring storage volume striping semantics to memory. The hot add/remove behavior is validated with extensions to the cxl_test unit test environment and this test in the cxl-cli test suite: https://github.com/pmem/ndctl/blob/djbw/for-74/cxl/test/cxl-topology.sh Summary: - Add a driver for 'struct cxl_memdev' objects responsible for CXL.mem operation as distinct from 'cxl_pci' mailbox operations. Its primary responsibility is enumerating an endpoint 'struct cxl_port' and all the 'struct cxl_port' instances between an endpoint and the CXL platform root. - Add a driver for 'struct cxl_port' objects responsible for enumerating and operating all Host-managed Device Memory (HDM) decoder resources between the platform-level CXL memory description, all intervening host bridges / switches, and the HDM resources in endpoints. - Update the cxl_pci driver to validate CXL.mem operation precursors to HDM decoder operation like ready-polling, and legacy CXL 1.1 DVSEC based CXL.mem configuration. - Add basic lockdep coverage for usage of device_lock() on CXL subsystem objects similar to what exists for LIBNVDIMM. Include a compile-time switch for which subsystem to validate at run-time. - Update cxl_test to emulate a one level switch topology. - Document a "Theory of Operation" for the subsystem. - Add 'numa_node' and 'serial' attributes to cxl_memdev sysfs - Include miscellaneous fixes for spec / QEMU CXL emulation compatibility and static analysis reports" * tag 'cxl-for-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (48 commits) cxl/core/port: Fix NULL but dereferenced coccicheck error cxl/port: Hold port reference until decoder release cxl/port: Fix endpoint refcount leak cxl/core: Fix cxl_device_lock() class detection cxl/core/port: Fix unregister_port() lock assertion cxl/regs: Fix size of CXL Capability Header Register cxl/core/port: Handle invalid decoders cxl/core/port: Fix / relax decoder target enumeration tools/testing/cxl: Add a physical_node link tools/testing/cxl: Enumerate mock decoders tools/testing/cxl: Mock one level of switches tools/testing/cxl: Fix root port to host bridge assignment tools/testing/cxl: Mock dvsec_ranges() cxl/core/port: Add endpoint decoders cxl/core: Move target_list out of base decoder attributes cxl/mem: Add the cxl_mem driver cxl/core/port: Add switch port enumeration cxl/memdev: Add numa_node attribute cxl/pci: Emit device serial number cxl/pci: Implement wait for media active ...
2022-03-24clk: qcom: gcc-msm8994: Fix gpll4 widthKonrad Dybcio1-0/+1
The gpll4 postdiv is actually a div4, so make sure that Linux is aware of this. This fixes the following error messages: mmc1: Card appears overclocked; req 200000000 Hz, actual 343999999 Hz mmc1: Card appears overclocked; req 400000000 Hz, actual 687999999 Hz Fixes: aec89f78cf01 ("clk: qcom: Add support for msm8994 global clock controller") Signed-off-by: Konrad Dybcio <[email protected]> Link: https://lore.kernel.org/r/[email protected] Tested-by: Petr Vorel <[email protected]> Reviewed-by: Petr Vorel <[email protected]> Signed-off-by: Stephen Boyd <[email protected]>
2022-03-24net: dsa: realtek: make interface drivers depend on OFAlvin Šipraga1-0/+2
The kernel test robot reported build warnings with a randconfig that built realtek-{smi,mdio} without CONFIG_OF set. Since both interface drivers are using OF and will not probe without, add the corresponding dependency to Kconfig. Link: https://lore.kernel.org/all/[email protected]/ Link: https://lore.kernel.org/all/[email protected]/ Fixes: aac94001067d ("net: dsa: realtek: add new mdio interface for drivers") Fixes: 765c39a4fafe ("net: dsa: realtek: convert subdrivers into modules") Signed-off-by: Alvin Šipraga <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Acked-by: Luiz Angelo Daros de Luca <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-03-24dt-bindings: clock: fix dt_binding_check error for qcom,gcc-other.yamlAnsuel Smith1-1/+1
qcom,gcc-other Documentation lacks a '|' for the description. This cause dt_binding_check to incorrectly parse "See also:" as a new value. Add the missing '|' to correctly parse the description. Fixes: a03965ed1310 ("dt-bindings: clock: split qcom,gcc.yaml to common and specific schema") Signed-off-by: Ansuel Smith <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Krzysztof Kozlowski <[email protected]> Reported-by: Rob Herring <[email protected]> Signed-off-by: Stephen Boyd <[email protected]>
2022-03-24net: stmmac: dwmac-qcom-ethqos: Enable RGMII functional clock on resumeBjorn Andersson1-0/+7
When the Qualcomm ethqos driver is properly described in its associated GDSC power-domain, the hardware will be powered down and loose its state between qcom_ethqos_probe() and stmmac_init_dma_engine(). The result of this is that the functional clock from the RGMII IO macro is no longer provides and the DMA software reset in dwmac4_dma_reset() will time out, due to lacking clock signal. Re-enable the functional clock, as part of the Qualcomm specific clock enablement sequence to avoid this problem. The final clock configuration will be adjusted by ethqos_fix_mac_speed() once the link is being brought up. Fixes: a7c30e62d4b8 ("net: stmmac: Add driver for Qualcomm ethqos") Signed-off-by: Bjorn Andersson <[email protected]> Tested-and-reviewed-by: Bhupesh Sharma <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-03-25fbdev: Fix cfb_imageblit() for arbitrary image widthsThomas Zimmermann1-4/+24
Commit 0d03011894d2 ("fbdev: Improve performance of cfb_imageblit()") broke cfb_imageblit() for image widths that are not aligned to 8-bit boundaries. Fix this by handling the trailing pixels on each line separately. The performance improvements in the original commit do not regress by this change. Signed-off-by: Thomas Zimmermann <[email protected]> Fixes: 0d03011894d2 ("fbdev: Improve performance of cfb_imageblit()") Reported-by: Marek Szyprowski <[email protected]> Cc: Thomas Zimmermann <[email protected]> Cc: Javier Martinez Canillas <[email protected]> Cc: Sam Ravnborg <[email protected]> Tested-by: Marek Szyprowski <[email protected]> Acked-by: Daniel Vetter <[email protected]> Reviewed-by: Javier Martinez Canillas <[email protected]> Tested-by: Guenter Roeck <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2022-03-25fbdev: Fix sys_imageblit() for arbitrary image widthsThomas Zimmermann1-4/+25
Commit 6f29e04938bf ("fbdev: Improve performance of sys_imageblit()") broke sys_imageblit() for image width that are not aligned to 8-bit boundaries. Fix this by handling the trailing pixels on each line separately. The performance improvements in the original commit do not regress by this change. Signed-off-by: Thomas Zimmermann <[email protected]> Fixes: 6f29e04938bf ("fbdev: Improve performance of sys_imageblit()") Cc: Thomas Zimmermann <[email protected]> Cc: Javier Martinez Canillas <[email protected]> Cc: Sam Ravnborg <[email protected]> Tested-by: Geert Uytterhoeven <[email protected]> Reviewed-by: Javier Martinez Canillas <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2022-03-25Merge tag 'drm-misc-next-fixes-2022-03-24-1' of ↵Dave Airlie3-10/+15
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next-fixes for v5.18-rc1: - Make audio and color plane support checking only happen when a CEA extension block is found. - Fix a small regression from ttm_resource_fini() - Small selftest fix. Signed-off-by: Dave Airlie <[email protected]> From: Maarten Lankhorst <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2022-03-25Merge tag 'drm-intel-next-fixes-2022-03-24' of ↵Dave Airlie4-6/+18
git://anongit.freedesktop.org/drm/drm-intel into drm-next - Reject unsupported TMDS rates on ICL+ (Ville Syrjälä) - Treat SAGV block time 0 as SAGV disabled (Ville Syrjälä) - Fix PSF GV point mask when SAGV is not possible (Ville Syrjälä) - Fix renamed INTEL_INFO->media.arch/ver field (Lucas De Marchi) Signed-off-by: Dave Airlie <[email protected]> From: Tvrtko Ursulin <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/YjwvgGzYNAX5rxHN@tursulin-mobl2
2022-03-24Merge tag 'drm-next-2022-03-24' of git://anongit.freedesktop.org/drm/drmLinus Torvalds1466-33826/+487487
Pull drm updates from Dave Airlie: "Lots of work all over, Intel improving DG2 support, amdkfd CRIU support, msm new hw support, and faster fbdev support. dma-buf: - rename dma-buf-map to iosys-map core: - move buddy allocator to core - add pci/platform init macros - improve EDID parser deep color handling - EDID timing type 7 support - add GPD Win Max quirk - add yes/no helpers to string_helpers - flatten syncobj chains - add nomodeset support to lots of drivers - improve fb-helper clipping support - add default property value interface fbdev: - improve fbdev ops speed ttm: - add a backpointer from ttm bo->ttm resource dp: - move displayport headers - add a dp helper module bridge: - anx7625 atomic support, HDCP support panel: - split out panel-lvds and lvds bindings - find panels in OF subnodes privacy: - add chromeos privacy screen support fb: - hot unplug fw fb on forced removal simpledrm: - request region instead of marking ioresource busy - add panel oreintation property udmabuf: - fix oops with 0 pages amdgpu: - power management code cleanup - Enable freesync video mode by default - RAS code cleanup - Improve VRAM access for debug using SDMA - SR-IOV rework special register access and fixes - profiling power state request ioctl - expose IP discovery via sysfs - Cyan skillfish updates - GC 10.3.7, SDMA 5.2.7, DCN 3.1.6 updates - expose benchmark tests via debugfs - add module param to disable XGMI for testing - GPU reset debugfs register dumping support amdkfd: - CRIU support - SDMA queue fixes radeon: - UVD suspend fix - iMac backlight fix i915: - minimal parallel submission for execlists - DG2-G12 subplatform added - DG2 programming workarounds - DG2 accelerated migration support - flat CCS and CCS engine support for XeHP - initial small BAR support - drop fake LMEM support - ADL-N PCH support - bigjoiner updates - introduce VMA resources and async unbinding - register definitions cleanups - multi-FBC refactoring - DG1 OPROM over SPI support - ADL-N platform enabling - opregion mailbox #5 support - DP MST ESI improvements - drm device based logging - async flip optimisation for DG2 - CPU arch abstraction fixes - improve GuC ADS init to work on aarch64 - tweak TTM LRU priority hint - GuC 69.0.3 support - remove short term execbuf pins nouveau: - higher DP/eDP bitrates - backlight fixes msm: - dpu + dp support for sc8180x - dp support for sm8350 - dpu + dsi support for qcm2290 - 10nm dsi phy tuning support - bridge support for dp encoder - gpu support for additional 7c3 SKUs ingenic: - HDMI support for JZ4780 - aux channel EDID support ast: - AST2600 support - add wide screen support - create DP/DVI connectors omapdrm: - fix implicit dma_buf fencing vc4: - add CSC + full range support - better display firmware handoff panfrost: - add initial dual-core GPU support stm: - new revision support - fb handover support mediatek: - transfer display binding document to yaml format. - add mt8195 display device binding. - allow commands to be sent during video mode. - add wait_for_event for crtc disable by cmdq. tegra: - YUV format support rcar-du: - LVDS support for M3-W+ (R8A77961) exynos: - BGR pixel format for FIMD device" * tag 'drm-next-2022-03-24' of git://anongit.freedesktop.org/drm/drm: (1529 commits) drm/i915/display: Do not re-enable PSR after it was marked as not reliable drm/i915/display: Fix HPD short pulse handling for eDP drm/amdgpu: Use drm_mode_copy() drm/radeon: Use drm_mode_copy() drm/amdgpu: Use ternary operator in `vcn_v1_0_start()` drm/amdgpu: Remove pointless on stack mode copies drm/amd/pm: fix indenting in __smu_cmn_reg_print_error() drm/amdgpu/dc: fix typos in comments drm/amdgpu: fix typos in comments drm/amd/pm: fix typos in comments drm/amdgpu: Add stolen reserved memory for MI25 SRIOV. drm/amdgpu: Merge get_reserved_allocation to get_vbios_allocations. drm/amdkfd: evict svm bo worker handle error drm/amdgpu/vcn: fix vcn ring test failure in igt reload test drm/amdgpu: only allow secure submission on rings which support that drm/amdgpu: fixed the warnings reported by kernel test robot drm/amd/display: 3.2.177 drm/amd/display: [FW Promotion] Release 0.0.108.0 drm/amd/display: Add save/restore PANEL_PWRSEQ_REF_DIV2 drm/amd/display: Wait for hubp read line for Pollock ...
2022-03-24pinctrl: qcom-pmic-gpio: Add support for pm8450Dmitry Baryshkov2-0/+2
PM8450 provides 4 GPIOs. Add a compatible entry for this GPIO block. Signed-off-by: Dmitry Baryshkov <[email protected]> Acked-by: Krzysztof Kozlowski <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Linus Walleij <[email protected]>
2022-03-24dt-bindings: pinctrl: aspeed: Update gfx node in exampleJoel Stanley1-0/+16
The example needs updating to match the to be added yaml bindings for the gfx node. Signed-off-by: Joel Stanley <[email protected]> Reviewed-by: Linus Walleij <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Linus Walleij <[email protected]>
2022-03-24Merge branch 'akpm' (patches from Andrew)Linus Torvalds34-303/+311
Merge more updates from Andrew Morton: "Various misc subsystems, before getting into the post-linux-next material. 41 patches. Subsystems affected by this patch series: procfs, misc, core-kernel, lib, checkpatch, init, pipe, minix, fat, cgroups, kexec, kdump, taskstats, panic, kcov, resource, and ubsan" * emailed patches from Andrew Morton <[email protected]>: (41 commits) Revert "ubsan, kcsan: Don't combine sanitizer with kcov on clang" kernel/resource: fix kfree() of bootmem memory again kcov: properly handle subsequent mmap calls kcov: split ioctl handling into locked and unlocked parts panic: move panic_print before kmsg dumpers panic: add option to dump all CPUs backtraces in panic_print docs: sysctl/kernel: add missing bit to panic_print taskstats: remove unneeded dead assignment kasan: no need to unset panic_on_warn in end_report() ubsan: no need to unset panic_on_warn in ubsan_epilogue() panic: unset panic_on_warn inside panic() docs: kdump: add scp example to write out the dump file docs: kdump: update description about sysfs file system support arm64: mm: use IS_ENABLED(CONFIG_KEXEC_CORE) instead of #ifdef x86/setup: use IS_ENABLED(CONFIG_KEXEC_CORE) instead of #ifdef riscv: mm: init: use IS_ENABLED(CONFIG_KEXEC_CORE) instead of #ifdef kexec: make crashk_res, crashk_low_res and crash_notes symbols always visible cgroup: use irqsave in cgroup_rstat_flush_locked(). fat: use pointer to simple type in put_user() minix: fix bug when opening a file with O_DIRECT ...
2022-03-24tools headers cpufeatures: Sync with the kernel sourcesArnaldo Carvalho de Melo1-3/+4
To pick the changes from: fa31a4d669bd471e ("x86/cpufeatures: Put the AMX macros in the word 18 block") 7b8f40b3de75c971 ("x86/cpu: Add definitions for the Intel Hardware Feedback Interface") This only causes these perf files to be rebuilt: CC /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o CC /tmp/build/perf/bench/mem-memset-x86-64-asm.o And addresses this perf build warning: Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h' diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h Cc: Borislav Petkov <[email protected]> Cc: Jim Mattson <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Cc: Ricardo Neri <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-03-24tools headers cpufeatures: Sync with the kernel sourcesArnaldo Carvalho de Melo1-2/+5
To pick the changes in: 7c1ef59145f1c8bf ("x86/cpufeatures: Re-enable ENQCMD") That causes only these 'perf bench' objects to rebuild: CC /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o CC /tmp/build/perf/bench/mem-memset-x86-64-asm.o And addresses these perf build warnings: Warning: Kernel ABI header at 'tools/arch/x86/include/asm/disabled-features.h' differs from latest version at 'arch/x86/include/asm/disabled-features.h' diff -u tools/arch/x86/include/asm/disabled-features.h arch/x86/include/asm/disabled-features.h Cc: Borislav Petkov <[email protected]> Cc: Fenghua Yu <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-03-24perf stat: Fix forked applications enablement of countersThomas Richter1-1/+1
I have run into the following issue: # perf stat -a -e new_pmu/INSTRUCTION_7/ -- mytest -c1 7 Performance counter stats for 'system wide': 0 new_pmu/INSTRUCTION_7/ 0.000366428 seconds time elapsed # The new PMU for s390 counts the execution of certain CPU instructions. The root cause is the extremely small run time of the mytest program. It just executes some assembly instructions and then exits. In above invocation the instruction is executed exactly one time (-c1 option). The PMU is expected to report this one time execution by a counter value of one, but fails to do so in some cases, not all. Debugging reveals the invocation of the child process is done *before* the counter events are installed and enabled. Tracing reveals that sometimes the child process starts and exits before the event is installed on all CPUs. The more CPUs the machine has, the more often this miscount happens. Fix this by reversing the start of the work load after the events have been installed on the specified CPUs. Now the comment also matches the code. Output after: # perf stat -a -e new_pmu/INSTRUCTION_7/ -- mytest -c1 7 Performance counter stats for 'system wide': 1 new_pmu/INSTRUCTION_7/ 0.000366428 seconds time elapsed # Now the correct result is reported rock solid all the time regardless how many CPUs are online. Reviewers notes: Jiri: Right, without -a the event has enable_on_exec so the race does not matter, but it's a problem for system wide with fork. Namhyung: Agreed. Also we may move the enable_counters() and the clock code out of the if block to be shared with the else block. Fixes: acf2892270dcc428 ("perf stat: Use perf_evlist__prepare/start_workload()") Signed-off-by: Thomas Richter <[email protected]> Acked-by: Jiri Olsa <[email protected]> Acked-by: Namhyung Kim <[email protected]> Acked-by: Sumanth Korikkar <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Vasily Gorbik <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-03-24tools arch x86: Sync the msr-index.h copy with the kernel sourcesArnaldo Carvalho de Melo1-0/+6
To pick up the changes from these csets: 7b8f40b3de75c971 ("x86/cpu: Add definitions for the Intel Hardware Feedback Interface") That cause no changes to tooling: $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after $ diff -u before after $ Just silences this perf build warning: Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h' diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h Cc: Rafael J. Wysocki <[email protected]> Cc: Ricardo Neri <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-03-24Merge tag 'net-next-5.18' of ↵Linus Torvalds2020-36565/+120737
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "The sprinkling of SPI drivers is because we added a new one and Mark sent us a SPI driver interface conversion pull request. Core ---- - Introduce XDP multi-buffer support, allowing the use of XDP with jumbo frame MTUs and combination with Rx coalescing offloads (LRO). - Speed up netns dismantling (5x) and lower the memory cost a little. Remove unnecessary per-netns sockets. Scope some lists to a netns. Cut down RCU syncing. Use batch methods. Allow netdev registration to complete out of order. - Support distinguishing timestamp types (ingress vs egress) and maintaining them across packet scrubbing points (e.g. redirect). - Continue the work of annotating packet drop reasons throughout the stack. - Switch netdev error counters from an atomic to dynamically allocated per-CPU counters. - Rework a few preempt_disable(), local_irq_save() and busy waiting sections problematic on PREEMPT_RT. - Extend the ref_tracker to allow catching use-after-free bugs. BPF --- - Introduce "packing allocator" for BPF JIT images. JITed code is marked read only, and used to be allocated at page granularity. Custom allocator allows for more efficient memory use, lower iTLB pressure and prevents identity mapping huge pages from getting split. - Make use of BTF type annotations (e.g. __user, __percpu) to enforce the correct probe read access method, add appropriate helpers. - Convert the BPF preload to use light skeleton and drop the user-mode-driver dependency. - Allow XDP BPF_PROG_RUN test infra to send real packets, enabling its use as a packet generator. - Allow local storage memory to be allocated with GFP_KERNEL if called from a hook allowed to sleep. - Introduce fprobe (multi kprobe) to speed up mass attachment (arch bits to come later). - Add unstable conntrack lookup helpers for BPF by using the BPF kfunc infra. - Allow cgroup BPF progs to return custom errors to user space. - Add support for AF_UNIX iterator batching. - Allow iterator programs to use sleepable helpers. - Support JIT of add, and, or, xor and xchg atomic ops on arm64. - Add BTFGen support to bpftool which allows to use CO-RE in kernels without BTF info. - Large number of libbpf API improvements, cleanups and deprecations. Protocols --------- - Micro-optimize UDPv6 Tx, gaining up to 5% in test on dummy netdev. - Adjust TSO packet sizes based on min_rtt, allowing very low latency links (data centers) to always send full-sized TSO super-frames. - Make IPv6 flow label changes (AKA hash rethink) more configurable, via sysctl and setsockopt. Distinguish between server and client behavior. - VxLAN support to "collect metadata" devices to terminate only configured VNIs. This is similar to VLAN filtering in the bridge. - Support inserting IPv6 IOAM information to a fraction of frames. - Add protocol attribute to IP addresses to allow identifying where given address comes from (kernel-generated, DHCP etc.) - Support setting socket and IPv6 options via cmsg on ping6 sockets. - Reject mis-use of ECN bits in IP headers as part of DSCP/TOS. Define dscp_t and stop taking ECN bits into account in fib-rules. - Add support for locked bridge ports (for 802.1X). - tun: support NAPI for packets received from batched XDP buffs, doubling the performance in some scenarios. - IPv6 extension header handling in Open vSwitch. - Support IPv6 control message load balancing in bonding, prevent neighbor solicitation and advertisement from using the wrong port. Support NS/NA monitor selection similar to existing ARP monitor. - SMC - improve performance with TCP_CORK and sendfile() - support auto-corking - support TCP_NODELAY - MCTP (Management Component Transport Protocol) - add user space tag control interface - I2C binding driver (as specified by DMTF DSP0237) - Multi-BSSID beacon handling in AP mode for WiFi. - Bluetooth: - handle MSFT Monitor Device Event - add MGMT Adv Monitor Device Found/Lost events - Multi-Path TCP: - add support for the SO_SNDTIMEO socket option - lots of selftest cleanups and improvements - Increase the max PDU size in CAN ISOTP to 64 kB. Driver API ---------- - Add HW counters for SW netdevs, a mechanism for devices which offload packet forwarding to report packet statistics back to software interfaces such as tunnels. - Select the default NIC queue count as a fraction of number of physical CPU cores, instead of hard-coding to 8. - Expose devlink instance locks to drivers. Allow device layer of drivers to use that lock directly instead of creating their own which always runs into ordering issues in devlink callbacks. - Add header/data split indication to guide user space enabling of TCP zero-copy Rx. - Allow configuring completion queue event size. - Refactor page_pool to enable fragmenting after allocation. - Add allocation and page reuse statistics to page_pool. - Improve Multiple Spanning Trees support in the bridge to allow reuse of topologies across VLANs, saving HW resources in switches. - DSA (Distributed Switch Architecture): - replay and offload of host VLAN entries - offload of static and local FDB entries on LAG interfaces - FDB isolation and unicast filtering New hardware / drivers ---------------------- - Ethernet: - LAN937x T1 PHYs - Davicom DM9051 SPI NIC driver - Realtek RTL8367S, RTL8367RB-VB switch and MDIO - Microchip ksz8563 switches - Netronome NFP3800 SmartNICs - Fungible SmartNICs - MediaTek MT8195 switches - WiFi: - mt76: MediaTek mt7916 - mt76: MediaTek mt7921u USB adapters - brcmfmac: Broadcom BCM43454/6 - Mobile: - iosm: Intel M.2 7360 WWAN card Drivers ------- - Convert many drivers to the new phylink API built for split PCS designs but also simplifying other cases. - Intel Ethernet NICs: - add TTY for GNSS module for E810T device - improve AF_XDP performance - GTP-C and GTP-U filter offload - QinQ VLAN support - Mellanox Ethernet NICs (mlx5): - support xdp->data_meta - multi-buffer XDP - offload tc push_eth and pop_eth actions - Netronome Ethernet NICs (nfp): - flow-independent tc action hardware offload (police / meter) - AF_XDP - Other Ethernet NICs: - at803x: fiber and SFP support - xgmac: mdio: preamble suppression and custom MDC frequencies - r8169: enable ASPM L1.2 if system vendor flags it as safe - macb/gem: ZynqMP SGMII - hns3: add TX push mode - dpaa2-eth: software TSO - lan743x: multi-queue, mdio, SGMII, PTP - axienet: NAPI and GRO support - Mellanox Ethernet switches (mlxsw): - source and dest IP address rewrites - RJ45 ports - Marvell Ethernet switches (prestera): - basic routing offload - multi-chain TC ACL offload - NXP embedded Ethernet switches (ocelot & felix): - PTP over UDP with the ocelot-8021q DSA tagging protocol - basic QoS classification on Felix DSA switch using dcbnl - port mirroring for ocelot switches - Microchip high-speed industrial Ethernet (sparx5): - offloading of bridge port flooding flags - PTP Hardware Clock - Other embedded switches: - lan966x: PTP Hardward Clock - qca8k: mdio read/write operations via crafted Ethernet packets - Qualcomm 802.11ax WiFi (ath11k): - add LDPC FEC type and 802.11ax High Efficiency data in radiotap - enable RX PPDU stats in monitor co-exist mode - Intel WiFi (iwlwifi): - UHB TAS enablement via BIOS - band disablement via BIOS - channel switch offload - 32 Rx AMPDU sessions in newer devices - MediaTek WiFi (mt76): - background radar detection - thermal management improvements on mt7915 - SAR support for more mt76 platforms - MBSSID and 6 GHz band on mt7915 - RealTek WiFi: - rtw89: AP mode - rtw89: 160 MHz channels and 6 GHz band - rtw89: hardware scan - Bluetooth: - mt7921s: wake on Bluetooth, SCO over I2S, wide-band-speed (WBS) - Microchip CAN (mcp251xfd): - multiple RX-FIFOs and runtime configurable RX/TX rings - internal PLL, runtime PM handling simplification - improve chip detection and error handling after wakeup" * tag 'net-next-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2521 commits) llc: fix netdevice reference leaks in llc_ui_bind() drivers: ethernet: cpsw: fix panic when interrupt coaleceing is set via ethtool ice: don't allow to run ice_send_event_to_aux() in atomic ctx ice: fix 'scheduling while atomic' on aux critical err interrupt net/sched: fix incorrect vlan_push_eth dest field net: bridge: mst: Restrict info size queries to bridge ports net: marvell: prestera: add missing destroy_workqueue() in prestera_module_init() drivers: net: xgene: Fix regression in CRC stripping net: geneve: add missing netlink policy and size for IFLA_GENEVE_INNER_PROTO_INHERIT net: dsa: fix missing host-filtered multicast addresses net/mlx5e: Fix build warning, detected write beyond size of field iwlwifi: mvm: Don't fail if PPAG isn't supported selftests/bpf: Fix kprobe_multi test. Revert "rethook: x86: Add rethook x86 implementation" Revert "arm64: rethook: Add arm64 rethook implementation" Revert "powerpc: Add rethook support" Revert "ARM: rethook: Add rethook arm implementation" netdevice: add missing dm_private kdoc net: bridge: mst: prevent NULL deref in br_mst_info_size() selftests: forwarding: Use same VRF for port and VLAN upper ...
2022-03-24Merge tag 'vfio-v5.18-rc1' of https://github.com/awilliam/linux-vfioLinus Torvalds40-358/+3558
Pull VFIO updates from Alex Williamson: - Introduce new device migration uAPI and implement device specific mlx5 vfio-pci variant driver supporting new protocol (Jason Gunthorpe, Yishai Hadas, Leon Romanovsky) - New HiSilicon acc vfio-pci variant driver, also supporting migration interface (Shameer Kolothum, Longfang Liu) - D3hot fixes for vfio-pci-core (Abhishek Sahu) - Document new vfio-pci variant driver acceptance criteria (Alex Williamson) - Fix UML build unresolved ioport_{un}map() functions (Alex Williamson) - Fix MAINTAINERS due to header movement (Lukas Bulwahn) * tag 'vfio-v5.18-rc1' of https://github.com/awilliam/linux-vfio: (31 commits) vfio-pci: Provide reviewers and acceptance criteria for variant drivers MAINTAINERS: adjust entry for header movement in hisilicon qm driver hisi_acc_vfio_pci: Use its own PCI reset_done error handler hisi_acc_vfio_pci: Add support for VFIO live migration crypto: hisilicon/qm: Set the VF QM state register hisi_acc_vfio_pci: Add helper to retrieve the struct pci_driver hisi_acc_vfio_pci: Restrict access to VF dev BAR2 migration region hisi_acc_vfio_pci: add new vfio_pci driver for HiSilicon ACC devices hisi_acc_qm: Move VF PCI device IDs to common header crypto: hisilicon/qm: Move few definitions to common header crypto: hisilicon/qm: Move the QM header to include/linux vfio/mlx5: Fix to not use 0 as NULL pointer PCI/IOV: Fix wrong kernel-doc identifier vfio/mlx5: Use its own PCI reset_done error handler vfio/pci: Expose vfio_pci_core_aer_err_detected() vfio/mlx5: Implement vfio_pci driver for mlx5 devices vfio/mlx5: Expose migration commands over mlx5 device vfio: Remove migration protocol v1 documentation vfio: Extend the device migration protocol with RUNNING_P2P vfio: Define device migration protocol v2 ...
2022-03-24Merge tag 'hyperv-next-signed-20220322' of ↵Linus Torvalds8-27/+42
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv updates from Wei Liu: "Minor patches from various people" * tag 'hyperv-next-signed-20220322' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: x86/hyperv: Output host build info as normal Windows version number hv_balloon: rate-limit "Unhandled message" warning drivers: hv: log when enabling crash_kexec_post_notifiers hv_utils: Add comment about max VMbus packet size in VSS driver Drivers: hv: Compare cpumasks and not their weights in init_vp_index() Drivers: hv: Rename 'alloced' to 'allocated' Drivers: hv: vmbus: Use struct_size() helper in kmalloc()
2022-03-24dt-bindings: pinctrl: rt2880: add missing pin groups and functionsArınç ÜNAL1-5/+6
Add the missing pin groups: jtag, wdt Add the missing functions: i2s, jtag, pcie refclk, pcie rst, pcm, spdif2, spdif3, wdt refclk, wdt rst Sort pin groups and functions in alphabetical order. Fix a typo. Signed-off-by: Arınç ÜNAL <[email protected]> Acked-by: Rob Herring <[email protected]> Acked-by: Sergio Paracuellos <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Linus Walleij <[email protected]>
2022-03-24pinctrl: ingenic: Fix regmap on X series SoCsAidan MacDonald1-1/+45
The X series Ingenic SoCs have a shadow GPIO group which is at a higher offset than the other groups, and is used for all GPIO configuration. The regmap did not take this offset into account and set max_register too low, so the regmap API blocked writes to the shadow group, which made the pinctrl driver unable to configure any pins. Fix this by adding regmap access tables to the chip info. The way that max_register was computed was also off by one, since max_register is an inclusive bound, not an exclusive bound; this has been fixed. Cc: [email protected] Signed-off-by: Aidan MacDonald <[email protected]> Fixes: 6626a76ef857 ("pinctrl: ingenic: Add .max_register in regmap_config") Reviewed-by: Paul Cercueil <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Linus Walleij <[email protected]>
2022-03-24pinctrl: nuvoton: Fix return value check in wpcm450_gpio_register()Jialin Zhang1-2/+3
In case of error, the function devm_platform_ioremap_resource() returns ERR_PTR() and never returns NULL. The NULL test in the return value check should be replaced with IS_ERR(). Fixes: a1d1e0e3d80a ("pinctrl: nuvoton: Add driver for WPCM450") Reported-by: Hulk Robot <[email protected]> Signed-off-by: Jialin Zhang <[email protected]> Reviewed-by: Jonathan Neuschäfer <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Linus Walleij <[email protected]>
2022-03-24pinctrl: nuvoton: wpcm450: off by one in wpcm450_gpio_register()Dan Carpenter1-1/+1
The > WPCM450_NUM_BANKS should be >= or it leads to an out of bounds access on the next line. Fixes: a1d1e0e3d80a ("pinctrl: nuvoton: Add driver for WPCM450") Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: Jonathan Neuschäfer <[email protected]> Link: https://lore.kernel.org/r/20220318071131.GA29472@kili Signed-off-by: Linus Walleij <[email protected]>
2022-03-24pinctrl: nuvoton: wpcm450: select GENERIC_PINCTRL_GROUPSJonathan Neuschäfer1-0/+1
CONFIG_GENERIC_PINCTRL_GROUPS must be selected in order for struct group_desc to be defined in pinctrl/core.h. Add the missing select line to CONFIG_PINCTRL_WPCM450. Reported-by: kernel test robot <[email protected]> Fixes: a1d1e0e3d80a ("pinctrl: nuvoton: Add driver for WPCM450") Signed-off-by: Jonathan Neuschäfer <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Linus Walleij <[email protected]>
2022-03-24pinctrl: nuvoton: Fix sparse warningLinus Walleij2-3/+3
Sparse complains: drivers/pinctrl/nuvoton/pinctrl-wpcm450.c:626:9: sparse: sparse: obsolete array initializer, use C99 syntax This is because no equal sign is between the array index and the assignments, in the macro. Fix it up. Reviewed-by: Jonathan Neuschäfer <[email protected]> Reported-by: kernel test robot <[email protected]> Signed-off-by: Linus Walleij <[email protected]>
2022-03-24pinctrl: mediatek: mt8186: Account for probe refactoringLinus Walleij1-6/+2
The new MT8186 drive came in and the probe calls were refactored at the same time. Fix it up. Fixes a build issue. Cc: Guodong Liu <[email protected]> Cc: AngeloGioacchino Del Regno <[email protected]> Reported-by: kernel test robot <[email protected]> Signed-off-by: Linus Walleij <[email protected]>
2022-03-24Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds143-2516/+6279
Pull kvm updates from Paolo Bonzini: "ARM: - Proper emulation of the OSLock feature of the debug architecture - Scalibility improvements for the MMU lock when dirty logging is on - New VMID allocator, which will eventually help with SVA in VMs - Better support for PMUs in heterogenous systems - PSCI 1.1 support, enabling support for SYSTEM_RESET2 - Implement CONFIG_DEBUG_LIST at EL2 - Make CONFIG_ARM64_ERRATUM_2077057 default y - Reduce the overhead of VM exit when no interrupt is pending - Remove traces of 32bit ARM host support from the documentation - Updated vgic selftests - Various cleanups, doc updates and spelling fixes RISC-V: - Prevent KVM_COMPAT from being selected - Optimize __kvm_riscv_switch_to() implementation - RISC-V SBI v0.3 support s390: - memop selftest - fix SCK locking - adapter interruptions virtualization for secure guests - add Claudio Imbrenda as maintainer - first step to do proper storage key checking x86: - Continue switching kvm_x86_ops to static_call(); introduce static_call_cond() and __static_call_ret0 when applicable. - Cleanup unused arguments in several functions - Synthesize AMD 0x80000021 leaf - Fixes and optimization for Hyper-V sparse-bank hypercalls - Implement Hyper-V's enlightened MSR bitmap for nested SVM - Remove MMU auditing - Eager splitting of page tables (new aka "TDP" MMU only) when dirty page tracking is enabled - Cleanup the implementation of the guest PGD cache - Preparation for the implementation of Intel IPI virtualization - Fix some segment descriptor checks in the emulator - Allow AMD AVIC support on systems with physical APIC ID above 255 - Better API to disable virtualization quirks - Fixes and optimizations for the zapping of page tables: - Zap roots in two passes, avoiding RCU read-side critical sections that last too long for very large guests backed by 4 KiB SPTEs. - Zap invalid and defunct roots asynchronously via concurrency-managed work queue. - Allowing yielding when zapping TDP MMU roots in response to the root's last reference being put. - Batch more TLB flushes with an RCU trick. Whoever frees the paging structure now holds RCU as a proxy for all vCPUs running in the guest, i.e. to prolongs the grace period on their behalf. It then kicks the the vCPUs out of guest mode before doing rcu_read_unlock(). Generic: - Introduce __vcalloc and use it for very large allocations that need memcg accounting" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (246 commits) KVM: use kvcalloc for array allocations KVM: x86: Introduce KVM_CAP_DISABLE_QUIRKS2 kvm: x86: Require const tsc for RT KVM: x86: synthesize CPUID leaf 0x80000021h if useful KVM: x86: add support for CPUID leaf 0x80000021 KVM: x86: do not use KVM_X86_OP_OPTIONAL_RET0 for get_mt_mask Revert "KVM: x86/mmu: Zap only TDP MMU leafs in kvm_zap_gfn_range()" kvm: x86/mmu: Flush TLB before zap_gfn_range releases RCU KVM: arm64: fix typos in comments KVM: arm64: Generalise VM features into a set of flags KVM: s390: selftests: Add error memop tests KVM: s390: selftests: Add more copy memop tests KVM: s390: selftests: Add named stages for memop test KVM: s390: selftests: Add macro as abstraction for MEM_OP KVM: s390: selftests: Split memop tests KVM: s390x: fix SCK locking RISC-V: KVM: Implement SBI HSM suspend call RISC-V: KVM: Add common kvm_riscv_vcpu_wfi() function RISC-V: Add SBI HSM suspend related defines RISC-V: KVM: Implement SBI v0.3 SRST extension ...