aboutsummaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)AuthorFilesLines
2024-03-06mm: add __dump_folio()Matthew Wilcox (Oracle)2-0/+10
Turn __dump_page() into a wrapper around __dump_folio(). Snapshot the page & folio into a stack variable so we don't hit BUG_ON() if an allocation is freed under us and what was a folio pointer becomes a pointer to a tail page. [[email protected]: fix build issue] Link: https://lkml.kernel.org/r/[email protected] [[email protected]: fix __dump_folio] Link: https://lkml.kernel.org/r/[email protected] [[email protected]: fix pointer confusion] Link: https://lkml.kernel.org/r/[email protected] [[email protected]: s/printk/pr_warn/] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-03-06mm: remove PageYoung and PageIdle definitionsMatthew Wilcox (Oracle)1-4/+4
All callers have been converted to use folios, so remove the various set/clear/test functions defined on pages. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-03-06mm: remove PageWaiters, PageSetWaiters and PageClearWaitersMatthew Wilcox (Oracle)1-9/+1
All callers have been converted to use folios. This was the only user of PF_ONLY_HEAD, so remove that too. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-03-06mm: separate out FOLIO_FLAGS from PAGEFLAGSMatthew Wilcox (Oracle)1-20/+43
Patch series "PageFlags cleanups". We have now successfully removed all of the uses of some of the PageFlags from the kernel, but there's nothing to stop somebody reintroducing them. By splitting out FOLIO_FLAGS from PAGEFLAGS, we can stop defining the old flags; and we do that in some of the later patches. After doing this, I realised that dump_page() was living dangerously; we could end up calling folio_test_foo() on a pointer which no longer pointed to a folio (as dump_page() is not necessarily called when the caller has a reference to the page). So I fixed that up. And then I realised that this was the key to making dump_page() take a const argument, which means we can constify the page flags testing, which means we can remove more cast-away-the-const bad code. And here's where I ended up. This patch (of 8): We've progressed far enough with the folio transition that some flags are now no longer checked on pages, but only on folios. To prevent new users appearing, prepare to only define the folio versions of the flag test/set/clear. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-03-06hugetlb: parallelize 1G hugetlb initializationGang Li1-1/+1
Optimizing the initialization speed of 1G huge pages through parallelization. 1G hugetlbs are allocated from bootmem, a process that is already very fast and does not currently require optimization. Therefore, we focus on parallelizing only the initialization phase in `gather_bootmem_prealloc`. Here are some test results: test case no patch(ms) patched(ms) saved ------------------- -------------- ------------- -------- 256c2T(4 node) 1G 4745 2024 57.34% 128c1T(2 node) 1G 3358 1712 49.02% 12T 1G 77000 18300 76.23% [[email protected]: s/initialied/initialized/, per Alexey] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Gang Li <[email protected]> Tested-by: David Rientjes <[email protected]> Reviewed-by: Muchun Song <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Daniel Jordan <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Jane Chu <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Steffen Klassert <[email protected]> Cc: Tim Chen <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-03-06padata: downgrade padata_do_multithreaded to serial execution for non-SMPGang Li1-4/+8
hugetlb parallelization depends on PADATA, and PADATA depends on SMP. PADATA consists of two distinct functionality: One part is padata_do_multithreaded which disregards order and simply divides tasks into several groups for parallel execution. Hugetlb init parallelization depends on padata_do_multithreaded. The other part is composed of a set of APIs that, while handling data in an out-of-order parallel manner, can eventually return the data with ordered sequence. Currently Only `crypto/pcrypt.c` use them. All users of PADATA of non-SMP case currently only use padata_do_multithreaded. It is easy to implement a serial one in include/linux/padata.h. And it is not necessary to implement another functionality unless the only user of crypto/pcrypt.c does not depend on SMP in the future. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Gang Li <[email protected]> Tested-by: Paul E. McKenney <[email protected]> Acked-by: Daniel Jordan <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: David Rientjes <[email protected]> Cc: Jane Chu <[email protected]> Cc: Muchun Song <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Steffen Klassert <[email protected]> Cc: Tim Chen <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Mike Kravetz <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-03-06Author: Gang Li padata: dispatch works onGang Li Subject: padata: dispatch works on1-0/+2
different nodes Date: Thu, 22 Feb 2024 22:04:17 +0800 When a group of tasks that access different nodes are scheduled on the same node, they may encounter bandwidth bottlenecks and access latency. Thus, numa_aware flag is introduced here, allowing tasks to be distributed across different nodes to fully utilize the advantage of multi-node systems. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Gang Li <[email protected]> Tested-by: David Rientjes <[email protected]> Reviewed-by: Muchun Song <[email protected]> Reviewed-by: Tim Chen <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Daniel Jordan <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Jane Chu <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Steffen Klassert <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-03-06tracing: Limit trace_seq size to just 8K and not depend on architecture ↵Steven Rostedt (Google)1-1/+7
PAGE_SIZE The trace_seq buffer is used to print out entire events. It's typically set to PAGE_SIZE * 2 as there's some events that can be quite large. As a side effect, writes to trace_marker is limited by both the size of the trace_seq buffer as well as the ring buffer's sub-buffer size (which is a power of PAGE_SIZE). By limiting the trace_seq size, it also limits the size of the largest string written to trace_marker. trace_seq does not need to be dependent on PAGE_SIZE like the ring buffer sub-buffers need to be. Hard code it to 8K which is PAGE_SIZE * 2 on most architectures. This will also limit the size of trace_marker on those architectures with greater than 4K PAGE_SIZE. Link: https://lore.kernel.org/all/[email protected]/ Link: https://lore.kernel.org/linux-trace-kernel/[email protected] Cc: Masami Hiramatsu <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Sachin Sant <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
2024-03-06Merge tag 'md-6.9-20240306' of ↵Jens Axboe1-2/+0
https://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-6.9/block Pull MD atomic queue limits changes from Song. * tag 'md-6.9-20240306' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md: block: remove disk_stack_limits md: remove mddev->queue md: don't initialize queue limits md/raid10: use the atomic queue limit update APIs md/raid5: use the atomic queue limit update APIs md/raid1: use the atomic queue limit update APIs md/raid0: use the atomic queue limit update APIs md: add queue limit helpers md: add a mddev_is_dm helper md: add a mddev_add_trace_msg helper md: add a mddev_trace_remap helper
2024-03-06block: remove disk_stack_limitsChristoph Hellwig1-2/+0
disk_stack_limits is unused now, remove it. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed--by: Song Liu <[email protected]> Tested-by: Song Liu <[email protected]> Signed-off-by: Song Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-03-06iommu: Add static iommu_ops->release_domainLu Baolu1-0/+1
The current device_release callback for individual iommu drivers does the following: 1) Silent IOMMU DMA translation: It detaches any existing domain from the device and puts it into a blocking state (some drivers might use the identity state). 2) Resource release: It releases resources allocated during the device_probe callback and restores the device to its pre-probe state. Step 1 is challenging for individual iommu drivers because each must check if a domain is already attached to the device. Additionally, if a deferred attach never occurred, the device_release should avoid modifying hardware configuration regardless of the reason for its call. To simplify this process, introduce a static release_domain within the iommu_ops structure. It can be either a blocking or identity domain depending on the iommu hardware. The iommu core will decide whether to attach this domain before the device_release callback, eliminating the need for repetitive code in various drivers. Consequently, the device_release callback can focus solely on the opposite operations of device_probe, including releasing all resources allocated during that callback. Co-developed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Signed-off-by: Lu Baolu <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2024-03-06PCI: Make pci_dev_is_disconnected() helper public for other driversEthan Zhao1-0/+5
Make pci_dev_is_disconnected() public so that it can be called from Intel VT-d driver to quickly fix/workaround the surprise removal unplug hang issue for those ATS capable devices on PCIe switch downstream hotplug capable ports. Beside pci_device_is_present() function, this one has no config space space access, so is light enough to optimize the normal pure surprise removal and safe removal flow. Acked-by: Bjorn Helgaas <[email protected]> Reviewed-by: Dan Carpenter <[email protected]> Tested-by: Haorong Ye <[email protected]> Signed-off-by: Ethan Zhao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Joerg Roedel <[email protected]>
2024-03-06Merge tag 'vfs-6.8-release.fixes' of ↵Linus Torvalds1-16/+0
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Get rid of copy_mc flag in iov_iter which really only makes sense for the core dumping code so move it out of the generic iov iter code and make it coredump's problem. See the detailed commit description. - Revert fs/aio: Make io_cancel() generate completions again The initial fix here was predicated on the assumption that calling ki_cancel() didn't complete aio requests. However, that turned out to be wrong since the two drivers that actually make use of this set a cancellation function that performs the cancellation correctly. So revert this change. - Ensure that the test for IOCB_AIO_RW always happens before the read from ki_ctx. * tag 'vfs-6.8-release.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: iov_iter: get rid of 'copy_mc' flag fs/aio: Check IOCB_AIO_RW before the struct aio_kiocb conversion Revert "fs/aio: Make io_cancel() generate completions again"
2024-03-06block: make block_class constantRicardo B. Marliere1-1/+1
Since commit 43a7206b0963 ("driver core: class: make class_register() take a const *"), the driver core allows for struct class to be in read-only memory, so move the block_class structure to be declared at build time placing it into read-only memory, instead of having to be dynamically allocated at boot time. Cc: Greg Kroah-Hartman <[email protected]> Suggested-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Ricardo B. Marliere <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2024-03-06greybus: constify the struct device_type usageRicardo B. Marliere1-6/+6
Since commit aed65af1cc2f ("drivers: make device_type const"), the driver core can properly handle constant struct device_type. Move the greybus_hd_type, greybus_module_type, greybus_interface_type, greybus_control_type, greybus_bundle_type and greybus_svc_type variables to be constant structures as well, placing it into read-only memory which can not be modified at runtime. Signed-off-by: "Ricardo B. Marliere" <[email protected]> Reviewed-by: Alex Elder <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2024-03-06greybus: make greybus_bus_type constGreg Kroah-Hartman1-1/+1
Now that the driver core can properly handle constant struct bus_type, move the greybus_bus_type variable to be a constant structure as well, placing it into read-only memory which can not be modified at runtime. Cc: Alex Elder <[email protected]> Cc: [email protected] Reviewed-by: Johan Hovold <[email protected]> Link: https://lore.kernel.org/r/2024010517-handgun-scoreless-05e7@gregkh Signed-off-by: Greg Kroah-Hartman <[email protected]>
2024-03-06Merge tag 'icc-6.9-rc1' of ↵Greg Kroah-Hartman1-5/+6
git://git.kernel.org/pub/scm/linux/kernel/git/djakov/icc into char-misc-next Georgi writes: interconnect changes for 6.9 This pull request contains the interconnect changes for the 6.9-rc1 merge window. The highlights are below: Core changes: - Constify the of_phandle_args in xlate functions. Driver changes: - New interconnect driver for the MSM8909 platform. - New interconnect driver for the SM7150 platform. - Clean-up and removal of unused resources in drivers. - Constify some pointers to structs. Signed-off-by: Georgi Djakov <[email protected]> * tag 'icc-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/djakov/icc: interconnect: qcom: Add SM7150 driver support dt-bindings: interconnect: Add Qualcomm SM7150 DT bindings interconnect: constify of_phandle_args in xlate dt-bindings: interconnect: qcom,rpmh: Fix bouncing @codeaurora address interconnect: qcom: x1e80100: constify pointer to qcom_icc_bcm interconnect: qcom: sa8775p: constify pointer to qcom_icc_bcm interconnect: qcom: sm6115: constify pointer to qcom_icc_node interconnect: qcom: sm8250: constify pointer to qcom_icc_node interconnect: qcom: sa8775p: constify pointer to qcom_icc_node interconnect: qcom: msm8909: constify pointer to qcom_icc_node interconnect: qcom: x1e80100: Remove bogus per-RSC BCMs and nodes dt-bindings: interconnect: Remove bogus interconnect nodes interconnect: qcom: sm8550: Remove bogus per-RSC BCMs and nodes interconnect: qcom: Add MSM8909 interconnect provider driver dt-bindings: interconnect: Add Qualcomm MSM8909 DT bindings
2024-03-06fsnotify: Fix misspelling of "writable"Vicki Pfau1-2/+2
Several file system notification system headers have "writable" misspelled as "writtable" in the comments. This patch fixes it in the fsnotify header. Signed-off-by: Vicki Pfau <[email protected]> Signed-off-by: Jan Kara <[email protected]> Message-Id: <[email protected]>
2024-03-06iov_iter: get rid of 'copy_mc' flagLinus Torvalds1-16/+0
This flag is only set by one single user: the magical core dumping code that looks up user pages one by one, and then writes them out using their kernel addresses (by using a BVEC_ITER). That actually ends up being a huge problem, because while we do use copy_mc_to_kernel() for this case and it is able to handle the possible machine checks involved, nothing else is really ready to handle the failures caused by the machine check. In particular, as reported by Tong Tiangen, we don't actually support fault_in_iov_iter_readable() on a machine check area. As a result, the usual logic for writing things to a file under a filesystem lock, which involves doing a copy with page faults disabled and then if that fails trying to fault pages in without holding the locks with fault_in_iov_iter_readable() does not work at all. We could decide to always just make the MC copy "succeed" (and filling the destination with zeroes), and that would then create a core dump file that just ignores any machine checks. But honestly, this single special case has been problematic before, and means that all the normal iov_iter code ends up slightly more complex and slower. See for example commit c9eec08bac96 ("iov_iter: Don't deal with iter->copy_mc in memcpy_from_iter_mc()") where David Howells re-organized the code just to avoid having to check the 'copy_mc' flags inside the inner iov_iter loops. So considering that we have exactly one user, and that one user is a non-critical special case that doesn't actually ever trigger in real life (Tong found this with manual error injection), the sane solution is to just decide that the onus on handling the machine check lines on that user instead. Ergo, do the copy_mc_to_kernel() in the core dump logic itself, copying the user data to a stable kernel page before writing it out. Fixes: f1982740f5e7 ("iov_iter: Convert iterate*() to inline funcs") Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Tong Tiangen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Link: https://lore.kernel.org/all/[email protected]/ Tested-by: David Howells <[email protected]> Reviewed-by: David Howells <[email protected]> Reviewed-by: Jens Axboe <[email protected]> Reported-by: Tong Tiangen <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
2024-03-06firmware: arm_scmi: Populate fast channel rate_limitPierre Gondois1-0/+4
Arm SCMI spec. v3.2, s4.5.3.12 PERFORMANCE_DESCRIBE_FASTCHANNEL defines a per-domain rate_limit for performance requests: """ Rate Limit in microseconds, indicating the minimum time required between successive requests. A value of 0 indicates that this field is not applicable or supported on the platform. """" The field is first defined in SCMI v2.0. Add support to fetch this value and advertise it through a fast_switch_rate_limit() callback. Signed-off-by: Pierre Gondois <[email protected]> Reviewed-by: Cristian Marussi <[email protected]> Acked-by: Sudeep Holla <[email protected]> Signed-off-by: Viresh Kumar <[email protected]>
2024-03-06firmware: arm_scmi: Populate perf commands rate_limitPierre Gondois1-0/+4
Arm SCMI spec. v3.2, s4.5.3.4 PERFORMANCE_DOMAIN_ATTRIBUTES defines a per-domain rate_limit for performance requests: """ Rate Limit in microseconds, indicating the minimum time required between successive requests. A value of 0 indicates that this field is not supported by the platform. This field does not apply to FastChannels. """" The field is first defined in SCMI v1.0. Add support to fetch this value and advertise it through a rate_limit_get() callback. Signed-off-by: Pierre Gondois <[email protected]> Reviewed-by: Cristian Marussi <[email protected]> Acked-by: Sudeep Holla <[email protected]> Signed-off-by: Viresh Kumar <[email protected]>
2024-03-05net: phy: Add phy_support_eee() indicating MAC support EEEAndrew Lunn1-1/+2
In order for EEE to operate, both the MAC and the PHY need to support it, similar to how pause works. With some exception - a number of PHYs have SmartEEE or AutoGrEEEn support in order to provide some EEE-like power savings with non-EEE capable MACs. Copy the pause concept and add the call phy_support_eee() which the MAC makes after connecting the PHY to indicate it supports EEE. phylib will then advertise EEE when auto-neg is performed. Signed-off-by: Andrew Lunn <[email protected]> Signed-off-by: Oleksij Rempel <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-03-05net: phy: Keep track of EEE configurationAndrew Lunn1-0/+3
Have phylib keep track of the EEE configuration. This simplifies the MAC drivers, in that they don't need to store it. Future patches to phylib will also make use of this information to further simplify the MAC drivers. Reviewed-by: Russell King (Oracle) <[email protected]> Signed-off-by: Andrew Lunn <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: Oleksij Rempel <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-03-05net: phy: Add phydev->enable_tx_lpi to simplify adjust link callbacksAndrew Lunn1-0/+2
MAC drivers which support EEE need to know the results of the EEE auto-neg in order to program the hardware to perform EEE or not. The oddly named phy_init_eee() can be used to determine this, it returns 0 if EEE should be used, or a negative error code, e.g. -EOPPROTONOTSUPPORT if the PHY does not support EEE or negotiate resulted in it not being used. However, many MAC drivers get this wrong. Add phydev->enable_tx_lpi which indicates the result of the autoneg for EEE, including if EEE is administratively disabled with ethtool. The MAC driver can then access this in the same way as link speed and duplex in the adjust link callback. If enable_tx_lpi is true, the MAC should send low power indications and does not need to consider anything else with respect to EEE. Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: Andrew Lunn <[email protected]> Signed-off-by: Oleksij Rempel <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-03-05dpll: move all dpll<>netdev helpers to dpll codeJakub Kicinski2-17/+13
Older versions of GCC really want to know the full definition of the type involved in rcu_assign_pointer(). struct dpll_pin is defined in a local header, net/core can't reach it. Move all the netdev <> dpll code into dpll, where the type is known. Otherwise we'd need multiple function calls to jump between the compilation units. This is the same problem the commit under fixes was trying to address, but with rcu_assign_pointer() not rcu_dereference(). Some of the exports are not needed, networking core can't be a module, we only need exports for the helpers used by drivers. Reported-by: Geert Uytterhoeven <[email protected]> Link: https://lore.kernel.org/all/[email protected]/ Fixes: 640f41ed33b5 ("dpll: fix build failure due to rcu_dereference_check() on unknown type") Reviewed-by: Jiri Pirko <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-03-06power: supply: core: fix charge_behaviour formattingThomas Weißschuh1-0/+1
This property is documented to have a special format which exposes all available behaviours and the currently active one at the same time. For this special format some helpers are provided. When the charge_behaviour property was added in 1b0b6cc8030d ("power: supply: add charge_behaviour attributes"), it did not update the default logic in in power_supply_sysfs.c to use the format helpers. Thus by default only the currently active behaviour is printed. This fixes the default logic to follow the documented format. There is currently only one in-tree drivers exposing charge behaviours - thinkpad_acpi, which is not affected by the change, as it directly uses the helpers and does not use the power_supply_sysfs.c logic. Signed-off-by: Thomas Weißschuh <[email protected]> Link: https://lore.kernel.org/r/20240303-power_supply-charge_behaviour_prop-v2-3-8ebb0a7c2409@weissschuh.net Signed-off-by: Sebastian Reichel <[email protected]>
2024-03-06power: supply: core: add power_supply_for_each_device()Sebastian Reichel1-2/+1
Introduce power_supply_for_each_device(), which is a wrapper for class_for_each_device() using the power_supply_class and going through all devices. This allows making the power_supply_class itself a local variable, so that drivers cannot mess with it and simplifies the code slightly. Reviewed-by: Ricardo B. Marliere <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sebastian Reichel <[email protected]>
2024-03-05Merge tag 'hyperv-fixes-signed-20240303' of ↵Linus Torvalds1-1/+21
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv fixes from Wei Liu: - Multiple fixes, cleanups and documentations for Hyper-V core code and drivers * tag 'hyperv-fixes-signed-20240303' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: Drivers: hv: vmbus: make hv_bus const x86/hyperv: Allow 15-bit APIC IDs for VTL platforms x86/hyperv: Make encrypted/decrypted changes safe for load_unaligned_zeropad() x86/mm: Regularize set_memory_p() parameters and make non-static x86/hyperv: Use slow_virt_to_phys() in page transition hypervisor callback Documentation: hyperv: Add overview of PCI pass-thru device support Drivers: hv: vmbus: Update indentation in create_gpadl_header() Drivers: hv: vmbus: Remove duplication and cleanup code in create_gpadl_header() fbdev/hyperv_fb: Fix logic error for Gen2 VMs in hvfb_getmem() Drivers: hv: vmbus: Calculate ring buffer size for more efficient use of memory hv_utils: Allow implicit ICTIMESYNCFLAG_SYNC
2024-03-05Merge tag 'v6.8-rc7' into gpio/for-nextBartosz Golaszewski27-63/+125
Linux 6.8-rc7
2024-03-05greybus: Avoid fake flexible array for response dataKees Cook1-6/+2
FORTIFY_SOURCE has been ignoring 0-sized destinations while the kernel code base has been converted to flexible arrays. In order to enforce the 0-sized destinations (e.g. with __counted_by), the remaining 0-sized destinations need to be handled. Instead of converting an empty struct into using a flexible array, just directly use a pointer without any additional indirection. Remove struct gb_bootrom_get_firmware_response and struct gb_fw_download_fetch_firmware_response. Signed-off-by: Kees Cook <[email protected]> Reviewed-by: Alex Elder <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2024-03-05serial: port: Introduce a common helper to read propertiesAndy Shevchenko1-0/+2
Several serial drivers want to read the same or similar set of the port properties. Make a common helper for them. Signed-off-by: Andy Shevchenko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2024-03-05serial: core: Add UPIO_UNKNOWN constant for unknown port typeAndy Shevchenko1-0/+1
In some APIs we would like to assign the special value to iotype and compare against it in another places. Introduce UPIO_UNKNOWN for this purpose. Note, we can't use 0, because it's a valid value for IO port access. Signed-off-by: Andy Shevchenko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2024-03-05serial: core: Move struct uart_port::quirks closer to possible valuesAndy Shevchenko1-2/+4
Currently it's not crystal clear what UPIO_* and UPQ_* definitions belong to. Reindent the code, so it will be easy to read and understand. No functional changes intended. Reviewed-by: Andi Shyti <[email protected]> Signed-off-by: Andy Shevchenko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2024-03-05serial: core: only stop transmit when HW fifo is emptyJonas Gorski1-1/+2
If the circular buffer is empty, it just means we fit all characters to send into the HW fifo, but not that the hardware finished transmitting them. So if we immediately call stop_tx() after that, this may abort any pending characters in the HW fifo, and cause dropped characters on the console. Fix this by only stopping tx when the tx HW fifo is actually empty. Fixes: 8275b48b2780 ("tty: serial: introduce transmit helpers") Cc: [email protected] Signed-off-by: Jonas Gorski <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2024-03-05usb: typec: tcpm: add support to set tcpc connector orientatitionMarco Felsch1-0/+2
This adds the support to set the connector orientation value accordingly. This is part of the optional CONFIG_STANDARD_OUTPUT register 0x18, specified within the USB port controller spsicification rev. 2.0 [1]. [1] https://www.usb.org/sites/default/files/documents/usb-port_controller_specification_rev2.0_v1.0_0.pdf Signed-off-by: Marco Felsch <[email protected]> Reviewed-by: Heikki Krogerus <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2024-03-05usb: core: Set connect_type of ports based on DT nodeStephen Boyd1-0/+7
When a USB hub is described in DT, such as any device that matches the onboard-hub driver, the connect_type is set to "unknown" or USB_PORT_CONNECT_TYPE_UNKNOWN. This makes any device plugged into that USB port report their 'removable' device attribute as "unknown". ChromeOS userspace would like to know if the USB device is actually removable or not so that security policies can be applied. Improve the connect_type attribute for ports, and in turn the removable attribute for USB devices, by looking for child devices with a reg property or an OF graph when the device is described in DT. If the graph exists, endpoints that are connected to a remote node must be something like a usb-{a,b,c}-connector compatible node, or an intermediate node like a redriver, and not a hardwired USB device on the board. Set the connect_type to USB_PORT_CONNECT_TYPE_HOT_PLUG in this case because the device is going to be plugged in. Set the connect_type to USB_PORT_CONNECT_TYPE_HARD_WIRED if there's a child node for the port like 'device@2' for port2. Set the connect_type to USB_PORT_NOT_USED if there isn't an endpoint or child node corresponding to the port number. To make sure things don't change, only set the port to not used if there are child nodes. This way an onboard hub connect_type doesn't change until ports are added or child nodes are added to describe hardwired devices. It's assumed that all ports or no ports will be described for a device. Cc: Matthias Kaehlcke <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Pin-yen Lin <[email protected]> Cc: maciek swiech <[email protected]> Signed-off-by: Stephen Boyd <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2024-03-05usb: typec: pd: no opencoding of FIELD_GETOliver Neukum2-7/+9
If we have a neat macro, at least new code should use it. Signed-off-by: Oliver Neukum <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2024-03-05net: Re-use and set mono_delivery_time bit for userspace tstamp packetsAbhishek Chauhan1-3/+3
Bridge driver today has no support to forward the userspace timestamp packets and ends up resetting the timestamp. ETF qdisc checks the packet coming from userspace and encounters to be 0 thereby dropping time sensitive packets. These changes will allow userspace timestamps packets to be forwarded from the bridge to NIC drivers. Setting the same bit (mono_delivery_time) to avoid dropping of userspace tstamp packets in the forwarding path. Existing functionality of mono_delivery_time remains unaltered here, instead just extended with userspace tstamp support for bridge forwarding path. Signed-off-by: Abhishek Chauhan <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2024-03-05mmc: core: Use a struct device* as in-param to mmc_of_parse_clk_phase()Yang Xiwen1-1/+1
Parsing dt usually happens very early, sometimes even before the struct mmc_host has been allocated (e.g. dw_mci_probe() and dw_mci_parse_dt() in dw_mmc.c). Looking at the source of mmc_of_parse_clk_phase(), it's actually not needed to have an initialized mmc_host, let's therefore pass a struct device* to it instead. Also update the only current user, sdhci-of-aspeed. Reviewed-by: Paul Menzel <[email protected]> Acked-by: Andrew Jeffery <[email protected]> Signed-off-by: Yang Xiwen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ulf Hansson <[email protected]>
2024-03-05net: introduce page_frag_cache_drain()Yunsheng Lin1-0/+1
When draining a page_frag_cache, most user are doing the similar steps, so introduce an API to avoid code duplication. Signed-off-by: Yunsheng Lin <[email protected]> Acked-by: Jason Wang <[email protected]> Reviewed-by: Alexander Duyck <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2024-03-05mm/page_alloc: modify page_frag_alloc_align() to accept align as an argumentYunsheng Lin1-4/+11
napi_alloc_frag_align() and netdev_alloc_frag_align() accept align as an argument, and they are thin wrappers around the __napi_alloc_frag_align() and __netdev_alloc_frag_align() APIs doing the alignment checking and align mask conversion, in order to call page_frag_alloc_align() directly. The intention here is to keep the alignment checking and the alignmask conversion in in-line wrapper to avoid those kind of operations during execution time since it can usually be handled during compile time. We are going to use page_frag_alloc_align() in vhost_net.c, it need the same kind of alignment checking and alignmask conversion, so split up page_frag_alloc_align into an inline wrapper doing the above operation, and add __page_frag_alloc_align() which is passed with the align mask the original function expected as suggested by Alexander. Signed-off-by: Yunsheng Lin <[email protected]> CC: Alexander Duyck <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2024-03-04tcp: align tcp_sock_write_rx groupEric Dumazet1-1/+1
Stephen Rothwell and kernel test robot reported that some arches (parisc, hexagon) and/or compilers would not like blamed commit. Lets make sure tcp_sock_write_rx group does not start with a hole. While we are at it, correct tcp_sock_write_tx CACHELINE_ASSERT_GROUP_SIZE() since after the blamed commit, we went to 105 bytes. Fixes: 99123622050f ("tcp: remove some holes in struct tcp_sock") Reported-by: Stephen Rothwell <[email protected]> Reported-by: kernel test robot <[email protected]> Link: https://lore.kernel.org/netdev/[email protected]/ Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Signed-off-by: Eric Dumazet <[email protected]> Reviewed-by: Simon Horman <[email protected]> Tested-by: Simon Horman <[email protected]> # build-tested Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-03-04modules: wait do_free_init correctlyChangbin Du1-0/+8
The synchronization here is to ensure the ordering of freeing of a module init so that it happens before W+X checking. It is worth noting it is not that the freeing was not happening, it is just that our sanity checkers raced against the permission checkers which assume init memory is already gone. Commit 1a7b7d922081 ("modules: Use vmalloc special flag") moved calling do_free_init() into a global workqueue instead of relying on it being called through call_rcu(..., do_free_init), which used to allowed us call do_free_init() asynchronously after the end of a subsequent grace period. The move to a global workqueue broke the gaurantees for code which needed to be sure the do_free_init() would complete with rcu_barrier(). To fix this callers which used to rely on rcu_barrier() must now instead use flush_work(&init_free_wq). Without this fix, we still could encounter false positive reports in W+X checking since the rcu_barrier() here can not ensure the ordering now. Even worse, the rcu_barrier() can introduce significant delay. Eric Chanudet reported that the rcu_barrier introduces ~0.1s delay on a PREEMPT_RT kernel. [ 0.291444] Freeing unused kernel memory: 5568K [ 0.402442] Run /sbin/init as init process With this fix, the above delay can be eliminated. Link: https://lkml.kernel.org/r/[email protected] Fixes: 1a7b7d922081 ("modules: Use vmalloc special flag") Signed-off-by: Changbin Du <[email protected]> Tested-by: Eric Chanudet <[email protected]> Acked-by: Luis Chamberlain <[email protected]> Cc: Xiaoyi Su <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-03-04mm: convert free_swap_cache() to take a folioMatthew Wilcox (Oracle)1-4/+4
All but one caller already has a folio, so convert free_page_and_swap_cache() to have a folio and remove the call to page_folio(). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Ryan Roberts <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Cc: Mel Gorman <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-03-04mm: remove lru_to_page()Matthew Wilcox (Oracle)1-1/+0
The last user was removed over a year ago; remove the definition. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Ryan Roberts <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-03-04memcg: remove mem_cgroup_uncharge_list()Matthew Wilcox (Oracle)1-12/+0
All users have been converted to mem_cgroup_uncharge_folios() so we can remove this API. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Ryan Roberts <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Mel Gorman <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-03-04mm: use __page_cache_release() in folios_put()Matthew Wilcox (Oracle)1-8/+8
Pass a pointer to the lruvec so we can take advantage of the folio_lruvec_relock_irqsave(). Adjust the calling convention of folio_lruvec_relock_irqsave() to suit and add a page_cache_release() wrapper. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Ryan Roberts <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-03-04memcg: add mem_cgroup_uncharge_folios()Matthew Wilcox (Oracle)1-2/+12
Almost identical to mem_cgroup_uncharge_list(), except it takes a folio_batch instead of a list_head. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Ryan Roberts <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Mel Gorman <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-03-04mm: make folios_put() the basis of release_pages()Matthew Wilcox (Oracle)1-6/+10
Patch series "Rearrange batched folio freeing", v3. Other than the obvious "remove calls to compound_head" changes, the fundamental belief here is that iterating a linked list is much slower than iterating an array (5-15x slower in my testing). There's also an associated belief that since we iterate the batch of folios three times, we do better when the array is small (ie 15 entries) than we do with a batch that is hundreds of entries long, which only gives us the opportunity for the first pages to fall out of cache by the time we get to the end. It is possible we should increase the size of folio_batch. Hopefully the bots let us know if this introduces any performance regressions. This patch (of 3): By making release_pages() call folios_put(), we can get rid of the calls to compound_head() for the callers that already know they have folios. We can also get rid of the lock_batch tracking as we know the size of the batch is limited by folio_batch. This does reduce the maximum number of pages for which the lruvec lock is held, from SWAP_CLUSTER_MAX (32) to PAGEVEC_SIZE (15). I do not expect this to make a significant difference, but if it does, we can increase PAGEVEC_SIZE to 31. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Ryan Roberts <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-03-04mm: remove total_mapcount()David Hildenbrand1-8/+1
All users of total_mapcount() are gone, let's remove it. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Reviewed-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]>