blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2023-10-16	scsi: ufs: core: Add OPP support for scaling clocks and regulators	Manivannan Sadhasivam	1	-0/+4
	UFS core is only scaling the clocks during devfreq scaling and initialization. But for an optimum power saving, regulators should also be scaled along with the clocks. So let's use the OPP framework which supports scaling clocks, regulators, and performance state using OPP table defined in devicetree. For accomodating the OPP support, the existing APIs (ufshcd_scale_clks, ufshcd_is_devfreq_scaling_required and ufshcd_devfreq_scale) are modified to accept "freq" as an argument which in turn used by the OPP helpers. The OPP support is added along with the old freq-table based clock scaling so that the existing platforms work as expected. Co-developed-by: Krzysztof Kozlowski <[email protected]> Signed-off-by: Krzysztof Kozlowski <[email protected]> Signed-off-by: Manivannan Sadhasivam <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-10-16	net: stub tcp_gro_complete if CONFIG_INET=n	Jacob Keller	1	-0/+4
	A few networking drivers including bnx2x, bnxt, qede, and idpf call tcp_gro_complete as part of offloading TCP GRO. The function is only defined if CONFIG_INET is true, since its TCP specific and is meaningless if the kernel lacks IP networking support. The combination of trying to use the complex network drivers with CONFIG_NET but not CONFIG_INET is rather unlikely in practice: most use cases are going to need IP networking. The tcp_gro_complete function just sets some data in the socket buffer for use in processing the TCP packet in the event that the GRO was offloaded to the device. If the kernel lacks TCP support, such setup will simply go unused. The bnx2x, bnxt, and qede drivers wrap their TCP offload support in CONFIG_INET checks and skip handling on such kernels. The idpf driver did not check CONFIG_INET and thus fails to link if the kernel is configured with CONFIG_NET=y, CONFIG_IDPF=(m\|y), and CONFIG_INET=n. While checking CONFIG_INET does allow the driver to bypass significantly more instructions in the event that we know TCP networking isn't supported, the configuration is unlikely to be used widely. Rather than require driver authors to care about this, stub the tcp_gro_complete function when CONFIG_INET=n. This allows drivers to be left as-is. It does mean the idpf driver will perform slightly more work than strictly necessary when CONFIG_INET=n, since it will still execute some of the skb setup in idpf_rx_rsc. However, that work would be performed in the case where CONFIG_INET=y anyways. I did not change the existing drivers, since they appear to wrap a significant portion of code when CONFIG_INET=n. There is little benefit in trashing these drivers just to unwrap and remove the CONFIG_INET check. Using a stub for tcp_gro_complete is still beneficial, as it means future drivers no longer need to worry about this case of CONFIG_NET=y and CONFIG_INET=n, which should reduce noise from buildbots that check such a configuration. Signed-off-by: Jacob Keller <[email protected]> Acked-by: Randy Dunlap <[email protected]> Tested-by: Randy Dunlap <[email protected]> # build-tested Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-10-16	dax, kmem: calculate abstract distance with general interface	Huang Ying	1	-0/+2
	Previously, a fixed abstract distance MEMTIER_DEFAULT_DAX_ADISTANCE is used for slow memory type in kmem driver. This limits the usage of kmem driver, for example, it cannot be used for HBM (high bandwidth memory). So, we use the general abstract distance calculation mechanism in kmem drivers to get more accurate abstract distance on systems with proper support. The original MEMTIER_DEFAULT_DAX_ADISTANCE is used as fallback only. Now, multiple memory types may be managed by kmem. These memory types are put into the "kmem_memory_types" list and protected by kmem_memory_type_lock. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: "Huang, Ying" <[email protected]> Tested-by: Bharata B Rao <[email protected]> Reviewed-by: Dave Jiang <[email protected]> Reviewed-by: Alistair Popple <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Cc: Wei Xu <[email protected]> Cc: Dan Williams <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Yang Shi <[email protected]> Cc: Rafael J Wysocki <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-10-16	acpi, hmat: calculate abstract distance with HMAT	Huang Ying	1	-0/+18
	A memory tiering abstract distance calculation algorithm based on ACPI HMAT is implemented. The basic idea is as follows. The performance attributes of system default DRAM nodes are recorded as the base line. Whose abstract distance is MEMTIER_ADISTANCE_DRAM. Then, the ratio of the abstract distance of a memory node (target) to MEMTIER_ADISTANCE_DRAM is scaled based on the ratio of the performance attributes of the node to that of the default DRAM nodes. The functions to record the read/write latency/bandwidth of the default DRAM nodes and calculate abstract distance according to read/write latency/bandwidth ratio will be used by CXL CDAT (Coherent Device Attribute Table) and other memory device drivers. So, they are put in memory-tiers.c. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: "Huang, Ying" <[email protected]> Tested-by: Bharata B Rao <[email protected]> Reviewed-by: Dave Jiang <[email protected]> Reviewed-by: Alistair Popple <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Cc: Wei Xu <[email protected]> Cc: Dan Williams <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Yang Shi <[email protected]> Cc: Rafael J Wysocki <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-10-16	memory tiering: add abstract distance calculation algorithms management	Huang Ying	1	-0/+19
	Patch series "memory tiering: calculate abstract distance based on ACPI HMAT", v4. We have the explicit memory tiers framework to manage systems with multiple types of memory, e.g., DRAM in DIMM slots and CXL memory devices. Where, same kind of memory devices will be grouped into memory types, then put into memory tiers. To describe the performance of a memory type, abstract distance is defined. Which is in direct proportion to the memory latency and inversely proportional to the memory bandwidth. To keep the code as simple as possible, fixed abstract distance is used in dax/kmem to describe slow memory such as Optane DCPMM. To support more memory types, in this series, we added the abstract distance calculation algorithm management mechanism, provided a algorithm implementation based on ACPI HMAT, and used the general abstract distance calculation interface in dax/kmem driver. So, dax/kmem can support HBM (high bandwidth memory) in addition to the original Optane DCPMM. This patch (of 4): The abstract distance may be calculated by various drivers, such as ACPI HMAT, CXL CDAT, etc. While it may be used by various code which hot-add memory node, such as dax/kmem etc. To decouple the algorithm users and the providers, the abstract distance calculation algorithms management mechanism is implemented in this patch. It provides interface for the providers to register the implementation, and interface for the users. Multiple algorithm implementations can cooperate via calculating abstract distance for different memory nodes. The preference of algorithm implementations can be specified via priority (notifier_block.priority). Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: "Huang, Ying" <[email protected]> Tested-by: Bharata B Rao <[email protected]> Reviewed-by: Alistair Popple <[email protected]> Reviewed-by: Dave Jiang <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Cc: Wei Xu <[email protected]> Cc: Dan Williams <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Yang Shi <[email protected]> Cc: Rafael J Wysocki <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-10-16	mm/filemap: remove hugetlb special casing in filemap.c	Sidhartha Kumar	2	-30/+14
	Remove special cased hugetlb handling code within the page cache by changing the granularity of ->index to the base page size rather than the huge page size. The motivation of this patch is to reduce complexity within the filemap code while also increasing performance by removing branches that are evaluated on every page cache lookup. To support the change in index, new wrappers for hugetlb page cache interactions are added. These wrappers perform the conversion to a linear index which is now expected by the page cache for huge pages. ========================= PERFORMANCE ====================================== Perf was used to check the performance differences after the patch. Overall the performance is similar to mainline with a very small larger overhead that occurs in __filemap_add_folio() and hugetlb_add_to_page_cache(). This is because of the larger overhead that occurs in xa_load() and xa_store() as the xarray is now using more entries to store hugetlb folios in the page cache. Timing aarch64 2MB Page Size 6.5-rc3 + this patch: [root@sidhakum-ol9-1 hugepages]# time fallocate -l 700GB test.txt real 1m49.568s user 0m0.000s sys 1m49.461s 6.5-rc3: [root]# time fallocate -l 700GB test.txt real 1m47.495s user 0m0.000s sys 1m47.370s 1GB Page Size 6.5-rc3 + this patch: [root@sidhakum-ol9-1 hugepages1G]# time fallocate -l 700GB test.txt real 1m47.024s user 0m0.000s sys 1m46.921s 6.5-rc3: [root@sidhakum-ol9-1 hugepages1G]# time fallocate -l 700GB test.txt real 1m44.551s user 0m0.000s sys 1m44.438s x86 2MB Page Size 6.5-rc3 + this patch: [root@sidhakum-ol9-2 hugepages]# time fallocate -l 100GB test.txt real 0m22.383s user 0m0.000s sys 0m22.255s 6.5-rc3: [opc@sidhakum-ol9-2 hugepages]$ time sudo fallocate -l 100GB /dev/hugepages/test.txt real 0m22.735s user 0m0.038s sys 0m22.567s 1GB Page Size 6.5-rc3 + this patch: [root@sidhakum-ol9-2 hugepages1GB]# time fallocate -l 100GB test.txt real 0m25.786s user 0m0.001s sys 0m25.589s 6.5-rc3: [root@sidhakum-ol9-2 hugepages1G]# time fallocate -l 100GB test.txt real 0m33.454s user 0m0.001s sys 0m33.193s aarch64: workload - fallocate a 700GB file backed by huge pages 6.5-rc3 + this patch: 2MB Page Size: --100.00%--__arm64_sys_fallocate ksys_fallocate vfs_fallocate hugetlbfs_fallocate \| \|--95.04%--__pi_clear_page \| \|--3.57%--clear_huge_page \| \| \| \|--2.63%--rcu_all_qs \| \| \| --0.91%--__cond_resched \| --0.67%--__cond_resched 0.17% 0.00% 0 fallocate [kernel.vmlinux] [k] hugetlb_add_to_page_cache 0.14% 0.10% 11 fallocate [kernel.vmlinux] [k] __filemap_add_folio 6.5-rc3 2MB Page Size: --100.00%--__arm64_sys_fallocate ksys_fallocate vfs_fallocate hugetlbfs_fallocate \| \|--94.91%--__pi_clear_page \| \|--4.11%--clear_huge_page \| \| \| \|--3.00%--rcu_all_qs \| \| \| --1.10%--__cond_resched \| --0.59%--__cond_resched 0.08% 0.01% 1 fallocate [kernel.kallsyms] [k] hugetlb_add_to_page_cache 0.05% 0.03% 3 fallocate [kernel.kallsyms] [k] __filemap_add_folio x86 workload - fallocate a 100GB file backed by huge pages 6.5-rc3 + this patch: 2MB Page Size: hugetlbfs_fallocate \| --99.57%--clear_huge_page \| --98.47%--clear_page_erms \| --0.53%--asm_sysvec_apic_timer_interrupt 0.04% 0.04% 1 fallocate [kernel.kallsyms] [k] xa_load 0.04% 0.00% 0 fallocate [kernel.kallsyms] [k] hugetlb_add_to_page_cache 0.04% 0.00% 0 fallocate [kernel.kallsyms] [k] __filemap_add_folio 0.04% 0.00% 0 fallocate [kernel.kallsyms] [k] xas_store 6.5-rc3 2MB Page Size: --99.93%--__x64_sys_fallocate vfs_fallocate hugetlbfs_fallocate \| --99.38%--clear_huge_page \| \|--98.40%--clear_page_erms \| --0.59%--__cond_resched 0.03% 0.03% 1 fallocate [kernel.kallsyms] [k] __filemap_add_folio ========================= TESTING ====================================== This patch passes libhugetlbfs tests and LTP hugetlb tests ********** TEST SUMMARY * 2M * 32-bit 64-bit * Total testcases: 110 113 * Skipped: 0 0 * PASS: 107 113 * FAIL: 0 0 * Killed by signal: 3 0 * Bad configuration: 0 0 * Expected FAIL: 0 0 * Unexpected PASS: 0 0 * Test not present: 0 0 * Strange test result: 0 0 ********** Done executing testcases. LTP Version: 20220527-178-g2761a81c4 page migration was also tested using Mike Kravetz's test program.[8] [[email protected]: fix an NULL vs IS_ERR() bug] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sidhartha Kumar <[email protected]> Signed-off-by: Dan Carpenter <[email protected]> Reported-and-tested-by: [email protected] Closes: https://syzkaller.appspot.com/bug?extid=c225dea486da4d5592bd Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Muchun Song <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-10-16	mm/ksm: support fork/exec for prctl	Stefan Roesch	1	-5/+8
	Patch series "mm/ksm: add fork-exec support for prctl", v4. A process can enable KSM with the prctl system call. When the process is forked the KSM flag is inherited by the child process. However if the process is executing an exec system call directly after the fork, the KSM setting is cleared. This patch series addresses this problem. 1) Change the mask in coredump.h for execing a new process 2) Add a new test case in ksm_functional_tests This patch (of 2): Today we have two ways to enable KSM: 1) madvise system call This allows to enable KSM for a memory region for a long time. 2) prctl system call This is a recent addition to enable KSM for the complete process. In addition when a process is forked, the KSM setting is inherited. This change only affects the second case. One of the use cases for (2) was to support the ability to enable KSM for cgroups. This allows systemd to enable KSM for the seed process. By enabling it in the seed process all child processes inherit the setting. This works correctly when the process is forked. However it doesn't support fork/exec workflow. From the previous cover letter: .... Use case 3: With the madvise call sharing opportunities are only enabled for the current process: it is a workload-local decision. A considerable number of sharing opportunities may exist across multiple workloads or jobs (if they are part of the same security domain). Only a higler level entity like a job scheduler or container can know for certain if its running one or more instances of a job. That job scheduler however doesn't have the necessary internal workload knowledge to make targeted madvise calls. .... In addition it can also be a bit surprising that fork keeps the KSM setting and fork/exec does not. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Stefan Roesch <[email protected]> Fixes: d7597f59d1d3 ("mm: add new api to enable ksm per process") Reviewed-by: David Hildenbrand <[email protected]> Reported-by: Carl Klemm <[email protected]> Tested-by: Carl Klemm <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-10-16	sched/numa, mm: make numa migrate functions to take a folio	Kefeng Wang	1	-3/+3
	The cpupid (or access time) is stored in the head page for THP, so it is safely to make should_numa_migrate_memory() and numa_hint_fault_latency() to take a folio. This is in preparation for large folio numa balancing. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kefeng Wang <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: "Huang, Ying" <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Zi Yan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-10-16	mm: mempolicy: make mpol_misplaced() to take a folio	Kefeng Wang	1	-2/+3
	In preparation for large folio numa balancing, make mpol_misplaced() to take a folio, no functional change intended. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kefeng Wang <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: "Huang, Ying" <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Zi Yan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-10-16	mm: memory: add vm_normal_folio_pmd()	Kefeng Wang	1	-0/+2
	Patch series "mm: convert numa balancing functions to use a folio", v2. do_numa_pages() only handles non-compound pages, and only PMD-mapped THPs are handled in do_huge_pmd_numa_page(). But a large, PTE-mapped folio will be supported so let's convert more numa balancing functions to use/take a folio in preparation for that, no functional change intended for now. This patch (of 6): The new vm_normal_folio_pmd() wrapper is similar to vm_normal_folio(), which allow them to completely replace the struct page variables with struct folio variables. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kefeng Wang <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: "Huang, Ying" <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Zi Yan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-10-16	tcp: Set pingpong threshold via sysctl	Haiyang Zhang	2	-4/+14
	TCP pingpong threshold is 1 by default. But some applications, like SQL DB may prefer a higher pingpong threshold to activate delayed acks in quick ack mode for better performance. The pingpong threshold and related code were changed to 3 in the year 2019 in: commit 4a41f453bedf ("tcp: change pingpong threshold to 3") And reverted to 1 in the year 2022 in: commit 4d8f24eeedc5 ("Revert "tcp: change pingpong threshold to 3"") There is no single value that fits all applications. Add net.ipv4.tcp_pingpong_thresh sysctl tunable, so it can be tuned for optimal performance based on the application needs. Signed-off-by: Haiyang Zhang <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Acked-by: Neal Cardwell <[email protected]> Reviewed-by: Kuniyuki Iwashima <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-10-16	fbdev: uvesafb: Remove uvesafb_exec() prototype from include/video/uvesafb.h	Jorge Maidana	1	-2/+0
	uvesafb_exec() is a static function defined and called only in drivers/video/fbdev/uvesafb.c, remove the prototype from include/video/uvesafb.h. Fixes the warning: ./include/video/uvesafb.h:112:12: warning: 'uvesafb_exec' declared 'static' but never defined [-Wunused-function] when including '<video/uvesafb.h>' in an external program. Signed-off-by: Jorge Maidana <[email protected]> Signed-off-by: Helge Deller <[email protected]>
2023-10-16	Merge tag 'amlogic-drivers-for-v6.7' of ↵	Arnd Bergmann	1	-1/+1
	https://git.kernel.org/pub/scm/linux/kernel/git/amlogic/linux into soc/drivers Amlogic drivers changes for v6.7: - correct meson_sm_* API retval handling - Use device_get_match_data() in meson SM * tag 'amlogic-drivers-for-v6.7' of https://git.kernel.org/pub/scm/linux/kernel/git/amlogic/linux: firmware: meson: Use device_get_match_data() drivers: meson: sm: correct meson_sm_* API retval handling Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnd Bergmann <[email protected]>
2023-10-16	Merge tag 'tegra-for-6.7-firmware' of ↵	Arnd Bergmann	2	-1/+7
	git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux into soc/drivers firmware: tegra: Changes for v6.7-rc1 Contains a typofix and a new mechanism to help fix an issue that can seemingly hang the system during early resume. * tag 'tegra-for-6.7-firmware' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux: firmware: tegra: Add suspend hook and reset BPMP IPC early on resume firmware: tegra: Fix a typo Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnd Bergmann <[email protected]>
2023-10-16	Merge tag 'ffa-updates-6.7' of ↵	Arnd Bergmann	1	-12/+67
	git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into soc/drivers Arm FF-A updates for v6.7 The main addition is the initial support for the notifications and memory transaction descriptor changes added in FF-A v1.1 specification. The notification mechanism enables a requester/sender endpoint to notify a service provider/receiver endpoint about an event with non-blocking semantics. A notification is akin to the doorbell between two endpoints in a communication protocol that is based upon the doorbell/mailbox mechanism. The framework is responsible for the delivery of the notification from the ender to the receiver without blocking the sender. The receiver endpoint relies on the OS scheduler for allocation of CPU cycles to handle a notification. OS is referred as the receiver’s scheduler in the context of notifications. The framework is responsible for informing the receiver’s scheduler that the receiver must be run since it has a pending notification. The series also includes support for the new format of memory transaction descriptors introduced in v1.1 specification. Apart from the main additions, it includes minor fixes to re-enable FF-A drivers usage of 32bit mode of messaging and kernel warning due to the missing assignment of IDR allocation ID to the FFA device. It also adds emitting 'modalias' to the base attribute of FF-A devices. * tag 'ffa-updates-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: firmware: arm_ffa: Upgrade the driver version to v1.1 firmware: arm_ffa: Update memory descriptor to support v1.1 format firmware: arm_ffa: Switch to using ffa_mem_desc_offset() accessor KVM: arm64: FFA: Remove access of endpoint memory access descriptor array firmware: arm_ffa: Simplify the computation of transmit and fragment length firmware: arm_ffa: Add notification handling mechanism firmware: arm_ffa: Add interface to send a notification to a given partition firmware: arm_ffa: Add interfaces to request notification callbacks firmware: arm_ffa: Add schedule receiver callback mechanism firmware: arm_ffa: Initial support for scheduler receiver interrupt firmware: arm_ffa: Implement the NOTIFICATION_INFO_GET interface firmware: arm_ffa: Implement the FFA_NOTIFICATION_GET interface firmware: arm_ffa: Implement the FFA_NOTIFICATION_SET interface firmware: arm_ffa: Implement the FFA_RUN interface firmware: arm_ffa: Implement the notification bind and unbind interface firmware: arm_ffa: Implement notification bitmap create and destroy interfaces firmware: arm_ffa: Update the FF-A command list with v1.1 additions firmware: arm_ffa: Emit modalias for FF-A devices firmware: arm_ffa: Allow the FF-A drivers to use 32bit mode of messaging firmware: arm_ffa: Assign the missing IDR allocation ID to the FFA device Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnd Bergmann <[email protected]>
2023-10-16	Merge tag 'scmi-updates-6.7' of ↵	Arnd Bergmann	4	-14/+73
	git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into soc/drivers Arm SCMI updates for v6.7 Main additions this time include: 1. SCMI v3.2 clock configuration support: This helps to retrieve the enabled state of a clock as well as allow to set OEM specific clock configurations. 2. Support for generic performance scaling(DVFS): The current SCMI DVFS support is limited to the CPUs in the kernel. This extension enables it to used for all kind of devices and not only for the CPUs. It updates the SCMI cpufreq to utilize the power domain bindings. It also adds a more generic SCMI performance domain based on the genpd framework that as be used for all the non-CPU devices. 3. Extend the generic performance scaling(DVFS) support for firmware driver OPPs: Consumer drivers for devices that are attached to the SCMI performance domain can't make use of the current OPP library to scale performance as the OPPs are firmware driven and often obtained from the firmware rather than the device tree. These changes extend the generic OPP and genpd PM domain frameworks to identify and utilise these firmware driven OPPs. 4. SCMI v3.2 clock parent support: This enables the support for discovering and changing parent clocks and extending the SCMI clk driver to use the same. 5. Qualcom SMC/HVC transport support: The Qualcomm virtual platforms require capability id in the hypervisor call to identify which doorbell to assert when supporting multiple SMC/HVC based SCMI transport channels. Extra parameter is added to support the same and the same is obtained at the fixed address in the shared memory which is initialised by the firmware. 6. Move the existing SCMI power domain driver under drivers/pmdomain Apart from the above main changes, it also include couple of minor fixes and cosmetic reworks. * tag 'scmi-updates-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: (37 commits) firmware: arm_scmi: Add qcom smc/hvc transport support dt-bindings: arm: Add new compatible for smc/hvc transport for SCMI firmware: arm_scmi: Convert u32 to unsigned long to align with arm_smccc_1_1_invoke() clk: scmi: Add support for clock {set,get}_parent firmware: arm_scmi: Add support for clock parents clk: scmi: Free scmi_clk allocated when the clocks with invalid info are skipped firmware: arm_scpi: Use device_get_match_data() firmware: arm_scmi: Add generic OPP support to the SCMI performance domain firmware: arm_scmi: Specify the performance level when adding an OPP firmware: arm_scmi: Simplify error path in scmi_dvfs_device_opps_add() OPP: Extend support for the opp-level beyond required-opps OPP: Switch to use dev_pm_domain_set_performance_state() OPP: Extend dev_pm_opp_data with a level OPP: Add dev_pm_opp_add_dynamic() to allow more flexibility PM: domains: Implement the ->set_performance_state() callback for genpd PM: domains: Introduce dev_pm_domain_set_performance_state() firmware: arm_scmi: Rename scmi_{msg_,}clock_config_{get,set}_{2,21} firmware: arm_scmi: Do not use !! on boolean when setting msg->flags firmware: arm_scmi: Move power-domain driver to the pmdomain dir pmdomain: arm: Add the SCMI performance domain ... Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnd Bergmann <[email protected]>
2023-10-16	net, sched: Add tcf_set_drop_reason for {__,}tcf_classify	Daniel Borkmann	1	-0/+3
	Add an initial user for the newly added tcf_set_drop_reason() helper to set the drop reason for internal errors leading to TC_ACT_SHOT inside {__,}tcf_classify(). Right now this only adds a very basic SKB_DROP_REASON_TC_ERROR as a generic fallback indicator to mark drop locations. Where needed, such locations can be converted to more specific codes, for example, when hitting the reclassification limit, etc. Signed-off-by: Daniel Borkmann <[email protected]> Cc: Jamal Hadi Salim <[email protected]> Cc: Victor Nogueira <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-10-16	net, sched: Make tc-related drop reason more flexible	Daniel Borkmann	2	-2/+7
	Currently, the kfree_skb_reason() in sch_handle_{ingress,egress}() can only express a basic SKB_DROP_REASON_TC_INGRESS or SKB_DROP_REASON_TC_EGRESS reason. Victor kicked-off an initial proposal to make this more flexible by disambiguating verdict from return code by moving the verdict into struct tcf_result and letting tcf_classify() return a negative error. If hit, then two new drop reasons were added in the proposal, that is SKB_DROP_REASON_TC_INGRESS_ERROR as well as SKB_DROP_REASON_TC_EGRESS_ERROR. Further analysis of the actual error codes would have required to attach to tcf_classify via kprobe/kretprobe to more deeply debug skb and the returned error. In order to make the kfree_skb_reason() in sch_handle_{ingress,egress}() more extensible, it can be addressed in a more straight forward way, that is: Instead of placing the verdict into struct tcf_result, we can just put the drop reason in there, which does not require changes throughout various classful schedulers given the existing verdict logic can stay as is. Then, SKB_DROP_REASON_TC_ERROR{,_} can be added to the enum skb_drop_reason to disambiguate between an error or an intentional drop. New drop reason error codes can be added successively to the tc code base. For internal error locations which have not yet been annotated with a SKB_DROP_REASON_TC_ERROR{,_}, the fallback is SKB_DROP_REASON_TC_INGRESS and SKB_DROP_REASON_TC_EGRESS, respectively. Generic errors could be marked with a SKB_DROP_REASON_TC_ERROR code until they are converted to more specific ones if it is found that they would be useful for troubleshooting. While drop reasons have infrastructure for subsystem specific error codes which are currently used by mac80211 and ovs, Jakub mentioned that it is preferred for tc to use the enum skb_drop_reason core codes given it is a better fit and currently the tooling support is better, too. With regards to the latter: [...] I think Alastair (bpftrace) is working on auto-prettifying enums when bpftrace outputs maps. So we can do something like: $ bpftrace -e 'tracepoint:skb:kfree_skb { @[args->reason] = count(); }' Attaching 1 probe... ^C @[SKB_DROP_REASON_TC_INGRESS]: 2 @[SKB_CONSUMED]: 34 ^^^^^^^^^^^^ names!! Auto-magically. [...] Add a small helper tcf_set_drop_reason() which can be used to set the drop reason into the tcf_result. Signed-off-by: Daniel Borkmann <[email protected]> Cc: Jamal Hadi Salim <[email protected]> Cc: Victor Nogueira <[email protected]> Link: https://lore.kernel.org/netdev/[email protected] Reviewed-by: Jakub Kicinski <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-10-16	svcrdma: Fix tracepoint printk format	Chuck Lever	1	-5/+5
	Other tracepoints use "cq.id=" rather than "cq_id=". Let's make it more reliable to grep for the CQ restracker ID. Reviewed-by: Benjamin Coddington <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2023-10-16	NFSD: Copy FATTR4 bit number definitions from RFCs	Chuck Lever	1	-68/+192
	I'd like to convert nfsd4_encode_fattr() to rotate through the attrmask using for_each_bit() instead of explicitly testing the bitmask for each bit value. This means I need the bit numbers, as defined in the specs, instead of our internal bitmask constants. As a clean up, use the new spec-derived values to define the WORD# bitmask constants. Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2023-10-16	NFSD: Add nfsd4_encode_fattr4_change()	Chuck Lever	1	-1/+1
	Refactor the encoder for FATTR4_CHANGE into a helper. In a subsequent patch, this helper will be called from a bitmask loop. The code is restructured a bit to use the modern xdr_stream flow, and the encoded cinfo value is made const so that callers of the encoders can be passed a const cinfo. Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2023-10-16	nfs: fix the typo of rfc number about xattr in NFSv4	Kinglong Mee	1	-1/+1
	Signed-off-by: Kinglong Mee <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2023-10-16	NFSD: add rpc_status netlink support	Lorenzo Bianconi	1	-0/+1
	Introduce rpc_status netlink support for NFSD in order to dump pending RPC requests debugging information from userspace. Closes: https://bugzilla.linux-nfs.org/show_bug.cgi?id=366 Tested-by: Jeff Layton <[email protected]> Signed-off-by: Lorenzo Bianconi <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2023-10-16	NFSD: introduce netlink stubs	Lorenzo Bianconi	1	-0/+39
	Generate stubs and uAPI for nfsd netlink protocol. For the moment, the new protocol has one operation: rpc_status. The generated header and source files are created by running: tools/net/ynl/ynl-regen.sh Tested-by: Jeff Layton <[email protected]> Signed-off-by: Lorenzo Bianconi <[email protected]> Acked-by: Jeff Layton <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2023-10-16	SUNRPC: change the back-channel queue to lwq	NeilBrown	2	-3/+3
	This removes the need to store and update back-links in the list. It also remove the need for the _bh version of spin_lock(). Signed-off-by: NeilBrown <[email protected]> Cc: Trond Myklebust <[email protected]> Cc: Anna Schumaker <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2023-10-16	SUNRPC: discard sp_lock	NeilBrown	1	-1/+0
	sp_lock is now only used to protect sp_all_threads. This isn't needed as sp_all_threads is only manipulated through svc_set_num_threads(), which is already serialized. Read-acccess only requires rcu_read_lock(). So no more locking is needed. Signed-off-by: NeilBrown <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2023-10-16	SUNRPC: change sp_nrthreads to atomic_t	NeilBrown	1	-1/+1
	Using an atomic_t avoids the need to take a spinlock (which can soon be removed). Choosing a thread to kill needs to be careful as we cannot set the "die now" bit atomically with the test on the count. Instead we temporarily increase the count. Signed-off-by: NeilBrown <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2023-10-16	SUNRPC: use lwq for sp_sockets - renamed to sp_xprts	NeilBrown	2	-2/+3
	lwq avoids using back pointers in lists, and uses less locking. This introduces a new spinlock, but the other one will be removed in a future patch. For svc_clean_up_xprts(), we now dequeue the entire queue, walk it to remove and process the xprts that need cleaning up, then re-enqueue the remaining queue. Signed-off-by: NeilBrown <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2023-10-16	SUNRPC: only have one thread waking up at a time	NeilBrown	1	-11/+0
	Currently if several items of work become available in quick succession, that number of threads (if available) will be woken. By the time some of them wake up another thread that was already cache-warm might have come along and completed the work. Anecdotal evidence suggests as many as 15% of wakes find nothing to do once they get to the point of looking. This patch changes svc_pool_wake_idle_thread() to wake the first thread on the queue but NOT remove it. Subsequent calls will wake the same thread. Once that thread starts it will dequeue itself and after dequeueing some work to do, it will wake the next thread if there is more work ready. This results in a more orderly increase in the number of busy threads. As a bonus, this allows us to reduce locking around the idle queue. svc_pool_wake_idle_thread() no longer needs to take a lock (beyond rcu_read_lock()) as it doesn't manipulate the queue, it just looks at the first item. The thread itself can avoid locking by using the new llist_del_first_this() interface. This will safely remove the thread itself if it is the head. If it isn't the head, it will do nothing. If multiple threads call this concurrently only one will succeed. The others will do nothing, so no corruption can result. If a thread wakes up and finds that it cannot dequeue itself that means either - that it wasn't woken because it was the head of the queue. Maybe the freezer woke it. In that case it can go back to sleep (after trying to freeze of course). - some other thread found there was nothing to do very recently, and placed itself on the head of the queue in front of this thread. It must check again after placing itself there, so it can be deemed to be responsible for any pending work, and this thread can go back to sleep until woken. No code ever tests for busy threads any more. Only each thread itself cares if it is busy. So svc_thread_busy() is no longer needed. Signed-off-by: NeilBrown <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2023-10-16	lib: add light-weight queuing mechanism.	NeilBrown	1	-0/+124
	lwq is a FIFO single-linked queue that only requires a spinlock for dequeueing, which happens in process context. Enqueueing is atomic with no spinlock and can happen in any context. This is particularly useful when work items are queued from BH or IRQ context, and when they are handled one at a time by dedicated threads. Avoiding any locking when enqueueing means there is no need to disable BH or interrupts, which is generally best avoided (particularly when there are any RT tasks on the machine). This solution is superior to using "list_head" links because we need half as many pointers in the data structures, and because list_head lists would need locking to add items to the queue. This solution is superior to a bespoke solution as all locking and container_of casting is integrated, so the interface is simple. Despite the similar name, this solution meets a distinctly different need to kfifo. kfifo provides a fixed sized circular buffer to which data can be added at one end and removed at the other, and does not provide any locking. lwq does not have any size limit and works with data structures (objects?) rather than data (bytes). A unit test for basic functionality, which runs at boot time, is included. Signed-off-by: NeilBrown <[email protected]> Cc: Andrew Morton <[email protected]> Cc: "Liam R. Howlett" <[email protected]> Cc: Kees Cook <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: David Gow <[email protected]> Cc: [email protected] Message-Id: <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2023-10-16	llist: add llist_del_first_this()	NeilBrown	1	-0/+4
	llist_del_first_this() deletes a specific entry from an llist, providing it is at the head of the list. Multiple threads can call this concurrently providing they each offer a different entry. This can be uses for a set of worker threads which are on the llist when they are idle. The head can always be woken, and when it is woken it can remove itself, and possibly wake the next if there is an excess of work to do. Signed-off-by: NeilBrown <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2023-10-16	SUNRPC: change service idle list to be an llist	NeilBrown	1	-16/+5
	With an llist we don't need to take a lock to add a thread to the list, though we still need a lock to remove it. That will go in the next patch. Unlike double-linked lists, a thread cannot reliably remove itself from the list. Only the first thread can be removed, and that can change asynchronously. So some care is needed. We already check if there is pending work to do, so we are unlikely to add ourselves to the idle list and then want to remove ourselves again. If we DO find something needs to be done after adding ourselves to the list, we simply wake up the first thread on the list. If that was us, we successfully removed ourselves and can continue. If it was some other thread, they will do the work that needs to be done. We can safely sleep until woken. We also remove the test on freezing() from rqst_should_sleep(). Instead we set TASK_FREEZABLE before scheduling. This makes is safe to schedule() when a freeze is pending. As we now loop waiting to be removed from the idle queue, this is a cleaner way to handle freezing. Signed-off-by: NeilBrown <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2023-10-16	llist: add interface to check if a node is on a list.	NeilBrown	1	-0/+42
	With list.h lists, it is easy to test if a node is on a list, providing it was initialised and that it is removed with list_del_init(). This patch provides similar functionality for llist.h lists. init_llist_node() marks a node as being not-on-any-list be setting the ->next pointer to the node itself. llist_on_list() tests if the node is on any list. llist_del_first_init() remove the first element from a llist, and marks it as being off-list. Signed-off-by: NeilBrown <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2023-10-16	SUNRPC: discard SP_CONGESTED	NeilBrown	1	-1/+0
	We can tell if a pool is congested by checking if the idle list is empty. We don't need a separate flag. Signed-off-by: NeilBrown <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2023-10-16	SUNRPC: add list of idle threads	NeilBrown	2	-2/+24
	Rather than searching a list of threads to find an idle one, having a list of idle threads allows an idle thread to be found immediately. This adds some spin_lock calls which is not ideal, but as the hold-time is tiny it is still faster than searching a list. A future patch will remove them using llist.h. This involves some subtlety and so is left to a separate patch. This removes the need for the RQ_BUSY flag. The rqst is "busy" precisely when it is not on the "idle" list. Signed-off-by: NeilBrown <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2023-10-16	SUNRPC: change how svc threads are asked to exit.	NeilBrown	2	-2/+26
	svc threads are currently stopped using kthread_stop(). This requires identifying a specific thread. However we don't care which thread stops, just as long as one does. So instead, set a flag in the svc_pool to say that a thread needs to die, and have each thread check this flag instead of calling kthread_should_stop(). The first thread to find and clear this flag then moves towards exiting. This removes an explicit dependency on sp_all_threads which will make a future patch simpler. Signed-off-by: NeilBrown <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2023-10-16	SUNRPC: integrate back-channel processing with svc_recv()	NeilBrown	1	-2/+0
	Using svc_recv() for (NFSv4.1) back-channel handling means we have just one mechanism for waking threads. Also change kthread_freezable_should_stop() in nfs4_callback_svc() to kthread_should_stop() as used elsewhere. kthread_freezable_should_stop() effectively adds a try_to_freeze() call, and svc_recv() already contains that at an appropriate place. Signed-off-by: NeilBrown <[email protected]> Cc: Trond Myklebust <[email protected]> Cc: Anna Schumaker <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2023-10-16	SUNRPC: Clean up bc_svc_process()	Chuck Lever	1	-2/+1
	The test robot complained that, in some build configurations, the @error variable in bc_svc_process's only caller is set but never used. This happens because dprintk() is the only consumer of that value. - Remove the dprintk() call sites in favor of the svc_process tracepoint - The @error variable and the return value of bc_svc_process() are now unused, so get rid of them. - The @serv parameter is set to rqstp->rq_serv by the only caller, and bc_svc_process() then uses it only to set rqstp->rq_serv. It can be removed. - Rename bc_svc_process() according to the convention that globally-visible RPC server functions have names that begin with "svc_"; and because it is globally-visible, give it a proper kdoc comment. Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Signed-off-by: Chuck Lever <[email protected]>
2023-10-16	lockd: introduce safe async lock op	Alexander Aring	1	-0/+14
	This patch reverts mostly commit 40595cdc93ed ("nfs: block notification on fs with its own ->lock") and introduces an EXPORT_OP_ASYNC_LOCK export flag to signal that the "own ->lock" implementation supports async lock requests. The only main user is DLM that is used by GFS2 and OCFS2 filesystem. Those implement their own lock() implementation and return FILE_LOCK_DEFERRED as return value. Since commit 40595cdc93ed ("nfs: block notification on fs with its own ->lock") the DLM implementation were never updated. This patch should prepare for DLM to set the EXPORT_OP_ASYNC_LOCK export flag and update the DLM plock implementation regarding to it. Acked-by: Jeff Layton <[email protected]> Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2023-10-16	spi: Export acpi_spi_find_controller_by_adev()	Hans de Goede	1	-0/+1
	Export acpi_spi_find_controller_by_adev() so that ACPI glue code which wants to dynamically create a spi_device using acpi_spi_device_alloc() or spi_new_device() on a controller, to which the code does not already have a reference, can find the controller. Signed-off-by: Hans de Goede <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]>
2023-10-16	firmware: qcom: qseecom: add missing include guards	Bartosz Golaszewski	1	-0/+6
	The qseecom header does not contain ifdef guards against multiple inclusion. Add them. Fixes: 00b1248606ba ("firmware: qcom_scm: Add support for Qualcomm Secure Execution Environment SCM interface") Signed-off-by: Bartosz Golaszewski <[email protected]> Reviewed-by: Maximilian Luz <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Bjorn Andersson <[email protected]>
2023-10-16	xen: privcmd: Add support for ioeventfd	Viresh Kumar	2	-0/+69
	Virtio guests send VIRTIO_MMIO_QUEUE_NOTIFY notification when they need to notify the backend of an update to the status of the virtqueue. The backend or another entity, polls the MMIO address for updates to know when the notification is sent. It works well if the backend does this polling by itself. But as we move towards generic backend implementations, we end up implementing this in a separate user-space program. Generally, the Virtio backends are implemented to work with the Eventfd based mechanism. In order to make such backends work with Xen, another software layer needs to do the polling and send an event via eventfd to the backend once the notification from guest is received. This results in an extra context switch. This is not a new problem in Linux though. It is present with other hypervisors like KVM, etc. as well. The generic solution implemented in the kernel for them is to provide an IOCTL call to pass the address to poll and eventfd, which lets the kernel take care of polling and raise an event on the eventfd, instead of handling this in user space (which involves an extra context switch). This patch adds similar support for xen. Inspired by existing implementations for KVM, etc.. This also copies ioreq.h header file (only struct ioreq and related macros) from Xen's source tree (Top commit 5d84f07fe6bf ("xen/pci: drop remaining uses of bool_t")). Signed-off-by: Viresh Kumar <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Link: https://lore.kernel.org/r/b20d83efba6453037d0c099912813c79c81f7714.1697439990.git.viresh.kumar@linaro.org Signed-off-by: Juergen Gross <[email protected]>
2023-10-16	xen: irqfd: Use _IOW instead of the internal _IOC() macro	Viresh Kumar	1	-1/+1
	_IOC() an internal helper that we should not use in driver code. In particular, we got the data direction wrong here, which breaks a number of tools, as having "_IOC_NONE" should never be paired with a nonzero size. Use _IOW() instead. Fixes: f8941e6c4c71 ("xen: privcmd: Add support for irqfd") Reported-by: Arnd Bergmann <[email protected]> Closes: https://lore.kernel.org/all/[email protected]/ Signed-off-by: Viresh Kumar <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Link: https://lore.kernel.org/r/599ca6f1b9dd2f0e6247ea37bee3ea6827404b6d.1697439990.git.viresh.kumar@linaro.org Signed-off-by: Juergen Gross <[email protected]>
2023-10-16	xen: Make struct privcmd_irqfd's layout architecture independent	Viresh Kumar	1	-1/+1
	Using indirect pointers in an ioctl command argument means that the layout is architecture specific, in particular we can't use the same one from 32-bit compat tasks. The general recommendation is to have __u64 members and use u64_to_user_ptr() to access it from the kernel if we are unable to avoid the pointers altogether. Fixes: f8941e6c4c71 ("xen: privcmd: Add support for irqfd") Reported-by: Arnd Bergmann <[email protected]> Closes: https://lore.kernel.org/all/[email protected]/ Signed-off-by: Viresh Kumar <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Link: https://lore.kernel.org/r/a4ef0d4a68fc858b34a81fd3f9877d9b6898eb77.1697439990.git.viresh.kumar@linaro.org Signed-off-by: Juergen Gross <[email protected]>
2023-10-16	arm64/arm: xen: enlighten: Fix KPTI checks	Mark Rutland	1	-0/+1
	When KPTI is in use, we cannot register a runstate region as XEN requires that this is always a valid VA, which we cannot guarantee. Due to this, xen_starting_cpu() must avoid registering each CPU's runstate region, and xen_guest_init() must avoid setting up features that depend upon it. We tried to ensure that in commit: f88af7229f6f22ce (" xen/arm: do not setup the runstate info page if kpti is enabled") ... where we added checks for xen_kernel_unmapped_at_usr(), which wraps arm64_kernel_unmapped_at_el0() on arm64 and is always false on 32-bit arm. Unfortunately, as xen_guest_init() is an early_initcall, this happens before secondary CPUs are booted and arm64 has finalized the ARM64_UNMAP_KERNEL_AT_EL0 cpucap which backs arm64_kernel_unmapped_at_el0(), and so this can subsequently be set as secondary CPUs are onlined. On a big.LITTLE system where the boot CPU does not require KPTI but some secondary CPUs do, this will result in xen_guest_init() intializing features that depend on the runstate region, and xen_starting_cpu() registering the runstate region on some CPUs before KPTI is subsequent enabled, resulting the the problems the aforementioned commit tried to avoid. Handle this more robsutly by deferring the initialization of the runstate region until secondary CPUs have been initialized and the ARM64_UNMAP_KERNEL_AT_EL0 cpucap has been finalized. The per-cpu work is moved into a new hotplug starting function which is registered later when we're certain that KPTI will not be used. Fixes: f88af7229f6f ("xen/arm: do not setup the runstate info page if kpti is enabled") Signed-off-by: Mark Rutland <[email protected]> Cc: Bertrand Marquis <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Suzuki K Poulose <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2023-10-16	clocksource/drivers/arm_arch_timer: Initialize evtstrm after finalizing cpucaps	Mark Rutland	1	-0/+1
	We attempt to initialize each CPU's arch_timer event stream in arch_timer_evtstrm_enable(), which we call from the arch_timer_starting_cpu() cpu hotplug callback which is registered early in boot. As this is registered before we initialize the system cpucaps, the test for ARM64_HAS_ECV will always be false for CPUs present at boot time, and will only be taken into account for CPUs onlined late (including those which are hotplugged out and in again). Due to this, CPUs present and boot time may not use the intended divider and scale factor to generate the event stream, and may differ from other CPUs. Correct this by only initializing the event stream after cpucaps have been finalized, registering a separate CPU hotplug callback for the event stream configuration. Since the caps must be finalized by this point, use cpus_have_final_cap() to verify this. Signed-off-by: Mark Rutland <[email protected]> Acked-by: Marc Zyngier <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Cc: Daniel Lezcano <[email protected]> Cc: Suzuki K Poulose <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2023-10-16	drm/bridge: synopsys: dw-mipi-dsi: Add mode fixup support	Liu Ying	1	-0/+3
	Vendor drivers may need to fixup mode due to pixel clock tree limitation, so introduce the ->mode_fixup() callcack to struct dw_mipi_dsi_plat_data and call it at atomic check stage if available. Signed-off-by: Liu Ying <[email protected]> Reviewed-by: Robert Foss <[email protected]> Signed-off-by: Robert Foss <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-16	drm/bridge: synopsys: dw-mipi-dsi: Add input bus format negotiation support	Liu Ying	1	-0/+11
	Introduce ->get_input_bus_fmts() callback to struct dw_mipi_dsi_plat_data so that vendor drivers can implement specific methods to get input bus formats for Synopsys DW MIPI DSI. While at it, implement a generic callback for ->atomic_get_input_bus_fmts(), where we try to get the input bus formats through pdata->get_input_bus_fmts() first. If it's unavailable, fall back to the only format - MEDIA_BUS_FMT_FIXED, which matches the default behavior if ->atomic_get_input_bus_fmts() is not implemented as ->atomic_get_input_bus_fmts()'s kerneldoc indicates. Signed-off-by: Liu Ying <[email protected]> Reviewed-by: Robert Foss <[email protected]> Signed-off-by: Robert Foss <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-16	drm/bridge: synopsys: dw-mipi-dsi: Add dw_mipi_dsi_get_bridge() helper	Liu Ying	1	-0/+2
	Add dw_mipi_dsi_get_bridge() helper so that it can be used by vendor drivers which implement vendor specific extensions to Synopsys DW MIPI DSI. Signed-off-by: Liu Ying <[email protected]> Reviewed-by: Robert Foss <[email protected]> Signed-off-by: Robert Foss <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-16	ipv4: add new arguments to udp_tunnel_dst_lookup()	Beniamino Galvani	1	-3/+5
	We want to make the function more generic so that it can be used by other UDP tunnel implementations such as geneve and vxlan. To do that, add the following arguments: - source and destination UDP port; - ifindex of the output interface, needed by vxlan; - the tos, because in some cases it is not taken from struct ip_tunnel_info (for example, when it's inherited from the inner packet); - the dst cache, because not all tunnel types (e.g. vxlan) want to use the one from struct ip_tunnel_info. With these parameters, the function no longer needs the full struct ip_tunnel_info as argument and we can pass only the relevant part of it (struct ip_tunnel_key). Suggested-by: Guillaume Nault <[email protected]> Signed-off-by: Beniamino Galvani <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>