aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2023-10-25drm/doc: document DRM_IOCTL_MODE_CREATE_DUMBSimon Ser2-2/+34
The main motivation is to repeat that dumb buffers should not be abused for anything else than basic software rendering with KMS. User-space devs are more likely to look at the IOCTL docs than to actively search for the driver-oriented "Dumb Buffer Objects" section. v2: reference DRM_CAP_DUMB_BUFFER, DRM_CAP_DUMB_PREFERRED_DEPTH and DRM_CAP_DUMB_PREFER_SHADOW (Pekka) Signed-off-by: Simon Ser <[email protected]> Acked-by: Daniel Vetter <[email protected]> Acked-by: Pekka Paalanen <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-25sh: Remove superhyway bus supportArnd Bergmann1-107/+0
The superhyway bus driver was only referenced on SH4-202, which is now gone, so remove it all as well. I could find no trace of anything ever calling superhyway_register_driver(), not in the git history but also not on the web, so I assume this has never served any purpose on mainline kernels. Signed-off-by: Arnd Bergmann <[email protected]> Reviewed-by: John Paul Adrian Glaubitz <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: John Paul Adrian Glaubitz <[email protected]>
2023-10-25Merge tag 'devfreq-next-for-6.7' of ↵Rafael J. Wysocki4-6/+52
git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux Merge devfreq updates for v6.7 from Chanwoo Choi: " Detailed description for this pull request: 1. Update devfreq core - Switch to dev_pm_opp_find_freq_(ceil/floor)_indexed() APIs to support the specific device like UFS which handle the multiple clocks through OPP (Operationg Performance Point) framework. 2. Update the devfreq / devfreq-event drivers - Add perf support to the Rockchip DFI(DDR Monitor Module) devfreq-event driver. : Generalize rockchip-dfi.c to support new RK3568/RK3588 using different DDR type. : Covert devicetree bidning document format to yaml. : DFI is a unit which is suitable for measuring DDR utilization for the DDR frequency scaling driver. Add perf support feature to rockchip-dfi.c to extend DFI usage. The perf support has been tested on a RK3568 and a RK3399. - Protect the OPP handling code in critical section because the voltage of shared OPP might be changed by multiple drivers on Mediatek CCI devfreq driver. - Use device_get_match_data() on Samsung Exynos PPMU devfreq-event driver." * tag 'devfreq-next-for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux: (26 commits) dt-bindings: devfreq: event: rockchip,dfi: Add rk3588 support dt-bindings: devfreq: event: rockchip,dfi: Add rk3568 support dt-bindings: devfreq: event: convert Rockchip DFI binding to yaml PM / devfreq: rockchip-dfi: add support for RK3588 PM / devfreq: rockchip-dfi: account for multiple DDRMON_CTRL registers PM / devfreq: rockchip-dfi: make register stride SoC specific PM / devfreq: rockchip-dfi: Add perf support PM / devfreq: rockchip-dfi: give variable a better name PM / devfreq: rockchip-dfi: Prepare for multiple users PM / devfreq: rockchip-dfi: Pass private data struct to internal functions PM / devfreq: rockchip-dfi: Handle LPDDR4X PM / devfreq: rockchip-dfi: Handle LPDDR2 correctly PM / devfreq: rockchip-dfi: Add RK3568 support PM / devfreq: rockchip-dfi: Clean up DDR type register defines PM / devfreq: rk3399_dmc,dfi: generalize DDRTYPE defines PM / devfreq: rockchip-dfi: introduce channel mask PM / devfreq: rockchip-dfi: Use free running counter PM / devfreq: mediatek: unlock on error in mtk_ccifreq_target() PM / devfreq: exynos-ppmu: Use device_get_match_data() PM / devfreq: rockchip-dfi: dfi store raw values in counter struct ...
2023-10-25Merge tag 'opp-updates-6.7' of ↵Rafael J. Wysocki3-4/+44
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm Merge OPP (operating performance points) updates for 6.7 from Viresh Kumar: "- Extend support for the opp-level beyond required-opps (Ulf Hansson). - Add dev_pm_opp_find_level_floor() (Krishna chaitanya chundru). - dt-bindings: Allow opp-peak-kBpsfor kryo CPUs, support Qualcomm Krait SoCs and document named opp-microvolt property (Bjorn Andersson, Dmitry Baryshkov and Christian Marangi). - Fix -Wunsequenced warning (Nathan Chancellor). - General cleanup (Viresh Kumar)." * tag 'opp-updates-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: dt-bindings: opp: opp-v2-kryo-cpu: Document named opp-microvolt property OPP: No need to defer probe from _opp_attach_genpd() OPP: Remove genpd_virt_dev_lock OPP: Reorder code in _opp_set_required_opps_genpd() OPP: Add _link_required_opps() to avoid code duplication OPP: Fix formatting of if/else block dt-bindings: opp: opp-v2-kryo-cpu: support Qualcomm Krait SoCs OPP: Fix -Wunsequenced in _of_add_opp_table_v1() dt-bindings: opp: opp-v2-kryo-cpu: Allow opp-peak-kBps OPP: debugfs: Fix warning with W=1 builds OPP: Remove doc style comments for internal routines OPP: Add dev_pm_opp_find_level_floor() OPP: Extend support for the opp-level beyond required-opps OPP: Switch to use dev_pm_domain_set_performance_state() OPP: Extend dev_pm_opp_data with a level OPP: Add dev_pm_opp_add_dynamic() to allow more flexibility PM: domains: Implement the ->set_performance_state() callback for genpd PM: domains: Introduce dev_pm_domain_set_performance_state()
2023-10-25Merge tag 'thermal-v6.7-rc1' of ↵Rafael J. Wysocki1-0/+28
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux Merge thermal control (ARM drivers mostly) updates for 6.7-rc1 from Daniel Lezcano: "- Add support for Mediatek LVTS MT8192 driver along with the suspend/resume routines (Balsam Chihi) - Fix probe for THERMAL_V2 for the Mediatek LVTS driver (Markus Schneider-Pargmann) - Remove duplicate error message in the max76620 driver when thermal_of_zone_register() fails as the sub routine already show one (Thierry Reding) - Add i.MX7D compatible bindings to fix a warning from dtbs_check for the imx6ul platform (Alexander Stein) - Add sa8775p compatible for the QCom tsens driver (Priyansh Jain) - Fix error check in lvts_debugfs_init() which is checking against NULL instead of PTR_ERR() on the LVTS Mediatek driver (Minjie Du) - Remove unused variable in the thermal/tools (Kuan-Wei Chiu) - Document the imx8dl thermal sensor (Fabio Estevam) - Add variable names in callback prototypes to prevent warning from checkpatch.pl for the imx8mm driver (Bragatheswaran Manickavel) - Add missing unevaluatedProperties on child node schemas for tegra124 (Rob Herring) - Add mt7988 support for the Mediatek LVTS driver (Frank Wunderlich)" * tag 'thermal-v6.7-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux: thermal/qcom/tsens: Drop ops_v0_1 thermal/drivers/mediatek/lvts_thermal: Update calibration data documentation thermal/drivers/mediatek/lvts_thermal: Add mt8192 support thermal/drivers/mediatek/lvts_thermal: Add suspend and resume dt-bindings: thermal: mediatek: Add LVTS thermal controller definition for mt8192 thermal/drivers/mediatek: Fix probe for THERMAL_V2 thermal/drivers/max77620: Remove duplicate error message dt-bindings: timer: add imx7d compatible dt-bindings: net: microchip: Allow nvmem-cell usage dt-bindings: imx-thermal: Add #thermal-sensor-cells property dt-bindings: thermal: tsens: Add sa8775p compatible thermal/drivers/mediatek/lvts_thermal: Fix error check in lvts_debugfs_init() tools/thermal: Remove unused 'mds' and 'nrhandler' variables dt-bindings: thermal: fsl,scu-thermal: Document imx8dl thermal/drivers/imx8mm_thermal: Fix function pointer declaration by adding identifier name dt-bindings: thermal: nvidia,tegra124-soctherm: Add missing unevaluatedProperties on child node schemas thermal/drivers/mediatek/lvts_thermal: Add mt7988 support thermal/drivers/mediatek/lvts_thermal: Make coeff configurable dt-bindings: thermal: mediatek: Add LVTS thermal sensors for mt7988 dt-bindings: thermal: mediatek: Add mt7988 lvts compatible
2023-10-25netfilter: flowtable: GC pushes back packets to classic pathPablo Neira Ayuso1-0/+1
Since 41f2c7c342d3 ("net/sched: act_ct: Fix promotion of offloaded unreplied tuple"), flowtable GC pushes back flows with IPS_SEEN_REPLY back to classic path in every run, ie. every second. This is because of a new check for NF_FLOW_HW_ESTABLISHED which is specific of sched/act_ct. In Netfilter's flowtable case, NF_FLOW_HW_ESTABLISHED never gets set on and IPS_SEEN_REPLY is unreliable since users decide when to offload the flow before, such bit might be set on at a later stage. Fix it by adding a custom .gc handler that sched/act_ct can use to deal with its NF_FLOW_HW_ESTABLISHED bit. Fixes: 41f2c7c342d3 ("net/sched: act_ct: Fix promotion of offloaded unreplied tuple") Reported-by: Vladimir Smelhaus <[email protected]> Reviewed-by: Paul Blakey <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2023-10-25sched: act_ct: switch to per-action label countingFlorian Westphal1-0/+1
net->ct.labels_used was meant to convey 'number of ip/nftables rules that need the label extension allocated'. act_ct enables this for each net namespace, which voids all attempts to avoid ct->ext allocation when possible. Move this increment to the control plane to request label extension space allocation only when its needed. Signed-off-by: Florian Westphal <[email protected]> Reviewed-by: Pedro Tammela <[email protected]> Reviewed-by: Jamal Hadi Salim <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-10-24scsi: core: Add comment to target_destroy in scsi_host_templateWenchao Hao1-0/+3
Add comment to indicate that the callback function target_destroy in the scsi_host_template must not sleep. Signed-off-by: Wenchao Hao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Bart Van Assche <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-10-24netkit, bpf: Add bpf programmable net deviceDaniel Borkmann3-0/+76
This work adds a new, minimal BPF-programmable device called "netkit" (former PoC code-name "meta") we recently presented at LSF/MM/BPF. The core idea is that BPF programs are executed within the drivers xmit routine and therefore e.g. in case of containers/Pods moving BPF processing closer to the source. One of the goals was that in case of Pod egress traffic, this allows to move BPF programs from hostns tcx ingress into the device itself, providing earlier drop or forward mechanisms, for example, if the BPF program determines that the skb must be sent out of the node, then a redirect to the physical device can take place directly without going through per-CPU backlog queue. This helps to shift processing for such traffic from softirq to process context, leading to better scheduling decisions/performance (see measurements in the slides). In this initial version, the netkit device ships as a pair, but we plan to extend this further so it can also operate in single device mode. The pair comes with a primary and a peer device. Only the primary device, typically residing in hostns, can manage BPF programs for itself and its peer. The peer device is designated for containers/Pods and cannot attach/detach BPF programs. Upon the device creation, the user can set the default policy to 'pass' or 'drop' for the case when no BPF program is attached. Additionally, the device can be operated in L3 (default) or L2 mode. The management of BPF programs is done via bpf_mprog, so that multi-attach is supported right from the beginning with similar API and dependency controls as tcx. For details on the latter see commit 053c8e1f235d ("bpf: Add generic attach/detach/query API for multi-progs"). tc BPF compatibility is provided, so that existing programs can be easily migrated. Going forward, we plan to use netkit devices in Cilium as the main device type for connecting Pods. They will be operated in L3 mode in order to simplify a Pod's neighbor management and the peer will operate in default drop mode, so that no traffic is leaving between the time when a Pod is brought up by the CNI plugin and programs attached by the agent. Additionally, the programs we attach via tcx on the physical devices are using bpf_redirect_peer() for inbound traffic into netkit device, hence the latter is also supporting the ndo_get_peer_dev callback. Similarly, we use bpf_redirect_neigh() for the way out, pushing from netkit peer to phys device directly. Also, BIG TCP is supported on netkit device. For the follow-up work in single device mode, we plan to convert Cilium's cilium_host/_net devices into a single one. An extensive test suite for checking device operations and the BPF program and link management API comes as BPF selftests in this series. Co-developed-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Toke Høiland-Jørgensen <[email protected]> Acked-by: Stanislav Fomichev <[email protected]> Acked-by: Martin KaFai Lau <[email protected]> Link: https://github.com/borkmann/iproute2/tree/pr/netkit Link: http://vger.kernel.org/bpfconf2023_material/tcx_meta_netdev_borkmann.pdf (24ff.) Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin KaFai Lau <[email protected]>
2023-10-24KVM: arm64: Sanitize PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR} before first runRaghavendra Rao Ananta1-1/+2
For unimplemented counters, the registers PM{C,I}NTEN{SET,CLR} and PMOVS{SET,CLR} are expected to have the corresponding bits RAZ. Hence to ensure correct KVM's PMU emulation, mask out the RES0 bits. Defer this work to the point that userspace can no longer change the number of advertised PMCs. Signed-off-by: Raghavendra Rao Ananta <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Oliver Upton <[email protected]>
2023-10-24KVM: arm64: PMU: Set PMCR_EL0.N for vCPU based on the associated PMURaghavendra Rao Ananta1-0/+6
The number of PMU event counters is indicated in PMCR_EL0.N. For a vCPU with PMUv3 configured, the value is set to the same value as the current PE on every vCPU reset. Unless the vCPU is pinned to PEs that has the PMU associated to the guest from the initial vCPU reset, the value might be different from the PMU's PMCR_EL0.N on heterogeneous PMU systems. Fix this by setting the vCPU's PMCR_EL0.N to the PMU's PMCR_EL0.N value. Track the PMCR_EL0.N per guest, as only one PMU can be set for the guest (PMCR_EL0.N must be the same for all vCPUs of the guest), and it is convenient for updating the value. To achieve this, the patch introduces a helper, kvm_arm_pmu_get_max_counters(), that reads the maximum number of counters from the arm_pmu associated to the VM. Make the function global as upcoming patches will be interested to know the value while setting the PMCR.N of the guest from userspace. KVM does not yet support userspace modifying PMCR_EL0.N. The following patch will add support for that. Reviewed-by: Sebastian Ott <[email protected]> Co-developed-by: Marc Zyngier <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Reiji Watanabe <[email protected]> Signed-off-by: Raghavendra Rao Ananta <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Oliver Upton <[email protected]>
2023-10-24KVM: arm64: PMU: Add a helper to read a vCPU's PMCR_EL0Reiji Watanabe1-0/+6
Add a helper to read a vCPU's PMCR_EL0, and use it whenever KVM reads a vCPU's PMCR_EL0. Currently, the PMCR_EL0 value is tracked per vCPU. The following patches will make (only) PMCR_EL0.N track per guest. Having the new helper will be useful to combine the PMCR_EL0.N field (tracked per guest) and the other fields (tracked per vCPU) to provide the value of PMCR_EL0. No functional change intended. Reviewed-by: Sebastian Ott <[email protected]> Signed-off-by: Reiji Watanabe <[email protected]> Signed-off-by: Raghavendra Rao Ananta <[email protected]> Reviewed-by: Eric Auger <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Oliver Upton <[email protected]>
2023-10-24KVM: arm64: Select default PMU in KVM_ARM_VCPU_INIT handlerReiji Watanabe1-0/+6
Future changes to KVM's sysreg emulation will rely on having a valid PMU instance to determine the number of implemented counters (PMCR_EL0.N). This is earlier than when userspace is expected to modify the vPMU device attributes, where the default is selected today. Select the default PMU when handling KVM_ARM_VCPU_INIT such that it is available in time for sysreg emulation. Reviewed-by: Sebastian Ott <[email protected]> Co-developed-by: Marc Zyngier <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Reiji Watanabe <[email protected]> Signed-off-by: Raghavendra Rao Ananta <[email protected]> Link: https://lore.kernel.org/r/[email protected] [Oliver: rewrite changelog] Signed-off-by: Oliver Upton <[email protected]>
2023-10-24PCI/PME: Use FIELD_GET()Bjorn Helgaas1-0/+1
Use FIELD_GET() to remove dependences on the field position, i.e., the shift value. No functional change intended. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Ilpo Järvinen <[email protected]> Reviewed-by: Jonathan Cameron <[email protected]> Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]>
2023-10-24PCI/ATS: Use FIELD_GET()Bjorn Helgaas1-0/+1
Use FIELD_GET() to remove dependences on the field position, i.e., the shift value. No functional change intended. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Jonathan Cameron <[email protected]> Reviewed-by: Ilpo Järvinen <[email protected]> Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]>
2023-10-24PCI/ATS: Show PASID Capability register width in bitmasksBjorn Helgaas1-5/+5
The PASID Capability and Control registers are both 16 bits wide. Use 16-bit wide constants in field names to match the register width. No functional change intended. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Ilpo Järvinen <[email protected]> Reviewed-by: Jonathan Cameron <[email protected]> Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]>
2023-10-24kexec: Annotate struct crash_mem with __counted_byKees Cook1-1/+1
Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct crash_mem. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Eric Biederman <[email protected]> Cc: [email protected] Acked-by: Baoquan He <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Kees Cook <[email protected]>
2023-10-24Merge tag 'wireless-2023-10-24' of ↵Jakub Kicinski1-0/+29
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Johannes Berg says: ==================== Three more fixes: - don't drop all unprotected public action frames since some don't have a protected dual - fix pointer confusion in scanning code - fix warning in some connections with multiple links * tag 'wireless-2023-10-24' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: wifi: mac80211: don't drop all unprotected public action frames wifi: cfg80211: fix assoc response warning on failed links wifi: cfg80211: pass correct pointer to rdev_inform_bss() ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-10-24net: dsa: Rename IFLA_DSA_MASTER to IFLA_DSA_CONDUITFlorian Fainelli1-1/+3
This preserves the existing IFLA_DSA_MASTER which is part of the uAPI and creates an alias named IFLA_DSA_CONDUIT. Reviewed-by: Andrew Lunn <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Signed-off-by: Florian Fainelli <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-10-24net: dsa: Use conduit and user termsFlorian Fainelli3-40/+40
Use more inclusive terms throughout the DSA subsystem by moving away from "master" which is replaced by "conduit" and "slave" which is replaced by "user". No functional changes. Acked-by: Rob Herring <[email protected]> Acked-by: Stephen Hemminger <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Signed-off-by: Florian Fainelli <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-10-24uapi: mptcp: use header file generated from YAML specDavide Caratti2-172/+160
generated with: $ ./tools/net/ynl/ynl-gen-c.py --mode uapi \ > --spec Documentation/netlink/specs/mptcp.yaml \ > --header -o include/uapi/linux/mptcp_pm.h Link: https://github.com/multipath-tcp/mptcp_net-next/issues/340 Acked-by: Paolo Abeni <[email protected]> Signed-off-by: Davide Caratti <[email protected]> Signed-off-by: Mat Martineau <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-10-24net: mptcp: convert netlink from small_ops to opsDavide Caratti1-0/+8
in the current MPTCP control plane, all operations use a netlink attribute of the same type "MPTCP_PM_ATTR". However, add/del/get/flush operations only parse the first element in the message _ the one that describes MPTCP endpoints (that was named MPTCP_PM_ATTR_ADDR and mostly used in ADD_ADDR operations _ probably the similarity of "attr", "addr" and "add" might cause some confusion to human readers). Convert MPTCP from 'small_ops' to 'ops', thus allowing different attributes for each single operation, hopefully makes all this clearer to human readers. - use a separate attribute set for add/del/get/flush address operation, binary compatible with the existing one, to store the endpoint address. MPTCP_PM_ENDPOINT_ADDR is added to the uAPI (with the same value as MPTCP_PM_ATTR_ADDR) for these operations. - convert mptcp_pm_ops[] and add policy files accordingly. this prepares MPTCP control plane to be described as YAML spec. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/340 Acked-by: Paolo Abeni <[email protected]> Signed-off-by: Davide Caratti <[email protected]> Signed-off-by: Mat Martineau <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-10-24Merge tag 'mm-hotfixes-stable-2023-10-24-09-40' of ↵Linus Torvalds2-5/+42
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "20 hotfixes. 12 are cc:stable and the remainder address post-6.5 issues or aren't considered necessary for earlier kernel versions" * tag 'mm-hotfixes-stable-2023-10-24-09-40' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: maple_tree: add GFP_KERNEL to allocations in mas_expected_entries() selftests/mm: include mman header to access MREMAP_DONTUNMAP identifier mailmap: correct email aliasing for Oleksij Rempel mailmap: map Bartosz's old address to the current one mm/damon/sysfs: check DAMOS regions update progress from before_terminate() MAINTAINERS: Ondrej has moved kasan: disable kasan_non_canonical_hook() for HW tags kasan: print the original fault addr when access invalid shadow hugetlbfs: close race between MADV_DONTNEED and page fault hugetlbfs: extend hugetlb_vma_lock to private VMAs hugetlbfs: clear resv_map pointer if mmap fails mm: zswap: fix pool refcount bug around shrink_worker() mm/migrate: fix do_pages_move for compat pointers riscv: fix set_huge_pte_at() for NAPOT mappings when a swap entry is set riscv: handle VM_FAULT_[HWPOISON|HWPOISON_LARGE] faults instead of panicking mmap: fix error paths with dup_anon_vma() mmap: fix vma_iterator in error path of vma_merge() mm: fix vm_brk_flags() to not bail out while holding lock mm/mempolicy: fix set_mempolicy_home_node() previous VMA pointer mm/page_alloc: correct start page when guard page debug is enabled
2023-10-24ACPI: utils: Introduce acpi_dev_uid_match() for matching _UIDRaag Jadav2-0/+6
Introduce acpi_dev_uid_match() helper that matches the device with supplied _UID string. Signed-off-by: Raag Jadav <[email protected]> Reviewed-by: Mika Westerberg <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2023-10-24drm/fourcc: Add NV20 and NV30 YUV formatsJonas Karlman1-0/+2
DRM_FORMAT_NV20 and DRM_FORMAT_NV30 formats is the 2x1 and non-subsampled variant of NV15, a 10-bit 2-plane YUV format that has no padding between components. Instead, luminance and chrominance samples are grouped into 4s so that each group is packed into an integer number of bytes: YYYY = UVUV = 4 * 10 bits = 40 bits = 5 bytes The '20' and '30' suffix refers to the optimum effective bits per pixel which is achieved when the total number of luminance samples is a multiple of 4. V2: Added NV30 format Signed-off-by: Jonas Karlman <[email protected]> Reviewed-by: Sandy Huang <[email protected]> Reviewed-by: Christopher Obbard <[email protected]> Tested-by: Christopher Obbard <[email protected]> Signed-off-by: Heiko Stuebner <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-24KVM: arm64: Add PMU event filter bits required if EL3 is implementedOliver Upton1-3/+6
Suzuki noticed that KVM's PMU emulation is oblivious to the NSU and NSK event filter bits. On systems that have EL3 these bits modify the filter behavior in non-secure EL0 and EL1, respectively. Even though the kernel doesn't use these bits, it is entirely possible some other guest OS does. Additionally, it would appear that these and the M bit are required by the architecture if EL3 is implemented. Allow the EL3 event filter bits to be set if EL3 is advertised in the guest's ID register. Implement the behavior of NSU and NSK according to the pseudocode, and entirely ignore the M bit for perf event creation. Reported-by: Suzuki K Poulose <[email protected]> Reviewed-by: Suzuki K Poulose <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Oliver Upton <[email protected]>
2023-10-24KVM: arm64: Make PMEVTYPER<n>_EL0.NSH RES0 if EL2 isn't advertisedOliver Upton1-0/+5
The NSH bit, which filters event counting at EL2, is required by the architecture if an implementation has EL2. Even though KVM doesn't support nested virt yet, it makes no effort to hide the existence of EL2 from the ID registers. Userspace can, however, change the value of PFR0 to hide EL2. Align KVM's sysreg emulation with the architecture and make NSH RES0 if EL2 isn't advertised. Keep in mind the bit is ignored when constructing the backing perf event. While at it, build the event type mask using explicit field definitions instead of relying on ARMV8_PMU_EVTYPE_MASK. KVM probably should've been doing this in the first place, as it avoids changes to the aforementioned mask affecting sysreg emulation. Reviewed-by: Suzuki K Poulose <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Oliver Upton <[email protected]>
2023-10-24exportfs: add helpers to check if filesystem can encode/decode file handlesAmir Goldstein1-0/+27
The logic of whether filesystem can encode/decode file handles is open coded in many places. In preparation to changing the logic, move the open coded logic into inline helpers. Reviewed-by: Jan Kara <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Amir Goldstein <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Christian Brauner <[email protected]>
2023-10-24PCI/DPC: Use defines with DPC reason fieldsIlpo Järvinen1-0/+6
Add new defines for DPC reason fields and use them instead of literals. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ilpo Järvinen <[email protected]> [bhelgaas: shorten comments] Signed-off-by: Bjorn Helgaas <[email protected]>
2023-10-24PCI/DPC: Use FIELD_GET()Bjorn Helgaas1-0/+1
Use FIELD_GET() to remove dependencies on the field position, i.e., the shift value. No functional change intended. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ilpo Järvinen <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
2023-10-24PCI: dwc: Use FIELD_GET/PREP()Ilpo Järvinen1-0/+2
Convert open-coded variants of PCI field access into FIELD_GET/PREP() to make the code easier to understand. Add two missing defines into pci_regs.h. Logically, the Max No-Snoop Latency Register is a separate word sized register in the PCIe spec, but the pre-existing LTR defines in pci_regs.h with dword long values seem to consider the registers together (the same goes for the only user). Thus, follow the custom and make the new values also take both word long LTR registers as a joint dword register. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ilpo Järvinen <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
2023-10-24iommufd: Add a flag to skip clearing of IOPTE dirtyJoao Martins1-1/+14
VFIO has an operation where it unmaps an IOVA while returning a bitmap with the dirty data. In reality the operation doesn't quite query the IO pagetables that the PTE was dirty or not. Instead it marks as dirty on anything that was mapped, and doing so in one syscall. In IOMMUFD the equivalent is done in two operations by querying with GET_DIRTY_IOVA followed by UNMAP_IOVA. However, this would incur two TLB flushes given that after clearing dirty bits IOMMU implementations require invalidating their IOTLB, plus another invalidation needed for the UNMAP. To allow dirty bits to be queried faster, add a flag (IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR) that requests to not clear the dirty bits from the PTE (but just reading them), under the expectation that the next operation is the unmap. An alternative is to unmap and just perpectually mark as dirty as that's the same behaviour as today. So here equivalent functionally can be provided with unmap alone, and if real dirty info is required it will amortize the cost while querying. There's still a race against DMA where in theory the unmap of the IOVA (when the guest invalidates the IOTLB via emulated iommu) would race against the VF performing DMA on the same IOVA. As discussed in [0], we are accepting to resolve this race as throwing away the DMA and it doesn't matter if it hit physical DRAM or not, the VM can't tell if we threw it away because the DMA was blocked or because we failed to copy the DRAM. [0] https://lore.kernel.org/linux-iommu/[email protected]/ Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joao Martins <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-10-24iommufd: Add capabilities to IOMMU_GET_HW_INFOJoao Martins1-0/+17
Extend IOMMUFD_CMD_GET_HW_INFO op to query generic iommu capabilities for a given device. Capabilities are IOMMU agnostic and use device_iommu_capable() API passing one of the IOMMU_CAP_*. Enumerate IOMMU_CAP_DIRTY_TRACKING for now in the out_capabilities field returned back to userspace. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joao Martins <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-10-24iommufd: Add IOMMU_HWPT_GET_DIRTY_BITMAPJoao Martins1-0/+35
Connect a hw_pagetable to the IOMMU core dirty tracking read_and_clear_dirty iommu domain op. It exposes all of the functionality for the UAPI that read the dirtied IOVAs while clearing the Dirty bits from the PTEs. In doing so, add an IO pagetable API iopt_read_and_clear_dirty_data() that performs the reading of dirty IOPTEs for a given IOVA range and then copying back to userspace bitmap. Underneath it uses the IOMMU domain kernel API which will read the dirty bits, as well as atomically clearing the IOPTE dirty bit and flushing the IOTLB at the end. The IOVA bitmaps usage takes care of the iteration of the bitmaps user pages efficiently and without copies. Within the iterator function we iterate over io-pagetable contigous areas that have been mapped. Contrary to past incantation of a similar interface in VFIO the IOVA range to be scanned is tied in to the bitmap size, thus the application needs to pass a appropriately sized bitmap address taking into account the iova range being passed *and* page size ... as opposed to allowing bitmap-iova != iova. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joao Martins <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-10-24iommufd: Add IOMMU_HWPT_SET_DIRTY_TRACKINGJoao Martins1-0/+28
Every IOMMU driver should be able to implement the needed iommu domain ops to control dirty tracking. Connect a hw_pagetable to the IOMMU core dirty tracking ops, specifically the ability to enable/disable dirty tracking on an IOMMU domain (hw_pagetable id). To that end add an io_pagetable kernel API to toggle dirty tracking: * iopt_set_dirty_tracking(iopt, [domain], state) The intended caller of this is via the hw_pagetable object that is created. Internally it will ensure the leftover dirty state is cleared /right before/ dirty tracking starts. This is also useful for iommu drivers which may decide that dirty tracking is always-enabled at boot without wanting to toggle dynamically via corresponding iommu domain op. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joao Martins <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-10-24iommufd: Add a flag to enforce dirty tracking on attachJoao Martins1-0/+3
Throughout IOMMU domain lifetime that wants to use dirty tracking, some guarantees are needed such that any device attached to the iommu_domain supports dirty tracking. The idea is to handle a case where IOMMU in the system are assymetric feature-wise and thus the capability may not be supported for all devices. The enforcement is done by adding a flag into HWPT_ALLOC namely: IOMMU_HWPT_ALLOC_DIRTY_TRACKING .. Passed in HWPT_ALLOC ioctl() flags. The enforcement is done by creating a iommu_domain via domain_alloc_user() and validating the requested flags with what the device IOMMU supports (and failing accordingly) advertised). Advertising the new IOMMU domain feature flag requires that the individual iommu driver capability is supported when a future device attachment happens. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joao Martins <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-10-24iommu: Add iommu_domain ops for dirty trackingJoao Martins2-0/+74
Add to iommu domain operations a set of callbacks to perform dirty tracking, particulary to start and stop tracking and to read and clear the dirty data. Drivers are generally expected to dynamically change its translation structures to toggle the tracking and flush some form of control state structure that stands in the IOVA translation path. Though it's not mandatory, as drivers can also enable dirty tracking at boot, and just clear the dirty bits before setting dirty tracking. For each of the newly added IOMMU core APIs: iommu_cap::IOMMU_CAP_DIRTY_TRACKING: new device iommu_capable value when probing for capabilities of the device. .set_dirty_tracking(): an iommu driver is expected to change its translation structures and enable dirty tracking for the devices in the iommu_domain. For drivers making dirty tracking always-enabled, it should just return 0. .read_and_clear_dirty(): an iommu driver is expected to walk the pagetables for the iova range passed in and use iommu_dirty_bitmap_record() to record dirty info per IOVA. When detecting that a given IOVA is dirty it should also clear its dirty state from the PTE, *unless* the flag IOMMU_DIRTY_NO_CLEAR is passed in -- flushing is steered from the caller of the domain_op via iotlb_gather. The iommu core APIs use the same data structure in use for dirty tracking for VFIO device dirty (struct iova_bitmap) abstracted by iommu_dirty_bitmap_record() helper function. domain::dirty_ops: IOMMU domains will store the dirty ops depending on whether the iommu device supports dirty tracking or not. iommu drivers can then use this field to figure if the dirty tracking is supported+enforced on attach. The enforcement is enable via domain_alloc_user() which is done via IOMMUFD hwpt flag introduced later. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joao Martins <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Lu Baolu <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-10-24vfio: Move iova_bitmap into iommufdJoao Martins1-0/+26
Both VFIO and IOMMUFD will need iova bitmap for storing dirties and walking the user bitmaps, so move to the common dependency into IOMMUFD. In doing so, create the symbol IOMMUFD_DRIVER which designates the builtin code that will be used by drivers when selected. Today this means MLX5_VFIO_PCI and PDS_VFIO_PCI. IOMMU drivers will do the same (in future patches) when supporting dirty tracking and select IOMMUFD_DRIVER accordingly. Given that the symbol maybe be disabled, add header definitions in iova_bitmap.h for when IOMMUFD_DRIVER=n Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joao Martins <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Brett Creeley <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Reviewed-by: Alex Williamson <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-10-24arm64, irqchip/gic-v3, ACPI: Move MADT GICC enabled check into a helperJames Morse1-0/+5
ACPI, irqchip and the architecture code all inspect the MADT enabled bit for a GICC entry in the MADT. The addition of an 'online capable' bit means all these sites need updating. Move the current checks behind a helper to make future updates easier. Signed-off-by: James Morse <[email protected]> Reviewed-by: Jonathan Cameron <[email protected]> Reviewed-by: Gavin Shan <[email protected]> Signed-off-by: "Russell King (Oracle)" <[email protected]> Acked-by: "Rafael J. Wysocki" <[email protected]> Reviewed-by: Sudeep Holla <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
2023-10-24netfilter: nf_tables: set->ops->insert returns opaque set element in case of ↵Pablo Neira Ayuso1-1/+1
EEXIST Return struct nft_elem_priv instead of struct nft_set_ext for consistency with ("netfilter: nf_tables: expose opaque set element as struct nft_elem_priv") and to prepare the introduction of element timeout updates from control path. Signed-off-by: Pablo Neira Ayuso <[email protected]>
2023-10-24netfilter: nf_tables: shrink memory consumption of set elementsPablo Neira Ayuso1-9/+9
Instead of copying struct nft_set_elem into struct nft_trans_elem, store the pointer to the opaque set element object in the transaction. Adapt set backend API (and set backend implementations) to take the pointer to opaque set element representation whenever required. This patch deconstifies .remove() and .activate() set backend API since these modify the set element opaque object. And it also constify nft_set_elem_ext() this provides access to the nft_set_ext struct without updating the object. According to pahole on x86_64, this patch shrinks struct nft_trans_elem size from 216 to 24 bytes. This patch also reduces stack memory consumption by removing the template struct nft_set_elem object, using the opaque set element object instead such as from the set iterator API, catchall elements and the get element command. Signed-off-by: Pablo Neira Ayuso <[email protected]>
2023-10-24netfilter: nf_tables: expose opaque set element as struct nft_elem_privPablo Neira Ayuso1-13/+25
Add placeholder structure and place it at the beginning of each struct nft_*_elem for each existing set backend, instead of exposing elements as void type to the frontend which defeats compiler type checks. Use this pointer to this new type to replace void *. This patch updates the following set backend API to use this new struct nft_elem_priv placeholder structure: - update - deactivate - flush - get as well as the following helper functions: - nft_set_elem_ext() - nft_set_elem_init() - nft_set_elem_destroy() - nf_tables_set_elem_destroy() This patch adds nft_elem_priv_cast() to cast struct nft_elem_priv to native element representation from the corresponding set backend. BUILD_BUG_ON() makes sure this .priv placeholder is always at the top of the opaque set element representation. Suggested-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2023-10-24netfilter: nf_tables: set backend .flush always succeedsPablo Neira Ayuso1-1/+1
.flush is always successful since this results from iterating over the set elements to toggle mark the element as inactive in the next generation. Signed-off-by: Pablo Neira Ayuso <[email protected]>
2023-10-24netfilter: conntrack: switch connlabels to atomic_tFlorian Westphal2-2/+2
The spinlock is back from the day when connabels did not have a fixed size and reallocation had to be supported. Remove it. This change also allows to call the helpers from softirq or timers without deadlocks. Also add WARN()s to catch refcounting imbalances. Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2023-10-24pmdomain: Merge branch genpd_dt into nextUlf Hansson1-0/+21
Merge the immutable branch genpd_dt into next, to allow the DT bindings to be tested together with new pmdomain changes that are targeted for v6.7. Signed-off-by: Ulf Hansson <[email protected]>
2023-10-24dt-bindings: power: rpmpd: Add MSM8917, MSM8937 and QM215Otto Pflüger1-0/+21
The MSM8917, MSM8937 and QM215 SoCs have VDDCX and VDDMX power domains controlled in voltage level mode. Define the MSM8937 and QM215 power domains as aliases because these SoCs are similar to MSM8917 and may share some parts of the device tree. Also add the compatibles for these SoCs to the documentation, with qcom,msm8937-rpmpd using qcom,msm8917-rpmpd as a fallback compatible because there are no known differences. QM215 is not compatible with these because it uses different regulators. Signed-off-by: Otto Pflüger <[email protected]> Reviewed-by: Krzysztof Kozlowski <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ulf Hansson <[email protected]>
2023-10-24gtp: uapi: fix GTPA_MAXPablo Neira Ayuso1-1/+1
Subtract one to __GTPA_MAX, otherwise GTPA_MAX is off by 2. Fixes: 459aa660eb1d ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)") Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2023-10-24xsk: Avoid starving the xsk further down the listAlbert Huang1-0/+7
In the previous implementation, when multiple xsk sockets were associated with a single xsk_buff_pool, a situation could arise where the xsk_tx_list maintained data at the front for one xsk socket while starving the xsk sockets at the back of the list. This could result in issues such as the inability to transmit packets, increased latency, and jitter. To address this problem, we introduce a new variable called tx_budget_spent, which limits each xsk to transmit a maximum of MAX_PER_SOCKET_BUDGET tx descriptors. This allocation ensures equitable opportunities for subsequent xsk sockets to send tx descriptors. The value of MAX_PER_SOCKET_BUDGET is set to 32. Signed-off-by: Albert Huang <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Magnus Karlsson <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2023-10-24sched/fair: Remove SIS_PROPPeter Zijlstra1-2/+0
SIS_UTIL seems to work well, lets remove the old thing. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Vincent Guittot <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2023-10-24sched: Add cpus_share_resources APIBarry Song2-1/+14
Add cpus_share_resources() API. This is the preparation for the optimization of select_idle_cpu() on platforms with cluster scheduler level. On a machine with clusters cpus_share_resources() will test whether two cpus are within the same cluster. On a non-cluster machine it will behaves the same as cpus_share_cache(). So we use "resources" here for cache resources. Signed-off-by: Barry Song <[email protected]> Signed-off-by: Yicong Yang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Gautham R. Shenoy <[email protected]> Reviewed-by: Tim Chen <[email protected]> Reviewed-by: Vincent Guittot <[email protected]> Tested-and-reviewed-by: Chen Yu <[email protected]> Tested-by: K Prateek Nayak <[email protected]> Link: https://lkml.kernel.org/r/[email protected]