aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2023-12-11soc: mediatek: Support MT8188 VDOSYS1 Padding in mtk-mmsysHsiao Chien Sung1-0/+8
- Add Padding components - Add Mutex module definitions for Padding Reviewed-by: AngeloGioacchino Del Regno <[email protected]> Signed-off-by: Hsiao Chien Sung <[email protected]> Signed-off-by: AngeloGioacchino Del Regno <[email protected]>
2023-12-11platform/x86/amd: Add support for AMD ACPI based Wifi band RFI mitigation ↵Ma Jun1-0/+91
feature Due to electrical and mechanical constraints in certain platform designs there may be likely interference of relatively high-powered harmonics of the (G-)DDR memory clocks with local radio module frequency bands used by Wifi 6/6e/7. To mitigate this, AMD has introduced a mechanism that devices can use to notify active use of particular frequencies so that other devices can make relative internal adjustments as necessary to avoid this resonance. Co-developed-by: Evan Quan <[email protected]> Signed-off-by: Evan Quan <[email protected]> Signed-off-by: Ma Jun <[email protected]> Reviewed-by: Mario Limonciello <[email protected]> Reviewed-by: Hans de Goede <[email protected]> Signed-off-by: Hans de Goede <[email protected]>
2023-12-11platform/x86: wmi: Remove chardev interfaceArmin Wolf1-8/+0
The design of the WMI chardev interface is broken: - it assumes that WMI drivers are not instantiated twice - it offers next to no abstractions, the WMI driver gets a raw byte buffer - it is only used by a single driver, something which is unlikely to change Since the only user (dell-smbios-wmi) has been migrated to his own ioctl interface, remove it. Reviewed-by: Ilpo Järvinen <[email protected]> Signed-off-by: Armin Wolf <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Hans de Goede <[email protected]> Signed-off-by: Hans de Goede <[email protected]>
2023-12-11Merge tag 'platform-drivers-x86-v6.7-3' into pdx86/for-nextHans de Goede1-0/+3
Back merge pdx86 fixes into pdx86/for-next for further WMI work depending on some of the fixes. platform-drivers-x86 for v6.7-3 Highlights: - asus-wmi: Solve i8042 filter resource handling, input, and suspend issues - wmi: Skip zero instance WMI blocks to avoid issues with some laptops - mlxbf-bootctl: Differentiate dev/production keys - platform/surface: Correct serdev related return value to avoid leaking errno into userspace - Error checking fixes The following is an automated shortlog grouped by driver: asus-wmi: - Change q500a_i8042_filter() into a generic i8042-filter - disable USB0 hub on ROG Ally before suspend - Filter Volume key presses if also reported via atkbd - Move i8042 filter install to shared asus-wmi code mellanox: - Add null pointer checks for devm_kasprintf() - Check devm_hwmon_device_register_with_groups() return value mlxbf-bootctl: - correctly identify secure boot with development keys surface: aggregator: - fix recv_buf() return value wmi: - Skip blocks with zero instances
2023-12-11efivarfs: automatically update super block flagMasahisa Kojima1-0/+8
efivar operation is updated when the tee_stmm_efi module is probed. tee_stmm_efi module supports SetVariable runtime service, but user needs to manually remount the efivarfs as RW to enable the write access if the previous efivar operation does not support SetVariable and efivarfs is mounted as read-only. This commit notifies the update of efivar operation to efivarfs subsystem, then drops SB_RDONLY flag if the efivar operation supports SetVariable. Signed-off-by: Masahisa Kojima <[email protected]> [ardb: use per-superblock instance of the notifier block] Signed-off-by: Ard Biesheuvel <[email protected]>
2023-12-11efi: Add EFI_ACCESS_DENIED status codeMasahisa Kojima1-0/+1
This commit adds the EFI_ACCESS_DENIED status code. Acked-by: Sumit Garg <[email protected]> Co-developed-by: Ilias Apalodimas <[email protected]> Signed-off-by: Ilias Apalodimas <[email protected]> Signed-off-by: Masahisa Kojima <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]>
2023-12-11efi: expose efivar generic ops register functionMasahisa Kojima1-0/+3
This is a preparation for supporting efivar operations provided by other than efi subsystem. Both register and unregister functions are exposed so that non-efi subsystem can revert the efi generic operation. Acked-by: Sumit Garg <[email protected]> Co-developed-by: Ilias Apalodimas <[email protected]> Signed-off-by: Ilias Apalodimas <[email protected]> Signed-off-by: Masahisa Kojima <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]>
2023-12-11dt-bindings: reset: mt8188: Add VDOSYS reset control bitsHsiao Chien Sung1-0/+75
Add MT8188 VDOSYS0 and VDOSYS1 reset control bits. Reviewed-by: AngeloGioacchino Del Regno <[email protected]> Acked-by: Krzysztof Kozlowski <[email protected]> Signed-off-by: Hsiao Chien Sung <[email protected]> Signed-off-by: AngeloGioacchino Del Regno <[email protected]>
2023-12-11platform/x86/intel/tpmi: Move TPMI ID definitionSrinivas Pandruvada1-0/+13
Move TPMI ID definitions to common include file. In this way other feature drivers don't have to redefine. Signed-off-by: Srinivas Pandruvada <[email protected]> Reviewed-by: Ilpo Järvinen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Hans de Goede <[email protected]>
2023-12-11platform/x86/intel/tpmi: Modify external interface to get read/write stateSrinivas Pandruvada1-3/+2
Modify the external interface tpmi_get_feature_status() to get read and write blocked instead of locked and disabled. Since auxiliary device is not created when disabled, no use of returning disabled state. Also locked state is not useful as feature driver can't use locked state in a meaningful way. Using read and write state, feature driver can decide which operations to restrict for that feature. Signed-off-by: Srinivas Pandruvada <[email protected]> Reviewed-by: Ilpo Järvinen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Hans de Goede <[email protected]>
2023-12-11clk: x86: lpss-atom: Drop unneeded 'extern' in the headerAndy Shevchenko1-1/+1
'extern' for the functions is not needed, drop it. Signed-off-by: Andy Shevchenko <[email protected]> Reviewed-by: Ilpo Järvinen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Hans de Goede <[email protected]> Signed-off-by: Hans de Goede <[email protected]>
2023-12-11net/sched: act_ct: Take per-cb reference to tcf_ct_flow_tableVlad Buslov1-0/+10
The referenced change added custom cleanup code to act_ct to delete any callbacks registered on the parent block when deleting the tcf_ct_flow_table instance. However, the underlying issue is that the drivers don't obtain the reference to the tcf_ct_flow_table instance when registering callbacks which means that not only driver callbacks may still be on the table when deleting it but also that the driver can still have pointers to its internal nf_flowtable and can use it concurrently which results either warning in netfilter[0] or use-after-free. Fix the issue by taking a reference to the underlying struct tcf_ct_flow_table instance when registering the callback and release the reference when unregistering. Expose new API required for such reference counting by adding two new callbacks to nf_flowtable_type and implementing them for act_ct flowtable_ct type. This fixes the issue by extending the lifetime of nf_flowtable until all users have unregistered. [0]: [106170.938634] ------------[ cut here ]------------ [106170.939111] WARNING: CPU: 21 PID: 3688 at include/net/netfilter/nf_flow_table.h:262 mlx5_tc_ct_del_ft_cb+0x267/0x2b0 [mlx5_core] [106170.940108] Modules linked in: act_ct nf_flow_table act_mirred act_skbedit act_tunnel_key vxlan cls_matchall nfnetlink_cttimeout act_gact cls_flower sch_ingress mlx5_vdpa vringh vhost_iotlb vdpa bonding openvswitch nsh rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm mlx5_ib ib_uverbs ib_core xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat xt_addrtype xt_conntrack nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_regis try overlay mlx5_core [106170.943496] CPU: 21 PID: 3688 Comm: kworker/u48:0 Not tainted 6.6.0-rc7_for_upstream_min_debug_2023_11_01_13_02 #1 [106170.944361] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [106170.945292] Workqueue: mlx5e mlx5e_rep_neigh_update [mlx5_core] [106170.945846] RIP: 0010:mlx5_tc_ct_del_ft_cb+0x267/0x2b0 [mlx5_core] [106170.946413] Code: 89 ef 48 83 05 71 a4 14 00 01 e8 f4 06 04 e1 48 83 05 6c a4 14 00 01 48 83 c4 28 5b 5d 41 5c 41 5d c3 48 83 05 d1 8b 14 00 01 <0f> 0b 48 83 05 d7 8b 14 00 01 e9 96 fe ff ff 48 83 05 a2 90 14 00 [106170.947924] RSP: 0018:ffff88813ff0fcb8 EFLAGS: 00010202 [106170.948397] RAX: 0000000000000000 RBX: ffff88811eabac40 RCX: ffff88811eabad48 [106170.949040] RDX: ffff88811eab8000 RSI: ffffffffa02cd560 RDI: 0000000000000000 [106170.949679] RBP: ffff88811eab8000 R08: 0000000000000001 R09: ffffffffa0229700 [106170.950317] R10: ffff888103538fc0 R11: 0000000000000001 R12: ffff88811eabad58 [106170.950969] R13: ffff888110c01c00 R14: ffff888106b40000 R15: 0000000000000000 [106170.951616] FS: 0000000000000000(0000) GS:ffff88885fd40000(0000) knlGS:0000000000000000 [106170.952329] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [106170.952834] CR2: 00007f1cefd28cb0 CR3: 000000012181b006 CR4: 0000000000370ea0 [106170.953482] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [106170.954121] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [106170.954766] Call Trace: [106170.955057] <TASK> [106170.955315] ? __warn+0x79/0x120 [106170.955648] ? mlx5_tc_ct_del_ft_cb+0x267/0x2b0 [mlx5_core] [106170.956172] ? report_bug+0x17c/0x190 [106170.956537] ? handle_bug+0x3c/0x60 [106170.956891] ? exc_invalid_op+0x14/0x70 [106170.957264] ? asm_exc_invalid_op+0x16/0x20 [106170.957666] ? mlx5_del_flow_rules+0x10/0x310 [mlx5_core] [106170.958172] ? mlx5_tc_ct_block_flow_offload_add+0x1240/0x1240 [mlx5_core] [106170.958788] ? mlx5_tc_ct_del_ft_cb+0x267/0x2b0 [mlx5_core] [106170.959339] ? mlx5_tc_ct_del_ft_cb+0xc6/0x2b0 [mlx5_core] [106170.959854] ? mapping_remove+0x154/0x1d0 [mlx5_core] [106170.960342] ? mlx5e_tc_action_miss_mapping_put+0x4f/0x80 [mlx5_core] [106170.960927] mlx5_tc_ct_delete_flow+0x76/0xc0 [mlx5_core] [106170.961441] mlx5_free_flow_attr_actions+0x13b/0x220 [mlx5_core] [106170.962001] mlx5e_tc_del_fdb_flow+0x22c/0x3b0 [mlx5_core] [106170.962524] mlx5e_tc_del_flow+0x95/0x3c0 [mlx5_core] [106170.963034] mlx5e_flow_put+0x73/0xe0 [mlx5_core] [106170.963506] mlx5e_put_flow_list+0x38/0x70 [mlx5_core] [106170.964002] mlx5e_rep_update_flows+0xec/0x290 [mlx5_core] [106170.964525] mlx5e_rep_neigh_update+0x1da/0x310 [mlx5_core] [106170.965056] process_one_work+0x13a/0x2c0 [106170.965443] worker_thread+0x2e5/0x3f0 [106170.965808] ? rescuer_thread+0x410/0x410 [106170.966192] kthread+0xc6/0xf0 [106170.966515] ? kthread_complete_and_exit+0x20/0x20 [106170.966970] ret_from_fork+0x2d/0x50 [106170.967332] ? kthread_complete_and_exit+0x20/0x20 [106170.967774] ret_from_fork_asm+0x11/0x20 [106170.970466] </TASK> [106170.970726] ---[ end trace 0000000000000000 ]--- Fixes: 77ac5e40c44e ("net/sched: act_ct: remove and free nf_table callbacks") Signed-off-by: Vlad Buslov <[email protected]> Reviewed-by: Paul Blakey <[email protected]> Acked-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-12-11Merge 6.7-rc5 into tty-nextGreg Kroah-Hartman49-76/+262
We need the serial fixes in here as well to build off of. Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-12-11RDMA/bnxt_re: Adds MSN table capability for Gen P7 adaptersSelvin Xavier1-0/+1
GenP7 HW expects an MSN table instead of PSN table. Check for the HW retransmission capability and populate the MSN table if HW retansmission is supported. Signed-off-by: Damodharam Ammepalli <[email protected]> Signed-off-by: Selvin Xavier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Leon Romanovsky <[email protected]>
2023-12-11Merge 6.7-rc5 into usb-nextGreg Kroah-Hartman38-56/+188
We need the USB fixes in here as well to build off of. Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-12-11Merge 6.7-rc5 into char-misc-nextGreg Kroah-Hartman38-56/+188
We need the char/misc fixes in here as well for testing and to build off of. Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-12-10resource: add walk_system_ram_res_rev()Baoquan He1-0/+3
This function, being a variant of walk_system_ram_res() introduced in commit 8c86e70acead ("resource: provide new functions to walk through resources"), walks through a list of all the resources of System RAM in reversed order, i.e., from higher to lower. It will be used in kexec_file code to load kernel, initrd etc when preparing kexec reboot. Link: https://lkml.kernel.org/r/ZVTA6z/06cLnWKUz@MiWiFi-R3L-srv Signed-off-by: AKASHI Takahiro <[email protected]> Signed-off-by: Baoquan He <[email protected]> Cc: Eric Biederman <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-10kernel/signal.c: simplify force_sig_info_to_task(), kill ↵Oleg Nesterov1-1/+0
recalc_sigpending_and_wake() The purpose of recalc_sigpending_and_wake() is not clear, it looks "obviously unneeded" because we are going to send the signal which can't be blocked or ignored. Add the comment to explain why we can't rely on send_signal_locked() and make this logic more simple/explicit. recalc_sigpending_and_wake() has no other users, it can die. In fact I think we don't even need signal_wake_up(), the target task must be either current or a TASK_TRACED child, otherwise the usage of siglock is not safe. But this needs another change. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Oleg Nesterov <[email protected]> Cc: Eric Biederman <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-10arch: remove ARCH_TASK_STRUCT_ON_STACKHeiko Carstens2-9/+0
IA-64 was the only architecture which selected ARCH_TASK_STRUCT_ON_STACK. IA-64 was removed with commit cf8e8658100d ("arch: Remove Itanium (IA-64) architecture"). Therefore remove support for ARCH_TASK_STRUCT_ON_STACK as well. Note: this also reveals a potential bug in powerpc code, which makes use of __init_task_data without selecting ARCH_TASK_STRUCT_ON_STACK which makes __init_task_data a no-op. This is broken since commit d11ed3ab3166 ("Expand INIT_TASK() in init/init_task.c and remove") from 2018 and needs to be addressed separately. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Heiko Carstens <[email protected]> Reviewed-by: Arnd Bergmann <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Nicholas Piggin <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-10introduce for_other_threads(p, t)Oleg Nesterov1-0/+3
Cosmetic, but imho it makes the usage look more clear and simple, the new helper doesn't require to initialize "t". After this change while_each_thread() has only 3 users, and it is only used in the do/while loops. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Oleg Nesterov <[email protected]> Reviewed-by: Christian Brauner <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-10mm: list_lru: Update kernel documentation to follow the requirementsAndy Shevchenko1-17/+19
kernel-doc is not happy about documentation in list_lru.h: list_lru.h:90: warning: Function parameter or member 'lru' not described in 'list_lru_add' list_lru.h:90: warning: Excess function parameter 'list_lru' description in 'list_lru_add' list_lru.h:90: warning: No description found for return value of 'list_lru_add' list_lru.h:103: warning: Function parameter or member 'lru' not described in 'list_lru_del' list_lru.h:103: warning: Excess function parameter 'list_lru' description in 'list_lru_del' list_lru.h:103: warning: No description found for return value of 'list_lru_del' list_lru.h:116: warning: No description found for return value of 'list_lru_count_one' list_lru.h:168: warning: No description found for return value of 'list_lru_walk_one' list_lru.h:185: warning: No description found for return value of 'list_lru_walk_one_irq' Fix the documentation accordingly. While at it, fix the references to the parameters in functions inside the long descriptions, on which the above script is not complaining (yet?). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Andy Shevchenko <[email protected]> Cc: Rasmus Villemoes <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-10pgtable: rename ptdesc _refcount field to __page_refcountAlexander Gordeev1-3/+3
Rename ptdesc _refcount field to __page_refcount similar to the other unused page fields. Link: https://lkml.kernel.org/r/982bdc652ba79a606c3d01c905766e7e076b3315.1700594815.git.agordeev@linux.ibm.com Signed-off-by: Alexander Gordeev <[email protected]> Suggested-by: Vishal Moola <[email protected]> Cc: Gerald Schaefer <[email protected]> Cc: Heiko Carstens <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-10pgtable: fix s390 ptdesc field commentsAlexander Gordeev1-2/+2
Patch series "minor ptdesc updates", v3. This patch (of 2): Since commit d08d4e7cd6bf ("s390/mm: use full 4KB page for 2KB PTE") there is no fragmented page tracking on s390. Fix the corresponding comments. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/2eead241f3a45bed26c7911cf66bded1e35670b8.1700594815.git.agordeev@linux.ibm.com Signed-off-by: Alexander Gordeev <[email protected]> Suggested-by: Heiko Carstens <[email protected]> Cc: Gerald Schaefer <[email protected]> Cc: Vishal Moola (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-10mm: use vmem_altmap code without CONFIG_ZONE_DEVICESumanth Korikkar2-12/+26
vmem_altmap_free() and vmem_altmap_offset() could be utlized without CONFIG_ZONE_DEVICE enabled. For example, mm/memory_hotplug.c:__add_pages() relies on that. The altmap is no longer restricted to ZONE_DEVICE handling, but instead depends on CONFIG_SPARSEMEM_VMEMMAP. When CONFIG_SPARSEMEM_VMEMMAP is disabled, these functions are defined as inline stubs, ensuring compatibility with configurations that do not use sparsemem vmemmap. Without it, lkp reported the following: ld: arch/x86/mm/init_64.o: in function `remove_pagetable': init_64.c:(.meminit.text+0xfc7): undefined reference to `vmem_altmap_free' Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sumanth Korikkar <[email protected]> Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Reviewed-by: Gerald Schaefer <[email protected]> Acked-by: David Hildenbrand <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Vasily Gorbik <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-10lib/stackdepot: allow users to evict stack tracesAndrey Konovalov1-0/+14
Add stack_depot_put, a function that decrements the reference counter on a stack record and removes it from the stack depot once the counter reaches 0. Internally, when removing a stack record, the function unlinks it from the hash table bucket and returns to the freelist. With this change, the users of stack depot can call stack_depot_put when keeping a stack trace in the stack depot is not needed anymore. This allows avoiding polluting the stack depot with irrelevant stack traces and thus have more space to store the relevant ones before the stack depot reaches its capacity. Link: https://lkml.kernel.org/r/1d1ad5692ee43d4fc2b3fd9d221331d30b36123f.1700502145.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Evgenii Stepanov <[email protected]> Cc: Marco Elver <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Vlastimil Babka <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-10lib/stackdepot: add refcount for recordsAndrey Konovalov1-3/+10
Add a reference counter for how many times a stack records has been added to stack depot. Add a new STACK_DEPOT_FLAG_GET flag to stack_depot_save_flags that instructs the stack depot to increment the refcount. Do not yet decrement the refcount; this is implemented in one of the following patches. Do not yet enable any users to use the flag to avoid overflowing the refcount. This is preparatory patch for implementing the eviction of stack records from the stack depot. Link: https://lkml.kernel.org/r/a3fc14a2359d019d2a008d4ff8b46a665371ffee.1700502145.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <[email protected]> Reviewed-by: Alexander Potapenko <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Evgenii Stepanov <[email protected]> Cc: Marco Elver <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Vlastimil Babka <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-10lib/stackdepot, kasan: add flags to __stack_depot_save and renameAndrey Konovalov1-11/+25
Change the bool can_alloc argument of __stack_depot_save to a u32 argument that accepts a set of flags. The following patch will add another flag to stack_depot_save_flags besides the existing STACK_DEPOT_FLAG_CAN_ALLOC. Also rename the function to stack_depot_save_flags, as __stack_depot_save is a cryptic name, Link: https://lkml.kernel.org/r/645fa15239621eebbd3a10331e5864b718839512.1700502145.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <[email protected]> Reviewed-by: Alexander Potapenko <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Evgenii Stepanov <[email protected]> Cc: Marco Elver <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Vlastimil Babka <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-10fs: convert error_remove_page to error_remove_folioMatthew Wilcox (Oracle)2-2/+3
There were already assertions that we were not passing a tail page to error_remove_page(), so make the compiler enforce that by converting everything to pass and use a folio. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Cc: Naoya Horiguchi <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-10gfp: include __GFP_NOWARN in GFP_NOWAITMatthew Wilcox (Oracle)1-2/+3
GFP_NOWAIT callers are always prepared for their allocations to fail because they fail so frequently. Forcing the callers to remember to add __GFP_NOWARN is just annoying and leads to an endless stream of patches for the places where we forgot to add it. We can now remove __GFP_NOWARN from all the callers which specify GFP_NOWAIT, but I'd rather wait a cycle and send patches to each maintainer instead of creating a big pile of merge conflicts. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-10mm: return void from folio_start_writeback() and related functionsMatthew Wilcox (Oracle)1-2/+2
Nobody now checks the return value from any of these functions, so add an assertion at the beginning of the function and return void. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Josef Bacik <[email protected]> Cc: David Howells <[email protected]> Cc: Steve French <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-10mm: remove test_set_page_writeback()Matthew Wilcox (Oracle)1-5/+0
Patch series "Make folio_start_writeback return void". Most of the folio flag-setting functions return void. folio_start_writeback is gratuitously different; the only two filesystems that do anything with the return value emit debug messages if it's already set, and we can (and should) do that internally without bothering the filesystem to do it. This patch (of 4): There are no more callers of this wrapper. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Cc: David Howells <[email protected]> Cc: Steve French <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-10mm: add folio_fill_tail() and use it in iomapMatthew Wilcox (Oracle)1-0/+38
The iomap code was limited to PAGE_SIZE bytes; generalise it to cover an arbitrary-sized folio, and move it to be a common helper. [[email protected]: fix folio_fill_tail(), per Andreas Gruenbacher] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Andreas Gruenbacher <[email protected]> Cc: Andreas Dilger <[email protected]> Cc: Darrick J. Wong <[email protected]> Cc: Theodore Ts'o <[email protected]> Cc: Andreas Gruenbacher <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-10mm: add folio_zero_tail() and use it in ext4Matthew Wilcox (Oracle)1-0/+38
Patch series "Add folio_zero_tail() and folio_fill_tail()". I'm trying to make it easier for filesystems with tailpacking / stuffing / inline data to use folios. The primary function here is folio_fill_tail(). You give it a pointer to memory where the data currently is, and it takes care of copying it into the folio at that offset. That works for gfs2 & iomap. Then There's Ext4. Rather than gin up some kind of specialist "Here's a two pointers to two blocks of memory" routine, just let it do its current thing, and let it call folio_zero_tail(), which is also called by folio_fill_tail(). Other filesystems can be converted later; these ones seemed like good examples as they're already partly or completely converted to folios. This patch (of 3): Instead of unmapping the folio after copying the data to it, then mapping it again to zero the tail, provide folio_zero_tail() to zero the tail of an already-mapped folio. [[email protected]: fix kerneldoc argument ordering] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Andreas Gruenbacher <[email protected]> Cc: Darrick J. Wong <[email protected]> Cc: Theodore Ts'o <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-10fs/proc/task_mmu: report SOFT_DIRTY bits through the PAGEMAP_SCAN ioctlAndrei Vagin1-0/+1
The PAGEMAP_SCAN ioctl returns information regarding page table entries. It is more efficient compared to reading pagemap files. CRIU can start to utilize this ioctl, but it needs info about soft-dirty bits to track memory changes. We are aware of a new method for tracking memory changes implemented in the PAGEMAP_SCAN ioctl. For CRIU, the primary advantage of this method is its usability by unprivileged users. However, it is not feasible to transparently replace the soft-dirty tracker with the new one. The main problem here is userfault descriptors that have to be preserved between pre-dump iterations. It means criu continues supporting the soft-dirty method to avoid breakage for current users. The new method will be implemented as a separate feature. [[email protected]: update tools/include/uapi/linux/fs.h] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Andrei Vagin <[email protected]> Reviewed-by: Muhammad Usama Anjum <[email protected]> Cc: Michał Mirosław <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-10NUMA: optimize detection of memory with no node id assigned by firmwareLiam Ni1-0/+1
Sanity check that makes sure the nodes cover all memory loops over numa_meminfo to count the pages that have node id assigned by the firmware, then loops again over memblock.memory to find the total amount of memory and in the end checks that the difference between the total memory and memory that covered by nodes is less than some threshold. Worse, the loop over numa_meminfo calls __absent_pages_in_range() that also partially traverses memblock.memory. It's much simpler and more efficient to have a single traversal of memblock.memory that verifies that amount of memory not covered by nodes is less than a threshold. Introduce memblock_validate_numa_coverage() that does exactly that and use it instead of numa_meminfo_cover_memory(). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liam Ni <[email protected]> Reviewed-by: Mike Rapoport (IBM) <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Bibo Mao <[email protected]> Cc: Binbin Zhou <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Feiyang Chen <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: WANG Xuerui <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-10fork: use __mt_dup() to duplicate maple tree in dup_mmap()Peng Zhang1-0/+11
In dup_mmap(), using __mt_dup() to duplicate the old maple tree and then directly replacing the entries of VMAs in the new maple tree can result in better performance. __mt_dup() uses DFS pre-order to duplicate the maple tree, so it is efficient. The average time complexity of __mt_dup() is O(n), where n is the number of VMAs. The proof of the time complexity is provided in the commit log that introduces __mt_dup(). After duplicating the maple tree, each element is traversed and replaced (ignoring the cases of deletion, which are rare). Since it is only a replacement operation for each element, this process is also O(n). Analyzing the exact time complexity of the previous algorithm is challenging because each insertion can involve appending to a node, pushing data to adjacent nodes, or even splitting nodes. The frequency of each action is difficult to calculate. The worst-case scenario for a single insertion is when the tree undergoes splitting at every level. If we consider each insertion as the worst-case scenario, we can determine that the upper bound of the time complexity is O(n*log(n)), although this is a loose upper bound. However, based on the test data, it appears that the actual time complexity is likely to be O(n). As the entire maple tree is duplicated using __mt_dup(), if dup_mmap() fails, there will be a portion of VMAs that have not been duplicated in the maple tree. To handle this, we mark the failure point with XA_ZERO_ENTRY. In exit_mmap(), if this marker is encountered, stop releasing VMAs that have not been duplicated after this point. There is a "spawn" in byte-unixbench[1], which can be used to test the performance of fork(). I modified it slightly to make it work with different number of VMAs. Below are the test results. The first row shows the number of VMAs. The second and third rows show the number of fork() calls per ten seconds, corresponding to next-20231006 and the this patchset, respectively. The test results were obtained with CPU binding to avoid scheduler load balancing that could cause unstable results. There are still some fluctuations in the test results, but at least they are better than the original performance. 21 121 221 421 821 1621 3221 6421 12821 25621 51221 112100 76261 54227 34035 20195 11112 6017 3161 1606 802 393 114558 83067 65008 45824 28751 16072 8922 4747 2436 1233 599 2.19% 8.92% 19.88% 34.64% 42.37% 44.64% 48.28% 50.17% 51.68% 53.74% 52.42% [1] https://github.com/kdlucas/byte-unixbench/tree/master Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Peng Zhang <[email protected]> Suggested-by: Liam R. Howlett <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Mateusz Guzik <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Michael S. Tsirkin <[email protected]> Cc: Mike Christie <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-10maple_tree: introduce interfaces __mt_dup() and mtree_dup()Peng Zhang1-0/+3
Introduce interfaces __mt_dup() and mtree_dup(), which are used to duplicate a maple tree. They duplicate a maple tree in Depth-First Search (DFS) pre-order traversal. It uses memcopy() to copy nodes in the source tree and allocate new child nodes in non-leaf nodes. The new node is exactly the same as the source node except for all the addresses stored in it. It will be faster than traversing all elements in the source tree and inserting them one by one into the new tree. The time complexity of these two functions is O(n). The difference between __mt_dup() and mtree_dup() is that mtree_dup() handles locks internally. Analysis of the average time complexity of this algorithm: For simplicity, let's assume that the maximum branching factor of all non-leaf nodes is 16 (in allocation mode, it is 10), and the tree is a full tree. Under the given conditions, if there is a maple tree with n elements, the number of its leaves is n/16. From bottom to top, the number of nodes in each level is 1/16 of the number of nodes in the level below. So the total number of nodes in the entire tree is given by the sum of n/16 + n/16^2 + n/16^3 + ... + 1. This is a geometric series, and it has log(n) terms with base 16. According to the formula for the sum of a geometric series, the sum of this series can be calculated as (n-1)/15. Each node has only one parent node pointer, which can be considered as an edge. In total, there are (n-1)/15-1 edges. This algorithm consists of two operations: 1. Traversing all nodes in DFS order. 2. For each node, making a copy and performing necessary modifications to create a new node. For the first part, DFS traversal will visit each edge twice. Let T(ascend) represent the cost of taking one step downwards, and T(descend) represent the cost of taking one step upwards. And both of them are constants (although mas_ascend() may not be, as it contains a loop, but here we ignore it and treat it as a constant). So the time spent on the first part can be represented as ((n-1)/15-1) * (T(ascend) + T(descend)). For the second part, each node will be copied, and the cost of copying a node is denoted as T(copy_node). For each non-leaf node, it is necessary to reallocate all child nodes, and the cost of this operation is denoted as T(dup_alloc). The behavior behind memory allocation is complex and not specific to the maple tree operation. Here, we assume that the time required for a single allocation is constant. Since the size of a node is fixed, both of these symbols are also constants. We can calculate that the time spent on the second part is ((n-1)/15) * T(copy_node) + ((n-1)/15 - n/16) * T(dup_alloc). Adding both parts together, the total time spent by the algorithm can be represented as: ((n-1)/15) * (T(ascend) + T(descend) + T(copy_node) + T(dup_alloc)) - n/16 * T(dup_alloc) - (T(ascend) + T(descend)) Let C1 = T(ascend) + T(descend) + T(copy_node) + T(dup_alloc) Let C2 = T(dup_alloc) Let C3 = T(ascend) + T(descend) Finally, the expression can be simplified as: ((16 * C1 - 15 * C2) / (15 * 16)) * n - (C1 / 15 + C3). This is a linear function, so the average time complexity is O(n). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Peng Zhang <[email protected]> Suggested-by: Liam R. Howlett <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Mateusz Guzik <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Michael S. Tsirkin <[email protected]> Cc: Mike Christie <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-10maple_tree: introduce {mtree,mas}_lock_nested()Peng Zhang1-0/+4
In some cases, nested locks may be needed, so {mtree,mas}_lock_nested is introduced. For example, when duplicating maple tree, we need to hold the locks of two trees, in which case nested locks are needed. At the same time, add the definition of spin_lock_nested() in tools for testing. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Peng Zhang <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Mateusz Guzik <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Michael S. Tsirkin <[email protected]> Cc: Mike Christie <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-10mm/vmstat: move pgdemote_* to per-node statsLi Zhijian2-3/+4
Demotion will migrate pages across nodes. Previously, only the global demotion statistics were accounted for. Changed them to per-node statistics, making it easier to observe where demotion occurs on each node. This will help to identify which nodes are under pressure. This patch also make pgdemote_* behind CONFIG_NUMA_BALANCING, since demotion is not available for !CONFIG_NUMA_BALANCING With this patch, here is a sample where node0 node1 are DRAM, node3 is PMEM: Global stats: $ grep demote /proc/vmstat pgdemote_kswapd 254288 pgdemote_direct 113497 pgdemote_khugepaged 0 Per-node stats: $ grep demote /sys/devices/system/node/node0/vmstat # demotion source pgdemote_kswapd 68454 pgdemote_direct 83431 pgdemote_khugepaged 0 $ grep demote /sys/devices/system/node/node1/vmstat # demotion source pgdemote_kswapd 185834 pgdemote_direct 30066 pgdemote_khugepaged 0 $ grep demote /sys/devices/system/node/node3/vmstat # demotion target pgdemote_kswapd 0 pgdemote_direct 0 pgdemote_khugepaged 0 Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Li Zhijian <[email protected]> Acked-by: "Huang, Ying" <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-10drm/exec: Pass in initial # of objectsRob Clark1-1/+1
In cases where the # is known ahead of time, it is silly to do the table resize dance. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Christian König <[email protected]> Patchwork: https://patchwork.freedesktop.org/patch/568338/
2023-12-10drm/msm: Add param for the highest bank bitConnor Abbott1-0/+1
This parameter is programmed by the kernel and influences the tiling layout of images. Exposing it to userspace will allow it to tile/untile images correctly without guessing what value the kernel programmed, and allow us to change it in the future without breaking userspace. Signed-off-by: Connor Abbott <[email protected]> Patchwork: https://patchwork.freedesktop.org/patch/571181/ Signed-off-by: Rob Clark <[email protected]>
2023-12-10Merge remote-tracking branch 'drm-misc/drm-misc-next' into msm-nextRob Clark65-804/+2876
Backmerge drm-misc-next to pick up some dependencies for drm/msm patches, in particular: https://patchwork.freedesktop.org/patch/570219/?series=127251&rev=1 https://patchwork.freedesktop.org/series/123411/ Signed-off-by: Rob Clark <[email protected]>
2023-12-10dt-bindings: clock: Add Google gs101 clock management unit bindingsPeter Griffin1-0/+392
Provide dt-schema documentation for Google gs101 SoC clock controller. Currently this adds support for cmu_top, cmu_misc and cmu_apm. Reviewed-by: Sam Protsenko <[email protected]> Signed-off-by: Peter Griffin <[email protected]> Reviewed-by: Krzysztof Kozlowski <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Krzysztof Kozlowski <[email protected]>
2023-12-10iio: adc: ad9467: fix scale settingNuno Sa1-0/+4
When reading in_voltage_scale we can get something like: root@analog:/sys/bus/iio/devices/iio:device2# cat in_voltage_scale 0.038146 However, when reading the available options: root@analog:/sys/bus/iio/devices/iio:device2# cat in_voltage_scale_available 2000.000000 2100.000006 2200.000007 2300.000008 2400.000009 2500.000010 which does not make sense. Moreover, when trying to set a new scale we get an error because there's no call to __ad9467_get_scale() to give us values as given when reading in_voltage_scale. Fix it by computing the available scales during probe and properly pass the list when .read_available() is called. While at it, change to use .read_available() from iio_info. Also note that to properly fix this, adi-axi-adc.c has to be changed accordingly. Fixes: ad6797120238 ("iio: adc: ad9467: add support AD9467 ADC") Signed-off-by: Nuno Sa <[email protected]> Reviewed-by: David Lechner <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jonathan Cameron <[email protected]>
2023-12-10init: move THIS_MODULE from <linux/export.h> to <linux/init.h>Masahiro Yamada2-18/+7
Commit f50169324df4 ("module.h: split out the EXPORT_SYMBOL into export.h") appropriately separated EXPORT_SYMBOL into <linux/export.h> because modules and EXPORT_SYMBOL are orthogonal; modules are symbol consumers, while EXPORT_SYMBOL are used by symbol providers, which may not be necessarily a module. However, that commit also relocated THIS_MODULE. As explained in the commit description, the intention was to define THIS_MODULE in a lightweight header, but I do not believe <linux/export.h> was the best location because EXPORT_SYMBOL and THIS_MODULE are unrelated. Move it to another lightweight header, <linux/init.h>. The reason for choosing <linux/init.h> is to make <linux/moduleparam.h> self-contained without relying on <linux/linkage.h> incorrectly including <linux/export.h>. With this adjustment, the role of <linux/export.h> becomes clearer as it only defines EXPORT_SYMBOL. Signed-off-by: Masahiro Yamada <[email protected]> Reviewed-by: Luis Chamberlain <[email protected]>
2023-12-08ipv6: do not check fib6_has_expires() in fib6_info_release()Eric Dumazet1-1/+0
My prior patch went a bit too far, because apparently fib6_has_expires() could be true while f6i->gc_link is not hashed yet. fib6_set_expires_locked() can indeed set RTF_EXPIRES while f6i->fib6_table is NULL. Original syzbot reports were about corruptions caused by dangling f6i->gc_link. Fixes: 5a08d0065a91 ("ipv6: add debug checks in fib6_info_release()") Reported-by: [email protected] Signed-off-by: Eric Dumazet <[email protected]> Cc: Kui-Feng Lee <[email protected]> Reviewed-by: David Ahern <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-12-09GPIO descriptor cleanup for some Wolfson codecsMark Brown5-39/+0
Merge series from Linus Walleij <[email protected]>: This converts the remaining Wolfson ASoC codecs to use GPIO descriptors. These Wolfson codecs are mostly used with different Samsung S3C (especially Cragganmore 6410) board files, so the in-tree users are fixed up in the process.
2023-12-08bpf: Add some comments to stack representationAndrei Matei1-0/+14
Add comments to the datastructure tracking the stack state, as the mapping between each stack slot and where its state is stored is not entirely obvious. Signed-off-by: Andrei Matei <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: Eduard Zingerman <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2023-12-08KVM: clean up directives to compile out irqfdsPaolo Bonzini1-0/+2
Keep all #ifdef CONFIG_HAVE_KVM_IRQCHIP parts of eventfd.c together, and compile out the irqfds field of struct kvm if the symbol is not defined. Signed-off-by: Paolo Bonzini <[email protected]>
2023-12-08KVM: remove deprecated UAPIsPaolo Bonzini1-90/+0
The deprecated interfaces were removed 15 years ago. KVM's device assignment was deprecated in 4.2 and removed 6.5 years ago; the only interest might be in compiling ancient versions of QEMU, but QEMU has been using its own imported copy of the kernel headers since June 2011. So again we go into archaeology territory; just remove the cruft. Signed-off-by: Paolo Bonzini <[email protected]>