aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-02-13ionic: remove unnecessary indirectionShannon Nelson1-2/+2
We have the pointer already, don't need to go through the lif struct for it. Signed-off-by: Shannon Nelson <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-02-13bnxt_en: Fix mqprio and XDP ring checking logicMichael Chan1-2/+6
In bnxt_reserve_rings(), there is logic to check that the number of TX rings reserved is enough to cover all the mqprio TCs, but it fails to account for the TX XDP rings. So the check will always fail if there are mqprio TCs and TX XDP rings. As a result, the driver always fails to initialize after the XDP program is attached and the device will be brought down. A subsequent ifconfig up will also fail because the number of TX rings is set to an inconsistent number. Fix the check to properly account for TX XDP rings. If the check fails, set the number of TX rings back to a consistent number after calling netdev_reset_tc(). Fixes: 674f50a5b026 ("bnxt_en: Implement new method to reserve rings.") Reviewed-by: Hongguang Gao <[email protected]> Signed-off-by: Michael Chan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-02-13Merge branch 'net-ipa-GSI-regs'David S. Miller7-281/+903
Alex Elder says: ==================== net: ipa: determine GSI register offsets differently This series changes the way GSI register offset are specified, using the "reg" mechanism currently used for IPA registers. A follow-on series will extend this work so fields within GSI registers are also specified this way. The first patch rearranges the GSI register initialization code so it is similar to the way it's done for the IPA registers. The second identifies all the GSI registers in an enumerated type. The third introduces "gsi_reg-v3.1.c" and uses the "reg" code to define one GSI register offset. The second-to-last patch just adds "gsi_reg-v3.5.1.c", because that version introduces a new register not previously defined. All the rest just define the rest of the GSI register offsets using the "reg" mechanism. Note that, to have continued lines align with an open parenthesis, new files created in this series cause some checkpatch warnings. ==================== Signed-off-by: David S. Miller <[email protected]>
2023-02-13net: ipa: define IPA remaining GSI register offsetsAlex Elder5-27/+76
Add the remaining GSI register offset definitions. Use gsi_reg() rather than the corresponding GSI_*_OFFSET() macros to get the offsets for these registers, and get rid of the macros. Note that we are now defining information for the HW_PARAM_2 register, and that doesn't appear until IPA v3.5.1. Signed-off-by: Alex Elder <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-02-13net: ipa: add "gsi_v3.5.1.c"Alex Elder3-1/+185
The next patch adds a GSI register field that is only valid starting at IPA v3.5.1. Create "gsi_v3.5.1.c" from "gsi_v3.1.c", changing only the name of the public regs structure it defines. Signed-off-by: Alex Elder <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-02-13net: ipa: define IPA v3.1 GSI interrupt register offsetsAlex Elder3-102/+195
Add definitions of the offsets for IRQ-related GSI registers. Use gsi_reg() rather than the corresponding GSI_CNTXT_*_OFFSET() macros to get the offsets for these registers, and get rid of the macros. Signed-off-by: Alex Elder <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-02-13net: ipa: define IPA v3.1 GSI event ring register offsetsAlex Elder3-53/+90
Add definitions of the offsets and strides for registers whose offset depends on an event ring ID, and use gsi_reg() and its returned value to determine offsets for these registers. Get rid of the corresponding GSI_EV_CH_E_*_OFFSET() macros. Signed-off-by: Alex Elder <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-02-13net: ipa: add more GSI register definitionsAlex Elder3-38/+62
Continue populating with GSI register definitions, adding remaining registers whose offset depends on a channel ID. Use gsi_reg() and reg_n_offset() to determine offsets for those registers, and get rid of the corresponding GSI_CH_C_*_OFFSET() macros. Signed-off-by: Alex Elder <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-02-13net: ipa: start creating GSI register definitionsAlex Elder6-6/+75
Create a new register definition file in the "reg" subdirectory, and begin populating it with GSI register definitions based on IPA version. The GSI registers haven't changed much, so several IPA versions can share the same GSI register definitions. As with IPA registers, an array of pointers indexed by GSI register ID refers to these register definitions, and a new "regs" field in the GSI structure is initialized in gsi_reg_init() to refer to register information based on the IPA version (though for now there's only one). The new function gsi_reg() returns register information for a given GSI register, and the result can be used to look up that register's offset. This patch is meant only to put the infrastructure in place, so only eon register (CH_C_QOS) is defined for each version, and only the offset and stride are defined for that register. Use new function gsi_reg() to look up that register's information to get its offset, This makes the GSI_CH_C_QOS_OFFSET() unnecessary, so get rid of it. Signed-off-by: Alex Elder <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-02-13net: ipa: introduce GSI register IDsAlex Elder2-0/+122
Create a new gsi_reg_id enumerated type, which identifies each GSI register with a symbolic identifier. Create a function that indicates whether a register ID is valid. Signed-off-by: Alex Elder <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-02-13net: ipa: introduce gsi_reg_init()Alex Elder4-59/+103
Create a new source file "gsi_reg.c", and in it, introduce a new function to encapsulate initializing GSI registers, including looking up and I/O mapping their memory. Create gsi_reg_exit() as the inverse of the init function. Signed-off-by: Alex Elder <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-02-13net: Fix unwanted sign extension in netdev_stats_to_stats64()Felix Riemann1-1/+1
When converting net_device_stats to rtnl_link_stats64 sign extension is triggered on ILP32 machines as 6c1c509778 changed the previous "ulong -> u64" conversion to "long -> u64" by accessing the net_device_stats fields through a (signed) atomic_long_t. This causes for example the received bytes counter to jump to 16EiB after having received 2^31 bytes. Casting the atomic value to "unsigned long" beforehand converting it into u64 avoids this. Fixes: 6c1c5097781f ("net: add atomic_long_t to net_device_stats fields") Signed-off-by: Felix Riemann <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-02-13net/sched: fix error recovery in qdisc_create()Eric Dumazet1-7/+8
If TCA_STAB attribute is malformed, qdisc_get_stab() returns an error, and we end up calling ops->destroy() while ops->init() has not been called yet. While we are at it, call qdisc_put_stab() after ops->destroy(). Fixes: 1f62879e3632 ("net/sched: make stab available before ops->init() call") Reported-by: [email protected] Signed-off-by: Eric Dumazet <[email protected]> Cc: Vladimir Oltean <[email protected]> Cc: Kurt Kanzenbach <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-02-13net: micrel: Add PHC support for lan8841Horatiu Vultur1-24/+599
Add support for PHC and timestamping operations for the lan8841 PHY. PTP 1-step and 2-step modes are supported, over Ethernet and UDP both ipv4 and ipv6. Signed-off-by: Horatiu Vultur <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-02-13Merge branch 'devlink-params-cleanup'David S. Miller5-66/+91
Jiri Pirko says: ==================== devlink: params cleanups and devl_param_driverinit_value_get() fix The primary motivation of this patchset is the patch #6, which fixes an issue introduced by 075935f0ae0f ("devlink: protect devlink param list by instance lock") and reported by Kim Phillips <[email protected]> (https://lore.kernel.org/netdev/[email protected]/) and my colleagues doing mlx5 driver regression testing. The basis idea is that devl_param_driverinit_value_get() could be possible to the called without holding devlink intance lock in most of the cases (all existing ones in the current codebase), which would fix some mlx5 flows where the lock is not held. To achieve that, make sure that the param value does not change between reloads with patch #2. Also, convert the param list to xarray which removes the worry about list_head consistency when doing lockless lookup. The rest of the patches are doing some small related cleanup of things that poke me in the eye during the work. --- v1->v2: - a small bug was fixed in patch #2, the rest of the code stays the same so I left review/ack tags attached to them ==================== Signed-off-by: David S. Miller <[email protected]>
2023-02-13devlink: add forgotten devlink instance lock assertion to ↵Jiri Pirko1-0/+2
devl_param_driverinit_value_set() Driver calling devl_param_driverinit_value_set() has to hold devlink instance lock while doing that. Put an assertion there. Signed-off-by: Jiri Pirko <[email protected]> Reviewed-by: Simon Horman <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-02-13devlink: allow to call devl_param_driverinit_value_get() without holding ↵Jiri Pirko1-2/+11
instance lock If the driver maintains following basic sane behavior, the devl_param_driverinit_value_get() function could be called without holding instance lock: 1) Driver ensures a call to devl_param_driverinit_value_get() cannot race with registering/unregistering the parameter with the same parameter ID. 2) Driver ensures a call to devl_param_driverinit_value_get() cannot race with devl_param_driverinit_value_set() call with the same parameter ID. 3) Driver ensures a call to devl_param_driverinit_value_get() cannot race with reload operation. By the nature of params usage, these requirements should be trivially achievable. If the driver for some off reason is not able to comply, it has to take the devlink->lock while calling devl_param_driverinit_value_get(). Remove the lock assertion and add comment describing the locking requirements. This fixes a splat in mlx5 driver introduced by the commit referenced in the "Fixes" tag. Lore: https://lore.kernel.org/netdev/[email protected]/ Reported-by: Kim Phillips <[email protected]> Fixes: 075935f0ae0f ("devlink: protect devlink param list by instance lock") Signed-off-by: Jiri Pirko <[email protected]> Reviewed-by: Simon Horman <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Tested-by: Kim Phillips <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-02-13devlink: convert param list to xarrayJiri Pirko3-39/+39
Loose the linked list for params and use xarray instead. Note that this is required to be eventually possible to call devl_param_driverinit_value_get() without holding instance lock. Signed-off-by: Jiri Pirko <[email protected]> Reviewed-by: Simon Horman <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-02-13devlink: use xa_for_each_start() helper in devlink_nl_cmd_port_get_dump_one()Jiri Pirko1-8/+2
As xarray has an iterator helper that allows to start from specified index, use this directly and avoid repeated iteration from 0. Signed-off-by: Jiri Pirko <[email protected]> Reviewed-by: Simon Horman <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-02-13devlink: fix the name of value arg of devl_param_driverinit_value_get()Jiri Pirko2-4/+5
Probably due to copy-paste error, the name of the arg is "init_val" which is misleading, as the pointer is used to point to struct where to store the current value. Rename it to "val" and change the arg comment a bit on the way. Signed-off-by: Jiri Pirko <[email protected]> Reviewed-by: Simon Horman <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-02-13devlink: make sure driver does not read updated driverinit param before reloadJiri Pirko4-4/+32
The driverinit param purpose is to serve the driver during init/reload time to provide a value, either default or set by user. Make sure that driver does not read value updated by user before the reload is performed. Hold the new value in a separate struct and switch it during reload. Note that this is required to be eventually possible to call devl_param_driverinit_value_get() without holding instance lock. Signed-off-by: Jiri Pirko <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-02-13devlink: don't use strcpy() to copy param valueJiri Pirko1-12/+3
No need to treat string params any different comparing to other types. Rely on the struct assign to copy the whole struct, including the string. Signed-off-by: Jiri Pirko <[email protected]> Reviewed-by: Simon Horman <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-02-13nfp: ethtool: supplement nfp link modes supportedYu Xiao2-0/+15
Add support for the following modes to the nfp driver: NFP_MEDIA_10GBASE_LR NFP_MEDIA_25GBASE_LR NFP_MEDIA_25GBASE_ER These modes are supported by the hardware and, support for them was recently added to firmware. Signed-off-by: Yu Xiao <[email protected]> Signed-off-by: Simon Horman <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-02-13net/usb: kalmia: Don't pass act_len in usb_bulk_msg error pathMiko Larsson1-4/+4
syzbot reported that act_len in kalmia_send_init_packet() is uninitialized when passing it to the first usb_bulk_msg error path. Jiri Pirko noted that it's pointless to pass it in the error path, and that the value that would be printed in the second error path would be the value of act_len from the first call to usb_bulk_msg.[1] With this in mind, let's just not pass act_len to the usb_bulk_msg error paths. 1: https://lore.kernel.org/lkml/Y9pY61y1nwTuzMOa@nanopsycho/ Fixes: d40261236e8e ("net/usb: Add Samsung Kalmia driver for Samsung GT-B3730") Reported-and-tested-by: [email protected] Signed-off-by: Miko Larsson <[email protected]> Reviewed-by: Alexander Duyck <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-02-13net: openvswitch: fix possible memory leak in ovs_meter_cmd_set()Hangyu Hua1-1/+3
old_meter needs to be free after it is detached regardless of whether the new meter is successfully attached. Fixes: c7c4c44c9a95 ("net: openvswitch: expand the meters supported number") Signed-off-by: Hangyu Hua <[email protected]> Acked-by: Eelco Chaudron <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-02-13devlink: stop using NL_SET_ERR_MSG_MODJacob Keller2-70/+65
NL_SET_ERR_MSG_MOD inserts the KBUILD_MODNAME and a ':' before the actual extended error message. The devlink feature hasn't been able to be compiled as a module since commit f4b6bcc7002f ("net: devlink: turn devlink into a built-in"). Stop using NL_SET_ERR_MSG_MOD, and just use the base NL_SET_ERR_MSG. This aligns the extended error messages better with the NL_SET_ERR_MSG_ATTR messages as well. Signed-off-by: Jacob Keller <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-02-13rds: rds_rm_zerocopy_callback() correct order for list_add_tail()Pietro Borrello1-1/+1
rds_rm_zerocopy_callback() uses list_add_tail() with swapped arguments. This links the list head with the new entry, losing the references to the remaining part of the list. Fixes: 9426bbc6de99 ("rds: use list structure to track information for zerocopy completion notification") Suggested-by: Paolo Abeni <[email protected]> Signed-off-by: Pietro Borrello <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-02-13af_key: Fix heap information leakHyunwoo Kim1-1/+1
Since x->encap of pfkey_msg2xfrm_state() is not initialized to 0, kernel heap data can be leaked. Fix with kzalloc() to prevent this. Signed-off-by: Hyunwoo Kim <[email protected]> Acked-by: Herbert Xu <[email protected]> Reviewed-by: Sabrina Dubroca <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-02-13Merge tag 'iwlwifi-next-for-kalle-2023-01-30' of ↵Kalle Valo17-67/+314
http://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-next iwlwifi updates towards v6.3, this patch-set contains: * EHT rate reporting * Sniffer support for EHT and a few fixes in the related code * A few general cleanups and small bugfixes * Bump FW API to 74 for AX devices * Fix a compilation error in mei (it still depends on BROKEN) * STEP equalizer support - transfer some Phy related parameters from the BIOS to the firmware
2023-02-13nvme-pci: add bogus ID quirk for ADATA SX6000PNPDaniel Wagner1-0/+2
Yet another device which needs a quirk: nvme nvme1: globally duplicate IDs for nsid 1 nvme nvme1: VID:DID 10ec:5763 model:ADATA SX6000PNP firmware:V9002s94 Link: http://bugzilla.opensuse.org/show_bug.cgi?id=1207827 Reported-by: Gustavo Freitas <[email protected]> Signed-off-by: Daniel Wagner <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2023-02-12tracing: Make trace_define_field_ext() staticSteven Rostedt (Google)1-1/+1
trace_define_field_ext() is not used outside of trace_events.c, it should be static. Link: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Fixes: b6c7abd1c28a ("tracing: Fix TASK_COMM_LEN in trace event format file") Reported-by: Reported-by: kernel test robot <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
2023-02-12Linux 6.2-rc8Linus Torvalds1-1/+1
2023-02-12MAINTAINERS: Add myself as maintainer for arch/sh (SUPERH)John Paul Adrian Glaubitz1-0/+1
Both Rich Felker and Yoshinori Sato haven't done any work on arch/sh for a while. As I have been maintaining Debian's sh4 port since 2014, I am interested to keep the architecture alive. Signed-off-by: John Paul Adrian Glaubitz <[email protected]> Acked-by: Yoshinori Sato <[email protected]> Acked-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2023-02-12Merge tag 'trace-v6.2-rc7' of ↵Linus Torvalds5-11/+36
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fix from Steven Rostedt: "Fix showing of TASK_COMM_LEN instead of its value The TASK_COMM_LEN was converted from a macro into an enum so that BTF would have access to it. But this unfortunately caused TASK_COMM_LEN to display in the format fields of trace events, as they are created by the TRACE_EVENT() macro and such, macros convert to their values, where as enums do not. To handle this, instead of using the field itself to be display, save the value of the array size as another field in the trace_event_fields structure, and use that instead. Not only does this fix the issue, but also converts the other trace events that have this same problem (but were not breaking tooling). With this change, the original work around b3bc8547d3be6 ("tracing: Have TRACE_DEFINE_ENUM affect trace event types as well") could be reverted (but that should be done in the merge window)" * tag 'trace-v6.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Fix TASK_COMM_LEN in trace event format file
2023-02-12Merge tag 'for-6.2-rc7-tag' of ↵Linus Torvalds4-9/+34
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - one more fix for a tree-log 'write time corruption' report, update the last dir index directly and don't keep in the log context - do VFS-level inode lock around FIEMAP to prevent a deadlock with concurrent fsync, the extent-level lock is not sufficient - don't cache a single-device filesystem device to avoid cases when a loop device is reformatted and the entry gets stale * tag 'for-6.2-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: free device in btrfs_close_devices for a single device filesystem btrfs: lock the inode in shared mode before starting fiemap btrfs: simplify update of last_dir_index_offset when logging a directory
2023-02-12Merge tag 'usb-6.2-rc8' of ↵Linus Torvalds3-4/+11
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are 2 small USB driver fixes that resolve some reported regressions and one new device quirk. Specifically these are: - new quirk for Alcor Link AK9563 smartcard reader - revert of u_ether gadget change in 6.2-rc1 that caused problems - typec pin probe fix All of these have been in linux-next with no reported problems" * tag 'usb-6.2-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: usb: core: add quirk for Alcor Link AK9563 smartcard reader usb: typec: altmodes/displayport: Fix probe pin assign check Revert "usb: gadget: u_ether: Do not make UDC parent of the net device"
2023-02-12Merge tag 'efi-fixes-for-v6.2-4' of ↵Linus Torvalds1-3/+6
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fix from Ard Biesheuvel: "A fix from Darren to widen the SMBIOS match for detecting Ampere Altra machines with problematic firmware. In the mean time, we are working on a more precise check, but this is still work in progress" * tag 'efi-fixes-for-v6.2-4' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: arm64: efi: Force the use of SetVirtualAddressMap() on eMAG and Altra Max machines
2023-02-12Merge tag 'powerpc-6.2-5' of ↵Linus Torvalds3-3/+5
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix interrupt exit race with security mitigation switching. - Don't select ARCH_WANTS_NO_INSTR until warnings are fixed. - Build fix for CONFIG_NUMA=n. Thanks to Nicholas Piggin, Randy Dunlap, and Sachin Sant. * tag 'powerpc-6.2-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64s/interrupt: Fix interrupt exit race with security mitigation switch powerpc/kexec_file: fix implicit decl error powerpc: Don't select ARCH_WANTS_NO_INSTR
2023-02-12Fix page corruption caused by racy check in __free_pagesDavid Chen1-1/+4
When we upgraded our kernel, we started seeing some page corruption like the following consistently: BUG: Bad page state in process ganesha.nfsd pfn:1304ca page:0000000022261c55 refcount:0 mapcount:-128 mapping:0000000000000000 index:0x0 pfn:0x1304ca flags: 0x17ffffc0000000() raw: 0017ffffc0000000 ffff8a513ffd4c98 ffffeee24b35ec08 0000000000000000 raw: 0000000000000000 0000000000000001 00000000ffffff7f 0000000000000000 page dumped because: nonzero mapcount CPU: 0 PID: 15567 Comm: ganesha.nfsd Kdump: loaded Tainted: P B O 5.10.158-1.nutanix.20221209.el7.x86_64 #1 Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/05/2016 Call Trace: dump_stack+0x74/0x96 bad_page.cold+0x63/0x94 check_new_page_bad+0x6d/0x80 rmqueue+0x46e/0x970 get_page_from_freelist+0xcb/0x3f0 ? _cond_resched+0x19/0x40 __alloc_pages_nodemask+0x164/0x300 alloc_pages_current+0x87/0xf0 skb_page_frag_refill+0x84/0x110 ... Sometimes, it would also show up as corruption in the free list pointer and cause crashes. After bisecting the issue, we found the issue started from commit e320d3012d25 ("mm/page_alloc.c: fix freeing non-compound pages"): if (put_page_testzero(page)) free_the_page(page, order); else if (!PageHead(page)) while (order-- > 0) free_the_page(page + (1 << order), order); So the problem is the check PageHead is racy because at this point we already dropped our reference to the page. So even if we came in with compound page, the page can already be freed and PageHead can return false and we will end up freeing all the tail pages causing double free. Fixes: e320d3012d25 ("mm/page_alloc.c: fix freeing non-compound pages") Link: https://lore.kernel.org/lkml/BYAPR02MB448855960A9656EEA81141FC94D99@BYAPR02MB4488.namprd02.prod.outlook.com/ Cc: Andrew Morton <[email protected]> Cc: [email protected] Signed-off-by: Chunwei Chen <[email protected]> Reviewed-by: Vlastimil Babka <[email protected]> Reviewed-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2023-02-12tracing: Fix TASK_COMM_LEN in trace event format fileYafang Shao5-11/+36
After commit 3087c61ed2c4 ("tools/testing/selftests/bpf: replace open-coded 16 with TASK_COMM_LEN"), the content of the format file under /sys/kernel/tracing/events/task/task_newtask was changed from field:char comm[16]; offset:12; size:16; signed:0; to field:char comm[TASK_COMM_LEN]; offset:12; size:16; signed:0; John reported that this change breaks older versions of perfetto. Then Mathieu pointed out that this behavioral change was caused by the use of __stringify(_len), which happens to work on macros, but not on enum labels. And he also gave the suggestion on how to fix it: :One possible solution to make this more robust would be to extend :struct trace_event_fields with one more field that indicates the length :of an array as an actual integer, without storing it in its stringified :form in the type, and do the formatting in f_show where it belongs. The result as follows after this change, $ cat /sys/kernel/tracing/events/task/task_newtask/format field:char comm[16]; offset:12; size:16; signed:0; Link: https://lore.kernel.org/lkml/[email protected]/ Link: https://lore.kernel.org/linux-trace-kernel/[email protected]/ Link: https://lore.kernel.org/linux-trace-kernel/[email protected] Cc: [email protected] Cc: Alexei Starovoitov <[email protected]> Cc: Kajetan Puchalski <[email protected]> CC: Qais Yousef <[email protected]> Fixes: 3087c61ed2c4 ("tools/testing/selftests/bpf: replace open-coded 16 with TASK_COMM_LEN") Reported-by: John Stultz <[email protected]> Debugged-by: Mathieu Desnoyers <[email protected]> Suggested-by: Mathieu Desnoyers <[email protected]> Suggested-by: Steven Rostedt <[email protected]> Signed-off-by: Yafang Shao <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
2023-02-11Merge tag 'spi-fix-v6.2-rc7' of ↵Linus Torvalds2-7/+17
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "A couple of hopefully final fixes for spi: one driver specific fix for an issue with very large transfers and a fix for an issue with the locking fixes in spidev merged earlier this release cycle which was missed" * tag 'spi-fix-v6.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: spidev: fix a recursive locking error spi: dw: Fix wrong FIFO level setting for long xfers
2023-02-11Merge tag 'x86-urgent-2023-02-11' of ↵Linus Torvalds2-1/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Fix a kprobes bug, plus add a new Intel model number to the upstream <asm/intel-family.h> header for drivers to use" * tag 'x86-urgent-2023-02-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu: Add Lunar Lake M x86/kprobes: Fix 1 byte conditional jump target
2023-02-11Merge tag 'locking-urgent-2023-02-11' of ↵Linus Torvalds1-2/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fix from Ingo Molnar: "Fix an rtmutex missed-wakeup bug" * tag 'locking-urgent-2023-02-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: rtmutex: Ensure that the top waiter is always woken up
2023-02-11Merge tag 'cxl-fixes-6.2' of ↵Linus Torvalds1-5/+7
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull cxl fixes from Dan Williams: "Two fixups for CXL (Compute Express Link) in presence of passthrough decoders. This primarily helps developers using the QEMU CXL emulation, but with the impending arrival of CXL switches these types of topologies will be of interest to end users. - Fix a crash when shutting down regions in the presence of passthrough decoders - Fix region creation to understand passthrough decoders instead of the narrower definition of passthrough ports" * tag 'cxl-fixes-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl/region: Fix passthrough-decoder detection cxl/region: Fix null pointer dereference for resetting decoder
2023-02-11Merge tag 'libnvdimm-fixes-6.2' of ↵Linus Torvalds5-18/+49
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fixes from Dan Williams: "A fix for an issue that could causes users to inadvertantly reserve too much capacity when debugging the KMSAN and persistent memory namespace, a lockdep fix, and a kernel-doc build warning: - Resolve the conflict between KMSAN and NVDIMM with respect to reserving pmem namespace / volume capacity for larger sizeof(struct page) - Fix a lockdep warning in the the NFIT code - Fix a kernel-doc build warning" * tag 'libnvdimm-fixes-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: nvdimm: Support sizeof(struct page) > MAX_STRUCT_PAGE_SIZE ACPI: NFIT: fix a potential deadlock during NFIT teardown dax: super.c: fix kernel-doc bad line warning
2023-02-11Merge tag 'fixes-2023-02-11' of ↵Linus Torvalds2-11/+1
git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock Pull memblock revert from Mike Rapoport: "Revert 'mm: Always release pages to the buddy allocator in memblock_free_late()' The pages being freed by memblock_free_late() have already been initialized, but if they are in the deferred init range, __free_one_page() might access nearby uninitialized pages when trying to coalesce buddies, which will cause a crash. A proper fix will be more involved so revert this change for the time being" * tag 'fixes-2023-02-11' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock: Revert "mm: Always release pages to the buddy allocator in memblock_free_late()."
2023-02-11nfsd: don't destroy global nfs4_file table in per-net shutdownJeff Layton1-1/+1
The nfs4_file table is global, so shutting it down when a containerized nfsd is shut down is wrong and can lead to double-frees. Tear down the nfs4_file_rhltable in nfs4_state_shutdown instead of nfs4_state_shutdown_net. Fixes: d47b295e8d76 ("NFSD: Use rhashtable for managing nfs4_file objects") Link: https://bugzilla.redhat.com/show_bug.cgi?id=2169017 Reported-by: JianHong Yin <[email protected]> Signed-off-by: Jeff Layton <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2023-02-10Merge branch 'sk-sk_forward_alloc-fixes'Jakub Kicinski5-13/+19
Kuniyuki Iwashima says: ==================== sk->sk_forward_alloc fixes. The first patch fixes a negative sk_forward_alloc by adding sk_rmem_schedule() before skb_set_owner_r(), and second patch removes an unnecessary WARN_ON_ONCE(). v2: https://lore.kernel.org/netdev/[email protected]/ v1: https://lore.kernel.org/netdev/[email protected]/ ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-02-10net: Remove WARN_ON_ONCE(sk->sk_forward_alloc) from sk_stream_kill_queues().Kuniyuki Iwashima2-1/+1
Christoph Paasch reported that commit b5fc29233d28 ("inet6: Remove inet6_destroy_sock() in sk->sk_prot->destroy().") started triggering WARN_ON_ONCE(sk->sk_forward_alloc) in sk_stream_kill_queues(). [0 - 2] Also, we can reproduce it by a program in [3]. In the commit, we delay freeing ipv6_pinfo.pktoptions from sk->destroy() to sk->sk_destruct(), so sk->sk_forward_alloc is no longer zero in inet_csk_destroy_sock(). The same check has been in inet_sock_destruct() from at least v2.6, we can just remove the WARN_ON_ONCE(). However, among the users of sk_stream_kill_queues(), only CAIF is not calling inet_sock_destruct(). Thus, we add the same WARN_ON_ONCE() to caif_sock_destructor(). [0]: https://lore.kernel.org/netdev/[email protected]/ [1]: https://github.com/multipath-tcp/mptcp_net-next/issues/341 [2]: WARNING: CPU: 0 PID: 3232 at net/core/stream.c:212 sk_stream_kill_queues+0x2f9/0x3e0 Modules linked in: CPU: 0 PID: 3232 Comm: syz-executor.0 Not tainted 6.2.0-rc5ab24eb4698afbe147b424149c529e2a43ec24eb5 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:sk_stream_kill_queues+0x2f9/0x3e0 Code: 03 0f b6 04 02 84 c0 74 08 3c 03 0f 8e ec 00 00 00 8b ab 08 01 00 00 e9 60 ff ff ff e8 d0 5f b6 fe 0f 0b eb 97 e8 c7 5f b6 fe <0f> 0b eb a0 e8 be 5f b6 fe 0f 0b e9 6a fe ff ff e8 02 07 e3 fe e9 RSP: 0018:ffff88810570fc68 EFLAGS: 00010293 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffff888101f38f40 RSI: ffffffff8285e529 RDI: 0000000000000005 RBP: 0000000000000ce0 R08: 0000000000000005 R09: 0000000000000000 R10: 0000000000000ce0 R11: 0000000000000001 R12: ffff8881009e9488 R13: ffffffff84af2cc0 R14: 0000000000000000 R15: ffff8881009e9458 FS: 00007f7fdfbd5800(0000) GS:ffff88811b600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b32923000 CR3: 00000001062fc006 CR4: 0000000000170ef0 Call Trace: <TASK> inet_csk_destroy_sock+0x1a1/0x320 __tcp_close+0xab6/0xe90 tcp_close+0x30/0xc0 inet_release+0xe9/0x1f0 inet6_release+0x4c/0x70 __sock_release+0xd2/0x280 sock_close+0x15/0x20 __fput+0x252/0xa20 task_work_run+0x169/0x250 exit_to_user_mode_prepare+0x113/0x120 syscall_exit_to_user_mode+0x1d/0x40 do_syscall_64+0x48/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc RIP: 0033:0x7f7fdf7ae28d Code: c1 20 00 00 75 10 b8 03 00 00 00 0f 05 48 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 ee fb ff ff 48 89 04 24 b8 03 00 00 00 0f 05 <48> 8b 3c 24 48 89 c2 e8 37 fc ff ff 48 89 d0 48 83 c4 08 48 3d 01 RSP: 002b:00000000007dfbb0 EFLAGS: 00000293 ORIG_RAX: 0000000000000003 RAX: 0000000000000000 RBX: 0000000000000004 RCX: 00007f7fdf7ae28d RDX: 0000000000000000 RSI: ffffffffffffffff RDI: 0000000000000003 RBP: 0000000000000000 R08: 000000007f338e0f R09: 0000000000000e0f R10: 000000007f338e13 R11: 0000000000000293 R12: 00007f7fdefff000 R13: 00007f7fdefffcd8 R14: 00007f7fdefffce0 R15: 00007f7fdefffcd8 </TASK> [3]: https://lore.kernel.org/netdev/[email protected]/ Fixes: b5fc29233d28 ("inet6: Remove inet6_destroy_sock() in sk->sk_prot->destroy().") Reported-by: syzbot <[email protected]> Reported-by: Christoph Paasch <[email protected]> Signed-off-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-02-10dccp/tcp: Avoid negative sk_forward_alloc by ipv6_pinfo.pktoptions.Kuniyuki Iwashima3-12/+18
Eric Dumazet pointed out [0] that when we call skb_set_owner_r() for ipv6_pinfo.pktoptions, sk_rmem_schedule() has not been called, resulting in a negative sk_forward_alloc. We add a new helper which clones a skb and sets its owner only when sk_rmem_schedule() succeeds. Note that we move skb_set_owner_r() forward in (dccp|tcp)_v6_do_rcv() because tcp_send_synack() can make sk_forward_alloc negative before ipv6_opt_accepted() in the crossed SYN-ACK or self-connect() cases. [0]: https://lore.kernel.org/netdev/CANn89iK9oc20Jdi_41jb9URdF210r7d1Y-+uypbMSbOfY6jqrg@mail.gmail.com/ Fixes: 323fbd0edf3f ("net: dccp: Add handling of IPV6_PKTOPTIONS to dccp_v6_do_rcv()") Fixes: 3df80d9320bc ("[DCCP]: Introduce DCCPv6") Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>