aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-01-02Revert "PCI/ASPM: Remove pcie_aspm_pm_state_change()"Bjorn Helgaas3-0/+27
This reverts commit 08d0cc5f34265d1a1e3031f319f594bd1970976c. Michael reported that when attempting to resume from suspend to RAM on ASUS mini PC PN51-BB757MDE1 (DMI model: MINIPC PN51-E1), 08d0cc5f3426 ("PCI/ASPM: Remove pcie_aspm_pm_state_change()") caused a 12-second delay with no output, followed by a reboot. Workarounds include: - Reverting 08d0cc5f3426 ("PCI/ASPM: Remove pcie_aspm_pm_state_change()") - Booting with "pcie_aspm=off" - Booting with "pcie_aspm.policy=performance" - "echo 0 | sudo tee /sys/bus/pci/devices/0000:03:00.0/link/l1_aspm" before suspending - Connecting a USB flash drive Link: https://lore.kernel.org/r/[email protected] Fixes: 08d0cc5f3426 ("PCI/ASPM: Remove pcie_aspm_pm_state_change()") Reported-by: Michael Schaller <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Bjorn Helgaas <[email protected]> Cc: <[email protected]>
2024-01-02Revert "net: ipv6/addrconf: clamp preferred_lft to the minimum required"Alex Henrie2-14/+6
The commit had a bug and might not have been the right approach anyway. Fixes: 629df6701c8a ("net: ipv6/addrconf: clamp preferred_lft to the minimum required") Fixes: ec575f885e3e ("Documentation: networking: explain what happens if temp_prefered_lft is too small or too large") Reported-by: Dan Moulding <[email protected]> Closes: https://lore.kernel.org/netdev/[email protected]/ Link: https://lore.kernel.org/netdev/CAMMLpeTdYhd=7hhPi2Y7pwdPCgnnW5JYh-bu3hSc7im39uxnEA@mail.gmail.com/ Signed-off-by: Alex Henrie <[email protected]> Reviewed-by: David Ahern <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-01-02eventfs: Fix bitwise fields for "is_events"Steven Rostedt (Google)1-1/+1
A flag was needed to denote which eventfs_inode was the "events" directory, so a bit was taken from the "nr_entries" field, as there's not that many entries, and 2^30 is plenty. But the bit number for nr_entries was not updated to reflect the bit taken from it, which would add an unnecessary integer to the structure. Link: https://lore.kernel.org/linux-trace-kernel/[email protected] Cc: [email protected] Cc: Masami Hiramatsu <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Fixes: 7e8358edf503e ("eventfs: Fix file and directory uid and gid ownership") Signed-off-by: Steven Rostedt (Google) <[email protected]>
2024-01-02tracefs: Check for dentry->d_inode exists in set_gid()Steven Rostedt (Google)1-0/+4
If a getdents() is called on the tracefs directory but does not get all the files, it can leave a "cursor" dentry in the d_subdirs list of tracefs dentry. This cursor dentry does not have a d_inode for it. Before referencing tracefs_inode from the dentry, the d_inode must first be checked if it has content. If not, then it's not a tracefs_inode and can be ignored. The following caused a crash: #define getdents64(fd, dirp, count) syscall(SYS_getdents64, fd, dirp, count) #define BUF_SIZE 256 #define TDIR "/tmp/file0" int main(void) { char buf[BUF_SIZE]; int fd; int n; mkdir(TDIR, 0777); mount(NULL, TDIR, "tracefs", 0, NULL); fd = openat(AT_FDCWD, TDIR, O_RDONLY); n = getdents64(fd, buf, BUF_SIZE); ret = mount(NULL, TDIR, NULL, MS_NOSUID|MS_REMOUNT|MS_RELATIME|MS_LAZYTIME, "gid=1000"); return 0; } That's because the 256 BUF_SIZE was not big enough to read all the dentries of the tracefs file system and it left a "cursor" dentry in the subdirs of the tracefs root inode. Then on remounting with "gid=1000", it would cause an iteration of all dentries which hit: ti = get_tracefs(dentry->d_inode); if (ti && (ti->flags & TRACEFS_EVENT_INODE)) eventfs_update_gid(dentry, gid); Which crashed because of the dereference of the cursor dentry which had a NULL d_inode. In the subdir loop of the dentry lookup of set_gid(), if a child has a NULL d_inode, simply skip it. Link: https://lore.kernel.org/all/[email protected]/ Link: https://lore.kernel.org/linux-trace-kernel/[email protected] Cc: [email protected] Cc: Masami Hiramatsu <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Fixes: 7e8358edf503e ("eventfs: Fix file and directory uid and gid ownership") Reported-by: "Ubisectech Sirius" <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
2024-01-02efi/x86: Fix the missing KASLR_FLAG bit in boot_params->hdr.loadflagsYuntao Wang1-0/+2
When KASLR is enabled, the KASLR_FLAG bit in boot_params->hdr.loadflags should be set to 1 to propagate KASLR status from compressed kernel to kernel, just as the choose_random_location() function does. Currently, when the kernel is booted via the EFI stub, the KASLR_FLAG bit in boot_params->hdr.loadflags is not set, even though it should be. This causes some functions, such as kernel_randomize_memory(), not to execute as expected. Fix it. Fixes: a1b87d54f4e4 ("x86/efistub: Avoid legacy decompressor when doing EFI boot") Signed-off-by: Yuntao Wang <[email protected]> [ardb: drop 'else' branch clearing KASLR_FLAG] Signed-off-by: Ard Biesheuvel <[email protected]>
2024-01-02ARM: sun9i: smp: fix return code check of of_property_match_stringStefan Wahren1-2/+2
of_property_match_string returns an int; either an index from 0 or greater if successful or negative on failure. Even it's very unlikely that the DT CPU node contains multiple enable-methods these checks should be fixed. This patch was inspired by the work of Nick Desaulniers. Link: https://lore.kernel.org/lkml/[email protected]/T/ Cc: Nick Desaulniers <[email protected]> Signed-off-by: Stefan Wahren <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Chen-Yu Tsai <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]>
2024-01-02ARM: sun9i: smp: Fix array-index-out-of-bounds read in sunxi_mc_smp_initStefan Wahren1-2/+2
Running a multi-arch kernel (multi_v7_defconfig) on a Raspberry Pi 3B+ with enabled CONFIG_UBSAN triggers the following warning: UBSAN: array-index-out-of-bounds in arch/arm/mach-sunxi/mc_smp.c:810:29 index 2 is out of range for type 'sunxi_mc_smp_data [2]' CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.7.0-rc6-00248-g5254c0cbc92d Hardware name: BCM2835 unwind_backtrace from show_stack+0x10/0x14 show_stack from dump_stack_lvl+0x40/0x4c dump_stack_lvl from ubsan_epilogue+0x8/0x34 ubsan_epilogue from __ubsan_handle_out_of_bounds+0x78/0x80 __ubsan_handle_out_of_bounds from sunxi_mc_smp_init+0xe4/0x4cc sunxi_mc_smp_init from do_one_initcall+0xa0/0x2fc do_one_initcall from kernel_init_freeable+0xf4/0x2f4 kernel_init_freeable from kernel_init+0x18/0x158 kernel_init from ret_from_fork+0x14/0x28 Since the enabled method couldn't match with any entry from sunxi_mc_smp_data, the value of the index shouldn't be used right after the loop. So move it after the check of ret in order to have a valid index. Fixes: 1631090e34f5 ("ARM: sun9i: smp: Add is_a83t field") Signed-off-by: Stefan Wahren <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Chen-Yu Tsai <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]>
2024-01-02ALSA: hda/realtek: fix mute/micmute LEDs for a HP ZBookAndy Chi1-0/+1
There is a HP ZBook which using ALC236 codec and need the ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF quirk to make mute LED and micmute LED work. [ confirmed that the new entries are for new models that have no proper name, so the strings are left as "HP" which will be updated eventually later -- tiwai ] Signed-off-by: Andy Chi <[email protected]> Cc: <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2024-01-02selftests: bonding: do not set port down when adding to bondHangbin Liu1-3/+3
Similar to commit be809424659c ("selftests: bonding: do not set port down before adding to bond"). The bond-arp-interval-causes-panic test failed after commit a4abfa627c38 ("net: rtnetlink: Enslave device before bringing it up") as the kernel will set the port down _after_ adding to bond if setting port down specifically. Fix it by removing the link down operation when adding to bond. Fixes: 2ffd57327ff1 ("selftests: bonding: cause oops in bond_rr_gen_slave_id") Signed-off-by: Hangbin Liu <[email protected]> Tested-by: Benjamin Poirier <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-01-02connector: Fix proc_event_num_listeners count not clearedwangkeqi1-2/+3
When we register a cn_proc listening event, the proc_event_num_listener variable will be incremented by one, but if PROC_CN_MCAST_IGNORE is not called, the count will not decrease. This will cause the proc_*_connector function to take the wrong path. It will reappear when the forkstat tool exits via ctrl + c. We solve this problem by determining whether there are still listeners to clear proc_event_num_listener. Signed-off-by: wangkeqi <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-01-02net: phy: linux/phy.h: fix Excess kernel-doc description warningRandy Dunlap1-1/+0
Remove the @phy_timer: line to prevent the kernel-doc warning: include/linux/phy.h:768: warning: Excess struct member 'phy_timer' description in 'phy_device' Signed-off-by: Randy Dunlap <[email protected]> Cc: Andrew Lunn <[email protected]> Cc: Heiner Kallweit <[email protected]> Cc: Russell King <[email protected]> Cc: [email protected] Reviewed-by: Russell King (Oracle) <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-01-02net: Implement missing getsockopt(SO_TIMESTAMPING_NEW)Jörn-Thorben Hinz1-2/+9
Commit 9718475e6908 ("socket: Add SO_TIMESTAMPING_NEW") added the new socket option SO_TIMESTAMPING_NEW. Setting the option is handled in sk_setsockopt(), querying it was not handled in sk_getsockopt(), though. Following remarks on an earlier submission of this patch, keep the old behavior of getsockopt(SO_TIMESTAMPING_OLD) which returns the active flags even if they actually have been set through SO_TIMESTAMPING_NEW. The new getsockopt(SO_TIMESTAMPING_NEW) is stricter, returning flags only if they have been set through the same option. Fixes: 9718475e6908 ("socket: Add SO_TIMESTAMPING_NEW") Link: https://lore.kernel.org/lkml/[email protected]/ Link: https://lore.kernel.org/netdev/[email protected]/ Signed-off-by: Jörn-Thorben Hinz <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-01-01net: qrtr: ns: Return 0 if server port is not presentSarannya S1-1/+3
When a 'DEL_CLIENT' message is received from the remote, the corresponding server port gets deleted. A DEL_SERVER message is then announced for this server. As part of handling the subsequent DEL_SERVER message, the name- server attempts to delete the server port which results in a '-ENOENT' error. The return value from server_del() is then propagated back to qrtr_ns_worker, causing excessive error prints. To address this, return 0 from control_cmd_del_server() without checking the return value of server_del(), since the above scenario is not an error case and hence server_del() doesn't have any other error return value. Signed-off-by: Sarannya Sasikumar <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-01-01bcachefs: make RO snapshots actually ROKent Overstreet6-14/+63
Add checks to all the VFS paths for "are we in a RO snapshot?". Note - we don't check this when setting inode options via our xattr interface, since those generally only affect data placement, not contents of data. Signed-off-by: Kent Overstreet <[email protected]> Reported-by: "Carl E. Thompson" <[email protected]>
2024-01-01bcachefs: bch_sb_field_downgradeKent Overstreet10-9/+255
Add a new superblock section that contains a list of { minor version, recovery passes, errors_to_fix } that is - a list of recovery passes that must be run when downgrading past a given version, and a list of errors to silently fix. The upcoming disk accounting rewrite is not going to be fully compatible: we're going to have to regenerate accounting both when upgrading to the new version, and also from downgrading from the new version, since the new method of doing disk space accounting is a completely different architecture based on deltas, and synchronizing them for every jounal entry write to maintain compatibility is going to be too expensive and impractical. Signed-off-by: Kent Overstreet <[email protected]>
2024-01-01bcachefs: bch_sb.recovery_passes_requiredKent Overstreet9-30/+172
Add two new superblock fields. Since the main section of the superblock is now fully, we have to add a new variable length section for them - bch_sb_field_ext. - recovery_passes_requried: recovery passes that must be run on the next mount - errors_silent: errors that will be silently fixed These are to improve upgrading and dwongrading: these fields won't be cleared until after recovery successfully completes, so there won't be any issues with crashing partway through an upgrade or a downgrade. Signed-off-by: Kent Overstreet <[email protected]>
2024-01-01bcachefs: Add persistent identifiers for recovery passesKent Overstreet3-39/+84
The next patch will start to refer to recovery passes from the superblock; naturally, we now need identifiers that don't change, since the existing enum is in the order in which they are run and is not fixed. Signed-off-by: Kent Overstreet <[email protected]>
2024-01-01bcachefs: prt_bitflags_vector()Kent Overstreet3-0/+25
similar to prt_bitflags(), but for ulong arrays Signed-off-by: Kent Overstreet <[email protected]>
2024-01-01bcachefs: move BCH_SB_ERRS() to sb-errors_types.hKent Overstreet2-253/+253
we need BCH_SB_ERR_MAX in bcachefs.h Signed-off-by: Kent Overstreet <[email protected]>
2024-01-01bcachefs: fix buffer overflow in nocow write pathKent Overstreet1-41/+41
BCH_REPLICAS_MAX isn't the actual maximum number of pointers in an extent, it's the maximum number of dirty pointers. We don't have a real restriction on the number of cached pointers, and we don't want a fixed size array here anyways - so switch to DARRAY_PREALLOCATED(). Signed-off-by: Kent Overstreet <[email protected]> Reported-and-tested-by: Daniel J Blueman <[email protected]>
2024-01-01bcachefs: DARRAY_PREALLOCATED()Kent Overstreet2-13/+20
Add support to darray for preallocating some number of elements. Signed-off-by: Kent Overstreet <[email protected]>
2024-01-01bcachefs: Switch darray to kvmalloc()Kent Overstreet2-2/+4
We sometimes use darrays for quite large buffers - the btree write buffer in particular needs large buffers, since it must be sized to hold all the write buffer keys outstanding in the journal. Signed-off-by: Kent Overstreet <[email protected]>
2024-01-01bcachefs: Factor out darray resize slowpathKent Overstreet3-11/+39
Move the slowpath (actually growing the darray) to an out-of-line function; also, add some helpers for the upcoming btree write buffer rewrite. Signed-off-by: Kent Overstreet <[email protected]>
2024-01-01bcachefs: fix setting version_upgrade_completeKent Overstreet1-2/+2
If a superblock write hasn't happened (i.e. we never had to go rw), then c->sb.version will be out of date w.r.t. c->disk_sb.sb->version. Signed-off-by: Kent Overstreet <[email protected]>
2024-01-01bcachefs: fix invalid free in dio write pathKent Overstreet1-8/+5
turns out iterate_iovec() mutates __iov, we need to save our own copy Signed-off-by: Kent Overstreet <[email protected]> Reported-by: Marcin Mirosław <[email protected]>
2024-01-01bcachefs: Fix extents iteration + snapshots interactionKent Overstreet1-11/+24
peek_upto() checks against the end position and bails out before FILTER_SNAPSHOTS checks; this is because if we end up at a different inode number than the original search key none of the keys we see might be visibile in the current snapshot - we might be looking at inode in a completely different subvolume. But this is broken, because when we're iterating over extents we're checking against the extent start position to decide when to bail out, and the extent start position isn't monotonically increasing until after we've run FILTER_SNAPSHOTS. Fix this by adding a simple inode number check where the old bailout check was, and moving the main check to the correct position. Signed-off-by: Kent Overstreet <[email protected]> Reported-by: "Carl E. Thompson" <[email protected]>
2024-01-01MAINTAINERS: step down as TJA11XX C45 maintainerRadu Pirea (NXP OSS)1-1/+1
I am stepping down as TJA11XX C45 maintainer. Andrei Botila will take the responsibility to maintain and improve the support for TJA11XX C45 PHYs. Signed-off-by: Radu Pirea (NXP OSS) <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-01-01r8169: Fix PCI error on system resumeKai-Heng Feng1-1/+1
Some r8168 NICs stop working upon system resume: [ 688.051096] r8169 0000:02:00.1 enp2s0f1: rtl_ep_ocp_read_cond == 0 (loop: 10, delay: 10000). [ 688.175131] r8169 0000:02:00.1 enp2s0f1: Link is Down ... [ 691.534611] r8169 0000:02:00.1 enp2s0f1: PCI error (cmd = 0x0407, status_errs = 0x0000) Not sure if it's related, but those NICs have a BMC device at function 0: 02:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. Realtek RealManage BMC [10ec:816e] (rev 1a) Trial and error shows that increase the loop wait on rtl_ep_ocp_read_cond to 30 can eliminate the issue, so let rtl8168ep_driver_start() to wait a bit longer. Fixes: e6d6ca6e1204 ("r8169: Add support for another RTL8168FP") Signed-off-by: Kai-Heng Feng <[email protected]> Reviewed-by: Heiner Kallweit <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-01-01net/tcp_sigpool: Use kref_get_unless_zero()Dmitry Safonov1-3/+2
The freeing and re-allocation of algorithm are protected by cpool_mutex, so it doesn't fix an actual use-after-free, but avoids a deserved refcount_warn_saturate() warning. A trivial fix for the racy behavior. Fixes: 8c73b26315aa ("net/tcp: Prepare tcp_md5sig_pool for TCP-AO") Suggested-by: Eric Dumazet <[email protected]> Signed-off-by: Dmitry Safonov <[email protected]> Tested-by: Bagas Sanjaya <[email protected]> Reported-by: syzbot <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-01-01net: sched: em_text: fix possible memory leak in em_text_destroy()Hangyu Hua1-1/+3
m->data needs to be freed when em_text_destroy is called. Fixes: d675c989ed2d ("[PKT_SCHED]: Packet classification based on textsearch (ematch)") Acked-by: Jamal Hadi Salim <[email protected]> Signed-off-by: Hangyu Hua <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-12-31Linux 6.7-rc8Linus Torvalds1-1/+1
2023-12-31get_maintainer: remove stray punctuation when cleaning file emailsAlvin Šipraga1-7/+11
When parsing emails from .yaml files in particular, stray punctuation such as a leading '-' can end up in the name. For example, consider a common YAML section such as: maintainers: - [email protected] This would previously be processed by get_maintainer.pl as: - <[email protected]> Make the logic in clean_file_emails more robust by deleting any sub-names which consist of common single punctuation marks before proceeding to the best-effort name extraction logic. The output is then correct: [email protected] Some additional comments are added to the function to make things clearer to future readers. Link: https://lore.kernel.org/all/[email protected]/ Suggested-by: Joe Perches <[email protected]> Signed-off-by: Alvin Šipraga <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2023-12-31get_maintainer: correctly parse UTF-8 encoded names in filesAlvin Šipraga1-13/+17
While the script correctly extracts UTF-8 encoded names from the MAINTAINERS file, the regular expressions damage my name when parsing from .yaml files. Fix this by replacing the Latin-1-compatible regular expressions with the unicode property matcher \p{L}, which matches on any letter according to the Unicode General Category of letters. The proposed solution only works if the script uses proper string encoding from the outset, so instruct Perl to unconditionally open all files with UTF-8 encoding. This should be safe, as the entire source tree is either UTF-8 or ASCII encoded anyway. See [1] for a detailed analysis. Furthermore, to prevent the \w expression from matching non-ASCII when checking for whether a name should be escaped with quotes, add the /a flag to the regular expression. The escaping logic was duplicated in two places, so it has been factored out into its own function. The original issue was also identified on the tools mailing list [2]. This should solve the observed side effects there as well. Link: https://lore.kernel.org/all/dzn6uco4c45oaa3ia4u37uo5mlt33obecv7gghj2l756fr4hdh@mt3cprft3tmq/ [1] Link: https://lore.kernel.org/tools/20230726-gush-slouching-a5cd41@meerkat/ [2] Signed-off-by: Alvin Šipraga <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2023-12-30Merge tag 'trace-v6.7-rc7' of ↵Linus Torvalds6-69/+176
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Fix readers that are blocked on the ring buffer when buffer_percent is 100%. They are supposed to wake up when the buffer is full, but because the sub-buffer that the writer is on is never considered "dirty" in the calculation, dirty pages will never equal nr_pages. Add +1 to the dirty count in order to count for the sub-buffer that the writer is on. - When a reader is blocked on the "snapshot_raw" file, it is to be woken up when a snapshot is done and be able to read the snapshot buffer. But because the snapshot swaps the buffers (the main one with the snapshot one), and the snapshot reader is waiting on the old snapshot buffer, it was not woken up (because it is now on the main buffer after the swap). Worse yet, when it reads the buffer after a snapshot, it's not reading the snapshot buffer, it's reading the live active main buffer. Fix this by forcing a wakeup of all readers on the snapshot buffer when a new snapshot happens, and then update the buffer that the reader is reading to be back on the snapshot buffer. - Fix the modification of the direct_function hash. There was a race when new functions were added to the direct_function hash as when it moved function entries from the old hash to the new one, a direct function trace could be hit and not see its entry. This is fixed by allocating the new hash, copy all the old entries onto it as well as the new entries, and then use rcu_assign_pointer() to update the new direct_function hash with it. This also fixes a memory leak in that code. - Fix eventfs ownership * tag 'trace-v6.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: ftrace: Fix modification of direct_function hash while in use tracing: Fix blocked reader of snapshot buffer ring-buffer: Fix wake ups when buffer_percent is set to 100 eventfs: Fix file and directory uid and gid ownership
2023-12-30locking/osq_lock: Clarify osq_wait_next()David Laight1-5/+4
Directly return NULL or 'next' instead of breaking out of the loop. Signed-off-by: David Laight <[email protected]> [ Split original patch into two independent parts - Linus ] Link: https://lore.kernel.org/lkml/[email protected]/ Signed-off-by: Linus Torvalds <[email protected]>
2023-12-30locking/osq_lock: Clarify osq_wait_next() calling conventionDavid Laight1-12/+9
osq_wait_next() is passed 'prev' from osq_lock() and NULL from osq_unlock() but only needs the 'cpu' value to write to lock->tail. Just pass prev->cpu or OSQ_UNLOCKED_VAL instead. Should have no effect on the generated code since gcc manages to assume that 'prev != NULL' due to an earlier dereference. Signed-off-by: David Laight <[email protected]> [ Changed 'old' to 'old_cpu' by request from Waiman Long - Linus ] Signed-off-by: Linus Torvalds <[email protected]>
2023-12-30locking/osq_lock: Move the definition of optimistic_spin_node into osq_lock.cDavid Laight2-5/+7
struct optimistic_spin_node is private to the implementation. Move it into the C file to ensure nothing is accessing it. Signed-off-by: David Laight <[email protected]> Acked-by: Waiman Long <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2023-12-30ftrace: Fix modification of direct_function hash while in useSteven Rostedt (Google)1-53/+47
Masami Hiramatsu reported a memory leak in register_ftrace_direct() where if the number of new entries are added is large enough to cause two allocations in the loop: for (i = 0; i < size; i++) { hlist_for_each_entry(entry, &hash->buckets[i], hlist) { new = ftrace_add_rec_direct(entry->ip, addr, &free_hash); if (!new) goto out_remove; entry->direct = addr; } } Where ftrace_add_rec_direct() has: if (ftrace_hash_empty(direct_functions) || direct_functions->count > 2 * (1 << direct_functions->size_bits)) { struct ftrace_hash *new_hash; int size = ftrace_hash_empty(direct_functions) ? 0 : direct_functions->count + 1; if (size < 32) size = 32; new_hash = dup_hash(direct_functions, size); if (!new_hash) return NULL; *free_hash = direct_functions; direct_functions = new_hash; } The "*free_hash = direct_functions;" can happen twice, losing the previous allocation of direct_functions. But this also exposed a more serious bug. The modification of direct_functions above is not safe. As direct_functions can be referenced at any time to find what direct caller it should call, the time between: new_hash = dup_hash(direct_functions, size); and direct_functions = new_hash; can have a race with another CPU (or even this one if it gets interrupted), and the entries being moved to the new hash are not referenced. That's because the "dup_hash()" is really misnamed and is really a "move_hash()". It moves the entries from the old hash to the new one. Now even if that was changed, this code is not proper as direct_functions should not be updated until the end. That is the best way to handle function reference changes, and is the way other parts of ftrace handles this. The following is done: 1. Change add_hash_entry() to return the entry it created and inserted into the hash, and not just return success or not. 2. Replace ftrace_add_rec_direct() with add_hash_entry(), and remove the former. 3. Allocate a "new_hash" at the start that is made for holding both the new hash entries as well as the existing entries in direct_functions. 4. Copy (not move) the direct_function entries over to the new_hash. 5. Copy the entries of the added hash to the new_hash. 6. If everything succeeds, then use rcu_pointer_assign() to update the direct_functions with the new_hash. This simplifies the code and fixes both the memory leak as well as the race condition mentioned above. Link: https://lore.kernel.org/all/170368070504.42064.8960569647118388081.stgit@devnote2/ Link: https://lore.kernel.org/linux-trace-kernel/[email protected] Cc: [email protected] Cc: Mark Rutland <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Daniel Borkmann <[email protected]> Acked-by: Masami Hiramatsu (Google) <[email protected]> Fixes: 763e34e74bb7d ("ftrace: Add register_ftrace_direct()") Signed-off-by: Steven Rostedt (Google) <[email protected]>
2023-12-30ALSA: hda/realtek: enable SND_PCI_QUIRK for hp pavilion 14-ec1xxx seriesAabish Malik1-0/+1
The HP Pavilion 14 ec1xxx series uses the HP mainboard 8A0F with the ALC287 codec. The mute led can be enabled using the already existing ALC287_FIXUP_HP_GPIO_LED quirk. Tested on an HP Pavilion ec1003AU Signed-off-by: Aabish Malik <[email protected]> Cc: <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2023-12-29mlxbf_gige: fix receive packet race conditionDavid Thompson1-2/+7
Under heavy traffic, the BlueField Gigabit interface can become unresponsive. This is due to a possible race condition in the mlxbf_gige_rx_packet function, where the function exits with producer and consumer indices equal but there are remaining packet(s) to be processed. In order to prevent this situation, read receive consumer index *before* the HW replenish so that the mlxbf_gige_rx_packet function returns an accurate return value even if a packet is received into just-replenished buffer prior to exiting this routine. If the just-replenished buffer is received and occupies the last RX ring entry, the interface would not recover and instead would encounter RX packet drops related to internal buffer shortages since the driver RX logic is not being triggered to drain the RX ring. This patch will address and prevent this "ring full" condition. Fixes: f92e1869d74e ("Add Mellanox BlueField Gigabit Ethernet driver") Reviewed-by: Asmaa Mnebhi <[email protected]> Signed-off-by: David Thompson <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-12-29Merge tag 'gpio-fixes-for-v6.7-rc8' of ↵Linus Torvalds1-3/+11
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio fixes from Bartosz Golaszewski: - Andy steps down as GPIO reviewer - Kent becomes a reviewer for GPIO uAPI - add missing intel file to the relevant MAINTAINERS section * tag 'gpio-fixes-for-v6.7-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: MAINTAINERS: Add a missing file to the INTEL GPIO section MAINTAINERS: Remove Andy from GPIO maintainers MAINTAINERS: split out the uAPI into a new section
2023-12-29Merge tag 'platform-drivers-x86-v6.7-6' of ↵Linus Torvalds7-68/+176
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Ilpo Järvinen: - Intel PMC GBE LTR regression - P2SB / PCI deadlock fix * tag 'platform-drivers-x86-v6.7-6' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86/intel/pmc: Move GBE LTR ignore to suspend callback platform/x86/intel/pmc: Allow reenabling LTRs platform/x86/intel/pmc: Add suspend callback platform/x86: p2sb: Allow p2sb_bar() calls during PCI device probe
2023-12-29Merge tag 'block-6.7-2023-12-29' of git://git.kernel.dk/linuxLinus Torvalds2-3/+5
Pull block fixes from Jens Axboe: "Fix for a badly numbered flag, and a regression fix for the badblocks updates from this merge window" * tag 'block-6.7-2023-12-29' of git://git.kernel.dk/linux: block: renumber QUEUE_FLAG_HW_WC badblocks: avoid checking invalid range in badblocks_check()
2023-12-29mailmap: add entries for Mathieu OthaceheMathieu Othacehe1-1/+1
Add my gnu.org mail address. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Mathieu Othacehe <[email protected]> Cc: Bjorn Andersson <[email protected]> Cc: Heiko Stuebner <[email protected]> Cc: Jakub Kicinski <[email protected]> Cc: Jiri Kosina <[email protected]> Cc: Konrad Dybcio <[email protected]> Cc: Matthieu Baerts <[email protected]> Cc: Matt Ranostay <[email protected]> Cc: Oleksij Rempel <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-29MAINTAINERS: change vmware.com addresses to broadcom.comZack Rusin2-5/+5
Update the email addresses for vmwgfx and vmmouse to reflect the fact that VMware is now part of Broadcom. Add a .mailmap entry because the vmware.com address will start bouncing soon. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Zack Rusin <[email protected]> Acked-by: Florian Fainelli <[email protected]> Cc: Ian Forbes <[email protected]> Cc: Martin Krastev <[email protected]> Cc: Maaz Mombasawala <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-29arch/mm/fault: fix major fault accounting when retrying under per-VMA lockSuren Baghdasaryan5-0/+11
A test [1] in Android test suite started failing after [2] was merged. It turns out that after handling a major fault under per-VMA lock, the process major fault counter does not register that fault as major. Before [2] read faults would be done under mmap_lock, in which case FAULT_FLAG_TRIED flag is set before retrying. That in turn causes mm_account_fault() to account the fault as major once retry completes. With per-VMA locks we often retry because a fault can't be handled without locking the whole mm using mmap_lock. Therefore such retries do not set FAULT_FLAG_TRIED flag. This logic does not work after [2] because we can now handle read major faults under per-VMA lock and upon retry the fact there was a major fault gets lost. Fix this by setting FAULT_FLAG_TRIED after retrying under per-VMA lock if VM_FAULT_MAJOR was returned. Ideally we would use an additional VM_FAULT bit to indicate the reason for the retry (could not handle under per-VMA lock vs other reason) but this simpler solution seems to work, so keeping it simple. [1] https://cs.android.com/android/platform/superproject/+/master:test/vts-testcase/kernel/api/drop_caches_prop/drop_caches_test.cpp [2] https://lore.kernel.org/all/[email protected]/ Link: https://lkml.kernel.org/r/[email protected] Fixes: 12214eba1992 ("mm: handle read faults under the VMA lock") Signed-off-by: Suren Baghdasaryan <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Gerald Schaefer <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-29mm/mglru: skip special VMAs in lru_gen_look_around()Yu Zhao1-4/+9
Special VMAs like VM_PFNMAP can contain anon pages from COW. There isn't much profit in doing lookaround on them. Besides, they can trigger the pte_special() warning in get_pte_pfn(). Skip them in lru_gen_look_around(). Link: https://lkml.kernel.org/r/[email protected] Fixes: 018ee47f1489 ("mm: multi-gen LRU: exploit locality in rmap") Signed-off-by: Yu Zhao <[email protected]> Reported-by: [email protected] Closes: https://lore.kernel.org/[email protected]/ Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-29MAINTAINERS: hand over hwpoison maintainership to Miaohe LinNaoya Horiguchi1-2/+2
Miaohe Lin has contributed to hwpoison subsystem as a reviewer for more than 1.5 year, and has made many patch contributions in hwpoison subsystem and the memory management subsystem. So I'd like to pass on the hwpoison maintainership to Miaohe. [[email protected]: update to keep myself as a reviewer] Link: https://lkml.kernel.org/r/20231223031115.GA2883156@u2004 Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Naoya Horiguchi <[email protected]> Cc: Miaohe Lin <[email protected]> Cc: David Hildenbrand <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-29MAINTAINERS: remove hugetlb maintainer Mike KravetzMike Kravetz2-1/+4
I am stepping away from my role as hugetlb maintainer. There should be no gap in coverage as Muchun Song is also a hugetlb maintainer. [[email protected]: update CREDITS] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Mike Kravetz <[email protected]> Cc: Muchun Song <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-29mm: fix unmap_mapping_range high bits shift bugJiajun Xie1-2/+2
The bug happens when highest bit of holebegin is 1, suppose holebegin is 0x8000000111111000, after shift, hba would be 0xfff8000000111111, then vma_interval_tree_foreach would look it up fail or leads to the wrong result. error call seq e.g.: - mmap(..., offset=0x8000000111111000) |- syscall(mmap, ... unsigned long, off): |- ksys_mmap_pgoff( ... , off >> PAGE_SHIFT); here pgoff is correctly shifted to 0x8000000111111, but pass 0x8000000111111000 as holebegin to unmap would then cause terrible result, as shown below: - unmap_mapping_range(..., loff_t const holebegin) |- pgoff_t hba = holebegin >> PAGE_SHIFT; /* hba = 0xfff8000000111111 unexpectedly */ The issue happens in Heterogeneous computing, where the device(e.g. gpu) and host share the same virtual address space. A simple workflow pattern which hit the issue is: /* host */ 1. userspace first mmap a file backed VA range with specified offset. e.g. (offset=0x800..., mmap return: va_a) 2. write some data to the corresponding sys page e.g. (va_a = 0xAABB) /* device */ 3. gpu workload touches VA, triggers gpu fault and notify the host. /* host */ 4. reviced gpu fault notification, then it will: 4.1 unmap host pages and also takes care of cpu tlb (use unmap_mapping_range with offset=0x800...) 4.2 migrate sys page to device 4.3 setup device page table and resolve device fault. /* device */ 5. gpu workload continued, it accessed va_a and got 0xAABB. 6. gpu workload continued, it wrote 0xBBCC to va_a. /* host */ 7. userspace access va_a, as expected, it will: 7.1 trigger cpu vm fault. 7.2 driver handling fault to migrate gpu local page to host. 8. userspace then could correctly get 0xBBCC from va_a 9. done But in step 4.1, if we hit the bug this patch mentioned, then userspace would never trigger cpu fault, and still get the old value: 0xAABB. Making holebegin unsigned first fixes the bug. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Jiajun Xie <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>