aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-01-24dt-bindings: can: tcan4x5x: fix mram-cfg RX FIFO configMarc Kleine-Budde1-1/+1
This tcan4x5x only comes with 2K of MRAM, a RX FIFO with a dept of 32 doesn't fit into the MRAM. Use a depth of 16 instead. Fixes: 4edd396a1911 ("dt-bindings: can: tcan4x5x: Add DT bindings for TCAN4x5X driver") Link: https://lore.kernel.org/all/[email protected] Signed-off-by: Marc Kleine-Budde <[email protected]>
2022-01-24mailmap: update email address of Brian SilvermanMarc Kleine-Budde1-0/+1
Brian Silverman's address at bluerivertech.com is not valid anymore, use Brian's private email address instead. Link: https://lore.kernel.org/all/[email protected] Cc: Brian Silverman <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]>
2022-01-24btrfs: update writeback index when starting defragFilipe Manana1-0/+9
When starting a defrag, we should update the writeback index of the inode's mapping in case it currently has a value beyond the start of the range we are defragging. This can help performance and often result in getting less extents after writeback - for e.g., if the current value of the writeback index sits somewhere in the middle of a range that gets dirty by the defrag, then after writeback we can get two smaller extents instead of a single, larger extent. We used to have this before the refactoring in 5.16, but it was removed without any reason to do so. Originally it was added in kernel 3.1, by commit 2a0f7f5769992b ("Btrfs: fix recursive auto-defrag"), in order to fix a loop with autodefrag resulting in dirtying and writing pages over and over, but some testing on current code did not show that happening, at least with the test described in that commit. So add back the behaviour, as at the very least it is a nice to have optimization. Fixes: 7b508037d4cac3 ("btrfs: defrag: use defrag_one_cluster() to implement btrfs_defrag_file()") CC: [email protected] # 5.16 Signed-off-by: Filipe Manana <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-01-24btrfs: add back missing dirty page rate limiting to defragFilipe Manana1-0/+5
A defrag operation can dirty a lot of pages, specially if operating on the entire file or a large file range. Any task dirtying pages should periodically call balance_dirty_pages_ratelimited(), as stated in that function's comments, otherwise they can leave too many dirty pages in the system. This is what we did before the refactoring in 5.16, and it should have remained, just like in the buffered write path and relocation. So restore that behaviour. Fixes: 7b508037d4cac3 ("btrfs: defrag: use defrag_one_cluster() to implement btrfs_defrag_file()") CC: [email protected] # 5.16 Reviewed-by: Qu Wenruo <[email protected]> Signed-off-by: Filipe Manana <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-01-24btrfs: fix deadlock when reserving space during defragFilipe Manana1-1/+30
When defragging we can end up collecting a range for defrag that has already pages under delalloc (dirty), as long as the respective extent map for their range is not mapped to a hole, a prealloc extent or the extent map is from an old generation. Most of the time that is harmless from a functional perspective at least, however it can result in a deadlock: 1) At defrag_collect_targets() we find an extent map that meets all requirements but there's delalloc for the range it covers, and we add its range to list of ranges to defrag; 2) The defrag_collect_targets() function is called at defrag_one_range(), after it locked a range that overlaps the range of the extent map; 3) At defrag_one_range(), while the range is still locked, we call defrag_one_locked_target() for the range associated to the extent map we collected at step 1); 4) Then finally at defrag_one_locked_target() we do a call to btrfs_delalloc_reserve_space(), which will reserve data and metadata space. If the space reservations can not be satisfied right away, the flusher might be kicked in and start flushing delalloc and wait for the respective ordered extents to complete. If this happens we will deadlock, because both flushing delalloc and finishing an ordered extent, requires locking the range in the inode's io tree, which was already locked at defrag_collect_targets(). So fix this by skipping extent maps for which there's already delalloc. Fixes: eb793cf857828d ("btrfs: defrag: introduce helper to collect target file extents") CC: [email protected] # 5.16 Reviewed-by: Qu Wenruo <[email protected]> Signed-off-by: Filipe Manana <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-01-24arm64: Mark start_backtrace() notrace and NOKPROBE_SYMBOLMasami Hiramatsu1-2/+3
Mark the start_backtrace() as notrace and NOKPROBE_SYMBOL because this function is called from ftrace and lockdep to get the caller address via return_address(). The lockdep is used in kprobes, it should also be NOKPROBE_SYMBOL. Fixes: b07f3499661c ("arm64: stacktrace: Move start_backtrace() out of the header") Cc: <[email protected]> # 5.13.x Signed-off-by: Masami Hiramatsu <[email protected]> Reviewed-by: Mark Brown <[email protected]> Link: https://lore.kernel.org/r/164301227374.1433152.12808232644267107415.stgit@devnote2 Signed-off-by: Catalin Marinas <[email protected]>
2022-01-24arm64: errata: Update ARM64_ERRATUM_[2119858|2224489] with Cortex-X2 rangesAnshuman Khandual3-6/+12
Errata ARM64_ERRATUM_[2119858|2224489] also affect some Cortex-X2 ranges as well. Lets update these errata definition and detection to accommodate all new Cortex-X2 based cpu MIDR ranges. Cc: Will Deacon <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Suzuki Poulose <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Anshuman Khandual <[email protected]> Reviewed-by: Suzuki K Poulose <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
2022-01-24arm64: Add Cortex-X2 CPU part definitionAnshuman Khandual1-0/+2
Add the CPU Partnumbers for the new Arm designs. Cc: Will Deacon <[email protected]> Cc: Suzuki Poulose <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Anshuman Khandual <[email protected]> Reviewed-by: Suzuki K Poulose <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
2022-01-24video: hyperv_fb: Fix validation of screen resolutionMichael Kelley1-13/+3
In the WIN10 version of the Synthetic Video protocol with Hyper-V, Hyper-V reports a list of supported resolutions as part of the protocol negotiation. The driver calculates the maximum width and height from the list of resolutions, and uses those maximums to validate any screen resolution specified in the video= option on the kernel boot line. This method of validation is incorrect. For example, the list of supported resolutions could contain 1600x1200 and 1920x1080, both of which fit in an 8 Mbyte frame buffer. But calculating the max width and height yields 1920 and 1200, and 1920x1200 resolution does not fit in an 8 Mbyte frame buffer. Unfortunately, this resolution is accepted, causing a kernel fault when the driver accesses memory outside the frame buffer. Instead, validate the specified screen resolution by calculating its size, and comparing against the frame buffer size. Delete the code for calculating the max width and height from the list of resolutions, since these max values have no use. Also add the frame buffer size to the info message to aid in understanding why a resolution might be rejected. Fixes: 67e7cdb4829d ("video: hyperv: hyperv_fb: Obtain screen resolution from Hyper-V host") Signed-off-by: Michael Kelley <[email protected]> Reviewed-by: Haiyang Zhang <[email protected]> Acked-by: Helge Deller <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Wei Liu <[email protected]>
2022-01-24KVM: remove async parameter of hva_to_pfn_remapped()Xianting Tian1-4/+3
The async parameter of hva_to_pfn_remapped() is not used, so remove it. Signed-off-by: Xianting Tian <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-01-24x86,kvm/xen: Remove superfluous .fixup usagePeter Zijlstra1-8/+2
Commit 14243b387137 ("KVM: x86/xen: Add KVM_IRQ_ROUTING_XEN_EVTCHN and event channel delivery") adds superfluous .fixup usage after the whole .fixup section was removed in commit e5eefda5aa51 ("x86: Remove .fixup section"). Fixes: 14243b387137 ("KVM: x86/xen: Add KVM_IRQ_ROUTING_XEN_EVTCHN and event channel delivery") Reported-by: Borislav Petkov <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-01-24KVM: VMX: Zero host's SYSENTER_ESP iff SYSENTER is NOT usedSean Christopherson1-3/+7
Zero vmcs.HOST_IA32_SYSENTER_ESP when initializing *constant* host state if and only if SYSENTER cannot be used, i.e. the kernel is a 64-bit kernel and is not emulating 32-bit syscalls. As the name suggests, vmx_set_constant_host_state() is intended for state that is *constant*. When SYSENTER is used, SYSENTER_ESP isn't constant because stacks are per-CPU, and the VMCS must be updated whenever the vCPU is migrated to a new CPU. The logic in vmx_vcpu_load_vmcs() doesn't differentiate between "never loaded" and "loaded on a different CPU", i.e. setting SYSENTER_ESP on VMCS load also handles setting correct host state when the VMCS is first loaded. Because a VMCS must be loaded before it is initialized during vCPU RESET, zeroing the field in vmx_set_constant_host_state() obliterates the value that was written when the VMCS was loaded. If the vCPU is run before it is migrated, the subsequent VM-Exit will zero out MSR_IA32_SYSENTER_ESP, leading to a #DF on the next 32-bit syscall. double fault: 0000 [#1] SMP CPU: 0 PID: 990 Comm: stable Not tainted 5.16.0+ #97 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 EIP: entry_SYSENTER_32+0x0/0xe7 Code: <9c> 50 eb 17 0f 20 d8 a9 00 10 00 00 74 0d 25 ff ef ff ff 0f 22 d8 EAX: 000000a2 EBX: a8d1300c ECX: a8d13014 EDX: 00000000 ESI: a8f87000 EDI: a8d13014 EBP: a8d12fc0 ESP: 00000000 DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 EFLAGS: 00210093 CR0: 80050033 CR2: fffffffc CR3: 02c3b000 CR4: 00152e90 Fixes: 6ab8a4053f71 ("KVM: VMX: Avoid to rdmsrl(MSR_IA32_SYSENTER_ESP)") Cc: Lai Jiangshan <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-01-24quota: cleanup double word in commentTom Rix1-1/+1
Remove the second 'handle'. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Tom Rix <[email protected]> Signed-off-by: Jan Kara <[email protected]>
2022-01-24udf: Restore i_lenAlloc when inode expansion failsJan Kara1-0/+1
When we fail to expand inode from inline format to a normal format, we restore inode to contain the original inline formatting but we forgot to set i_lenAlloc back. The mismatch between i_lenAlloc and i_size was then causing further problems such as warnings and lost data down the line. Reported-by: butt3rflyh4ck <[email protected]> CC: [email protected] Fixes: 7e49b6f2480c ("udf: Convert UDF to new truncate calling sequence") Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Jan Kara <[email protected]>
2022-01-24udf: Fix NULL ptr deref when converting from inline formatJan Kara1-5/+3
udf_expand_file_adinicb() calls directly ->writepage to write data expanded into a page. This however misses to setup inode for writeback properly and so we can crash on inode->i_wb dereference when submitting page for IO like: BUG: kernel NULL pointer dereference, address: 0000000000000158 #PF: supervisor read access in kernel mode ... <TASK> __folio_start_writeback+0x2ac/0x350 __block_write_full_page+0x37d/0x490 udf_expand_file_adinicb+0x255/0x400 [udf] udf_file_write_iter+0xbe/0x1b0 [udf] new_sync_write+0x125/0x1c0 vfs_write+0x28e/0x400 Fix the problem by marking the page dirty and going through the standard writeback path to write the page. Strictly speaking we would not even have to write the page but we want to catch e.g. ENOSPC errors early. Reported-by: butt3rflyh4ck <[email protected]> CC: [email protected] Fixes: 52ebea749aae ("writeback: make backing_dev_info host cgroup-specific bdi_writebacks") Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Jan Kara <[email protected]>
2022-01-24net: stmmac: remove unused members in struct stmmac_privJisheng Zhang1-2/+0
The tx_coalesce and mii_irq are not used at all now, so remove them. Signed-off-by: Jisheng Zhang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-01-24fsnotify: fix fsnotify hooks in pseudo filesystemsAmir Goldstein4-8/+9
Commit 49246466a989 ("fsnotify: move fsnotify_nameremove() hook out of d_delete()") moved the fsnotify delete hook before d_delete() so fsnotify will have access to a positive dentry. This allowed a race where opening the deleted file via cached dentry is now possible after receiving the IN_DELETE event. To fix the regression in pseudo filesystems, convert d_delete() calls to d_drop() (see commit 46c46f8df9aa ("devpts_pty_kill(): don't bother with d_delete()") and move the fsnotify hook after d_drop(). Add a missing fsnotify_unlink() hook in nfsdfs that was found during the audit of fsnotify hooks in pseudo filesystems. Note that the fsnotify hooks in simple_recursive_removal() follow d_invalidate(), so they require no change. Link: https://lore.kernel.org/r/[email protected] Reported-by: Ivan Delalande <[email protected]> Link: https://lore.kernel.org/linux-fsdevel/YeNyzoDM5hP5LtGW@visor/ Fixes: 49246466a989 ("fsnotify: move fsnotify_nameremove() hook out of d_delete()") Cc: [email protected] # v5.3+ Signed-off-by: Amir Goldstein <[email protected]> Signed-off-by: Jan Kara <[email protected]>
2022-01-24fsnotify: invalidate dcache before IN_DELETE eventAmir Goldstein3-15/+50
Apparently, there are some applications that use IN_DELETE event as an invalidation mechanism and expect that if they try to open a file with the name reported with the delete event, that it should not contain the content of the deleted file. Commit 49246466a989 ("fsnotify: move fsnotify_nameremove() hook out of d_delete()") moved the fsnotify delete hook before d_delete() so fsnotify will have access to a positive dentry. This allowed a race where opening the deleted file via cached dentry is now possible after receiving the IN_DELETE event. To fix the regression, create a new hook fsnotify_delete() that takes the unlinked inode as an argument and use a helper d_delete_notify() to pin the inode, so we can pass it to fsnotify_delete() after d_delete(). Backporting hint: this regression is from v5.3. Although patch will apply with only trivial conflicts to v5.4 and v5.10, it won't build, because fsnotify_delete() implementation is different in each of those versions (see fsnotify_link()). A follow up patch will fix the fsnotify_unlink/rmdir() calls in pseudo filesystem that do not need to call d_delete(). Link: https://lore.kernel.org/r/[email protected] Reported-by: Ivan Delalande <[email protected]> Link: https://lore.kernel.org/linux-fsdevel/YeNyzoDM5hP5LtGW@visor/ Fixes: 49246466a989 ("fsnotify: move fsnotify_nameremove() hook out of d_delete()") Cc: [email protected] # v5.3+ Signed-off-by: Amir Goldstein <[email protected]> Signed-off-by: Jan Kara <[email protected]>
2022-01-24net: atlantic: Use the bitmap API instead of hand-writing itChristophe JAILLET1-4/+2
Simplify code by using bitmap_weight() and bitmap_zero() instead of hand-writing these functions. Signed-off-by: Christophe JAILLET <[email protected]> Reviewed-by: Igor Russkikh <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-01-24ping: fix the sk_bound_dev_if match in ping_lookupXin Long1-1/+2
When 'ping' changes to use PING socket instead of RAW socket by: # sysctl -w net.ipv4.ping_group_range="0 100" the selftests 'router_broadcast.sh' will fail, as such command # ip vrf exec vrf-h1 ping -I veth0 198.51.100.255 -b can't receive the response skb by the PING socket. It's caused by mismatch of sk_bound_dev_if and dif in ping_rcv() when looking up the PING socket, as dif is vrf-h1 if dif's master was set to vrf-h1. This patch is to fix this regression by also checking the sk_bound_dev_if against sdif so that the packets can stil be received even if the socket is not bound to the vrf device but to the real iif. Fixes: c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind") Reported-by: Hangbin Liu <[email protected]> Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-01-24arm64: vdso: Fix "no previous prototype" warningVincenzo Frascino1-1/+4
If compiling the arm64 kernel with W=1 the following warning is produced: | arch/arm64/kernel/vdso/vgettimeofday.c:9:5: error: no previous prototype for ‘__kernel_clock_gettime’ [-Werror=missing-prototypes] | 9 | int __kernel_clock_gettime(clockid_t clock, | | ^~~~~~~~~~~~~~~~~~~~~~ | arch/arm64/kernel/vdso/vgettimeofday.c:15:5: error: no previous prototype for ‘__kernel_gettimeofday’ [-Werror=missing-prototypes] | 15 | int __kernel_gettimeofday(struct __kernel_old_timeval *tv, | | ^~~~~~~~~~~~~~~~~~~~~ | arch/arm64/kernel/vdso/vgettimeofday.c:21:5: error: no previous prototype for ‘__kernel_clock_getres’ [-Werror=missing-prototypes] | 21 | int __kernel_clock_getres(clockid_t clock_id, | | ^~~~~~~~~~~~~~~~~~~~~ This patch removes "-Wmissing-prototypes" and "-Wmissing-declarations" compilers flags from the compilation of vgettimeofday.c to make possible to build the kernel with CONFIG_WERROR enabled. Cc: Will Deacon <[email protected]> Reported-by: Marc Kleine-Budde <[email protected]> Signed-off-by: Vincenzo Frascino <[email protected]> Tested-by: Marc Kleine-Budde <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
2022-01-24net/smc: Transitional solution for clcsock race issueWen Gu1-12/+51
We encountered a crash in smc_setsockopt() and it is caused by accessing smc->clcsock after clcsock was released. BUG: kernel NULL pointer dereference, address: 0000000000000020 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 1 PID: 50309 Comm: nginx Kdump: loaded Tainted: G E 5.16.0-rc4+ #53 RIP: 0010:smc_setsockopt+0x59/0x280 [smc] Call Trace: <TASK> __sys_setsockopt+0xfc/0x190 __x64_sys_setsockopt+0x20/0x30 do_syscall_64+0x34/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f16ba83918e </TASK> This patch tries to fix it by holding clcsock_release_lock and checking whether clcsock has already been released before access. In case that a crash of the same reason happens in smc_getsockopt() or smc_switch_to_fallback(), this patch also checkes smc->clcsock in them too. And the caller of smc_switch_to_fallback() will identify whether fallback succeeds according to the return value. Fixes: fd57770dd198 ("net/smc: wait for pending work before clcsock release_sock") Link: https://lore.kernel.org/lkml/[email protected]/T/ Signed-off-by: Wen Gu <[email protected]> Acked-by: Karsten Graul <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-01-24ibmvnic: remove unused ->wait_capabilitySukadev Bhattiprolu2-25/+14
With previous bug fix, ->wait_capability flag is no longer needed and can be removed. Fixes: 249168ad07cd ("ibmvnic: Make CRQ interrupt tasklet wait for all capabilities crqs") Signed-off-by: Sukadev Bhattiprolu <[email protected]> Reviewed-by: Dany Madden <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-01-24ibmvnic: don't spin in taskletSukadev Bhattiprolu1-6/+0
ibmvnic_tasklet() continuously spins waiting for responses to all capability requests. It does this to avoid encountering an error during initialization of the vnic. However if there is a bug in the VIOS and we do not receive a response to one or more queries the tasklet ends up spinning continuously leading to hard lock ups. If we fail to receive a message from the VIOS it is reasonable to timeout the login attempt rather than spin indefinitely in the tasklet. Fixes: 249168ad07cd ("ibmvnic: Make CRQ interrupt tasklet wait for all capabilities crqs") Signed-off-by: Sukadev Bhattiprolu <[email protected]> Reviewed-by: Dany Madden <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-01-24ibmvnic: init ->running_cap_crqs earlySukadev Bhattiprolu1-35/+71
We use ->running_cap_crqs to determine when the ibmvnic_tasklet() should send out the next protocol message type. i.e when we get back responses to all our QUERY_CAPABILITY CRQs we send out REQUEST_CAPABILITY crqs. Similiary, when we get responses to all the REQUEST_CAPABILITY crqs, we send out the QUERY_IP_OFFLOAD CRQ. We currently increment ->running_cap_crqs as we send out each CRQ and have the ibmvnic_tasklet() send out the next message type, when this running_cap_crqs count drops to 0. This assumes that all the CRQs of the current type were sent out before the count drops to 0. However it is possible that we send out say 6 CRQs, get preempted and receive all the 6 responses before we send out the remaining CRQs. This can result in ->running_cap_crqs count dropping to zero before all messages of the current type were sent and we end up sending the next protocol message too early. Instead initialize the ->running_cap_crqs upfront so the tasklet will only send the next protocol message after all responses are received. Use the cap_reqs local variable to also detect any discrepancy (either now or in future) in the number of capability requests we actually send. Currently only send_query_cap() is affected by this behavior (of sending next message early) since it is called from the worker thread (during reset) and from application thread (during ->ndo_open()) and they can be preempted. send_request_cap() is only called from the tasklet which processes CRQ responses sequentially, is not be affected. But to maintain the existing symmtery with send_query_capability() we update send_request_capability() also. Fixes: 249168ad07cd ("ibmvnic: Make CRQ interrupt tasklet wait for all capabilities crqs") Signed-off-by: Sukadev Bhattiprolu <[email protected]> Reviewed-by: Dany Madden <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-01-24ibmvnic: Allow extra failures before disablingSukadev Bhattiprolu1-4/+17
If auto-priority-failover (APF) is enabled and there are at least two backing devices of different priorities, some resets like fail-over, change-param etc can cause at least two back to back failovers. (Failover from high priority backing device to lower priority one and then back to the higher priority one if that is still functional). Depending on the timimg of the two failovers it is possible to trigger a "hard" reset and for the hard reset to fail due to failovers. When this occurs, the driver assumes that the network is unstable and disables the VNIC for a 60-second "settling time". This in turn can cause the ethtool command to fail with "No such device" while the vnic automatically recovers a little while later. Given that it's possible to have two back to back failures, allow for extra failures before disabling the vnic for the settling time. Fixes: f15fde9d47b8 ("ibmvnic: delay next reset if hard reset fails") Signed-off-by: Sukadev Bhattiprolu <[email protected]> Reviewed-by: Dany Madden <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-01-24ipv4: fix ip option filtering for locally generated fragmentsJakub Kicinski1-3/+12
During IP fragmentation we sanitize IP options. This means overwriting options which should not be copied with NOPs. Only the first fragment has the original, full options. ip_fraglist_prepare() copies the IP header and options from previous fragment to the next one. Commit 19c3401a917b ("net: ipv4: place control buffer handling away from fragmentation iterators") moved sanitizing options before ip_fraglist_prepare() which means options are sanitized and then overwritten again with the old values. Fixing this is not enough, however, nor did the sanitization work prior to aforementioned commit. ip_options_fragment() (which does the sanitization) uses ipcb->opt.optlen for the length of the options. ipcb->opt of fragments is not populated (it's 0), only the head skb has the state properly built. So even when called at the right time ip_options_fragment() does nothing. This seems to date back all the way to v2.5.44 when the fast path for pre-fragmented skbs had been introduced. Prior to that ip_options_build() would have been called for every fragment (in fact ever since v2.5.44 the fragmentation handing in ip_options_build() has been dead code, I'll clean it up in -next). In the original patch (see Link) caixf mentions fixing the handling for fragments other than the second one, but I'm not sure how _any_ fragment could have had their options sanitized with the code as it stood. Tested with python (MTU on lo lowered to 1000 to force fragmentation): import socket s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM) s.setsockopt(socket.IPPROTO_IP, socket.IP_OPTIONS, bytearray([7,4,5,192, 20|0x80,4,1,0])) s.sendto(b'1'*2000, ('127.0.0.1', 1234)) Before: IP (tos 0x0, ttl 64, id 1053, offset 0, flags [+], proto UDP (17), length 996, options (RR [bad length 4] [bad ptr 5] 192.148.4.1,,RA value 256)) localhost.36500 > localhost.search-agent: UDP, length 2000 IP (tos 0x0, ttl 64, id 1053, offset 968, flags [+], proto UDP (17), length 996, options (RR [bad length 4] [bad ptr 5] 192.148.4.1,,RA value 256)) localhost > localhost: udp IP (tos 0x0, ttl 64, id 1053, offset 1936, flags [none], proto UDP (17), length 100, options (RR [bad length 4] [bad ptr 5] 192.148.4.1,,RA value 256)) localhost > localhost: udp After: IP (tos 0x0, ttl 96, id 42549, offset 0, flags [+], proto UDP (17), length 996, options (RR [bad length 4] [bad ptr 5] 192.148.4.1,,RA value 256)) localhost.51607 > localhost.search-agent: UDP, bad length 2000 > 960 IP (tos 0x0, ttl 96, id 42549, offset 968, flags [+], proto UDP (17), length 996, options (NOP,NOP,NOP,NOP,RA value 256)) localhost > localhost: udp IP (tos 0x0, ttl 96, id 42549, offset 1936, flags [none], proto UDP (17), length 100, options (NOP,NOP,NOP,NOP,RA value 256)) localhost > localhost: udp RA (20 | 0x80) is now copied as expected, RR (7) is "NOPed out". Link: https://lore.kernel.org/netdev/[email protected]/ Fixes: 19c3401a917b ("net: ipv4: place control buffer handling away from fragmentation iterators") Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: caixf <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-01-24net-procfs: show net devices bound packet typesJianguo Wu1-3/+32
After commit:7866a621043f ("dev: add per net_device packet type chains"), we can not get packet types that are bound to a specified net device by /proc/net/ptype, this patch fix the regression. Run "tcpdump -i ens192 udp -nns0" Before and after apply this patch: Before: [root@localhost ~]# cat /proc/net/ptype Type Device Function 0800 ip_rcv 0806 arp_rcv 86dd ipv6_rcv After: [root@localhost ~]# cat /proc/net/ptype Type Device Function ALL ens192 tpacket_rcv 0800 ip_rcv 0806 arp_rcv 86dd ipv6_rcv v1 -> v2: - fix the regression rather than adding new /proc API as suggested by Stephen Hemminger. Fixes: 7866a621043f ("dev: add per net_device packet type chains") Signed-off-by: Jianguo Wu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-01-24bonding: use rcu_dereference_rtnl when get bonding active slaveHangbin Liu2-5/+1
bond_option_active_slave_get_rcu() should not be used in rtnl_mutex as it use rcu_dereference(). Replace to rcu_dereference_rtnl() so we also can use this function in rtnl protected context. With this update, we can rmeove the rcu_read_lock/unlock in bonding .ndo_eth_ioctl and .get_ts_info. Reported-by: Vladimir Oltean <[email protected]> Fixes: 94dd016ae538 ("bond: pass get_ts_info and SIOC[SG]HWTSTAMP ioctl to active device") Signed-off-by: Hangbin Liu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-01-24net: sfp: ignore disabled SFP nodeMarek Behún1-0/+5
Commit ce0aa27ff3f6 ("sfp: add sfp-bus to bridge between network devices and sfp cages") added code which finds SFP bus DT node even if the node is disabled with status = "disabled". Because of this, when phylink is created, it ends with non-null .sfp_bus member, even though the SFP module is not probed (because the node is disabled). We need to ignore disabled SFP bus node. Fixes: ce0aa27ff3f6 ("sfp: add sfp-bus to bridge between network devices and sfp cages") Signed-off-by: Marek Behún <[email protected]> Cc: [email protected] # 2203cbf2c8b5 ("net: sfp: move fwnode parsing into sfp-bus layer") Signed-off-by: David S. Miller <[email protected]>
2022-01-24KVM: arm64: Use shadow SPSR_EL1 when injecting exceptions on !VHEMarc Zyngier1-1/+4
Injecting an exception into a guest with non-VHE is risky business. Instead of writing in the shadow register for the switch code to restore it, we override the CPU register instead. Which gets overriden a few instructions later by said restore code. The result is that although the guest correctly gets the exception, it will return to the original context in some random state, depending on what was there the first place... Boo. Fix the issue by writing to the shadow register. The original code is absolutely fine on VHE, as the state is already loaded, and writing to the shadow register in that case would actually be a bug. Fixes: bb666c472ca2 ("KVM: arm64: Inject AArch64 exceptions from HYP") Cc: [email protected] Signed-off-by: Marc Zyngier <[email protected]> Reviewed-by: Fuad Tabba <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-01-24s390: update defconfigsHeiko Carstens3-17/+22
Signed-off-by: Heiko Carstens <[email protected]>
2022-01-24s390/module: test loading modules with a lot of relocationsIlya Leoshkevich5-0/+116
Add a test in order to prevent regressions. Signed-off-by: Ilya Leoshkevich <[email protected]> Reviewed-by: Heiko Carstens <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Christian Borntraeger <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2022-01-24s390/module: fix loading modules with a lot of relocationsIlya Leoshkevich1-19/+18
If the size of the PLT entries generated by apply_rela() exceeds 64KiB, the first ones can no longer reach __jump_r1 with brc. Fix by using brcl. An alternative solution is to add a __jump_r1 copy after every 64KiB, however, the space savings are quite small and do not justify the additional complexity. Fixes: f19fbd5ed642 ("s390: introduce execute-trampolines for branches") Cc: [email protected] Reported-by: Andrea Righi <[email protected]> Signed-off-by: Ilya Leoshkevich <[email protected]> Reviewed-by: Heiko Carstens <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Christian Borntraeger <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2022-01-24powerpc/perf: Fix power_pmu_disable to call clear_pmi_irq_pending only if ↵Athira Rajeev1-3/+14
PMI is pending Running selftest with CONFIG_PPC_IRQ_SOFT_MASK_DEBUG enabled in kernel triggered below warning: [ 172.851380] ------------[ cut here ]------------ [ 172.851391] WARNING: CPU: 8 PID: 2901 at arch/powerpc/include/asm/hw_irq.h:246 power_pmu_disable+0x270/0x280 [ 172.851402] Modules linked in: dm_mod bonding nft_ct nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables rfkill nfnetlink sunrpc xfs libcrc32c pseries_rng xts vmx_crypto uio_pdrv_genirq uio sch_fq_codel ip_tables ext4 mbcache jbd2 sd_mod t10_pi sg ibmvscsi ibmveth scsi_transport_srp fuse [ 172.851442] CPU: 8 PID: 2901 Comm: lost_exception_ Not tainted 5.16.0-rc5-03218-g798527287598 #2 [ 172.851451] NIP: c00000000013d600 LR: c00000000013d5a4 CTR: c00000000013b180 [ 172.851458] REGS: c000000017687860 TRAP: 0700 Not tainted (5.16.0-rc5-03218-g798527287598) [ 172.851465] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE> CR: 48004884 XER: 20040000 [ 172.851482] CFAR: c00000000013d5b4 IRQMASK: 1 [ 172.851482] GPR00: c00000000013d5a4 c000000017687b00 c000000002a10600 0000000000000004 [ 172.851482] GPR04: 0000000082004000 c0000008ba08f0a8 0000000000000000 00000008b7ed0000 [ 172.851482] GPR08: 00000000446194f6 0000000000008000 c00000000013b118 c000000000d58e68 [ 172.851482] GPR12: c00000000013d390 c00000001ec54a80 0000000000000000 0000000000000000 [ 172.851482] GPR16: 0000000000000000 0000000000000000 c000000015d5c708 c0000000025396d0 [ 172.851482] GPR20: 0000000000000000 0000000000000000 c00000000a3bbf40 0000000000000003 [ 172.851482] GPR24: 0000000000000000 c0000008ba097400 c0000000161e0d00 c00000000a3bb600 [ 172.851482] GPR28: c000000015d5c700 0000000000000001 0000000082384090 c0000008ba0020d8 [ 172.851549] NIP [c00000000013d600] power_pmu_disable+0x270/0x280 [ 172.851557] LR [c00000000013d5a4] power_pmu_disable+0x214/0x280 [ 172.851565] Call Trace: [ 172.851568] [c000000017687b00] [c00000000013d5a4] power_pmu_disable+0x214/0x280 (unreliable) [ 172.851579] [c000000017687b40] [c0000000003403ac] perf_pmu_disable+0x4c/0x60 [ 172.851588] [c000000017687b60] [c0000000003445e4] __perf_event_task_sched_out+0x1d4/0x660 [ 172.851596] [c000000017687c50] [c000000000d1175c] __schedule+0xbcc/0x12a0 [ 172.851602] [c000000017687d60] [c000000000d11ea8] schedule+0x78/0x140 [ 172.851608] [c000000017687d90] [c0000000001a8080] sys_sched_yield+0x20/0x40 [ 172.851615] [c000000017687db0] [c0000000000334dc] system_call_exception+0x18c/0x380 [ 172.851622] [c000000017687e10] [c00000000000c74c] system_call_common+0xec/0x268 The warning indicates that MSR_EE being set(interrupt enabled) when there was an overflown PMC detected. This could happen in power_pmu_disable since it runs under interrupt soft disable condition ( local_irq_save ) and not with interrupts hard disabled. commit 2c9ac51b850d ("powerpc/perf: Fix PMU callbacks to clear pending PMI before resetting an overflown PMC") intended to clear PMI pending bit in Paca when disabling the PMU. It could happen that PMC gets overflown while code is in power_pmu_disable callback function. Hence add a check to see if PMI pending bit is set in Paca before clearing it via clear_pmi_pending. Fixes: 2c9ac51b850d ("powerpc/perf: Fix PMU callbacks to clear pending PMI before resetting an overflown PMC") Reported-by: Sachin Sant <[email protected]> Signed-off-by: Athira Rajeev <[email protected]> Tested-by: Sachin Sant <[email protected]> Reviewed-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-01-24powerpc/fixmap: Fix VM debug warning on unmapChristophe Leroy6-2/+18
Unmapping a fixmap entry is done by calling __set_fixmap() with FIXMAP_PAGE_CLEAR as flags. Today, powerpc __set_fixmap() calls map_kernel_page(). map_kernel_page() is not happy when called a second time for the same page. WARNING: CPU: 0 PID: 1 at arch/powerpc/mm/pgtable.c:194 set_pte_at+0xc/0x1e8 CPU: 0 PID: 1 Comm: swapper Not tainted 5.16.0-rc3-s3k-dev-01993-g350ff07feb7d-dirty #682 NIP: c0017cd4 LR: c00187f0 CTR: 00000010 REGS: e1011d50 TRAP: 0700 Not tainted (5.16.0-rc3-s3k-dev-01993-g350ff07feb7d-dirty) MSR: 00029032 <EE,ME,IR,DR,RI> CR: 42000208 XER: 00000000 GPR00: c0165fec e1011e10 c14c0000 c0ee2550 ff800000 c0f3d000 00000000 c001686c GPR08: 00001000 b00045a9 00000001 c0f58460 c0f50000 00000000 c0007e10 00000000 GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 GPR24: 00000000 00000000 c0ee2550 00000000 c0f57000 00000ff8 00000000 ff800000 NIP [c0017cd4] set_pte_at+0xc/0x1e8 LR [c00187f0] map_kernel_page+0x9c/0x100 Call Trace: [e1011e10] [c0736c68] vsnprintf+0x358/0x6c8 (unreliable) [e1011e30] [c0165fec] __set_fixmap+0x30/0x44 [e1011e40] [c0c13bdc] early_iounmap+0x11c/0x170 [e1011e70] [c0c06cb0] ioremap_legacy_serial_console+0x88/0xc0 [e1011e90] [c0c03634] do_one_initcall+0x80/0x178 [e1011ef0] [c0c0385c] kernel_init_freeable+0xb4/0x250 [e1011f20] [c0007e34] kernel_init+0x24/0x140 [e1011f30] [c0016268] ret_from_kernel_thread+0x5c/0x64 Instruction dump: 7fe3fb78 48019689 80010014 7c630034 83e1000c 5463d97e 7c0803a6 38210010 4e800020 81250000 712a0001 41820008 <0fe00000> 9421ffe0 93e1001c 48000030 Implement unmap_kernel_page() which clears an existing pte. Reported-by: Maxime Bizon <[email protected]> Signed-off-by: Christophe Leroy <[email protected]> Tested-by: Maxime Bizon <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/b0b752f6f6ecc60653e873f385c6f0dce4e9ab6a.1638789098.git.christophe.leroy@csgroup.eu
2022-01-23hwmon: (adt7470) Prevent divide by zero in adt7470_fan_write()Dan Carpenter1-0/+3
The "val" variable is controlled by the user and comes from hwmon_attr_store(). The FAN_RPM_TO_PERIOD() macro divides by "val" so a zero will crash the system. Check for that and return -EINVAL. Negatives are also invalid so return -EINVAL for those too. Fixes: fc958a61ff6d ("hwmon: (adt7470) Convert to devm_hwmon_device_register_with_info API") Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Guenter Roeck <[email protected]>
2022-01-23hwmon: (pmbus/ir38064) Mark ir38064_of_match as __maybe_unusedGuenter Roeck1-1/+1
If CONFIG_PM is not enabled, the following warning is reported. drivers/hwmon/pmbus/ir38064.c:54:34: warning: unused variable 'ir38064_of_match' Mark it as __maybe_unused. Reported-by: kernel test robot <[email protected]> Cc: Arthur Heymans <[email protected]> Signed-off-by: Guenter Roeck <[email protected]>
2022-01-23hwmon: (lm90) Fix sysfs and udev notificationsGuenter Roeck1-6/+6
sysfs and udev notifications need to be sent to the _alarm attributes, not to the value attributes. Fixes: 94dbd23ed88c ("hwmon: (lm90) Use hwmon_notify_event()") Cc: Dmitry Osipenko <[email protected]> Signed-off-by: Guenter Roeck <[email protected]>
2022-01-23hwmon: (lm90) Mark alert as broken for MAX6646/6647/6649Guenter Roeck1-1/+1
Experiments with MAX6646 and MAX6648 show that the alert function of those chips is broken, similar to other chips supported by the lm90 driver. Mark it accordingly. Fixes: 4667bcb8d8fc ("hwmon: (lm90) Introduce chip parameter structure") Signed-off-by: Guenter Roeck <[email protected]>
2022-01-23hwmon: (lm90) Mark alert as broken for MAX6680Guenter Roeck1-1/+1
Experiments with MAX6680 and MAX6681 show that the alert function of those chips is broken, similar to other chips supported by the lm90 driver. Mark it accordingly. Fixes: 4667bcb8d8fc ("hwmon: (lm90) Introduce chip parameter structure") Signed-off-by: Guenter Roeck <[email protected]>
2022-01-23hwmon: (lm90) Mark alert as broken for MAX6654Guenter Roeck1-0/+1
Experiments with MAX6654 show that its alert function is broken, similar to other chips supported by the lm90 driver. Mark it accordingly. Fixes: 229d495d8189 ("hwmon: (lm90) Add max6654 support to lm90 driver") Cc: Josh Lehan <[email protected]> Signed-off-by: Guenter Roeck <[email protected]>
2022-01-23hwmon: (lm90) Re-enable interrupts after alert clearsGuenter Roeck1-1/+1
If alert handling is broken, interrupts are disabled after an alert and re-enabled after the alert clears. However, if there is an interrupt handler, this does not apply if alerts were originally disabled and enabled when the driver was loaded. In that case, interrupts will stay disabled after an alert was handled though the alert handler even after the alert condition clears. Address the situation by always re-enabling interrupts after the alert condition clears if there is an interrupt handler. Fixes: 2abdc357c55d9 ("hwmon: (lm90) Unmask hardware interrupt") Cc: Dmitry Osipenko <[email protected]> Signed-off-by: Guenter Roeck <[email protected]>
2022-01-23hwmon: (lm90) Reduce maximum conversion rate for G781Guenter Roeck1-1/+1
According to its datasheet, G781 supports a maximum conversion rate value of 8 (62.5 ms). However, chips labeled G781 and G780 were found to only support a maximum conversion rate value of 7 (125 ms). On the other side, chips labeled G781-1 and G784 were found to support a conversion rate value of 8. There is no known means to distinguish G780 from G781 or G784; all chips report the same manufacturer ID and chip revision. Setting the conversion rate register value to 8 on chips not supporting it causes unexpected behavior since the real conversion rate is set to 0 (16 seconds) if a value of 8 is written into the conversion rate register. Limit the conversion rate register value to 7 for all G78x chips to avoid the problem. Fixes: ae544f64cc7b ("hwmon: (lm90) Add support for GMT G781") Signed-off-by: Guenter Roeck <[email protected]>
2022-01-23Drivers: hv: balloon: account for vmbus packet header in max_pkt_sizeYanming Liu1-0/+7
Commit adae1e931acd ("Drivers: hv: vmbus: Copy packets sent by Hyper-V out of the ring buffer") introduced a notion of maximum packet size in vmbus channel and used that size to initialize a buffer holding all incoming packet along with their vmbus packet header. hv_balloon uses the default maximum packet size VMBUS_DEFAULT_MAX_PKT_SIZE which matches its maximum message size, however vmbus_open expects this size to also include vmbus packet header. This leads to 4096 bytes dm_unballoon_request messages being truncated to 4080 bytes. When the driver tries to read next packet it starts from a wrong read_index, receives garbage and prints a lot of "Unhandled message: type: <garbage>" in dmesg. Allocate the buffer with HV_HYP_PAGE_SIZE more bytes to make room for the header. Fixes: adae1e931acd ("Drivers: hv: vmbus: Copy packets sent by Hyper-V out of the ring buffer") Suggested-by: Michael Kelley (LINUX) <[email protected]> Suggested-by: Andrea Parri (Microsoft) <[email protected]> Signed-off-by: Yanming Liu <[email protected]> Reviewed-by: Michael Kelley <[email protected]> Reviewed-by: Andrea Parri (Microsoft) <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Wei Liu <[email protected]>
2022-01-23block: fix memory leak in disk_register_independent_access_rangesMiaoqian Lin1-1/+1
kobject_init_and_add() takes reference even when it fails. According to the doc of kobject_init_and_add() If this function returns an error, kobject_put() must be called to properly clean up the memory associated with the object. Fix this issue by adding kobject_put(). Callback function blk_ia_ranges_sysfs_release() in kobject_put() can handle the pointer "iars" properly. Fixes: a2247f19ee1c ("block: Add independent access ranges support") Signed-off-by: Miaoqian Lin <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-01-23io_uring: fix bug in slow unregistering of nodesDylan Yudaken1-1/+6
In some cases io_rsrc_ref_quiesce will call io_rsrc_node_switch_start, and then immediately flush the delayed work queue &ctx->rsrc_put_work. However the percpu_ref_put does not immediately destroy the node, it will be called asynchronously via RCU. That ends up with io_rsrc_node_ref_zero only being called after rsrc_put_work has been flushed, and so the process ends up sleeping for 1 second unnecessarily. This patch executes the put code immediately if we are busy quiescing. Fixes: 4a38aed2a0a7 ("io_uring: batch reap of dead file registrations") Signed-off-by: Dylan Yudaken <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-01-23Merge tag 'powerpc-5.17-2' of ↵Linus Torvalds16-87/+131
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - A series of bpf fixes, including an oops fix and some codegen fixes. - Fix a regression in syscall_get_arch() for compat processes. - Fix boot failure on some 32-bit systems with KASAN enabled. - A couple of other build/minor fixes. Thanks to Athira Rajeev, Christophe Leroy, Dmitry V. Levin, Jiri Olsa, Johan Almbladh, Maxime Bizon, Naveen N. Rao, and Nicholas Piggin. * tag 'powerpc-5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64s: Mask SRR0 before checking against the masked NIP powerpc/perf: Only define power_pmu_wants_prompt_pmi() for CONFIG_PPC64 powerpc/32s: Fix kasan_init_region() for KASAN powerpc/time: Fix build failure due to do_hard_irq_enable() on PPC32 powerpc/audit: Fix syscall_get_arch() powerpc64/bpf: Limit 'ldbrx' to processors compliant with ISA v2.06 tools/bpf: Rename 'struct event' to avoid naming conflict powerpc/bpf: Update ldimm64 instructions during extra pass powerpc32/bpf: Fix codegen for bpf-to-bpf calls bpf: Guard against accessing NULL pt_regs in bpf_get_task_stack()
2022-01-23Merge tag 'irq_urgent_for_v5.17_rc2' of ↵Linus Torvalds2-3/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fix from Borislav Petkov: "A single use-after-free fix in the PCI MSI irq domain allocation path" * tag 'irq_urgent_for_v5.17_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: PCI/MSI: Prevent UAF in error path
2022-01-23Merge tag 'sched_urgent_for_v5.17_rc2' of ↵Linus Torvalds10-103/+125
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Borislav Petkov: "A bunch of fixes: forced idle time accounting, utilization values propagation in the sched hierarchies and other minor cleanups and improvements" * tag 'sched_urgent_for_v5.17_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: kernel/sched: Remove dl_boosted flag comment sched: Avoid double preemption in __cond_resched_*lock*() sched/fair: Fix all kernel-doc warnings sched/core: Accounting forceidle time for all tasks except idle task sched/pelt: Relax the sync of load_sum with load_avg sched/pelt: Relax the sync of runnable_sum with runnable_avg sched/pelt: Continue to relax the sync of util_sum with util_avg sched/pelt: Relax the sync of util_sum with util_avg psi: Fix uaf issue when psi trigger is destroyed while being polled