aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-04-22serial: 8250: Also set sticky MCR bits in console restorationMaciej W. Rozycki1-1/+1
Sticky MCR bits are lost in console restoration if console suspending has been disabled. This currently affects the AFE bit, which works in combination with RTS which we set, so we want to make sure the UART retains control of its FIFO where previously requested. Also specific drivers may need other bits in the future. Signed-off-by: Maciej W. Rozycki <[email protected]> Fixes: 4516d50aabed ("serial: 8250: Use canary to restart console after suspend") Cc: [email protected] # v4.0+ Reviewed-by: Andy Shevchenko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2022-04-22tty: n_gsm: fix software flow control handlingDaniel Starke1-0/+16
n_gsm is based on the 3GPP 07.010 and its newer version is the 3GPP 27.010. See https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=1516 The changes from 07.010 to 27.010 are non-functional. Therefore, I refer to the newer 27.010 here. Chapter 5.4.8.1 states that XON/XOFF characters shall be used instead of Fcon/Fcoff command in advanced option mode to handle flow control. Chapter 5.4.8.2 describes how XON/XOFF characters shall be handled. Basic option mode only used Fcon/Fcoff commands and no XON/XOFF characters. These are treated as data bytes here. The current implementation uses the gsm_mux field 'constipated' to handle flow control from the remote peer and the gsm_dlci field 'constipated' to handle flow control from each DLCI. The later is unrelated to this patch. The gsm_mux field is correctly set for Fcon/Fcoff commands in gsm_control_message(). However, the same is not true for XON/XOFF characters in gsm1_receive(). Disable software flow control handling in the tty to allow explicit handling by n_gsm. Add the missing handling in advanced option mode for gsm_mux in gsm1_receive() to comply with the standard. This patch depends on the following commit: Commit 8838b2af23ca ("tty: n_gsm: fix SW flow control encoding/handling") Fixes: e1eaea46bb40 ("tty: n_gsm line discipline") Cc: [email protected] Signed-off-by: Daniel Starke <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2022-04-22tty: n_gsm: fix invalid use of MSC in advanced optionDaniel Starke1-8/+117
n_gsm is based on the 3GPP 07.010 and its newer version is the 3GPP 27.010. See https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=1516 The changes from 07.010 to 27.010 are non-functional. Therefore, I refer to the newer 27.010 here. Chapter 5.4.6.3.7 states that the Modem Status Command (MSC) shall only be used if the basic option was chosen. The current implementation uses MSC frames even if advanced option was chosen to inform the peer about modem line state updates. A standard conform peer may choose to discard these frames in advanced option mode. Furthermore, gsmtty_modem_update() is not part of the 'tty_operations' functions despite its name. Rename gsmtty_modem_update() to gsm_modem_update() to clarify this. Split its function into gsm_modem_upd_via_data() and gsm_modem_upd_via_msc() depending on the encoding and adaption. Introduce gsm_dlci_modem_output() as adaption of gsm_dlci_data_output() to encode and queue empty frames in advanced option mode. Use it in gsm_modem_upd_via_data(). gsm_modem_upd_via_msc() is based on the initial gsmtty_modem_update() function which used only MSC frames to update modem states. Fixes: e1eaea46bb40 ("tty: n_gsm line discipline") Cc: [email protected] Signed-off-by: Daniel Starke <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2022-04-22tty: n_gsm: fix broken virtual tty handlingDaniel Starke1-72/+15
Dynamic virtual tty registration was introduced to allow the user to handle these cases with uevent rules. The following commits relate to this: Commit 5b87686e3203 ("tty: n_gsm: Modify gsmtty driver register method when config requester") Commit 0b91b5332368 ("tty: n_gsm: Save dlci address open status when config requester") Commit 46292622ad73 ("tty: n_gsm: clean up indenting in gsm_queue()") However, the following behavior can be seen with this implementation: - n_gsm ldisc is activated via ioctl - all configuration parameters are set to their default value (initiator=0) - the mux gets activated and attached and gsmtty0 is being registered in in gsm_dlci_open() after DLCI 0 was established (DLCI 0 is the control channel) - the user configures n_gsm via ioctl GSMIOC_SETCONF as initiator - this re-attaches the n_gsm mux - no new gsmtty devices are registered in gsmld_attach_gsm() because the mux is already active - the initiator side registered only the control channel as gsmtty0 (which should never happen) and no user channel tty The commits above make it impossible to operate the initiator side as no user channel tty is or will be available. On the other hand, this behavior will make it also impossible to allow DLCI parameter negotiation on responder side in the future. The responder side first needs to provide a device for the application before the application can set its parameters of the associated DLCI via ioctl. Note that the user application is still able to detect a link establishment without relaying to uevent by waiting for DTR open on responder side. This is the same behavior as on a physical serial interface. And on initiator side a tty hangup can be detected if a link establishment request failed. Revert the commits above completely to always register all user channels and no control channel after mux attachment. No other changes are made. Fixes: 5b87686e3203 ("tty: n_gsm: Modify gsmtty driver register method when config requester") Cc: [email protected] Signed-off-by: Daniel Starke <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2022-04-22Revert "serial: sc16is7xx: Clear RS485 bits in the shutdown"Hui Wang1-4/+2
This reverts commit 927728a34f11b5a27f4610bdb7068317d6fdc72a. Once the uart_port->rs485->flag is set to SER_RS485_ENABLED, the port should always work in RS485 mode. If users want the port to leave RS485 mode, they need to call ioctl() to clear SER_RS485_ENABLED. So here we shouldn't clear the RS485 bits in the shutdown(). Fixes: 927728a34f11 ("serial: sc16is7xx: Clear RS485 bits in the shutdown") Signed-off-by: Hui Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2022-04-22Merge tag 'icc-5.18-rc4' of ↵Greg Kroah-Hartman2-42/+0
git://git.kernel.org/pub/scm/linux/kernel/git/djakov/icc into char-misc-next Georgi writes: interconnect fixes for v5.18 This contains a fix for a reported issue on sc7180 platforms, where one of the resources has been incorrectly modelled as both clock and interconnect, which is causing a crash when both frameworks try to manage it. Fix the same issue also on another platform that appears to be affected by the same. - interconnect: qcom: sc7180: Drop IP0 interconnects - interconnect: qcom: sdx55: Drop IP0 interconnects Signed-off-by: Georgi Djakov <[email protected]> * tag 'icc-5.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/djakov/icc: interconnect: qcom: sdx55: Drop IP0 interconnects interconnect: qcom: sc7180: Drop IP0 interconnects
2022-04-22Merge tag 'phy-fixes-5.18' of ↵Greg Kroah-Hartman7-20/+41
git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy into char-misc-linus Vinod writes: phy: fixes for 5.18 Fixes for bunch of drivers: - TI fixes for runtime disable, missing of_node_put and error handling - Samsung fixes for device_put and of_node_put - Amlogic error path handling * tag 'phy-fixes-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: phy: amlogic: fix error path in phy_g12a_usb3_pcie_probe() phy: ti: Add missing pm_runtime_disable() in serdes_am654_probe phy: mapphone-mdm6600: Fix PM error handling in phy_mdm6600_probe phy: ti: omap-usb2: Fix error handling in omap_usb2_enable_clocks phy: ti: tusb1210: Fix an error handling path in tusb1210_probe() phy: samsung: exynos5250-sata: fix missing device put in probe error paths phy: samsung: Fix missing of_node_put() in exynos_sata_phy_probe phy: ti: Fix missing of_node_put in ti_pipe3_get_sysctrl() phy: ti: tusb1210: Make tusb1210_chg_det_states static
2022-04-22netfilter: nft_set_rbtree: overlap detection with element re-addition after ↵Pablo Neira Ayuso1-1/+5
deletion This patch fixes spurious EEXIST errors. Extend d2df92e98a34 ("netfilter: nft_set_rbtree: handle element re-addition after deletion") to deal with elements with same end flags in the same transation. Reset the overlap flag as described by 7c84d41416d8 ("netfilter: nft_set_rbtree: Detect partial overlaps on insertion"). Fixes: 7c84d41416d8 ("netfilter: nft_set_rbtree: Detect partial overlaps on insertion") Fixes: d2df92e98a34 ("netfilter: nft_set_rbtree: handle element re-addition after deletion") Signed-off-by: Pablo Neira Ayuso <[email protected]> Reviewed-by: Stefano Brivio <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2022-04-22Merge tag 'mhi-fixes-v5.18' of ↵Greg Kroah-Hartman1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/mani/mhi into char-misc-linus Manivannan writes: MHI fixes for v5.18 Couple of patches fixing the hibernation issue seen on MHI endpoint devices like SDX65 modems: - During hibernation, the host puts the device into D3cold after thaw() stage. But at that time, the device would be in M0 state. So the device emits a warning (not visible to the host but to device firmware only) stating invalid transition. This is fixed by adding a poweroff() callback that puts the device into M3 before D3cold. - There is a possibility that the recovery worker might be running while trying to powerdown the device. So flush the recovery worker before that. * tag 'mhi-fixes-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/mani/mhi: bus: mhi: host: pci_generic: Flush recovery worker during freeze bus: mhi: host: pci_generic: Add missing poweroff() PM callback
2022-04-22usb: dwc3: core: Only handle soft-reset in DCTLThinh Nguyen1-1/+2
Make sure not to set run_stop bit or link state change request while initiating soft-reset. Register read-modify-write operation may unintentionally start the controller before the initialization completes with its previous DCTL value, which can cause initialization failure. Fixes: f59dcab17629 ("usb: dwc3: core: improve reset sequence") Cc: <[email protected]> Signed-off-by: Thinh Nguyen <[email protected]> Link: https://lore.kernel.org/r/6aecbd78328f102003d40ccf18ceeebd411d3703.1650594792.git.Thinh.Nguyen@synopsys.com Signed-off-by: Greg Kroah-Hartman <[email protected]>
2022-04-22net: dsa: Add missing of_node_put() in dsa_port_link_register_ofMiaoqian Lin1-0/+2
The device_node pointer is returned by of_parse_phandle() with refcount incremented. We should use of_node_put() on it when done. of_node_put() will check for NULL value. Fixes: a20f997010c4 ("net: dsa: Don't instantiate phylink for CPU/DSA ports unless needed") Signed-off-by: Miaoqian Lin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-22regulator: dt-bindings: Revise the rt5190a buck/ldo descriptionChiYuan Huang1-1/+1
Revise the rt5190a bucks and ldo property description. Signed-off-by: ChiYuan Huang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]>
2022-04-22arm64: mm: fix p?d_leaf()Muchun Song1-2/+2
The pmd_leaf() is used to test a leaf mapped PMD, however, it misses the PROT_NONE mapped PMD on arm64. Fix it. A real world issue [1] caused by this was reported by Qian Cai. Also fix pud_leaf(). Link: https://patchwork.kernel.org/comment/24798260/ [1] Fixes: 8aa82df3c123 ("arm64: mm: add p?d_leaf() definitions") Reported-by: Qian Cai <[email protected]> Signed-off-by: Muchun Song <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-04-22net: cosa: fix error check return value of register_chrdev()Lv Ruyi1-1/+1
If major equal 0, register_chrdev() returns error code when it fails. This function dynamically allocate a major and return its number on success, so we should use "< 0" to check it instead of "!". Reported-by: Zeal Robot <[email protected]> Signed-off-by: Lv Ruyi <[email protected]> Acked-By: Jan "Yenya" Kasprzak <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-21Merge tag 'drm-fixes-2022-04-22' of git://anongit.freedesktop.org/drm/drmLinus Torvalds3-24/+33
Pull drm fixes from Dave Airlie: "Extra quiet after Easter, only have minor i915 and msm pulls. However I haven't seen a PR from our misc tree in a little while, I've cc'ed all the suspects. Once that unblocks I expect a bit larger bunch of patches to arrive. Otherwise as I said, one msm revert and two i915 fixes. msm: - revert iommu change that broke some platforms. i915: - Unset enable_psr2_sel_fetch if PSR2 detection fails - Fix to detect when VRR is turned off from panel settings" * tag 'drm-fixes-2022-04-22' of git://anongit.freedesktop.org/drm/drm: drm/i915/display/psr: Unset enable_psr2_sel_fetch if other checks in intel_psr2_config_valid() fails drm/msm: Revert "drm/msm: Stop using iommu_present()" drm/i915/display/vrr: Reset VRR capable property on a long hpd
2022-04-21mm/mmu_notifier.c: fix race in mmu_interval_notifier_remove()Alistair Popple1-1/+13
In some cases it is possible for mmu_interval_notifier_remove() to race with mn_tree_inv_end() allowing it to return while the notifier data structure is still in use. Consider the following sequence: CPU0 - mn_tree_inv_end() CPU1 - mmu_interval_notifier_remove() ----------------------------------- ------------------------------------ spin_lock(subscriptions->lock); seq = subscriptions->invalidate_seq; spin_lock(subscriptions->lock); spin_unlock(subscriptions->lock); subscriptions->invalidate_seq++; wait_event(invalidate_seq != seq); return; interval_tree_remove(interval_sub); kfree(interval_sub); spin_unlock(subscriptions->lock); wake_up_all(); As the wait_event() condition is true it will return immediately. This can lead to use-after-free type errors if the caller frees the data structure containing the interval notifier subscription while it is still on a deferred list. Fix this by taking the appropriate lock when reading invalidate_seq to ensure proper synchronisation. I observed this whilst running stress testing during some development. You do have to be pretty unlucky, but it leads to the usual problems of use-after-free (memory corruption, kernel crash, difficult to diagnose WARN_ON, etc). Link: https://lkml.kernel.org/r/[email protected] Fixes: 99cb252f5e68 ("mm/mmu_notifier: add an interval tree notifier") Signed-off-by: Alistair Popple <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Cc: Christian König <[email protected]> Cc: John Hubbard <[email protected]> Cc: Ralph Campbell <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-04-21kcov: don't generate a warning on vm_insert_page()'s failureAleksandr Nogikh1-2/+5
vm_insert_page()'s failure is not an unexpected condition, so don't do WARN_ONCE() in such a case. Instead, print a kernel message and just return an error code. This flaw has been reported under an OOM condition by sysbot [1]. The message is mainly for the benefit of the test log, in this case the fuzzer's log so that humans inspecting the log can figure out what was going on. KCOV is a testing tool, so I think being a little more chatty when KCOV unexpectedly is about to fail will save someone debugging time. We don't want the WARN, because it's not a kernel bug that syzbot should report, and failure can happen if the fuzzer tries hard enough (as above). Link: https://lkml.kernel.org/r/[email protected] [1] Link: https://lkml.kernel.org/r/[email protected] Fixes: b3d7fe86fbd0 ("kcov: properly handle subsequent mmap calls"), Signed-off-by: Aleksandr Nogikh <[email protected]> Acked-by: Marco Elver <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Andrey Konovalov <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Taras Madan <[email protected]> Cc: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-04-21MAINTAINERS: add Vincenzo Frascino to KASAN reviewersVincenzo Frascino1-0/+1
Add my email address to KASAN reviewers list to make sure that I am Cc'ed in all the KASAN changes that may affect arm64 MTE. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Vincenzo Frascino <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Andrey Konovalov <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Dmitry Vyukov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-04-21oom_kill.c: futex: delay the OOM reaper to allow time for proper futex cleanupNico Pache2-14/+41
The pthread struct is allocated on PRIVATE|ANONYMOUS memory [1] which can be targeted by the oom reaper. This mapping is used to store the futex robust list head; the kernel does not keep a copy of the robust list and instead references a userspace address to maintain the robustness during a process death. A race can occur between exit_mm and the oom reaper that allows the oom reaper to free the memory of the futex robust list before the exit path has handled the futex death: CPU1 CPU2 -------------------------------------------------------------------- page_fault do_exit "signal" wake_oom_reaper oom_reaper oom_reap_task_mm (invalidates mm) exit_mm exit_mm_release futex_exit_release futex_cleanup exit_robust_list get_user (EFAULT- can't access memory) If the get_user EFAULT's, the kernel will be unable to recover the waiters on the robust_list, leaving userspace mutexes hung indefinitely. Delay the OOM reaper, allowing more time for the exit path to perform the futex cleanup. Reproducer: https://gitlab.com/jsavitz/oom_futex_reproducer Based on a patch by Michal Hocko. Link: https://elixir.bootlin.com/glibc/glibc-2.35/source/nptl/allocatestack.c#L370 [1] Link: https://lkml.kernel.org/r/[email protected] Fixes: 212925802454 ("mm: oom: let oom_reap_task and exit_mmap run concurrently") Signed-off-by: Joel Savitz <[email protected]> Signed-off-by: Nico Pache <[email protected]> Co-developed-by: Joel Savitz <[email protected]> Suggested-by: Thomas Gleixner <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Rafael Aquini <[email protected]> Cc: Waiman Long <[email protected]> Cc: Herton R. Krzesinski <[email protected]> Cc: Juri Lelli <[email protected]> Cc: Vincent Guittot <[email protected]> Cc: Dietmar Eggemann <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Ben Segall <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Daniel Bristot de Oliveira <[email protected]> Cc: David Rientjes <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Joel Savitz <[email protected]> Cc: Darren Hart <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-04-21selftest/vm: add skip support to mremap_testSidhartha Kumar1-3/+8
Allow the mremap test to be skipped due to errors such as failing to parse the mmap_min_addr sysctl. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sidhartha Kumar <[email protected]> Reviewed-by: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-04-21selftest/vm: support xfail in mremap_testSidhartha Kumar1-1/+1
Use ksft_test_result_xfail for the tests which are expected to fail. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sidhartha Kumar <[email protected]> Reviewed-by: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-04-21selftest/vm: verify remap destination address in mremap_testSidhartha Kumar1-3/+39
Because mremap does not have a MAP_FIXED_NOREPLACE flag, it can destroy existing mappings. This causes a segfault when regions such as text are remapped and the permissions are changed. Verify the requested mremap destination address does not overlap any existing mappings by using mmap's MAP_FIXED_NOREPLACE flag. Keep incrementing the destination address until a valid mapping is found or fail the current test once the max address is reached. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sidhartha Kumar <[email protected]> Reviewed-by: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-04-21selftest/vm: verify mmap addr in mremap_testSidhartha Kumar1-1/+40
Avoid calling mmap with requested addresses that are less than the system's mmap_min_addr. When run as root, mmap returns EACCES when trying to map addresses < mmap_min_addr. This is not one of the error codes for the condition to retry the mmap in the test. Rather than arbitrarily retrying on EACCES, don't attempt an mmap until addr > vm.mmap_min_addr. Add a munmap call after an alignment check as the mappings are retained after the retry and can reach the vm.max_map_count sysctl. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sidhartha Kumar <[email protected]> Reviewed-by: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-04-21mm, hugetlb: allow for "high" userspace addressesChristophe Leroy3-12/+13
This is a fix for commit f6795053dac8 ("mm: mmap: Allow for "high" userspace addresses") for hugetlb. This patch adds support for "high" userspace addresses that are optionally supported on the system and have to be requested via a hint mechanism ("high" addr parameter to mmap). Architectures such as powerpc and x86 achieve this by making changes to their architectural versions of hugetlb_get_unmapped_area() function. However, arm64 uses the generic version of that function. So take into account arch_get_mmap_base() and arch_get_mmap_end() in hugetlb_get_unmapped_area(). To allow that, move those two macros out of mm/mmap.c into include/linux/sched/mm.h If these macros are not defined in architectural code then they default to (TASK_SIZE) and (base) so should not introduce any behavioural changes to architectures that do not define them. For the time being, only ARM64 is affected by this change. Catalin (ARM64) said "We should have fixed hugetlb_get_unmapped_area() as well when we added support for 52-bit VA. The reason for commit f6795053dac8 was to prevent normal mmap() from returning addresses above 48-bit by default as some user-space had hard assumptions about this. It's a slight ABI change if you do this for hugetlb_get_unmapped_area() but I doubt anyone would notice. It's more likely that the current behaviour would cause issues, so I'd rather have them consistent. Basically when arm64 gained support for 52-bit addresses we did not want user-space calling mmap() to suddenly get such high addresses, otherwise we could have inadvertently broken some programs (similar behaviour to x86 here). Hence we added commit f6795053dac8. But we missed hugetlbfs which could still get such high mmap() addresses. So in theory that's a potential regression that should have bee addressed at the same time as commit f6795053dac8 (and before arm64 enabled 52-bit addresses)" Link: https://lkml.kernel.org/r/ab847b6edb197bffdfe189e70fb4ac76bfe79e0d.1650033747.git.christophe.leroy@csgroup.eu Fixes: f6795053dac8 ("mm: mmap: Allow for "high" userspace addresses") Signed-off-by: Christophe Leroy <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Cc: Steve Capper <[email protected]> Cc: Will Deacon <[email protected]> Cc: <[email protected]> [5.0.x] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-04-21userfaultfd: mark uffd_wp regardless of VM_WRITE flagNadav Amit1-6/+9
When a PTE is set by UFFD operations such as UFFDIO_COPY, the PTE is currently only marked as write-protected if the VMA has VM_WRITE flag set. This seems incorrect or at least would be unexpected by the users. Consider the following sequence of operations that are being performed on a certain page: mprotect(PROT_READ) UFFDIO_COPY(UFFDIO_COPY_MODE_WP) mprotect(PROT_READ|PROT_WRITE) At this point the user would expect to still get UFFD notification when the page is accessed for write, but the user would not get one, since the PTE was not marked as UFFD_WP during UFFDIO_COPY. Fix it by always marking PTEs as UFFD_WP regardless on the write-permission in the VMA flags. Link: https://lkml.kernel.org/r/[email protected] Fixes: 292924b26024 ("userfaultfd: wp: apply _PAGE_UFFD_WP bit") Signed-off-by: Nadav Amit <[email protected]> Acked-by: Peter Xu <[email protected]> Cc: Axel Rasmussen <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Andrea Arcangeli <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-04-21memcg: sync flush only if periodic flush is delayedShakeel Butt3-2/+17
Daniel Dao has reported [1] a regression on workloads that may trigger a lot of refaults (anon and file). The underlying issue is that flushing rstat is expensive. Although rstat flush are batched with (nr_cpus * MEMCG_BATCH) stat updates, it seems like there are workloads which genuinely do stat updates larger than batch value within short amount of time. Since the rstat flush can happen in the performance critical codepaths like page faults, such workload can suffer greatly. This patch fixes this regression by making the rstat flushing conditional in the performance critical codepaths. More specifically, the kernel relies on the async periodic rstat flusher to flush the stats and only if the periodic flusher is delayed by more than twice the amount of its normal time window then the kernel allows rstat flushing from the performance critical codepaths. Now the question: what are the side-effects of this change? The worst that can happen is the refault codepath will see 4sec old lruvec stats and may cause false (or missed) activations of the refaulted page which may under-or-overestimate the workingset size. Though that is not very concerning as the kernel can already miss or do false activations. There are two more codepaths whose flushing behavior is not changed by this patch and we may need to come to them in future. One is the writeback stats used by dirty throttling and second is the deactivation heuristic in the reclaim. For now keeping an eye on them and if there is report of regression due to these codepaths, we will reevaluate then. Link: https://lore.kernel.org/all/CA+wXwBSyO87ZX5PVwdHm-=dBjZYECGmfnydUicUyrQqndgX2MQ@mail.gmail.com [1] Link: https://lkml.kernel.org/r/[email protected] Fixes: 1f828223b799 ("memcg: flush lruvec stats in the refault") Signed-off-by: Shakeel Butt <[email protected]> Reported-by: Daniel Dao <[email protected]> Tested-by: Ivan Babrou <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Michal Koutný <[email protected]> Cc: Frank Hofmann <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-04-21mm/memory-failure.c: skip huge_zero_page in memory_failure()Xu Yu1-0/+13
Kernel panic when injecting memory_failure for the global huge_zero_page, when CONFIG_DEBUG_VM is enabled, as follows. Injecting memory failure for pfn 0x109ff9 at process virtual address 0x20ff9000 page:00000000fb053fc3 refcount:2 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x109e00 head:00000000fb053fc3 order:9 compound_mapcount:0 compound_pincount:0 flags: 0x17fffc000010001(locked|head|node=0|zone=2|lastcpupid=0x1ffff) raw: 017fffc000010001 0000000000000000 dead000000000122 0000000000000000 raw: 0000000000000000 0000000000000000 00000002ffffffff 0000000000000000 page dumped because: VM_BUG_ON_PAGE(is_huge_zero_page(head)) ------------[ cut here ]------------ kernel BUG at mm/huge_memory.c:2499! invalid opcode: 0000 [#1] PREEMPT SMP PTI CPU: 6 PID: 553 Comm: split_bug Not tainted 5.18.0-rc1+ #11 Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 3288b3c 04/01/2014 RIP: 0010:split_huge_page_to_list+0x66a/0x880 Code: 84 9b fb ff ff 48 8b 7c 24 08 31 f6 e8 9f 5d 2a 00 b8 b8 02 00 00 e9 e8 fb ff ff 48 c7 c6 e8 47 3c 82 4c b RSP: 0018:ffffc90000dcbdf8 EFLAGS: 00010246 RAX: 000000000000003c RBX: 0000000000000001 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffffff823e4c4f RDI: 00000000ffffffff RBP: ffff88843fffdb40 R08: 0000000000000000 R09: 00000000fffeffff R10: ffffc90000dcbc48 R11: ffffffff82d68448 R12: ffffea0004278000 R13: ffffffff823c6203 R14: 0000000000109ff9 R15: ffffea000427fe40 FS: 00007fc375a26740(0000) GS:ffff88842fd80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fc3757c9290 CR3: 0000000102174006 CR4: 00000000003706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: try_to_split_thp_page+0x3a/0x130 memory_failure+0x128/0x800 madvise_inject_error.cold+0x8b/0xa1 __x64_sys_madvise+0x54/0x60 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7fc3754f8bf9 Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 8 RSP: 002b:00007ffeda93a1d8 EFLAGS: 00000217 ORIG_RAX: 000000000000001c RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc3754f8bf9 RDX: 0000000000000064 RSI: 0000000000003000 RDI: 0000000020ff9000 RBP: 00007ffeda93a200 R08: 0000000000000000 R09: 0000000000000000 R10: 00000000ffffffff R11: 0000000000000217 R12: 0000000000400490 R13: 00007ffeda93a2e0 R14: 0000000000000000 R15: 0000000000000000 This makes huge_zero_page bail out explicitly before split in memory_failure(), thus the panic above won't happen again. Link: https://lkml.kernel.org/r/497d3835612610e370c74e697ea3c721d1d55b9c.1649775850.git.xuyu@linux.alibaba.com Fixes: 6a46079cf57a ("HWPOISON: The high level memory error handler in the VM v7") Signed-off-by: Xu Yu <[email protected]> Reported-by: Abaci <[email protected]> Suggested-by: Naoya Horiguchi <[email protected]> Acked-by: Naoya Horiguchi <[email protected]> Reviewed-by: Miaohe Lin <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-04-21mm/hwpoison: fix race between hugetlb free/demotion and memory_failure_hugetlb()Naoya Horiguchi4-42/+127
There is a race condition between memory_failure_hugetlb() and hugetlb free/demotion, which causes setting PageHWPoison flag on the wrong page. The one simple result is that wrong processes can be killed, but another (more serious) one is that the actual error is left unhandled, so no one prevents later access to it, and that might lead to more serious results like consuming corrupted data. Think about the below race window: CPU 1 CPU 2 memory_failure_hugetlb struct page *head = compound_head(p); hugetlb page might be freed to buddy, or even changed to another compound page. get_hwpoison_page -- page is not what we want now... The current code first does prechecks roughly and then reconfirms after taking refcount, but it's found that it makes code overly complicated, so move the prechecks in a single hugetlb_lock range. A newly introduced function, try_memory_failure_hugetlb(), always takes hugetlb_lock (even for non-hugetlb pages). That can be improved, but memory_failure() is rare in principle, so should not be a big problem. Link: https://lkml.kernel.org/r/[email protected] Fixes: 761ad8d7c7b5 ("mm: hwpoison: introduce memory_failure_hugetlb()") Signed-off-by: Naoya Horiguchi <[email protected]> Reported-by: Mike Kravetz <[email protected]> Reviewed-by: Miaohe Lin <[email protected]> Reviewed-by: Mike Kravetz <[email protected]> Cc: Yang Shi <[email protected]> Cc: Dan Carpenter <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-04-21clk: qcom: clk-rcg2: fix gfx3d frequency calculationDmitry Baryshkov1-1/+1
Since the commit 948fb0969eae ("clk: Always clamp the rounded rate"), the clk_core_determine_round_nolock() would clamp the requested rate between min and max rates from the rate request. Normally these fields would be filled by clk_core_get_boundaries() called from clk_round_rate(). However clk_gfx3d_determine_rate() uses a manually crafted rate request, which did not have these fields filled. Thus the requested frequency would be clamped to 0, resulting in weird frequencies being requested from the hardware. Fix this by filling min_rate and max_rate to the values valid for the respective PLLs (0 and ULONG_MAX). Fixes: 948fb0969eae ("clk: Always clamp the rounded rate") Signed-off-by: Dmitry Baryshkov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Bjorn Andersson <[email protected]> Reported-by: Rob Clark <[email protected]> Signed-off-by: Stephen Boyd <[email protected]>
2022-04-21clk: microchip: mpfs: don't reset disabled peripheralsConor Dooley1-4/+0
The current clock driver for PolarFire SoC puts the hardware behind "periph" clocks into reset if their clock is disabled. CONFIG_PM was recently added to the riscv defconfig and exposed issues caused by this behaviour, where the Cadence GEM was being put into reset between its bringup & the PHY bringup: https://lore.kernel.org/linux-riscv/[email protected]/ Fix this (for now) by removing the reset from mpfs_periph_clk_disable. Fixes: 635e5e73370e ("clk: microchip: Add driver for Microchip PolarFire SoC") Reviewed-by: Daire McNamara <[email protected]> Signed-off-by: Conor Dooley <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Stephen Boyd <[email protected]>
2022-04-21f2fs: should not truncate blocks during roll-forward recoveryJaegeuk Kim1-1/+2
If the file preallocated blocks and fsync'ed, we should not truncate them during roll-forward recovery which will recover i_size correctly back. Fixes: d4dd19ec1ea0 ("f2fs: do not expose unwritten blocks to user by DIO") Cc: <[email protected]> # 5.17+ Signed-off-by: Jaegeuk Kim <[email protected]>
2022-04-22ata: pata_marvell: Check the 'bmdma_addr' beforing readingZheyu Ma1-0/+2
Before detecting the cable type on the dma bar, the driver should check whether the 'bmdma_addr' is zero, which means the adapter does not support DMA, otherwise we will get the following error: [ 5.146634] Bad IO access at port 0x1 (return inb(port)) [ 5.147206] WARNING: CPU: 2 PID: 303 at lib/iomap.c:44 ioread8+0x4a/0x60 [ 5.150856] RIP: 0010:ioread8+0x4a/0x60 [ 5.160238] Call Trace: [ 5.160470] <TASK> [ 5.160674] marvell_cable_detect+0x6e/0xc0 [pata_marvell] [ 5.161728] ata_eh_recover+0x3520/0x6cc0 [ 5.168075] ata_do_eh+0x49/0x3c0 Signed-off-by: Zheyu Ma <[email protected]> Signed-off-by: Damien Le Moal <[email protected]>
2022-04-22Merge tag 'drm-msm-fixes-2022-04-20' of ↵Dave Airlie1-1/+1
https://gitlab.freedesktop.org/drm/msm into drm-fixes Revert to fix iommu regression. Signed-off-by: Dave Airlie <[email protected]> From: Rob Clark <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGtvPo4xD2peAztDMPP2n4utb7d9WQboMFwsba9E8U2rCw@mail.gmail.com
2022-04-21Merge tag 'dmaengine-fix-5.18' of ↵Linus Torvalds8-34/+53
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine Pull dmaengine fixes from Vinod Koul: "A bunch of driver fixes: - idxd device RO checks and device cleanup - dw-edma unaligned access and alignment - qcom: missing minItems in binding - mediatek pm usage fix - imx init script" * tag 'dmaengine-fix-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: dt-bindings: dmaengine: qcom: gpi: Add minItems for interrupts dmaengine: idxd: skip clearing device context when device is read-only dmaengine: idxd: add RO check for wq max_transfer_size write dmaengine: idxd: add RO check for wq max_batch_size write dmaengine: idxd: fix retry value to be constant for duration of function call dmaengine: idxd: match type for retries var in idxd_enqcmds() dmaengine: dw-edma: Fix inconsistent indenting dmaengine: dw-edma: Fix unaligned 64bit access dmaengine: mediatek:Fix PM usage reference leak of mtk_uart_apdma_alloc_chan_resources dmaengine: imx-sdma: Fix error checking in sdma_event_remap dma: at_xdmac: fix a missing check on list iterator dmaengine: imx-sdma: fix init of uart scripts dmaengine: idxd: fix device cleanup on disable
2022-04-21RISC-V: cpuidle: fix Kconfig select for RISCV_SBI_CPUIDLERandy Dunlap1-1/+1
There can be lots of build errors when building cpuidle-riscv-sbi.o. They are all caused by a kconfig problem with this warning: WARNING: unmet direct dependencies detected for RISCV_SBI_CPUIDLE Depends on [n]: CPU_IDLE [=y] && RISCV [=y] && RISCV_SBI [=n] Selected by [y]: - SOC_VIRT [=y] && CPU_IDLE [=y] so make the 'select' of RISCV_SBI_CPUIDLE also depend on RISCV_SBI. Fixes: c5179ef1ca0c ("RISC-V: Enable RISC-V SBI CPU Idle driver for QEMU virt machine") Signed-off-by: Randy Dunlap <[email protected]> Reported-by: kernel test robot <[email protected]> Reviewed-by: Anup Patel <[email protected]> Cc: [email protected] Signed-off-by: Palmer Dabbelt <[email protected]>
2022-04-21RISC-V: mm: Fix set_satp_mode() for platform not having Sv57Anup Patel1-0/+1
When Sv57 is not available the satp.MODE test in set_satp_mode() will fail and lead to pgdir re-programming for Sv48. The pgdir re-programming will fail as well due to pre-existing pgdir entry used for Sv57 and as a result kernel fails to boot on RISC-V platform not having Sv57. To fix above issue, we should clear the pgdir memory in set_satp_mode() before re-programming. Fixes: 011f09d12052 ("riscv: mm: Set sv57 on defaultly") Reported-by: Mayuresh Chitale <[email protected]> Signed-off-by: Anup Patel <[email protected]> Reviewed-by: Atish Patra <[email protected]> Cc: [email protected] Signed-off-by: Palmer Dabbelt <[email protected]>
2022-04-22Merge tag 'drm-intel-fixes-2022-04-20' of ↵Dave Airlie2-23/+32
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes - Unset enable_psr2_sel_fetch if PSR2 detection fails - Fix to detect when VRR is turned off from panel settings Signed-off-by: Dave Airlie <[email protected]> From: Joonas Lahtinen <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2022-04-21kvm: selftests: introduce and use more page size-related constantsPaolo Bonzini8-13/+8
Clean up code that was hardcoding masks for various fields, now that the masks are included in processor.h. For more cleanup, define PAGE_SIZE and PAGE_MASK just like in Linux. PAGE_SIZE in particular was defined by several tests. Suggested-by: Sean Christopherson <[email protected]> Reviewed-by: Peter Xu <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-04-21kvm: selftests: do not use bitfields larger than 32-bits for PTEsPaolo Bonzini2-115/+92
Red Hat's QE team reported test failure on access_tracking_perf_test: Testing guest mode: PA-bits:ANY, VA-bits:48, 4K pages guest physical test memory offset: 0x3fffbffff000 Populating memory : 0.684014577s Writing to populated memory : 0.006230175s Reading from populated memory : 0.004557805s ==== Test Assertion Failure ==== lib/kvm_util.c:1411: false pid=125806 tid=125809 errno=4 - Interrupted system call 1 0x0000000000402f7c: addr_gpa2hva at kvm_util.c:1411 2 (inlined by) addr_gpa2hva at kvm_util.c:1405 3 0x0000000000401f52: lookup_pfn at access_tracking_perf_test.c:98 4 (inlined by) mark_vcpu_memory_idle at access_tracking_perf_test.c:152 5 (inlined by) vcpu_thread_main at access_tracking_perf_test.c:232 6 0x00007fefe9ff81ce: ?? ??:0 7 0x00007fefe9c64d82: ?? ??:0 No vm physical memory at 0xffbffff000 I can easily reproduce it with a Intel(R) Xeon(R) CPU E5-2630 with 46 bits PA. It turns out that the address translation for clearing idle page tracking returned a wrong result; addr_gva2gpa()'s last step, which is based on "pte[index[0]].pfn", did the calculation with 40 bits length and the high 12 bits got truncated. In above case the GPA address to be returned should be 0x3fffbffff000 for GVA 0xc0000000, but it got truncated into 0xffbffff000 and the subsequent gpa2hva lookup failed. The width of operations on bit fields greater than 32-bit is implementation defined, and differs between GCC (which uses the bitfield precision) and clang (which uses 64-bit arithmetic), so this is a potential minefield. Remove the bit fields and using manual masking instead. Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2075036 Reported-by: Nana Liu <[email protected]> Reviewed-by: Peter Xu <[email protected]> Tested-by: Peter Xu <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-04-21KVM: SEV: add cache flush to solve SEV cache incoherency issuesMingwei Zhang8-3/+44
Flush the CPU caches when memory is reclaimed from an SEV guest (where reclaim also includes it being unmapped from KVM's memslots). Due to lack of coherency for SEV encrypted memory, failure to flush results in silent data corruption if userspace is malicious/broken and doesn't ensure SEV guest memory is properly pinned and unpinned. Cache coherency is not enforced across the VM boundary in SEV (AMD APM vol.2 Section 15.34.7). Confidential cachelines, generated by confidential VM guests have to be explicitly flushed on the host side. If a memory page containing dirty confidential cachelines was released by VM and reallocated to another user, the cachelines may corrupt the new user at a later time. KVM takes a shortcut by assuming all confidential memory remain pinned until the end of VM lifetime. Therefore, KVM does not flush cache at mmu_notifier invalidation events. Because of this incorrect assumption and the lack of cache flushing, malicous userspace can crash the host kernel: creating a malicious VM and continuously allocates/releases unpinned confidential memory pages when the VM is running. Add cache flush operations to mmu_notifier operations to ensure that any physical memory leaving the guest VM get flushed. In particular, hook mmu_notifier_invalidate_range_start and mmu_notifier_release events and flush cache accordingly. The hook after releasing the mmu lock to avoid contention with other vCPUs. Cc: [email protected] Suggested-by: Sean Christpherson <[email protected]> Reported-by: Mingwei Zhang <[email protected]> Signed-off-by: Mingwei Zhang <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-04-21Merge tag 'net-5.18-rc4' of ↵Linus Torvalds40-83/+210
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from xfrm and can. Current release - regressions: - rxrpc: restore removed timer deletion Current release - new code bugs: - gre: fix device lookup for l3mdev use-case - xfrm: fix egress device lookup for l3mdev use-case Previous releases - regressions: - sched: cls_u32: fix netns refcount changes in u32_change() - smc: fix sock leak when release after smc_shutdown() - xfrm: limit skb_page_frag_refill use to a single page - eth: atlantic: invert deep par in pm functions, preventing null derefs - eth: stmmac: use readl_poll_timeout_atomic() in atomic state Previous releases - always broken: - gre: fix skb_under_panic on xmit - openvswitch: fix OOB access in reserve_sfa_size() - dsa: hellcreek: calculate checksums in tagger - eth: ice: fix crash in switchdev mode - eth: igc: - fix infinite loop in release_swfw_sync - fix scheduling while atomic" * tag 'net-5.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (37 commits) drivers: net: hippi: Fix deadlock in rr_close() selftests: mlxsw: vxlan_flooding_ipv6: Prevent flooding of unwanted packets selftests: mlxsw: vxlan_flooding: Prevent flooding of unwanted packets nfc: MAINTAINERS: add Bug entry net: stmmac: Use readl_poll_timeout_atomic() in atomic state doc/ip-sysctl: add bc_forwarding netlink: reset network and mac headers in netlink_dump() net: mscc: ocelot: fix broken IP multicast flooding net: dsa: hellcreek: Calculate checksums in tagger net: atlantic: invert deep par in pm functions, preventing null derefs can: isotp: stop timeout monitoring when no first frame was sent bonding: do not discard lowest hash bit for non layer3+4 hashing net: lan966x: Make sure to release ptp interrupt ipv6: make ip6_rt_gc_expire an atomic_t net: Handle l3mdev in ip_tunnel_init_flow l3mdev: l3mdev_master_upper_ifindex_by_index_rcu should be using netdev_master_upper_dev_get_rcu net/sched: cls_u32: fix possible leak in u32_init_knode() net/sched: cls_u32: fix netns refcount changes in u32_change() powerpc: Update MAINTAINERS for ibmvnic and VAS net: restore alpha order to Ethernet devices in config ...
2022-04-21ALSA: hda/realtek: Add quirk for Clevo NP70PNPTim Crawford1-0/+1
Fixes headset detection on Clevo NP70PNP. Signed-off-by: Tim Crawford <[email protected]> Cc: <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2022-04-21ALSA: hda: intel-dsp-config: Add RaptorLake PCI IDsGongjun Song1-0/+9
Add RaptorLake-P PCI IDs Reviewed-by: Kai Vehmanen <[email protected]> Signed-off-by: Gongjun Song <[email protected]> Signed-off-by: Pierre-Louis Bossart <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2022-04-21jbd2: fix a potential race while discarding reserved buffers after an abortYe Bin1-1/+3
we got issue as follows: [ 72.796117] EXT4-fs error (device sda): ext4_journal_check_start:83: comm fallocate: Detected aborted journal [ 72.826847] EXT4-fs (sda): Remounting filesystem read-only fallocate: fallocate failed: Read-only file system [ 74.791830] jbd2_journal_commit_transaction: jh=0xffff9cfefe725d90 bh=0x0000000000000000 end delay [ 74.793597] ------------[ cut here ]------------ [ 74.794203] kernel BUG at fs/jbd2/transaction.c:2063! [ 74.794886] invalid opcode: 0000 [#1] PREEMPT SMP PTI [ 74.795533] CPU: 4 PID: 2260 Comm: jbd2/sda-8 Not tainted 5.17.0-rc8-next-20220315-dirty #150 [ 74.798327] RIP: 0010:__jbd2_journal_unfile_buffer+0x3e/0x60 [ 74.801971] RSP: 0018:ffffa828c24a3cb8 EFLAGS: 00010202 [ 74.802694] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 74.803601] RDX: 0000000000000001 RSI: ffff9cfefe725d90 RDI: ffff9cfefe725d90 [ 74.804554] RBP: ffff9cfefe725d90 R08: 0000000000000000 R09: ffffa828c24a3b20 [ 74.805471] R10: 0000000000000001 R11: 0000000000000001 R12: ffff9cfefe725d90 [ 74.806385] R13: ffff9cfefe725d98 R14: 0000000000000000 R15: ffff9cfe833a4d00 [ 74.807301] FS: 0000000000000000(0000) GS:ffff9d01afb00000(0000) knlGS:0000000000000000 [ 74.808338] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 74.809084] CR2: 00007f2b81bf4000 CR3: 0000000100056000 CR4: 00000000000006e0 [ 74.810047] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 74.810981] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 74.811897] Call Trace: [ 74.812241] <TASK> [ 74.812566] __jbd2_journal_refile_buffer+0x12f/0x180 [ 74.813246] jbd2_journal_refile_buffer+0x4c/0xa0 [ 74.813869] jbd2_journal_commit_transaction.cold+0xa1/0x148 [ 74.817550] kjournald2+0xf8/0x3e0 [ 74.819056] kthread+0x153/0x1c0 [ 74.819963] ret_from_fork+0x22/0x30 Above issue may happen as follows: write truncate kjournald2 generic_perform_write ext4_write_begin ext4_walk_page_buffers do_journal_get_write_access ->add BJ_Reserved list ext4_journalled_write_end ext4_walk_page_buffers write_end_fn ext4_handle_dirty_metadata ***************JBD2 ABORT************** jbd2_journal_dirty_metadata -> return -EROFS, jh in reserved_list jbd2_journal_commit_transaction while (commit_transaction->t_reserved_list) jh = commit_transaction->t_reserved_list; truncate_pagecache_range do_invalidatepage ext4_journalled_invalidatepage jbd2_journal_invalidatepage journal_unmap_buffer __dispose_buffer __jbd2_journal_unfile_buffer jbd2_journal_put_journal_head ->put last ref_count __journal_remove_journal_head bh->b_private = NULL; jh->b_bh = NULL; jbd2_journal_refile_buffer(journal, jh); bh = jh2bh(jh); ->bh is NULL, later will trigger null-ptr-deref journal_free_journal_head(jh); After commit 96f1e0974575, we no longer hold the j_state_lock while iterating over the list of reserved handles in jbd2_journal_commit_transaction(). This potentially allows the journal_head to be freed by journal_unmap_buffer while the commit codepath is also trying to free the BJ_Reserved buffers. Keeping j_state_lock held while trying extends hold time of the lock minimally, and solves this issue. Fixes: 96f1e0974575("jbd2: avoid long hold times of j_state_lock while committing a transaction") Signed-off-by: Ye Bin <[email protected]> Reviewed-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2022-04-21thermal: int340x: Fix attr.show callback prototypeKees Cook1-2/+2
Control Flow Integrity (CFI) instrumentation of the kernel noticed that the caller, dev_attr_show(), and the callback, odvp_show(), did not have matching function prototypes, which would cause a CFI exception to be raised. Correct the prototype by using struct device_attribute instead of struct kobj_attribute. Reported-and-tested-by: Joao Moreira <[email protected]> Link: https://lore.kernel.org/lkml/[email protected]/ Fixes: 006f006f1e5c ("thermal/int340x_thermal: Export OEM vendor variables") Cc: 5.8+ <[email protected]> # 5.8+ Signed-off-by: Kees Cook <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2022-04-21Revert "ACPI: processor: idle: fix lockup regression on 32-bit ThinkPad T40"Ville Syrjälä1-5/+0
This reverts commit bfe55a1f7fd6bfede16078bf04c6250fbca11588. This was presumably misdiagnosed as an inability to use C3 at all when I suspect the real problem is just misconfiguration of C3 vs. ARB_DIS. Signed-off-by: Ville Syrjälä <[email protected]> Cc: 5.16+ <[email protected]> # 5.16+ Tested-by: Woody Suwalski <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2022-04-21ACPI: processor: idle: Avoid falling back to C3 type C-statesVille Syrjälä1-1/+2
The "safe state" index is used by acpi_idle_enter_bm() to avoid entering a C-state that may require bus mastering to be disabled on entry in the cases when this is not going to happen. For this reason, it should not be set to point to C3 type of C-states, because they may require bus mastering to be disabled on entry in principle. This was broken by commit d6b88ce2eb9d ("ACPI: processor idle: Allow playing dead in C3 state") which inadvertently allowed the "safe state" index to point to C3 type of C-states. This results in a machine that won't boot past the point when it first enters C3. Restore the correct behaviour (either demote to C1/C2, or use C3 but also set ARB_DIS=1). I hit this on a Fujitsu Siemens Lifebook S6010 (P3) machine. Fixes: d6b88ce2eb9d ("ACPI: processor idle: Allow playing dead in C3 state") Cc: 5.16+ <[email protected]> # 5.16+ Signed-off-by: Ville Syrjälä <[email protected]> Tested-by: Woody Suwalski <[email protected]> [ rjw: Subject and changelog adjustments ] Signed-off-by: Rafael J. Wysocki <[email protected]>
2022-04-21usb: gadget: configfs: clear deactivation flag in configfs_composite_unbind()Vijayavardhan Vennapusa1-0/+2
If any function like UVC is deactivating gadget as part of composition switch which results in not calling pullup enablement, it is not getting enabled after switch to new composition due to this deactivation flag not cleared. This results in USB enumeration not happening after switch to new USB composition. Hence clear deactivation flag inside gadget structure in configfs_composite_unbind() before switch to new USB composition. Signed-off-by: Vijayavardhan Vennapusa <[email protected]> Signed-off-by: Dan Vacura <[email protected]> Cc: stable <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2022-04-21usb: misc: eud: Fix an error handling path in eud_probe()Christophe JAILLET1-5/+5
It is odd to call devm_add_action_or_reset() before calling the function that should be undone. Either, the "_or_reset" part should be omitted, or the action should be recorded after the resources have been allocated. Switch the order of devm_add_action_or_reset() and usb_role_switch_get(). Fixes: 9a1bf58ccd44 ("usb: misc: eud: Add driver support for Embedded USB Debugger(EUD)") Signed-off-by: Christophe JAILLET <[email protected]> Link: https://lore.kernel.org/r/362908699275ecec078381b42d87c817c6965fc6.1648979948.git.christophe.jaillet@wanadoo.fr Signed-off-by: Greg Kroah-Hartman <[email protected]>
2022-04-21usb: core: Don't hold the device lock while sleeping in do_proc_control()Tasos Sahanidis1-5/+9
Since commit ae8709b296d8 ("USB: core: Make do_proc_control() and do_proc_bulk() killable") if a device has the USB_QUIRK_DELAY_CTRL_MSG quirk set, it will temporarily block all other URBs (e.g. interrupts) while sleeping due to a control. This results in noticeable delays when, for example, a userspace usbfs application is sending URB interrupts at a high rate to a keyboard and simultaneously updates the lock indicators using controls. Interrupts with direction set to IN are also affected by this, meaning that delivery of HID reports (containing scancodes) to the usbfs application is delayed as well. This patch fixes the regression by calling msleep() while the device mutex is unlocked, as was the case originally with usb_control_msg(). Fixes: ae8709b296d8 ("USB: core: Make do_proc_control() and do_proc_bulk() killable") Cc: stable <[email protected]> Acked-by: Alan Stern <[email protected]> Signed-off-by: Tasos Sahanidis <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>