aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-12-02Bluetooth: btusb: Fix CSR clones again by re-adding ERR_DATA_REPORTING quirkIsmael Ferreras Morezuelas3-2/+19
A patch series by a Qualcomm engineer essentially removed my quirk/workaround because they thought it was unnecessary. It wasn't, and it broke everything again: https://patchwork.kernel.org/project/netdevbpf/list/?series=661703&archive=both&state=* He argues that the quirk is not necessary because the code should check if the dongle says if it's supported or not. The problem is that for these Chinese CSR clones they say that it would work: = New Index: 00:00:00:00:00:00 (Primary,USB,hci0) = Open Index: 00:00:00:00:00:00 < HCI Command: Read Local Version Information (0x04|0x0001) plen 0 > HCI Event: Command Complete (0x0e) plen 12 > [hci0] 11.276039 Read Local Version Information (0x04|0x0001) ncmd 1 Status: Success (0x00) HCI version: Bluetooth 5.0 (0x09) - Revision 2064 (0x0810) LMP version: Bluetooth 5.0 (0x09) - Subversion 8978 (0x2312) Manufacturer: Cambridge Silicon Radio (10) ... < HCI Command: Read Local Supported Features (0x04|0x0003) plen 0 > HCI Event: Command Complete (0x0e) plen 68 > [hci0] 11.668030 Read Local Supported Commands (0x04|0x0002) ncmd 1 Status: Success (0x00) Commands: 163 entries ... Read Default Erroneous Data Reporting (Octet 18 - Bit 2) Write Default Erroneous Data Reporting (Octet 18 - Bit 3) ... ... < HCI Command: Read Default Erroneous Data Reporting (0x03|0x005a) plen 0 = Close Index: 00:1A:7D:DA:71:XX So bring it back wholesale. Fixes: 63b1a7dd38bf ("Bluetooth: hci_sync: Remove HCI_QUIRK_BROKEN_ERR_DATA_REPORTING") Fixes: e168f6900877 ("Bluetooth: btusb: Remove HCI_QUIRK_BROKEN_ERR_DATA_REPORTING for fake CSR") Fixes: 766ae2422b43 ("Bluetooth: hci_sync: Check LMP feature bit instead of quirk") Cc: [email protected] Cc: Zijun Hu <[email protected]> Cc: Luiz Augusto von Dentz <[email protected]> Cc: Hans de Goede <[email protected]> Tested-by: Ismael Ferreras Morezuelas <[email protected]> Signed-off-by: Ismael Ferreras Morezuelas <[email protected]> Reviewed-by: Hans de Goede <[email protected]> Signed-off-by: Luiz Augusto von Dentz <[email protected]>
2022-12-02KVM: Document the interaction between KVM_CAP_HALT_POLL and halt_poll_nsDavid Matlack2-8/+20
Clarify the existing documentation about how KVM_CAP_HALT_POLL and halt_poll_ns interact to make it clear that VMs using KVM_CAP_HALT_POLL ignore halt_poll_ns. Signed-off-by: David Matlack <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-12-02KVM: Move halt-polling documentation into common directoryDavid Matlack3-1/+1
Move halt-polling.rst into the common KVM documentation directory and out of the x86-specific directory. Halt-polling is a common feature and the existing documentation is already written as such. Signed-off-by: David Matlack <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-12-02Merge tag 'nvme-6.1-2022-01-02' of git://git.infradead.org/nvme into block-6.1Jens Axboe3-1/+6
Pull NVMe fixes from Christoph: "nvme fixes for Linux 6.1 - fix SRCU protection of nvme_ns_head list (Caleb Sander) - clear the prp2 field when not used (Lei Rao)" * tag 'nvme-6.1-2022-01-02' of git://git.infradead.org/nvme: nvme: fix SRCU protection of nvme_ns_head list nvme-pci: clear the prp2 field when not used
2022-12-02iommu/vt-d: Fix PCI device refcount leak in dmar_dev_scope_init()Xiongfeng Wang1-0/+1
for_each_pci_dev() is implemented by pci_get_device(). The comment of pci_get_device() says that it will increase the reference count for the returned pci_dev and also decrease the reference count for the input pci_dev @from if it is not NULL. If we break for_each_pci_dev() loop with pdev not NULL, we need to call pci_dev_put() to decrease the reference count. Add the missing pci_dev_put() for the error path to avoid reference count leak. Fixes: 2e4552893038 ("iommu/vt-d: Unify the way to process DMAR device scope array") Signed-off-by: Xiongfeng Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Joerg Roedel <[email protected]>
2022-12-02iommu/vt-d: Fix PCI device refcount leak in has_external_pci()Xiongfeng Wang1-1/+3
for_each_pci_dev() is implemented by pci_get_device(). The comment of pci_get_device() says that it will increase the reference count for the returned pci_dev and also decrease the reference count for the input pci_dev @from if it is not NULL. If we break for_each_pci_dev() loop with pdev not NULL, we need to call pci_dev_put() to decrease the reference count. Add the missing pci_dev_put() before 'return true' to avoid reference count leak. Fixes: 89a6079df791 ("iommu/vt-d: Force IOMMU on for platform opt in hint") Signed-off-by: Xiongfeng Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Joerg Roedel <[email protected]>
2022-12-02iommu/vt-d: Fix PCI device refcount leak in prq_event_thread()Yang Yingliang1-5/+9
As comment of pci_get_domain_bus_and_slot() says, it returns a pci device with refcount increment, when finish using it, the caller must decrease the reference count by calling pci_dev_put(). So call pci_dev_put() after using the 'pdev' to avoid refcount leak. Besides, if the 'pdev' is null or intel_svm_prq_report() returns error, there is no need to trace this fault. Fixes: 06f4b8d09dba ("iommu/vt-d: Remove unnecessary SVA data accesses in page fault path") Suggested-by: Lu Baolu <[email protected]> Signed-off-by: Yang Yingliang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Joerg Roedel <[email protected]>
2022-12-02iommu/vt-d: Add a fix for devices need extra dtlb flushJacob Pan3-3/+75
QAT devices on Intel Sapphire Rapids and Emerald Rapids have a defect in address translation service (ATS). These devices may inadvertently issue ATS invalidation completion before posted writes initiated with translated address that utilized translations matching the invalidation address range, violating the invalidation completion ordering. This patch adds an extra device TLB invalidation for the affected devices, it is needed to ensure no more posted writes with translated address following the invalidation completion. Therefore, the ordering is preserved and data-corruption is prevented. Device TLBs are invalidated under the following six conditions: 1. Device driver does DMA API unmap IOVA 2. Device driver unbind a PASID from a process, sva_unbind_device() 3. PASID is torn down, after PASID cache is flushed. e.g. process exit_mmap() due to crash 4. Under SVA usage, called by mmu_notifier.invalidate_range() where VM has to free pages that were unmapped 5. userspace driver unmaps a DMA buffer 6. Cache invalidation in vSVA usage (upcoming) For #1 and #2, device drivers are responsible for stopping DMA traffic before unmap/unbind. For #3, iommu driver gets mmu_notifier to invalidate TLB the same way as normal user unmap which will do an extra invalidation. The dTLB invalidation after PASID cache flush does not need an extra invalidation. Therefore, we only need to deal with #4 and #5 in this patch. #1 is also covered by this patch due to common code path with #5. Tested-by: Yuzhang Luo <[email protected]> Reviewed-by: Ashok Raj <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Signed-off-by: Jacob Pan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Joerg Roedel <[email protected]>
2022-12-02Merge branch 'vmxnet3-fixes'David S. Miller1-4/+23
Ronak Doshi says: ==================== vmxnet3: couple of fixes This series fixes following issues: Patch 1: This patch provides a fix to correctly report encapsulated LRO'ed packet. Patch 2: This patch provides a fix to use correct intrConf reference. Changes in v2: - declare generic descriptor to be used - remove white spaces - remove single quote around commit reference in patch 2 - remove if check for encap_lro ==================== Signed-off-by: David S. Miller <[email protected]>
2022-12-02vmxnet3: use correct intrConf reference when using extended queuesRonak Doshi1-2/+14
Commit 39f9895a00f4 ("vmxnet3: add support for 32 Tx/Rx queues") added support for 32Tx/Rx queues. As a part of this patch, intrConf structure was extended to incorporate increased queues. This patch fixes the issue where incorrect reference is being used. Fixes: 39f9895a00f4 ("vmxnet3: add support for 32 Tx/Rx queues") Signed-off-by: Ronak Doshi <[email protected]> Acked-by: Guolin Yang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-12-02vmxnet3: correctly report encapsulated LRO packetRonak Doshi1-2/+9
Commit dacce2be3312 ("vmxnet3: add geneve and vxlan tunnel offload support") added support for encapsulation offload. However, the pathc did not report correctly the encapsulated packet which is LRO'ed by the hypervisor. This patch fixes this issue by using correct callback for the LRO'ed encapsulated packet. Fixes: dacce2be3312 ("vmxnet3: add geneve and vxlan tunnel offload support") Signed-off-by: Ronak Doshi <[email protected]> Acked-by: Guolin Yang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-12-01Merge branch '1GbE' of ↵Jakub Kicinski2-2/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-11-30 (e1000e, igb) This series contains updates to e1000e and igb drivers. Akihiko Odaki fixes calculation for checking whether space for next frame exists for e1000e and properly sets MSI-X vector to fix failing ethtool interrupt test for igb. * '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: igb: Allocate MSI-X vector when testing e1000e: Fix TX dispatch condition ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-12-02Merge tag 'amd-drm-fixes-6.1-2022-12-01' of ↵Dave Airlie1-0/+3
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.1-2022-12-01: amdgpu: - VCN fix for vangogh Signed-off-by: Dave Airlie <[email protected]> From: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2022-12-02i2c: imx: Only DMA messages with I2C_M_DMA_SAFE flag setAndrew Lunn1-2/+4
Recent changes to the DMA code has resulting in the IMX driver failing I2C transfers when the buffer has been vmalloc. Only perform DMA transfers if the message has the I2C_M_DMA_SAFE flag set, indicating the client is providing a buffer which is DMA safe. This is a minimal fix for stable. The I2C core provides helpers to allocate a bounce buffer. For a fuller fix the master should make use of these helpers. Fixes: 4544b9f25e70 ("dma-mapping: Add vmap checks to dma_map_single()") Signed-off-by: Andrew Lunn <[email protected]> Acked-by: Oleksij Rempel <[email protected]> Signed-off-by: Wolfram Sang <[email protected]>
2022-12-01i2c: qcom-geni: fix error return code in geni_i2c_gpi_xferWang Yufen1-1/+0
Fix to return a negative error code from the gi2c->err instead of 0. Fixes: d8703554f4de ("i2c: qcom-geni: Add support for GPI DMA") Signed-off-by: Wang Yufen <[email protected]> Reviewed-by: Tommaso Merciai <[email protected]> Signed-off-by: Wolfram Sang <[email protected]>
2022-12-01i2c: cadence: Fix regression with bus recoveryCarsten Haitzler1-3/+8
Commit "i2c: cadence: Add standard bus recovery support" breaks for i2c devices that have no pinctrl defined. There is no requirement for this to exist in the DT. This has worked perfectly well without this before in at least 1 real usage case on hardware (Mali Komeda DPU, Cadence i2c to talk to a tda99xx phy). Adding the requirement to have pinctrl set up in the device tree (or otherwise be found) is a regression where the whole i2c device is lost entirely (in this case dropping entire devices which then leads to the drm display stack unable to find the phy for display output, thus having no drm display device and so on down the chain). This converts the above commit to an enhancement if pinctrl can be found for the i2c device, providing a timeout on read with recovery, but if not, do what used to be done rather than a fatal loss of a device. This restores the mentioned display devices to their working state again. Fixes: 58b924241d0a ("i2c: cadence: Add standard bus recovery support") Signed-off-by: Carsten Haitzler <[email protected]> Reviewed-by: Shubhrajyoti Datta <[email protected]> Reviewed-by: Michael Grzeschik <[email protected]> Acked-by: Michal Simek <[email protected]> [wsa: added braces to else-branch] Signed-off-by: Wolfram Sang <[email protected]>
2022-12-02Merge tag 'drm-intel-fixes-2022-12-01' of ↵Dave Airlie4-8/+16
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes - Fix dram info readout (Radhakrishna Sripada) - Remove non-existent pipes from bigjoiner pipe mask (Ville Syrjälä) - Fix negative value passed as remaining time (Janusz Krzysztofik) - Never return 0 if not all requests retired (Janusz Krzysztofik) Signed-off-by: Dave Airlie <[email protected]> From: Tvrtko Ursulin <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/Y4hp+a3TJ13t2ZA1@tursulin-desk
2022-12-01error-injection: Add prompt for function error injectionSteven Rostedt (Google)1-1/+7
The config to be able to inject error codes into any function annotated with ALLOW_ERROR_INJECTION() is enabled when FUNCTION_ERROR_INJECTION is enabled. But unfortunately, this is always enabled on x86 when KPROBES is enabled, and there's no way to turn it off. As kprobes is useful for observability of the kernel, it is useful to have it enabled in production environments. But error injection should be avoided. Add a prompt to the config to allow it to be disabled even when kprobes is enabled, and get rid of the "def_bool y". This is a kernel debug feature (it's in Kconfig.debug), and should have never been something enabled by default. Cc: [email protected] Fixes: 540adea3809f6 ("error-injection: Separate error-injection from kprobe") Signed-off-by: Steven Rostedt (Google) <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-12-01drm/amdgpu: enable Vangogh VCN indirect sram modeLeo Liu1-0/+3
So that uses PSP to initialize HW. Fixes: 0c2c02b66c672e ("drm/amdgpu/vcn: add firmware support for dimgrey_cavefish") Signed-off-by: Leo Liu <[email protected]> Reviewed-by: James Zhu <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2022-12-01RISC-V: Fix a race condition during kernel stack overflowPalmer Dabbelt3-0/+32
This fixes a concrete bug but is also the basis for some cleanup work, so I'm merging it based on the offending commit in order to minimize future conflicts. * commit '7e1864332fbc1b993659eab7974da9fe8bf8c128': riscv: fix race when vmap stack overflow
2022-12-01Merge tag 'efi-fixes-for-v6.1-4' of ↵Linus Torvalds4-69/+2
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fix from Ard Biesheuvel: "A single revert for some code that I added during this cycle. The code is not wrong, but it should be a bit more careful about how to handle the shadow call stack pointer, so it is better to revert it for now and bring it back later in improved form. Summary: - Revert runtime service sync exception recovery on arm64" * tag 'efi-fixes-for-v6.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: arm64: efi: Revert "Recover from synchronous exceptions ..."
2022-12-01Merge tag 'hwmon-for-v6.1-rc8' of ↵Linus Torvalds6-5/+15
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fixes from Guenter Roeck: - Fix refcount leak and NULL pointer access in coretemp driver - Fix UAF in ibmpex driver - Add missing pci_disable_device() to i5500_temp driver - Fix calculation in ina3221 driver - Fix temperature scaling in ltc2947 driver - Check result of devm_kcalloc call in asus-ec-sensors driver * tag 'hwmon-for-v6.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (asus-ec-sensors) Add checks for devm_kcalloc hwmon: (coretemp) fix pci device refcount leak in nv1a_ram_new() hwmon: (coretemp) Check for null before removing sysfs attrs hwmon: (ibmpex) Fix possible UAF when ibmpex_register_bmc() fails hwmon: (i5500_temp) fix missing pci_disable_device() hwmon: (ina3221) Fix shunt sum critical calculation hwmon: (ltc2947) fix temperature scaling
2022-12-01hwmon: (asus-ec-sensors) Add checks for devm_kcallocYuan Can1-0/+2
As the devm_kcalloc may return NULL, the return value needs to be checked to avoid NULL poineter dereference. Fixes: d0ddfd241e57 ("hwmon: (asus-ec-sensors) add driver for ASUS EC") Signed-off-by: Yuan Can <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Guenter Roeck <[email protected]>
2022-12-01hwmon: (coretemp) fix pci device refcount leak in nv1a_ram_new()Yang Yingliang1-1/+4
As comment of pci_get_domain_bus_and_slot() says, it returns a pci device with refcount increment, when finish using it, the caller must decrement the reference count by calling pci_dev_put(). So call it after using to avoid refcount leak. Fixes: 14513ee696a0 ("hwmon: (coretemp) Use PCI host bridge ID to identify CPU if necessary") Signed-off-by: Yang Yingliang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Guenter Roeck <[email protected]>
2022-12-01hwmon: (coretemp) Check for null before removing sysfs attrsPhil Auld1-0/+4
If coretemp_add_core() gets an error then pdata->core_data[indx] is already NULL and has been kfreed. Don't pass that to sysfs_remove_group() as that will crash in sysfs_remove_group(). [Shortened for readability] [91854.020159] sysfs: cannot create duplicate filename '/devices/platform/coretemp.0/hwmon/hwmon2/temp20_label' <cpu offline> [91855.126115] BUG: kernel NULL pointer dereference, address: 0000000000000188 [91855.165103] #PF: supervisor read access in kernel mode [91855.194506] #PF: error_code(0x0000) - not-present page [91855.224445] PGD 0 P4D 0 [91855.238508] Oops: 0000 [#1] PREEMPT SMP PTI ... [91855.342716] RIP: 0010:sysfs_remove_group+0xc/0x80 ... [91855.796571] Call Trace: [91855.810524] coretemp_cpu_offline+0x12b/0x1dd [coretemp] [91855.841738] ? coretemp_cpu_online+0x180/0x180 [coretemp] [91855.871107] cpuhp_invoke_callback+0x105/0x4b0 [91855.893432] cpuhp_thread_fun+0x8e/0x150 ... Fix this by checking for NULL first. Signed-off-by: Phil Auld <[email protected]> Cc: [email protected] Cc: Fenghua Yu <[email protected]> Cc: Jean Delvare <[email protected]> Cc: Guenter Roeck <[email protected]> Link: https://lore.kernel.org/r/[email protected] Fixes: 199e0de7f5df3 ("hwmon: (coretemp) Merge pkgtemp with coretemp") Signed-off-by: Guenter Roeck <[email protected]>
2022-12-01arm64: efi: Revert "Recover from synchronous exceptions ..."Ard Biesheuvel4-69/+2
This reverts commit 23715a26c8d81291, which introduced some code in assembler that manipulates both the ordinary and the shadow call stack pointer in a way that could potentially be taken advantage of. So let's revert it, and do a better job the next time around. Signed-off-by: Ard Biesheuvel <[email protected]>
2022-12-01inet: ping: use hlist_nulls rcu iterator during lookupFlorian Westphal3-4/+7
ping_lookup() does not acquire the table spinlock, so iteration should use hlist_nulls_for_each_entry_rcu(). Spotted during code review. Fixes: dbca1596bbb0 ("ping: convert to RCU lookups, get rid of rwlock") Cc: Eric Dumazet <[email protected]> Signed-off-by: Florian Westphal <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2022-12-01Revert "clocksource/drivers/riscv: Events are stopped during CPU suspend"Conor Dooley1-1/+1
This reverts commit 232ccac1bd9b5bfe73895f527c08623e7fa0752d. On the subject of suspend, the RISC-V SBI spec states: This does not cover whether any given events actually reach the hart or not, just what the hart will do if it receives an event. On PolarFire SoC, and potentially other SiFive based implementations, events from the RISC-V timer do reach a hart during suspend. This is not the case for the implementation on the Allwinner D1 - there timer events are not received during suspend. To fix this, the CLOCK_EVT_FEAT_C3STOP (mis)feature was enabled for the timer driver - but this has broken both RCU stall detection and timers generally on PolarFire SoC and potentially other SiFive based implementations. If an AXI read to the PCIe controller on PolarFire SoC times out, the system will stall, however, with CLOCK_EVT_FEAT_C3STOP active, the system just locks up without RCU stalling: io scheduler mq-deadline registered io scheduler kyber registered microchip-pcie 2000000000.pcie: host bridge /soc/pcie@2000000000 ranges: microchip-pcie 2000000000.pcie: MEM 0x2008000000..0x2087ffffff -> 0x0008000000 microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer microchip-pcie 2000000000.pcie: axi read request error microchip-pcie 2000000000.pcie: axi read timeout microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer Freeing initrd memory: 7332K Similarly issues were reported with clock_nanosleep() - with a test app that sleeps each cpu for 6, 5, 4, 3 ms respectively, HZ=250 & the blamed commit in place, the sleep times are rounded up to the next jiffy: == CPU: 1 == == CPU: 2 == == CPU: 3 == == CPU: 4 == Mean: 7.974992 Mean: 7.976534 Mean: 7.962591 Mean: 3.952179 Std Dev: 0.154374 Std Dev: 0.156082 Std Dev: 0.171018 Std Dev: 0.076193 Hi: 9.472000 Hi: 10.495000 Hi: 8.864000 Hi: 4.736000 Lo: 6.087000 Lo: 6.380000 Lo: 4.872000 Lo: 3.403000 Samples: 521 Samples: 521 Samples: 521 Samples: 521 Fortunately, the D1 has a second timer, which is "currently used in preference to the RISC-V/SBI timer driver" so a revert here does not hurt operation of D1 in its current form. Ultimately, a DeviceTree property (or node) will be added to encode the behaviour of the timers, but until then revert the addition of CLOCK_EVT_FEAT_C3STOP. Fixes: 232ccac1bd9b ("clocksource/drivers/riscv: Events are stopped during CPU suspend") Signed-off-by: Conor Dooley <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Palmer Dabbelt <[email protected]> Acked-by: Palmer Dabbelt <[email protected]> Acked-by: Samuel Holland <[email protected]> Link: https://lore.kernel.org/linux-riscv/YzYTNQRxLr7Q9JR0@spud/ Link: https://github.com/riscv-non-isa/riscv-sbi-doc/issues/98/ Link: https://lore.kernel.org/linux-riscv/[email protected]/ Link: https://lore.kernel.org/r/[email protected]
2022-12-01mmc: sdhci-sprd: Fix no reset data and command after voltage switchWenchao Chen1-1/+3
After switching the voltage, no reset data and command will cause CMD2 timeout. Fixes: 29ca763fc26f ("mmc: sdhci-sprd: Add pin control support for voltage switch") Signed-off-by: Wenchao Chen <[email protected]> Acked-by: Adrian Hunter <[email protected]> Reviewed-by: Baolin Wang <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ulf Hansson <[email protected]>
2022-12-01Merge branch 'af_unix-fix-a-null-deref-in-sk_diag_dump_uid'Paolo Abeni4-9/+192
Kuniyuki Iwashima says: ==================== af_unix: Fix a NULL deref in sk_diag_dump_uid(). The first patch fixes a NULL deref when we dump a AF_UNIX socket's UID, and the second patch adds a repro/test for such a case. ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2022-12-01af_unix: Add test for sock_diag and UDIAG_SHOW_UID.Kuniyuki Iwashima3-1/+180
The test prog dumps a single AF_UNIX socket's UID with and without unshare(CLONE_NEWUSER) and checks if it matches the result of getuid(). Without the preceding patch, the test prog is killed by a NULL deref in sk_diag_dump_uid(). # ./diag_uid TAP version 13 1..2 # Starting 2 tests from 3 test cases. # RUN diag_uid.uid.1 ... BUG: kernel NULL pointer dereference, address: 0000000000000270 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 105212067 P4D 105212067 PUD 1051fe067 PMD 0 Oops: 0000 [#1] PREEMPT SMP NOPTI Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.amzn2022.0.1 04/01/2014 RIP: 0010:sk_diag_fill (./include/net/sock.h:920 net/unix/diag.c:119 net/unix/diag.c:170) ... # 1: Test terminated unexpectedly by signal 9 # FAIL diag_uid.uid.1 not ok 1 diag_uid.uid.1 # RUN diag_uid.uid_unshare.1 ... # 1: Test terminated by timeout # FAIL diag_uid.uid_unshare.1 not ok 2 diag_uid.uid_unshare.1 # FAILED: 0 / 2 tests passed. # Totals: pass:0 fail:2 xfail:0 xpass:0 skip:0 error:0 With the patch, the test succeeds. # ./diag_uid TAP version 13 1..2 # Starting 2 tests from 3 test cases. # RUN diag_uid.uid.1 ... # OK diag_uid.uid.1 ok 1 diag_uid.uid.1 # RUN diag_uid.uid_unshare.1 ... # OK diag_uid.uid_unshare.1 ok 2 diag_uid.uid_unshare.1 # PASSED: 2 / 2 tests passed. # Totals: pass:2 fail:0 xfail:0 xpass:0 skip:0 error:0 Signed-off-by: Kuniyuki Iwashima <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2022-12-01af_unix: Get user_ns from in_skb in unix_diag_get_exact().Kuniyuki Iwashima1-8/+12
Wei Chen reported a NULL deref in sk_user_ns() [0][1], and Paolo diagnosed the root cause: in unix_diag_get_exact(), the newly allocated skb does not have sk. [2] We must get the user_ns from the NETLINK_CB(in_skb).sk and pass it to sk_diag_fill(). [0]: BUG: kernel NULL pointer dereference, address: 0000000000000270 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 12bbce067 P4D 12bbce067 PUD 12bc40067 PMD 0 Oops: 0000 [#1] PREEMPT SMP CPU: 0 PID: 27942 Comm: syz-executor.0 Not tainted 6.1.0-rc5-next-20221118 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-48-gd9c812dda519-prebuilt.qemu.org 04/01/2014 RIP: 0010:sk_user_ns include/net/sock.h:920 [inline] RIP: 0010:sk_diag_dump_uid net/unix/diag.c:119 [inline] RIP: 0010:sk_diag_fill+0x77d/0x890 net/unix/diag.c:170 Code: 89 ef e8 66 d4 2d fd c7 44 24 40 00 00 00 00 49 8d 7c 24 18 e8 54 d7 2d fd 49 8b 5c 24 18 48 8d bb 70 02 00 00 e8 43 d7 2d fd <48> 8b 9b 70 02 00 00 48 8d 7b 10 e8 33 d7 2d fd 48 8b 5b 10 48 8d RSP: 0018:ffffc90000d67968 EFLAGS: 00010246 RAX: ffff88812badaa48 RBX: 0000000000000000 RCX: ffffffff840d481d RDX: 0000000000000465 RSI: 0000000000000000 RDI: 0000000000000270 RBP: ffffc90000d679a8 R08: 0000000000000277 R09: 0000000000000000 R10: 0001ffffffffffff R11: 0001c90000d679a8 R12: ffff88812ac03800 R13: ffff88812c87c400 R14: ffff88812ae42210 R15: ffff888103026940 FS: 00007f08b4e6f700(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000270 CR3: 000000012c58b000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> unix_diag_get_exact net/unix/diag.c:285 [inline] unix_diag_handler_dump+0x3f9/0x500 net/unix/diag.c:317 __sock_diag_cmd net/core/sock_diag.c:235 [inline] sock_diag_rcv_msg+0x237/0x250 net/core/sock_diag.c:266 netlink_rcv_skb+0x13e/0x250 net/netlink/af_netlink.c:2564 sock_diag_rcv+0x24/0x40 net/core/sock_diag.c:277 netlink_unicast_kernel net/netlink/af_netlink.c:1330 [inline] netlink_unicast+0x5e9/0x6b0 net/netlink/af_netlink.c:1356 netlink_sendmsg+0x739/0x860 net/netlink/af_netlink.c:1932 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg net/socket.c:734 [inline] ____sys_sendmsg+0x38f/0x500 net/socket.c:2476 ___sys_sendmsg net/socket.c:2530 [inline] __sys_sendmsg+0x197/0x230 net/socket.c:2559 __do_sys_sendmsg net/socket.c:2568 [inline] __se_sys_sendmsg net/socket.c:2566 [inline] __x64_sys_sendmsg+0x42/0x50 net/socket.c:2566 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x4697f9 Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f08b4e6ec48 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 000000000077bf80 RCX: 00000000004697f9 RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000003 RBP: 00000000004d29e9 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 000000000077bf80 R13: 0000000000000000 R14: 000000000077bf80 R15: 00007ffdb36bc6c0 </TASK> Modules linked in: CR2: 0000000000000270 [1]: https://lore.kernel.org/netdev/CAO4mrfdvyjFpokhNsiwZiP-wpdSD0AStcJwfKcKQdAALQ9_2Qw@mail.gmail.com/ [2]: https://lore.kernel.org/netdev/[email protected]/ Fixes: cae9910e7344 ("net: Add UNIX_DIAG_UID to Netlink UNIX socket diagnostics.") Reported-by: syzbot <[email protected]> Reported-by: Wei Chen <[email protected]> Diagnosed-by: Paolo Abeni <[email protected]> Signed-off-by: Kuniyuki Iwashima <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2022-12-01drm: bridge: dw_hdmi: fix preference of RGB modes over YUV420Guillaume BRUN1-3/+3
Cheap monitors sometimes advertise YUV modes they don't really have (HDMI specification mandates YUV support so even monitors without actual support will often wrongfully advertise it) which results in YUV matches and user forum complaints of a red tint to light colour display areas in common desktop environments. Moving the default RGB fall-back before YUV selection results in RGB mode matching in most cases, reducing complaints. Fixes: 6c3c719936da ("drm/bridge: synopsys: dw-hdmi: add bus format negociation") Signed-off-by: Guillaume BRUN <[email protected]> Tested-by: Christian Hewitt <[email protected]> Reviewed-by: Robert Foss <[email protected]> Signed-off-by: Neil Armstrong <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2022-11-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nfJakub Kicinski4-17/+19
Pablo Neira Ayuso says: ==================== Netfilter fixes for net 1) Check for interval validity in all concatenation fields in nft_set_pipapo, from Stefano Brivio. 2) Missing preemption disabled in conntrack and flowtable stat updates, from Xin Long. 3) Fix compilation warning when CONFIG_NF_CONNTRACK_MARK=n. Except for 3) which was a bug introduced in a recent fix in 6.1-rc - anything else, broken for several releases. * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: ctnetlink: fix compilation warning after data race fixes in ct mark netfilter: conntrack: fix using __this_cpu_add in preemptible netfilter: flowtable_offload: fix using __this_cpu_add in preemptible netfilter: nft_set_pipapo: Actually validate intervals in fields after the first one ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-11-30net: ethernet: ti: am65-cpsw: Fix RGMII configuration at SPEED_10Siddharth Vadapalli1-1/+1
The am65-cpsw driver supports configuring all RGMII variants at interface speed of 10 Mbps. However, in the process of shifting to the PHYLINK framework, the support for all variants of RGMII except the PHY_INTERFACE_MODE_RGMII variant was accidentally removed. Fix this by using phy_interface_mode_is_rgmii() to check for all variants of RGMII mode. Fixes: e8609e69470f ("net: ethernet: ti: am65-cpsw: Convert to PHYLINK") Reported-by: Schuyler Patton <[email protected]> Signed-off-by: Siddharth Vadapalli <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-11-30net: broadcom: Add PTP_1588_CLOCK_OPTIONAL dependency for BCMGENET under ↵YueHaibing1-1/+2
ARCH_BCM2835 commit 8d820bc9d12b ("net: broadcom: Fix BCMGENET Kconfig") fixes the build that contain 99addbe31f55 ("net: broadcom: Select BROADCOM_PHY for BCMGENET") and enable BCMGENET=y but PTP_1588_CLOCK_OPTIONAL=m, which otherwise leads to a link failure. However this may trigger a runtime failure. Fix the original issue by propagating the PTP_1588_CLOCK_OPTIONAL dependency of BROADCOM_PHY down to BCMGENET. Fixes: 8d820bc9d12b ("net: broadcom: Fix BCMGENET Kconfig") Fixes: 99addbe31f55 ("net: broadcom: Select BROADCOM_PHY for BCMGENET") Reported-by: Naresh Kamboju <[email protected]> Suggested-by: Arnd Bergmann <[email protected]> Signed-off-by: YueHaibing <[email protected]> Acked-by: Arnd Bergmann <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-11-30Merge tag 'clk-fixes-for-linus' of ↵Linus Torvalds8-64/+21
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "A set of clk driver fixes that resolve issues for various SoCs. Most of these are incorrect clk data, like bad parent descriptions. When the clk tree is improperly described things don't work, like USB and UFS controllers, because clk frequencies are wonky. Here are the extra details: - Fix the parent of UFS reference clks on Qualcomm SC8280XP so that UFS works properly - Fix the clk ID for USB on AT91 RM9200 so the USB driver continues to probe - Stop using of_device_get_match_data() on the wrong device for a Samsung Exynos driver so it gets the proper clk data - Fix ExynosAutov9 binding - Fix the parent of the div4 clk on Exynos7885 - Stop calling runtime PM APIs from the Qualcomm GDSC driver directly as it leads to a lockdep splat and is just plain wrong because it violates runtime PM semantics by calling runtime PM APIs when the device has been runtime PM disabled" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: qcom: gcc-sc8280xp: add cxo as parent for three ufs ref clks ARM: at91: rm9200: fix usb device clock id clk: samsung: Revert "clk: samsung: exynos-clkout: Use of_device_get_match_data()" dt-bindings: clock: exynosautov9: fix reference to CMU_FSYS1 clk: qcom: gdsc: Remove direct runtime PM calls clk: samsung: exynos7885: Correct "div4" clock parents
2022-11-30revert "kbuild: fix -Wimplicit-function-declaration in ↵Andrew Morton1-2/+0
license_is_gpl_compatible" It causes build failures with unusual CC/HOSTCC combinations. Quoting https://lkml.kernel.org/r/[email protected]: HOSTCC scripts/mod/modpost.o - due to target missing In file included from include/linux/string.h:5, from scripts/mod/../../include/linux/license.h:5, from scripts/mod/modpost.c:24: include/linux/compiler.h:246:10: fatal error: asm/rwonce.h: No such file or directory 246 | #include <asm/rwonce.h> | ^~~~~~~~~~~~~~ compilation terminated. ... The problem is that HOSTCC is not necessarily the same compiler or even architecture as CC and pulling in <linux/compiler.h> or <asm/rwonce.h> files indirectly isn't a good idea then. My toolchain is providing HOSTCC = gcc (MacPorts) and CC = arm-linux-gnueabihf (built from gcc source) and all running on Darwin. If I change the include to <string.h> I can then "HOSTCC scripts/mod/modpost.c" but then it fails for "CC kernel/module/main.c" not finding <string.h>: CC kernel/module/main.o - due to target missing In file included from kernel/module/main.c:43:0: ./include/linux/license.h:5:20: fatal error: string.h: No such file or directory #include <string.h> ^ compilation terminated. Reported-by: "H. Nikolaus Schaller" <[email protected]> Cc: Sam James <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-11-30Kconfig.debug: provide a little extra FRAME_WARN leeway when KASAN is enabledLee Jones1-0/+1
When enabled, KASAN enlarges function's stack-frames. Pushing quite a few over the current threshold. This can mainly be seen on 32-bit architectures where the present limit (when !GCC) is a lowly 1024-Bytes. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Lee Jones <[email protected]> Acked-by: Arnd Bergmann <[email protected]> Cc: Alex Deucher <[email protected]> Cc: "Christian König" <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: David Airlie <[email protected]> Cc: Harry Wentland <[email protected]> Cc: Leo Li <[email protected]> Cc: Maarten Lankhorst <[email protected]> Cc: Maxime Ripard <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: "Pan, Xinhui" <[email protected]> Cc: Rodrigo Siqueira <[email protected]> Cc: Thomas Zimmermann <[email protected]> Cc: Tom Rix <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-11-30drm/amdgpu: temporarily disable broken Clang builds due to blown stack-frameLee Jones1-0/+7
Patch series "Fix a bunch of allmodconfig errors", v2. Since b339ec9c229aa ("kbuild: Only default to -Werror if COMPILE_TEST") WERROR now defaults to COMPILE_TEST meaning that it's enabled for allmodconfig builds. This leads to some interesting build failures when using Clang, each resolved in this set. With this set applied, I am able to obtain a successful allmodconfig Arm build. This patch (of 2): calculate_bandwidth() is presently broken on all !(X86_64 || SPARC64 || ARM64) architectures built with Clang (all released versions), whereby the stack frame gets blown up to well over 5k. This would cause an immediate kernel panic on most architectures. We'll revert this when the following bug report has been resolved: https://github.com/llvm/llvm-project/issues/41896. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Lee Jones <[email protected]> Suggested-by: Arnd Bergmann <[email protected]> Acked-by: Arnd Bergmann <[email protected]> Cc: Alex Deucher <[email protected]> Cc: "Christian König" <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: David Airlie <[email protected]> Cc: Harry Wentland <[email protected]> Cc: Lee Jones <[email protected]> Cc: Leo Li <[email protected]> Cc: Maarten Lankhorst <[email protected]> Cc: Maxime Ripard <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: "Pan, Xinhui" <[email protected]> Cc: Rodrigo Siqueira <[email protected]> Cc: Thomas Zimmermann <[email protected]> Cc: Tom Rix <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-11-30mm/khugepaged: invoke MMU notifiers in shmem/file collapse pathsJann Horn1-0/+5
Any codepath that zaps page table entries must invoke MMU notifiers to ensure that secondary MMUs (like KVM) don't keep accessing pages which aren't mapped anymore. Secondary MMUs don't hold their own references to pages that are mirrored over, so failing to notify them can lead to page use-after-free. I'm marking this as addressing an issue introduced in commit f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem pages"), but most of the security impact of this only came in commit 27e1f8273113 ("khugepaged: enable collapse pmd for pte-mapped THP"), which actually omitted flushes for the removal of present PTEs, not just for the removal of empty page tables. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem pages") Signed-off-by: Jann Horn <[email protected]> Acked-by: David Hildenbrand <[email protected]> Reviewed-by: Yang Shi <[email protected]> Cc: John Hubbard <[email protected]> Cc: Peter Xu <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-11-30mm/khugepaged: fix GUP-fast interaction by sending IPIJann Horn3-3/+7
Since commit 70cbc3cc78a99 ("mm: gup: fix the fast GUP race against THP collapse"), the lockless_pages_from_mm() fastpath rechecks the pmd_t to ensure that the page table was not removed by khugepaged in between. However, lockless_pages_from_mm() still requires that the page table is not concurrently freed. Fix it by sending IPIs (if the architecture uses semi-RCU-style page table freeing) before freeing/reusing page tables. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: ba76149f47d8 ("thp: khugepaged") Signed-off-by: Jann Horn <[email protected]> Reviewed-by: Yang Shi <[email protected]> Acked-by: David Hildenbrand <[email protected]> Cc: John Hubbard <[email protected]> Cc: Peter Xu <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-11-30mm/khugepaged: take the right locks for page table retractionJann Horn1-4/+51
pagetable walks on address ranges mapped by VMAs can be done under the mmap lock, the lock of an anon_vma attached to the VMA, or the lock of the VMA's address_space. Only one of these needs to be held, and it does not need to be held in exclusive mode. Under those circumstances, the rules for concurrent access to page table entries are: - Terminal page table entries (entries that don't point to another page table) can be arbitrarily changed under the page table lock, with the exception that they always need to be consistent for hardware page table walks and lockless_pages_from_mm(). This includes that they can be changed into non-terminal entries. - Non-terminal page table entries (which point to another page table) can not be modified; readers are allowed to READ_ONCE() an entry, verify that it is non-terminal, and then assume that its value will stay as-is. Retracting a page table involves modifying a non-terminal entry, so page-table-level locks are insufficient to protect against concurrent page table traversal; it requires taking all the higher-level locks under which it is possible to start a page walk in the relevant range in exclusive mode. The collapse_huge_page() path for anonymous THP already follows this rule, but the shmem/file THP path was getting it wrong, making it possible for concurrent rmap-based operations to cause corruption. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: 27e1f8273113 ("khugepaged: enable collapse pmd for pte-mapped THP") Signed-off-by: Jann Horn <[email protected]> Reviewed-by: Yang Shi <[email protected]> Acked-by: David Hildenbrand <[email protected]> Cc: John Hubbard <[email protected]> Cc: Peter Xu <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-11-30mm: migrate: fix THP's mapcount on isolationGavin Shan1-11/+11
The issue is reported when removing memory through virtio_mem device. The transparent huge page, experienced copy-on-write fault, is wrongly regarded as pinned. The transparent huge page is escaped from being isolated in isolate_migratepages_block(). The transparent huge page can't be migrated and the corresponding memory block can't be put into offline state. Fix it by replacing page_mapcount() with total_mapcount(). With this, the transparent huge page can be isolated and migrated, and the memory block can be put into offline state. Besides, The page's refcount is increased a bit earlier to avoid the page is released when the check is executed. Link: https://lkml.kernel.org/r/[email protected] Fixes: 1da2f328fa64 ("mm,thp,compaction,cma: allow THP migration for CMA allocations") Signed-off-by: Gavin Shan <[email protected]> Reported-by: Zhenyu Zhang <[email protected]> Tested-by: Zhenyu Zhang <[email protected]> Suggested-by: David Hildenbrand <[email protected]> Acked-by: David Hildenbrand <[email protected]> Cc: Alistair Popple <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: William Kucharski <[email protected]> Cc: Zi Yan <[email protected]> Cc: <[email protected]> [5.7+] Signed-off-by: Andrew Morton <[email protected]>
2022-11-30mm: introduce arch_has_hw_nonleaf_pmd_young()Juergen Gross3-5/+24
When running as a Xen PV guests commit eed9a328aa1a ("mm: x86: add CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG") can cause a protection violation in pmdp_test_and_clear_young(): BUG: unable to handle page fault for address: ffff8880083374d0 #PF: supervisor write access in kernel mode #PF: error_code(0x0003) - permissions violation PGD 3026067 P4D 3026067 PUD 3027067 PMD 7fee5067 PTE 8010000008337065 Oops: 0003 [#1] PREEMPT SMP NOPTI CPU: 7 PID: 158 Comm: kswapd0 Not tainted 6.1.0-rc5-20221118-doflr+ #1 RIP: e030:pmdp_test_and_clear_young+0x25/0x40 This happens because the Xen hypervisor can't emulate direct writes to page table entries other than PTEs. This can easily be fixed by introducing arch_has_hw_nonleaf_pmd_young() similar to arch_has_hw_pte_young() and test that instead of CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG. Link: https://lkml.kernel.org/r/[email protected] Fixes: eed9a328aa1a ("mm: x86: add CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG") Signed-off-by: Juergen Gross <[email protected]> Reported-by: Sander Eikelenboom <[email protected]> Acked-by: Yu Zhao <[email protected]> Tested-by: Sander Eikelenboom <[email protected]> Acked-by: David Hildenbrand <[email protected]> [core changes] Signed-off-by: Andrew Morton <[email protected]>
2022-11-30mm: add dummy pmd_young() for architectures not having itJuergen Gross7-0/+13
In order to avoid #ifdeffery add a dummy pmd_young() implementation as a fallback. This is required for the later patch "mm: introduce arch_has_hw_nonleaf_pmd_young()". Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Juergen Gross <[email protected]> Acked-by: Yu Zhao <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Sander Eikelenboom <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-11-30mm/damon/sysfs: fix wrong empty schemes assumption under online tuning in ↵SeongJae Park1-2/+44
damon_sysfs_set_schemes() Commit da87878010e5 ("mm/damon/sysfs: support online inputs update") made 'damon_sysfs_set_schemes()' to be called for running DAMON context, which could have schemes. In the case, DAMON sysfs interface is supposed to update, remove, or add schemes to reflect the sysfs files. However, the code is assuming the DAMON context wouldn't have schemes at all, and therefore creates and adds new schemes. As a result, the code doesn't work as intended for online schemes tuning and could have more than expected memory footprint. The schemes are all in the DAMON context, so it doesn't leak the memory, though. Remove the wrong asssumption (the DAMON context wouldn't have schemes) in 'damon_sysfs_set_schemes()' to fix the bug. Link: https://lkml.kernel.org/r/[email protected] Fixes: da87878010e5 ("mm/damon/sysfs: support online inputs update") Signed-off-by: SeongJae Park <[email protected]> Cc: <[email protected]> [5.19+] Signed-off-by: Andrew Morton <[email protected]>
2022-11-30tools/vm/slabinfo-gnuplot: use "grep -E" instead of "egrep"Tiezhu Yang1-2/+2
The latest version of grep claims the egrep is now obsolete so the build now contains warnings that look like: egrep: warning: egrep is obsolescent; using grep -E fix this up by moving the related file to use "grep -E" instead. sed -i "s/egrep/grep -E/g" `grep egrep -rwl tools/vm` Here are the steps to install the latest grep: wget http://ftp.gnu.org/gnu/grep/grep-3.8.tar.gz tar xf grep-3.8.tar.gz cd grep-3.8 && ./configure && make sudo make install export PATH=/usr/local/bin:$PATH Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Tiezhu Yang <[email protected]> Reviewed-by: Sergey Senozhatsky <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-11-30nilfs2: fix NULL pointer dereference in nilfs_palloc_commit_free_entry()ZhangPeng1-0/+7
Syzbot reported a null-ptr-deref bug: NILFS (loop0): segctord starting. Construction interval = 5 seconds, CP frequency < 30 seconds general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017] CPU: 1 PID: 3603 Comm: segctord Not tainted 6.1.0-rc2-syzkaller-00105-gb229b6ca5abb #0 Hardware name: Google Compute Engine/Google Compute Engine, BIOS Google 10/11/2022 RIP: 0010:nilfs_palloc_commit_free_entry+0xe5/0x6b0 fs/nilfs2/alloc.c:608 Code: 00 00 00 00 fc ff df 80 3c 02 00 0f 85 cd 05 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8b 73 08 49 8d 7e 10 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 26 05 00 00 49 8b 46 10 be a6 00 00 00 48 c7 c7 RSP: 0018:ffffc90003dff830 EFLAGS: 00010212 RAX: dffffc0000000000 RBX: ffff88802594e218 RCX: 000000000000000d RDX: 0000000000000002 RSI: 0000000000002000 RDI: 0000000000000010 RBP: ffff888071880222 R08: 0000000000000005 R09: 000000000000003f R10: 000000000000000d R11: 0000000000000000 R12: ffff888071880158 R13: ffff88802594e220 R14: 0000000000000000 R15: 0000000000000004 FS: 0000000000000000(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fb1c08316a8 CR3: 0000000018560000 CR4: 0000000000350ee0 Call Trace: <TASK> nilfs_dat_commit_free fs/nilfs2/dat.c:114 [inline] nilfs_dat_commit_end+0x464/0x5f0 fs/nilfs2/dat.c:193 nilfs_dat_commit_update+0x26/0x40 fs/nilfs2/dat.c:236 nilfs_btree_commit_update_v+0x87/0x4a0 fs/nilfs2/btree.c:1940 nilfs_btree_commit_propagate_v fs/nilfs2/btree.c:2016 [inline] nilfs_btree_propagate_v fs/nilfs2/btree.c:2046 [inline] nilfs_btree_propagate+0xa00/0xd60 fs/nilfs2/btree.c:2088 nilfs_bmap_propagate+0x73/0x170 fs/nilfs2/bmap.c:337 nilfs_collect_file_data+0x45/0xd0 fs/nilfs2/segment.c:568 nilfs_segctor_apply_buffers+0x14a/0x470 fs/nilfs2/segment.c:1018 nilfs_segctor_scan_file+0x3f4/0x6f0 fs/nilfs2/segment.c:1067 nilfs_segctor_collect_blocks fs/nilfs2/segment.c:1197 [inline] nilfs_segctor_collect fs/nilfs2/segment.c:1503 [inline] nilfs_segctor_do_construct+0x12fc/0x6af0 fs/nilfs2/segment.c:2045 nilfs_segctor_construct+0x8e3/0xb30 fs/nilfs2/segment.c:2379 nilfs_segctor_thread_construct fs/nilfs2/segment.c:2487 [inline] nilfs_segctor_thread+0x3c3/0xf30 fs/nilfs2/segment.c:2570 kthread+0x2e4/0x3a0 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306 </TASK> ... If DAT metadata file is corrupted on disk, there is a case where req->pr_desc_bh is NULL and blocknr is 0 at nilfs_dat_commit_end() during a b-tree operation that cascadingly updates ancestor nodes of the b-tree, because nilfs_dat_commit_alloc() for a lower level block can initialize the blocknr on the same DAT entry between nilfs_dat_prepare_end() and nilfs_dat_commit_end(). If this happens, nilfs_dat_commit_end() calls nilfs_dat_commit_free() without valid buffer heads in req->pr_desc_bh and req->pr_bitmap_bh, and causes the NULL pointer dereference above in nilfs_palloc_commit_free_entry() function, which leads to a crash. Fix this by adding a NULL check on req->pr_desc_bh and req->pr_bitmap_bh before nilfs_palloc_commit_free_entry() in nilfs_dat_commit_free(). This also calls nilfs_error() in that case to notify that there is a fatal flaw in the filesystem metadata and prevent further operations. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: ZhangPeng <[email protected]> Signed-off-by: Ryusuke Konishi <[email protected]> Reported-by: [email protected] Tested-by: Ryusuke Konishi <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-11-30hugetlb: don't delete vma_lock in hugetlb MADV_DONTNEED processingMike Kravetz3-12/+19
madvise(MADV_DONTNEED) ends up calling zap_page_range() to clear page tables associated with the address range. For hugetlb vmas, zap_page_range will call __unmap_hugepage_range_final. However, __unmap_hugepage_range_final assumes the passed vma is about to be removed and deletes the vma_lock to prevent pmd sharing as the vma is on the way out. In the case of madvise(MADV_DONTNEED) the vma remains, but the missing vma_lock prevents pmd sharing and could potentially lead to issues with truncation/fault races. This issue was originally reported here [1] as a BUG triggered in page_try_dup_anon_rmap. Prior to the introduction of the hugetlb vma_lock, __unmap_hugepage_range_final cleared the VM_MAYSHARE flag to prevent pmd sharing. Subsequent faults on this vma were confused as VM_MAYSHARE indicates a sharable vma, but was not set so page_mapping was not set in new pages added to the page table. This resulted in pages that appeared anonymous in a VM_SHARED vma and triggered the BUG. Address issue by adding a new zap flag ZAP_FLAG_UNMAP to indicate an unmap call from unmap_vmas(). This is used to indicate the 'final' unmapping of a hugetlb vma. When called via MADV_DONTNEED, this flag is not set and the vm_lock is not deleted. [1] https://lore.kernel.org/lkml/CAO4mrfdLMXsao9RF4fUE8-Wfde8xmjsKrTNMNC9wjUb6JudD0g@mail.gmail.com/ Link: https://lkml.kernel.org/r/[email protected] Fixes: 90e7e7f5ef3f ("mm: enable MADV_DONTNEED for hugetlb mappings") Signed-off-by: Mike Kravetz <[email protected]> Reported-by: Wei Chen <[email protected]> Cc: Axel Rasmussen <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Mina Almasry <[email protected]> Cc: Nadav Amit <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Peter Xu <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>