aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-10-02Merge tag 'gpio-v5.9-2' of ↵Linus Torvalds11-60/+138
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO fixes from Linus Walleij: "Some late GPIO fixes for the v5.9 series: - Fix compiler warnings on the OMAP when PM is disabled - Clear the interrupt when setting edge sensitivity on the Spreadtrum driver. - Fix up spurious interrupts on the TC35894. - Support threaded interrupts on the Siox controller. - Fix resource leaks on the mockup driver. - Fix line event handling in syscall compatible mode for the character device. - Fix an unitialized variable in the PCA953A driver. - Fix access to all GPIO IRQs on the Aspeed AST2600. - Fix line direction on the AMD FCH driver. - Use the bitmap API instead of compiler intrinsics for bit manipulation in the PCA953x driver" * tag 'gpio-v5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: gpio: pca953x: Correctly initialize registers 6 and 7 for PCA957x gpio: pca953x: Use bitmap API over implicit GCC extension gpio: amd-fch: correct logic of GPIO_LINE_DIRECTION gpio: aspeed: fix ast2600 bank properties gpio/aspeed-sgpio: don't enable all interrupts by default gpio/aspeed-sgpio: enable access to all 80 input & output sgpios gpio: pca953x: Fix uninitialized pending variable gpiolib: Fix line event handling in syscall compatible mode gpio: mockup: fix resource leak in error path gpio: siox: explicitly support only threaded irqs gpio: tc35894: fix up tc35894 interrupt configuration gpio: sprd: Clear interrupt when setting the type as edge gpio: omap: Fix warnings if PM is disabled
2020-10-02Merge tag 'mmc-v5.9-rc4-3' of ↵Linus Torvalds3-1/+7
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC fixes from Ulf Hansson: - Fix deadlock when removing MEMSTICK host - Workaround broken CMDQ on Intel GLK based IRBIS models * tag 'mmc-v5.9-rc4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: sdhci: Workaround broken command queuing on Intel GLK based IRBIS models memstick: Skip allocating card when removing host
2020-10-02random32: Restore __latent_entropy attribute on net_rand_stateThibaut Sautereau1-1/+1
Commit f227e3ec3b5c ("random32: update the net random state on interrupt and activity") broke compilation and was temporarily fixed by Linus in 83bdc7275e62 ("random32: remove net_rand_state from the latent entropy gcc plugin") by entirely moving net_rand_state out of the things handled by the latent_entropy GCC plugin. From what I understand when reading the plugin code, using the __latent_entropy attribute on a declaration was the wrong part and simply keeping the __latent_entropy attribute on the variable definition was the correct fix. Fixes: 83bdc7275e62 ("random32: remove net_rand_state from the latent entropy gcc plugin") Acked-by: Willy Tarreau <[email protected]> Cc: Emese Revfy <[email protected]> Signed-off-by: Thibaut Sautereau <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-10-02Merge branch 'pm-cpufreq'Rafael J. Wysocki1-0/+1
* pm-cpufreq: cpufreq: intel_pstate: Fix missing return statement
2020-10-02mm: memcg/slab: fix slab statistics in !SMP configurationRoman Gushchin1-0/+5
Since commit ea426c2a7de8 ("mm: memcg: prepare for byte-sized vmstat items") the write side of slab counters accepts a value in bytes and converts it to pages. It happens in __mod_node_page_state(). However a non-SMP version of __mod_node_page_state() doesn't perform this conversion. It leads to incorrect (unrealistically high) slab counters values. Fix this by adding a similar conversion to the non-SMP version of __mod_node_page_state(). Signed-off-by: Roman Gushchin <[email protected]> Reported-and-tested-by: Bastian Bittorf <[email protected]> Fixes: ea426c2a7de8 ("mm: memcg: prepare for byte-sized vmstat items") Acked-by: Vlastimil Babka <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-10-02MAINTAINERS: Add Mark Gross and Hans de Goede as x86 platform drivers ↵Hans de Goede1-3/+3
maintainers Darren Hart and Andy Shevchenko lately have not had enough time to maintain the x86 platform drivers, dropping their status to: "Odd Fixes". Mark Gross and Hans de Goede will take over maintainership of the x86 platform drivers. Replace Darren and Andy's entries with theirs and change the status to "Maintained". Signed-off-by: Hans de Goede <[email protected]> Acked-by: Mark Gross <[email protected]> Signed-off-by: Andy Shevchenko <[email protected]>
2020-10-02platform/x86: intel-vbtn: Switch to an allow-list for SW_TABLET_MODE reportingHans de Goede1-9/+43
2 recent commits: cfae58ed681c ("platform/x86: intel-vbtn: Only blacklist SW_TABLET_MODE on the 9 / "Laptop" chasis-type") 1fac39fd0316 ("platform/x86: intel-vbtn: Also handle tablet-mode switch on "Detachable" and "Portable" chassis-types") Enabled reporting of SW_TABLET_MODE on more devices since the vbtn ACPI interface is used by the firmware on some of those devices to report this. Testing has shown that unconditionally enabling SW_TABLET_MODE reporting on all devices with a chassis type of 8 ("Portable") or 10 ("Notebook") which support the VGBS method is a very bad idea. Many of these devices are normal laptops (non 2-in-1) models with a VGBS which always returns 0, which we translate to SW_TABLET_MODE=1. This in turn causes userspace (libinput) to suppress events from the builtin keyboard and touchpad, making the laptop essentially unusable. Since the problem of wrongly reporting SW_TABLET_MODE=1 in combination with libinput, leads to a non-usable system. Where as OTOH many people will not even notice when SW_TABLET_MODE is not being reported, this commit changes intel_vbtn_has_switches() to use a DMI based allow-list. The new DMI based allow-list matches on the 31 ("Convertible") and 32 ("Detachable") chassis-types, as these clearly are 2-in-1s and so far if they support the intel-vbtn ACPI interface they all have properly working SW_TABLET_MODE reporting. Besides these 2 generic matches, it also contains model specific matches for 2-in-1 models which use a different chassis-type and which are known to have properly working SW_TABLET_MODE reporting. This has been tested on the following 2-in-1 devices: Dell Venue 11 Pro 7130 vPro HP Pavilion X2 10-p002nd HP Stream x360 Convertible PC 11 Medion E1239T Fixes: cfae58ed681c ("platform/x86: intel-vbtn: Only blacklist SW_TABLET_MODE on the 9 / "Laptop" chasis-type") BugLink: https://forum.manjaro.org/t/keyboard-and-touchpad-only-work-on-kernel-5-6/22668 BugLink: https://bugzilla.opensuse.org/show_bug.cgi?id=1175599 Cc: Barnabás Pőcze <[email protected]> Cc: Takashi Iwai <[email protected]> Signed-off-by: Hans de Goede <[email protected]> Signed-off-by: Andy Shevchenko <[email protected]>
2020-10-02platform/x86: intel-vbtn: Revert "Fix SW_TABLET_MODE always reporting 1 on ↵Andy Shevchenko1-8/+4
the HP Pavilion 11 x360" After discussion, see the Link tag, it appears that this is not good enough. So, revert it now and apply a better fix. This reverts commit d823346876a970522ff9e4d2b323c9b734dcc4de. Link: https://lore.kernel.org/platform-driver-x86/[email protected]/ Signed-off-by: Andy Shevchenko <[email protected]>
2020-10-02Merge branch 'for-next/mte' into for-next/coreWill Deacon63-61/+1764
Add userspace support for the Memory Tagging Extension introduced by Armv8.5. (Catalin Marinas and others) * for-next/mte: (30 commits) arm64: mte: Fix typo in memory tagging ABI documentation arm64: mte: Add Memory Tagging Extension documentation arm64: mte: Kconfig entry arm64: mte: Save tags when hibernating arm64: mte: Enable swap of tagged pages mm: Add arch hooks for saving/restoring tags fs: Handle intra-page faults in copy_mount_options() arm64: mte: ptrace: Add NT_ARM_TAGGED_ADDR_CTRL regset arm64: mte: ptrace: Add PTRACE_{PEEK,POKE}MTETAGS support arm64: mte: Allow {set,get}_tagged_addr_ctrl() on non-current tasks arm64: mte: Restore the GCR_EL1 register after a suspend arm64: mte: Allow user control of the generated random tags via prctl() arm64: mte: Allow user control of the tag check mode via prctl() mm: Allow arm64 mmap(PROT_MTE) on RAM-based files arm64: mte: Validate the PROT_MTE request via arch_validate_flags() mm: Introduce arch_validate_flags() arm64: mte: Add PROT_MTE support to mmap() and mprotect() mm: Introduce arch_calc_vm_flag_bits() arm64: mte: Tags-aware aware memcmp_pages() implementation arm64: Avoid unnecessary clear_user_page() indirection ...
2020-10-02Merge branch 'for-next/ghostbusters' into for-next/coreWill Deacon32-1054/+981
Fix and subsequently rewrite Spectre mitigations, including the addition of support for PR_SPEC_DISABLE_NOEXEC. (Will Deacon and Marc Zyngier) * for-next/ghostbusters: (22 commits) arm64: Add support for PR_SPEC_DISABLE_NOEXEC prctl() option arm64: Pull in task_stack_page() to Spectre-v4 mitigation code KVM: arm64: Allow patching EL2 vectors even with KASLR is not enabled arm64: Get rid of arm64_ssbd_state KVM: arm64: Convert ARCH_WORKAROUND_2 to arm64_get_spectre_v4_state() KVM: arm64: Get rid of kvm_arm_have_ssbd() KVM: arm64: Simplify handling of ARCH_WORKAROUND_2 arm64: Rewrite Spectre-v4 mitigation code arm64: Move SSBD prctl() handler alongside other spectre mitigation code arm64: Rename ARM64_SSBD to ARM64_SPECTRE_V4 arm64: Treat SSBS as a non-strict system feature arm64: Group start_thread() functions together KVM: arm64: Set CSV2 for guests on hardware unaffected by Spectre-v2 arm64: Rewrite Spectre-v2 mitigation code arm64: Introduce separate file for spectre mitigations and reporting arm64: Rename ARM64_HARDEN_BRANCH_PREDICTOR to ARM64_SPECTRE_V2 KVM: arm64: Simplify install_bp_hardening_cb() KVM: arm64: Replace CONFIG_KVM_INDIRECT_VECTORS with CONFIG_RANDOMIZE_BASE arm64: Remove Spectre-related CONFIG_* options arm64: Run ARCH_WORKAROUND_2 enabling code on all CPUs ...
2020-10-02Merge branches 'for-next/acpi', 'for-next/boot', 'for-next/bpf', ↵Will Deacon105-777/+5565
'for-next/cpuinfo', 'for-next/fpsimd', 'for-next/misc', 'for-next/mm', 'for-next/pci', 'for-next/perf', 'for-next/ptrauth', 'for-next/sdei', 'for-next/selftests', 'for-next/stacktrace', 'for-next/svm', 'for-next/topology', 'for-next/tpyos' and 'for-next/vdso' into for-next/core Remove unused functions and parameters from ACPI IORT code. (Zenghui Yu via Lorenzo Pieralisi) * for-next/acpi: ACPI/IORT: Remove the unused inline functions ACPI/IORT: Drop the unused @ops of iort_add_device_replay() Remove redundant code and fix documentation of caching behaviour for the HVC_SOFT_RESTART hypercall. (Pingfan Liu) * for-next/boot: Documentation/kvm/arm: improve description of HVC_SOFT_RESTART arm64/relocate_kernel: remove redundant code Improve reporting of unexpected kernel traps due to BPF JIT failure. (Will Deacon) * for-next/bpf: arm64: Improve diagnostics when trapping BRK with FAULT_BRK_IMM Improve robustness of user-visible HWCAP strings and their corresponding numerical constants. (Anshuman Khandual) * for-next/cpuinfo: arm64/cpuinfo: Define HWCAP name arrays per their actual bit definitions Cleanups to handling of SVE and FPSIMD register state in preparation for potential future optimisation of handling across syscalls. (Julien Grall) * for-next/fpsimd: arm64/sve: Implement a helper to load SVE registers from FPSIMD state arm64/sve: Implement a helper to flush SVE registers arm64/fpsimdmacros: Allow the macro "for" to be used in more cases arm64/fpsimdmacros: Introduce a macro to update ZCR_EL1.LEN arm64/signal: Update the comment in preserve_sve_context arm64/fpsimd: Update documentation of do_sve_acc Miscellaneous changes. (Tian Tao and others) * for-next/misc: arm64/mm: return cpu_all_mask when node is NUMA_NO_NODE arm64: mm: Fix missing-prototypes in pageattr.c arm64/fpsimd: Fix missing-prototypes in fpsimd.c arm64: hibernate: Remove unused including <linux/version.h> arm64/mm: Refactor {pgd, pud, pmd, pte}_ERROR() arm64: Remove the unused include statements arm64: get rid of TEXT_OFFSET arm64: traps: Add str of description to panic() in die() Memory management updates and cleanups. (Anshuman Khandual and others) * for-next/mm: arm64: dbm: Invalidate local TLB when setting TCR_EL1.HD arm64: mm: Make flush_tlb_fix_spurious_fault() a no-op arm64/mm: Unify CONT_PMD_SHIFT arm64/mm: Unify CONT_PTE_SHIFT arm64/mm: Remove CONT_RANGE_OFFSET arm64/mm: Enable THP migration arm64/mm: Change THP helpers to comply with generic MM semantics arm64/mm/ptdump: Add address markers for BPF regions Allow prefetchable PCI BARs to be exposed to userspace using normal non-cacheable mappings. (Clint Sbisa) * for-next/pci: arm64: Enable PCI write-combine resources under sysfs Perf/PMU driver updates. (Julien Thierry and others) * for-next/perf: perf: arm-cmn: Fix conversion specifiers for node type perf: arm-cmn: Fix unsigned comparison to less than zero arm_pmu: arm64: Use NMIs for PMU arm_pmu: Introduce pmu_irq_ops KVM: arm64: pmu: Make overflow handler NMI safe arm64: perf: Defer irq_work to IPI_IRQ_WORK arm64: perf: Remove PMU locking arm64: perf: Avoid PMXEV* indirection arm64: perf: Add missing ISB in armv8pmu_enable_counter() perf: Add Arm CMN-600 PMU driver perf: Add Arm CMN-600 DT binding arm64: perf: Add support caps under sysfs drivers/perf: thunderx2_pmu: Fix memory resource error handling drivers/perf: xgene_pmu: Fix uninitialized resource struct perf: arm_dsu: Support DSU ACPI devices arm64: perf: Remove unnecessary event_idx check drivers/perf: hisi: Add missing include of linux/module.h arm64: perf: Add general hardware LLC events for PMUv3 Support for the Armv8.3 Pointer Authentication enhancements. (By Amit Daniel Kachhap) * for-next/ptrauth: arm64: kprobe: clarify the comment of steppable hint instructions arm64: kprobe: disable probe of fault prone ptrauth instruction arm64: cpufeature: Modify address authentication cpufeature to exact arm64: ptrauth: Introduce Armv8.3 pointer authentication enhancements arm64: traps: Allow force_signal_inject to pass esr error code arm64: kprobe: add checks for ARMv8.3-PAuth combined instructions Tonnes of cleanup to the SDEI driver. (Gavin Shan) * for-next/sdei: firmware: arm_sdei: Remove _sdei_event_unregister() firmware: arm_sdei: Remove _sdei_event_register() firmware: arm_sdei: Introduce sdei_do_local_call() firmware: arm_sdei: Cleanup on cross call function firmware: arm_sdei: Remove while loop in sdei_event_unregister() firmware: arm_sdei: Remove while loop in sdei_event_register() firmware: arm_sdei: Remove redundant error message in sdei_probe() firmware: arm_sdei: Remove duplicate check in sdei_get_conduit() firmware: arm_sdei: Unregister driver on error in sdei_init() firmware: arm_sdei: Avoid nested statements in sdei_init() firmware: arm_sdei: Retrieve event number from event instance firmware: arm_sdei: Common block for failing path in sdei_event_create() firmware: arm_sdei: Remove sdei_is_err() Selftests for Pointer Authentication and FPSIMD/SVE context-switching. (Mark Brown and Boyan Karatotev) * for-next/selftests: selftests: arm64: Add build and documentation for FP tests selftests: arm64: Add wrapper scripts for stress tests selftests: arm64: Add utility to set SVE vector lengths selftests: arm64: Add stress tests for FPSMID and SVE context switching selftests: arm64: Add test for the SVE ptrace interface selftests: arm64: Test case for enumeration of SVE vector lengths kselftests/arm64: add PAuth tests for single threaded consistency and differently initialized keys kselftests/arm64: add PAuth test for whether exec() changes keys kselftests/arm64: add nop checks for PAuth tests kselftests/arm64: add a basic Pointer Authentication test Implementation of ARCH_STACKWALK for unwinding. (Mark Brown) * for-next/stacktrace: arm64: Move console stack display code to stacktrace.c arm64: stacktrace: Convert to ARCH_STACKWALK arm64: stacktrace: Make stack walk callback consistent with generic code stacktrace: Remove reliable argument from arch_stack_walk() callback Support for ASID pinning, which is required when sharing page-tables with the SMMU. (Jean-Philippe Brucker) * for-next/svm: arm64: cpufeature: Export symbol read_sanitised_ftr_reg() arm64: mm: Pin down ASIDs for sharing mm with devices Rely on firmware tables for establishing CPU topology. (Valentin Schneider) * for-next/topology: arm64: topology: Stop using MPIDR for topology information Spelling fixes. (Xiaoming Ni and Yanfei Xu) * for-next/tpyos: arm64/numa: Fix a typo in comment of arm64_numa_init arm64: fix some spelling mistakes in the comments by codespell vDSO cleanups. (Will Deacon) * for-next/vdso: arm64: vdso: Fix unusual formatting in *setup_additional_pages() arm64: vdso32: Remove a bunch of #ifdef CONFIG_COMPAT_VDSO guards
2020-10-01pipe: remove pipe_wait() and fix wakeup race with spliceLinus Torvalds3-27/+48
The pipe splice code still used the old model of waiting for pipe IO by using a non-specific "pipe_wait()" that waited for any pipe event to happen, which depended on all pipe IO being entirely serialized by the pipe lock. So by checking the state you were waiting for, and then adding yourself to the wait queue before dropping the lock, you were guaranteed to see all the wakeups. Strictly speaking, the actual wakeups were not done under the lock, but the pipe_wait() model still worked, because since the waiter held the lock when checking whether it should sleep, it would always see the current state, and the wakeup was always done after updating the state. However, commit 0ddad21d3e99 ("pipe: use exclusive waits when reading or writing") split the single wait-queue into two, and in the process also made the "wait for event" code wait for _two_ wait queues, and that then showed a race with the wakers that were not serialized by the pipe lock. It's only splice that used that "pipe_wait()" model, so the problem wasn't obvious, but Josef Bacik reports: "I hit a hang with fstest btrfs/187, which does a btrfs send into /dev/null. This works by creating a pipe, the write side is given to the kernel to write into, and the read side is handed to a thread that splices into a file, in this case /dev/null. The box that was hung had the write side stuck here [pipe_write] and the read side stuck here [splice_from_pipe_next -> pipe_wait]. [ more details about pipe_wait() scenario ] The problem is we're doing the prepare_to_wait, which sets our state each time, however we can be woken up either with reads or writes. In the case above we race with the WRITER waking us up, and re-set our state to INTERRUPTIBLE, and thus never break out of schedule" Josef had a patch that avoided the issue in pipe_wait() by just making it set the state only once, but the deeper problem is that pipe_wait() depends on a level of synchonization by the pipe mutex that it really shouldn't. And the whole "wait for any pipe state change" model really isn't very good to begin with. So rather than trying to work around things in pipe_wait(), remove that legacy model of "wait for arbitrary pipe event" entirely, and actually create functions that wait for the pipe actually being readable or writable, and can do so without depending on the pipe lock serializing everything. Fixes: 0ddad21d3e99 ("pipe: use exclusive waits when reading or writing") Link: https://lore.kernel.org/linux-fsdevel/bfa88b5ad6f069b2b679316b9e495a970130416c.1601567868.git.josef@toxicpanda.com/ Reported-by: Josef Bacik <[email protected]> Reviewed-and-tested-by: Josef Bacik <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-10-01perf: arm-cmn: Fix conversion specifiers for node typeWill Deacon1-2/+2
The node type field is an enum type, so print it as a 32-bit quantity rather than as an unsigned short. Link: https://lore.kernel.org/r/[email protected] Reported-by: kernel test robot <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2020-10-01perf: arm-cmn: Fix unsigned comparison to less than zeroWill Deacon1-1/+1
Ensure that the 'irq' field of 'struct arm_cmn_dtc' is a signed int so that it can be compared '< 0'. Link: https://lore.kernel.org/r/20200929170835.GA15956@embeddedor Addresses-Coverity-ID: 1497488 ("Unsigned compared against 0") Fixes: 0ba64770a2f2 ("perf: Add Arm CMN-600 PMU driver") Reported-by: Gustavo A. R. Silva <[email protected]> Reviewed-by: Gustavo A. R. Silva <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2020-10-02tpm_tis: Add a check for invalid statusJames Bottomley2-0/+12
Some TIS based TPMs can return 0xff to status reads if the locality hasn't been properly requested. Detect this condition by checking the bits that the TIS spec specifies must return zero are clear and return zero in that case. Also drop a warning so the problem can be identified in the calling path and fixed (usually a missing try_get_ops()). Signed-off-by: James Bottomley <[email protected]> Reviewed-by: Jarkko Sakkinen <[email protected]> Signed-off-by: Jarkko Sakkinen <[email protected]>
2020-10-02tpm: use %*ph to print small bufferAndy Shevchenko1-21/+10
Use %*ph format to print small buffer as hex string. Signed-off-by: Andy Shevchenko <[email protected]> Reviewed-by: Petr Vorel <[email protected]> Reviewed-by: Jarkko Sakkinen <[email protected]> Signed-off-by: Jarkko Sakkinen <[email protected]>
2020-10-02dt-bindings: Add SynQucer TPM MMIO as a trivial deviceMasahisa Kojima1-0/+2
Add a compatible string for the SynQuacer TPM to the binding for a TPM exposed via a memory mapped TIS frame. The MMIO window behaves slightly differently on this hardware, so it requires its own identifier. Cc: Rob Herring <[email protected]> Cc: Ard Biesheuvel <[email protected]> Acked-by: Rob Herring <[email protected]> Signed-off-by: Masahisa Kojima <[email protected]> Acked-by: Jarkko Sakkinen <[email protected]> Signed-off-by: Jarkko Sakkinen <[email protected]>
2020-10-02tpm: tis: add support for MMIO TPM on SynQuacerMasahisa Kojima3-0/+221
When fitted, the SynQuacer platform exposes its SPI TPM via a MMIO window that is backed by the SPI command sequencer in the SPI bus controller. This arrangement has the limitation that only byte size accesses are supported, and so we'll need to provide a separate module that take this into account. Signed-off-by: Ard Biesheuvel <[email protected]> Signed-off-by: Masahisa Kojima <[email protected]> Reviewed-by: Jarkko Sakkinen <[email protected]> Signed-off-by: Jarkko Sakkinen <[email protected]>
2020-10-01Merge tag 'iommu-fixes-v5.9-rc7' of ↵Linus Torvalds3-50/+18
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fixes from Joerg Roedel: - Fix a device reference counting bug in the Exynos IOMMU driver. - Lockdep fix for the Intel VT-d driver. - Fix a bug in the AMD IOMMU driver which caused corruption of the IVRS ACPI table and caused IOMMU driver initialization failures in kdump kernels. * tag 'iommu-fixes-v5.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/vt-d: Fix lockdep splat in iommu_flush_dev_iotlb() iommu/amd: Fix the overwritten field in IVMD header iommu/exynos: add missing put_device() call in exynos_iommu_of_xlate()
2020-10-01r8169: fix data corruption issue on RTL8402Heiner Kallweit1-0/+4
Petr reported that after resume from suspend RTL8402 partially truncates incoming packets, and re-initializing register RxConfig before the actual chip re-initialization sequence is needed to avoid the issue. Reported-by: Petr Tesarik <[email protected]> Proposed-by: Petr Tesarik <[email protected]> Tested-by: Petr Tesarik <[email protected]> Signed-off-by: Heiner Kallweit <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-10-01r8169: fix handling ether_clkHeiner Kallweit1-13/+19
Petr reported that system freezes on r8169 driver load on a system using ether_clk. The original change was done under the assumption that the clock isn't needed for basic operations like chip register access. But obviously that was wrong. Therefore effectively revert the original change, and in addition leave the clock active when suspending and WoL is enabled. Chip may not be able to process incoming packets otherwise. Fixes: 9f0b54cd1672 ("r8169: move switching optional clock on/off to pll power functions") Reported-by: Petr Tesarik <[email protected]> Tested-by: Petr Tesarik <[email protected]> Signed-off-by: Heiner Kallweit <[email protected]> Reviewed-by: Hans de Goede <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-10-01Merge tag 'arm64-fixes' of ↵Linus Torvalds2-3/+21
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fix from Catalin Marinas: "A previous commit to prevent AML memory opregions from accessing the kernel memory turned out to be too restrictive. Relax the permission check to permit the ACPI core to map kernel memory used for table overrides" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: permit ACPI core to map kernel memory used for table overrides
2020-10-01x86/uv/time: Use a flexible array in struct uv_rtc_timer_headGustavo A. R. Silva1-4/+3
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. struct uv_rtc_timer_head contains a one-element array cpu[1]. Switch it to a flexible array and use the struct_size() helper to calculate the allocation size. Also, save some heap space in the process[3]. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays [3] https://lore.kernel.org/lkml/20200518190114.GA7757@embeddedor/ [ bp: Massage a bit. ] Signed-off-by: Gustavo A. R. Silva <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Kees Cook <[email protected]> Cc: Steve Wahl <[email protected]> Link: https://lkml.kernel.org/r/20201001145608.GA10204@embeddedor
2020-10-01Merge tag 'drm-fixes-2020-10-01-1' of git://anongit.freedesktop.org/drm/drmLinus Torvalds18-43/+156
Pull drm fixes from Dave Airlie: "AMD and vmwgfx fixes. Just dequeuing these a bit early as the AMD ones are bit larger than I'd prefer, but Alex missed last week so it's a double set of fixes. The larger ones are just register header fixes for the new chips that were just introduced in rc1 along with some new PCI IDs for new hw. Otherwise it is usual fixes. The vmwgfx fix was due to some testing I was doing and found we weren't booting properly, vmware had the fix internally so hurried it vmwgfx: - fix a regression due to TTM refactor amdgpu: - Fix potential double free in userptr handling - Sienna Cichlid and Navy Flounder udpates - Add Sienna Cichlid PCI IDs - Drop experimental flag for navi12 - Raven fixes - Renoir fixes - HDCP fix - DCN3 fix for clang and older versions of gcc - Fix a runtime pm refcount issue" * tag 'drm-fixes-2020-10-01-1' of git://anongit.freedesktop.org/drm/drm: drm/amdgpu: disable gfxoff temporarily for navy_flounder drm/amd/pm: setup APU dpm clock table in SMU HW initialization drm/vmwgfx: Fix error handling in get_node drm/amd/display: remove duplicate call to rn_vbios_smu_get_smu_version() drm/amdgpu/swsmu/smu12: fix force clock handling for mclk drm/amdgpu: restore proper ref count in amdgpu_display_crtc_set_config drm/amdgpu/display: fix CFLAGS setup for DCN30 drm/amd/display: fix return value check for hdcp_work drm/amdgpu: remove gpu_info fw support for sienna_cichlid etc. drm/amd/pm: Removed fixed clock in auto mode DPM drm/amdgpu: remove experimental flag from navi12 drm/amdgpu: add device ID for sienna_cichlid (v2) drm/amdgpu: use the AV1 defines for VCN 3.0 drm/amdgpu: add VCN 3.0 AV1 registers drm/amdgpu: add the GC 10.3 VRS registers drm/amdgpu: prevent double kfree ttm->sg
2020-10-01Merge tag 'trace-v5.9-rc6' of ↵Linus Torvalds2-8/+8
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "Two tracing fixes: - Fix temp buffer accounting that caused a WARNING for ftrace_dump_on_opps() - Move the recursion check in one of the function callback helpers to the beginning of the function, as if the rcu_is_watching() gets traced, it will cause a recursive loop that will crash the kernel" * tag 'trace-v5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: ftrace: Move RCU is watching check after recursion check tracing: Fix trace_find_next_entry() accounting of temp buffer size
2020-10-01bpf: Fix "unresolved symbol" build error with resolve_btfidsYonghong Song1-0/+6
Michal reported a build failure likes below: BTFIDS vmlinux FAILED unresolved symbol tcp_timewait_sock make[1]: *** [/.../linux-5.9-rc7/Makefile:1176: vmlinux] Error 255 This error can be triggered when config has CONFIG_NET enabled but CONFIG_INET disabled. In this case, there is no user of istructs inet_timewait_sock and tcp_timewait_sock and hence vmlinux BTF types are not generated for these two structures. To fix the problem, let us force BTF generation for these two structures with BTF_TYPE_EMIT. Fixes: fce557bcef11 ("bpf: Make btf_sock_ids global") Reported-by: Michal Kubecek <[email protected]> Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-10-01pipe: Fix memory leaks in create_pipe_files()Qian Cai2-6/+11
Calling pipe2() with O_NOTIFICATION_PIPE could results in memory leaks unless watch_queue_init() is successful. In case of watch_queue_init() failure in pipe2() we are left with inode and pipe_inode_info instances that need to be freed. That failure exit has been introduced in commit c73be61cede5 ("pipe: Add general notification queue support") and its handling should've been identical to nearby treatment of alloc_file_pseudo() failures - it is dealing with the same situation. As it is, the mainline kernel leaks in that case. Another problem is that CONFIG_WATCH_QUEUE and !CONFIG_WATCH_QUEUE cases are treated differently (and the former leaks just pipe_inode_info, the latter - both pipe_inode_info and inode). Fixed by providing a dummy wacth_queue_init() in !CONFIG_WATCH_QUEUE case and by having failures of wacth_queue_init() handled the same way we handle alloc_file_pseudo() ones. Fixes: c73be61cede5 ("pipe: Add general notification queue support") Signed-off-by: Qian Cai <[email protected]> Signed-off-by: Al Viro <[email protected]>
2020-10-01iommu/vt-d: Fix lockdep splat in iommu_flush_dev_iotlb()Lu Baolu1-2/+2
Lock(&iommu->lock) without disabling irq causes lockdep warnings. [ 12.703950] ======================================================== [ 12.703962] WARNING: possible irq lock inversion dependency detected [ 12.703975] 5.9.0-rc6+ #659 Not tainted [ 12.703983] -------------------------------------------------------- [ 12.703995] systemd-udevd/284 just changed the state of lock: [ 12.704007] ffffffffbd6ff4d8 (device_domain_lock){..-.}-{2:2}, at: iommu_flush_dev_iotlb.part.57+0x2e/0x90 [ 12.704031] but this lock took another, SOFTIRQ-unsafe lock in the past: [ 12.704043] (&iommu->lock){+.+.}-{2:2} [ 12.704045] and interrupts could create inverse lock ordering between them. [ 12.704073] other info that might help us debug this: [ 12.704085] Possible interrupt unsafe locking scenario: [ 12.704097] CPU0 CPU1 [ 12.704106] ---- ---- [ 12.704115] lock(&iommu->lock); [ 12.704123] local_irq_disable(); [ 12.704134] lock(device_domain_lock); [ 12.704146] lock(&iommu->lock); [ 12.704158] <Interrupt> [ 12.704164] lock(device_domain_lock); [ 12.704174] *** DEADLOCK *** Signed-off-by: Lu Baolu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2020-10-01xen/events: don't use chip_data for legacy IRQsJuergen Gross1-8/+21
Since commit c330fb1ddc0a ("XEN uses irqdesc::irq_data_common::handler_data to store a per interrupt XEN data pointer which contains XEN specific information.") Xen is using the chip_data pointer for storing IRQ specific data. When running as a HVM domain this can result in problems for legacy IRQs, as those might use chip_data for their own purposes. Use a local array for this purpose in case of legacy IRQs, avoiding the double use. Cc: [email protected] Fixes: c330fb1ddc0a ("XEN uses irqdesc::irq_data_common::handler_data to store a per interrupt XEN data pointer which contains XEN specific information.") Signed-off-by: Juergen Gross <[email protected]> Tested-by: Stefan Bader <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Juergen Gross <[email protected]>
2020-10-01iommu/amd: Fix the overwritten field in IVMD headerAdrian Huang1-46/+10
Commit 387caf0b759a ("iommu/amd: Treat per-device exclusion ranges as r/w unity-mapped regions") accidentally overwrites the 'flags' field in IVMD (struct ivmd_header) when the I/O virtualization memory definition is associated with the exclusion range entry. This leads to the corrupted IVMD table (incorrect checksum). The kdump kernel reports the invalid checksum: ACPI BIOS Warning (bug): Incorrect checksum in table [IVRS] - 0x5C, should be 0x60 (20200717/tbprint-177) AMD-Vi: [Firmware Bug]: IVRS invalid checksum Fix the above-mentioned issue by modifying the 'struct unity_map_entry' member instead of the IVMD header. Cleanup: The *exclusion_range* functions are not used anymore, so get rid of them. Fixes: 387caf0b759a ("iommu/amd: Treat per-device exclusion ranges as r/w unity-mapped regions") Reported-and-tested-by: Baoquan He <[email protected]> Signed-off-by: Adrian Huang <[email protected]> Cc: Jerry Snitselaar <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2020-10-01arm64: dbm: Invalidate local TLB when setting TCR_EL1.HDWill Deacon1-0/+1
TCR_EL1.HD is permitted to be cached in a TLB, so invalidate the local TLB after setting the bit when detected support for the feature. Although this isn't strictly necessary, since we can happily operate with the bit effectively clear, the current code uses an ISB in a half-hearted attempt to make the change effective, so let's just fix that up. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Catalin Marinas <[email protected]> Reviewed-by: Mark Rutland <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2020-10-01KVM: arm64: Restore missing ISB on nVHE __tlb_switch_to_guestMarc Zyngier1-0/+7
Commit a0e50aa3f4a8 ("KVM: arm64: Factor out stage 2 page table data from struct kvm") dropped the ISB after __load_guest_stage2(), only leaving the one that is required when the speculative AT workaround is in effect. As Andrew points it: "This alternative is 'backwards' to avoid a double ISB as there is one in __load_guest_stage2 when the workaround is active." Restore the missing ISB, conditionned on the AT workaround not being active. Fixes: a0e50aa3f4a8 ("KVM: arm64: Factor out stage 2 page table data from struct kvm") Reported-by: Andrew Scull <[email protected]> Reported-by: Thomas Tai <[email protected]> Signed-off-by: Marc Zyngier <[email protected]>
2020-10-01arm64: mm: Make flush_tlb_fix_spurious_fault() a no-opWill Deacon2-1/+11
Our use of broadcast TLB maintenance means that spurious page-faults that have been handled already by another CPU do not require additional TLB maintenance. Make flush_tlb_fix_spurious_fault() a no-op and rely on the existing TLB invalidation instead. Add an explicit flush_tlb_page() when making a page dirty, as the TLB is permitted to cache the old read-only entry. Reviewed-by: Catalin Marinas <[email protected]> Link: https://lore.kernel.org/r/20200728092220.GA21800@willie-the-truck Signed-off-by: Will Deacon <[email protected]>
2020-10-01gpio: pca953x: Correctly initialize registers 6 and 7 for PCA957xAndy Shevchenko1-1/+4
When driver has been converted to the bitmap API the non-bitmap functions started behaving differently on 32-bit BE architectures since the bytes in two consequent unsigned longs are in different order in comparison to byte array. Hence if the chip had had more than 32 lines the memset() call over it would have not set up upper lines correctly. Although it's currently a theoretical case (no supported chips of this type has 32+ lines), it's better to provide a clean code to avoid people thinking this is okay and potentially producing not fully working things. Fixes: 35d13d94893f ("gpio: pca953x: convert to use bitmap API") Signed-off-by: Andy Shevchenko <[email protected]> Reviewed-by: Bartosz Golaszewski <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Linus Walleij <[email protected]>
2020-10-01gpio: pca953x: Use bitmap API over implicit GCC extensionAndy Shevchenko1-1/+3
In IRQ handler we have to clear bitmap before use. Currently the GCC extension has been used for that. For sake of the consistency switch to bitmap API. As expected bloat-o-meter shows no difference in the object size. Signed-off-by: Andy Shevchenko <[email protected]> Reviewed-by: Bartosz Golaszewski <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Linus Walleij <[email protected]>
2020-10-01pinctrl: mediatek: check mtk_is_virt_gpio input parameterHanks Chen1-0/+4
check mtk_is_virt_gpio input parameter, virtual gpio need to support eint mode. add error handler for the ko case to fix this boot fail: pc : mtk_is_virt_gpio+0x20/0x38 [pinctrl_mtk_common_v2] lr : mtk_gpio_get_direction+0x44/0xb0 [pinctrl_paris] Fixes: edd546465002 ("pinctrl: mediatek: avoid virtual gpio trying to set reg") Signed-off-by: Hanks Chen <[email protected]> Acked-by: Sean Wang <[email protected]> Singed-off-by: Jie Yang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Linus Walleij <[email protected]>
2020-10-01pinctrl: qcom: sm8250: correct sdc2_clkDmitry Baryshkov1-1/+1
Correct sdc2_clk pin definition (register offset is wrong, verified by the msm-4.19 driver). Fixes: 4e3ec9e407ad ("pinctrl: qcom: Add sm8250 pinctrl driver.") Signed-off-by: Dmitry Baryshkov <[email protected]> Reviewed-by: Bjorn Andersson <[email protected]> Acked-by: Manivannan Sadhasivam <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Linus Walleij <[email protected]>
2020-10-01Merge tag 'amd-drm-fixes-5.9-2020-09-30' of ↵Dave Airlie16-41/+154
git://people.freedesktop.org/~agd5f/linux into drm-fixes amd-drm-fixes-5.9-2020-09-30: amdgpu: - Fix potential double free in userptr handling - Sienna Cichlid and Navy Flounder udpates - Add Sienna Cichlid PCI IDs - Drop experimental flag for navi12 - Raven fixes - Renoir fixes - HDCP fix - DCN3 fix for clang and older versions of gcc - Fix a runtime pm refcount issue Signed-off-by: Dave Airlie <[email protected]> From: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-09-30Merge branch 'Fix-bugs-in-Octeontx2-netdev-driver'David S. Miller8-21/+47
Geetha sowjanya says: ==================== Fix bugs in Octeontx2 netdev driver In existing Octeontx2 network drivers code has issues like stale entries in broadcast replication list, missing L3TYPE for IPv6 frames, running tx queues on error and race condition in mbox reset. This patch set fixes the above issues. ==================== Signed-off-by: David S. Miller <[email protected]>
2020-09-30octeontx2-pf: Fix synchnorization issue in mboxHariprasad Kelam4-9/+19
Mbox implementation in octeontx2 driver has three states alloc, send and reset in mbox response. VF allocate and sends message to PF for processing, PF ACKs them back and reset the mbox memory. In some case we see synchronization issue where after msgs_acked is incremented and before mbox_reset API is called, if current execution is scheduled out and a different thread is scheduled in which checks for msgs_acked. Since the new thread sees msgs_acked == msgs_sent it will try to allocate a new message and to send a new mbox message to PF.Now if mbox_reset is scheduled in, PF will see '0' in msgs_send. This patch fixes the issue by calling mbox_reset before incrementing msgs_acked flag for last processing message and checks for valid message size. Fixes: d424b6c02 ("octeontx2-pf: Enable SRIOV and added VF mbox handling") Signed-off-by: Hariprasad Kelam <[email protected]> Signed-off-by: Geetha sowjanya <[email protected]> Signed-off-by: Sunil Goutham <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-30octeontx2-pf: Fix the device state on errorHariprasad Kelam1-1/+4
Currently in otx2_open on failure of nix_lf_start transmit queues are not stopped which are already started in link_event. Since the tx queues are not stopped network stack still try's to send the packets leading to driver crash while access the device resources. Fixes: 50fe6c02e ("octeontx2-pf: Register and handle link notifications") Signed-off-by: Hariprasad Kelam <[email protected]> Signed-off-by: Geetha sowjanya <[email protected]> Signed-off-by: Sunil Goutham <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-30octeontx2-pf: Fix TCP/UDP checksum offload for IPv6 framesGeetha sowjanya1-0/+1
For TCP/UDP checksum offload feature in Octeontx2 expects L3TYPE to be set irrespective of IP header checksum is being offloaded or not. Currently for IPv6 frames L3TYPE is not being set resulting in packet drop with checksum error. This patch fixes this issue. Fixes: 3ca6c4c88 ("octeontx2-pf: Add packet transmission support") Signed-off-by: Geetha sowjanya <[email protected]> Signed-off-by: Sunil Goutham <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-30octeontx2-af: Fix enable/disable of default NPC entriesSubbaraya Sundeep3-11/+23
Packet replication feature present in Octeontx2 is a hardware linked list of PF and its VF interfaces so that broadcast packets are sent to all interfaces present in the list. It is driver job to add and delete a PF/VF interface to/from the list when the interface is brought up and down. This patch fixes the npc_enadis_default_entries function to handle broadcast replication properly if packet replication feature is present. Fixes: 40df309e4166 ("octeontx2-af: Support to enable/disable default MCAM entries") Signed-off-by: Subbaraya Sundeep <[email protected]> Signed-off-by: Geetha sowjanya <[email protected]> Signed-off-by: Sunil Goutham <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-30Merge branch '100GbE' of https://github.com/anguy11/net-queueDavid S. Miller2-24/+35
Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2020-09-30 This series contains updates to ice driver only. Jake increases the wait time for firmware response as it can take longer than the current wait time. Preserves the NVM capabilities of the device in safe mode so the device reports its NVM update capabilities properly when in this state. v2: Added cover letter ==================== Signed-off-by: David S. Miller <[email protected]>
2020-09-30MAINTAINERS: Add Pali Rohár as aardvark PCI maintainerPali Rohár1-0/+1
Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Pali Rohár <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Acked-by: Thomas Petazzzoni <[email protected]>
2020-09-30arm64: permit ACPI core to map kernel memory used for table overridesArd Biesheuvel2-3/+21
Jonathan reports that the strict policy for memory mapped by the ACPI core breaks the use case of passing ACPI table overrides via initramfs. This is due to the fact that the memory type used for loading the initramfs in memory is not recognized as a memory type that is typically used by firmware to pass firmware tables. Since the purpose of the strict policy is to ensure that no AML or other ACPI code can manipulate any memory that is used by the kernel to keep its internal state or the state of user tasks, we can relax the permission check, and allow mappings of memory that is reserved and marked as NOMAP via memblock, and therefore not covered by the linear mapping to begin with. Fixes: 1583052d111f ("arm64/acpi: disallow AML memory opregions to access kernel memory") Fixes: 325f5585ec36 ("arm64/acpi: disallow writeable AML opregion mapping for EFI code regions") Reported-by: Jonathan Cameron <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]> Tested-by: Jonathan Cameron <[email protected]> Cc: Sudeep Holla <[email protected]> Cc: Lorenzo Pieralisi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
2020-09-30Merge tag 'clk-fixes-for-linus' of ↵Linus Torvalds5-8/+12
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "Another batch of clk driver fixes: - Make sure DRAM and ChipID region doesn't get disabled on Exynos - Fix a SATA failure on Tegra - Fix the emac_ptp clk divider on stratix10" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: socfpga: stratix10: fix the divider for the emac_ptp_free_clk clk: samsung: exynos4: mark 'chipid' clock as CLK_IGNORE_UNUSED clk: tegra: Fix missing prototype for tegra210_clk_register_emc() clk: tegra: Always program PLL_E when enabled clk: tegra: Capitalization fixes clk: samsung: Keep top BPLL mux on Exynos542x enabled
2020-09-30RISC-V: Check clint_time_val before useAnup Patel2-4/+13
The NoMMU kernel is broken for QEMU virt machine from Linux-5.9-rc6 because clint_time_val is used even before CLINT driver is probed at following places: 1. rand_initialize() calls get_cycles() which in-turn uses clint_time_val 2. boot_init_stack_canary() calls get_cycles() which in-turn uses clint_time_val The issue#1 (above) is fixed by providing custom random_get_entropy() for RISC-V NoMMU kernel. For issue#2 (above), we remove dependency of boot_init_stack_canary() on get_cycles() and this is aligned with the boot_init_stack_canary() implementations of ARM, ARM64 and MIPS kernel. Fixes: d5be89a8d118 ("RISC-V: Resurrect the MMIO timer implementation for M-mode systems") Signed-off-by: Anup Patel <[email protected]> Tested-by: Damien Le Moal <[email protected]> Signed-off-by: Palmer Dabbelt <[email protected]>
2020-09-30btrfs: fix filesystem corruption after a device replaceFilipe Manana1-1/+39
We use a device's allocation state tree to track ranges in a device used for allocated chunks, and we set ranges in this tree when allocating a new chunk. However after a device replace operation, we were not setting the allocated ranges in the new device's allocation state tree, so that tree is empty after a device replace. This means that a fitrim operation after a device replace will trim the device ranges that have allocated chunks and extents, as we trim every range for which there is not a range marked in the device's allocation state tree. It is also important during chunk allocation, since the device's allocation state is used to determine if a range is already allocated when allocating a new chunk. This is trivial to reproduce and the following script triggers the bug: $ cat reproducer.sh #!/bin/bash DEV1="/dev/sdg" DEV2="/dev/sdh" DEV3="/dev/sdi" wipefs -a $DEV1 $DEV2 $DEV3 &> /dev/null # Create a raid1 test fs on 2 devices. mkfs.btrfs -f -m raid1 -d raid1 $DEV1 $DEV2 > /dev/null mount $DEV1 /mnt/btrfs xfs_io -f -c "pwrite -S 0xab 0 10M" /mnt/btrfs/foo echo "Starting to replace $DEV1 with $DEV3" btrfs replace start -B $DEV1 $DEV3 /mnt/btrfs echo echo "Running fstrim" fstrim /mnt/btrfs echo echo "Unmounting filesystem" umount /mnt/btrfs echo "Mounting filesystem in degraded mode using $DEV3 only" wipefs -a $DEV1 $DEV2 &> /dev/null mount -o degraded $DEV3 /mnt/btrfs if [ $? -ne 0 ]; then dmesg | tail echo echo "Failed to mount in degraded mode" exit 1 fi echo echo "File foo data (expected all bytes = 0xab):" od -A d -t x1 /mnt/btrfs/foo umount /mnt/btrfs When running the reproducer: $ ./replace-test.sh wrote 10485760/10485760 bytes at offset 0 10 MiB, 2560 ops; 0.0901 sec (110.877 MiB/sec and 28384.5216 ops/sec) Starting to replace /dev/sdg with /dev/sdi Running fstrim Unmounting filesystem Mounting filesystem in degraded mode using /dev/sdi only mount: /mnt/btrfs: wrong fs type, bad option, bad superblock on /dev/sdi, missing codepage or helper program, or other error. [19581.748641] BTRFS info (device sdg): dev_replace from /dev/sdg (devid 1) to /dev/sdi started [19581.803842] BTRFS info (device sdg): dev_replace from /dev/sdg (devid 1) to /dev/sdi finished [19582.208293] BTRFS info (device sdi): allowing degraded mounts [19582.208298] BTRFS info (device sdi): disk space caching is enabled [19582.208301] BTRFS info (device sdi): has skinny extents [19582.212853] BTRFS warning (device sdi): devid 2 uuid 1f731f47-e1bb-4f00-bfbb-9e5a0cb4ba9f is missing [19582.213904] btree_readpage_end_io_hook: 25839 callbacks suppressed [19582.213907] BTRFS error (device sdi): bad tree block start, want 30490624 have 0 [19582.214780] BTRFS warning (device sdi): failed to read root (objectid=7): -5 [19582.231576] BTRFS error (device sdi): open_ctree failed Failed to mount in degraded mode So fix by setting all allocated ranges in the replace target device when the replace operation is finishing, when we are holding the chunk mutex and we can not race with new chunk allocations. A test case for fstests follows soon. Fixes: 1c11b63eff2a67 ("btrfs: replace pending/pinned chunks lists with io tree") CC: [email protected] # 5.2+ Reviewed-by: Nikolay Borisov <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: Filipe Manana <[email protected]> Signed-off-by: David Sterba <[email protected]>
2020-09-30btrfs: move btrfs_rm_dev_replace_free_srcdev outside of all locksJosef Bacik2-1/+5
When closing and freeing the source device we could end up doing our final blkdev_put() on the bdev, which will grab the bd_mutex. As such we want to be holding as few locks as possible, so move this call outside of the dev_replace->lock_finishing_cancel_unmount lock. Since we're modifying the fs_devices we need to make sure we're holding the uuid_mutex here, so take that as well. There's a report from syzbot probably hitting one of the cases where the bd_mutex and device_list_mutex are taken in the wrong order, however it's not with device replace, like this patch fixes. As there's no reproducer available so far, we can't verify the fix. https://lore.kernel.org/lkml/[email protected]/ dashboard link: https://syzkaller.appspot.com/bug?extid=84a0634dc5d21d488419 WARNING: possible circular locking dependency detected 5.9.0-rc5-syzkaller #0 Not tainted ------------------------------------------------------ syz-executor.0/6878 is trying to acquire lock: ffff88804c17d780 (&bdev->bd_mutex){+.+.}-{3:3}, at: blkdev_put+0x30/0x520 fs/block_dev.c:1804 but task is already holding lock: ffff8880908cfce0 (&fs_devs->device_list_mutex){+.+.}-{3:3}, at: close_fs_devices.part.0+0x2e/0x800 fs/btrfs/volumes.c:1159 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #4 (&fs_devs->device_list_mutex){+.+.}-{3:3}: __mutex_lock_common kernel/locking/mutex.c:956 [inline] __mutex_lock+0x134/0x10e0 kernel/locking/mutex.c:1103 btrfs_finish_chunk_alloc+0x281/0xf90 fs/btrfs/volumes.c:5255 btrfs_create_pending_block_groups+0x2f3/0x700 fs/btrfs/block-group.c:2109 __btrfs_end_transaction+0xf5/0x690 fs/btrfs/transaction.c:916 find_free_extent_update_loop fs/btrfs/extent-tree.c:3807 [inline] find_free_extent+0x23b7/0x2e60 fs/btrfs/extent-tree.c:4127 btrfs_reserve_extent+0x166/0x460 fs/btrfs/extent-tree.c:4206 cow_file_range+0x3de/0x9b0 fs/btrfs/inode.c:1063 btrfs_run_delalloc_range+0x2cf/0x1410 fs/btrfs/inode.c:1838 writepage_delalloc+0x150/0x460 fs/btrfs/extent_io.c:3439 __extent_writepage+0x441/0xd00 fs/btrfs/extent_io.c:3653 extent_write_cache_pages.constprop.0+0x69d/0x1040 fs/btrfs/extent_io.c:4249 extent_writepages+0xcd/0x2b0 fs/btrfs/extent_io.c:4370 do_writepages+0xec/0x290 mm/page-writeback.c:2352 __writeback_single_inode+0x125/0x1400 fs/fs-writeback.c:1461 writeback_sb_inodes+0x53d/0xf40 fs/fs-writeback.c:1721 wb_writeback+0x2ad/0xd40 fs/fs-writeback.c:1894 wb_do_writeback fs/fs-writeback.c:2039 [inline] wb_workfn+0x2dc/0x13e0 fs/fs-writeback.c:2080 process_one_work+0x94c/0x1670 kernel/workqueue.c:2269 worker_thread+0x64c/0x1120 kernel/workqueue.c:2415 kthread+0x3b5/0x4a0 kernel/kthread.c:292 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294 -> #3 (sb_internal#2){.+.+}-{0:0}: percpu_down_read include/linux/percpu-rwsem.h:51 [inline] __sb_start_write+0x234/0x470 fs/super.c:1672 sb_start_intwrite include/linux/fs.h:1690 [inline] start_transaction+0xbe7/0x1170 fs/btrfs/transaction.c:624 find_free_extent_update_loop fs/btrfs/extent-tree.c:3789 [inline] find_free_extent+0x25e1/0x2e60 fs/btrfs/extent-tree.c:4127 btrfs_reserve_extent+0x166/0x460 fs/btrfs/extent-tree.c:4206 cow_file_range+0x3de/0x9b0 fs/btrfs/inode.c:1063 btrfs_run_delalloc_range+0x2cf/0x1410 fs/btrfs/inode.c:1838 writepage_delalloc+0x150/0x460 fs/btrfs/extent_io.c:3439 __extent_writepage+0x441/0xd00 fs/btrfs/extent_io.c:3653 extent_write_cache_pages.constprop.0+0x69d/0x1040 fs/btrfs/extent_io.c:4249 extent_writepages+0xcd/0x2b0 fs/btrfs/extent_io.c:4370 do_writepages+0xec/0x290 mm/page-writeback.c:2352 __writeback_single_inode+0x125/0x1400 fs/fs-writeback.c:1461 writeback_sb_inodes+0x53d/0xf40 fs/fs-writeback.c:1721 wb_writeback+0x2ad/0xd40 fs/fs-writeback.c:1894 wb_do_writeback fs/fs-writeback.c:2039 [inline] wb_workfn+0x2dc/0x13e0 fs/fs-writeback.c:2080 process_one_work+0x94c/0x1670 kernel/workqueue.c:2269 worker_thread+0x64c/0x1120 kernel/workqueue.c:2415 kthread+0x3b5/0x4a0 kernel/kthread.c:292 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294 -> #2 ((work_completion)(&(&wb->dwork)->work)){+.+.}-{0:0}: __flush_work+0x60e/0xac0 kernel/workqueue.c:3041 wb_shutdown+0x180/0x220 mm/backing-dev.c:355 bdi_unregister+0x174/0x590 mm/backing-dev.c:872 del_gendisk+0x820/0xa10 block/genhd.c:933 loop_remove drivers/block/loop.c:2192 [inline] loop_control_ioctl drivers/block/loop.c:2291 [inline] loop_control_ioctl+0x3b1/0x480 drivers/block/loop.c:2257 vfs_ioctl fs/ioctl.c:48 [inline] __do_sys_ioctl fs/ioctl.c:753 [inline] __se_sys_ioctl fs/ioctl.c:739 [inline] __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:739 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 -> #1 (loop_ctl_mutex){+.+.}-{3:3}: __mutex_lock_common kernel/locking/mutex.c:956 [inline] __mutex_lock+0x134/0x10e0 kernel/locking/mutex.c:1103 lo_open+0x19/0xd0 drivers/block/loop.c:1893 __blkdev_get+0x759/0x1aa0 fs/block_dev.c:1507 blkdev_get fs/block_dev.c:1639 [inline] blkdev_open+0x227/0x300 fs/block_dev.c:1753 do_dentry_open+0x4b9/0x11b0 fs/open.c:817 do_open fs/namei.c:3251 [inline] path_openat+0x1b9a/0x2730 fs/namei.c:3368 do_filp_open+0x17e/0x3c0 fs/namei.c:3395 do_sys_openat2+0x16d/0x420 fs/open.c:1168 do_sys_open fs/open.c:1184 [inline] __do_sys_open fs/open.c:1192 [inline] __se_sys_open fs/open.c:1188 [inline] __x64_sys_open+0x119/0x1c0 fs/open.c:1188 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 -> #0 (&bdev->bd_mutex){+.+.}-{3:3}: check_prev_add kernel/locking/lockdep.c:2496 [inline] check_prevs_add kernel/locking/lockdep.c:2601 [inline] validate_chain kernel/locking/lockdep.c:3218 [inline] __lock_acquire+0x2a96/0x5780 kernel/locking/lockdep.c:4426 lock_acquire+0x1f3/0xae0 kernel/locking/lockdep.c:5006 __mutex_lock_common kernel/locking/mutex.c:956 [inline] __mutex_lock+0x134/0x10e0 kernel/locking/mutex.c:1103 blkdev_put+0x30/0x520 fs/block_dev.c:1804 btrfs_close_bdev fs/btrfs/volumes.c:1117 [inline] btrfs_close_bdev fs/btrfs/volumes.c:1107 [inline] btrfs_close_one_device fs/btrfs/volumes.c:1133 [inline] close_fs_devices.part.0+0x1a4/0x800 fs/btrfs/volumes.c:1161 close_fs_devices fs/btrfs/volumes.c:1193 [inline] btrfs_close_devices+0x95/0x1f0 fs/btrfs/volumes.c:1179 close_ctree+0x688/0x6cb fs/btrfs/disk-io.c:4149 generic_shutdown_super+0x144/0x370 fs/super.c:464 kill_anon_super+0x36/0x60 fs/super.c:1108 btrfs_kill_super+0x38/0x50 fs/btrfs/super.c:2265 deactivate_locked_super+0x94/0x160 fs/super.c:335 deactivate_super+0xad/0xd0 fs/super.c:366 cleanup_mnt+0x3a3/0x530 fs/namespace.c:1118 task_work_run+0xdd/0x190 kernel/task_work.c:141 tracehook_notify_resume include/linux/tracehook.h:188 [inline] exit_to_user_mode_loop kernel/entry/common.c:163 [inline] exit_to_user_mode_prepare+0x1e1/0x200 kernel/entry/common.c:190 syscall_exit_to_user_mode+0x7e/0x2e0 kernel/entry/common.c:265 entry_SYSCALL_64_after_hwframe+0x44/0xa9 other info that might help us debug this: Chain exists of: &bdev->bd_mutex --> sb_internal#2 --> &fs_devs->device_list_mutex Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&fs_devs->device_list_mutex); lock(sb_internal#2); lock(&fs_devs->device_list_mutex); lock(&bdev->bd_mutex); *** DEADLOCK *** 3 locks held by syz-executor.0/6878: #0: ffff88809070c0e0 (&type->s_umount_key#70){++++}-{3:3}, at: deactivate_super+0xa5/0xd0 fs/super.c:365 #1: ffffffff8a5b37a8 (uuid_mutex){+.+.}-{3:3}, at: btrfs_close_devices+0x23/0x1f0 fs/btrfs/volumes.c:1178 #2: ffff8880908cfce0 (&fs_devs->device_list_mutex){+.+.}-{3:3}, at: close_fs_devices.part.0+0x2e/0x800 fs/btrfs/volumes.c:1159 stack backtrace: CPU: 0 PID: 6878 Comm: syz-executor.0 Not tainted 5.9.0-rc5-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x198/0x1fd lib/dump_stack.c:118 check_noncircular+0x324/0x3e0 kernel/locking/lockdep.c:1827 check_prev_add kernel/locking/lockdep.c:2496 [inline] check_prevs_add kernel/locking/lockdep.c:2601 [inline] validate_chain kernel/locking/lockdep.c:3218 [inline] __lock_acquire+0x2a96/0x5780 kernel/locking/lockdep.c:4426 lock_acquire+0x1f3/0xae0 kernel/locking/lockdep.c:5006 __mutex_lock_common kernel/locking/mutex.c:956 [inline] __mutex_lock+0x134/0x10e0 kernel/locking/mutex.c:1103 blkdev_put+0x30/0x520 fs/block_dev.c:1804 btrfs_close_bdev fs/btrfs/volumes.c:1117 [inline] btrfs_close_bdev fs/btrfs/volumes.c:1107 [inline] btrfs_close_one_device fs/btrfs/volumes.c:1133 [inline] close_fs_devices.part.0+0x1a4/0x800 fs/btrfs/volumes.c:1161 close_fs_devices fs/btrfs/volumes.c:1193 [inline] btrfs_close_devices+0x95/0x1f0 fs/btrfs/volumes.c:1179 close_ctree+0x688/0x6cb fs/btrfs/disk-io.c:4149 generic_shutdown_super+0x144/0x370 fs/super.c:464 kill_anon_super+0x36/0x60 fs/super.c:1108 btrfs_kill_super+0x38/0x50 fs/btrfs/super.c:2265 deactivate_locked_super+0x94/0x160 fs/super.c:335 deactivate_super+0xad/0xd0 fs/super.c:366 cleanup_mnt+0x3a3/0x530 fs/namespace.c:1118 task_work_run+0xdd/0x190 kernel/task_work.c:141 tracehook_notify_resume include/linux/tracehook.h:188 [inline] exit_to_user_mode_loop kernel/entry/common.c:163 [inline] exit_to_user_mode_prepare+0x1e1/0x200 kernel/entry/common.c:190 syscall_exit_to_user_mode+0x7e/0x2e0 kernel/entry/common.c:265 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x460027 RSP: 002b:00007fff59216328 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6 RAX: 0000000000000000 RBX: 0000000000076035 RCX: 0000000000460027 RDX: 0000000000403188 RSI: 0000000000000002 RDI: 00007fff592163d0 RBP: 0000000000000333 R08: 0000000000000000 R09: 000000000000000b R10: 0000000000000005 R11: 0000000000000246 R12: 00007fff59217460 R13: 0000000002df2a60 R14: 0000000000000000 R15: 00007fff59217460 Signed-off-by: Josef Bacik <[email protected]> [ add syzbot reference ] Signed-off-by: David Sterba <[email protected]>