aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2021-10-11KVM: arm64: vgic-v3: Check redist region is not above the VM IPA sizeRicardo Koller2-4/+8
Verify that the redistributor regions do not extend beyond the VM-specified IPA range (phys_size). This can happen when using KVM_VGIC_V3_ADDR_TYPE_REDIST or KVM_VGIC_V3_ADDR_TYPE_REDIST_REGIONS with: base + size > phys_size AND base < phys_size Add the missing check into vgic_v3_alloc_redist_region() which is called when setting the regions, and into vgic_v3_check_base() which is called when attempting the first vcpu-run. The vcpu-run check does not apply to KVM_VGIC_V3_ADDR_TYPE_REDIST_REGIONS because the regions size is known before the first vcpu-run. Note that using the REDIST_REGIONS API results in a different check, which already exists, at first vcpu run: that the number of redist regions is enough for all vcpus. Finally, this patch also enables some extra tests in vgic_v3_alloc_redist_region() by calculating "size" early for the legacy redist api: like checking that the REDIST region can fit all the already created vcpus. Reviewed-by: Eric Auger <[email protected]> Signed-off-by: Ricardo Koller <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-11kvm: arm64: vgic: Introduce vgic_check_iorangeRicardo Koller2-0/+26
Add the new vgic_check_iorange helper that checks that an iorange is sane: the start address and size have valid alignments, the range is within the addressable PA range, start+size doesn't overflow, and the start wasn't already defined. No functional change. Reviewed-by: Eric Auger <[email protected]> Signed-off-by: Ricardo Koller <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-11Merge branch kvm-arm64/pkvm/restrict-hypercalls into kvmarm-master/nextMarc Zyngier6-60/+117
* kvm-arm64/pkvm/restrict-hypercalls: : . : Restrict the use of some hypercalls as well as kexec once : the protected KVM mode has been initialised. : . KVM: arm64: Disable privileged hypercalls after pKVM finalisation KVM: arm64: Prevent re-finalisation of pKVM for a given CPU KVM: arm64: Propagate errors from __pkvm_prot_finalize hypercall KVM: arm64: Reject stub hypercalls after pKVM has been initialised arm64: Prevent kexec and hibernation if is_protected_kvm_enabled() KVM: arm64: Turn __KVM_HOST_SMCCC_FUNC_* into an enum (mostly) Signed-off-by: Marc Zyngier <[email protected]>
2021-10-11KVM: arm64: Disable privileged hypercalls after pKVM finalisationWill Deacon2-22/+40
After pKVM has been 'finalised' using the __pkvm_prot_finalize hypercall, the calling CPU will have a Stage-2 translation enabled to prevent access to memory pages owned by EL2. Although this forms a significant part of the process to deprivilege the host kernel, we also need to ensure that the hypercall interface is reduced so that the EL2 code cannot, for example, be re-initialised using a new set of vectors. Re-order the hypercalls so that only a suffix remains available after finalisation of pKVM. Cc: Marc Zyngier <[email protected]> Cc: Quentin Perret <[email protected]> Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-11KVM: arm64: Prevent re-finalisation of pKVM for a given CPUWill Deacon1-0/+3
__pkvm_prot_finalize() completes the deprivilege of the host when pKVM is in use by installing a stage-2 translation table for the calling CPU. Issuing the hypercall multiple times for a given CPU makes little sense, but in such a case just return early with -EPERM rather than go through the whole page-table dance again. Cc: Marc Zyngier <[email protected]> Cc: Quentin Perret <[email protected]> Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-11KVM: arm64: Propagate errors from __pkvm_prot_finalize hypercallWill Deacon1-11/+19
If the __pkvm_prot_finalize hypercall returns an error, we WARN but fail to propagate the failure code back to kvm_arch_init(). Pass a pointer to a zero-initialised return variable so that failure to finalise the pKVM protections on a host CPU can be reported back to KVM. Cc: Marc Zyngier <[email protected]> Cc: Quentin Perret <[email protected]> Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-11KVM: arm64: Reject stub hypercalls after pKVM has been initialisedWill Deacon2-17/+40
The stub hypercalls provide mechanisms to reset and replace the EL2 code, so uninstall them once pKVM has been initialised in order to ensure the integrity of the hypervisor code. To ensure pKVM initialisation remains functional, split cpu_hyp_reinit() into two helper functions to separate usage of the stub from usage of pkvm hypercalls either side of __pkvm_init on the boot CPU. Cc: Marc Zyngier <[email protected]> Cc: Quentin Perret <[email protected]> Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-11arm64: Prevent kexec and hibernation if is_protected_kvm_enabled()Will Deacon1-1/+2
When pKVM is enabled, the hypervisor code at EL2 and its data structures are inaccessible to the host kernel and cannot be torn down or replaced as this would defeat the integrity properies which pKVM aims to provide. Furthermore, the ABI between the host and EL2 is flexible and private to whatever the current implementation of KVM requires and so booting a new kernel with an old EL2 component is very likely to end in disaster. In preparation for uninstalling the hyp stub calls which are relied upon to reset EL2, disable kexec and hibernation in the host when protected KVM is enabled. Cc: Marc Zyngier <[email protected]> Cc: Quentin Perret <[email protected]> Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-11KVM: arm64: Turn __KVM_HOST_SMCCC_FUNC_* into an enum (mostly)Marc Zyngier1-20/+24
__KVM_HOST_SMCCC_FUNC_* is a royal pain, as there is a fair amount of churn around these #defines, and we avoid making it an enum only for the sake of the early init, low level code that requires __KVM_HOST_SMCCC_FUNC___kvm_hyp_init to be usable from assembly. Let's be brave and turn everything but this symbol into an enum, using a bit of arithmetic to avoid any overlap. Acked-by: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-05KVM: arm64: Release mmap_lock when using VM_SHARED with MTEQuentin Perret1-2/+4
VM_SHARED mappings are currently forbidden in a memslot with MTE to prevent two VMs racing to sanitise the same page. However, this check is performed while holding current->mm's mmap_lock, but fails to release it. Fix this by releasing the lock when needed. Fixes: ea7fc1bb1cd1 ("KVM: arm64: Introduce MTE VM feature") Signed-off-by: Quentin Perret <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-05KVM: arm64: Report corrupted refcount at EL2Quentin Perret1-0/+1
Some of the refcount manipulation helpers used at EL2 are instrumented to catch a corrupted state, but not all of them are treated equally. Let's make things more consistent by instrumenting hyp_page_ref_dec_and_test() as well. Acked-by: Will Deacon <[email protected]> Suggested-by: Will Deacon <[email protected]> Signed-off-by: Quentin Perret <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-05KVM: arm64: Fix host stage-2 PGD refcountQuentin Perret3-1/+27
The KVM page-table library refcounts the pages of concatenated stage-2 PGDs individually. However, when running KVM in protected mode, the host's stage-2 PGD is currently managed by EL2 as a single high-order compound page, which can cause the refcount of the tail pages to reach 0 when they shouldn't, hence corrupting the page-table. Fix this by introducing a new hyp_split_page() helper in the EL2 page allocator (matching the kernel's split_page() function), and make use of it from host_s2_zalloc_pages_exact(). Fixes: 1025c8c0c6ac ("KVM: arm64: Wrap the host with a stage 2") Acked-by: Will Deacon <[email protected]> Suggested-by: Will Deacon <[email protected]> Signed-off-by: Quentin Perret <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-05Merge tag 'kvm-riscv-5.16-1' of git://github.com/kvm-riscv/linux into HEADPaolo Bonzini27-10/+4714
Initial KVM RISC-V support Following features are supported by the initial KVM RISC-V support: 1. No RISC-V specific KVM IOCTL 2. Loadable KVM RISC-V module 3. Minimal possible KVM world-switch which touches only GPRs and few CSRs 4. Works on both RV64 and RV32 host 5. Full Guest/VM switch via vcpu_get/vcpu_put infrastructure 6. KVM ONE_REG interface for VCPU register access from KVM user-space 7. Interrupt controller emulation in KVM user-space 8. Timer and IPI emuation in kernel 9. Both Sv39x4 and Sv48x4 supported for RV64 host 10. MMU notifiers supported 11. Generic dirty log supported 12. FP lazy save/restore supported 13. SBI v0.1 emulation for Guest/VM 14. Forward unhandled SBI calls to KVM user-space 15. Hugepage support for Guest/VM 16. IOEVENTFD support for Vhost
2021-10-04RISC-V: KVM: Add MAINTAINERS entryAnup Patel1-0/+12
Add myself as maintainer for KVM RISC-V and Atish as designated reviewer. Signed-off-by: Atish Patra <[email protected]> Signed-off-by: Anup Patel <[email protected]> Acked-by: Paolo Bonzini <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Reviewed-by: Alexander Graf <[email protected]> Acked-by: Palmer Dabbelt <[email protected]>
2021-10-04RISC-V: KVM: Document RISC-V specific parts of KVM APIAnup Patel1-9/+184
Document RISC-V specific parts of the KVM API, such as: - The interrupt numbers passed to the KVM_INTERRUPT ioctl. - The states supported by the KVM_{GET,SET}_MP_STATE ioctls. - The registers supported by the KVM_{GET,SET}_ONE_REG interface and the encoding of those register ids. - The exit reason KVM_EXIT_RISCV_SBI for SBI calls forwarded to userspace tool. CC: Jonathan Corbet <[email protected]> CC: [email protected] Signed-off-by: Anup Patel <[email protected]> Acked-by: Palmer Dabbelt <[email protected]>
2021-10-04RISC-V: KVM: Add SBI v0.1 supportAtish Patra6-0/+217
The KVM host kernel is running in HS-mode needs so we need to handle the SBI calls coming from guest kernel running in VS-mode. This patch adds SBI v0.1 support in KVM RISC-V. Almost all SBI v0.1 calls are implemented in KVM kernel module except GETCHAR and PUTCHART calls which are forwarded to user space because these calls cannot be implemented in kernel space. In future, when we implement SBI v0.2 for Guest, we will forward SBI v0.2 experimental and vendor extension calls to user space. Signed-off-by: Atish Patra <[email protected]> Signed-off-by: Anup Patel <[email protected]> Acked-by: Paolo Bonzini <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Acked-by: Palmer Dabbelt <[email protected]>
2021-10-04RISC-V: KVM: Implement ONE REG interface for FP registersAtish Patra2-0/+114
Add a KVM_GET_ONE_REG/KVM_SET_ONE_REG ioctl interface for floating point registers such as F0-F31 and FCSR. This support is added for both 'F' and 'D' extensions. Signed-off-by: Atish Patra <[email protected]> Signed-off-by: Anup Patel <[email protected]> Acked-by: Paolo Bonzini <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Reviewed-by: Alexander Graf <[email protected]> Acked-by: Palmer Dabbelt <[email protected]>
2021-10-04RISC-V: KVM: FP lazy save/restoreAtish Patra4-0/+342
This patch adds floating point (F and D extension) context save/restore for guest VCPUs. The FP context is saved and restored lazily only when kernel enter/exits the in-kernel run loop and not during the KVM world switch. This way FP save/restore has minimal impact on KVM performance. Signed-off-by: Atish Patra <[email protected]> Signed-off-by: Anup Patel <[email protected]> Acked-by: Paolo Bonzini <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Reviewed-by: Alexander Graf <[email protected]> Acked-by: Palmer Dabbelt <[email protected]>
2021-10-04RISC-V: KVM: Add timer functionalityAtish Patra9-1/+334
The RISC-V hypervisor specification doesn't have any virtual timer feature. Due to this, the guest VCPU timer will be programmed via SBI calls. The host will use a separate hrtimer event for each guest VCPU to provide timer functionality. We inject a virtual timer interrupt to the guest VCPU whenever the guest VCPU hrtimer event expires. This patch adds guest VCPU timer implementation along with ONE_REG interface to access VCPU timer state from user space. Signed-off-by: Atish Patra <[email protected]> Signed-off-by: Anup Patel <[email protected]> Acked-by: Paolo Bonzini <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Acked-by: Daniel Lezcano <[email protected]> Acked-by: Palmer Dabbelt <[email protected]>
2021-10-04RISC-V: KVM: Implement MMU notifiersAnup Patel4-5/+89
This patch implements MMU notifiers for KVM RISC-V so that Guest physical address space is in-sync with Host physical address space. This will allow swapping, page migration, etc to work transparently with KVM RISC-V. Signed-off-by: Anup Patel <[email protected]> Acked-by: Paolo Bonzini <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Reviewed-by: Alexander Graf <[email protected]> Acked-by: Palmer Dabbelt <[email protected]>
2021-10-04RISC-V: KVM: Implement stage2 page table programmingAnup Patel5-16/+676
This patch implements all required functions for programming the stage2 page table for each Guest/VM. At high-level, the flow of stage2 related functions is similar from KVM ARM/ARM64 implementation but the stage2 page table format is quite different for KVM RISC-V. [jiangyifei: stage2 dirty log support] Signed-off-by: Yifei Jiang <[email protected]> Signed-off-by: Anup Patel <[email protected]> Acked-by: Paolo Bonzini <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Acked-by: Palmer Dabbelt <[email protected]>
2021-10-04RISC-V: KVM: Implement VMID allocatorAnup Patel7-2/+249
We implement a simple VMID allocator for Guests/VMs which: 1. Detects number of VMID bits at boot-time 2. Uses atomic number to track VMID version and increments VMID version whenever we run-out of VMIDs 3. Flushes Guest TLBs on all host CPUs whenever we run-out of VMIDs 4. Force updates HW Stage2 VMID for each Guest VCPU whenever VMID changes using VCPU request KVM_REQ_UPDATE_HGATP Signed-off-by: Anup Patel <[email protected]> Acked-by: Paolo Bonzini <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Reviewed-by: Alexander Graf <[email protected]> Acked-by: Palmer Dabbelt <[email protected]>
2021-10-04RISC-V: KVM: Handle WFI exits for VCPUAnup Patel1-0/+76
We get illegal instruction trap whenever Guest/VM executes WFI instruction. This patch handles WFI trap by blocking the trapped VCPU using kvm_vcpu_block() API. The blocked VCPU will be automatically resumed whenever a VCPU interrupt is injected from user-space or from in-kernel IRQCHIP emulation. Signed-off-by: Anup Patel <[email protected]> Acked-by: Paolo Bonzini <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Acked-by: Palmer Dabbelt <[email protected]>
2021-10-04RISC-V: KVM: Handle MMIO exits for VCPUAnup Patel8-4/+651
We will get stage2 page faults whenever Guest/VM access SW emulated MMIO device or unmapped Guest RAM. This patch implements MMIO read/write emulation by extracting MMIO details from the trapped load/store instruction and forwarding the MMIO read/write to user-space. The actual MMIO emulation will happen in user-space and KVM kernel module will only take care of register updates before resuming the trapped VCPU. The handling for stage2 page faults for unmapped Guest RAM will be implemeted by a separate patch later. [jiangyifei: ioeventfd and in-kernel mmio device support] Signed-off-by: Yifei Jiang <[email protected]> Signed-off-by: Anup Patel <[email protected]> Acked-by: Paolo Bonzini <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Reviewed-by: Alexander Graf <[email protected]> Acked-by: Palmer Dabbelt <[email protected]>
2021-10-04RISC-V: KVM: Implement VCPU world-switchAnup Patel5-4/+319
This patch implements the VCPU world-switch for KVM RISC-V. The KVM RISC-V world-switch (i.e. __kvm_riscv_switch_to()) mostly switches general purpose registers, SSTATUS, STVEC, SSCRATCH and HSTATUS CSRs. Other CSRs are switched via vcpu_load() and vcpu_put() interface in kvm_arch_vcpu_load() and kvm_arch_vcpu_put() functions respectively. Signed-off-by: Anup Patel <[email protected]> Acked-by: Paolo Bonzini <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Reviewed-by: Alexander Graf <[email protected]> Acked-by: Palmer Dabbelt <[email protected]>
2021-10-04RISC-V: KVM: Implement KVM_GET_ONE_REG/KVM_SET_ONE_REG ioctlsAnup Patel2-4/+290
For KVM RISC-V, we use KVM_GET_ONE_REG/KVM_SET_ONE_REG ioctls to access VCPU config and registers from user-space. We have three types of VCPU registers: 1. CONFIG - these are VCPU config and capabilities 2. CORE - these are VCPU general purpose registers 3. CSR - these are VCPU control and status registers The CONFIG register available to user-space is ISA. The ISA register is a read and write register where user-space can only write the desired VCPU ISA capabilities before running the VCPU. The CORE registers available to user-space are PC, RA, SP, GP, TP, A0-A7, T0-T6, S0-S11 and MODE. Most of these are RISC-V general registers except PC and MODE. The PC register represents program counter whereas the MODE register represent VCPU privilege mode (i.e. S/U-mode). The CSRs available to user-space are SSTATUS, SIE, STVEC, SSCRATCH, SEPC, SCAUSE, STVAL, SIP, and SATP. All of these are read/write registers. In future, more VCPU register types will be added (such as FP) for the KVM_GET_ONE_REG/KVM_SET_ONE_REG ioctls. Signed-off-by: Anup Patel <[email protected]> Acked-by: Paolo Bonzini <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Acked-by: Palmer Dabbelt <[email protected]>
2021-10-04RISC-V: KVM: Implement VCPU interrupts and requests handlingAnup Patel3-13/+197
This patch implements VCPU interrupts and requests which are both asynchronous events. The VCPU interrupts can be set/unset using KVM_INTERRUPT ioctl from user-space. In future, the in-kernel IRQCHIP emulation will use kvm_riscv_vcpu_set_interrupt() and kvm_riscv_vcpu_unset_interrupt() functions to set/unset VCPU interrupts. Important VCPU requests implemented by this patch are: KVM_REQ_SLEEP - set whenever VCPU itself goes to sleep state KVM_REQ_VCPU_RESET - set whenever VCPU reset is requested The WFI trap-n-emulate (added later) will use KVM_REQ_SLEEP request and kvm_riscv_vcpu_has_interrupt() function. The KVM_REQ_VCPU_RESET request will be used by SBI emulation (added later) to power-up a VCPU in power-off state. The user-space can use the GET_MPSTATE/SET_MPSTATE ioctls to get/set power state of a VCPU. Signed-off-by: Anup Patel <[email protected]> Acked-by: Paolo Bonzini <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Reviewed-by: Alexander Graf <[email protected]> Acked-by: Palmer Dabbelt <[email protected]>
2021-10-04RISC-V: KVM: Implement VCPU create, init and destroy functionsAnup Patel2-9/+115
This patch implements VCPU create, init and destroy functions required by generic KVM module. We don't have much dynamic resources in struct kvm_vcpu_arch so these functions are quite simple for KVM RISC-V. Signed-off-by: Anup Patel <[email protected]> Acked-by: Paolo Bonzini <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Reviewed-by: Alexander Graf <[email protected]> Acked-by: Palmer Dabbelt <[email protected]>
2021-10-04RISC-V: Add initial skeletal KVM supportAnup Patel12-0/+805
This patch adds initial skeletal KVM RISC-V support which has: 1. A simple implementation of arch specific VM functions except kvm_vm_ioctl_get_dirty_log() which will implemeted in-future as part of stage2 page loging. 2. Stubs of required arch specific VCPU functions except kvm_arch_vcpu_ioctl_run() which is semi-complete and extended by subsequent patches. 3. Stubs for required arch specific stage2 MMU functions. Signed-off-by: Anup Patel <[email protected]> Acked-by: Paolo Bonzini <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Reviewed-by: Alexander Graf <[email protected]> Acked-by: Palmer Dabbelt <[email protected]>
2021-10-04RISC-V: Add hypervisor extension related CSR definesAnup Patel1-0/+87
This patch adds asm/kvm_csr.h for RISC-V hypervisor extension related defines. Signed-off-by: Anup Patel <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Reviewed-by: Alexander Graf <[email protected]> Message-Id: <[email protected]> Acked-by: Palmer Dabbelt <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-10-04Merge tag 'kvm-s390-master-5.15-1' of ↵Paolo Bonzini2-1/+15
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-master KVM: s390: allow to compile without warning with W=1
2021-10-03Linux 5.15-rc4Linus Torvalds1-1/+1
2021-10-03elf: don't use MAP_FIXED_NOREPLACE for elf interpreter mappingsChen Jingwen1-1/+1
In commit b212921b13bd ("elf: don't use MAP_FIXED_NOREPLACE for elf executable mappings") we still leave MAP_FIXED_NOREPLACE in place for load_elf_interp. Unfortunately, this will cause kernel to fail to start with: 1 (init): Uhuuh, elf segment at 00003ffff7ffd000 requested but the memory is mapped already Failed to execute /init (error -17) The reason is that the elf interpreter (ld.so) has overlapping segments. readelf -l ld-2.31.so Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x000000000002c94c 0x000000000002c94c R E 0x10000 LOAD 0x000000000002dae0 0x000000000003dae0 0x000000000003dae0 0x00000000000021e8 0x0000000000002320 RW 0x10000 LOAD 0x000000000002fe00 0x000000000003fe00 0x000000000003fe00 0x00000000000011ac 0x0000000000001328 RW 0x10000 The reason for this problem is the same as described in commit ad55eac74f20 ("elf: enforce MAP_FIXED on overlaying elf segments"). Not only executable binaries, elf interpreters (e.g. ld.so) can have overlapping elf segments, so we better drop MAP_FIXED_NOREPLACE and go back to MAP_FIXED in load_elf_interp. Fixes: 4ed28639519c ("fs, elf: drop MAP_FIXED usage from elf_map") Cc: <[email protected]> # v4.19 Cc: Andrew Morton <[email protected]> Cc: Michal Hocko <[email protected]> Signed-off-by: Chen Jingwen <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-10-03Merge tag 'ext4_for_linus_stable' of ↵Linus Torvalds7-199/+182
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Fix a number of ext4 bugs in fast_commit, inline data, and delayed allocation. Also fix error handling code paths in ext4_dx_readdir() and ext4_fill_super(). Finally, avoid a grabbing a journal head in the delayed allocation write in the common cases where we are overwriting a pre-existing block or appending to an inode" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: recheck buffer uptodate bit under buffer lock ext4: fix potential infinite loop in ext4_dx_readdir() ext4: flush s_error_work before journal destroy in ext4_fill_super ext4: fix loff_t overflow in ext4_max_bitmap_size() ext4: fix reserved space counter leakage ext4: limit the number of blocks in one ADD_RANGE TLV ext4: enforce buffer head state assertion in ext4_da_map_blocks ext4: remove extent cache entries when truncating inline data ext4: drop unnecessary journal handle in delalloc write ext4: factor out write end code of inline file ext4: correct the error path of ext4_write_inline_data_end() ext4: check and update i_disksize properly ext4: add error checking to ext4_ext_replay_set_iblocks()
2021-10-03objtool: print out the symbol type when complaining about itLinus Torvalds1-4/+8
The objtool warning that the kvm instruction emulation code triggered wasn't very useful: arch/x86/kvm/emulate.o: warning: objtool: __ex_table+0x4: don't know how to handle reloc symbol type: kvm_fastop_exception in that it helpfully tells you which symbol name it had trouble figuring out the relocation for, but it doesn't actually say what the unknown symbol type was that triggered it all. In this case it was because of missing type information (type 0, aka STT_NOTYPE), but on the whole it really should just have printed that out as part of the message. Because if this warning triggers, that's very much the first thing you want to know - why did reloc2sec_off() return failure for that symbol? So rather than just saying you can't handle some type of symbol without saying what the type _was_, just print out the type number too. Fixes: 24ff65257375 ("objtool: Teach get_alt_entry() about more relocation types") Link: https://lore.kernel.org/lkml/CAHk-=wiZwq-0LknKhXN4M+T8jbxn_2i9mcKpO+OaBSSq_Eh7tg@mail.gmail.com/ Signed-off-by: Linus Torvalds <[email protected]>
2021-10-03kvm: fix objtool relocation warningLinus Torvalds1-1/+0
The recent change to make objtool aware of more symbol relocation types (commit 24ff65257375: "objtool: Teach get_alt_entry() about more relocation types") also added another check, and resulted in this objtool warning when building kvm on x86: arch/x86/kvm/emulate.o: warning: objtool: __ex_table+0x4: don't know how to handle reloc symbol type: kvm_fastop_exception The reason seems to be that kvm_fastop_exception() is marked as a global symbol, which causes the relocation to ke kept around for objtool. And at the same time, the kvm_fastop_exception definition (which is done as an inline asm statement) doesn't actually set the type of the global, which then makes objtool unhappy. The minimal fix is to just not mark kvm_fastop_exception as being a global symbol. It's only used in that one compilation unit anyway, so it was always pointless. That's how all the other local exception table labels are done. I'm not entirely happy about the kinds of games that the kvm code plays with doing its own exception handling, and the fact that it confused objtool is most definitely a symptom of the code being a bit too subtle and ad-hoc. But at least this trivial one-liner makes objtool no longer upset about what is going on. Fixes: 24ff65257375 ("objtool: Teach get_alt_entry() about more relocation types") Link: https://lore.kernel.org/lkml/CAHk-=wiZwq-0LknKhXN4M+T8jbxn_2i9mcKpO+OaBSSq_Eh7tg@mail.gmail.com/ Cc: Borislav Petkov <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Vitaly Kuznetsov <[email protected]> Cc: Wanpeng Li <[email protected]> Cc: Jim Mattson <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Nathan Chancellor <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-10-03Merge tag 'char-misc-5.15-rc4' of ↵Linus Torvalds3-26/+108
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are some small misc driver fixes for 5.15-rc4. They are in two "groups": - ipack driver fixes for issues found by Johan Hovold - interconnect driver fixes for reported problems All of these have been in linux-next for a while with no reported issues" * tag 'char-misc-5.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: ipack: ipoctal: fix module reference leak ipack: ipoctal: fix missing allocation-failure check ipack: ipoctal: fix tty-registration error handling ipack: ipoctal: fix tty registration race ipack: ipoctal: fix stack information leak interconnect: qcom: sdm660: Add missing a2noc qos clocks dt-bindings: interconnect: sdm660: Add missing a2noc qos clocks interconnect: qcom: sdm660: Correct NOC_QOS_PRIORITY shift and mask interconnect: qcom: sdm660: Fix id of slv_cnoc_mnoc_cfg
2021-10-03Merge tag 'driver-core-5.15-rc4' of ↵Linus Torvalds6-36/+87
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg KH: "Here are some driver core and kernfs fixes for reported issues for 5.15-rc4. These fixes include: - kernfs positive dentry bugfix - debugfs_create_file_size error path fix - cpumask sysfs file bugfix to preserve the user/kernel abi (has been reported multiple times.) - devlink fixes for mdiobus devices as reported by the subsystem maintainers. Also included in here are some devlink debugging changes to make it easier for people to report problems when asked. They have already helped with the mdiobus and other subsystems reporting issues. All of these have been linux-next for a while with no reported issues" * tag 'driver-core-5.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: kernfs: also call kernfs_set_rev() for positive dentry driver core: Add debug logs when fwnode links are added/deleted driver core: Create __fwnode_link_del() helper function driver core: Set deferred probe reason when deferred by driver core net: mdiobus: Set FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD for mdiobus parents driver core: fw_devlink: Add support for FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD driver core: fw_devlink: Improve handling of cyclic dependencies cpumask: Omit terminating null byte in cpumap_print_{list,bitmask}_to_buf debugfs: debugfs_create_file_size(): use IS_ERR to check for error
2021-10-03Merge tag 'sched_urgent_for_v5.15_rc4' of ↵Linus Torvalds3-3/+13
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Borislav Petkov: - Tell the compiler to always inline is_percpu_thread() - Make sure tunable_scaling buffer is null-terminated after an update in sysfs - Fix LTP named regression due to cgroup list ordering * tag 'sched_urgent_for_v5.15_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Always inline is_percpu_thread() sched/fair: Null terminate buffer when updating tunable_scaling sched/fair: Add ancestors of unthrottled undecayed cfs_rq
2021-10-03Merge tag 'perf_urgent_for_v5.15_rc4' of ↵Linus Torvalds4-5/+35
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Borislav Petkov: - Make sure the destroy callback is reset when a event initialization fails - Update the event constraints for Icelake - Make sure the active time of an event is updated even for inactive events * tag 'perf_urgent_for_v5.15_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/core: fix userpage->time_enabled of inactive events perf/x86/intel: Update event constraints for ICX perf/x86: Reset destroy callback on event init failure
2021-10-03Merge tag 'objtool_urgent_for_v5.15_rc4' of ↵Linus Torvalds1-7/+25
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool fix from Borislav Petkov: - Handle symbol relocations properly due to changes in the toolchains which remove section symbols now * tag 'objtool_urgent_for_v5.15_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Teach get_alt_entry() about more relocation types
2021-10-02Merge tag 'hwmon-for-v5.15-rc4' of ↵Linus Torvalds11-125/+101
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fixes from Guenter Roeck: - Fixed various potential NULL pointer accesses in w8379* drivers - Improved error handling, fault reporting, and fixed rounding in thmp421 driver - Fixed error handling in ltc2947 driver - Added missing attribute to pmbus/mp2975 driver - Fixed attribute values in pbus/ibm-cffps, occ, and mlxreg-fan drivers - Removed unused residual code from k10temp driver * tag 'hwmon-for-v5.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (w83793) Fix NULL pointer dereference by removing unnecessary structure field hwmon: (w83792d) Fix NULL pointer dereference by removing unnecessary structure field hwmon: (w83791d) Fix NULL pointer dereference by removing unnecessary structure field hwmon: (pmbus/mp2975) Add missed POUT attribute for page 1 mp2975 controller hwmon: (pmbus/ibm-cffps) max_power_out swap changes hwmon: (occ) Fix P10 VRM temp sensors hwmon: (ltc2947) Properly handle errors when looking for the external clock hwmon: (tmp421) fix rounding for negative values hwmon: (tmp421) report /PVLD condition as fault hwmon: (tmp421) handle I2C errors hwmon: (mlxreg-fan) Return non-zero value when fan current state is enforced from sysfs hwmon: (k10temp) Remove residues of current and voltage
2021-10-02Merge tag '5.15-rc3-ksmbd-fixes' of git://git.samba.org/ksmbdLinus Torvalds12-342/+294
Pull ksmbd server fixes from Steve French: "Eleven fixes for the ksmbd kernel server, mostly security related: - an important fix for disabling weak NTLMv1 authentication - seven security (improved buffer overflow checks) fixes - fix for wrong infolevel struct used in some getattr/setattr paths - two small documentation fixes" * tag '5.15-rc3-ksmbd-fixes' of git://git.samba.org/ksmbd: ksmbd: missing check for NULL in convert_to_nt_pathname() ksmbd: fix transform header validation ksmbd: add buffer validation for SMB2_CREATE_CONTEXT ksmbd: add validation in smb2 negotiate ksmbd: add request buffer validation in smb2_set_info ksmbd: use correct basic info level in set_file_basic_info() ksmbd: remove NTLMv1 authentication ksmbd: fix documentation for 2 functions MAINTAINERS: rename cifs_common to smbfs_common in cifs and ksmbd entry ksmbd: fix invalid request buffer access in compound ksmbd: remove RFC1002 check in smb2 request
2021-10-02Merge tag 'scsi-fixes' of ↵Linus Torvalds5-7/+7
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Five fairly minor fixes and spelling updates, all in drivers. Even though the ufs fix is in tracing, it's a potentially exploitable use beyond end of array bug" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: csiostor: Add module softdep on cxgb4 scsi: qla2xxx: Fix excessive messages during device logout scsi: virtio_scsi: Fix spelling mistake "Unsupport" -> "Unsupported" scsi: ses: Fix unsigned comparison with less than zero scsi: ufs: Fix illegal offset in UPIU event trace
2021-10-02Merge tag 'block-5.15-2021-10-01' of git://git.kernel.dk/linux-blockLinus Torvalds5-27/+31
Pull block fixes from Jens Axboe: "A few block fixes for this release: - Revert a BFQ commit that causes breakage for people. Unfortunately it was auto-selected for stable as well, so now 5.14.7 suffers from it too. Hopefully stable will pick up this revert quickly too, so we can remove the issue on that end as well. - Add a quirk for Apple NVMe controllers, which due to their non-compliance broke due to the introduction of command sequences (Keith) - Use shifts in nbd, fixing a __divdi3 issue (Nick)" * tag 'block-5.15-2021-10-01' of git://git.kernel.dk/linux-block: nbd: use shifts rather than multiplies Revert "block, bfq: honor already-setup queue merges" nvme: add command id quirk for apple controllers
2021-10-02Merge tag 'io_uring-5.15-2021-10-01' of git://git.kernel.dk/linux-blockLinus Torvalds2-19/+3
Pull io_uring fixes from Jens Axboe: "Two fixes in here: - The signal issue that was discussed start of this week (me). - Kill dead fasync support in io_uring. Looks like it was broken since io_uring was initially merged, and given that nobody has ever complained about it, let's just kill it (Pavel)" * tag 'io_uring-5.15-2021-10-01' of git://git.kernel.dk/linux-block: io_uring: kill fasync io-wq: exclusively gate signal based exit on get_signal() return
2021-10-02Merge tag 'libnvdimm-fixes-5.15-rc4' of ↵Linus Torvalds2-4/+13
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fixes from Dan Williams: "A fix for a regression added this cycle in the pmem driver, and for a long standing bug for failed NUMA node lookups on ARM64. This has appeared in -next for several days with no reported issues. Summary: - Fix a regression that caused the sysfs ABI for pmem block devices to not be registered. This fails the nvdimm unit tests and dax xfstests. - Fix numa node lookups for dax-kmem memory (device-dax memory assigned to the page allocator) on ARM64" * tag 'libnvdimm-fixes-5.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: nvdimm/pmem: fix creating the dax group ACPI: NFIT: Use fallback node id when numa info in NFIT table is incorrect
2021-10-02cachefiles: Fix oops in trace_cachefiles_mark_buried due to NULL objectDave Wysochanski1-1/+1
In cachefiles_mark_object_buried, the dentry in question may not have an owner, and thus our cachefiles_object pointer may be NULL when calling the tracepoint, in which case we will also not have a valid debug_id to print in the tracepoint. Check for NULL object in the tracepoint and if so, just set debug_id to MAX_UINT as was done in 2908f5e101e3 ("fscache: Add a cookie debug ID and use that in traces"). This fixes the following oops: FS-Cache: Cache "mycache" added (type cachefiles) CacheFiles: File cache on vdc registered ... Workqueue: fscache_object fscache_object_work_func [fscache] RIP: 0010:trace_event_raw_event_cachefiles_mark_buried+0x4e/0xa0 [cachefiles] .... Call Trace: cachefiles_mark_object_buried+0xa5/0xb0 [cachefiles] cachefiles_bury_object+0x270/0x430 [cachefiles] cachefiles_walk_to_object+0x195/0x9c0 [cachefiles] cachefiles_lookup_object+0x5a/0xc0 [cachefiles] fscache_look_up_object+0xd7/0x160 [fscache] fscache_object_work_func+0xb2/0x340 [fscache] process_one_work+0x1f1/0x390 worker_thread+0x53/0x3e0 kthread+0x127/0x150 Fixes: 2908f5e101e3 ("fscache: Add a cookie debug ID and use that in traces") Signed-off-by: Dave Wysochanski <[email protected]> Signed-off-by: David Howells <[email protected]> cc: [email protected] Signed-off-by: Linus Torvalds <[email protected]>
2021-10-02drm/i915: fix blank screen booting crashesHugh Dickins1-2/+3
5.15-rc1 crashes with blank screen when booting up on two ThinkPads using i915. Bisections converge convincingly, but arrive at different and suprising "culprits", none of them the actual culprit. netconsole (with init_netconsole() hacked to call i915_init() when logging has started, instead of by module_init()) tells the story: kernel BUG at drivers/gpu/drm/i915/i915_sw_fence.c:245! with RSI: ffffffff814d408b pointing to sw_fence_dummy_notify(). I've been building with CONFIG_CC_OPTIMIZE_FOR_SIZE=y, and that function needs to be 4-byte aligned. Fixes: 62eaf0ae217d ("drm/i915/guc: Support request cancellation") Signed-off-by: Hugh Dickins <[email protected]> Tested-by: Steven Rostedt (VMware) <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-10-02hwmon: (w83793) Fix NULL pointer dereference by removing unnecessary ↵Nadezda Lutovinova1-15/+11
structure field If driver read tmp value sufficient for (tmp & 0x08) && (!(tmp & 0x80)) && ((tmp & 0x7) == ((tmp >> 4) & 0x7)) from device then Null pointer dereference occurs. (It is possible if tmp = 0b0xyz1xyz, where same literals mean same numbers) Also lm75[] does not serve a purpose anymore after switching to devm_i2c_new_dummy_device() in w83791d_detect_subclients(). The patch fixes possible NULL pointer dereference by removing lm75[]. Found by Linux Driver Verification project (linuxtesting.org). Cc: [email protected] Signed-off-by: Nadezda Lutovinova <[email protected]> Link: https://lore.kernel.org/r/[email protected] [groeck: Dropped unnecessary continuation lines, fixed multi-line alignments] Signed-off-by: Guenter Roeck <[email protected]>