Age | Commit message (Collapse) | Author | Files | Lines |
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- Fix a kernel fault during page table walking in huge_pte_alloc() with
PTABLE_LEVELS=5 due to using p4d_offset() instead of p4d_alloc()
- head.S fix and cleanup to disable the MMU before toggling the
HCR_EL2.E2H bit when entering the kernel with the MMU on from the EFI
stub. Changing this bit (currently from VHE to nVHE) causes some
system registers as well as page table descriptors to be interpreted
differently, potentially resulting in spurious MMU faults
- Fix translation fault in swsusp_save() accessing MEMBLOCK_NOMAP
memory ranges due to kernel_page_present() returning true in most
configurations other than rodata_full == true,
CONFIG_DEBUG_PAGEALLOC=y or CONFIG_KFENCE=y
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: hibernate: Fix level3 translation fault in swsusp_save()
arm64/head: Disable MMU at EL2 before clearing HCR_EL2.E2H
arm64/head: Drop unnecessary pre-disable-MMU workaround
arm64/hugetlb: Fix page table walk in huge_pte_alloc()
|
|
On arm64 machines, swsusp_save() faults if it attempts to access
MEMBLOCK_NOMAP memory ranges. This can be reproduced in QEMU using UEFI
when booting with rodata=off debug_pagealloc=off and CONFIG_KFENCE=n:
Unable to handle kernel paging request at virtual address ffffff8000000000
Mem abort info:
ESR = 0x0000000096000007
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x07: level 3 translation fault
Data abort info:
ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
CM = 0, WnR = 0, TnD = 0, TagAccess = 0
GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
swapper pgtable: 4k pages, 39-bit VAs, pgdp=00000000eeb0b000
[ffffff8000000000] pgd=180000217fff9803, p4d=180000217fff9803, pud=180000217fff9803, pmd=180000217fff8803, pte=0000000000000000
Internal error: Oops: 0000000096000007 [#1] SMP
Internal error: Oops: 0000000096000007 [#1] SMP
Modules linked in: xt_multiport ipt_REJECT nf_reject_ipv4 xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_filter bpfilter rfkill at803x snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg dwmac_generic stmmac_platform snd_hda_codec stmmac joydev pcs_xpcs snd_hda_core phylink ppdev lp parport ramoops reed_solomon ip_tables x_tables nls_iso8859_1 vfat multipath linear amdgpu amdxcp drm_exec gpu_sched drm_buddy hid_generic usbhid hid radeon video drm_suballoc_helper drm_ttm_helper ttm i2c_algo_bit drm_display_helper cec drm_kms_helper drm
CPU: 0 PID: 3663 Comm: systemd-sleep Not tainted 6.6.2+ #76
Source Version: 4e22ed63a0a48e7a7cff9b98b7806d8d4add7dc0
Hardware name: Greatwall GW-XXXXXX-XXX/GW-XXXXXX-XXX, BIOS KunLun BIOS V4.0 01/19/2021
pstate: 600003c5 (nZCv DAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : swsusp_save+0x280/0x538
lr : swsusp_save+0x280/0x538
sp : ffffffa034a3fa40
x29: ffffffa034a3fa40 x28: ffffff8000001000 x27: 0000000000000000
x26: ffffff8001400000 x25: ffffffc08113e248 x24: 0000000000000000
x23: 0000000000080000 x22: ffffffc08113e280 x21: 00000000000c69f2
x20: ffffff8000000000 x19: ffffffc081ae2500 x18: 0000000000000000
x17: 6666662074736420 x16: 3030303030303030 x15: 3038666666666666
x14: 0000000000000b69 x13: ffffff9f89088530 x12: 00000000ffffffea
x11: 00000000ffff7fff x10: 00000000ffff7fff x9 : ffffffc08193f0d0
x8 : 00000000000bffe8 x7 : c0000000ffff7fff x6 : 0000000000000001
x5 : ffffffa0fff09dc8 x4 : 0000000000000000 x3 : 0000000000000027
x2 : 0000000000000000 x1 : 0000000000000000 x0 : 000000000000004e
Call trace:
swsusp_save+0x280/0x538
swsusp_arch_suspend+0x148/0x190
hibernation_snapshot+0x240/0x39c
hibernate+0xc4/0x378
state_store+0xf0/0x10c
kobj_attr_store+0x14/0x24
The reason is swsusp_save() -> copy_data_pages() -> page_is_saveable()
-> kernel_page_present() assuming that a page is always present when
can_set_direct_map() is false (all of rodata_full,
debug_pagealloc_enabled() and arm64_kfence_can_set_direct_map() false),
irrespective of the MEMBLOCK_NOMAP ranges. Such MEMBLOCK_NOMAP regions
should not be saved during hibernation.
This problem was introduced by changes to the pfn_valid() logic in
commit a7d9f306ba70 ("arm64: drop pfn_valid_within() and simplify
pfn_valid()").
Similar to other architectures, drop the !can_set_direct_map() check in
kernel_page_present() so that page_is_savable() skips such pages.
Fixes: a7d9f306ba70 ("arm64: drop pfn_valid_within() and simplify pfn_valid()")
Cc: <stable@vger.kernel.org> # 5.14.x
Suggested-by: Mike Rapoport <rppt@kernel.org>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Co-developed-by: xiongxin <xiongxin@kylinos.cn>
Signed-off-by: xiongxin <xiongxin@kylinos.cn>
Signed-off-by: Yaxiong Tian <tianyaxiong@kylinos.cn>
Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
Link: https://lore.kernel.org/r/20240417025248.386622-1-tianyaxiong@kylinos.cn
[catalin.marinas@arm.com: rework commit message]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Even though the boot protocol stipulates otherwise, an exception has
been made for the EFI stub, and entering the core kernel with the MMU
enabled is permitted. This allows a substantial amount of cache
maintenance to be elided, wich is significant when fast boot times are
critical (e.g., for booting micro-VMs)
Once the initial ID map has been populated, the MMU is disabled as part
of the logic sequence that puts all system registers into a known state.
Any code that needs to execute within the window where the MMU is off is
cleaned to the PoC explicitly, which includes all of HYP text when
entering at EL2.
However, the current sequence of initializing the EL2 system registers
is not safe: HCR_EL2 is set to its nVHE initial state before SCTLR_EL2
is reprogrammed, and this means that a VHE-to-nVHE switch may occur
while the MMU is enabled. This switch causes some system registers as
well as page table descriptors to be interpreted in a different way,
potentially resulting in spurious exceptions relating to MMU
translation.
So disable the MMU explicitly first when entering in EL2 with the MMU
and caches enabled.
Fixes: 617861703830 ("efi: arm64: enter with MMU and caches enabled")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Cc: <stable@vger.kernel.org> # 6.3.x
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240415075412.2347624-6-ardb+git@google.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
The Falkor erratum that results in the need for an ISB before clearing
the M bit in SCTLR_ELx only applies to execution at exception level x,
and so the workaround is not needed when disabling the EL1 MMU while
running at EL2.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20240415075412.2347624-5-ardb+git@google.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Currently normal HugeTLB fault ends up crashing the kernel, as p4dp derived
from p4d_offset() is an invalid address when PGTABLE_LEVEL = 5. A p4d level
entry needs to be allocated when not available while walking the page table
during HugeTLB faults. Let's call p4d_alloc() to allocate such entries when
required instead of current p4d_offset().
Unable to handle kernel paging request at virtual address ffffffff80000000
Mem abort info:
ESR = 0x0000000096000005
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x05: level 1 translation fault
Data abort info:
ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
CM = 0, WnR = 0, TnD = 0, TagAccess = 0
GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
swapper pgtable: 4k pages, 52-bit VAs, pgdp=0000000081da9000
[ffffffff80000000] pgd=1000000082cec003, p4d=0000000082c32003, pud=0000000000000000
Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
Modules linked in:
CPU: 1 PID: 108 Comm: high_addr_hugep Not tainted 6.9.0-rc4 #48
Hardware name: Foundation-v8A (DT)
pstate: 01402005 (nzcv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
pc : huge_pte_alloc+0xd4/0x334
lr : hugetlb_fault+0x1b8/0xc68
sp : ffff8000833bbc20
x29: ffff8000833bbc20 x28: fff000080080cb58 x27: ffff800082a7cc58
x26: 0000000000000000 x25: fff0000800378e40 x24: fff00008008d6c60
x23: 00000000de9dbf07 x22: fff0000800378e40 x21: 0004000000000000
x20: 0004000000000000 x19: ffffffff80000000 x18: 1ffe00010011d7a1
x17: 0000000000000001 x16: ffffffffffffffff x15: 0000000000000001
x14: 0000000000000000 x13: ffff8000816120d0 x12: ffffffffffffffff
x11: 0000000000000000 x10: fff00008008ebd0c x9 : 0004000000000000
x8 : 0000000000001255 x7 : fff00008003e2000 x6 : 00000000061d54b0
x5 : 0000000000001000 x4 : ffffffff80000000 x3 : 0000000000200000
x2 : 0000000000000004 x1 : 0000000080000000 x0 : 0000000000000000
Call trace:
huge_pte_alloc+0xd4/0x334
hugetlb_fault+0x1b8/0xc68
handle_mm_fault+0x260/0x29c
do_page_fault+0xfc/0x47c
do_translation_fault+0x68/0x74
do_mem_abort+0x44/0x94
el0_da+0x2c/0x9c
el0t_64_sync_handler+0x70/0xc4
el0t_64_sync+0x190/0x194
Code: aa000084 cb010084 b24c2c84 8b130c93 (f9400260)
---[ end trace 0000000000000000 ]---
Cc: Will Deacon <will@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Fixes: a6bbf5d4d9d1 ("arm64: mm: Add definitions to support 5 levels of paging")
Reported-by: Dev Jain <dev.jain@arm.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Link: https://lore.kernel.org/r/20240415094003.1812018-1-anshuman.khandual@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fix from Catalin Marinas:
"Fix the TLBI RANGE operand calculation causing live migration under
KVM/arm64 to miss dirty pages due to stale TLB entries"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: tlb: Fix TLBI RANGE operand
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC fixes from Arnd Bergmann:
"The device tree changes this time are all for NXP i.MX platforms,
addressing issues with clocks and regulators on i.MX7 and i.MX8.
The old OMAP2 based Nokia N8x0 tablet get a couple of code fixes for
regressions that came in.
The ARM SCMI and FF-A firmware interfaces get a couple of minor bug
fixes.
A regression fix for RISC-V cache management addresses a problem with
probe order on Sifive cores"
* tag 'soc-fixes-6.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (23 commits)
MAINTAINERS: Change Krzysztof Kozlowski's email address
arm64: dts: imx8qm-ss-dma: fix can lpcg indices
arm64: dts: imx8-ss-dma: fix can lpcg indices
arm64: dts: imx8-ss-dma: fix adc lpcg indices
arm64: dts: imx8-ss-dma: fix pwm lpcg indices
arm64: dts: imx8-ss-dma: fix spi lpcg indices
arm64: dts: imx8-ss-conn: fix usb lpcg indices
arm64: dts: imx8-ss-lsio: fix pwm lpcg indices
ARM: dts: imx7s-warp: Pass OV2680 link-frequencies
ARM: dts: imx7-mba7: Use 'no-mmc' property
arm64: dts: imx8-ss-conn: fix usdhc wrong lpcg clock order
arm64: dts: freescale: imx8mp-venice-gw73xx-2x: fix USB vbus regulator
arm64: dts: freescale: imx8mp-venice-gw72xx-2x: fix USB vbus regulator
cache: sifive_ccache: Partially convert to a platform driver
firmware: arm_scmi: Make raw debugfs entries non-seekable
firmware: arm_scmi: Fix wrong fastchannel initialization
firmware: arm_ffa: Fix the partition ID check in ffa_notification_info_get()
ARM: OMAP2+: fix USB regression on Nokia N8x0
mmc: omap: restore original power up/down steps
mmc: omap: fix deferred probe
...
|
|
KVM/arm64 relies on TLBI RANGE feature to flush TLBs when the dirty
pages are collected by VMM and the page table entries become write
protected during live migration. Unfortunately, the operand passed
to the TLBI RANGE instruction isn't correctly sorted out due to the
commit 117940aa6e5f ("KVM: arm64: Define kvm_tlb_flush_vmid_range()").
It leads to crash on the destination VM after live migration because
TLBs aren't flushed completely and some of the dirty pages are missed.
For example, I have a VM where 8GB memory is assigned, starting from
0x40000000 (1GB). Note that the host has 4KB as the base page size.
In the middile of migration, kvm_tlb_flush_vmid_range() is executed
to flush TLBs. It passes MAX_TLBI_RANGE_PAGES as the argument to
__kvm_tlb_flush_vmid_range() and __flush_s2_tlb_range_op(). SCALE#3
and NUM#31, corresponding to MAX_TLBI_RANGE_PAGES, isn't supported
by __TLBI_RANGE_NUM(). In this specific case, -1 has been returned
from __TLBI_RANGE_NUM() for SCALE#3/2/1/0 and rejected by the loop
in the __flush_tlb_range_op() until the variable @scale underflows
and becomes -9, 0xffff708000040000 is set as the operand. The operand
is wrong since it's sorted out by __TLBI_VADDR_RANGE() according to
invalid @scale and @num.
Fix it by extending __TLBI_RANGE_NUM() to support the combination of
SCALE#3 and NUM#31. With the changes, [-1 31] instead of [-1 30] can
be returned from the macro, meaning the TLBs for 0x200000 pages in the
above example can be flushed in one shoot with SCALE#3 and NUM#31. The
macro TLBI_RANGE_MASK is dropped since no one uses it any more. The
comments are also adjusted accordingly.
Fixes: 117940aa6e5f ("KVM: arm64: Define kvm_tlb_flush_vmid_range()")
Cc: stable@kernel.org # v6.6+
Reported-by: Yihuang Yu <yihyu@redhat.com>
Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Link: https://lore.kernel.org/r/20240405035852.1532010-2-gshan@redhat.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/fixes
i.MX fixes for 6.9:
- A couple of i.MX7 board fixes from Fabio Estevam that use correct
'no-mmc' property and pass 'link-frequencies' for OV2680.
- A series from Frank Li to fix LPCG clock indices for i.MX8 subsystems.
- A couple of changes from Tim Harvey that fix USB VBUS regulator for
imx8mp-venice board.
* tag 'imx-fixes-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
arm64: dts: imx8qm-ss-dma: fix can lpcg indices
arm64: dts: imx8-ss-dma: fix can lpcg indices
arm64: dts: imx8-ss-dma: fix adc lpcg indices
arm64: dts: imx8-ss-dma: fix pwm lpcg indices
arm64: dts: imx8-ss-dma: fix spi lpcg indices
arm64: dts: imx8-ss-conn: fix usb lpcg indices
arm64: dts: imx8-ss-lsio: fix pwm lpcg indices
ARM: dts: imx7s-warp: Pass OV2680 link-frequencies
ARM: dts: imx7-mba7: Use 'no-mmc' property
arm64: dts: imx8-ss-conn: fix usdhc wrong lpcg clock order
arm64: dts: freescale: imx8mp-venice-gw73xx-2x: fix USB vbus regulator
arm64: dts: freescale: imx8mp-venice-gw72xx-2x: fix USB vbus regulator
Link: https://lore.kernel.org/r/Zg5rfaVVvD9egoBK@dragon
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fix from Catalin Marinas:
"arm64/ptrace fix to use the correct SVE layout based on the saved
floating point state rather than the TIF_SVE flag. The latter may be
left on during syscalls even if the SVE state is discarded"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64/ptrace: Use saved floating point state type to determine SVE layout
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from netfilter, bluetooth and bpf.
Fairly usual collection of driver and core fixes. The large selftest
accompanying one of the fixes is also becoming a common occurrence.
Current release - regressions:
- ipv6: fix infinite recursion in fib6_dump_done()
- net/rds: fix possible null-deref in newly added error path
Current release - new code bugs:
- net: do not consume a full cacheline for system_page_pool
- bpf: fix bpf_arena-related file descriptor leaks in the verifier
- drv: ice: fix freeing uninitialized pointers, fixing misuse of the
newfangled __free() auto-cleanup
Previous releases - regressions:
- x86/bpf: fixes the BPF JIT with retbleed=stuff
- xen-netfront: add missing skb_mark_for_recycle, fix page pool
accounting leaks, revealed by recently added explicit warning
- tcp: fix bind() regression for v6-only wildcard and v4-mapped-v6
non-wildcard addresses
- Bluetooth:
- replace "hci_qca: Set BDA quirk bit if fwnode exists in DT" with
better workarounds to un-break some buggy Qualcomm devices
- set conn encrypted before conn establishes, fix re-connecting to
some headsets which use slightly unusual sequence of msgs
- mptcp:
- prevent BPF accessing lowat from a subflow socket
- don't account accept() of non-MPC client as fallback to TCP
- drv: mana: fix Rx DMA datasize and skb_over_panic
- drv: i40e: fix VF MAC filter removal
Previous releases - always broken:
- gro: various fixes related to UDP tunnels - netns crossing
problems, incorrect checksum conversions, and incorrect packet
transformations which may lead to panics
- bpf: support deferring bpf_link dealloc to after RCU grace period
- nf_tables:
- release batch on table validation from abort path
- release mutex after nft_gc_seq_end from abort path
- flush pending destroy work before exit_net release
- drv: r8169: skip DASH fw status checks when DASH is disabled"
* tag 'net-6.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (81 commits)
netfilter: validate user input for expected length
net/sched: act_skbmod: prevent kernel-infoleak
net: usb: ax88179_178a: avoid the interface always configured as random address
net: dsa: sja1105: Fix parameters order in sja1110_pcs_mdio_write_c45()
net: ravb: Always update error counters
net: ravb: Always process TX descriptor ring
netfilter: nf_tables: discard table flag update with pending basechain deletion
netfilter: nf_tables: Fix potential data-race in __nft_flowtable_type_get()
netfilter: nf_tables: reject new basechain after table flag update
netfilter: nf_tables: flush pending destroy work before exit_net release
netfilter: nf_tables: release mutex after nft_gc_seq_end from abort path
netfilter: nf_tables: release batch on table validation from abort path
Revert "tg3: Remove residual error handling in tg3_suspend"
tg3: Remove residual error handling in tg3_suspend
net: mana: Fix Rx DMA datasize and skb_over_panic
net/sched: fix lockdep splat in qdisc_tree_reduce_backlog()
net: phy: micrel: lan8814: Fix when enabling/disabling 1-step timestamping
net: stmmac: fix rx queue priority assignment
net: txgbe: fix i2c dev name cannot match clkdev
net: fec: Set mac_managed_pm during probe
...
|
|
Pull KVM fixes from Paolo Bonzini:
"ARM:
- Ensure perf events programmed to count during guest execution are
actually enabled before entering the guest in the nVHE
configuration
- Restore out-of-range handler for stage-2 translation faults
- Several fixes to stage-2 TLB invalidations to avoid stale
translations, possibly including partial walk caches
- Fix early handling of architectural VHE-only systems to ensure E2H
is appropriately set
- Correct a format specifier warning in the arch_timer selftest
- Make the KVM banner message correctly handle all of the possible
configurations
RISC-V:
- Remove redundant semicolon in num_isa_ext_regs()
- Fix APLIC setipnum_le/be write emulation
- Fix APLIC in_clrip[x] read emulation
x86:
- Fix a bug in KVM_SET_CPUID{2,} where KVM looks at the wrong CPUID
entries (old vs. new) and ultimately neglects to clear PV_UNHALT
from vCPUs with HLT-exiting disabled
- Documentation fixes for SEV
- Fix compat ABI for KVM_MEMORY_ENCRYPT_OP
- Fix a 14-year-old goof in a declaration shared by host and guest;
the enabled field used by Linux when running as a guest pushes the
size of "struct kvm_vcpu_pv_apf_data" from 64 to 68 bytes. This is
really unconsequential because KVM never consumes anything beyond
the first 64 bytes, but the resulting struct does not match the
documentation
Selftests:
- Fix spelling mistake in arch_timer selftest"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (25 commits)
KVM: arm64: Rationalise KVM banner output
arm64: Fix early handling of FEAT_E2H0 not being implemented
KVM: arm64: Ensure target address is granule-aligned for range TLBI
KVM: arm64: Use TLBI_TTL_UNKNOWN in __kvm_tlb_flush_vmid_range()
KVM: arm64: Don't pass a TLBI level hint when zapping table entries
KVM: arm64: Don't defer TLB invalidation when zapping table entries
KVM: selftests: Fix __GUEST_ASSERT() format warnings in ARM's arch timer test
KVM: arm64: Fix out-of-IPA space translation fault handling
KVM: arm64: Fix host-programmed guest events in nVHE
RISC-V: KVM: Fix APLIC in_clrip[x] read emulation
RISC-V: KVM: Fix APLIC setipnum_le/be write emulation
RISC-V: KVM: Remove second semicolon
KVM: selftests: Fix spelling mistake "trigged" -> "triggered"
Documentation: kvm/sev: clarify usage of KVM_MEMORY_ENCRYPT_OP
Documentation: kvm/sev: separate description of firmware
KVM: SEV: fix compat ABI for KVM_MEMORY_ENCRYPT_OP
KVM: selftests: Check that PV_UNHALT is cleared when HLT exiting is disabled
KVM: x86: Use actual kvm_cpuid.base for clearing KVM_FEATURE_PV_UNHALT
KVM: x86: Introduce __kvm_get_hypervisor_cpuid() helper
KVM: SVM: Return -EINVAL instead of -EBUSY on attempt to re-init SEV/SEV-ES
...
|
|
The SVE register sets have two different formats, one of which is a wrapped
version of the standard FPSIMD register set and another with actual SVE
register data. At present we check TIF_SVE to see if full SVE register
state should be provided when reading the SVE regset but if we were in a
syscall we may have saved only floating point registers even though that is
set.
Fix this and simplify the logic by checking and using the format which we
recorded when deciding if we should use FPSIMD or SVE format.
Fixes: 8c845e273104 ("arm64/sve: Leave SVE enabled on syscall if we don't context switch")
Cc: <stable@vger.kernel.org> # 6.2.x
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20240325-arm64-ptrace-fp-type-v1-1-8dc846caf11f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
can1_lpcg: clock-controller@5ace0000 {
... Col1 Col2
clocks = <&clk IMX_SC_R_CAN_1 IMX_SC_PM_CLK_PER>,// 0 0
<&dma_ipg_clk>, // 1 4
<&dma_ipg_clk>; // 2 5
clock-indices = <IMX_LPCG_CLK_0>,
<IMX_LPCG_CLK_4>,
<IMX_LPCG_CLK_5>;
};
Col1: index, which existing dts try to get.
Col2: actual index in lpcg driver
&flexcan2 {
clocks = <&can1_lpcg 1>, <&can1_lpcg 0>;
^^ ^^
Should be:
clocks = <&can1_lpcg IMX_LPCG_CLK_4>, <&can1_lpcg IMX_LPCG_CLK_0>;
};
Arg0 is divided by 4 in lpcg driver. So flexcan get IMX_SC_PM_CLK_PER by
<&can1_lpcg 1> and <&can1_lpcg 0>. Although function work, code logic is
wrong. Fix it by using correct clock indices.
Cc: stable@vger.kernel.org
Fixes: be85831de020 ("arm64: dts: imx8qm: add can node in devicetree")
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
can0_lpcg: clock-controller@5acd0000 {
... Col1 Col2
clocks = <&clk IMX_SC_R_CAN_0 IMX_SC_PM_CLK_PER>, // 0 0
<&dma_ipg_clk>, // 1 4
<&dma_ipg_clk>; // 2 5
clock-indices = <IMX_LPCG_CLK_0>,
<IMX_LPCG_CLK_4>,
<IMX_LPCG_CLK_5>;
}
Col1: index, which existing dts try to get.
Col2: actual index in lpcg driver.
flexcan1: can@5a8d0000 {
clocks = <&can0_lpcg 1>, <&can0_lpcg 0>;
^^ ^^
Should be:
clocks = <&can0_lpcg IMX_LPCG_CLK_4>, <&can0_lpcg IMX_LPCG_CLK_0>;
};
Arg0 is divided by 4 in lpcg driver. flexcan driver get IMX_SC_PM_CLK_PER
by <&can0_lpcg 1> and <&can0_lpcg 0>. Although function can work, code
logic is wrong. Fix it by using correct clock indices.
Cc: stable@vger.kernel.org
Fixes: 5e7d5b023e03 ("arm64: dts: imx8qxp: add flexcan in adma")
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
adc0_lpcg: clock-controller@5ac80000 {
... Col1 Col2
clocks = <&clk IMX_SC_R_ADC_0 IMX_SC_PM_CLK_PER>, // 0 0
<&dma_ipg_clk>; // 1 4
clock-indices = <IMX_LPCG_CLK_0>, <IMX_LPCG_CLK_4>;
};
Col1: index, which existing dts try to get.
Col2: actual index in lpcg driver.
adc0: adc@5a880000 {
clocks = <&adc0_lpcg 0>, <&adc0_lpcg 1>;
^^ ^^
clocks = <&adc0_lpcg IMX_LPCG_CLK_0>, <&adc0_lpcg IMX_LPCG_CLK_4>;
Arg0 is divided by 4 in lpcg driver. So adc get IMX_SC_PM_CLK_PER by
<&adc0_lpcg 0>, <&adc0_lpcg 1>. Although function can work, code logic is
wrong. Fix it by using correct indices.
Cc: stable@vger.kernel.org
Fixes: 1db044b25d2e ("arm64: dts: imx8dxl: add adc0 support")
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
adma_pwm_lpcg: clock-controller@5a590000 {
... col1 col2
clocks = <&clk IMX_SC_R_LCD_0_PWM_0 IMX_SC_PM_CLK_PER>,// 0 0
<&dma_ipg_clk>; // 1 4
clock-indices = <IMX_LPCG_CLK_0>, <IMX_LPCG_CLK_4>;
...
};
Col1: index, which existing dts try to get.
Col2: actual index in lpcg driver.
adma_pwm: pwm@5a190000 {
...
clocks = <&adma_pwm_lpcg 1>, <&adma_pwm_lpcg 0>;
^^ ^^
Should be
clocks = <&adma_pwm_lpcg IMX_LPCG_CLK_4>,
<&adma_pwm_lpcg IMX_LPCG_CLK_0>;
};
Arg0 will be divided by 4 in lcpg driver, so pwm will get IMX_SC_PM_CLK_PER
by <&adma_pwm_lpcg 1>, <&adma_pwm_lpcg 0>. Although function can work, code
logic is wrong. Fix it by use correct indices.
Cc: stable@vger.kernel.org
Fixes: f1d6a6b991ef ("arm64: dts: imx8qxp: add adma_pwm in adma")
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
spi0_lpcg: clock-controller@5a400000 {
... Col0 Col1
clocks = <&clk IMX_SC_R_SPI_0 IMX_SC_PM_CLK_PER>,// 0 1
<&dma_ipg_clk>; // 1 4
clock-indices = <IMX_LPCG_CLK_0>, <IMX_LPCG_CLK_4>;
};
Col1: index, which existing dts try to get.
Col2: actual index in lpcg driver.
lpspi0: spi@5a000000 {
...
clocks = <&spi0_lpcg 0>, <&spi0_lpcg 1>;
^ ^
Should be:
clocks = <&spi0_lpcg IMX_LPCG_CLK_0>, <&spi0_lpcg IMX_LPCG_CLK_4>;
};
Arg0 is divided by 4 in lpcg driver. <&spi0_lpcg 0> and <&spi0_lpcg 1> are
IMX_SC_PM_CLK_PER. Although code can work, code logic is wrong. It should
use IMX_LPCG_CLK_0 and IMX_LPCG_CLK_4 for lpcg arg0.
Cc: stable@vger.kernel.org
Fixes: c4098885e790 ("arm64: dts: imx8dxl: add lpspi support")
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
usb2_lpcg: clock-controller@5b270000 {
... Col1 Col2
clocks = <&conn_ahb_clk>, <&conn_ipg_clk>; // 0 6
clock-indices = <IMX_LPCG_CLK_6>, <IMX_LPCG_CLK_7>; // 0 7
...
};
Col1: index, which existing dts try to get.
Col2: actual index in lpcg driver.
usbotg1: usb@5b0d0000 {
...
clocks = <&usb2_lpcg 0>;
^^
Should be:
clocks = <&usb2_lpcg IMX_LPCG_CLK_6>;
};
usbphy1: usbphy@5b100000 {
clocks = <&usb2_lpcg 1>;
^^
SHould be:
clocks = <&usb2_lpcg IMX_LPCG_CLK_7>;
};
Arg0 is divided by 4 in lpcg driver. So lpcg will do dummy enable. Fix it
by use correct clock indices.
Cc: stable@vger.kernel.org
Fixes: 8065fc937f0f ("arm64: dts: imx8dxl: add usb1 and usb2 support")
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
lpcg's arg0 should use clock indices instead of index.
pwm0_lpcg: clock-controller@5d400000 {
... // Col1 Col2
clocks = <&clk IMX_SC_R_PWM_0 IMX_SC_PM_CLK_PER>, // 0 0
<&clk IMX_SC_R_PWM_0 IMX_SC_PM_CLK_PER>, // 1 1
<&clk IMX_SC_R_PWM_0 IMX_SC_PM_CLK_PER>, // 2 4
<&lsio_bus_clk>, // 3 5
<&clk IMX_SC_R_PWM_0 IMX_SC_PM_CLK_PER>; // 4 6
clock-indices = <IMX_LPCG_CLK_0>, <IMX_LPCG_CLK_1>,
<IMX_LPCG_CLK_4>, <IMX_LPCG_CLK_5>,
<IMX_LPCG_CLK_6>;
};
Col1: index, which existing dts try to get.
Col2: actual index in lpcg driver.
pwm1 {
....
clocks = <&pwm1_lpcg 4>, <&pwm1_lpcg 1>;
^^ ^^
should be:
clocks = <&pwm1_lpcg IMX_LPCG_CLK_6>, <&pwm1_lpcg IMX_LPCG_CLK_1>;
};
Arg0 is divided by 4 in lpcg driver, so index 0 and 1 will be get by pwm
driver, which are same as IMX_LPCG_CLK_6 and IMX_LPCG_CLK_1. Even it can
work, but code logic is wrong. Fixed it by use correct indices.
Cc: stable@vger.kernel.org
Fixes: 23fa99b205ea ("arm64: dts: freescale: imx8-ss-lsio: add support for lsio_pwm0-3")
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
The actual clock show wrong frequency:
echo on >/sys/devices/platform/bus\@5b000000/5b010000.mmc/power/control
cat /sys/kernel/debug/mmc0/ios
clock: 200000000 Hz
actual clock: 166000000 Hz
^^^^^^^^^
.....
According to
sdhc0_lpcg: clock-controller@5b200000 {
compatible = "fsl,imx8qxp-lpcg";
reg = <0x5b200000 0x10000>;
#clock-cells = <1>;
clocks = <&clk IMX_SC_R_SDHC_0 IMX_SC_PM_CLK_PER>,
<&conn_ipg_clk>, <&conn_axi_clk>;
clock-indices = <IMX_LPCG_CLK_0>, <IMX_LPCG_CLK_4>,
<IMX_LPCG_CLK_5>;
clock-output-names = "sdhc0_lpcg_per_clk",
"sdhc0_lpcg_ipg_clk",
"sdhc0_lpcg_ahb_clk";
power-domains = <&pd IMX_SC_R_SDHC_0>;
}
"per_clk" should be IMX_LPCG_CLK_0 instead of IMX_LPCG_CLK_5.
After correct clocks order:
echo on >/sys/devices/platform/bus\@5b000000/5b010000.mmc/power/control
cat /sys/kernel/debug/mmc0/ios
clock: 200000000 Hz
actual clock: 198000000 Hz
^^^^^^^^
...
Fixes: 16c4ea7501b1 ("arm64: dts: imx8: switch to new lpcg clock binding")
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
We are not very consistent when it comes to displaying which mode
we're in (VHE, {n,h}VHE, protected or not). For example, booting
in protected mode with hVHE results in:
[ 0.969545] kvm [1]: Protected nVHE mode initialized successfully
which is mildly amusing considering that the machine is VHE only.
We already cleaned this up a bit with commit 1f3ca7023fe6 ("KVM:
arm64: print Hyp mode"), but that's still unsatisfactory.
Unify the three strings into one and use a mess of conditional
statements to sort it out (yes, it's a slow day).
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240321173706.3280796-1-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Commit 3944382fa6f2 introduced checks for the FEAT_E2H0 not being
implemented. However, the check is absolutely wrong and makes a
point it testing a bit that is guaranteed to be zero.
On top of that, the detection happens way too late, after the
init_el2_state has done its job.
This went undetected because the HW this was tested on has E2H being
RAO/WI, and not RES1. However, the bug shows up when run as a nested
guest, where HCR_EL2.E2H is not necessarily set to 1. As a result,
booting the kernel in hVHE mode fails with timer accesses being
cought in a trap loop (which was fun to debug).
Fix the check for ID_AA64MMFR4_EL1.E2H0, and set the HCR_EL2.E2H bit
early so that it can be checked by the rest of the init sequence.
With this, hVHE works again in a NV environment that doesn't have
FEAT_E2H0.
Fixes: 3944382fa6f2 ("arm64: Treat HCR_EL2.E2H as RES1 when ID_AA64MMFR4_EL1.E2H0 is negative")
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20240321115414.3169115-1-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
When zapping a table entry in stage2_try_break_pte(), we issue range
TLB invalidation for the region that was mapped by the table. However,
we neglect to align the base address down to the granule size and so
if we ended up reaching the table entry via a misaligned address then
we will accidentally skip invalidation for some prefix of the affected
address range.
Align 'ctx->addr' down to the granule size when performing TLB
invalidation for an unmapped table in stage2_try_break_pte().
Cc: Raghavendra Rao Ananta <rananta@google.com>
Cc: Gavin Shan <gshan@redhat.com>
Cc: Shaoqin Huang <shahuang@redhat.com>
Cc: Quentin Perret <qperret@google.com>
Fixes: defc8cc7abf0 ("KVM: arm64: Invalidate the table entries upon a range")
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240327124853.11206-5-will@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Commit c910f2b65518 ("arm64/mm: Update tlb invalidation routines for
FEAT_LPA2") updated the __tlbi_level() macro to take the target level
as an argument, with TLBI_TTL_UNKNOWN (rather than 0) indicating that
the caller cannot provide level information. Unfortunately, the two
implementations of __kvm_tlb_flush_vmid_range() were not updated and so
now ask for an level 0 invalidation if FEAT_LPA2 is implemented.
Fix the problem by passing TLBI_TTL_UNKNOWN instead of 0 as the level
argument to __flush_s2_tlb_range_op() in __kvm_tlb_flush_vmid_range().
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Marc Zyngier <maz@kernel.org>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Fixes: c910f2b65518 ("arm64/mm: Update tlb invalidation routines for FEAT_LPA2")
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240327124853.11206-4-will@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
The TLBI level hints are for leaf entries only, so take care not to pass
them incorrectly after clearing a table entry.
Cc: Gavin Shan <gshan@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Fixes: 82bb02445de5 ("KVM: arm64: Implement kvm_pgtable_hyp_unmap() at EL2")
Fixes: 6d9d2115c480 ("KVM: arm64: Add support for stage-2 map()/unmap() in generic page-table")
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240327124853.11206-3-will@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Commit 7657ea920c54 ("KVM: arm64: Use TLBI range-based instructions for
unmap") introduced deferred TLB invalidation for the stage-2 page-table
so that range-based invalidation can be used for the accumulated
addresses. This works fine if the structure of the page-tables remains
unchanged, but if entire tables are zapped and subsequently freed then
we transiently leave the hardware page-table walker with a reference
to freed memory thanks to the translation walk caches. For example,
stage2_unmap_walker() will free page-table pages:
if (childp)
mm_ops->put_page(childp);
and issue the TLB invalidation later in kvm_pgtable_stage2_unmap():
if (stage2_unmap_defer_tlb_flush(pgt))
/* Perform the deferred TLB invalidations */
kvm_tlb_flush_vmid_range(pgt->mmu, addr, size);
For now, take the conservative approach and invalidate the TLB eagerly
when we clear a table entry. Note, however, that the existing level
hint passed to __kvm_tlb_flush_vmid_ipa() is incorrect and will be
fixed in a subsequent patch.
Cc: Raghavendra Rao Ananta <rananta@google.com>
Cc: Shaoqin Huang <shahuang@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240327124853.11206-2-will@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Several Qualcomm Bluetooth controllers lack persistent storage for the
device address and instead one can be provided by the boot firmware
using the 'local-bd-address' devicetree property.
The Bluetooth bindings clearly states that the address should be
specified in little-endian order, but due to a long-standing bug in the
Qualcomm driver which reversed the address some boot firmware has been
providing the address in big-endian order instead.
The boot firmware in SC7180 Trogdor Chromebooks is known to be affected
so mark the 'local-bd-address' property as broken to maintain backwards
compatibility with older firmware when fixing the underlying driver bug.
Note that ChromeOS always updates the kernel and devicetree in lockstep
so that there is no need to handle backwards compatibility with older
devicetrees.
Fixes: 7ec3e67307f8 ("arm64: dts: qcom: sc7180-trogdor: add initial trogdor and lazor dt")
Cc: stable@vger.kernel.org # 5.10
Cc: Rob Clark <robdclark@chromium.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Acked-by: Bjorn Andersson <andersson@kernel.org>
Reviewed-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
When using usb-conn-gpio to control USB role and VBUS, the vbus-supply
property must be present in the usb-conn-gpio node. Additionally it
should not be present in the phy node as that isn't what controls vbus
and will upset the use count.
This resolves an issue where VBUS is enabled with OTG in peripheral
mode.
Fixes: ad9a12f7a522 ("arm64: dts: imx8mp-venice: Fix USB connector description")
Signed-off-by: Tim Harvey <tharvey@gateworks.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
When using usb-conn-gpio to control USB role and VBUS, the vbus-supply
property must be present in the usb-conn-gpio node. Additionally it
should not be present in the phy node as that isn't what controls vbus
and will upset the use count.
This resolves an issue where VBUS is enabled with OTG in peripheral
mode.
Fixes: ad9a12f7a522 ("arm64: dts: imx8mp-venice: Fix USB connector description")
Signed-off-by: Tim Harvey <tharvey@gateworks.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from bpf, WiFi and netfilter.
Current release - regressions:
- ipv6: fix address dump when IPv6 is disabled on an interface
Current release - new code bugs:
- bpf: temporarily disable atomic operations in BPF arena
- nexthop: fix uninitialized variable in nla_put_nh_group_stats()
Previous releases - regressions:
- bpf: protect against int overflow for stack access size
- hsr: fix the promiscuous mode in offload mode
- wifi: don't always use FW dump trig
- tls: adjust recv return with async crypto and failed copy to
userspace
- tcp: properly terminate timers for kernel sockets
- ice: fix memory corruption bug with suspend and rebuild
- at803x: fix kernel panic with at8031_probe
- qeth: handle deferred cc1
Previous releases - always broken:
- bpf: fix bug in BPF_LDX_MEMSX
- netfilter: reject table flag and netdev basechain updates
- inet_defrag: prevent sk release while still in use
- wifi: pick the version of SESSION_PROTECTION_NOTIF
- wwan: t7xx: split 64bit accesses to fix alignment issues
- mlxbf_gige: call request_irq() after NAPI initialized
- hns3: fix kernel crash when devlink reload during pf
initialization"
* tag 'net-6.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (81 commits)
inet: inet_defrag: prevent sk release while still in use
Octeontx2-af: fix pause frame configuration in GMP mode
net: lan743x: Add set RFE read fifo threshold for PCI1x1x chips
net: bcmasp: Remove phy_{suspend/resume}
net: bcmasp: Bring up unimac after PHY link up
net: phy: qcom: at803x: fix kernel panic with at8031_probe
netfilter: arptables: Select NETFILTER_FAMILY_ARP when building arp_tables.c
netfilter: nf_tables: skip netdev hook unregistration if table is dormant
netfilter: nf_tables: reject table flag and netdev basechain updates
netfilter: nf_tables: reject destroy command to remove basechain hooks
bpf: update BPF LSM designated reviewer list
bpf: Protect against int overflow for stack access size
bpf: Check bloom filter map value size
bpf: fix warning for crash_kexec
selftests: netdevsim: set test timeout to 10 minutes
net: wan: framer: Add missing static inline qualifiers
mlxbf_gige: call request_irq() after NAPI initialized
tls: get psock ref after taking rxlock to avoid leak
selftests: tls: add test with a partially invalid iov
tls: adjust recv return with async crypto and failed copy to userspace
...
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
pull-request: bpf 2024-03-25
The following pull-request contains BPF updates for your *net* tree.
We've added 17 non-merge commits during the last 12 day(s) which contain
a total of 19 files changed, 184 insertions(+), 61 deletions(-).
The main changes are:
1) Fix an arm64 BPF JIT bug in BPF_LDX_MEMSX implementation's offset handling
found via test_bpf module, from Puranjay Mohan.
2) Various fixups to the BPF arena code in particular in the BPF verifier and
around BPF selftests to match latest corresponding LLVM implementation,
from Puranjay Mohan and Alexei Starovoitov.
3) Fix xsk to not assume that metadata is always requested in TX completion,
from Stanislav Fomichev.
4) Fix riscv BPF JIT's kfunc parameter incompatibility between BPF and the riscv
ABI which requires sign-extension on int/uint, from Pu Lehui.
5) Fix s390x BPF JIT's bpf_plt pointer arithmetic which triggered a crash when
testing struct_ops, from Ilya Leoshkevich.
6) Fix libbpf's arena mmap handling which had incorrect u64-to-pointer cast on
32-bit architectures, from Andrii Nakryiko.
7) Fix libbpf to define MFD_CLOEXEC when not available, from Arnaldo Carvalho de Melo.
8) Fix arm64 BPF JIT implementation for 32bit unconditional bswap which
resulted in an incorrect swap as indicated by test_bpf, from Artem Savkov.
9) Fix BPF man page build script to use silent mode, from Hangbin Liu.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
riscv, bpf: Fix kfunc parameters incompatibility between bpf and riscv abi
bpf: verifier: reject addr_space_cast insn without arena
selftests/bpf: verifier_arena: fix mmap address for arm64
bpf: verifier: fix addr_space_cast from as(1) to as(0)
libbpf: Define MFD_CLOEXEC if not available
arm64: bpf: fix 32bit unconditional bswap
bpf, arm64: fix bug in BPF_LDX_MEMSX
libbpf: fix u64-to-pointer cast on 32-bit arches
s390/bpf: Fix bpf_plt pointer arithmetic
xsk: Don't assume metadata is always requested in TX completion
selftests/bpf: Add arena test case for 4Gbyte corner case
selftests/bpf: Remove hard coded PAGE_SIZE macro.
libbpf, selftests/bpf: Adjust libbpf, bpftool, selftests to match LLVM
bpf: Clarify bpf_arena comments.
MAINTAINERS: Update email address for Quentin Monnet
scripts/bpf_doc: Use silent mode when exec make cmd
bpf: Temporarily disable atomic operations in BPF arena
====================
Link: https://lore.kernel.org/r/20240325213520.26688-1-daniel@iogearbox.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Commit 11e5ea5242e3 ("KVM: arm64: Use helpers to classify exception
types reported via ESR") tried to abstract the translation fault
check when handling an out-of IPA space condition, but incorrectly
replaced it with a permission fault check.
Restore the previous translation fault check.
Fixes: 11e5ea5242e3 ("KVM: arm64: Use helpers to classify exception types reported via ESR")
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Wujie Duan <wjduan@linx-info.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/kvmarm/864jd3269g.wl-maz@kernel.org/
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- Re-instate the CPUMASK_OFFSTACK option for arm64 when NR_CPUS > 256.
The bug that led to the initial revert was the cpufreq-dt code not
using zalloc_cpumask_var().
- Make the STARFIVE_STARLINK_PMU config option depend on 64BIT to
prevent compile-test failures on 32-bit architectures due to missing
writeq().
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
perf: starfive: fix 64-bit only COMPILE_TEST condition
ARM64: Dynamically allocate cpumasks and increase supported CPUs to 512
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB / Thunderbolt updates from Greg KH:
"Here is the big set of USB and Thunderbolt changes for 6.9-rc1. Lots
of tiny changes and forward progress to support new hardware and
better support for existing devices. Included in here are:
- Thunderbolt (i.e. USB4) updates for newer hardware and uses as more
people start to use the hardware
- default USB authentication mode Kconfig and documentation update to
make it more obvious what is going on
- USB typec updates and enhancements
- usual dwc3 driver updates
- usual xhci driver updates
- function USB (i.e. gadget) driver updates and additions
- new device ids for lots of drivers
- loads of other small updates, full details in the shortlog
All of these, including a "last minute regression fix" have been in
linux-next with no reported issues"
* tag 'usb-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (185 commits)
usb: usb-acpi: Fix oops due to freeing uninitialized pld pointer
usb: gadget: net2272: Use irqflags in the call to net2272_probe_fin
usb: gadget: tegra-xudc: Fix USB3 PHY retrieval logic
phy: tegra: xusb: Add API to retrieve the port number of phy
USB: gadget: pxa27x_udc: Remove unused of_gpio.h
usb: gadget/snps_udc_plat: Remove unused of_gpio.h
usb: ohci-pxa27x: Remove unused of_gpio.h
usb: sl811-hcd: only defined function checkdone if QUIRK2 is defined
usb: Clarify expected behavior of dev_bin_attrs_are_visible()
xhci: Allow RPM on the USB controller (1022:43f7) by default
usb: isp1760: remove SLAB_MEM_SPREAD flag usage
usb: misc: onboard_hub: use pointer consistently in the probe function
usb: gadget: fsl: Increase size of name buffer for endpoints
usb: gadget: fsl: Add of device table to enable module autoloading
usb: typec: tcpm: add support to set tcpc connector orientatition
usb: typec: tcpci: add generic tcpci fallback compatible
dt-bindings: usb: typec-tcpci: add tcpci fallback binding
usb: gadget: fsl-udc: Replace custom log wrappers by dev_{err,warn,dbg,vdbg}
usb: core: Set connect_type of ports based on DT node
dt-bindings: usb: Add downstream facing ports to realtek binding
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyperv updates from Wei Liu:
- Use Hyper-V entropy to seed guest random number generator (Michael
Kelley)
- Convert to platform remove callback returning void for vmbus (Uwe
Kleine-König)
- Introduce hv_get_hypervisor_version function (Nuno Das Neves)
- Rename some HV_REGISTER_* defines for consistency (Nuno Das Neves)
- Change prefix of generic HV_REGISTER_* MSRs to HV_MSR_* (Nuno Das
Neves)
- Cosmetic changes for hv_spinlock.c (Purna Pavan Chandra Aekkaladevi)
- Use per cpu initial stack for vtl context (Saurabh Sengar)
* tag 'hyperv-next-signed-20240320' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
x86/hyperv: Use Hyper-V entropy to seed guest random number generator
x86/hyperv: Cosmetic changes for hv_spinlock.c
hyperv-tlfs: Rename some HV_REGISTER_* defines for consistency
hv: vmbus: Convert to platform remove callback returning void
mshyperv: Introduce hv_get_hypervisor_version function
x86/hyperv: Use per cpu initial stack for vtl context
hyperv-tlfs: Change prefix of generic HV_REGISTER_* MSRs to HV_MSR_*
|
|
In case when is64 == 1 in emit(A64_REV32(is64, dst, dst), ctx) the
generated insn reverses byte order for both high and low 32-bit words,
resuling in an incorrect swap as indicated by the jit test:
[ 9757.262607] test_bpf: #312 BSWAP 16: 0x0123456789abcdef -> 0xefcd jited:1 8 PASS
[ 9757.264435] test_bpf: #313 BSWAP 32: 0x0123456789abcdef -> 0xefcdab89 jited:1 ret 1460850314 != -271733879 (0x5712ce8a != 0xefcdab89)FAIL (1 times)
[ 9757.266260] test_bpf: #314 BSWAP 64: 0x0123456789abcdef -> 0x67452301 jited:1 8 PASS
[ 9757.268000] test_bpf: #315 BSWAP 64: 0x0123456789abcdef >> 32 -> 0xefcdab89 jited:1 8 PASS
[ 9757.269686] test_bpf: #316 BSWAP 16: 0xfedcba9876543210 -> 0x1032 jited:1 8 PASS
[ 9757.271380] test_bpf: #317 BSWAP 32: 0xfedcba9876543210 -> 0x10325476 jited:1 ret -1460850316 != 271733878 (0xa8ed3174 != 0x10325476)FAIL (1 times)
[ 9757.273022] test_bpf: #318 BSWAP 64: 0xfedcba9876543210 -> 0x98badcfe jited:1 7 PASS
[ 9757.274721] test_bpf: #319 BSWAP 64: 0xfedcba9876543210 >> 32 -> 0x10325476 jited:1 9 PASS
Fix this by forcing 32bit variant of rev32.
Fixes: 1104247f3f979 ("bpf, arm64: Support unconditional bswap")
Signed-off-by: Artem Savkov <asavkov@redhat.com>
Tested-by: Puranjay Mohan <puranjay12@gmail.com>
Acked-by: Puranjay Mohan <puranjay12@gmail.com>
Acked-by: Xu Kuohai <xukuohai@huawei.com>
Message-ID: <20240321081809.158803-1-asavkov@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
A64_LDRSW() takes three registers: Xt, Xn, Xm as arguments and it loads
and sign extends the value at address Xn + Xm into register Xt.
Currently, the offset is being directly used in place of the tmp
register which has the offset already loaded by the last emitted
instruction.
This will cause JIT failures. The easiest way to reproduce this is to
test the following code through test_bpf module:
{
"BPF_LDX_MEMSX | BPF_W",
.u.insns_int = {
BPF_LD_IMM64(R1, 0x00000000deadbeefULL),
BPF_LD_IMM64(R2, 0xffffffffdeadbeefULL),
BPF_STX_MEM(BPF_DW, R10, R1, -7),
BPF_LDX_MEMSX(BPF_W, R0, R10, -7),
BPF_JMP_REG(BPF_JNE, R0, R2, 1),
BPF_ALU64_IMM(BPF_MOV, R0, 0),
BPF_EXIT_INSN(),
},
INTERNAL,
{ },
{ { 0, 0 } },
.stack_depth = 7,
},
We need to use the offset as -7 to trigger this code path, there could
be other valid ways to trigger this from proper BPF programs as well.
This code is rejected by the JIT because -7 is passed to A64_LDRSW() but
it expects a valid register (0 - 31).
roott@pjy:~# modprobe test_bpf test_name="BPF_LDX_MEMSX | BPF_W"
[11300.490371] test_bpf: test_bpf: set 'test_bpf' as the default test_suite.
[11300.491750] test_bpf: #345 BPF_LDX_MEMSX | BPF_W
[11300.493179] aarch64_insn_encode_register: unknown register encoding -7
[11300.494133] aarch64_insn_encode_register: unknown register encoding -7
[11300.495292] FAIL to select_runtime err=-524
[11300.496804] test_bpf: Summary: 0 PASSED, 1 FAILED, [0/0 JIT'ed]
modprobe: ERROR: could not insert 'test_bpf': Invalid argument
Applying this patch fixes the issue.
root@pjy:~# modprobe test_bpf test_name="BPF_LDX_MEMSX | BPF_W"
[ 292.837436] test_bpf: test_bpf: set 'test_bpf' as the default test_suite.
[ 292.839416] test_bpf: #345 BPF_LDX_MEMSX | BPF_W jited:1 156 PASS
[ 292.844794] test_bpf: Summary: 1 PASSED, 0 FAILED, [1/1 JIT'ed]
Fixes: cc88f540da52 ("bpf, arm64: Support sign-extension load instructions")
Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Message-ID: <20240312235917.103626-1-puranjay12@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull more ARM SoC updates from Arnd Bergmann:
"These are changes that for some reason ended up not making it into the
first four branches but that should still make it into 6.9:
- A rework of the omap clock support that touches both drivers and
device tree files
- The reset controller branch changes that had a dependency on late
bugfixes. Merging them here avoids a backmerge of 6.8-rc5 into the
drivers branch
- The RISC-V/starfive, RISC-V/microchip and ARM/Broadcom devicetree
changes that got delayed and needed some extra time in linux-next
for wider testing"
* tag 'soc-late-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (31 commits)
soc: fsl: dpio: fix kcalloc() argument order
bus: ts-nbus: Improve error reporting
bus: ts-nbus: Convert to atomic pwm API
riscv: dts: starfive: jh7110: Add camera subsystem nodes
ARM: bcm: stop selecing CONFIG_TICK_ONESHOT
ARM: dts: omap3: Update clksel clocks to use reg instead of ti,bit-shift
ARM: dts: am3: Update clksel clocks to use reg instead of ti,bit-shift
clk: ti: Improve clksel clock bit parsing for reg property
clk: ti: Handle possible address in the node name
dt-bindings: pwm: opencores: Add compatible for StarFive JH8100
dt-bindings: riscv: cpus: reg matches hart ID
reset: Instantiate reset GPIO controller for shared reset-gpios
reset: gpio: Add GPIO-based reset controller
cpufreq: do not open-code of_phandle_args_equal()
of: Add of_phandle_args_equal() helper
reset: simple: add support for Sophgo SG2042
dt-bindings: reset: sophgo: support SG2042
riscv: dts: microchip: add specific compatible for mpfs pdma
riscv: dts: microchip: add missing CAN bus clocks
ARM: brcmstb: Add debug UART entry for 74165
...
|
|
A Hyper-V host provides its guest VMs with entropy in a custom ACPI
table named "OEM0". The entropy bits are updated each time Hyper-V
boots the VM, and are suitable for seeding the Linux guest random
number generator (rng). See a brief description of OEM0 in [1].
Generation 2 VMs on Hyper-V use UEFI to boot. Existing EFI code in
Linux seeds the rng with entropy bits from the EFI_RNG_PROTOCOL.
Via this path, the rng is seeded very early during boot with good
entropy. The ACPI OEM0 table provided in such VMs is an additional
source of entropy.
Generation 1 VMs on Hyper-V boot from BIOS. For these VMs, Linux
doesn't currently get any entropy from the Hyper-V host. While this
is not fundamentally broken because Linux can generate its own entropy,
using the Hyper-V host provided entropy would get the rng off to a
better start and would do so earlier in the boot process.
Improve the rng seeding for Generation 1 VMs by having Hyper-V specific
code in Linux take advantage of the OEM0 table to seed the rng. For
Generation 2 VMs, use the OEM0 table to provide additional entropy
beyond the EFI_RNG_PROTOCOL. Because the OEM0 table is custom to
Hyper-V, parse it directly in the Hyper-V code in the Linux kernel
and use add_bootloader_randomness() to add it to the rng. Once the
entropy bits are read from OEM0, zero them out in the table so
they don't appear in /sys/firmware/acpi/tables/OEM0 in the running
VM. The zero'ing is done out of an abundance of caution to avoid
potential security risks to the rng. Also set the OEM0 data length
to zero so a kexec or other subsequent use of the table won't try
to use the zero'ed bits.
[1] https://download.microsoft.com/download/1/c/9/1c9813b8-089c-4fef-b2ad-ad80e79403ba/Whitepaper%20-%20The%20Windows%2010%20random%20number%20generation%20infrastructure.pdf
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20240318155408.216851-1-mhklinux@outlook.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <20240318155408.216851-1-mhklinux@outlook.com>
|
|
[ a.k.a. Revert "Revert "ARM64: Dynamically allocate cpumasks and
increase supported CPUs to 512""; originally reverted because of a
bug in the cpufreq-dt code not using zalloc_cpumask_var() ]
Currently defconfig selects NR_CPUS=256, but some vendors (e.g. Ampere
Computing) are planning to ship systems with 512 CPUs. So that all CPUs on
these systems can be used with defconfig, we'd like to bump NR_CPUS to 512.
Therefore this patch increases the default NR_CPUS from 256 to 512.
As increasing NR_CPUS will increase the size of cpumasks, there's a fear that
this might have a significant impact on stack usage due to code which places
cpumasks on the stack. To mitigate that concern, we can select
CPUMASK_OFFSTACK. As that doesn't seem to be a problem today with
NR_CPUS=256, we only select this when NR_CPUS > 256.
CPUMASK_OFFSTACK configures the cpumasks in the kernel to be
dynamically allocated. This was used in the X86 architecture in the
past to enable support for larger CPU configurations up to 8k cpus.
With that is becomes possible to dynamically size the allocation of
the cpu bitmaps depending on the quantity of processors detected on
bootup. Memory used for cpumasks will increase if the kernel is
run on a machine with more cores.
Further increases may be needed if ARM processor vendors start
supporting more processors. Given the current inflationary trends
in core counts from multiple processor manufacturers this may occur.
There are minor regressions for hackbench. The kernel data size
for 512 cpus is smaller with offstack than with onstack.
Benchmark results using hackbench average over 10 runs of
hackbench -s 512 -l 2000 -g 15 -f 25 -P
on Altra 80 Core
Support for 256 CPUs on stack. Baseline
7.8564 sec
Support for 512 CUs on stack.
7.8713 sec + 0.18%
512 CPUS offstack
7.8916 sec + 0.44%
Kernel size comparison:
text data filename Difference to onstack256 baseline
25755648 9589248 vmlinuz-6.8.0-rc4-onstack256
25755648 9607680 vmlinuz-6.8.0-rc4-onstack512 +0.19%
25755648 9603584 vmlinuz-6.8.0-rc4-offstack512 +0.14%
Tested-by: Eric Mackay <eric.mackay@oracle.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Christoph Lameter (Ampere) <cl@linux.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/37099a57-b655-3b3a-56d0-5f7fbd49d7db@gentwo.org
Link: https://lore.kernel.org/r/20240314125457.186678-1-m.szyprowski@samsung.com
[catalin.marinas@arm.com: use 'select' instead of duplicating 'config CPUMASK_OFFSTACK']
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Rename HV_REGISTER_GUEST_OSID to HV_REGISTER_GUEST_OS_ID. This matches
the existing HV_X64_MSR_GUEST_OS_ID.
Rename HV_REGISTER_CRASH_* to HV_REGISTER_GUEST_CRASH_*. Including
GUEST_ is consistent with other #defines such as
HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE. The new names also match the TLFS
document more accurately, i.e. HvRegisterGuestCrash*.
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Link: https://lore.kernel.org/r/1710285687-9160-1-git-send-email-nunodasneves@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <1710285687-9160-1-git-send-email-nunodasneves@linux.microsoft.com>
|
|
checking"
This reverts commits 99101dda29e3186b1356b0dc4dbb835c02c71ac9 and
b80b701d5a67d07f4df4a21e09cb31f6bc1feeca.
Linus reports that the sysreg reserved bit checks in KVM have led to
build failures, arising from commit fdd867fe9b32 ("arm64/sysreg: Add
register fields for ID_AA64DFR1_EL1") giving meaning to fields that were
previously RES0.
Of course, this is a genuine issue, since KVM's sysreg emulation depends
heavily on the definition of reserved fields. But at this point the
build breakage is far more offensive, and the right course of action is
to revert and retry later.
All of these build-time assertions were on by default before
commit 99101dda29e3 ("KVM: arm64: Make build-time check of RES0/RES1
bits optional"), so deliberately revert it all atomically to avoid
introducing further breakage of bisection.
Link: https://lore.kernel.org/all/CAHk-=whCvkhc8BbFOUf1ddOsgSGgEjwoKv77=HEY1UiVCydGqw@mail.gmail.com/
Acked-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
"API:
- Avoid unnecessary copying in scomp for trivial SG lists
Algorithms:
- Optimise NEON CCM implementation on ARM64
Drivers:
- Add queue stop/query debugfs support in hisilicon/qm
- Intel qat updates and cleanups"
* tag 'v6.9-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (79 commits)
Revert "crypto: remove CONFIG_CRYPTO_STATS"
crypto: scomp - remove memcpy if sg_nents is 1 and pages are lowmem
crypto: tcrypt - add ffdhe2048(dh) test
crypto: iaa - fix the missing CRYPTO_ALG_ASYNC in cra_flags
crypto: hisilicon/zip - fix the missing CRYPTO_ALG_ASYNC in cra_flags
hwrng: hisi - use dev_err_probe
MAINTAINERS: Remove T Ambarus from few mchp entries
crypto: iaa - Fix comp/decomp delay statistics
crypto: iaa - Fix async_disable descriptor leak
dt-bindings: rng: atmel,at91-trng: add sam9x7 TRNG
dt-bindings: crypto: add sam9x7 in Atmel TDES
dt-bindings: crypto: add sam9x7 in Atmel SHA
dt-bindings: crypto: add sam9x7 in Atmel AES
crypto: remove CONFIG_CRYPTO_STATS
crypto: dh - Make public key test FIPS-only
crypto: rockchip - fix to check return value
crypto: jitter - fix CRYPTO_JITTERENTROPY help text
crypto: qat - make ring to service map common for QAT GEN4
crypto: qat - fix ring to service map for dcc in 420xx
crypto: qat - fix ring to service map for dcc in 4xxx
...
|
|
Pull kvm updates from Paolo Bonzini:
"S390:
- Changes to FPU handling came in via the main s390 pull request
- Only deliver to the guest the SCLP events that userspace has
requested
- More virtual vs physical address fixes (only a cleanup since
virtual and physical address spaces are currently the same)
- Fix selftests undefined behavior
x86:
- Fix a restriction that the guest can't program a PMU event whose
encoding matches an architectural event that isn't included in the
guest CPUID. The enumeration of an architectural event only says
that if a CPU supports an architectural event, then the event can
be programmed *using the architectural encoding*. The enumeration
does NOT say anything about the encoding when the CPU doesn't
report support the event *in general*. It might support it, and it
might support it using the same encoding that made it into the
architectural PMU spec
- Fix a variety of bugs in KVM's emulation of RDPMC (more details on
individual commits) and add a selftest to verify KVM correctly
emulates RDMPC, counter availability, and a variety of other
PMC-related behaviors that depend on guest CPUID and therefore are
easier to validate with selftests than with custom guests (aka
kvm-unit-tests)
- Zero out PMU state on AMD if the virtual PMU is disabled, it does
not cause any bug but it wastes time in various cases where KVM
would check if a PMC event needs to be synthesized
- Optimize triggering of emulated events, with a nice ~10%
performance improvement in VM-Exit microbenchmarks when a vPMU is
exposed to the guest
- Tighten the check for "PMI in guest" to reduce false positives if
an NMI arrives in the host while KVM is handling an IRQ VM-Exit
- Fix a bug where KVM would report stale/bogus exit qualification
information when exiting to userspace with an internal error exit
code
- Add a VMX flag in /proc/cpuinfo to report 5-level EPT support
- Rework TDP MMU root unload, free, and alloc to run with mmu_lock
held for read, e.g. to avoid serializing vCPUs when userspace
deletes a memslot
- Tear down TDP MMU page tables at 4KiB granularity (used to be
1GiB). KVM doesn't support yielding in the middle of processing a
zap, and 1GiB granularity resulted in multi-millisecond lags that
are quite impolite for CONFIG_PREEMPT kernels
- Allocate write-tracking metadata on-demand to avoid the memory
overhead when a kernel is built with i915 virtualization support
but the workloads use neither shadow paging nor i915 virtualization
- Explicitly initialize a variety of on-stack variables in the
emulator that triggered KMSAN false positives
- Fix the debugregs ABI for 32-bit KVM
- Rework the "force immediate exit" code so that vendor code
ultimately decides how and when to force the exit, which allowed
some optimization for both Intel and AMD
- Fix a long-standing bug where kvm_has_noapic_vcpu could be left
elevated if vCPU creation ultimately failed, causing extra
unnecessary work
- Cleanup the logic for checking if the currently loaded vCPU is
in-kernel
- Harden against underflowing the active mmu_notifier invalidation
count, so that "bad" invalidations (usually due to bugs elsehwere
in the kernel) are detected earlier and are less likely to hang the
kernel
x86 Xen emulation:
- Overlay pages can now be cached based on host virtual address,
instead of guest physical addresses. This removes the need to
reconfigure and invalidate the cache if the guest changes the gpa
but the underlying host virtual address remains the same
- When possible, use a single host TSC value when computing the
deadline for Xen timers in order to improve the accuracy of the
timer emulation
- Inject pending upcall events when the vCPU software-enables its
APIC to fix a bug where an upcall can be lost (and to follow Xen's
behavior)
- Fall back to the slow path instead of warning if "fast" IRQ
delivery of Xen events fails, e.g. if the guest has aliased xAPIC
IDs
RISC-V:
- Support exception and interrupt handling in selftests
- New self test for RISC-V architectural timer (Sstc extension)
- New extension support (Ztso, Zacas)
- Support userspace emulation of random number seed CSRs
ARM:
- Infrastructure for building KVM's trap configuration based on the
architectural features (or lack thereof) advertised in the VM's ID
registers
- Support for mapping vfio-pci BARs as Normal-NC (vaguely similar to
x86's WC) at stage-2, improving the performance of interacting with
assigned devices that can tolerate it
- Conversion of KVM's representation of LPIs to an xarray, utilized
to address serialization some of the serialization on the LPI
injection path
- Support for _architectural_ VHE-only systems, advertised through
the absence of FEAT_E2H0 in the CPU's ID register
- Miscellaneous cleanups, fixes, and spelling corrections to KVM and
selftests
LoongArch:
- Set reserved bits as zero in CPUCFG
- Start SW timer only when vcpu is blocking
- Do not restart SW timer when it is expired
- Remove unnecessary CSR register saving during enter guest
- Misc cleanups and fixes as usual
Generic:
- Clean up Kconfig by removing CONFIG_HAVE_KVM, which was basically
always true on all architectures except MIPS (where Kconfig
determines the available depending on CPU capabilities). It is
replaced either by an architecture-dependent symbol for MIPS, and
IS_ENABLED(CONFIG_KVM) everywhere else
- Factor common "select" statements in common code instead of
requiring each architecture to specify it
- Remove thoroughly obsolete APIs from the uapi headers
- Move architecture-dependent stuff to uapi/asm/kvm.h
- Always flush the async page fault workqueue when a work item is
being removed, especially during vCPU destruction, to ensure that
there are no workers running in KVM code when all references to
KVM-the-module are gone, i.e. to prevent a very unlikely
use-after-free if kvm.ko is unloaded
- Grab a reference to the VM's mm_struct in the async #PF worker
itself instead of gifting the worker a reference, so that there's
no need to remember to *conditionally* clean up after the worker
Selftests:
- Reduce boilerplate especially when utilize selftest TAP
infrastructure
- Add basic smoke tests for SEV and SEV-ES, along with a pile of
library support for handling private/encrypted/protected memory
- Fix benign bugs where tests neglect to close() guest_memfd files"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (246 commits)
selftests: kvm: remove meaningless assignments in Makefiles
KVM: riscv: selftests: Add Zacas extension to get-reg-list test
RISC-V: KVM: Allow Zacas extension for Guest/VM
KVM: riscv: selftests: Add Ztso extension to get-reg-list test
RISC-V: KVM: Allow Ztso extension for Guest/VM
RISC-V: KVM: Forward SEED CSR access to user space
KVM: riscv: selftests: Add sstc timer test
KVM: riscv: selftests: Change vcpu_has_ext to a common function
KVM: riscv: selftests: Add guest helper to get vcpu id
KVM: riscv: selftests: Add exception handling support
LoongArch: KVM: Remove unnecessary CSR register saving during enter guest
LoongArch: KVM: Do not restart SW timer when it is expired
LoongArch: KVM: Start SW timer only when vcpu is blocking
LoongArch: KVM: Set reserved bits as zero in CPUCFG
KVM: selftests: Explicitly close guest_memfd files in some gmem tests
KVM: x86/xen: fix recursive deadlock in timer injection
KVM: pfncache: simplify locking and make more self-contained
KVM: x86/xen: remove WARN_ON_ONCE() with false positives in evtchn delivery
KVM: x86/xen: inject vCPU upcall vector when local APIC is enabled
KVM: x86/xen: improve accuracy of Xen timers
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
Pull MTD updates from Miquel Raynal:
"MTD:
- The Carillo Ranch driver has been removed
- Top level mtd bindings have received a couple of improvements
(references, selects)
- The ssfdc driver received few minor adjustments
- The usual load of misc/small improvements and fixes
Raw NAND:
- The main series brought is an update of the Broadcom support to
support all BCMBCA SoCs and their specificity (ECC, write
protection, configuration straps), plus a few misc fixes and
changes in the main driver. Device tree updates are also part of
this PR, initially because of a misunderstanding on my side.
- The STM32_FMC2 controller driver is also upgraded to properly
support MP1 and MP25 SoCs.
- A new compatible is added for an Atmel flavor.
- Among all these feature changes, there is as well a load of
continuous read related fixes, avoiding more corner conditions and
clarifying the logic. Finally a few miscellaneous fixes are made to
the core, the lpx32xx_mlc, fsl_lbc, Meson and Atmel controller
driver, as well as final one in the Hynix vendor driver.
SPI-NAND:
- The ESMT support has been extended to match 5 bytes ID to avoid
collisions. Winbond support on its side receives support for
W25N04KV chips.
SPI NOR:
- SPI NOR gets the non uniform erase code cleaned. We stopped using
bitmasks for erase types and flags, and instead introduced
dedicated members. We then passed the SPI NOR erase map to MTD.
Users can now determine the erase regions and make informed
decisions on partitions size.
- An optional interrupt property is now described in the bindings"
* tag 'mtd/for-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: (50 commits)
mtd: rawnand: Ensure continuous reads are well disabled
mtd: rawnand: Constrain even more when continuous reads are enabled
mtd: rawnand: brcmnand: Add support for getting ecc setting from strap
mtd: rawnand: brcmnand: fix sparse warnings
mtd: nand: raw: atmel: Fix comment in timings preparation
mtd: rawnand: Ensure all continuous terms are always in sync
mtd: rawnand: Add a helper for calculating a page index
mtd: rawnand: Fix and simplify again the continuous read derivations
mtd: rawnand: hynix: remove @nand_technology kernel-doc description
dt-bindings: atmel-nand: add microchip,sam9x7-pmecc
mtd: rawnand: brcmnand: Support write protection setting from dts
mtd: rawnand: brcmnand: Add BCMBCA read data bus interface
mtd: rawnand: brcmnand: Rename bcm63138 nand driver
arm64: dts: broadcom: bcmbca: Update router boards
arm64: dts: broadcom: bcmbca: Add NAND controller node
ARM: dts: broadcom: bcmbca: Add NAND controller node
mtd: spi-nor: core: correct type of i
mtd: spi-nor: core: set mtd->eraseregions for non-uniform erase map
mtd: spi-nor: core: get rid of SNOR_OVERLAID_REGION flag
mtd: spi-nor: core: get rid of SNOR_LAST_REGION flag
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- Kuan-Wei Chiu has developed the well-named series "lib min_heap: Min
heap optimizations".
- Kuan-Wei Chiu has also sped up the library sorting code in the series
"lib/sort: Optimize the number of swaps and comparisons".
- Alexey Gladkov has added the ability for code running within an IPC
namespace to alter its IPC and MQ limits. The series is "Allow to
change ipc/mq sysctls inside ipc namespace".
- Geert Uytterhoeven has contributed some dhrystone maintenance work in
the series "lib: dhry: miscellaneous cleanups".
- Ryusuke Konishi continues nilfs2 maintenance work in the series
"nilfs2: eliminate kmap and kmap_atomic calls"
"nilfs2: fix kernel bug at submit_bh_wbc()"
- Nathan Chancellor has updated our build tools requirements in the
series "Bump the minimum supported version of LLVM to 13.0.1".
- Muhammad Usama Anjum continues with the selftests maintenance work in
the series "selftests/mm: Improve run_vmtests.sh".
- Oleg Nesterov has done some maintenance work against the signal code
in the series "get_signal: minor cleanups and fix".
Plus the usual shower of singleton patches in various parts of the tree.
Please see the individual changelogs for details.
* tag 'mm-nonmm-stable-2024-03-14-09-36' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (77 commits)
nilfs2: prevent kernel bug at submit_bh_wbc()
nilfs2: fix failure to detect DAT corruption in btree and direct mappings
ocfs2: enable ocfs2_listxattr for special files
ocfs2: remove SLAB_MEM_SPREAD flag usage
assoc_array: fix the return value in assoc_array_insert_mid_shortcut()
buildid: use kmap_local_page()
watchdog/core: remove sysctl handlers from public header
nilfs2: use div64_ul() instead of do_div()
mul_u64_u64_div_u64: increase precision by conditionally swapping a and b
kexec: copy only happens before uchunk goes to zero
get_signal: don't initialize ksig->info if SIGNAL_GROUP_EXIT/group_exec_task
get_signal: hide_si_addr_tag_bits: fix the usage of uninitialized ksig
get_signal: don't abuse ksig->info.si_signo and ksig->sig
const_structs.checkpatch: add device_type
Normalise "name (ad@dr)" MODULE_AUTHORs to "name <ad@dr>"
dyndbg: replace kstrdup() + strchr() with kstrdup_and_replace()
list: leverage list_is_head() for list_entry_is_head()
nilfs2: MAINTAINERS: drop unreachable project mirror site
smp: make __smp_processor_id() 0-argument macro
fat: fix uninitialized field in nostale filehandles
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- Sumanth Korikkar has taught s390 to allocate hotplug-time page frames
from hotplugged memory rather than only from main memory. Series
"implement "memmap on memory" feature on s390".
- More folio conversions from Matthew Wilcox in the series
"Convert memcontrol charge moving to use folios"
"mm: convert mm counter to take a folio"
- Chengming Zhou has optimized zswap's rbtree locking, providing
significant reductions in system time and modest but measurable
reductions in overall runtimes. The series is "mm/zswap: optimize the
scalability of zswap rb-tree".
- Chengming Zhou has also provided the series "mm/zswap: optimize zswap
lru list" which provides measurable runtime benefits in some
swap-intensive situations.
- And Chengming Zhou further optimizes zswap in the series "mm/zswap:
optimize for dynamic zswap_pools". Measured improvements are modest.
- zswap cleanups and simplifications from Yosry Ahmed in the series
"mm: zswap: simplify zswap_swapoff()".
- In the series "Add DAX ABI for memmap_on_memory", Vishal Verma has
contributed several DAX cleanups as well as adding a sysfs tunable to
control the memmap_on_memory setting when the dax device is
hotplugged as system memory.
- Johannes Weiner has added the large series "mm: zswap: cleanups",
which does that.
- More DAMON work from SeongJae Park in the series
"mm/damon: make DAMON debugfs interface deprecation unignorable"
"selftests/damon: add more tests for core functionalities and corner cases"
"Docs/mm/damon: misc readability improvements"
"mm/damon: let DAMOS feeds and tame/auto-tune itself"
- In the series "mm/mempolicy: weighted interleave mempolicy and sysfs
extension" Rakie Kim has developed a new mempolicy interleaving
policy wherein we allocate memory across nodes in a weighted fashion
rather than uniformly. This is beneficial in heterogeneous memory
environments appearing with CXL.
- Christophe Leroy has contributed some cleanup and consolidation work
against the ARM pagetable dumping code in the series "mm: ptdump:
Refactor CONFIG_DEBUG_WX and check_wx_pages debugfs attribute".
- Luis Chamberlain has added some additional xarray selftesting in the
series "test_xarray: advanced API multi-index tests".
- Muhammad Usama Anjum has reworked the selftest code to make its
human-readable output conform to the TAP ("Test Anything Protocol")
format. Amongst other things, this opens up the use of third-party
tools to parse and process out selftesting results.
- Ryan Roberts has added fork()-time PTE batching of THP ptes in the
series "mm/memory: optimize fork() with PTE-mapped THP". Mainly
targeted at arm64, this significantly speeds up fork() when the
process has a large number of pte-mapped folios.
- David Hildenbrand also gets in on the THP pte batching game in his
series "mm/memory: optimize unmap/zap with PTE-mapped THP". It
implements batching during munmap() and other pte teardown
situations. The microbenchmark improvements are nice.
- And in the series "Transparent Contiguous PTEs for User Mappings"
Ryan Roberts further utilizes arm's pte's contiguous bit ("contpte
mappings"). Kernel build times on arm64 improved nicely. Ryan's
series "Address some contpte nits" provides some followup work.
- In the series "mm/hugetlb: Restore the reservation" Breno Leitao has
fixed an obscure hugetlb race which was causing unnecessary page
faults. He has also added a reproducer under the selftest code.
- In the series "selftests/mm: Output cleanups for the compaction
test", Mark Brown did what the title claims.
- Kinsey Ho has added the series "mm/mglru: code cleanup and
refactoring".
- Even more zswap material from Nhat Pham. The series "fix and extend
zswap kselftests" does as claimed.
- In the series "Introduce cpu_dcache_is_aliasing() to fix DAX
regression" Mathieu Desnoyers has cleaned up and fixed rather a mess
in our handling of DAX on archiecctures which have virtually aliasing
data caches. The arm architecture is the main beneficiary.
- Lokesh Gidra's series "per-vma locks in userfaultfd" provides
dramatic improvements in worst-case mmap_lock hold times during
certain userfaultfd operations.
- Some page_owner enhancements and maintenance work from Oscar Salvador
in his series
"page_owner: print stacks and their outstanding allocations"
"page_owner: Fixup and cleanup"
- Uladzislau Rezki has contributed some vmalloc scalability
improvements in his series "Mitigate a vmap lock contention". It
realizes a 12x improvement for a certain microbenchmark.
- Some kexec/crash cleanup work from Baoquan He in the series "Split
crash out from kexec and clean up related config items".
- Some zsmalloc maintenance work from Chengming Zhou in the series
"mm/zsmalloc: fix and optimize objects/page migration"
"mm/zsmalloc: some cleanup for get/set_zspage_mapping()"
- Zi Yan has taught the MM to perform compaction on folios larger than
order=0. This a step along the path to implementaton of the merging
of large anonymous folios. The series is named "Enable >0 order folio
memory compaction".
- Christoph Hellwig has done quite a lot of cleanup work in the
pagecache writeback code in his series "convert write_cache_pages()
to an iterator".
- Some modest hugetlb cleanups and speedups in Vishal Moola's series
"Handle hugetlb faults under the VMA lock".
- Zi Yan has changed the page splitting code so we can split huge pages
into sizes other than order-0 to better utilize large folios. The
series is named "Split a folio to any lower order folios".
- David Hildenbrand has contributed the series "mm: remove
total_mapcount()", a cleanup.
- Matthew Wilcox has sought to improve the performance of bulk memory
freeing in his series "Rearrange batched folio freeing".
- Gang Li's series "hugetlb: parallelize hugetlb page init on boot"
provides large improvements in bootup times on large machines which
are configured to use large numbers of hugetlb pages.
- Matthew Wilcox's series "PageFlags cleanups" does that.
- Qi Zheng's series "minor fixes and supplement for ptdesc" does that
also. S390 is affected.
- Cleanups to our pagemap utility functions from Peter Xu in his series
"mm/treewide: Replace pXd_large() with pXd_leaf()".
- Nico Pache has fixed a few things with our hugepage selftests in his
series "selftests/mm: Improve Hugepage Test Handling in MM
Selftests".
- Also, of course, many singleton patches to many things. Please see
the individual changelogs for details.
* tag 'mm-stable-2024-03-13-20-04' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (435 commits)
mm/zswap: remove the memcpy if acomp is not sleepable
crypto: introduce: acomp_is_async to expose if comp drivers might sleep
memtest: use {READ,WRITE}_ONCE in memory scanning
mm: prohibit the last subpage from reusing the entire large folio
mm: recover pud_leaf() definitions in nopmd case
selftests/mm: skip the hugetlb-madvise tests on unmet hugepage requirements
selftests/mm: skip uffd hugetlb tests with insufficient hugepages
selftests/mm: dont fail testsuite due to a lack of hugepages
mm/huge_memory: skip invalid debugfs new_order input for folio split
mm/huge_memory: check new folio order when split a folio
mm, vmscan: retry kswapd's priority loop with cache_trim_mode off on failure
mm: add an explicit smp_wmb() to UFFDIO_CONTINUE
mm: fix list corruption in put_pages_list
mm: remove folio from deferred split list before uncharging it
filemap: avoid unnecessary major faults in filemap_fault()
mm,page_owner: drop unnecessary check
mm,page_owner: check for null stack_record before bumping its refcount
mm: swap: fix race between free_swap_and_cache() and swapoff()
mm/treewide: align up pXd_leaf() retval across archs
mm/treewide: drop pXd_large()
...
|
|
Enable the nand controller and add WP pin connection property in actual
board dts as they are board level properties now that they are disabled
and moved out from SoC dtsi.
Also remove the unnecessary brcm,nand-has-wp property from AC5300 board.
This property is only needed for some old controller that this board
does not apply.
Signed-off-by: William Zhang <william.zhang@broadcom.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Acked-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/linux-mtd/20240223034758.13753-10-william.zhang@broadcom.com
|
|
Add support for Broadcom STB NAND controller in BCMBCA ARMv8 chip dts
files.
Signed-off-by: William Zhang <william.zhang@broadcom.com>
Reviewed-by: David Regan <dregan@broadcom.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Acked-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/linux-mtd/20240223034758.13753-9-william.zhang@broadcom.com
|