aboutsummaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
2022-11-30KVM: Shorten gfn_to_pfn_cache function namesMichal Luczaj2-19/+19
Formalize "gpc" as the acronym and use it in function names. No functional change intended. Suggested-by: Sean Christopherson <[email protected]> Signed-off-by: Michal Luczaj <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: David Woodhouse <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-11-30KVM: x86/xen: Add runstate tests for 32-bit mode and crossing page boundaryDavid Woodhouse1-0/+2
Torture test the cases where the runstate crosses a page boundary, and and especially the case where it's configured in 32-bit mode and doesn't, but then switching to 64-bit mode makes it go onto the second page. To simplify this, make the KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADJUST ioctl also update the guest runstate area. It already did so if the actual runstate changed, as a side-effect of kvm_xen_update_runstate(). So doing it in the plain adjustment case is making it more consistent, as well as giving us a nice way to trigger the update without actually running the vCPU again and changing the values. Signed-off-by: David Woodhouse <[email protected]> Reviewed-by: Paul Durrant <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-11-30KVM: x86/xen: Allow XEN_RUNSTATE_UPDATE flag behaviour to be configuredDavid Woodhouse3-14/+47
Closer inspection of the Xen code shows that we aren't supposed to be using the XEN_RUNSTATE_UPDATE flag unconditionally. It should be explicitly enabled by guests through the HYPERVISOR_vm_assist hypercall. If we randomly set the top bit of ->state_entry_time for a guest that hasn't asked for it and doesn't expect it, that could make the runtimes fail to add up and confuse the guest. Without the flag it's perfectly safe for a vCPU to read its own vcpu_runstate_info; just not for one vCPU to read *another's*. I briefly pondered adding a word for the whole set of VMASST_TYPE_* flags but the only one we care about for HVM guests is this, so it seemed a bit pointless. Signed-off-by: David Woodhouse <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-11-30KVM: x86/xen: Compatibility fixes for shared runstate areaDavid Woodhouse3-107/+270
The guest runstate area can be arbitrarily byte-aligned. In fact, even when a sane 32-bit guest aligns the overall structure nicely, the 64-bit fields in the structure end up being unaligned due to the fact that the 32-bit ABI only aligns them to 32 bits. So setting the ->state_entry_time field to something|XEN_RUNSTATE_UPDATE is buggy, because if it's unaligned then we can't update the whole field atomically; the low bytes might be observable before the _UPDATE bit is. Xen actually updates the *byte* containing that top bit, on its own. KVM should do the same. In addition, we cannot assume that the runstate area fits within a single page. One option might be to make the gfn_to_pfn cache cope with regions that cross a page — but getting a contiguous virtual kernel mapping of a discontiguous set of IOMEM pages is a distinctly non-trivial exercise, and it seems this is the *only* current use case for the GPC which would benefit from it. An earlier version of the runstate code did use a gfn_to_hva cache for this purpose, but it still had the single-page restriction because it used the uhva directly — because it needs to be able to do so atomically when the vCPU is being scheduled out, so it used pagefault_disable() around the accesses and didn't just use kvm_write_guest_cached() which has a fallback path. So... use a pair of GPCs for the first and potential second page covering the runstate area. We can get away with locking both at once because nothing else takes more than one GPC lock at a time so we can invent a trivial ordering rule. The common case where it's all in the same page is kept as a fast path, but in both cases, the actual guest structure (compat or not) is built up from the fields in @vx, following preset pointers to the state and times fields. The only difference is whether those pointers point to the kernel stack (in the split case) or to guest memory directly via the GPC. The fast path is also fixed to use a byte access for the XEN_RUNSTATE_UPDATE bit, then the only real difference is the dual memcpy. Finally, Xen also does write the runstate area immediately when it's configured. Flip the kvm_xen_update_runstate() and …_guest() functions and call the latter directly when the runstate area is set. This means that other ioctls which modify the runstate also write it immediately to the guest when they do so, which is also intended. Update the xen_shinfo_test to exercise the pathological case where the XEN_RUNSTATE_UPDATE flag in the top byte of the state_entry_time is actually in a different page to the rest of the 64-bit word. Signed-off-by: David Woodhouse <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-11-30Merge tag 'mvebu-dt-6.2-1' of ↵Arnd Bergmann26-71/+339
git://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu into soc/dt mvebu dt for 6.2 (part 1) Fix assigned-addresses for every PCIe Root Port Align LED node names with dtschema Add a new kirkwood based board: Zyxel NSA310S Fix compatible string for gpios for Armada 38x and 39x Add interrupts for watchdog on Armada XP Turris Omnia (Armada 385 based): - Add switch port 6 node - Add ethernet aliases Switch to using gpiod API in pm-board code * tag 'mvebu-dt-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu: ARM: dts: armada-xp: add interrupts for watchdog ARM: dts: armada: align LED node names with dtschema ARM: mvebu: switch to using gpiod API in pm-board code ARM: dts: armada-39x: Fix compatible string for gpios ARM: dts: armada-38x: Fix compatible string for gpios ARM: dts: turris-omnia: Add switch port 6 node ARM: dts: turris-omnia: Add ethernet aliases ARM: dts: armada-39x: Fix assigned-addresses for every PCIe Root Port ARM: dts: armada-38x: Fix assigned-addresses for every PCIe Root Port ARM: dts: armada-375: Fix assigned-addresses for every PCIe Root Port ARM: dts: armada-xp: Fix assigned-addresses for every PCIe Root Port ARM: dts: armada-370: Fix assigned-addresses for every PCIe Root Port ARM: dts: dove: Fix assigned-addresses for every PCIe Root Port ARM: dts: kirkwood: Add Zyxel NSA310S board Link: https://lore.kernel.org/r/87cz979adf.fsf@BL-laptop Signed-off-by: Arnd Bergmann <[email protected]>
2022-11-30Merge tag 'mvebu-dt64-6.2-1' of ↵Arnd Bergmann7-0/+20
git://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu into soc/dt mvebu dt64 for 6.2 (part 1) Update cache properties for various Marvell SoCs Reserved memory for optee firmware Turris Mox (Armada 3720 based Socs) - Define slot-power-limit-milliwatt for PCIe - Add missing interrupt for RTC * tag 'mvebu-dt64-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu: arm64: dts: marvell: add optee FW definitions arm64: dts: Update cache properties for marvell arm64: dts: armada-3720-turris-mox: Add missing interrupt for RTC arm64: dts: armada-3720-turris-mox: Define slot-power-limit-milliwatt for PCIe Link: https://lore.kernel.org/r/87fse39aer.fsf@BL-laptop Signed-off-by: Arnd Bergmann <[email protected]>
2022-11-30Merge tag 'at91-dt-6.2-3' of ↵Arnd Bergmann2-1/+9
https://git.kernel.org/pub/scm/linux/kernel/git/at91/linux into soc/dt AT91 DT for 6.2 #3 It contains: - proper power rail description for SDMMC devices available on SAMA7G5-EK - OTP controller has been added for LAN966X devices * tag 'at91-dt-6.2-3' of https://git.kernel.org/pub/scm/linux/kernel/git/at91/linux: ARM: dts: lan966x: Add otp support ARM: dts: at91: sama7g5ek: align power rails for sdmmc0/1 Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnd Bergmann <[email protected]>
2022-11-30Merge tag 'musb-for-v6.2-signed' of ↵Arnd Bergmann5-116/+69
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into soc/dt Devicetree related musb changes for omap3 for v6.2 Recent musb driver regressions eposed two issues for musb legacy probing. The changes to use device_set_of_node_from_dev() confuse the legacy interconnect code. And we now have to manually populate the musb core irq resources. The musb driver has a fix for these, but it's not a good long term solution. To fix the issue properly, let's just update musb to probe with ti-sysc interconnect driver with proper devicetree data. This allows dropping most of the musb driver workaround later on. And with these changes we have the omap2430 musb glue layer behaving the same way for all the SoCs using it. We need to patch the ti-sysc driver quirks, and add devicetree data to make things work. And we want to drop the legacy data too to avoid pointless warnings. As we have a musb driver workaround, these changes are not needed as fixes and can wait for the merge window. * tag 'musb-for-v6.2-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: OMAP2+: Drop legacy hwmod data for omap3 otg ARM: dts: Update omap3 musb to probe with ti-sysc bus: ti-sysc: Add otg quirk flags for omap3 musb Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnd Bergmann <[email protected]>
2022-11-30Merge tag 'omap-for-v6.2/dt-signed' of ↵Arnd Bergmann9-24/+33
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into soc/dt Devicetree fixes for omaps for v6.2 Two devicetree fixes for omaps. These fixes are not urgent and can wait for the merge window: - Fix up the node names and missing #pwm-cells property for ti,omap-dmtimer-pwm to avoid warnings when the the related yaml binding gets merged - Fix TDA998x port addressing * tag 'omap-for-v6.2/dt-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: dts: Unify pwm-omap-dmtimer node names ARM: dts: am335x: Fix TDA998x ports addressing ARM: dts: am335x-pcm-953: Define fixed regulators in root node Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnd Bergmann <[email protected]>
2022-11-30Merge tag 'qcom-arm64-for-6.2' of ↵Arnd Bergmann168-3668/+12393
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into soc/dt Qualcomm ARM64 DTS updates for 6.2 This introduces support for SM4250, SM6115, SM6375 and SDM670 platforms and Sony Xperia 10 IV, Google Pixel 3a, OnePlus 3, OnePlus 3T, Google Pazquel and OnePlus Nord N100. A wide variety of updates to align with DeviceTree bindings across many/most platforms is introduced, and incorrectly styled comments are adjusted across the tree. Apps RSC is added to the cluster-idle power-domain across SM8150, SM8250, SM8350 and SM8450, to ensure sleep and wake votes are flushed as the last core is being powered down. Remoteproc firmware patches are aligned with agreed upon structure used in linux-firmware across Inforce 6560, Lenovo Miix 630, various Sony Xperia devices and Samsung Galaxy Book2 (although these are not available in linux-firmware today). On IPQ8074 CPU clocks are added, thermal zones are introduced and vqmmc supply is specified for the HK01 board. Alcatel OneTouch Idol 3 gains LED nodes and Samsung Galaxy A3U gained vibrator support. The application subsystem's IOMMU and the display subsystem is enabled for MSM8953. A new CPU frequency table is introduced for MSM8996Pro, to properly describe it separate of MSM8996. The GPU opp-table is extended as well. On SC7180 USB is marked as a wakeup source, USB gains required-opps to ensure that the core voltage rail is voted for as needed. The description of the fingerprint sensor in Trogdor is corrected. On SC7280 Wake-on-WLAN is introduced, and PHY parameters for the SNPS USB PHY is defined across SC7280. The memory map across Google Herobrine is adjusted, to regain unused memory on the WiFi SKUs. A LTE SKU of the Evoker board is introduced and the bard gains touchscreen. NVME support is disabled on Villager boards, as it's not used. PCIe support is introduced on SC8280XP, with NVMe, SDX55 (5G) and WiFi enabled on the Lenovo Thinkpad X13s and Compute Reference Device. ADCs and thermal zones are intrduced for the same. Lenovo Thinkpad X13s gains LID switch support. Fairphone FP3 gains touchscreen support. Support for Xiaomi Poco F1 variant with EBBG panel. The round-robin ADC is enabled across DB845c, OnePlus devices and Pocophone F1 devices. The displayport controller on SDM845 is introduced. SM6350 gains SDHCI support and on Sony Xperia 10 III sd-card, touchscreen and GPI DMA is enabled. Fairphone FP4 got SD-card support. UFS PHY register ranges are corrected across SM8150, SM8250, SM8350 and SM8450. Sony Xperia 1 II got NFC support and Sony Xperia 5 III got PMIC regulators defined and USB definition corrected, to enable USB3. The SDHCI controller is described for SM8450 and microSD support is enabled for the HDK and QRD devices. SM8450 also gains camera CCI interface and display clock controller. * tag 'qcom-arm64-for-6.2' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: (261 commits) arm64: dts: qcom: sdm845-polaris: Don't duplicate DMA assignment arm64: dts: qcom: sm8350-sagami: Wire up USB regulators and fix USB3 arm64: dts: qcom: sm8350-sagami: Add most RPMh regulators arm64: dts: qcom: sc7280: Make herobrine-audio-rt5682 mic dtsi's match more arm64: dts: qcom: trim addresses to 8 digits arm64: dts: msm8998: unify PCIe clock order withMSM8996 arm64: dts: msm8998: add MSM8998 specific compatible arm64: dts: qcom: sc8280xp-x13s: enable WiFi controller arm64: dts: qcom: sc8280xp-x13s: enable modem arm64: dts: qcom: sc8280xp-x13s: enable NVMe SSD arm64: dts: qcom: sc8280xp-crd: enable WiFi controller arm64: dts: qcom: sc8280xp-crd: enable SDX55 modem arm64: dts: qcom: sc8280xp-crd: enable NVMe SSD arm64: dts: qcom: sc8280xp-crd: rename backlight and misc regulators arm64: dts: qcom: sa8295p-adp: enable PCIe arm64: dts: qcom: sc8280xp/sa8540p: add PCIe2-4 nodes arm64: dts: qcom: add sdm670 and pixel 3a device trees arm64: dts: qcom: sc7280: Add Google Herobrine WIFI SKU dts fragment arm64: dts: qcom: sc7280: Mark all Qualcomm reference boards as LTE arm64: dts: qcom: sm7225-fairphone-fp4: Enable SD card ... Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnd Bergmann <[email protected]>
2022-11-30powerpc/tlb: Add local flush for page given mm_struct and psizeBenjamin Gray4-0/+31
Adds a local TLB flush operation that works given an mm_struct, VA to flush, and page size representation. Most implementations mirror the surrounding code. The book3s/32/tlbflush.h implementation is left as a BUILD_BUG because it is more complicated and not required for anything as yet. This removes the need to create a vm_area_struct, which the temporary patching mm work does not need. Signed-off-by: Benjamin Gray <[email protected]> Reviewed-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-30powerpc/mm: Remove flush_all_mm, local_flush_all_mmBenjamin Gray2-37/+0
These functions were introduced for "cxl: Enable global TLBIs for cxl contexts" [1], which ended up using them for Radix only. They were never implemented on Hash (and creating an implementation appears to be difficult), so nothing can actually rely on them. They behave differently to the existing surrounding functions too, in that they actually need to do something on Hash. The other functions are primarily for use in generic code that expects their definitions, but Hash updates the TLB during PTE updates. After replacing the only usage with the Radix specific version, there are no more users of these functions, and given they are not implemented anyway it is safe to delete them. [1]: https://patchwork.ozlabs.org/project/linuxppc-dev/patch/[email protected]/ Signed-off-by: Benjamin Gray <[email protected]> Reviewed-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-30cxl: Use radix__flush_all_mm instead of generic flush_all_mmBenjamin Gray1-3/+3
The generic implementation of this function isn't really generic (Hash is not implemented). Unfortunately, the runtime warnings cannot be replaced with BUILD_BUG's, so it seems safer not to provide a stub in the first place. Signed-off-by: Benjamin Gray <[email protected]> Reviewed-by: Andrew Donnellan <[email protected]> Reviewed-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-30powerpc/mm: Remove empty hash__ functionsBenjamin Gray2-46/+9
The empty hash__* functions are unnecessary. The empty definitions were introduced when 64-bit Hash support was added, as the functions were still used in generic code. These empty definitions were prefixed with hash__ when Radix support was added, and new wrappers with the original names were added that selected the Radix or Hash version based on radix_enabled(). But the hash__ prefixed functions were not part of a public interface, so there is no need to include them for compatibility with anything. Generic code will use the non-prefixed wrappers, and Hash specific code will know that there is no point in calling them (or even worse, call them and expect them to do something). Signed-off-by: Benjamin Gray <[email protected]> Reviewed-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-30powerpc/code-patching: Use WARN_ON and fix check in poking_initBenjamin Gray1-8/+9
BUG_ON() when failing to initialise the code patching window is unnecessary, and use of BUG_ON is discouraged. We don't set poking_init_done in this case, so failure to init the boot CPU will result in a strict RWX error when a following patch_instruction uses raw_patch_instruction. If it only fails for later CPUs, they won't be onlined in the first place. The return value of cpuhp_setup_state() is also >= 0 on success, so check for < 0. Signed-off-by: Benjamin Gray <[email protected]> Reviewed-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-30powerpc: Allow clearing and restoring registers independent of saved ↵Jordan Niethe2-3/+37
breakpoint state For the coming temporary mm used for instruction patching, the breakpoint registers need to be cleared to prevent them from accidentally being triggered. As soon as the patching is done, the breakpoints will be restored. The breakpoint state is stored in the per-cpu variable current_brk[]. Add a suspend_breakpoints() function which will clear the breakpoint registers without touching the state in current_brk[]. Add a pair function restore_breakpoints() which will move the state in current_brk[] back to the registers. Signed-off-by: Jordan Niethe <[email protected]> Signed-off-by: Benjamin Gray <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-30powerpc/fsl-pci: Choose PCI host bridge with alias pci0 as the primaryPali Rohár1-0/+13
If there's no PCI host bridge with ISA then check for PCI host bridge with alias "pci0" (first PCI host bridge) and if it exists then choose it as the primary PCI host bridge. This makes choice of primary PCI host bridge more stable across boots and updates as the last fallback candidate for primary PCI host bridge (if there is no choice) is selected arbitrary. Signed-off-by: Pali Rohár <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-30powerpc: dts: turris1x.dts: Add channel labels for temperature sensorPali Rohár1-0/+14
Channel 0 of SA56004ED chip refers to internal SA56004ED chip sensor (chip itself is located on the board) and channel 1 of SA56004ED chip refers to external sensor which is connected to temperature diode of the P2020 CPU. Fixes: 54c15ec3b738 ("powerpc: dts: Add DTS file for CZ.NIC Turris 1.x routers") Signed-off-by: Pali Rohár <[email protected]> Reviewed-by: Marek Behún <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-30powerpc/book3e: remove #include <generated/utsrelease.h>Thomas Weißschuh1-1/+0
Commit 7ad4bd887d27 ("powerpc/book3e: get rid of #include <generated/compile.h>") removed the usage of the define UTS_RELEASE but forgot to drop the include. utsrelease.h is potentially generated on each build. By removing the unused include we can get rid of some spurious recompilations. Fixes: 7ad4bd887d27 ("powerpc/book3e: get rid of #include <generated/compile.h>") Signed-off-by: Thomas Weißschuh <[email protected]> Reviewed-by: Masahiro Yamada <[email protected]> Reviewed-by: Christophe Leroy <[email protected]> [mpe: Fix typo in change log and add more explanation] Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-30powerpc/ps3: mark ps3_system_bus_type staticChristoph Hellwig2-5/+1
ps3_system_bus_type is only used inside of system-bus.c, so remove the external declaration and the very outdated comment next to it. Signed-off-by: Christoph Hellwig <[email protected]> Acked-by: Geoff Levand <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-30Merge branch 'fixes' into nextMichael Ellerman20-81/+261
Merge our fixes branch to bring in some changes that are prerequisites for work in next.
2022-11-30Merge branch 'topic/ppc-kvm' into nextMichael Ellerman7-22/+26
Merge our KVM topic branch.
2022-11-30KVM: PPC: Book3E: Fix CONFIG_TRACE_IRQFLAGS supportNicholas Piggin3-9/+15
32-bit does not trace_irqs_off() to match the trace_irqs_on() call in kvmppc_fix_ee_before_entry(). This can lead to irqs being enabled twice in the trace, and the irqs-off region between guest exit and the host enabling local irqs again is not properly traced. 64-bit code does call this, but from asm code where volatiles are live and so incorrectly get clobbered. Move the irq reconcile into C to fix both problems. Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-29Merge patch series "riscv: kexec: Fxiup crash_save percpu and ↵Palmer Dabbelt3-13/+133
machine_kexec_mask_interrupts" [email protected] <[email protected]> says: From: Guo Ren <[email protected]> Current riscv kexec can't crash_save percpu states and disable interrupts properly. The patch series fix them, make kexec work correct. * b4-shazam-merge: riscv: kexec: Fixup crash_smp_send_stop without multi cores riscv: kexec: Fixup irq controller broken in kexec crash path Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Palmer Dabbelt <[email protected]>
2022-11-29riscv: kexec: Fixup crash_smp_send_stop without multi coresGuo Ren3-18/+103
Current crash_smp_send_stop is the same as the generic one in kernel/panic and misses crash_save_cpu in percpu. This patch is inspired by 78fd584cdec0 ("arm64: kdump: implement machine_crash_shutdown()") and adds the same mechanism for riscv. Before this patch, test result: crash> help -r CPU 0: [OFFLINE] CPU 1: epc : ffffffff80009ff0 ra : ffffffff800b789a sp : ff2000001098bb40 gp : ffffffff815fca60 tp : ff60000004680000 t0 : 6666666666663c5b t1 : 0000000000000000 t2 : 666666666666663c s0 : ff2000001098bc90 s1 : ffffffff81600798 a0 : ff2000001098bb48 a1 : 0000000000000000 a2 : 0000000000000000 a3 : 0000000000000001 a4 : 0000000000000000 a5 : ff60000004690800 a6 : 0000000000000000 a7 : 0000000000000000 s2 : ff2000001098bb48 s3 : ffffffff81093ec8 s4 : ffffffff816004ac s5 : 0000000000000000 s6 : 0000000000000007 s7 : ffffffff80e7f720 s8 : 00fffffffffff3f0 s9 : 0000000000000007 s10: 00aaaaaaaab98700 s11: 0000000000000001 t3 : ffffffff819a8097 t4 : ffffffff819a8097 t5 : ffffffff819a8098 t6 : ff2000001098b9a8 CPU 2: [OFFLINE] CPU 3: [OFFLINE] After this patch, test result: crash> help -r CPU 0: epc : ffffffff80003f34 ra : ffffffff808caa7c sp : ffffffff81403eb0 gp : ffffffff815fcb48 tp : ffffffff81413400 t0 : 0000000000000000 t1 : 0000000000000000 t2 : 0000000000000000 s0 : ffffffff81403ec0 s1 : 0000000000000000 a0 : 0000000000000000 a1 : 0000000000000000 a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000 a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000000000000 s2 : ffffffff816001c8 s3 : ffffffff81600370 s4 : ffffffff80c32e18 s5 : ffffffff819d3018 s6 : ffffffff810e2110 s7 : 0000000000000000 s8 : 0000000000000000 s9 : 0000000080039eac s10: 0000000000000000 s11: 0000000000000000 t3 : 0000000000000000 t4 : 0000000000000000 t5 : 0000000000000000 t6 : 0000000000000000 CPU 1: epc : ffffffff80003f34 ra : ffffffff808caa7c sp : ff2000000068bf30 gp : ffffffff815fcb48 tp : ff6000000240d400 t0 : 0000000000000000 t1 : 0000000000000000 t2 : 0000000000000000 s0 : ff2000000068bf40 s1 : 0000000000000001 a0 : 0000000000000000 a1 : 0000000000000000 a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000 a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000000000000 s2 : ffffffff816001c8 s3 : ffffffff81600370 s4 : ffffffff80c32e18 s5 : ffffffff819d3018 s6 : ffffffff810e2110 s7 : 0000000000000000 s8 : 0000000000000000 s9 : 0000000080039ea8 s10: 0000000000000000 s11: 0000000000000000 t3 : 0000000000000000 t4 : 0000000000000000 t5 : 0000000000000000 t6 : 0000000000000000 CPU 2: epc : ffffffff80003f34 ra : ffffffff808caa7c sp : ff20000000693f30 gp : ffffffff815fcb48 tp : ff6000000240e900 t0 : 0000000000000000 t1 : 0000000000000000 t2 : 0000000000000000 s0 : ff20000000693f40 s1 : 0000000000000002 a0 : 0000000000000000 a1 : 0000000000000000 a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000 a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000000000000 s2 : ffffffff816001c8 s3 : ffffffff81600370 s4 : ffffffff80c32e18 s5 : ffffffff819d3018 s6 : ffffffff810e2110 s7 : 0000000000000000 s8 : 0000000000000000 s9 : 0000000080039eb0 s10: 0000000000000000 s11: 0000000000000000 t3 : 0000000000000000 t4 : 0000000000000000 t5 : 0000000000000000 t6 : 0000000000000000 CPU 3: epc : ffffffff8000a1e4 ra : ffffffff800b7bba sp : ff200000109bbb40 gp : ffffffff815fcb48 tp : ff6000000373aa00 t0 : 6666666666663c5b t1 : 0000000000000000 t2 : 666666666666663c s0 : ff200000109bbc90 s1 : ffffffff816007a0 a0 : ff200000109bbb48 a1 : 0000000000000000 a2 : 0000000000000000 a3 : 0000000000000001 a4 : 0000000000000000 a5 : ff60000002c61c00 a6 : 0000000000000000 a7 : 0000000000000000 s2 : ff200000109bbb48 s3 : ffffffff810941a8 s4 : ffffffff816004b4 s5 : 0000000000000000 s6 : 0000000000000007 s7 : ffffffff80e7f7a0 s8 : 00fffffffffff3f0 s9 : 0000000000000007 s10: 00aaaaaaaab98700 s11: 0000000000000001 t3 : ffffffff819a8097 t4 : ffffffff819a8097 t5 : ffffffff819a8098 t6 : ff200000109bb9a8 Fixes: ad943893d5f1 ("RISC-V: Fixup schedule out issue in machine_crash_shutdown()") Reviewed-by: Xianting Tian <[email protected]> Signed-off-by: Guo Ren <[email protected]> Signed-off-by: Guo Ren <[email protected]> Cc: Nick Kossifidis <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Palmer Dabbelt <[email protected]>
2022-11-29riscv: kexec: Fixup irq controller broken in kexec crash pathGuo Ren1-0/+35
If a crash happens on cpu3 and all interrupts are binding on cpu0, the bad irq routing will cause a crash kernel which can't receive any irq. Because crash kernel won't clean up all harts' PLIC enable bits in enable registers. This patch is similar to 9141a003a491 ("ARM: 7316/1: kexec: EOI active and mask all interrupts in kexec crash path") and 78fd584cdec0 ("arm64: kdump: implement machine_crash_shutdown()"), and PowerPC also has the same mechanism. Fixes: fba8a8674f68 ("RISC-V: Add kexec support") Signed-off-by: Guo Ren <[email protected]> Signed-off-by: Guo Ren <[email protected]> Reviewed-by: Xianting Tian <[email protected]> Cc: Nick Kossifidis <[email protected]> Cc: Palmer Dabbelt <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Palmer Dabbelt <[email protected]>
2022-11-29riscv: mm: Proper page permissions after initmem freeBjörn Töpel1-4/+5
64-bit RISC-V kernels have the kernel image mapped separately to alias the linear map. The linear map and the kernel image map are documented as "direct mapping" and "kernel" respectively in [1]. At image load time, the linear map corresponding to the kernel image is set to PAGE_READ permission, and the kernel image map is set to PAGE_READ|PAGE_EXEC. When the initmem is freed, the pages in the linear map should be restored to PAGE_READ|PAGE_WRITE, whereas the corresponding pages in the kernel image map should be restored to PAGE_READ, by removing the PAGE_EXEC permission. This is not the case. For 64-bit kernels, only the linear map is restored to its proper page permissions at initmem free, and not the kernel image map. In practise this results in that the kernel can potentially jump to dead __init code, and start executing invalid instructions, without getting an exception. Restore the freed initmem properly, by setting both the kernel image map to the correct permissions. [1] Documentation/riscv/vm-layout.rst Fixes: e5c35fa04019 ("riscv: Map the kernel with correct permissions the first time") Signed-off-by: Björn Töpel <[email protected]> Reviewed-by: Alexandre Ghiti <[email protected]> Tested-by: Alexandre Ghiti <[email protected]> Link: https://lore.kernel.org/r/[email protected] Cc: [email protected] Signed-off-by: Palmer Dabbelt <[email protected]>
2022-11-29riscv: vdso: fix section overlapping under some conditionsJisheng Zhang1-0/+1
lkp reported a build error, I tried the config and can reproduce build error as below: VDSOLD arch/riscv/kernel/vdso/vdso.so.dbg ld.lld: error: section .note file range overlaps with .text >>> .note range is [0x7C8, 0x803] >>> .text range is [0x800, 0x1993] ld.lld: error: section .text file range overlaps with .dynamic >>> .text range is [0x800, 0x1993] >>> .dynamic range is [0x808, 0x937] ld.lld: error: section .note virtual address range overlaps with .text >>> .note range is [0x7C8, 0x803] >>> .text range is [0x800, 0x1993] Fix it by setting DISABLE_BRANCH_PROFILING which will disable branch tracing for vdso, thus avoid useless _ftrace_annotated_branch section and _ftrace_branch section. Although we can also fix it by removing the hardcoded .text begin address, but I think that's another story and should be put into another patch. Link: https://lore.kernel.org/lkml/[email protected]/#r Reported-by: kernel test robot <[email protected]> Signed-off-by: Jisheng Zhang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Fixes: ad5d1122b82f ("riscv: use vDSO common flow to reduce the latency of the time-related functions") Cc: [email protected] Signed-off-by: Palmer Dabbelt <[email protected]>
2022-11-29riscv: fix race when vmap stack overflowJisheng Zhang3-0/+32
Currently, when detecting vmap stack overflow, riscv firstly switches to the so called shadow stack, then use this shadow stack to call the get_overflow_stack() to get the overflow stack. However, there's a race here if two or more harts use the same shadow stack at the same time. To solve this race, we introduce spin_shadow_stack atomic var, which will be swap between its own address and 0 in atomic way, when the var is set, it means the shadow_stack is being used; when the var is cleared, it means the shadow_stack isn't being used. Fixes: 31da94c25aea ("riscv: add VMAP_STACK overflow detection") Signed-off-by: Jisheng Zhang <[email protected]> Suggested-by: Guo Ren <[email protected]> Reviewed-by: Guo Ren <[email protected]> Link: https://lore.kernel.org/r/[email protected] [Palmer: Add AQ to the swap, and also some comments.] Signed-off-by: Palmer Dabbelt <[email protected]>
2022-11-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski99-331/+475
tools/lib/bpf/ringbuf.c 927cbb478adf ("libbpf: Handle size overflow for ringbuf mmap") b486d19a0ab0 ("libbpf: checkpatch: Fixed code alignments in ringbuf.c") https://lore.kernel.org/all/[email protected]/ Signed-off-by: Jakub Kicinski <[email protected]>
2022-11-29Merge tag 'v6.2-rockchip-dts64-1' of ↵Arnd Bergmann29-719/+4101
git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into asahi-wip New boards: - Model A and blade baseboards for the SOQuartz (rk3568) SoM, - Anberic RG351M, RG353V, RG353VS; Odroid Go Super, Advance gaming devices - Odroid M1 - Theobroma px30 SoM with baseboard - Rockchip's own rk3566 demo board Some core support for per SoC specifics: - crypto support for rk3399 and rk3328 - second I2S controller for rk3568 - Cache properties for follow the binding for rk3308 and rk3328 Bigger device support updates for: - SOQuartz: PCIe2, video output, gpu, HDMI sound - Rock 3A: eth regulator, eth clock input, Wifi+Bt, I2S, PCIe3 As well as some minor extensions for Rock960 (hdmi supplies), rk3566-roc-pc (PCIe2), Rock 4C+ (thermal support), Pinephone Pro (Wifi+Bt) * tag 'v6.2-rockchip-dts64-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip: (51 commits) arm64: dts: rockchip: update cache properties for rk3308 and rk3328 arm64: dts: rockchip: Add SOQuartz Model A baseboard dt-bindings: arm: rockchip: Add SOQuartz Model A arm64: dts: rockchip: Add SOQuartz blade board dt-bindings: arm: rockchip: Add SOQuartz Blade arm64: dts: rockchip: Add Anbernic RG351M arm64: dts: rockchip: Add Odroid Go Super arm64: dts: rockchip: Add Odroid Go Advance Black Edition dt-bindings: arm: rockchip: Add more RK3326 devices arm64: dts: rockchip: Move most of Odroid Go Advance DTS into a DTSI arm64: dts: rockchip: Add support of regulator for ethernet node on Rock 3A SBC arm64: dts: rockchip: Add support of external clock to ethernet node on Rock 3A SBC arm64: dts: rockchip: Add HDMI supplies on Rock960 arm64: dts: rockchip: Add dts for rockchip rk3566 box demo board dt-bindings: rockchip: Add Rockchip rk3566 box demo board arm64: dts: rockchip: Enable PCIe 2 on SOQuartz CM4IO arm64: dts: rockchip: Enable HDMI sound on SOQuartz arm64: dts: rockchip: Enable video output and HDMI on SOQuartz arm64: dts: rockchip: Enable GPU on SOQuartz CM4 arm64: dts: rockchip: enable pcie2 on rk3566-roc-pc ... Link: https://lore.kernel.org/r/4716610.aeNJFYEL58@phil Signed-off-by: Arnd Bergmann <[email protected]>
2022-11-29Merge tag 'renesas-arm-dt-for-v6.2-tag3' of ↵Arnd Bergmann2-0/+0
git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel into soc/dt Renesas ARM DT updates for v6.2 (take three) - Rename Renesas DTB overlay source files from .dts to .dtso. * tag 'renesas-arm-dt-for-v6.2-tag3' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel: arm64: dts: renesas: Rename DTB overlay source files from .dts to .dtso Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnd Bergmann <[email protected]>
2022-11-29RISC-V: enable sparsemem by default for defconfigConor Dooley1-0/+1
on an arch level, RISC-V defaults to FLATMEM. On PolarFire SoC, the memory layout is almost always sparse, with a maximum of 1 GiB at 0x8000_0000 & a possible 16 GiB range at 0x10_0000_0000. The Icicle kit, for example, has 2 GiB of DDR - so there's a big hole in the memory map between the two gigs. Prior to v6.1-rc1, boot times from defconfig builds were pretty bad on Icicle but enabling sparsemem would fix those issues. As of v6.1-rc1, the Icicle kit no longer boots from defconfig builds with the in-kernel devicetree. A change to the memory map resulted in a futher "sparse-ification", producing a splat on boot: OF: fdt: Ignoring memory range 0x80000000 - 0x80200000 Machine model: Microchip PolarFire-SoC Icicle Kit earlycon: ns16550a0 at MMIO32 0x0000000020100000 (options '115200n8') printk: bootconsole [ns16550a0] enabled printk: debug: skip boot console de-registration. efi: UEFI not found. Zone ranges: DMA32 [mem 0x0000000080200000-0x00000000ffffffff] Normal [mem 0x0000000100000000-0x000000107fffffff] Movable zone start for each node Early memory node ranges node 0: [mem 0x0000000080200000-0x00000000bfbfffff] node 0: [mem 0x00000000bfc00000-0x00000000bfffffff] node 0: [mem 0x0000001040000000-0x000000107fffffff] Initmem setup node 0 [mem 0x0000000080200000-0x000000107fffffff] Kernel panic - not syncing: Failed to allocate 1073741824 bytes for node 0 memory map CPU: 0 PID: 0 Comm: swapper Not tainted 5.19.0-dirty #1 Hardware name: Microchip PolarFire-SoC Icicle Kit (DT) Call Trace: [<ffffffff800057f0>] show_stack+0x30/0x3c [<ffffffff807d5802>] dump_stack_lvl+0x4a/0x66 [<ffffffff807d5836>] dump_stack+0x18/0x20 [<ffffffff807d1ae8>] panic+0x124/0x2c6 [<ffffffff80814064>] free_area_init_core+0x0/0x11e [<ffffffff80813720>] free_area_init_node+0xc2/0xf6 [<ffffffff8081331e>] free_area_init+0x222/0x260 [<ffffffff808064d6>] misc_mem_init+0x62/0x9a [<ffffffff80803cb2>] setup_arch+0xb0/0xea [<ffffffff8080039a>] start_kernel+0x88/0x4ee ---[ end Kernel panic - not syncing: Failed to allocate 1073741824 bytes for node 0 memory map ]--- With the aim of keeping defconfig builds booting on icicle, enable SPARSEMEM_MANUAL. Signed-off-by: Conor Dooley <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Palmer Dabbelt <[email protected]>
2022-11-29x86/cpuid: Carve out all CPUID functionalityBorislav Petkov2-134/+140
Carve it out into a special header, where it belongs. No functional changes. Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-29x86/hyperv: Remove unregister syscore call from Hyper-V cleanupGaurav Kohli1-2/+0
Hyper-V cleanup code comes under panic path where preemption and irq is already disabled. So calling of unregister_syscore_ops might schedule out the thread even for the case where mutex lock is free. hyperv_cleanup unregister_syscore_ops mutex_lock(&syscore_ops_lock) might_sleep Here might_sleep might schedule out this thread, where voluntary preemption config is on and this thread will never comes back. And also this was added earlier to maintain the symmetry which is not required as this can comes during crash shutdown path only. To prevent the same, removing unregister_syscore_ops function call. Signed-off-by: Gaurav Kohli <[email protected]> Reviewed-by: Michael Kelley <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Wei Liu <[email protected]>
2022-11-29ARM: dts: socfpga: align LED node names with dtschemaKrzysztof Kozlowski3-9/+9
The node names should be generic and DT schema expects certain pattern: socfpga_arria5_socdk.dtb: leds: 'hps0', 'hps1', 'hps2', 'hps3' do not match any of the regexes: '(^led-[0-9a-f]$|led)', 'pinctrl-[0-9]+' Signed-off-by: Krzysztof Kozlowski <[email protected]> Signed-off-by: Dinh Nguyen <[email protected]>
2022-11-29arm64: dts: altera: align LED node names with dtschemaKrzysztof Kozlowski2-6/+6
The node names should be generic and DT schema expects certain pattern: altera/socfpga_stratix10_socdk.dtb: leds: 'hps0', 'hps1', 'hps2' do not match any of the regexes: '(^led-[0-9a-f]$|led)', 'pinctrl-[0-9]+' Signed-off-by: Krzysztof Kozlowski <[email protected]> Signed-off-by: Dinh Nguyen <[email protected]>
2022-11-29x86/boot: Remove x86_32 PIC using %ebx workaroundUros Bizjak1-12/+3
The currently supported minimum gcc version is 5.1. Before that, the PIC register, when generating Position Independent Code, was considered "fixed" in the sense that it wasn't in the set of registers available to the compiler's register allocator. Which, on x86-32, is already a very small set. What is more, the register allocator was unable to satisfy extended asm "=b" constraints. (Yes, PIC code uses %ebx on 32-bit as the base reg.) With gcc 5.1: "Reuse of the PIC hard register, instead of using a fixed register, was implemented on x86/x86-64 targets. This improves generated PIC code performance as more hard registers can be used. Shared libraries can significantly benefit from this optimization. Currently it is switched on only for x86/x86-64 targets. As RA infrastructure is already implemented for PIC register reuse, other targets might follow this in the future." (from: https://gcc.gnu.org/gcc-5/changes.html) which basically means that the register allocator has a higher degree of freedom when handling %ebx, including reloading it with the correct value before a PIC access. Furthermore: arch/x86/Makefile: # Never want PIC in a 32-bit kernel, prevent breakage with GCC built # with nonstandard options KBUILD_CFLAGS += -fno-pic $ gcc -Wp,-MMD,arch/x86/boot/.cpuflags.o.d ... -fno-pic ... -D__KBUILD_MODNAME=kmod_cpuflags -c -o arch/x86/boot/cpuflags.o arch/x86/boot/cpuflags.c so the 32-bit workaround in cpuid_count() is fixing exactly nothing because 32-bit configs don't even allow PIC builds. As to 64-bit builds: they're done using -mcmodel=kernel which produces RIP-relative addressing for PIC builds and thus does not apply here either. So get rid of the thing and make cpuid_count() nice and simple. There should be no functional changes resulting from this. [ bp: Expand commit message. ] Signed-off-by: Uros Bizjak <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-29arm64/fp: Use a struct to pass data to fpsimd_bind_state_to_cpu()Mark Brown3-43/+32
For reasons that are unclear to this reader fpsimd_bind_state_to_cpu() populates the struct fpsimd_last_state_struct that it uses to store the active floating point state for KVM guests by passing an argument for each member of the structure. As the richness of the architecture increases this is resulting in a function with a rather large number of arguments which isn't ideal. Simplify the interface by using the struct directly as the single argument for the function, renaming it as we lift the definition into the header. This could be built on further to reduce the work we do adding storage for new FP state in various places but for now it just simplifies this one interface. Signed-off-by: Mark Brown <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Reviewed-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-11-29arm64/sve: Leave SVE enabled on syscall if we don't context switchMark Brown2-15/+12
The syscall ABI says that the SVE register state not shared with FPSIMD may not be preserved on syscall, and this is the only mechanism we have in the ABI to stop tracking the extra SVE state for a process. Currently we do this unconditionally by means of disabling SVE for the process on syscall, causing userspace to take a trap to EL1 if it uses SVE again. These extra traps result in a noticeable overhead for using SVE instead of FPSIMD in some workloads, especially for simple syscalls where we can return directly to userspace and would not otherwise need to update the floating point registers. Tests with fp-pidbench show an approximately 70% overhead on a range of implementations when SVE is in use - while this is an extreme and entirely artificial benchmark it is clear that there is some useful room for improvement here. Now that we have the ability to track the decision about what to save seprately to TIF_SVE we can improve things by leaving TIF_SVE enabled on syscall but only saving the FPSIMD registers if we are in a syscall. This means that if we need to restore the register state from memory (eg, after a context switch or kernel mode NEON) we will drop TIF_SVE and reenable traps for userspace but if we can just return to userspace then traps will remain disabled. Since our current implementation and hence ABI has the effect of zeroing all the SVE register state not shared with FPSIMD on syscall we replace the disabling of TIF_SVE with a flush of the non-shared register state, this means that there is still some overhead for syscalls when SVE is in use but it is very much reduced. Signed-off-by: Mark Brown <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Reviewed-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-11-29arm64/fpsimd: SME no longer requires SVE register stateMark Brown2-4/+1
Now that we track the type of the stored register state separately to what is active in the task, it is valid to have the FPSIMD register state stored while in streaming mode. Remove the special case handling for SME when setting FPSIMD register state. Signed-off-by: Mark Brown <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Reviewed-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-11-29arm64/fpsimd: Load FP state based on recorded data typeMark Brown1-8/+32
Now that we are recording the type of floating point register state we are saving when we write the register state out to memory we can use that information when we load from memory to decide which format to load, bringing TIF_SVE into line with what we saved rather than relying on TIF_SVE to determine what to load. The SME state details are already recorded directly in the saved SVCR and handled based on the information there. Since we are not changing any of the save paths there should be no functional change from this patch, further patches will make use of this to optimise and clarify the code. Signed-off-by: Mark Brown <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Reviewed-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-11-29arm64/fpsimd: Stop using TIF_SVE to manage register saving in KVMMark Brown2-21/+4
Now that we are explicitly telling the host FP code which register state it needs to save we can remove the manipulation of TIF_SVE from the KVM code, simplifying it and allowing us to optimise our handling of normal tasks. Remove the manipulation of TIF_SVE from KVM and instead rely on to_save to ensure we save the correct data for it. There should be no functional or performance impact from this change. Signed-off-by: Mark Brown <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Reviewed-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-11-29arm64/fpsimd: Have KVM explicitly say which FP registers to saveMark Brown4-5/+35
In order to avoid needlessly saving and restoring the guest registers KVM relies on the host FPSMID code to save the guest registers when we context switch away from the guest. This is done by binding the KVM guest state to the CPU on top of the task state that was originally there, then carefully managing the TIF_SVE flag for the task to cause the host to save the full SVE state when needed regardless of the needs of the host task. This works well enough but isn't terribly direct about what is going on and makes it much more complicated to try to optimise what we're doing with the SVE register state. Let's instead have KVM pass in the register state it wants saving when it binds to the CPU. We introduce a new FP_STATE_CURRENT for use during normal task binding to indicate that we should base our decisions on the current task. This should not be used when actually saving. Ideally we might want to use a separate enum for the type to save but this enum and the enum values would then need to be named which has problems with clarity and ambiguity. In order to ease any future debugging that might be required this patch does not actually update any of the decision making about what to save, it merely starts tracking the new information and warns if the requested state is not what we would otherwise have decided to save. Signed-off-by: Mark Brown <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Reviewed-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-11-29arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVEMark Brown8-19/+74
When we save the state for the floating point registers this can be done in the form visible through either the FPSIMD V registers or the SVE Z and P registers. At present we track which format is currently used based on TIF_SVE and the SME streaming mode state but particularly in the SVE case this limits our options for optimising things, especially around syscalls. Introduce a new enum which we place together with saved floating point state in both thread_struct and the KVM guest state which explicitly states which format is active and keep it up to date when we change it. At present we do not use this state except to verify that it has the expected value when loading the state, future patches will introduce functional changes. Signed-off-by: Mark Brown <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Reviewed-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-11-29KVM: arm64: Discard any SVE state when entering KVM guestsMark Brown3-1/+26
Since 8383741ab2e773a99 (KVM: arm64: Get rid of host SVE tracking/saving) KVM has not tracked the host SVE state, relying on the fact that we currently disable SVE whenever we perform a syscall. This may not be true in future since performance optimisation may result in us keeping SVE enabled in order to avoid needing to take access traps to reenable it. Handle this by clearing TIF_SVE and converting the stored task state to FPSIMD format when preparing to run the guest. This is done with a new call fpsimd_kvm_prepare() to keep the direct state manipulation functions internal to fpsimd.c. Signed-off-by: Mark Brown <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Reviewed-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-11-29Merge tag 'at91-fixes-6.1-3' of ↵Arnd Bergmann1-1/+1
https://git.kernel.org/pub/scm/linux/kernel/git/at91/linux into arm/fixes AT91 fixes for 6.1 #3 It contains: - build fix for SAMA5D3 devices which don't have an L2 cache and due to this accesssing outer_cache.write_sec in sama5_secure_cache_init() could throw undefined reference to `outer_cache' if CONFIG_OUTER_CACHE is disabled from common sama5_defconfig. * tag 'at91-fixes-6.1-3' of https://git.kernel.org/pub/scm/linux/kernel/git/at91/linux: ARM: at91: fix build for SAMA5D3 w/o L2 cache Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnd Bergmann <[email protected]>
2022-11-29arm64/perf: Replace PMU version number '0' with ID_AA64DFR0_EL1_PMUVer_NIAnshuman Khandual1-1/+2
__armv8pmu_probe_pmu() returns if detected PMU is either not implemented or implementation defined. Extracted ID_AA64DFR0_EL1_PMUVer value, when PMU is not implemented is '0' which can be replaced with ID_AA64DFR0_EL1_PMUVer_NI defined as '0b0000'. Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Will Deacon <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Anshuman Khandual <[email protected]> Acked-by: Mark Rutland <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-11-29ARM: 9277/1: Make the dumped instructions are consistent with the ↵Zhen Lei1-1/+3
disassembled ones In ARM, the mapping of instruction memory is always little-endian, except some BE-32 supported ARM architectures. Such as ARMv7-R, its instruction endianness may be BE-32. Of course, its data endianness will also be BE-32 mode. Due to two negatives make a positive, the instruction stored in the register after reading is in little-endian format. But for the case of BE-8, the instruction endianness is LE, the instruction stored in the register after reading is in big-endian format, which is inconsistent with the disassembled one. For example: The content of disassembly: c0429ee8: e3500000 cmp r0, #0 c0429eec: 159f2044 ldrne r2, [pc, #68] c0429ef0: 108f2002 addne r2, pc, r2 c0429ef4: 1882000a stmne r2, {r1, r3} c0429ef8: e7f000f0 udf #0 The output of undefined instruction exception: Internal error: Oops - undefined instruction: 0 [#1] SMP ARM ... ... Code: 000050e3 44209f15 02208f10 0a008218 (f000f0e7) This inconveniences the checking of instructions. What's worse is that, for somebody who don't know about this, might think the instructions are all broken. So, when CONFIG_CPU_ENDIAN_BE8=y, let's convert the instructions to little-endian format before they are printed. The conversion result is as follows: Code: e3500000 159f2044 108f2002 1882000a (e7f000f0) Signed-off-by: Zhen Lei <[email protected]> Signed-off-by: Russell King (Oracle) <[email protected]>
2022-11-29KVM: arm64: permit all VM_MTE_ALLOWED mappings with MTE enabledPeter Collingbourne1-8/+0
Certain VMMs such as crosvm have features (e.g. sandboxing) that depend on being able to map guest memory as MAP_SHARED. The current restriction on sharing MAP_SHARED pages with the guest is preventing the use of those features with MTE. Now that the races between tasks concurrently clearing tags on the same page have been fixed, remove this restriction. Note that this is a relaxation of the ABI. Signed-off-by: Peter Collingbourne <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Reviewed-by: Steven Price <[email protected]> Reviewed-by: Cornelia Huck <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]