aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2021-11-18tools headers UAPI: Sync x86's asm/kvm.h with the kernel sourcesArnaldo Carvalho de Melo1-0/+4
To pick the changes in: 828ca89628bfcb1b ("KVM: x86: Expose TSC offset controls to userspace") That just rebuilds kvm-stat.c on x86, no change in functionality. This silences these perf build warning: Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/kvm.h' differs from latest version at 'arch/x86/include/uapi/asm/kvm.h' diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h Cc: Oliver Upton <[email protected]> Cc: Paolo Bonzini <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-11-18perf sort: Fix the 'p_stage_cyc' sort key behaviorNamhyung Kim3-11/+7
andle 'p_stage_cyc' (for pipeline stage cycles) sort key with the same rationale as for the 'weight' and 'local_weight', see the fix in this series for a full explanation. Not sure it also needs the local and global variants. But I couldn't test it actually because I don't have the machine. Reviewed-by: Athira Jajeev <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Athira Jajeev <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Jajeev <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-11-18perf sort: Fix the 'ins_lat' sort key behaviorNamhyung Kim3-25/+12
Handle 'ins_lat' (for instruction latency) and 'local_ins_lat' sort keys with the same rationale as for the 'weight' and 'local_weight', see the previous fix in this series for a full explanation. But I couldn't test it actually, so only build tested. Reviewed-by: Athira Jajeev <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Athira Jajeev <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Jajeev <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-11-18perf sort: Fix the 'weight' sort key behaviorNamhyung Kim3-27/+13
Currently, the 'weight' field in the perf sample has latency information for some instructions like in memory accesses. And perf tool has 'weight' and 'local_weight' sort keys to display the info. But it's somewhat confusing what it shows exactly. In my understanding, 'local_weight' shows a weight in a single sample, and (global) 'weight' shows a sum of the weights in the hist_entry. For example: $ perf mem record -t load dd if=/dev/zero of=/dev/null bs=4k count=1M $ perf report --stdio -n -s +local_weight ... # # Overhead Samples Command Shared Object Symbol Local Weight # ........ ....... ....... ................ ......................... ............ # 21.23% 313 dd [kernel.vmlinux] [k] lockref_get_not_zero 32 12.43% 183 dd [kernel.vmlinux] [k] lockref_get_not_zero 35 11.97% 159 dd [kernel.vmlinux] [k] lockref_get_not_zero 36 10.40% 141 dd [kernel.vmlinux] [k] lockref_put_return 32 7.63% 113 dd [kernel.vmlinux] [k] lockref_get_not_zero 33 6.37% 92 dd [kernel.vmlinux] [k] lockref_get_not_zero 34 6.15% 90 dd [kernel.vmlinux] [k] lockref_put_return 33 ... So let's look at the 'lockref_get_not_zero' symbols. The top entry shows that 313 samples were captured with 'local_weight' 32, so the total weight should be 313 x 32 = 10016. But it's not the case: $ perf report --stdio -n -s +local_weight,weight -S lockref_get_not_zero ... # # Overhead Samples Command Shared Object Local Weight Weight # ........ ....... ....... ................ ............ ...... # 1.36% 4 dd [kernel.vmlinux] 36 144 0.47% 4 dd [kernel.vmlinux] 37 148 0.42% 4 dd [kernel.vmlinux] 32 128 0.40% 4 dd [kernel.vmlinux] 34 136 0.35% 4 dd [kernel.vmlinux] 36 144 0.34% 4 dd [kernel.vmlinux] 35 140 0.30% 4 dd [kernel.vmlinux] 36 144 0.30% 4 dd [kernel.vmlinux] 34 136 0.30% 4 dd [kernel.vmlinux] 32 128 0.30% 4 dd [kernel.vmlinux] 32 128 ... With the 'weight' sort key, it's divided to 4 samples even with the same info ('comm', 'dso', 'sym' and 'local_weight'). I don't think this is what we want. I found this because of the way it aggregates the 'weight' value. Since it's not a period, we should not add them in the he->stat. Otherwise, two 32 'weight' entries will create a 64 'weight' entry. After that, new 32 'weight' samples don't have a matching entry so it'd create a new entry and make it a 64 'weight' entry again and again. Later, they will be merged into 128 'weight' entries during the hists__collapse_resort() with 4 samples, multiple times like above. Let's keep the weight and display it differently. For 'local_weight', it can show the weight as is, and for (global) 'weight' it can display the number multiplied by the number of samples. With this change, I can see the expected numbers. $ perf report --stdio -n -s +local_weight,weight -S lockref_get_not_zero ... # # Overhead Samples Command Shared Object Local Weight Weight # ........ ....... ....... ................ ............ ..... # 21.23% 313 dd [kernel.vmlinux] 32 10016 12.43% 183 dd [kernel.vmlinux] 35 6405 11.97% 159 dd [kernel.vmlinux] 36 5724 7.63% 113 dd [kernel.vmlinux] 33 3729 6.37% 92 dd [kernel.vmlinux] 34 3128 4.17% 59 dd [kernel.vmlinux] 37 2183 0.08% 1 dd [kernel.vmlinux] 269 269 0.08% 1 dd [kernel.vmlinux] 38 38 Reviewed-by: Athira Jajeev <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Athira Jajeev <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-11-18perf tools: Set COMPAT_NEED_REALLOCARRAY for CONFIG_AUXTRACE=1Arnaldo Carvalho de Melo1-0/+3
As it is being used in tools/perf/arch/arm64/util/arm-spe.c and the COMPAT_NEED_REALLOCARRAY was only being set when CORESIGHT=1 is set. Fixes: 56c31cdff7c2a640 ("perf arm-spe: Implement find_snapshot callback") Cc: Alexander Shishkin <[email protected]> Cc: German Gomez <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/all/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-11-18perf tests wp: Remove unused functions on s390Arnaldo Carvalho de Melo1-1/+1
Fixing these build problems: tests/wp.c:24:12: error: 'wp_read' defined but not used [-Werror=unused-function] static int wp_read(int fd, long long *count, int size) ^ tests/wp.c:35:13: error: 'get__perf_event_attr' defined but not used [-Werror=unused-function] static void get__perf_event_attr(struct perf_event_attr *attr, int wp_type, ^ CC /tmp/build/perf/util/print_binary.o Fixes: e47c6ecaae1df54a ("perf test: Convert watch point tests to test cases.") Cc: Alexander Shishkin <[email protected]> Cc: Brendan Higgins <[email protected]> Cc: Daniel Latypov <[email protected]> Cc: David Gow <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jin Yao <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Clarke <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sohaib Mohamed <[email protected]> Cc: Stephane Eranian <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-11-18tools headers UAPI: Sync linux/kvm.h with the kernel sourcesArnaldo Carvalho de Melo1-3/+27
To pick the changes in: b56639318bb2be66 ("KVM: SEV: Add support for SEV intra host migration") e615e355894e6197 ("KVM: x86: On emulation failure, convey the exit reason, etc. to userspace") a9d496d8e08ca1eb ("KVM: x86: Clarify the kvm_run.emulation_failure structure layout") c68dc1b577eabd56 ("KVM: x86: Report host tsc and realtime values in KVM_GET_CLOCK") dea8ee31a0392775 ("RISC-V: KVM: Add SBI v0.1 support") That just rebuilds perf, as these patches don't add any new KVM ioctl to be harvested for the the 'perf trace' ioctl syscall argument beautifiers. This is also by now used by tools/testing/selftests/kvm/, a simple test build succeeded. This silences this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h' diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h Cc: Anup Patel <[email protected]> Cc: Atish Patra <[email protected]> Cc: David Edmondson <[email protected]> Cc: Oliver Upton <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Peter Gonda <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-11-18tools headers cpufeatures: Sync with the kernel sourcesArnaldo Carvalho de Melo1-0/+2
To pick the changes from: eec2113eabd92b7b ("x86/fpu/amx: Define AMX state components and have it used for boot-time checks") This only causes these perf files to be rebuilt: CC /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o CC /tmp/build/perf/bench/mem-memset-x86-64-asm.o And addresses this perf build warning: Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h' diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h Cc: Borislav Petkov <[email protected]> Cc: Chang S. Bae <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-11-18ptp: ocp: Fix a couple NULL vs IS_ERR() checksDan Carpenter1-4/+5
The ptp_ocp_get_mem() function does not return NULL, it returns error pointers. Fixes: 773bda964921 ("ptp: ocp: Expose various resources on the timecard.") Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-18net: ethernet: dec: tulip: de4x5: fix possible array overflows in ↵Teng Qi1-0/+4
type3_infoblock() The definition of macro MOTO_SROM_BUG is: #define MOTO_SROM_BUG (lp->active == 8 && (get_unaligned_le32( dev->dev_addr) & 0x00ffffff) == 0x3e0008) and the if statement if (MOTO_SROM_BUG) lp->active = 0; using this macro indicates lp->active could be 8. If lp->active is 8 and the second comparison of this macro is false. lp->active will remain 8 in: lp->phy[lp->active].gep = (*p ? p : NULL); p += (2 * (*p) + 1); lp->phy[lp->active].rst = (*p ? p : NULL); p += (2 * (*p) + 1); lp->phy[lp->active].mc = get_unaligned_le16(p); p += 2; lp->phy[lp->active].ana = get_unaligned_le16(p); p += 2; lp->phy[lp->active].fdx = get_unaligned_le16(p); p += 2; lp->phy[lp->active].ttm = get_unaligned_le16(p); p += 2; lp->phy[lp->active].mci = *p; However, the length of array lp->phy is 8, so array overflows can occur. To fix these possible array overflows, we first check lp->active and then return -EINVAL if it is greater or equal to ARRAY_SIZE(lp->phy) (i.e. 8). Reported-by: TOTE Robot <[email protected]> Signed-off-by: Teng Qi <[email protected]> Reviewed-by: Arnd Bergmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-18net: tulip: de4x5: fix the problem that the array 'lp->phy[8]' may be out of ↵zhangyue1-13/+17
bound In line 5001, if all id in the array 'lp->phy[8]' is not 0, when the 'for' end, the 'k' is 8. At this time, the array 'lp->phy[8]' may be out of bound. Signed-off-by: zhangyue <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-18Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-David S. Miller3-136/+147
queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2021-11-17 This series contains updates to i40e driver only. Eryk adds accounting for VLAN header in packet size when VF port VLAN is configured. He also fixes TC queue distribution when the user has changed queue counts as well as for configuration of VF ADQ which caused dropped packets. Michal adds tracking for when a VSI is being released to prevent null pointer dereference when managing filters. Karen ensures PF successfully initiates VF requested reset which could cause a call trace otherwise. Jedrzej moves validation of channel queue value earlier to prevent partial configuration when the value is invalid. Grzegorz corrects the reported error when adding filter fails. ==================== Signed-off-by: David S. Miller <[email protected]>
2021-11-18ipv6: check return value of ipv6_skip_exthdrJordy Zomer1-0/+6
The offset value is used in pointer math on skb->data. Since ipv6_skip_exthdr may return -1 the pointer to uh and th may not point to the actual udp and tcp headers and potentially overwrite other stuff. This is why I think this should be checked. EDIT: added {}'s, thanks Kees Signed-off-by: Jordy Zomer <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-18e100: fix device suspend/resumeJesse Brandeburg1-5/+13
As reported in [1], e100 was no longer working for suspend/resume cycles. The previous commit mentioned in the fixes appears to have broken things and this attempts to practice best known methods for device power management and keep wake-up working while allowing suspend/resume to work. To do this, I reorder a little bit of code and fix the resume path to make sure the device is enabled. [1] https://bugzilla.kernel.org/show_bug.cgi?id=214933 Fixes: 69a74aef8a18 ("e100: use generic power management") Cc: Vaibhav Gupta <[email protected]> Reported-by: Alexey Kuznetsov <[email protected]> Signed-off-by: Jesse Brandeburg <[email protected]> Tested-by: Alexey Kuznetsov <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-18devlink: Don't throw an error if flash notification sent before devlink visibleLeon Romanovsky1-1/+3
The mlxsw driver calls to various devlink flash routines even before users can get any access to the devlink instance itself. For example, mlxsw_core_fw_rev_validate() one of such functions. __mlxsw_core_bus_device_register -> mlxsw_core_fw_rev_validate -> mlxsw_core_fw_flash -> mlxfw_firmware_flash -> mlxfw_status_notify -> devlink_flash_update_status_notify -> __devlink_flash_update_notify -> WARN_ON(...) It causes to the WARN_ON to trigger warning about devlink not registered. Fixes: cf530217408e ("devlink: Notify users when objects are accessible") Reported-by: Danielle Ratson <[email protected]> Tested-by: Danielle Ratson <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-18page_pool: Revert "page_pool: disable dma mapping support..."Yunsheng Lin3-8/+27
This reverts commit d00e60ee54b12de945b8493cf18c1ada9e422514. As reported by Guillaume in [1]: Enabling LPAE always enables CONFIG_ARCH_DMA_ADDR_T_64BIT in 32-bit systems, which breaks the bootup proceess when a ethernet driver is using page pool with PP_FLAG_DMA_MAP flag. As we were hoping we had no active consumers for such system when we removed the dma mapping support, and LPAE seems like a common feature for 32 bits system, so revert it. 1. https://www.spinics.net/lists/netdev/msg779890.html Fixes: d00e60ee54b1 ("page_pool: disable dma mapping support for 32-bit arch with 64-bit DMA") Signed-off-by: Yunsheng Lin <[email protected]> Reported-by: "kernelci.org bot" <[email protected]> Tested-by: "kernelci.org bot" <[email protected]> Acked-by: Jesper Dangaard Brouer <[email protected]> Acked-by: Ilias Apalodimas <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-18ethernet: hisilicon: hns: hns_dsaf_misc: fix a possible array overflow in ↵Teng Qi1-0/+4
hns_dsaf_ge_srst_by_port() The if statement: if (port >= DSAF_GE_NUM) return; limits the value of port less than DSAF_GE_NUM (i.e., 8). However, if the value of port is 6 or 7, an array overflow could occur: port_rst_off = dsaf_dev->mac_cb[port]->port_rst_off; because the length of dsaf_dev->mac_cb is DSAF_MAX_PORT_NUM (i.e., 6). To fix this possible array overflow, we first check port and if it is greater than or equal to DSAF_MAX_PORT_NUM, the function returns. Reported-by: TOTE Robot <[email protected]> Signed-off-by: Teng Qi <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-18Merge branch 'rework/printk_safe-removal' into for-linusPetr Mladek4-0/+21
2021-11-18parisc: Enable CONFIG_PRINTK_TIME=y in 32bit defconfigHelge Deller1-0/+1
Signed-off-by: Helge Deller <[email protected]>
2021-11-18Revert "parisc: Reduce sigreturn trampoline to 3 instructions"Helge Deller3-8/+9
This reverts commit e4f2006f1287e7ea17660490569cff323772dac4. This patch shows problems with signal handling. Revert it for now. Signed-off-by: Helge Deller <[email protected]> Cc: <[email protected]> # v5.15
2021-11-18parisc: Wrap assembler related defines inside __ASSEMBLY__Helge Deller1-20/+24
Building allmodconfig shows errors in the gpu/drm/msm snapdragon drivers, because a COND() define is used there which conflicts with the COND() for PA-RISC assembly. Although the snapdragon driver isn't relevant for parisc, it is nevertheless compiled when CONFIG_COMPILE_TEST is defined. Move the COND() define and other PA-RISC mnemonics inside the #ifdef __ASSEMBLY__ part to avoid this conflict. Signed-off-by: Helge Deller <[email protected]> Reported-by: kernel test robot <[email protected]>
2021-11-18parisc: Wire up futex_waitvHelge Deller1-0/+1
Signed-off-by: Helge Deller <[email protected]>
2021-11-18parisc: Include stringify.h to avoid build error in crypto/api.cHelge Deller1-0/+1
Include stringify.h to avoid this build error: arch/parisc/include/asm/jump_label.h: error: expected ':' before '__stringify' arch/parisc/include/asm/jump_label.h: error: label 'l_yes' defined but not used [-Werror=unused-label] Signed-off-by: Helge Deller <[email protected]> Reported-by: kernel test robot <[email protected]>
2021-11-18ALSA: hda/realtek: Fix LED on HP ProBook 435 G7Takashi Iwai1-0/+1
HP ProBook 435 G7 (SSID 103c:8735) needs the similar quirk as another HP ProBook for enabling the mute and the mic-mute LEDs. BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=215021 Cc: <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2021-11-18KVM: x86: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUSVitaly Kuznetsov1-1/+1
It doesn't make sense to return the recommended maximum number of vCPUs which exceeds the maximum possible number of vCPUs. Signed-off-by: Vitaly Kuznetsov <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-18KVM: s390: Cap KVM_CAP_NR_VCPUS by num_online_cpus()Vitaly Kuznetsov1-0/+2
KVM_CAP_NR_VCPUS is a legacy advisory value which on other architectures return num_online_cpus() caped by KVM_CAP_NR_VCPUS or something else (ppc and arm64 are special cases). On s390, KVM_CAP_NR_VCPUS returns the same as KVM_CAP_MAX_VCPUS and this may turn out to be a bad 'advice'. Switch s390 to returning caped num_online_cpus() too. Acked-by: Christian Borntraeger <[email protected]> Signed-off-by: Vitaly Kuznetsov <[email protected]> Reviewed-by: Christian Borntraeger <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-18KVM: RISC-V: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUSVitaly Kuznetsov1-1/+1
It doesn't make sense to return the recommended maximum number of vCPUs which exceeds the maximum possible number of vCPUs. Signed-off-by: Vitaly Kuznetsov <[email protected]> Acked-by: Anup Patel <[email protected]> Reviewed-by: Anup Patel <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-18KVM: PPC: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUSVitaly Kuznetsov1-2/+2
It doesn't make sense to return the recommended maximum number of vCPUs which exceeds the maximum possible number of vCPUs. Signed-off-by: Vitaly Kuznetsov <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-18KVM: MIPS: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUSVitaly Kuznetsov1-1/+1
It doesn't make sense to return the recommended maximum number of vCPUs which exceeds the maximum possible number of vCPUs. Signed-off-by: Vitaly Kuznetsov <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-18KVM: arm64: Cap KVM_CAP_NR_VCPUS by kvm_arm_default_max_vcpus()Vitaly Kuznetsov1-1/+8
Generally, it doesn't make sense to return the recommended maximum number of vCPUs which exceeds the maximum possible number of vCPUs. Note: ARM64 is special as the value returned by KVM_CAP_MAX_VCPUS differs depending on whether it is a system-wide ioctl or a per-VM one. Previously, KVM_CAP_NR_VCPUS didn't have this difference and it seems preferable to keep the status quo. Cap KVM_CAP_NR_VCPUS by kvm_arm_default_max_vcpus() which is what gets returned by system-wide KVM_CAP_MAX_VCPUS. Signed-off-by: Vitaly Kuznetsov <[email protected]> Message-Id: <[email protected]> Acked-by: Marc Zyngier <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-18KVM: x86: Assume a 64-bit hypercall for guests with protected stateTom Lendacky4-4/+16
When processing a hypercall for a guest with protected state, currently SEV-ES guests, the guest CS segment register can't be checked to determine if the guest is in 64-bit mode. For an SEV-ES guest, it is expected that communication between the guest and the hypervisor is performed to shared memory using the GHCB. In order to use the GHCB, the guest must have been in long mode, otherwise writes by the guest to the GHCB would be encrypted and not be able to be comprehended by the hypervisor. Create a new helper function, is_64_bit_hypercall(), that assumes the guest is in 64-bit mode when the guest has protected state, and returns true, otherwise invoking is_64_bit_mode() to determine the mode. Update the hypercall related routines to use is_64_bit_hypercall() instead of is_64_bit_mode(). Add a WARN_ON_ONCE() to is_64_bit_mode() to catch occurences of calls to this helper function for a guest running with protected state. Fixes: f1c6366e3043 ("KVM: SVM: Add required changes to support intercepts under SEV-ES") Reported-by: Sean Christopherson <[email protected]> Signed-off-by: Tom Lendacky <[email protected]> Message-Id: <e0b20c770c9d0d1403f23d83e785385104211f74.1621878537.git.thomas.lendacky@amd.com> Cc: [email protected] Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-18selftests: KVM: Add /x86_64/sev_migrate_tests to .gitignoreArnaldo Carvalho de Melo1-0/+1
$ git status nothing to commit, working tree clean $ $ make -C tools/testing/selftests/kvm/ > /dev/null 2>&1 $ git status Untracked files: (use "git add <file>..." to include in what will be committed) tools/testing/selftests/kvm/x86_64/sev_migrate_tests nothing added to commit but untracked files present (use "git add" to track) $ Fixes: 6a58150859fdec76 ("selftest: KVM: Add intra host migration tests") Cc: Brijesh Singh <[email protected]> Cc: David Rientjes <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Marc Orr <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Peter Gonda <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Tom Lendacky <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Message-Id: <YZPIPfvYgRDCZi/[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-18riscv: kvm: fix non-kernel-doc comment blockRandy Dunlap1-1/+1
Don't use "/**" to begin a comment block for a non-kernel-doc comment. Prevents this docs build warning: vcpu_sbi.c:3: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * Copyright (c) 2019 Western Digital Corporation or its affiliates. Fixes: dea8ee31a039 ("RISC-V: KVM: Add SBI v0.1 support") Signed-off-by: Randy Dunlap <[email protected]> Reported-by: kernel test robot <[email protected]> Cc: Atish Patra <[email protected]> Cc: Anup Patel <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: Paul Walmsley <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Albert Ou <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-18Merge branch 'kvm-5.16-fixes' into kvm-masterPaolo Bonzini12-159/+120
* Fixes for Xen emulation * Kill kvm_map_gfn() / kvm_unmap_gfn() and broken gfn_to_pfn_cache * Fixes for migration of 32-bit nested guests on 64-bit hypervisor * Compilation fixes * More SEV cleanups
2021-11-18KVM: SEV: Fix typo in and tweak name of cmd_allowed_from_miror()Sean Christopherson1-2/+2
Rename cmd_allowed_from_miror() to is_cmd_allowed_from_mirror(), fixing a typo and making it obvious that the result is a boolean where false means "not allowed". No functional change intended. Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-18KVM: SEV: Drop a redundant setting of sev->asid during initializationSean Christopherson1-1/+0
Remove a fully redundant write to sev->asid during SEV/SEV-ES guest initialization. The ASID is set a few lines earlier prior to the call to sev_platform_init(), which doesn't take "sev" as a param, i.e. can't muck with the ASID barring some truly magical behind-the-scenes code. No functional change intended. Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-18KVM: SEV: WARN if SEV-ES is marked active but SEV is notSean Christopherson1-1/+1
WARN if the VM is tagged as SEV-ES but not SEV. KVM relies on SEV and SEV-ES being set atomically, and guards common flows with "is SEV", i.e. observing SEV-ES without SEV means KVM has a fatal bug. Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-18KVM: SEV: Set sev_info.active after initial checks in sev_guest_init()Sean Christopherson1-3/+3
Set sev_info.active during SEV/SEV-ES activation before calling any code that can potentially consume sev_info.es_active, e.g. set "active" and "es_active" as a pair immediately after the initial sanity checks. KVM generally expects that es_active can be true if and only if active is true, e.g. sev_asid_new() deliberately avoids sev_es_guest() so that it doesn't get a false negative. This will allow WARNing in sev_es_guest() if the VM is tagged as SEV-ES but not SEV. Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-18KVM: SEV: Disallow COPY_ENC_CONTEXT_FROM if target has created vCPUsSean Christopherson1-1/+6
Reject COPY_ENC_CONTEXT_FROM if the destination VM has created vCPUs. KVM relies on SEV activation to occur before vCPUs are created, e.g. to set VMCB flags and intercepts correctly. Fixes: 54526d1fd593 ("KVM: x86: Support KVM VMs sharing SEV context") Cc: [email protected] Cc: Peter Gonda <[email protected]> Cc: Marc Orr <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Nathan Tempelman <[email protected]> Cc: Brijesh Singh <[email protected]> Cc: Tom Lendacky <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-18KVM: Kill kvm_map_gfn() / kvm_unmap_gfn() and gfn_to_pfn_cacheDavid Woodhouse3-101/+12
In commit 7e2175ebd695 ("KVM: x86: Fix recording of guest steal time / preempted status") I removed the only user of these functions because it was basically impossible to use them safely. There are two stages to the GFN->PFN mapping; first through the KVM memslots to a userspace HVA and then through the page tables to translate that HVA to an underlying PFN. Invalidations of the former were being handled correctly, but no attempt was made to use the MMU notifiers to invalidate the cache when the HVA->GFN mapping changed. As a prelude to reinventing the gfn_to_pfn_cache with more usable semantics, rip it out entirely and untangle the implementation of the unsafe kvm_vcpu_map()/kvm_vcpu_unmap() functions from it. All current users of kvm_vcpu_map() also look broken right now, and will be dealt with separately. They broadly fall into two classes: * Those which map, access the data and immediately unmap. This is mostly gratuitous and could just as well use the existing user HVA, and could probably benefit from a gfn_to_hva_cache as they do so. * Those which keep the mapping around for a longer time, perhaps even using the PFN directly from the guest. These will need to be converted to the new gfn_to_pfn_cache and then kvm_vcpu_map() can be removed too. Signed-off-by: David Woodhouse <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-18KVM: nVMX: Use a gfn_to_hva_cache for vmptrldDavid Woodhouse2-9/+22
And thus another call to kvm_vcpu_map() can die. Signed-off-by: David Woodhouse <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-18KVM: nVMX: Use kvm_read_guest_offset_cached() for nested VMCS checkDavid Woodhouse1-11/+15
Kill another mostly gratuitous kvm_vcpu_map() which could just use the userspace HVA for it. Signed-off-by: David Woodhouse <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-18KVM: x86/xen: Use sizeof_field() instead of open-coding itDavid Woodhouse1-9/+9
Signed-off-by: David Woodhouse <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-18KVM: nVMX: Use kvm_{read,write}_guest_cached() for shadow_vmcs12David Woodhouse2-9/+20
Using kvm_vcpu_map() for reading from the guest is entirely gratuitous, when all we do is a single memcpy and unmap it again. Fix it up to use kvm_read_guest()... but in fact I couldn't bring myself to do that without also making it use a gfn_to_hva_cache for both that *and* the copy in the other direction. Signed-off-by: David Woodhouse <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-18KVM: x86/xen: Fix get_attr of KVM_XEN_ATTR_TYPE_SHARED_INFODavid Woodhouse1-1/+1
In commit 319afe68567b ("KVM: xen: do not use struct gfn_to_hva_cache") we stopped storing this in-kernel as a GPA, and started storing it as a GFN. Which means we probably should have stopped calling gpa_to_gfn() on it when userspace asks for it back. Cc: [email protected] Fixes: 319afe68567b ("KVM: xen: do not use struct gfn_to_hva_cache") Signed-off-by: David Woodhouse <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-18KVM: x86/mmu: include EFER.LMA in extended mmu roleMaxim Levitsky2-0/+2
Incorporate EFER.LMA into kvm_mmu_extended_role, as it used to compute the guest root level and is not reflected in kvm_mmu_page_role.level when TDP is in use. When simply running the guest, it is impossible for EFER.LMA and kvm_mmu.root_level to get out of sync, as the guest cannot transition from PAE paging to 64-bit paging without toggling CR0.PG, i.e. without first bouncing through a different MMU context. And stuffing guest state via KVM_SET_SREGS{,2} also ensures a full MMU context reset. However, if KVM_SET_SREGS{,2} is followed by KVM_SET_NESTED_STATE, e.g. to set guest state when migrating the VM while L2 is active, the vCPU state will reflect L2, not L1. If L1 is using TDP for L2, then root_mmu will have been configured using L2's state, despite not being used for L2. If L2.EFER.LMA != L1.EFER.LMA, and L2 is using PAE paging, then root_mmu will be configured for guest PAE paging, but will match the mmu_role for 64-bit paging and cause KVM to not reconfigure root_mmu on the next nested VM-Exit. Alternatively, the root_mmu's role could be invalidated after a successful KVM_SET_NESTED_STATE that yields vcpu->arch.mmu != vcpu->arch.root_mmu, i.e. that switches the active mmu to guest_mmu, but doing so is unnecessarily tricky, and not even needed if L1 and L2 do have the same role (e.g., they are both 64-bit guests and run with the same CR4). Suggested-by: Sean Christopherson <[email protected]> Signed-off-by: Maxim Levitsky <[email protected]> Message-Id: <[email protected]> Cc: [email protected] Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-18KVM: nVMX: don't use vcpu->arch.efer when checking host state on nested ↵Maxim Levitsky1-5/+17
state load When loading nested state, don't use check vcpu->arch.efer to get the L1 host's 64-bit vs. 32-bit state and don't check it for consistency with respect to VM_EXIT_HOST_ADDR_SPACE_SIZE, as register state in vCPU may be stale when KVM_SET_NESTED_STATE is called---and architecturally does not exist. When restoring L2 state in KVM, the CPU is placed in non-root where nested VMX code has no snapshot of L1 host state: VMX (conditionally) loads host state fields loaded on VM-exit, but they need not correspond to the state before entry. A simple case occurs in KVM itself, where the host RIP field points to vmx_vmexit rather than the instruction following vmlaunch/vmresume. However, for the particular case of L1 being in 32- or 64-bit mode on entry, the exit controls can be treated instead as the source of truth regarding the state of L1 on entry, and can be used to check that vmcs12.VM_EXIT_HOST_ADDR_SPACE_SIZE matches vmcs12.HOST_EFER if vmcs12.VM_EXIT_LOAD_IA32_EFER is set. The consistency check on CPU EFER vs. vmcs12.VM_EXIT_HOST_ADDR_SPACE_SIZE, instead, happens only on VM-Enter. That's because, again, there's conceptually no "current" L1 EFER to check on KVM_SET_NESTED_STATE. Suggested-by: Paolo Bonzini <[email protected]> Signed-off-by: Maxim Levitsky <[email protected]> Message-Id: <[email protected]> Cc: [email protected] Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-18KVM: Fix steal time asm constraintsDavid Woodhouse1-3/+3
In 64-bit mode, x86 instruction encoding allows us to use the low 8 bits of any GPR as an 8-bit operand. In 32-bit mode, however, we can only use the [abcd] registers. For which, GCC has the "q" constraint instead of the less restrictive "r". Also fix st->preempted, which is an input/output operand rather than an input. Fixes: 7e2175ebd695 ("KVM: x86: Fix recording of guest steal time / preempted status") Reported-by: kernel test robot <[email protected]> Signed-off-by: David Woodhouse <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-18cpuid: kvm_find_kvm_cpuid_features() should be declared 'static'Paul Durrant1-1/+1
The lack a static declaration currently results in: arch/x86/kvm/cpuid.c:128:26: warning: no previous prototype for function 'kvm_find_kvm_cpuid_features' when compiling with "W=1". Reported-by: kernel test robot <[email protected]> Fixes: 760849b1476c ("KVM: x86: Make sure KVM_CPUID_FEATURES really are KVM_CPUID_FEATURES") Signed-off-by: Paul Durrant <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-18ata: libata-sata: Declare ata_ncq_sdev_attrs staticDamien Le Moal1-1/+1
Since ata_ncq_sdev_attrs is a local struct, declare it static. This avoids a sparse warning at compile time. Signed-off-by: Damien Le Moal <[email protected]>