aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-08-31drm/amdgpu: Only support RAS EEPROM on dGPU platformCandice Li1-1/+2
RAS EEPROM device is only supported on dGPU platform for smu v13_0_6. Signed-off-by: Candice Li <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-08-31Documentation/gpu: Update amdgpu documentationLijo Lazar1-4/+4
7957ec80ef97 ("drm/amdgpu: Add FRU sysfs nodes only if needed") moved the documentation for some of the sysfs nodes to amdgpu_fru_eeprom.c. Update the documentation accordingly. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-08-31Merge tag 'v6.6-vfs.super.fixes.2' of ↵Linus Torvalds4-56/+58
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull more superblock follow-on fixes from Christian Brauner: "This contains two more small follow-up fixes for the super work this cycle. I went through all filesystems once more and detected two minor issues that still needed fixing: - Some filesystems support mtd devices (e.g., mount -t jffs2 mtd2 /mnt). The mtd infrastructure uses the sb->s_mtd pointer to find an existing superblock. When the mtd device is put and sb->s_mtd cleared the superblock can still be found fs_supers and so this risks a use-after-free. Add a small patch that aligns mtd with what we did for regular block devices and switch keying to rely on sb->s_dev. (This was tested with mtd devices and jffs2 as xfstests doesn't support mtd devices.) - Switch nfs back to rely on kill_anon_super() so the superblock is removed from the list of active supers before sb->s_fs_info is freed" * tag 'v6.6-vfs.super.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: NFS: switch back to using kill_anon_super mtd: key superblock by device number fs: export sget_dev()
2023-08-31drm/amdgpu/pm: Add notification for no DC supportBokun Zhang5-12/+11
- There is a DPM issue where if DC is not present, FCLK will stay at low level. We need to send a SMU message to configure the DPM - Reuse smu_v13_0_notify_display_change() for this purpose Reviewed-by: Evan Quan <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Bokun Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-08-31drm/amd/display: Enable Replay for static screen use casesBhawanpreet Lakha3-1/+32
- Setup replay config on device init. - Enable replay if feature is enabled (prioritize replay over PSR, since it can be enabled in more usecases) - Add debug masks to enable replay on supported ASICs Signed-off-by: Bhawanpreet Lakha <[email protected]> Reviewed-by: Harry Wentland <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-08-31powerpc: Fix pud_mkwrite() definition after pte_mkwrite() API changesIngo Molnar1-1/+1
Fix up missed semantic mis-merge between commits 161e393c0f63 ("mm: Make pte_mkwrite() take a VMA") 27af67f35631 ("powerpc/book3s64/mm: enable transparent pud hugepage") where the newly introduced powerpc use of 'pte_mkwrite()' needs to use the 'novma()' versions as per commit 2f0584f3f4bd ("mm: Rename arch pte_mkwrite()'s to pte_mkwrite_novma()"). Fixes: df57721f9a63 ("Merge tag 'x86_shstk_for_6.6-rc1' of [...]") Signed-off-by: Ingo Molnar <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2023-08-31x86/fpu/xstate: Fix PKRU covert channelJim Mattson1-1/+1
When XCR0[9] is set, PKRU can be read and written from userspace with XSAVE and XRSTOR, even when CR4.PKE is clear. Clear XCR0[9] when protection keys are disabled. Reported-by: Tavis Ormandy <[email protected]> Signed-off-by: Jim Mattson <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Acked-by: Dave Hansen <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-08-31pstore: Base compression input buffer size on estimated compressed sizeArd Biesheuvel1-7/+27
Commit 1756ddea6916 ("pstore: Remove worst-case compression size logic") removed some clunky per-algorithm worst case size estimation routines on the basis that we can always store pstore records uncompressed, and these worst case estimations are about how much the size might inadvertently *increase* due to encapsulation overhead when the input cannot be compressed at all. So if compression results in a size increase, we just store the original data instead. However, it seems that the original code was misinterpreting these calculations as an estimation of how much uncompressed data might fit into a compressed buffer of a given size, and it was using the results to consume the input data in larger chunks than the pstore record size, relying on the compression to ensure that what ultimately gets stored fits into the available space. One result of this, as observed and reported by Linus, is that upgrading to a newer kernel that includes the given commit may result in pstore decompression errors reported in the kernel log. This is due to the fact that the existing records may unexpectedly decompress to a size that is larger than the pstore record size. Another potential problem caused by this change is that we may underutilize the fixed sized records on pstore backends such as ramoops. And on pstore backends with variable sized records such as EFI, we will end up creating many more entries than before to store the same amount of compressed data. So let's fix both issues, by bringing back the typical case estimation of how much ASCII text captured from the dmesg log might fit into a pstore record of a given size after compression. The original implementation used the computation given below for zlib: switch (size) { /* buffer range for efivars */ case 1000 ... 2000: cmpr = 56; break; case 2001 ... 3000: cmpr = 54; break; case 3001 ... 3999: cmpr = 52; break; /* buffer range for nvram, erst */ case 4000 ... 10000: cmpr = 45; break; default: cmpr = 60; break; } return (size * 100) / cmpr; We will use the previous worst-case of 60% for compression. For decompression go extra large (3x) so we make sure there's enough space for anything. While at it, rate limit the error message so we don't flood the log unnecessarily on systems that have accumulated a lot of pstore history. Cc: Linus Torvalds <[email protected]> Cc: Eric Biggers <[email protected]> Cc: Kees Cook <[email protected]> Cc: Herbert Xu <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]> Link: https://lore.kernel.org/r/[email protected] Co-developed-by: Kees Cook <[email protected]> Signed-off-by: Kees Cook <[email protected]>
2023-08-31fbdev: mx3fb: Remove the driverFabio Estevam4-1757/+0
The mx3fb driver does not support devicetree and i.MX has been converted to a DT-only platform since kernel 5.10. As there is no user for this driver anymore, just remove it. Signed-off-by: Fabio Estevam <[email protected]> Reviewed-by: Arnd Bergmann <[email protected]> Signed-off-by: Helge Deller <[email protected]>
2023-08-31fbdev/core: Use list_for_each_entry() helperJinjie Ruan2-23/+7
Convert list_for_each() to list_for_each_entry() so that the pos list_head pointer and list_entry() call are no longer needed, which can reduce a few lines of code. No functional changed. Signed-off-by: Jinjie Ruan <[email protected]> Signed-off-by: Helge Deller <[email protected]>
2023-08-31selftests/bpf: Include build flavors for install targetBjörn Töpel1-0/+12
When using the "install" or targets depending on install, e.g. "gen_tar", the BPF machine flavors weren't included. A command like: | make ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu- O=/workspace/kbuild \ | HOSTCC=gcc FORMAT= SKIP_TARGETS="arm64 ia64 powerpc sparc64 x86 sgx" \ | -C tools/testing/selftests gen_tar would not include bpf/no_alu32, bpf/cpuv4, or bpf/bpf-gcc. Include the BPF machine flavors for "install" make target. Signed-off-by: Björn Töpel <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2023-08-31bpf: Annotate bpf_long_memcpy with data_raceDaniel Borkmann1-1/+1
syzbot reported a data race splat between two processes trying to update the same BPF map value via syscall on different CPUs: BUG: KCSAN: data-race in bpf_percpu_array_update / bpf_percpu_array_update write to 0xffffe8fffe7425d8 of 8 bytes by task 8257 on cpu 1: bpf_long_memcpy include/linux/bpf.h:428 [inline] bpf_obj_memcpy include/linux/bpf.h:441 [inline] copy_map_value_long include/linux/bpf.h:464 [inline] bpf_percpu_array_update+0x3bb/0x500 kernel/bpf/arraymap.c:380 bpf_map_update_value+0x190/0x370 kernel/bpf/syscall.c:175 generic_map_update_batch+0x3ae/0x4f0 kernel/bpf/syscall.c:1749 bpf_map_do_batch+0x2df/0x3d0 kernel/bpf/syscall.c:4648 __sys_bpf+0x28a/0x780 __do_sys_bpf kernel/bpf/syscall.c:5241 [inline] __se_sys_bpf kernel/bpf/syscall.c:5239 [inline] __x64_sys_bpf+0x43/0x50 kernel/bpf/syscall.c:5239 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd write to 0xffffe8fffe7425d8 of 8 bytes by task 8268 on cpu 0: bpf_long_memcpy include/linux/bpf.h:428 [inline] bpf_obj_memcpy include/linux/bpf.h:441 [inline] copy_map_value_long include/linux/bpf.h:464 [inline] bpf_percpu_array_update+0x3bb/0x500 kernel/bpf/arraymap.c:380 bpf_map_update_value+0x190/0x370 kernel/bpf/syscall.c:175 generic_map_update_batch+0x3ae/0x4f0 kernel/bpf/syscall.c:1749 bpf_map_do_batch+0x2df/0x3d0 kernel/bpf/syscall.c:4648 __sys_bpf+0x28a/0x780 __do_sys_bpf kernel/bpf/syscall.c:5241 [inline] __se_sys_bpf kernel/bpf/syscall.c:5239 [inline] __x64_sys_bpf+0x43/0x50 kernel/bpf/syscall.c:5239 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd value changed: 0x0000000000000000 -> 0xfffffff000002788 The bpf_long_memcpy is used with 8-byte aligned pointers, power-of-8 size and forced to use long read/writes to try to atomically copy long counters. It is best-effort only and no barriers are here since it _will_ race with concurrent updates from BPF programs. The bpf_long_memcpy() is called from bpf(2) syscall. Marco suggested that the best way to make this known to KCSAN would be to use data_race() annotation. Reported-by: [email protected] Suggested-by: Marco Elver <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Marco Elver <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Link: https://lore.kernel.org/bpf/57628f7a15e20d502247c3b55fceb1cb2b31f266.1693342186.git.daniel@iogearbox.net
2023-08-31Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds18-554/+313
Pull ARM updates from Russell King: - Refactor VFP code and convert to C code (Ard Biesheuvel) - Fix hardware breakpoint single-stepping using bpf_overflow_handler - Make SMP stop calls asynchronous allowing panic from irq context to work - Fix for kernel-doc warnings for locomo * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: Revert part of ae1f8d793a19 ("ARM: 9304/1: add prototype for function called only from asm") ARM: 9318/1: locomo: move kernel-doc to prevent warnings ARM: 9317/1: kexec: Make smp stop calls asynchronous ARM: 9316/1: hw_breakpoint: fix single-stepping when using bpf_overflow_handler ARM: entry: Make asm coproc dispatch code NWFPE only ARM: iwmmxt: Use undef hook to enable coprocessor for task ARM: entry: Disregard Thumb undef exception in coproc dispatch ARM: vfp: Use undef hook for handling VFP exceptions ARM: kernel: Get rid of thread_info::used_cp[] array ARM: vfp: Reimplement VFP exception entry in C code ARM: vfp: Remove workaround for Feroceon CPUs ARM: vfp: Record VFP bounces as perf emulation faults
2023-08-31Merge tag 'powerpc-6.6-1' of ↵Linus Torvalds305-3290/+4042
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Add HOTPLUG_SMT support (/sys/devices/system/cpu/smt) and honour the configured SMT state when hotplugging CPUs into the system - Combine final TLB flush and lazy TLB mm shootdown IPIs when using the Radix MMU to avoid a broadcast TLBIE flush on exit - Drop the exclusion between ptrace/perf watchpoints, and drop the now unused associated arch hooks - Add support for the "nohlt" command line option to disable CPU idle - Add support for -fpatchable-function-entry for ftrace, with GCC >= 13.1 - Rework memory block size determination, and support 256MB size on systems with GPUs that have hotpluggable memory - Various other small features and fixes Thanks to Andrew Donnellan, Aneesh Kumar K.V, Arnd Bergmann, Athira Rajeev, Benjamin Gray, Christophe Leroy, Frederic Barrat, Gautam Menghani, Geoff Levand, Hari Bathini, Immad Mir, Jialin Zhang, Joel Stanley, Jordan Niethe, Justin Stitt, Kajol Jain, Kees Cook, Krzysztof Kozlowski, Laurent Dufour, Liang He, Linus Walleij, Mahesh Salgaonkar, Masahiro Yamada, Michal Suchanek, Nageswara R Sastry, Nathan Chancellor, Nathan Lynch, Naveen N Rao, Nicholas Piggin, Nick Desaulniers, Omar Sandoval, Randy Dunlap, Reza Arbab, Rob Herring, Russell Currey, Sourabh Jain, Thomas Gleixner, Trevor Woerner, Uwe Kleine-König, Vaibhav Jain, Xiongfeng Wang, Yuan Tan, Zhang Rui, and Zheng Zengkai. * tag 'powerpc-6.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (135 commits) macintosh/ams: linux/platform_device.h is needed powerpc/xmon: Reapply "Relax frame size for clang" powerpc/mm/book3s64: Use 256M as the upper limit with coherent device memory attached powerpc/mm/book3s64: Fix build error with SPARSEMEM disabled powerpc/iommu: Fix notifiers being shared by PCI and VIO buses powerpc/mpc5xxx: Add missing fwnode_handle_put() powerpc/config: Disable SLAB_DEBUG_ON in skiroot powerpc/pseries: Remove unused hcall tracing instruction powerpc/pseries: Fix hcall tracepoints with JUMP_LABEL=n powerpc: dts: add missing space before { powerpc/eeh: Use pci_dev_id() to simplify the code powerpc/64s: Move CPU -mtune options into Kconfig powerpc/powermac: Fix unused function warning powerpc/pseries: Rework lppaca_shared_proc() to avoid DEBUG_PREEMPT powerpc: Don't include lppaca.h in paca.h powerpc/pseries: Move hcall_vphn() prototype into vphn.h powerpc/pseries: Move VPHN constants into vphn.h cxl: Drop unused detach_spa() powerpc: Drop zalloc_maybe_bootmem() powerpc/powernv: Use struct opal_prd_msg in more places ...
2023-08-31perf parse-events: Fix propagation of term's no_value when cloningIan Rogers3-21/+19
The no_value field in 'struct parse_events_term' indicates that the val variable isn't used, the case for an event name. Cloning wasn't propagating this, making cloned event name terms appearing to have a constant assinged to them. Working around the bug would check for a value of 1 assigned to value, but then this meant a user value of 1 couldn't be differentiated causing the value to be lost in debug printing and perf list. The change fixes the cloning and updates the "val.num ==/!= 1" tests to use no_value instead. To better check the no_value is set appropriately parameter comments are added for constant values. This found that no_value wasn't set correctly in parse_events_multi_pmu_add, which matters now that no_value is used to indicate an event name. Fixes: 7a6e91644708d514 ("perf parse-events: Make common term list to strbuf helper") Fixes: 99e7138eb7897aa0 ("perf tools: Fail on using multiple bits long terms without value") Signed-off-by: Ian Rogers <[email protected]> Tested-by: Kan Liang <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-31perf parse-events: Name the two term enumsIan Rogers4-67/+187
Name the enums used by 'struct parse_events_term' to parse_events__term_val_type and parse_events__term_type. This allows greater compile time error checking. Fix -Wswitch related issues by explicitly listing all enum values prior to default. Add config_term_name to safely look up a parse_events__term_type name, bounds checking the array access first. Add documentation to 'struct parse_events_terms' and reorder to save space. Signed-off-by: Ian Rogers <[email protected]> Tested-by: Kan Liang <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-31perf list: Don't print Unit for "default_core"Ian Rogers1-1/+1
"default_core" was added as a way to demark JSON events whose PMU should be whatever the default core PMU is, previously this had been assumed to be "cpu" but that fails on s390 and ARM. 'perf list' displays the PMU in the event description to save storing it in JSON, but was still comparing against "cpu" and not "default_core", so update this. Fixes: d2045f87154bf67a ("perf jevents: Use "default_core" for events with no Unit") Signed-off-by: Ian Rogers <[email protected]> Tested-by: Kan Liang <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-31Merge tag 'x86_shstk_for_6.6-rc1' of ↵Linus Torvalds118-308/+2790
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 shadow stack support from Dave Hansen: "This is the long awaited x86 shadow stack support, part of Intel's Control-flow Enforcement Technology (CET). CET consists of two related security features: shadow stacks and indirect branch tracking. This series implements just the shadow stack part of this feature, and just for userspace. The main use case for shadow stack is providing protection against return oriented programming attacks. It works by maintaining a secondary (shadow) stack using a special memory type that has protections against modification. When executing a CALL instruction, the processor pushes the return address to both the normal stack and to the special permission shadow stack. Upon RET, the processor pops the shadow stack copy and compares it to the normal stack copy. For more information, refer to the links below for the earlier versions of this patch set" Link: https://lore.kernel.org/lkml/[email protected]/ Link: https://lore.kernel.org/lkml/[email protected]/ * tag 'x86_shstk_for_6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (47 commits) x86/shstk: Change order of __user in type x86/ibt: Convert IBT selftest to asm x86/shstk: Don't retry vm_munmap() on -EINTR x86/kbuild: Fix Documentation/ reference x86/shstk: Move arch detail comment out of core mm x86/shstk: Add ARCH_SHSTK_STATUS x86/shstk: Add ARCH_SHSTK_UNLOCK x86: Add PTRACE interface for shadow stack selftests/x86: Add shadow stack test x86/cpufeatures: Enable CET CR4 bit for shadow stack x86/shstk: Wire in shadow stack interface x86: Expose thread features in /proc/$PID/status x86/shstk: Support WRSS for userspace x86/shstk: Introduce map_shadow_stack syscall x86/shstk: Check that signal frame is shadow stack mem x86/shstk: Check that SSP is aligned on sigreturn x86/shstk: Handle signals for shadow stack x86/shstk: Introduce routines modifying shstk x86/shstk: Handle thread shadow stack x86/shstk: Add user-mode shadow stack support ...
2023-08-31x86/irq/i8259: Fix kernel-doc annotation warningVincenzo Palazzo1-3/+1
Fix this warning: arch/x86/kernel/i8259.c:235: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * ELCR registers (0x4d0, 0x4d1) control edge/level of IRQ CC arch/x86/kernel/irqinit.o Signed-off-by: Vincenzo Palazzo <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-08-31x86/speculation: Mark all Skylake CPUs as vulnerable to GDSDave Hansen1-4/+4
The Gather Data Sampling (GDS) vulnerability is common to all Skylake processors. However, the "client" Skylakes* are now in this list: https://www.intel.com/content/www/us/en/support/articles/000022396/processors.html which means they are no longer included for new vulnerabilities here: https://www.intel.com/content/www/us/en/developer/topic-technology/software-security-guidance/processors-affected-consolidated-product-cpu-model.html or in other GDS documentation. Thus, they were not included in the original GDS mitigation patches. Mark SKYLAKE and SKYLAKE_L as vulnerable to GDS to match all the other Skylake CPUs (which include Kaby Lake). Also group the CPUs so that the ones that share the exact same vulnerabilities are next to each other. Last, move SRBDS to the end of each line. This makes it clear at a glance that SKYLAKE_X is unique. Of the five Skylakes, it is the only "server" CPU and has a different implementation from the clients of the "special register" hardware, making it immune to SRBDS. This makes the diff much harder to read, but the resulting table is worth it. I very much appreciate the report from Michael Zhivich about this issue. Despite what level of support a hardware vendor is providing, the kernel very much needs an accurate and up-to-date list of vulnerable CPUs. More reports like this are very welcome. * Client Skylakes are CPUID 406E3/506E3 which is family 6, models 0x4E and 0x5E, aka INTEL_FAM6_SKYLAKE and INTEL_FAM6_SKYLAKE_L. Reported-by: Michael Zhivich <[email protected]> Fixes: 8974eb588283 ("x86/speculation: Add Gather Data Sampling mitigation") Signed-off-by: Dave Hansen <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Reviewed-by: Daniel Sneddon <[email protected]> Cc: Linus Torvalds <[email protected]>
2023-08-31KVM: x86/mmu: Include mmu.h in spte.hSean Christopherson1-0/+1
Explicitly include mmu.h in spte.h instead of relying on the "parent" to include mmu.h. spte.h references a variety of macros and variables that are defined/declared in mmu.h, and so including spte.h before (or instead of) mmu.h will result in build errors, e.g. arch/x86/kvm/mmu/spte.h: In function ‘is_mmio_spte’: arch/x86/kvm/mmu/spte.h:242:23: error: ‘enable_mmio_caching’ undeclared 242 | likely(enable_mmio_caching); | ^~~~~~~~~~~~~~~~~~~ arch/x86/kvm/mmu/spte.h: In function ‘is_large_pte’: arch/x86/kvm/mmu/spte.h:302:22: error: ‘PT_PAGE_SIZE_MASK’ undeclared 302 | return pte & PT_PAGE_SIZE_MASK; | ^~~~~~~~~~~~~~~~~ arch/x86/kvm/mmu/spte.h: In function ‘is_dirty_spte’: arch/x86/kvm/mmu/spte.h:332:56: error: ‘PT_WRITABLE_MASK’ undeclared 332 | return dirty_mask ? spte & dirty_mask : spte & PT_WRITABLE_MASK; | ^~~~~~~~~~~~~~~~ Fixes: 5a9624affe7c ("KVM: mmu: extract spte.h and spte.c") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2023-08-31KVM: x86/mmu: Use dummy root, backed by zero page, for !visible guest rootsSean Christopherson4-24/+47
When attempting to allocate a shadow root for a !visible guest root gfn, e.g. that resides in MMIO space, load a dummy root that is backed by the zero page instead of immediately synthesizing a triple fault shutdown (using the zero page ensures any attempt to translate memory will generate a !PRESENT fault and thus VM-Exit). Unless the vCPU is racing with memslot activity, KVM will inject a page fault due to not finding a visible slot in FNAME(walk_addr_generic), i.e. the end result is mostly same, but critically KVM will inject a fault only *after* KVM runs the vCPU with the bogus root. Waiting to inject a fault until after running the vCPU fixes a bug where KVM would bail from nested VM-Enter if L1 tried to run L2 with TDP enabled and a !visible root. Even though a bad root will *probably* lead to shutdown, (a) it's not guaranteed and (b) the CPU won't read the underlying memory until after VM-Enter succeeds. E.g. if L1 runs L2 with a VMX preemption timer value of '0', then architecturally the preemption timer VM-Exit is guaranteed to occur before the CPU executes any instruction, i.e. before the CPU needs to translate a GPA to a HPA (so long as there are no injected events with higher priority than the preemption timer). If KVM manages to get to FNAME(fetch) with a dummy root, e.g. because userspace created a memslot between installing the dummy root and handling the page fault, simply unload the MMU to allocate a new root and retry the instruction. Use KVM_REQ_MMU_FREE_OBSOLETE_ROOTS to drop the root, as invoking kvm_mmu_free_roots() while holding mmu_lock would deadlock, and conceptually the dummy root has indeeed become obsolete. The only difference versus existing usage of KVM_REQ_MMU_FREE_OBSOLETE_ROOTS is that the root has become obsolete due to memslot *creation*, not memslot deletion or movement. Reported-by: Reima Ishii <[email protected]> Cc: Yu Zhang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2023-08-31KVM: x86/mmu: Disallow guest from using !visible slots for page tablesSean Christopherson1-1/+6
Explicitly inject a page fault if guest attempts to use a !visible gfn as a page table. kvm_vcpu_gfn_to_hva_prot() will naturally handle the case where there is no memslot, but doesn't catch the scenario where the gfn points at a KVM-internal memslot. Letting the guest backdoor its way into accessing KVM-internal memslots isn't dangerous on its own, e.g. at worst the guest can crash itself, but disallowing the behavior will simplify fixing how KVM handles !visible guest root gfns (immediately synthesizing a triple fault when loading the root is architecturally wrong). Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2023-08-31KVM: x86/mmu: Harden TDP MMU iteration against root w/o shadow pageSean Christopherson1-5/+6
Explicitly check that tdp_iter_start() is handed a valid shadow page to harden KVM against bugs, e.g. if KVM calls into the TDP MMU with an invalid or shadow MMU root (which would be a fatal KVM bug), the shadow page pointer will be NULL. Opportunistically stop the TDP MMU iteration instead of continuing on with garbage if the incoming root is bogus. Attempting to walk a garbage root is more likely to caused major problems than doing nothing. Cc: Yu Zhang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2023-08-31KVM: x86/mmu: Harden new PGD against roots without shadow pagesSean Christopherson1-6/+19
Harden kvm_mmu_new_pgd() against NULL pointer dereference bugs by sanity checking that the target root has an associated shadow page prior to dereferencing said shadow page. The code in question is guaranteed to only see roots with shadow pages as fast_pgd_switch() explicitly frees the current root if it doesn't have a shadow page, i.e. is a PAE root, and that in turn prevents valid roots from being cached, but that's all very subtle. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2023-08-31KVM: x86/mmu: Add helper to convert root hpa to shadow pageSean Christopherson3-16/+23
Add a dedicated helper for converting a root hpa to a shadow page in anticipation of using a "dummy" root to handle the scenario where KVM needs to load a valid shadow root (from hardware's perspective), but the guest doesn't have a visible root to shadow. Similar to PAE roots, the dummy root won't have an associated kvm_mmu_page and will need special handling when finding a shadow page given a root. Opportunistically retrieve the root shadow page in kvm_mmu_sync_roots() *after* verifying the root is unsync (the dummy root can never be unsync). Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2023-08-31drm/i915/gvt: Drop final dependencies on KVM internal detailsSean Christopherson2-2/+3
Open code gpa_to_gfn() in kvmgt_page_track_write() and drop KVMGT's dependency on kvm_host.h, i.e. include only kvm_page_track.h. KVMGT assumes "gfn == gpa >> PAGE_SHIFT" all over the place, including a few lines below in the same function with the same gpa, i.e. there's no reason to use KVM's helper for this one case. No functional change intended. Reviewed-by: Yan Zhao <[email protected]> Tested-by: Yongwei Ma <[email protected]> Reviewed-by: Zhi Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2023-08-31KVM: x86/mmu: Handle KVM bookkeeping in page-track APIs, not callersSean Christopherson3-22/+24
Get/put references to KVM when a page-track notifier is (un)registered instead of relying on the caller to do so. Forcing the caller to do the bookkeeping is unnecessary and adds one more thing for users to get wrong, e.g. see commit 9ed1fdee9ee3 ("drm/i915/gvt: Get reference to KVM iff attachment to VM is successful"). Reviewed-by: Yan Zhao <[email protected]> Tested-by: Yongwei Ma <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2023-08-31KVM: x86/mmu: Drop @slot param from exported/external page-track APIsSean Christopherson5-58/+80
Refactor KVM's exported/external page-track, a.k.a. write-track, APIs to take only the gfn and do the required memslot lookup in KVM proper. Forcing users of the APIs to get the memslot unnecessarily bleeds KVM internals into KVMGT and complicates usage of the APIs. No functional change intended. Reviewed-by: Yan Zhao <[email protected]> Tested-by: Yongwei Ma <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2023-08-31KVM: x86/mmu: Bug the VM if write-tracking is used but not enabledSean Christopherson1-2/+2
Bug the VM if something attempts to write-track a gfn, but write-tracking isn't enabled. The VM is doomed (and KVM has an egregious bug) if KVM or KVMGT wants to shadow guest page tables but can't because write-tracking isn't enabled. Tested-by: Yongwei Ma <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2023-08-31KVM: x86/mmu: Assert that correct locks are held for page write-trackingSean Christopherson1-6/+11
When adding/removing gfns to/from write-tracking, assert that mmu_lock is held for write, and that either slots_lock or kvm->srcu is held. mmu_lock must be held for write to protect gfn_write_track's refcount, and SRCU or slots_lock must be held to protect the memslot itself. Tested-by: Yan Zhao <[email protected]> Tested-by: Yongwei Ma <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2023-08-31KVM: x86/mmu: Rename page-track APIs to reflect the new realitySean Christopherson5-23/+21
Rename the page-track APIs to capture that they're all about tracking writes, now that the facade of supporting multiple modes is gone. Opportunstically replace "slot" with "gfn" in anticipation of removing the @slot param from the external APIs. No functional change intended. Tested-by: Yongwei Ma <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2023-08-31KVM: x86/mmu: Drop infrastructure for multiple page-track modesSean Christopherson6-102/+48
Drop "support" for multiple page-track modes, as there is no evidence that array-based and refcounted metadata is the optimal solution for other modes, nor is there any evidence that other use cases, e.g. for access-tracking, will be a good fit for the page-track machinery in general. E.g. one potential use case of access-tracking would be to prevent guest access to poisoned memory (from the guest's perspective). In that case, the number of poisoned pages is likely to be a very small percentage of the guest memory, and there is no need to reference count the number of access-tracking users, i.e. expanding gfn_track[] for a new mode would be grossly inefficient. And for poisoned memory, host userspace would also likely want to trap accesses, e.g. to inject #MC into the guest, and that isn't currently supported by the page-track framework. A better alternative for that poisoned page use case is likely a variation of the proposed per-gfn attributes overlay (linked), which would allow efficiently tracking the sparse set of poisoned pages, and by default would exit to userspace on access. Link: https://lore.kernel.org/all/[email protected] Cc: Ben Gardon <[email protected]> Tested-by: Yongwei Ma <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2023-08-31KVM: x86/mmu: Use page-track notifiers iff there are external usersSean Christopherson4-16/+47
Disable the page-track notifier code at compile time if there are no external users, i.e. if CONFIG_KVM_EXTERNAL_WRITE_TRACKING=n. KVM itself now hooks emulated writes directly instead of relying on the page-track mechanism. Provide a stub for "struct kvm_page_track_notifier_node" so that including headers directly from the command line, e.g. for testing include guards, doesn't fail due to a struct having an incomplete type. Reviewed-by: Yan Zhao <[email protected]> Tested-by: Yongwei Ma <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2023-08-31KVM: x86/mmu: Move KVM-only page-track declarations to internal headerSean Christopherson5-27/+39
Bury the declaration of the page-track helpers that are intended only for internal KVM use in a "private" header. In addition to guarding against unwanted usage of the internal-only helpers, dropping their definitions avoids exposing other structures that should be KVM-internal, e.g. for memslots. This is a baby step toward making kvm_host.h a KVM-internal header in the very distant future. Tested-by: Yongwei Ma <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2023-08-31KVM: x86: Remove the unused page-track hook track_flush_slot()Yan Zhao3-39/+0
Remove ->track_remove_slot(), there are no longer any users and it's unlikely a "flush" hook will ever be the correct API to provide to an external page-track user. Cc: Zhenyu Wang <[email protected]> Suggested-by: Sean Christopherson <[email protected]> Signed-off-by: Yan Zhao <[email protected]> Tested-by: Yongwei Ma <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2023-08-31drm/i915/gvt: switch from ->track_flush_slot() to ->track_remove_region()Yan Zhao1-12/+9
Switch from the poorly named and flawed ->track_flush_slot() to the newly introduced ->track_remove_region(). From KVMGT's perspective, the two hooks are functionally equivalent, the only difference being that ->track_remove_region() is called only when KVM is 100% certain the memory region will be removed, i.e. is invoked slightly later in KVM's memslot modification flow. Cc: Zhenyu Wang <[email protected]> Suggested-by: Sean Christopherson <[email protected]> Signed-off-by: Yan Zhao <[email protected]> [sean: handle name change, massage changelog, rebase] Tested-by: Yan Zhao <[email protected]> Tested-by: Yongwei Ma <[email protected]> Reviewed-by: Zhi Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2023-08-31KVM: x86: Add a new page-track hook to handle memslot deletionYan Zhao3-2/+40
Add a new page-track hook, track_remove_region(), that is called when a memslot DELETE operation is about to be committed. The "remove" hook will be used by KVMGT and will effectively replace the existing track_flush_slot() altogether now that KVM itself doesn't rely on the "flush" hook either. The "flush" hook is flawed as it's invoked before the memslot operation is guaranteed to succeed, i.e. KVM might ultimately keep the existing memslot without notifying external page track users, a.k.a. KVMGT. In practice, this can't currently happen on x86, but there are no guarantees that won't change in the future, not to mention that "flush" does a very poor job of describing what is happening. Pass in the gfn+nr_pages instead of the slot itself so external users, i.e. KVMGT, don't need to exposed to KVM internals (memslots). This will help set the stage for additional cleanups to the page-track APIs. Opportunistically align the existing srcu_read_lock_held() usage so that the new case doesn't stand out like a sore thumb (and not aligning the new code makes bots unhappy). Cc: Zhenyu Wang <[email protected]> Tested-by: Yongwei Ma <[email protected]> Signed-off-by: Yan Zhao <[email protected]> Co-developed-by: Sean Christopherson <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2023-08-31drm/i915/gvt: Don't bother removing write-protection on to-be-deleted slotSean Christopherson1-7/+1
When handling a slot "flush", don't call back into KVM to drop write protection for gfns in the slot. Now that KVM rejects attempts to move memory slots while KVMGT is attached, the only time a slot is "flushed" is when it's being removed, i.e. the memslot and all its write-tracking metadata is about to be deleted. Reviewed-by: Yan Zhao <[email protected]> Tested-by: Yongwei Ma <[email protected]> Reviewed-by: Zhi Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2023-08-31KVM: x86: Reject memslot MOVE operations if KVMGT is attachedSean Christopherson3-0/+15
Disallow moving memslots if the VM has external page-track users, i.e. if KVMGT is being used to expose a virtual GPU to the guest, as KVMGT doesn't correctly handle moving memory regions. Note, this is potential ABI breakage! E.g. userspace could move regions that aren't shadowed by KVMGT without harming the guest. However, the only known user of KVMGT is QEMU, and QEMU doesn't move generic memory regions. KVM's own support for moving memory regions was also broken for multiple years (albeit for an edge case, but arguably moving RAM is itself an edge case), e.g. see commit edd4fa37baa6 ("KVM: x86: Allocate new rmap and large page tracking when moving memslot"). Reviewed-by: Yan Zhao <[email protected]> Tested-by: Yongwei Ma <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2023-08-31KVM: drm/i915/gvt: Drop @vcpu from KVM's ->track_write() hookSean Christopherson3-10/+7
Drop @vcpu from KVM's ->track_write() hook provided for external users of the page-track APIs now that KVM itself doesn't use the page-track mechanism. Reviewed-by: Yan Zhao <[email protected]> Tested-by: Yongwei Ma <[email protected]> Reviewed-by: Zhi Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2023-08-31KVM: x86/mmu: Don't bounce through page-track mechanism for guest PTEsSean Christopherson4-12/+6
Don't use the generic page-track mechanism to handle writes to guest PTEs in KVM's MMU. KVM's MMU needs access to information that should not be exposed to external page-track users, e.g. KVM needs (for some definitions of "need") the vCPU to query the current paging mode, whereas external users, i.e. KVMGT, have no ties to the current vCPU and so should never need the vCPU. Moving away from the page-track mechanism will allow dropping use of the page-track mechanism for KVM's own MMU, and will also allow simplifying and cleaning up the page-track APIs. Reviewed-by: Yan Zhao <[email protected]> Tested-by: Yongwei Ma <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2023-08-31KVM: x86/mmu: Don't rely on page-track mechanism to flush on memslot changeSean Christopherson1-8/+2
Call kvm_mmu_zap_all_fast() directly when flushing a memslot instead of bouncing through the page-track mechanism. KVM (unfortunately) needs to zap and flush all page tables on memslot DELETE/MOVE irrespective of whether KVM is shadowing guest page tables. This will allow changing KVM to register a page-track notifier on the first shadow root allocation, and will also allow deleting the misguided kvm_page_track_flush_slot() hook itself once KVM-GT also moves to a different method for reacting to memslot changes. No functional change intended. Cc: Yan Zhao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Yan Zhao <[email protected]> Tested-by: Yongwei Ma <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2023-08-31KVM: x86/mmu: Move kvm_arch_flush_shadow_{all,memslot}() to mmu.cSean Christopherson3-13/+12
Move x86's implementation of kvm_arch_flush_shadow_{all,memslot}() into mmu.c, and make kvm_mmu_zap_all() static as it was globally visible only for kvm_arch_flush_shadow_all(). This will allow refactoring kvm_arch_flush_shadow_memslot() to call kvm_mmu_zap_all() directly without having to expose kvm_mmu_zap_all_fast() outside of mmu.c. Keeping everything in mmu.c will also likely simplify supporting TDX, which intends to do zap only relevant SPTEs on memslot updates. No functional change intended. Suggested-by: Yan Zhao <[email protected]> Tested-by: Yongwei Ma <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2023-08-31drm/i915/gvt: Protect gfn hash table with vgpu_lockSean Christopherson2-24/+25
Use vgpu_lock instead of KVM's mmu_lock to protect accesses to the hash table used to track which gfns are write-protected when shadowing the guest's GTT, and hoist the acquisition of vgpu_lock from intel_vgpu_page_track_handler() out to its sole caller, kvmgt_page_track_write(). This fixes a bug where kvmgt_page_track_write(), which doesn't hold kvm->mmu_lock, could race with intel_gvt_page_track_remove() and trigger a use-after-free. Fixing kvmgt_page_track_write() by taking kvm->mmu_lock is not an option as mmu_lock is a r/w spinlock, and intel_vgpu_page_track_handler() might sleep when acquiring vgpu->cache_lock deep down the callstack: intel_vgpu_page_track_handler() | |-> page_track->handler / ppgtt_write_protection_handler() | |-> ppgtt_handle_guest_write_page_table_bytes() | |-> ppgtt_handle_guest_write_page_table() | |-> ppgtt_handle_guest_entry_removal() | |-> ppgtt_invalidate_pte() | |-> intel_gvt_dma_unmap_guest_page() | |-> mutex_lock(&vgpu->cache_lock); Reviewed-by: Yan Zhao <[email protected]> Tested-by: Yongwei Ma <[email protected]> Reviewed-by: Zhi Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2023-08-31drm/i915/gvt: Drop unused helper intel_vgpu_reset_gtt()Sean Christopherson2-19/+0
Drop intel_vgpu_reset_gtt() as it no longer has any callers. In addition to eliminating dead code, this eliminates the last possible scenario where __kvmgt_protect_table_find() can be reached without holding vgpu_lock. Requiring vgpu_lock to be held when calling __kvmgt_protect_table_find() will allow a protecting the gfn hash with vgpu_lock without too much fuss. No functional change intended. Fixes: ba25d977571e ("drm/i915/gvt: Do not destroy ppgtt_mm during vGPU D3->D0.") Reviewed-by: Yan Zhao <[email protected]> Tested-by: Yongwei Ma <[email protected]> Reviewed-by: Zhi Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2023-08-31drm/i915/gvt: Use an "unsigned long" to iterate over memslot gfnsSean Christopherson1-1/+1
Use an "unsigned long" instead of an "int" when iterating over the gfns in a memslot. The number of pages in the memslot is tracked as an "unsigned long", e.g. KVMGT could theoretically break if a KVM memslot larger than 16TiB were deleted (2^32 * 4KiB). Reviewed-by: Yan Zhao <[email protected]> Tested-by: Yongwei Ma <[email protected]> Reviewed-by: Zhi Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2023-08-31drm/i915/gvt: Don't rely on KVM's gfn_to_pfn() to query possible 2M GTTSean Christopherson2-42/+8
Now that gvt_pin_guest_page() explicitly verifies the pinned PFN is a transparent hugepage page, don't use KVM's gfn_to_pfn() to pre-check if a 2MiB GTT entry is possible and instead just try to map the GFN with a 2MiB entry. Using KVM to query pfn that is ultimately managed through VFIO is odd, and KVM's gfn_to_pfn() is not intended for non-KVM consumption; it's exported only because of KVM vendor modules (x86 and PPC). Open code the check on 2MiB support instead of keeping is_2MB_gtt_possible() around for a single line of code. Move the call to intel_gvt_dma_map_guest_page() for a 4KiB entry into its case statement, i.e. fork the common path into the 4KiB and 2MiB "direct" shadow paths. Keeping the call in the "common" path is arguably more in the spirit of "one change per patch", but retaining the local "page_size" variable is silly, i.e. the call site will be changed either way, and jumping around the no-longer-common code is more subtle and rather odd, i.e. would just need to be immediately cleaned up. Drop the error message from gvt_pin_guest_page() when KVMGT attempts to shadow a 2MiB guest page that isn't backed by a compatible hugepage in the host. Dropping the pre-check on a THP makes it much more likely that the "error" will be encountered in normal operation. Reviewed-by: Yan Zhao <[email protected]> Tested-by: Yan Zhao <[email protected]> Tested-by: Yongwei Ma <[email protected]> Reviewed-by: Zhi Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2023-08-31drm/i915/gvt: Error out on an attempt to shadowing an unknown GTT entry typeSean Christopherson1-0/+1
Bail from ppgtt_populate_shadow_entry() if an unexpected GTT entry type is encountered instead of subtly falling through to the common "direct shadow" path. Eliminating the default/error path's reliance on the common handling will allow hoisting intel_gvt_dma_map_guest_page() into the case statements so that the 2MiB case can try intel_gvt_dma_map_guest_page() and fallback to splitting the entry on failure. Reviewed-by: Zhi Wang <[email protected]> Tested-by: Yongwei Ma <[email protected]> Reviewed-by: Yan Zhao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2023-08-31drm/i915/gvt: Explicitly check that vGPU is attached before shadowingSean Christopherson1-2/+3
Move the check that a vGPU is attached from is_2MB_gtt_possible() all the way up to shadow_ppgtt_mm() to avoid unnecessary work, and to make it more obvious that a future cleanup of is_2MB_gtt_possible() isn't introducing a bug. is_2MB_gtt_possible() has only one caller, ppgtt_populate_shadow_entry(), and all paths in ppgtt_populate_shadow_entry() eventually check for attachment by way of intel_gvt_dma_map_guest_page(). And of the paths that lead to ppgtt_populate_shadow_entry(), shadow_ppgtt_mm() is the only one that doesn't already check for INTEL_VGPU_STATUS_ACTIVE or INTEL_VGPU_STATUS_ATTACHED. workload_thread() <= pick_next_workload() => INTEL_VGPU_STATUS_ACTIVE | -> dispatch_workload() | |-> prepare_workload() | -> intel_vgpu_sync_oos_pages() | | | |-> ppgtt_set_guest_page_sync() | | | |-> sync_oos_page() | | | |-> ppgtt_populate_shadow_entry() | |-> intel_vgpu_flush_post_shadow() | 1: |-> ppgtt_handle_guest_write_page_table() | |-> ppgtt_handle_guest_entry_add() | 2: | -> ppgtt_populate_spt_by_guest_entry() | | | |-> ppgtt_populate_spt() | | | |-> ppgtt_populate_shadow_entry() | | | |-> ppgtt_populate_spt_by_guest_entry() [see 2] | |-> ppgtt_populate_shadow_entry() kvmgt_page_track_write() <= KVM callback => INTEL_VGPU_STATUS_ATTACHED | |-> intel_vgpu_page_track_handler() | |-> ppgtt_write_protection_handler() | |-> ppgtt_handle_guest_write_page_table_bytes() | |-> ppgtt_handle_guest_write_page_table() [see 1] Reviewed-by: Yan Zhao <[email protected]> Tested-by: Yan Zhao <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>