aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2022-04-05ACPI: Add perf low power callbackStephane Eranian1-0/+6
Add an optional callback needed by some PMU features, e.g., AMD BRS, to give a chance to the perf_events code to change its state before a CPU goes to low power and after it comes back. The callback is void when the PERF_NEEDS_LOPWR_CB flag is not set. This flag must be set in arch specific perf_event.h header whenever needed. When not set, there is no impact on the ACPI code. Signed-off-by: Stephane Eranian <[email protected]> [peterz: build fix] Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-05locking/mutex: Make contention tracepoints more consistent wrt adaptive spinningPeter Zijlstra1-1/+3
Have the trace_contention_*() tracepoints consistently include adaptive spinning. In order to differentiate between the spinning and non-spinning states add LCB_F_MUTEX and combine with LCB_F_SPIN. The consequence is that a mutex contention can now triggler multiple _begin() tracepoints before triggering an _end(). Additionally, this fixes one path where mutex would trigger _end() without ever seeing a _begin(). Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
2022-04-05perf/core: Add perf_clear_branch_entry_bitfields() helperStephane Eranian1-0/+16
Make it simpler to reset all the info fields on the perf_branch_entry by adding a helper inline function. The goal is to centralize the initialization to avoid missing a field in case more are added. Signed-off-by: Stephane Eranian <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-05locking: Add lock contention tracepointsNamhyung Kim1-3/+58
This adds two new lock contention tracepoints like below: * lock:contention_begin * lock:contention_end The lock:contention_begin takes a flags argument to classify locks. I found it useful to identify what kind of locks it's tracing like if it's spinning or sleeping, reader-writer lock, real-time, and per-cpu. Move tracepoint definitions into mutex.c so that we can use them without lockdep. Signed-off-by: Namhyung Kim <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Tested-by: Hyeonggon Yoo <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2022-04-05lockdep: Fix -Wunused-parameter for _THIS_IP_Nick Desaulniers2-3/+3
While looking into a bug related to the compiler's handling of addresses of labels, I noticed some uses of _THIS_IP_ seemed unused in lockdep. Drive by cleanup. -Wunused-parameter: kernel/locking/lockdep.c:1383:22: warning: unused parameter 'ip' kernel/locking/lockdep.c:4246:48: warning: unused parameter 'ip' kernel/locking/lockdep.c:4844:19: warning: unused parameter 'ip' Signed-off-by: Nick Desaulniers <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Waiman Long <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-05Revert "locking/local_lock: Make the empty local_lock_*() function a macro."Sebastian Andrzej Siewior1-3/+3
With volatile removed from arch_raw_cpu_ptr() the compiler no longer creates the per-CPU reference. The usage of the macro can be reverted now. Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-05static_call: Remove __DEFINE_STATIC_CALL macroChristophe Leroy1-13/+10
Only DEFINE_STATIC_CALL use __DEFINE_STATIC_CALL macro now when CONFIG_HAVE_STATIC_CALL is selected. Only keep __DEFINE_STATIC_CALL() for the generic fallback, and also use it to implement DEFINE_STATIC_CALL_NULL() in that case. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Josh Poimboeuf <[email protected]> Link: https://lore.kernel.org/r/329074f92d96e3220ebe15da7bbe2779beee31eb.1647253456.git.christophe.leroy@csgroup.eu
2022-04-05static_call: Properly initialise DEFINE_STATIC_CALL_RET0()Christophe Leroy1-3/+17
When a static call is updated with __static_call_return0() as target, arch_static_call_transform() set it to use an optimised set of instructions which are meant to lay in the same cacheline. But when initialising a static call with DEFINE_STATIC_CALL_RET0(), we get a branch to the real __static_call_return0() function instead of getting the optimised setup: c00d8120 <__SCT__perf_snapshot_branch_stack>: c00d8120: 4b ff ff f4 b c00d8114 <__static_call_return0> c00d8124: 3d 80 c0 0e lis r12,-16370 c00d8128: 81 8c 81 3c lwz r12,-32452(r12) c00d812c: 7d 89 03 a6 mtctr r12 c00d8130: 4e 80 04 20 bctr c00d8134: 38 60 00 00 li r3,0 c00d8138: 4e 80 00 20 blr c00d813c: 00 00 00 00 .long 0x0 Add ARCH_DEFINE_STATIC_CALL_RET0_TRAMP() defined by each architecture to setup the optimised configuration, and rework DEFINE_STATIC_CALL_RET0() to call it: c00d8120 <__SCT__perf_snapshot_branch_stack>: c00d8120: 48 00 00 14 b c00d8134 <__SCT__perf_snapshot_branch_stack+0x14> c00d8124: 3d 80 c0 0e lis r12,-16370 c00d8128: 81 8c 81 3c lwz r12,-32452(r12) c00d812c: 7d 89 03 a6 mtctr r12 c00d8130: 4e 80 04 20 bctr c00d8134: 38 60 00 00 li r3,0 c00d8138: 4e 80 00 20 blr c00d813c: 00 00 00 00 .long 0x0 Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Josh Poimboeuf <[email protected]> Link: https://lore.kernel.org/r/1e0a61a88f52a460f62a58ffc2a5f847d1f7d9d8.1647253456.git.christophe.leroy@csgroup.eu
2022-04-05static_call: Don't make __static_call_return0 staticChristophe Leroy1-4/+1
System.map shows that vmlinux contains several instances of __static_call_return0(): c0004fc0 t __static_call_return0 c0011518 t __static_call_return0 c00d8160 t __static_call_return0 arch_static_call_transform() uses the middle one to check whether we are setting a call to __static_call_return0 or not: c0011520 <arch_static_call_transform>: c0011520: 3d 20 c0 01 lis r9,-16383 <== r9 = 0xc001 << 16 c0011524: 39 29 15 18 addi r9,r9,5400 <== r9 += 0x1518 c0011528: 7c 05 48 00 cmpw r5,r9 <== r9 has value 0xc0011518 here So if static_call_update() is called with one of the other instances of __static_call_return0(), arch_static_call_transform() won't recognise it. In order to work properly, global single instance of __static_call_return0() is required. Fixes: 3f2a8fc4b15d ("static_call/x86: Add __static_call_return0()") Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Josh Poimboeuf <[email protected]> Link: https://lkml.kernel.org/r/30821468a0e7d28251954b578e5051dc09300d04.1647258493.git.christophe.leroy@csgroup.eu
2022-04-05dma-buf: finally make dma_resv_excl_fence private v2Christian König1-17/+0
Drivers should never touch this directly. v2: fix rebase clash Signed-off-by: Christian König <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Daniel Vetter <[email protected]>
2022-04-04kunit: split resource API from test.h into new resource.hDaniel Latypov2-299/+322
Background: Currently, a reader looking at kunit/test.h will find the file is quite long, and the first meaty comment is a doc comment about struct kunit_resource. Most users will not ever use the KUnit resource API directly. They'll use kunit_kmalloc() and friends, or decide it's simpler to do cleanups via labels (it often can be) instead of figuring out how to use the API. It's also logically separate from everything else in test.h. Removing it from the file doesn't cause any compilation errors (since struct kunit has `struct list_head resources` to store them). This commit: Let's move it into a kunit/resource.h file and give it a separate page in the docs, kunit/api/resource.rst. We include resource.h at the bottom of test.h since * don't want to force existing users to add a new include if they use the API * it accesses `lock` inside `struct kunit` in a inline func * so we can't just forward declare, and the alternatives require uninlining the func, adding hepers to lock/unlock, or other more invasive changes. Now the first big comment in test.h is about kunit_case, which is a lot more relevant to what a new user wants to know. A side effect of this is git blame won't properly track history by default, users need to run $ git blame -L ,1 -C17 include/kunit/resource.h Signed-off-by: Daniel Latypov <[email protected]> Reviewed-by: David Gow <[email protected]> Reviewed-by: Brendan Higgins <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2022-04-04kunit: Introduce _NULL and _NOT_NULL macrosRicardo Ribalda1-0/+84
Today, when we want to check if a pointer is NULL and not ERR we have two options: KUNIT_EXPECT_TRUE(test, ptr == NULL); or KUNIT_EXPECT_PTR_NE(test, ptr, (struct mystruct *)NULL); Create a new set of macros that take care of NULL checks. Reviewed-by: Brendan Higgins <[email protected]> Reviewed-by: Daniel Latypov <[email protected]> Signed-off-by: Ricardo Ribalda <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2022-04-04drm: fix a kernel-doc typoRandy Dunlap1-1/+1
Fix a build warning from 'make htmldocs' by correcting the lock name in the kernel-doc comment. include/drm/drm_file.h:369: warning: Function parameter or member 'master_lookup_lock' not described in 'drm_file' Signed-off-by: Randy Dunlap <[email protected]> Cc: David Airlie <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: [email protected] Reviewed-by: Simon Ser <[email protected]> Signed-off-by: Simon Ser <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2022-04-04drm/gem: Delete gem array fencing helpersDaniel Vetter1-5/+0
Integrated into the scheduler now and all users converted over. v2: Rebased over changes from König. Acked-by: Christian König <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Cc: Maarten Lankhorst <[email protected]> Cc: Maxime Ripard <[email protected]> Cc: Thomas Zimmermann <[email protected]> Cc: David Airlie <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Sumit Semwal <[email protected]> Cc: "Christian König" <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2022-04-04regulator: mt6366: Add support for MT6366 regulatorJohnson Wang1-0/+45
The MT6366 is a regulator found on boards based on MediaTek MT8186 and probably other SoCs. It is a so called pmic and connects as a slave to SoC using SPI, wrapped inside the pmic-wrapper. Reviewed-by: AngeloGioacchino Del Regno <[email protected]> Signed-off-by: Johnson Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]>
2022-04-04IB/uverbs: Move part of enum ib_device_cap_flags to uapiXiao Yang2-38/+97
1) Part of enum ib_device_cap_flags are used by ibv_query_device(3) or ibv_query_device_ex(3), so we define them in include/uapi/rdma/ib_user_verbs.h and only expose them to userspace. 2) Reformat enum ib_device_cap_flags by removing the indent before '='. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Xiao Yang <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2022-04-04IB/uverbs: Move enum ib_raw_packet_caps to uapiXiao Yang2-7/+18
This enum is used by ibv_query_device_ex(3) so it should be defined in include/uapi/rdma/ib_user_verbs.h. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Xiao Yang <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2022-04-04gpio: Restrict usage of GPIO chip irq members before initializationShreeya Patel1-0/+9
GPIO chip irq members are exposed before they could be completely initialized and this leads to race conditions. One such issue was observed for the gc->irq.domain variable which was accessed through the I2C interface in gpiochip_to_irq() before it could be initialized by gpiochip_add_irqchip(). This resulted in Kernel NULL pointer dereference. Following are the logs for reference :- kernel: Call Trace: kernel: gpiod_to_irq+0x53/0x70 kernel: acpi_dev_gpio_irq_get_by+0x113/0x1f0 kernel: i2c_acpi_get_irq+0xc0/0xd0 kernel: i2c_device_probe+0x28a/0x2a0 kernel: really_probe+0xf2/0x460 kernel: RIP: 0010:gpiochip_to_irq+0x47/0xc0 To avoid such scenarios, restrict usage of GPIO chip irq members before they are completely initialized. Signed-off-by: Shreeya Patel <[email protected]> Cc: [email protected] Reviewed-by: Andy Shevchenko <[email protected]> Reviewed-by: Linus Walleij <[email protected]> Signed-off-by: Bartosz Golaszewski <[email protected]>
2022-04-04arm64: dts: mt8192: Add the mmsys reset bit to reset the dsi0Allen-KH Cheng1-0/+3
Reset the DSI hardware is needed to prevent different settings between the bootloader and the kernel. Signed-off-by: Allen-KH Cheng <[email protected]> Reviewed-by: Nícolas F. R. A. Prado <[email protected]> Reviewed-by: AngeloGioacchino Del Regno <[email protected]> Acked-by: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Matthias Brugger <[email protected]>
2022-04-04task_stack, x86/cea: Force-inline stack helpersBorislav Petkov1-1/+1
Force-inline two stack helpers to fix the following objtool warnings: vmlinux.o: warning: objtool: in_task_stack()+0xc: call to task_stack_page() leaves .noinstr.text section vmlinux.o: warning: objtool: in_entry_stack()+0x10: call to cpu_entry_stack() leaves .noinstr.text section Signed-off-by: Borislav Petkov <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-04iio: adc: ad_sigma_delta: Add sequencer supportLars-Peter Clausen1-0/+38
Some sigma-delta chips support sampling of multiple channels in continuous mode. When the operating with more than one channel enabled, the channel sequencer cycles through the enabled channels in sequential order, from first channel to the last one. If a channel is disabled, it is skipped by the sequencer. If more than one channel is used in continuous mode, instruct the device to append the status to the SPI transfer (1 extra byte) every time we receive a sample. All sigma-delta chips possessing a sampling sequencer have this ability. Inside the status register there will be the number of the converted channel. In this way, even if the CPU won't keep up with the sampling rate, it won't send to userspace wrong channel samples. When multiple channels are enabled in continuous mode, the device needs to perform a measurement on all slots before we can push to userspace the sample. If, during sequencing and data reading, a channel measurement is lost, a desync occurred. In this case, ad_sigma_delta drops the incomplete sample and waits for the device to send the measurement on the first active slot. Co-developed-by: Alexandru Tachici <[email protected]> Signed-off-by: Alexandru Tachici <[email protected]> Signed-off-by: Lars-Peter Clausen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jonathan Cameron <[email protected]>
2022-04-04linux/fb.h: Spelling s/palette/palette/Geert Uytterhoeven1-1/+1
Fix a misspelling of "palette" in a comment. Signed-off-by: Geert Uytterhoeven <[email protected]> Reviewed-by: Pekka Paalanen <[email protected]> Signed-off-by: Helge Deller <[email protected]>
2022-04-03bpf: Correct the comment for BTF kind bitfieldHaiyue Wang1-2/+2
The commit 8fd886911a6a ("bpf: Add BTF_KIND_FLOAT to uapi") has extended the BTF kind bitfield from 4 to 5 bits, correct the comment. Signed-off-by: Haiyue Wang <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-04-03Merge tag 'trace-v5.18-2' of ↵Linus Torvalds11-68/+29
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull more tracing updates from Steven Rostedt: - Rename the staging files to give them some meaning. Just stage1,stag2,etc, does not show what they are for - Check for NULL from allocation in bootconfig - Hold event mutex for dyn_event call in user events - Mark user events to broken (to work on the API) - Remove eBPF updates from user events - Remove user events from uapi header to keep it from being installed. - Move ftrace_graph_is_dead() into inline as it is called from hot paths and also convert it into a static branch. * tag 'trace-v5.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Move user_events.h temporarily out of include/uapi ftrace: Make ftrace_graph_is_dead() a static branch tracing: Set user_events to BROKEN tracing/user_events: Remove eBPF interfaces tracing/user_events: Hold event_mutex during dyn_event_add proc: bootconfig: Add null pointer check tracing: Rename the staging files for trace_events
2022-04-03Merge tag 'core-urgent-2022-04-03' of ↵Linus Torvalds1-3/+0
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RT signal fix from Thomas Gleixner: "Revert the RT related signal changes. They need to be reworked and generalized" * tag 'core-urgent-2022-04-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Revert "signal, x86: Delay calling signals in atomic on RT enabled kernels"
2022-04-03Merge tag 'dma-mapping-5.18-1' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds2-131/+1
Pull more dma-mapping updates from Christoph Hellwig: - fix a regression in dma remap handling vs AMD memory encryption (me) - finally kill off the legacy PCI DMA API (Christophe JAILLET) * tag 'dma-mapping-5.18-1' of git://git.infradead.org/users/hch/dma-mapping: dma-mapping: move pgprot_decrypted out of dma_pgprot PCI/doc: cleanup references to the legacy PCI DMA API PCI: Remove the deprecated "pci-dma-compat.h" API
2022-04-03dma-buf: add dma_resv_get_singleton v2Christian König1-0/+2
Add a function to simplify getting a single fence for all the fences in the dma_resv object. v2: fix ref leak in error handling Signed-off-by: Christian König <[email protected]> Reviewed-by: Daniel Vetter <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2022-04-02Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2-23/+48
Pull kvm fixes from Paolo Bonzini: - Only do MSR filtering for MSRs accessed by rdmsr/wrmsr - Documentation improvements - Prevent module exit until all VMs are freed - PMU Virtualization fixes - Fix for kvm_irq_delivery_to_apic_fast() NULL-pointer dereferences - Other miscellaneous bugfixes * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (42 commits) KVM: x86: fix sending PV IPI KVM: x86/mmu: do compare-and-exchange of gPTE via the user address KVM: x86: Remove redundant vm_entry_controls_clearbit() call KVM: x86: cleanup enter_rmode() KVM: x86: SVM: fix tsc scaling when the host doesn't support it kvm: x86: SVM: remove unused defines KVM: x86: SVM: move tsc ratio definitions to svm.h KVM: x86: SVM: fix avic spec based definitions again KVM: MIPS: remove reference to trap&emulate virtualization KVM: x86: document limitations of MSR filtering KVM: x86: Only do MSR filtering when access MSR by rdmsr/wrmsr KVM: x86/emulator: Emulate RDPID only if it is enabled in guest KVM: x86/pmu: Fix and isolate TSX-specific performance event logic KVM: x86: mmu: trace kvm_mmu_set_spte after the new SPTE was set KVM: x86/svm: Clear reserved bits written to PerfEvtSeln MSRs KVM: x86: Trace all APICv inhibit changes and capture overall status KVM: x86: Add wrappers for setting/clearing APICv inhibits KVM: x86: Make APICv inhibit reasons an enum and cleanup naming KVM: X86: Handle implicit supervisor access with SMAP KVM: X86: Rename variable smap to not_smap in permission_fault() ...
2022-04-02tracing: mark user_events as BROKENSteven Rostedt (Google)1-0/+0
After being merged, user_events become more visible to a wider audience that have concerns with the current API. It is too late to fix this for this release, but instead of a full revert, just mark it as BROKEN (which prevents it from being selected in make config). Then we can work finding a better API. If that fails, then it will need to be completely reverted. To not have the code silently bitrot, still allow building it with COMPILE_TEST. And to prevent the uapi header from being installed, then later changed, and then have an old distro user space see the old version, move the header file out of the uapi directory. Surround the include with CONFIG_COMPILE_TEST to the current location, but when the BROKEN tag is taken off, it will use the uapi directory, and fail to compile. This is a good way to remind us to move the header back. Link: https://lore.kernel.org/all/[email protected] Link: https://lkml.kernel.org/r/[email protected] Suggested-by: Mathieu Desnoyers <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-04-02tracing: Move user_events.h temporarily out of include/uapiSteven Rostedt (Google)1-0/+0
While user_events API is under development and has been marked for broken to not let the API become fixed, move the header file out of the uapi directory. This is to prevent it from being installed, then later changed, and then have an old distro user space update with a new kernel, where applications see the user_events being available, but the old header is in place, and then they get compiled incorrectly. Also, surround the include with CONFIG_COMPILE_TEST to the current location, but when the BROKEN tag is taken off, it will use the uapi directory, and fail to compile. This is a good way to remind us to move the header back. Link: https://lore.kernel.org/all/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Suggested-by: Mathieu Desnoyers <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
2022-04-02ftrace: Make ftrace_graph_is_dead() a static branchChristophe Leroy1-1/+15
ftrace_graph_is_dead() is used on hot paths, it just reads a variable in memory and is not worth suffering function call constraints. For instance, at entry of prepare_ftrace_return(), inlining it avoids saving prepare_ftrace_return() parameters to stack and restoring them after calling ftrace_graph_is_dead(). While at it using a static branch is even more performant and is rather well adapted considering that the returned value will almost never change. Inline ftrace_graph_is_dead() and replace 'kill_ftrace_graph' bool by a static branch. The performance improvement is noticeable. Link: https://lkml.kernel.org/r/e0411a6a0ed3eafff0ad2bc9cd4b0e202b4617df.1648623570.git.christophe.leroy@csgroup.eu Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
2022-04-02tracing/user_events: Remove eBPF interfacesBeau Belgrave1-53/+0
Remove eBPF interfaces within user_events to ensure they are fully reviewed. Link: https://lore.kernel.org/all/20220329165718.GA10381@kbox/ Link: https://lkml.kernel.org/r/[email protected] Suggested-by: Alexei Starovoitov <[email protected]> Signed-off-by: Beau Belgrave <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
2022-04-02tracing: Rename the staging files for trace_eventsSteven Rostedt (Google)9-14/+14
When looking for implementation of different phases of the creation of the TRACE_EVENT() macro, it is pretty useless when all helper macro redefinitions are in files labeled "stageX_defines.h". Rename them to state which phase the files are for. For instance, when looking for the defines that are used to create the event fields, seeing "stage4_event_fields.h" gives the developer a good idea that the defines are in that file. Signed-off-by: Steven Rostedt (Google) <[email protected]>
2022-04-02KVM: x86: Accept KVM_[GS]ET_TSC_KHZ as a VM ioctl.David Woodhouse1-1/+3
This sets the default TSC frequency for subsequently created vCPUs. Signed-off-by: David Woodhouse <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-04-02KVM: x86/xen: Advertise and document KVM_XEN_HVM_CONFIG_EVTCHN_SENDDavid Woodhouse1-0/+1
At the end of the patch series adding this batch of event channel acceleration features, finally add the feature bit which advertises them and document it all. For SCHEDOP_poll we need to wake a polling vCPU when a given port is triggered, even when it's masked — and we want to implement that in the kernel, for efficiency. So we want the kernel to know that it has sole ownership of event channel delivery. Thus, we allow userspace to make the 'promise' by setting the corresponding feature bit in its KVM_XEN_HVM_CONFIG call. As we implement SCHEDOP_poll bypass later, we will do so only if that promise has been made by userspace. Signed-off-by: David Woodhouse <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-04-02KVM: x86/xen: Support per-vCPU event channel upcall via local APICDavid Woodhouse1-0/+2
Windows uses a per-vCPU vector, and it's delivered via the local APIC basically like an MSI (with associated EOI) unlike the traditional guest-wide vector which is just magically asserted by Xen (and in the KVM case by kvm_xen_has_interrupt() / kvm_cpu_get_extint()). Now that the kernel is able to raise event channel events for itself, being able to do so for Windows guests is also going to be useful. Signed-off-by: David Woodhouse <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-04-02KVM: x86/xen: Kernel acceleration for XENVER_versionDavid Woodhouse1-1/+2
Turns out this is a fast path for PV guests because they use it to trigger the event channel upcall. So letting it bounce all the way up to userspace is not great. Signed-off-by: David Woodhouse <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-04-02KVM: x86/xen: handle PV timers oneshot modeJoao Martins1-0/+6
If the guest has offloaded the timer virq, handle the following hypercalls for programming the timer: VCPUOP_set_singleshot_timer VCPUOP_stop_singleshot_timer set_timer_op(timestamp_ns) The event channel corresponding to the timer virq is then used to inject events once timer deadlines are met. For now we back the PV timer with hrtimer. [ dwmw2: Add save/restore, 32-bit compat mode, immediate delivery, don't check timer in kvm_vcpu_has_event() ] Signed-off-by: Joao Martins <[email protected]> Signed-off-by: David Woodhouse <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-04-02KVM: x86/xen: Add KVM_XEN_VCPU_ATTR_TYPE_VCPU_IDDavid Woodhouse1-0/+3
In order to intercept hypercalls such as VCPUOP_set_singleshot_timer, we need to be aware of the Xen CPU numbering. This looks a lot like the Hyper-V handling of vpidx, for obvious reasons. Signed-off-by: David Woodhouse <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-04-02KVM: x86/xen: intercept EVTCHNOP_send from guestsJoao Martins1-0/+28
Userspace registers a sending @port to either deliver to an @eventfd or directly back to a local event channel port. After binding events the guest or host may wish to bind those events to a particular vcpu. This is usually done for unbound and and interdomain events. Update requests are handled via the KVM_XEN_EVTCHN_UPDATE flag. Unregistered ports are handled by the emulator. Co-developed-by: Ankur Arora <[email protected]> Co-developed-By: David Woodhouse <[email protected]> Signed-off-by: Joao Martins <[email protected]> Signed-off-by: Ankur Arora <[email protected]> Signed-off-by: David Woodhouse <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-04-02KVM: x86/xen: Support direct injection of event channel eventsDavid Woodhouse1-0/+3
This adds a KVM_XEN_HVM_EVTCHN_SEND ioctl which allows direct injection of events given an explicit { vcpu, port, priority } in precisely the same form that those fields are given in the IRQ routing table. Userspace is currently able to inject 2-level events purely by setting the bits in the shared_info and vcpu_info, but FIFO event channels are harder to deal with; we will need the kernel to take sole ownership of delivery when we support those. A patch advertising this feature with a new bit in the KVM_CAP_XEN_HVM ioctl will be added in a subsequent patch. Signed-off-by: David Woodhouse <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-04-02KVM: x86/xen: Make kvm_xen_set_evtchn() reusable from other placesDavid Woodhouse1-1/+2
Clean it up to return -errno on error consistently, while still being compatible with the return conventions for kvm_arch_set_irq_inatomic() and the kvm_set_irq() callback. We use -ENOTCONN to indicate when the port is masked. No existing users care, except that it's negative. Also allow it to optimise the vCPU lookup. Unless we abuse the lapic map, there is no quick lookup from APIC ID to a vCPU; the logic in kvm_get_vcpu_by_id() will just iterate over all vCPUs till it finds the one it wants. So do that just once and stash the result in the struct kvm_xen_evtchn for next time. Signed-off-by: David Woodhouse <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-04-02KVM: Remove dirty handling from gfn_to_pfn_cache completelyDavid Woodhouse2-10/+5
It isn't OK to cache the dirty status of a page in internal structures for an indefinite period of time. Any time a vCPU exits the run loop to userspace might be its last; the VMM might do its final check of the dirty log, flush the last remaining dirty pages to the destination and complete a live migration. If we have internal 'dirty' state which doesn't get flushed until the vCPU is finally destroyed on the source after migration is complete, then we have lost data because that will escape the final copy. This problem already exists with the use of kvm_vcpu_unmap() to mark pages dirty in e.g. VMX nesting. Note that the actual Linux MM already considers the page to be dirty since we have a writeable mapping of it. This is just about the KVM dirty logging. For the nesting-style use cases (KVM_GUEST_USES_PFN) we will need to track which gfn_to_pfn_caches have been used and explicitly mark the corresponding pages dirty before returning to userspace. But we would have needed external tracking of that anyway, rather than walking the full list of GPCs to find those belonging to this vCPU which are dirty. So let's rely *solely* on that external tracking, and keep it simple rather than laying a tempting trap for callers to fall into. Signed-off-by: David Woodhouse <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-04-02KVM: Use enum to track if cached PFN will be used in guest and/or hostSean Christopherson2-10/+16
Replace the guest_uses_pa and kernel_map booleans in the PFN cache code with a unified enum/bitmask. Using explicit names makes it easier to review and audit call sites. Opportunistically add a WARN to prevent passing garbage; instantating a cache without declaring its usage is either buggy or pointless. Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: David Woodhouse <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-04-02KVM: Don't actually set a request when evicting vCPUs for GFN cache invdSean Christopherson1-7/+31
Don't actually set a request bit in vcpu->requests when making a request purely to force a vCPU to exit the guest. Logging a request but not actually consuming it would cause the vCPU to get stuck in an infinite loop during KVM_RUN because KVM would see the pending request and bail from VM-Enter to service the request. Note, it's currently impossible for KVM to set KVM_REQ_GPC_INVALIDATE as nothing in KVM is wired up to set guest_uses_pa=true. But, it'd be all too easy for arch code to introduce use of kvm_gfn_to_pfn_cache_init() without implementing handling of the request, especially since getting test coverage of MMU notifier interaction with specific KVM features usually requires a directed test. Opportunistically rename gfn_to_pfn_cache_invalidate_start()'s wake_vcpus to evict_vcpus. The purpose of the request is to get vCPUs out of guest mode, it's supposed to _avoid_ waking vCPUs that are blocking. Opportunistically rename KVM_REQ_GPC_INVALIDATE to be more specific as to what it wants to accomplish, and to genericize the name so that it can used for similar but unrelated scenarios, should they arise in the future. Add a comment and documentation to explain why the "no action" request exists. Add compile-time assertions to help detect improper usage. Use the inner assertless helper in the one s390 path that makes requests without a hardcoded request. Cc: David Woodhouse <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-04-01Merge branch 'work.misc' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs updates from Al Viro: "Assorted bits and pieces" * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: aio: drop needless assignment in aio_read() clean overflow checks in count_mounts() a bit seq_file: fix NULL pointer arithmetic warning uml/x86: use x86 load_unaligned_zeropad() asm/user.h: killed unused macros constify struct path argument of finish_automount()/do_add_mount() fs: Remove FIXME comment in generic_write_checks()
2022-04-02drm/ttm: Add a parameter to add extra pages into ttm_ttRamalingam C1-1/+3
Add a parameter called "extra_pages" for ttm_tt_init, to indicate that driver needs extra pages in ttm_tt. v2: Used imperative wording [Thomas and Christian] Signed-off-by: Ramalingam C <[email protected]> cc: Christian Koenig <[email protected]> cc: Hellstrom Thomas <[email protected]> Reviewed-by: Thomas Hellstrom <[email protected]> Reviewed-by: Christian Konig <[email protected]> Reviewed-by: Nirmoy Das <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2022-04-01Merge tag 'for-5.18/drivers-2022-04-01' of git://git.kernel.dk/linux-blockLinus Torvalds2-2/+3
Pull block driver fixes from Jens Axboe: "Followup block driver updates and fixes for the 5.18-rc1 merge window. In detail: - NVMe pull request - Fix multipath hang when disk goes live over reconnect (Anton Eidelman) - fix RCU hole that allowed for endless looping in multipath round robin (Chris Leech) - remove redundant assignment after left shift (Colin Ian King) - add quirks for Samsung X5 SSDs (Monish Kumar R) - fix the read-only state for zoned namespaces with unsupposed features (Pankaj Raghav) - use a private workqueue instead of the system workqueue in nvmet (Sagi Grimberg) - allow duplicate NSIDs for private namespaces (Sungup Moon) - expose use_threaded_interrupts read-only in sysfs (Xin Hao)" - nbd minor allocation fix (Zhang) - drbd fixes and maintainer addition (Lars, Jakob, Christoph) - n64cart build fix (Jackie) - loop compat ioctl fix (Carlos) - misc fixes (Colin, Dongli)" * tag 'for-5.18/drivers-2022-04-01' of git://git.kernel.dk/linux-block: drbd: remove check of list iterator against head past the loop body drbd: remove usage of list iterator variable after loop nbd: fix possible overflow on 'first_minor' in nbd_dev_add() MAINTAINERS: add drbd co-maintainer drbd: fix potential silent data corruption loop: fix ioctl calls using compat_loop_info nvme-multipath: fix hang when disk goes live over reconnect nvme: fix RCU hole that allowed for endless looping in multipath round robin nvme: allow duplicate NSIDs for private namespaces nvmet: remove redundant assignment after left shift nvmet: use a private workqueue instead of the system workqueue nvme-pci: add quirks for Samsung X5 SSDs nvme-pci: expose use_threaded_interrupts read-only in sysfs nvme: fix the read-only state for zoned namespaces with unsupposed features n64cart: convert bi_disk to bi_bdev->bd_disk fix build xen/blkfront: fix comment for need_copy xen-blkback: remove redundant assignment to variable i
2022-04-01Merge tag 'for-5.18/block-2022-04-01' of git://git.kernel.dk/linux-blockLinus Torvalds2-2/+5
Pull block fixes from Jens Axboe: "Either fixes or a few additions that got missed in the initial merge window pull. In detail: - List iterator fix to avoid leaking value post loop (Jakob) - One-off fix in minor count (Christophe) - Fix for a regression in how io priority setting works for an exiting task (Jiri) - Fix a regression in this merge window with blkg_free() being called in an inappropriate context (Ming) - Misc fixes (Ming, Tom)" * tag 'for-5.18/block-2022-04-01' of git://git.kernel.dk/linux-block: blk-wbt: remove wbt_track stub block: use dedicated list iterator variable block: Fix the maximum minor value is blk_alloc_ext_minor() block: restore the old set_task_ioprio() behaviour wrt PF_EXITING block: avoid calling blkg_free() in atomic context lib/sbitmap: allocate sb->map via kvzalloc_node
2022-04-01Merge tag 'for-5.18/io_uring-2022-04-01' of git://git.kernel.dk/linux-blockLinus Torvalds1-2/+0
Pull io_uring fixes from Jens Axboe: "A little bit all over the map, some regression fixes for this merge window, and some general fixes that are stable bound. In detail: - Fix an SQPOLL memory ordering issue (Almog) - Accept fixes (Dylan) - Poll fixes (me) - Fixes for provided buffers and recycling (me) - Tweak to IORING_OP_MSG_RING command added in this merge window (me) - Memory leak fix (Pavel) - Misc fixes and tweaks (Pavel, me)" * tag 'for-5.18/io_uring-2022-04-01' of git://git.kernel.dk/linux-block: io_uring: defer msg-ring file validity check until command issue io_uring: fail links if msg-ring doesn't succeeed io_uring: fix memory leak of uid in files registration io_uring: fix put_kbuf without proper locking io_uring: fix invalid flags for io_put_kbuf() io_uring: improve req fields comments io_uring: enable EPOLLEXCLUSIVE for accept poll io_uring: improve task work cache utilization io_uring: fix async accept on O_NONBLOCK sockets io_uring: remove IORING_CQE_F_MSG io_uring: add flag for disabling provided buffer recycling io_uring: ensure recv and recvmsg handle MSG_WAITALL correctly io_uring: don't recycle provided buffer if punted to async worker io_uring: fix assuming triggered poll waitqueue is the single poll io_uring: bump poll refs to full 31-bits io_uring: remove poll entry from list when canceling all io_uring: fix memory ordering when SQPOLL thread goes to sleep io_uring: ensure that fsnotify is always called io_uring: recycle provided before arming poll