aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-04-05dpaa2-ptp: Fix refcount leak in dpaa2_ptp_probeMiaoqian Lin1-1/+3
This node pointer is returned by of_find_compatible_node() with refcount incremented. Calling of_node_put() to aovid the refcount leak. Fixes: d346c9e86d86 ("dpaa2-ptp: reuse ptp_qoriq driver") Signed-off-by: Miaoqian Lin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2022-04-05netfilter: nf_tables: memcg accounting for dynamically allocated objectsVasily Averin6-6/+6
nft_*.c files whose NFT_EXPR_STATEFUL flag is set on need to use __GFP_ACCOUNT flag for objects that are dynamically allocated from the packet path. Such objects are allocated inside nft_expr_ops->init() callbacks executed in task context while processing netlink messages. In addition, this patch adds accounting to nft_set_elem_expr_clone() used for the same purposes. Signed-off-by: Vasily Averin <[email protected]> Acked-by: Roman Gushchin <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2022-04-05drm/nouveau: support more than one write fence in fenv50_wndw_prepare_fbChristian König1-9/+5
Use dma_resv_get_singleton() here to eventually get more than one write fence as single fence. Signed-off-by: Christian König <[email protected]> Reviewed-by: Daniel Vetter <[email protected]> Cc: Thomas Zimmermann <[email protected]> Cc: Laurent Pinchart <[email protected]> Cc: Maxime Ripard <[email protected]> Cc: Lyude Paul <[email protected]> Cc: [email protected] Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2022-04-05Merge drm-misc/drm-misc-next-fixes into drm-misc-fixesMaxime Ripard10-117/+540
There were a few patches left in drm-misc-next-fixes, let's bring them into drm-misc-fixes. Signed-off-by: Maxime Ripard <[email protected]>
2022-04-05Merge drm/drm-fixes into drm-misc-fixesMaxime Ripard12850-290398/+1031004
Let's start the 5.18 fixes cycle. Signed-off-by: Maxime Ripard <[email protected]>
2022-04-05Merge drm/drm-next into drm-misc-nextMaxime Ripard13184-293003/+1031865
Let's start the 5.19 development cycle. Signed-off-by: Maxime Ripard <[email protected]>
2022-04-05objtool: Fix SLS validation for kcov tail-call replacementPeter Zijlstra1-0/+11
Since not all compilers have a function attribute to disable KCOV instrumentation, objtool can rewrite KCOV instrumentation in noinstr functions as per commit: f56dae88a81f ("objtool: Handle __sanitize_cov*() tail calls") However, this has subtle interaction with the SLS validation from commit: 1cc1e4c8aab4 ("objtool: Add straight-line-speculation validation") In that when a tail-call instrucion is replaced with a RET an additional INT3 instruction is also written, but is not represented in the decoded instruction stream. This then leads to false positive missing INT3 objtool warnings in noinstr code. Instead of adding additional struct instruction objects, mark the RET instruction with retpoline_safe to suppress the warning (since we know there really is an INT3). Fixes: 1cc1e4c8aab4 ("objtool: Add straight-line-speculation validation") Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2022-04-05objtool: Fix IBT tail-call detectionPeter Zijlstra1-5/+14
Objtool reports: arch/x86/crypto/poly1305-x86_64.o: warning: objtool: poly1305_blocks_avx() falls through to next function poly1305_blocks_x86_64() arch/x86/crypto/poly1305-x86_64.o: warning: objtool: poly1305_emit_avx() falls through to next function poly1305_emit_x86_64() arch/x86/crypto/poly1305-x86_64.o: warning: objtool: poly1305_blocks_avx2() falls through to next function poly1305_blocks_x86_64() Which reads like: 0000000000000040 <poly1305_blocks_x86_64>: 40: f3 0f 1e fa endbr64 ... 0000000000000400 <poly1305_blocks_avx>: 400: f3 0f 1e fa endbr64 404: 44 8b 47 14 mov 0x14(%rdi),%r8d 408: 48 81 fa 80 00 00 00 cmp $0x80,%rdx 40f: 73 09 jae 41a <poly1305_blocks_avx+0x1a> 411: 45 85 c0 test %r8d,%r8d 414: 0f 84 2a fc ff ff je 44 <poly1305_blocks_x86_64+0x4> ... These are simple conditional tail-calls and *should* be recognised as such by objtool, however due to a mistake in commit 08f87a93c8ec ("objtool: Validate IBT assumptions") this is failing. Specifically, the jump_dest is +4, this means the instruction pointed at will not be ENDBR and as such it will fail the second clause of is_first_func_insn() that was supposed to capture this exact case. Instead, have is_first_func_insn() look at the previous instruction. Fixes: 08f87a93c8ec ("objtool: Validate IBT assumptions") Reported-by: Stephen Rothwell <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2022-04-05x86/bug: Prevent shadowing in __WARN_FLAGSVincent Mailhol1-2/+2
The macro __WARN_FLAGS() uses a local variable named "f". This being a common name, there is a risk of shadowing other variables. For example, GCC would yield: | In file included from ./include/linux/bug.h:5, | from ./include/linux/cpumask.h:14, | from ./arch/x86/include/asm/cpumask.h:5, | from ./arch/x86/include/asm/msr.h:11, | from ./arch/x86/include/asm/processor.h:22, | from ./arch/x86/include/asm/timex.h:5, | from ./include/linux/timex.h:65, | from ./include/linux/time32.h:13, | from ./include/linux/time.h:60, | from ./include/linux/stat.h:19, | from ./include/linux/module.h:13, | from virt/lib/irqbypass.mod.c:1: | ./include/linux/rcupdate.h: In function 'rcu_head_after_call_rcu': | ./arch/x86/include/asm/bug.h:80:21: warning: declaration of 'f' shadows a parameter [-Wshadow] | 80 | __auto_type f = BUGFLAG_WARNING|(flags); \ | | ^ | ./include/asm-generic/bug.h:106:17: note: in expansion of macro '__WARN_FLAGS' | 106 | __WARN_FLAGS(BUGFLAG_ONCE | \ | | ^~~~~~~~~~~~ | ./include/linux/rcupdate.h:1007:9: note: in expansion of macro 'WARN_ON_ONCE' | 1007 | WARN_ON_ONCE(func != (rcu_callback_t)~0L); | | ^~~~~~~~~~~~ | In file included from ./include/linux/rbtree.h:24, | from ./include/linux/mm_types.h:11, | from ./include/linux/buildid.h:5, | from ./include/linux/module.h:14, | from virt/lib/irqbypass.mod.c:1: | ./include/linux/rcupdate.h:1001:62: note: shadowed declaration is here | 1001 | rcu_head_after_call_rcu(struct rcu_head *rhp, rcu_callback_t f) | | ~~~~~~~~~~~~~~~^ For reference, sparse also warns about it, c.f. [1]. This patch renames the variable from f to __flags (with two underscore prefixes as suggested in the Linux kernel coding style [2]) in order to prevent collisions. [1] https://lore.kernel.org/all/CAFGhKbyifH1a+nAMCvWM88TK6fpNPdzFtUXPmRGnnQeePV+1sw@mail.gmail.com/ [2] Linux kernel coding style, section 12) Macros, Enums and RTL, paragraph 5) namespace collisions when defining local variables in macros resembling functions https://www.kernel.org/doc/html/latest/process/coding-style.html#macros-enums-and-rtl Fixes: bfb1a7c91fb7 ("x86/bug: Merge annotate_reachable() into_BUG_FLAGS() asm") Signed-off-by: Vincent Mailhol <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Nick Desaulniers <[email protected]> Acked-by: Josh Poimboeuf <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2022-04-05drm/i915/dp: Fix DFP rgb->ycbcr conversion matrixVille Syrjälä1-35/+3
Our YCbCr output is always supposed to be limited range BT.709. That's what we send with native HDMI. The conn_state->colorspace stuff is entirely independent of that and is not supposed to alter the generated output in any way. If we want a way to do that then we need a new proprty for it. Make it so that the RGB->YCbCr conversion when performed by the DPF will match the BT.709 we would transmit with native HDMI. Cc: Ankit Nautiyal <[email protected]> Cc: Uma Shankar <[email protected]> Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Uma Shankar <[email protected]>
2022-04-05drm/i915/dp: Duplicate native HDMI TMDS clock limit handling for DP HDMI DFPsVille Syrjälä1-13/+38
With native HDMI we allow the user to override the mode with something that may not respect the downstream (sink,dual-mode adapter) TMDS clock limits. Let's reuse the same logic for DP HDMI DFPs so that behaviour is more or less uniform. Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Uma Shankar <[email protected]>
2022-04-05drm/i915/dp: Add support for "4:2:0 also" modes for DPVille Syrjälä1-8/+59
Currently we only support "4:2:0 also" modes on native HDMI. Extend that support for DP as well. With all the HDMI DFP TMDS clock handling sorted out this is now going to work for both native DP and DP->HDMI converters. As with native HDMI we first check if RGB output is possible, and if not we try YCbCr 4:2:0 instead. Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Uma Shankar <[email protected]>
2022-04-05drm/i915/dp: Rework HDMI DFP TMDS clock handlingVille Syrjälä1-10/+25
Rework the HDMI DFP TMDS clock checks to also check at 8bpc. Previously we only checked the deep color cases. But I suppose a sink could potentially declare "4:2:0 also" modes that only actually fit within its own limits when using 4:2:0. Even if that is too nuts to be real there is no real harm in running through the full checks for everything. Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Uma Shankar <[email protected]>
2022-04-05drm/i915/dp: Make intel_dp_output_format() usable for "4:2:0 also" modesVille Syrjälä1-6/+7
Hoist the drm_mode_is_420_only() from intel_dp_output_format() into the caller. This will allow intel_dp_output_format() to be reused for "4:2:0 also" modes. Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Uma Shankar <[email protected]>
2022-04-05drm/i915/dp: Pass around intel_connector rather than drm_connectorVille Syrjälä1-20/+18
Prefer to use intel_connector over drm_connector. Also clean up the related variable names a bit. Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Uma Shankar <[email protected]>
2022-04-05drm/i915/dp: Reorder intel_dp_compute_config() a bitVille Syrjälä1-13/+10
Consolidate the double pfit call, and reorder things so that intel_dp_output_format() and intel_dp_compute_link_config() are back-to-back. They are intimately related, and will need to be called twice to properly handle the "4:2:0 also" modes. Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Uma Shankar <[email protected]>
2022-04-05drm/i915/dp: s/intel_dp_hdmi_ycbcr420/intel_dp_is_ycbcr420/Ville Syrjälä1-3/+3
intel_dp_hdmi_ycbcr420() does account for native DP 4:2:0 output as well, so lets rename it a bit. Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Uma Shankar <[email protected]>
2022-04-05drm/i915/dp: Extract intel_dp_has_audio()Ville Syrjälä1-10/+20
Declutter intel_dp_compute_config() a bit by moving the has_audio computation into a helper. HDMI already does the same thing. Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Uma Shankar <[email protected]>
2022-04-05drm/i915/dp: Respect the sink's max TMDS clock when dealing with DP->HDMI DFPsVille Syrjälä1-5/+19
Currently we only look at the DFPs max TMDS clock limit when considering whether the mode is valid, or whether we can do deep color. The sink's max TMDS clock limit may be lower than the DFPs, so we need to account for it as well. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/4095 Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2844 Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Uma Shankar <[email protected]>
2022-04-05drm/i915/dp: Extract intel_dp_tmds_clock_valid()Ville Syrjälä1-33/+26
We're currently duplicating the DFP min/max TMDS clock checks in .mode_valid() and .compute_config(). Extract a helper suitable for both use cases. Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Uma Shankar <[email protected]>
2022-04-05perf/core: Always set cpuctx cgrp when enable cgroup eventChengming Zhou1-16/+2
When enable a cgroup event, cpuctx->cgrp setting is conditional on the current task cgrp matching the event's cgroup, so have to do it for every new event. It brings complexity but no advantage. To keep it simple, this patch would always set cpuctx->cgrp when enable the first cgroup event, and reset to NULL when disable the last cgroup event. Signed-off-by: Chengming Zhou <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-05perf/core: Fix perf_cgroup_switch()Chengming Zhou1-107/+25
There is a race problem that can trigger WARN_ON_ONCE(cpuctx->cgrp) in perf_cgroup_switch(). CPU1 CPU2 perf_cgroup_sched_out(prev, next) cgrp1 = perf_cgroup_from_task(prev) cgrp2 = perf_cgroup_from_task(next) if (cgrp1 != cgrp2) perf_cgroup_switch(prev, PERF_CGROUP_SWOUT) cgroup_migrate_execute() task->cgroups = ? perf_cgroup_attach() task_function_call(task, __perf_cgroup_move) perf_cgroup_sched_in(prev, next) cgrp1 = perf_cgroup_from_task(prev) cgrp2 = perf_cgroup_from_task(next) if (cgrp1 != cgrp2) perf_cgroup_switch(next, PERF_CGROUP_SWIN) __perf_cgroup_move() perf_cgroup_switch(task, PERF_CGROUP_SWOUT | PERF_CGROUP_SWIN) The commit a8d757ef076f ("perf events: Fix slow and broken cgroup context switch code") want to skip perf_cgroup_switch() when the perf_cgroup of "prev" and "next" are the same. But task->cgroups can change in concurrent with context_switch() in cgroup_migrate_execute(). If cgrp1 == cgrp2 in sched_out(), cpuctx won't do sched_out. Then task->cgroups changed cause cgrp1 != cgrp2 in sched_in(), cpuctx will do sched_in. So trigger WARN_ON_ONCE(cpuctx->cgrp). Even though __perf_cgroup_move() will be synchronized as the context switch disables the interrupt, context_switch() still can see the task->cgroups is changing in the middle, since task->cgroups changed before sending IPI. So we have to combine perf_cgroup_sched_in() into perf_cgroup_sched_out(), unified into perf_cgroup_switch(), to fix the incosistency between perf_cgroup_sched_out() and perf_cgroup_sched_in(). But we can't just compare prev->cgroups with next->cgroups to decide whether to skip cpuctx sched_out/in since the prev->cgroups is changing too. For example: CPU1 CPU2 cgroup_migrate_execute() prev->cgroups = ? perf_cgroup_attach() task_function_call(task, __perf_cgroup_move) perf_cgroup_switch(task) cgrp1 = perf_cgroup_from_task(prev) cgrp2 = perf_cgroup_from_task(next) if (cgrp1 != cgrp2) cpuctx sched_out/in ... task_function_call() will return -ESRCH In the above example, prev->cgroups changing cause (cgrp1 == cgrp2) to be true, so skip cpuctx sched_out/in. And later task_function_call() would return -ESRCH since the prev task isn't running on cpu anymore. So we would leave perf_events of the old prev->cgroups still sched on the CPU, which is wrong. The solution is that we should use cpuctx->cgrp to compare with the next task's perf_cgroup. Since cpuctx->cgrp can only be changed on local CPU, and we have irq disabled, we can read cpuctx->cgrp to compare without holding ctx lock. Fixes: a8d757ef076f ("perf events: Fix slow and broken cgroup context switch code") Signed-off-by: Chengming Zhou <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-05perf/core: Use perf_cgroup_info->active to check if cgroup is activeChengming Zhou1-5/+2
Since we use perf_cgroup_set_timestamp() to start cgroup time and set active to 1, then use update_cgrp_time_from_cpuctx() to stop cgroup time and set active to 0. We can use info->active directly to check if cgroup is active. Signed-off-by: Chengming Zhou <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-05perf/core: Don't pass task around when ctx sched inChengming Zhou1-32/+26
The current code pass task around for ctx_sched_in(), only to get perf_cgroup of the task, then update the timestamp of it and its ancestors and set them to active. But we can use cpuctx->cgrp to get active perf_cgroup and its ancestors since cpuctx->cgrp has been set before ctx_sched_in(). This patch remove the task argument in ctx_sched_in() and cleanup related code. Signed-off-by: Chengming Zhou <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-05perf/x86/intel: Update the FRONTEND MSR mask on Sapphire RapidsKan Liang1-1/+1
On Sapphire Rapids, the FRONTEND_RETIRED.MS_FLOWS event requires the FRONTEND MSR value 0x8. However, the current FRONTEND MSR mask doesn't support it. Update intel_spr_extra_regs[] to support it. Fixes: 61b985e3e775 ("perf/x86/intel: Add perf core PMU support for Sapphire Rapids") Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected]
2022-04-05perf/x86/intel: Don't extend the pseudo-encoding to GP countersKan Liang2-1/+10
The INST_RETIRED.PREC_DIST event (0x0100) doesn't count on SPR. perf stat -e cpu/event=0xc0,umask=0x0/,cpu/event=0x0,umask=0x1/ -C0 Performance counter stats for 'CPU(s) 0': 607,246 cpu/event=0xc0,umask=0x0/ 0 cpu/event=0x0,umask=0x1/ The encoding for INST_RETIRED.PREC_DIST is pseudo-encoding, which doesn't work on the generic counters. However, current perf extends its mask to the generic counters. The pseudo event-code for a fixed counter must be 0x00. Check and avoid extending the mask for the fixed counter event which using the pseudo-encoding, e.g., ref-cycles and PREC_DIST event. With the patch, perf stat -e cpu/event=0xc0,umask=0x0/,cpu/event=0x0,umask=0x1/ -C0 Performance counter stats for 'CPU(s) 0': 583,184 cpu/event=0xc0,umask=0x0/ 583,048 cpu/event=0x0,umask=0x1/ Fixes: 2de71ee153ef ("perf/x86/intel: Fix ICL/SPR INST_RETIRED.PREC_DIST encodings") Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2022-04-05perf/core: Inherit event_capsNamhyung Kim1-0/+3
It was reported that some perf event setup can make fork failed on ARM64. It was the case of a group of mixed hw and sw events and it failed in perf_event_init_task() due to armpmu_event_init(). The ARM PMU code checks if all the events in a group belong to the same PMU except for software events. But it didn't set the event_caps of inherited events and no longer identify them as software events. Therefore the test failed in a child process. A simple reproducer is: $ perf stat -e '{cycles,cs,instructions}' perf bench sched messaging # Running 'sched/messaging' benchmark: perf: fork(): Invalid argument The perf stat was fine but the perf bench failed in fork(). Let's inherit the event caps from the parent. Signed-off-by: Namhyung Kim <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2022-04-05perf/x86/uncore: Add Raptor Lake uncore supportKan Liang2-0/+21
The uncore PMU of the Raptor Lake is the same as Alder Lake. Add new PCIIDs of IMC for Raptor Lake. Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-05perf/x86/msr: Add Raptor Lake CPU supportKan Liang1-0/+1
Raptor Lake is Intel's successor to Alder lake. PPERF and SMI_COUNT MSRs are also supported. Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-05perf/x86/cstate: Add Raptor Lake supportKan Liang1-10/+12
Raptor Lake is Intel's successor to Alder lake. From the perspective of Intel cstate residency counters, there is nothing changed compared with Alder lake. Share adl_cstates with Alder lake. Update the comments for Raptor Lake. Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-05perf/x86: Add Intel Raptor Lake supportKan Liang1-0/+1
From PMU's perspective, Raptor Lake is the same as the Alder Lake. The only difference is the event list, which will be supported in the perf tool later. Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-05Revert "mm/page_alloc: mark pagesets as __maybe_unused"Sebastian Andrzej Siewior1-1/+1
The local_lock() is now using a proper static inline function which is enough for llvm to accept that the variable is used. Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-05Revert "locking/local_lock: Make the empty local_lock_*() function a macro."Sebastian Andrzej Siewior1-3/+3
With volatile removed from arch_raw_cpu_ptr() the compiler no longer creates the per-CPU reference. The usage of the macro can be reverted now. Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-05x86/percpu: Remove volatile from arch_raw_cpu_ptr().Sebastian Andrzej Siewior1-3/+3
The volatile attribute in the inline assembly of arch_raw_cpu_ptr() forces the compiler to always generate the code, even if the compiler can decide upfront that its result is not needed. For instance invoking __intel_pmu_disable_all(false) (like intel_pmu_snapshot_arch_branch_stack() does) leads to loading the address of &cpu_hw_events into the register while compiler knows that it has no need for it. This ends up with code like: | movq $cpu_hw_events, %rax #, tcp_ptr__ | add %gs:this_cpu_off(%rip), %rax # this_cpu_off, tcp_ptr__ | xorl %eax, %eax # tmp93 It also creates additional code within local_lock() with !RT && !LOCKDEP which is not desired. By removing the volatile attribute the compiler can place the function freely and avoid it if it is not needed in the end. By using the function twice the compiler properly caches only the variable offset and always loads the CPU-offset. this_cpu_ptr() also remains properly placed within a preempt_disable() sections because - arch_raw_cpu_ptr() assembly has a memory input ("m" (this_cpu_off)) - prempt_{dis,en}able() fundamentally has a 'barrier()' in it Therefore this_cpu_ptr() is already properly serialized and does not rely on the 'volatile' attribute. Remove volatile from arch_raw_cpu_ptr(). [ bigeasy: Added Linus' explanation why this_cpu_ptr() is not moved out of a preempt_disable() section without the 'volatile' attribute. ] Suggested-by: Linus Torvalds <[email protected]> Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-05static_call: Remove __DEFINE_STATIC_CALL macroChristophe Leroy1-13/+10
Only DEFINE_STATIC_CALL use __DEFINE_STATIC_CALL macro now when CONFIG_HAVE_STATIC_CALL is selected. Only keep __DEFINE_STATIC_CALL() for the generic fallback, and also use it to implement DEFINE_STATIC_CALL_NULL() in that case. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Josh Poimboeuf <[email protected]> Link: https://lore.kernel.org/r/329074f92d96e3220ebe15da7bbe2779beee31eb.1647253456.git.christophe.leroy@csgroup.eu
2022-04-05static_call: Properly initialise DEFINE_STATIC_CALL_RET0()Christophe Leroy3-3/+20
When a static call is updated with __static_call_return0() as target, arch_static_call_transform() set it to use an optimised set of instructions which are meant to lay in the same cacheline. But when initialising a static call with DEFINE_STATIC_CALL_RET0(), we get a branch to the real __static_call_return0() function instead of getting the optimised setup: c00d8120 <__SCT__perf_snapshot_branch_stack>: c00d8120: 4b ff ff f4 b c00d8114 <__static_call_return0> c00d8124: 3d 80 c0 0e lis r12,-16370 c00d8128: 81 8c 81 3c lwz r12,-32452(r12) c00d812c: 7d 89 03 a6 mtctr r12 c00d8130: 4e 80 04 20 bctr c00d8134: 38 60 00 00 li r3,0 c00d8138: 4e 80 00 20 blr c00d813c: 00 00 00 00 .long 0x0 Add ARCH_DEFINE_STATIC_CALL_RET0_TRAMP() defined by each architecture to setup the optimised configuration, and rework DEFINE_STATIC_CALL_RET0() to call it: c00d8120 <__SCT__perf_snapshot_branch_stack>: c00d8120: 48 00 00 14 b c00d8134 <__SCT__perf_snapshot_branch_stack+0x14> c00d8124: 3d 80 c0 0e lis r12,-16370 c00d8128: 81 8c 81 3c lwz r12,-32452(r12) c00d812c: 7d 89 03 a6 mtctr r12 c00d8130: 4e 80 04 20 bctr c00d8134: 38 60 00 00 li r3,0 c00d8138: 4e 80 00 20 blr c00d813c: 00 00 00 00 .long 0x0 Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Josh Poimboeuf <[email protected]> Link: https://lore.kernel.org/r/1e0a61a88f52a460f62a58ffc2a5f847d1f7d9d8.1647253456.git.christophe.leroy@csgroup.eu
2022-04-05static_call: Don't make __static_call_return0 staticChristophe Leroy4-546/+546
System.map shows that vmlinux contains several instances of __static_call_return0(): c0004fc0 t __static_call_return0 c0011518 t __static_call_return0 c00d8160 t __static_call_return0 arch_static_call_transform() uses the middle one to check whether we are setting a call to __static_call_return0 or not: c0011520 <arch_static_call_transform>: c0011520: 3d 20 c0 01 lis r9,-16383 <== r9 = 0xc001 << 16 c0011524: 39 29 15 18 addi r9,r9,5400 <== r9 += 0x1518 c0011528: 7c 05 48 00 cmpw r5,r9 <== r9 has value 0xc0011518 here So if static_call_update() is called with one of the other instances of __static_call_return0(), arch_static_call_transform() won't recognise it. In order to work properly, global single instance of __static_call_return0() is required. Fixes: 3f2a8fc4b15d ("static_call/x86: Add __static_call_return0()") Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Josh Poimboeuf <[email protected]> Link: https://lkml.kernel.org/r/30821468a0e7d28251954b578e5051dc09300d04.1647258493.git.christophe.leroy@csgroup.eu
2022-04-05x86,static_call: Fix __static_call_return0 for i386Peter Zijlstra1-3/+2
Paolo reported that the instruction sequence that is used to replace: call __static_call_return0 namely: 66 66 48 31 c0 data16 data16 xor %rax,%rax decodes to something else on i386, namely: 66 66 48 data16 dec %ax 31 c0 xor %eax,%eax Which is a nonsensical sequence that happens to have the same outcome. *However* an important distinction is that it consists of 2 instructions which is a problem when the thing needs to be overwriten with a regular call instruction again. As such, replace the instruction with something that decodes the same on both i386 and x86_64. Fixes: 3f2a8fc4b15d ("static_call/x86: Add __static_call_return0()") Reported-by: Paolo Bonzini <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2022-04-05entry: Fix compile error in dynamic_irqentry_exit_cond_resched()Sven Schnelle1-1/+1
kernel/entry/common.c: In function ‘dynamic_irqentry_exit_cond_resched’: kernel/entry/common.c:409:14: error: implicit declaration of function ‘static_key_unlikely’; did you mean ‘static_key_enable’? [-Werror=implicit-function-declaration] 409 | if (!static_key_unlikely(&sk_dynamic_irqentry_exit_cond_resched)) | ^~~~~~~~~~~~~~~~~~~ | static_key_enable static_key_unlikely() should be static_branch_unlikely(). Fixes: 99cf983cc8bca ("sched/preempt: Add PREEMPT_DYNAMIC using static keys") Signed-off-by: Sven Schnelle <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Mark Rutland <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-05sched: Teach the forced-newidle balancer about CPU affinity limitation.Sebastian Andrzej Siewior1-1/+1
try_steal_cookie() looks at task_struct::cpus_mask to decide if the task could be moved to `this' CPU. It ignores that the task might be in a migration disabled section while not on the CPU. In this case the task must not be moved otherwise per-CPU assumption are broken. Use is_cpu_allowed(), as suggested by Peter Zijlstra, to decide if the a task can be moved. Fixes: d2dfa17bc7de6 ("sched: Trivial forced-newidle balancer") Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2022-04-05sched/core: Fix forceidle balancingPeter Zijlstra3-11/+10
Steve reported that ChromeOS encounters the forceidle balancer being ran from rt_mutex_setprio()'s balance_callback() invocation and explodes. Now, the forceidle balancer gets queued every time the idle task gets selected, set_next_task(), which is strictly too often. rt_mutex_setprio() also uses set_next_task() in the 'change' pattern: queued = task_on_rq_queued(p); /* p->on_rq == TASK_ON_RQ_QUEUED */ running = task_current(rq, p); /* rq->curr == p */ if (queued) dequeue_task(...); if (running) put_prev_task(...); /* change task properties */ if (queued) enqueue_task(...); if (running) set_next_task(...); However, rt_mutex_setprio() will explicitly not run this pattern on the idle task (since priority boosting the idle task is quite insane). Most other 'change' pattern users are pidhash based and would also not apply to idle. Also, the change pattern doesn't contain a __balance_callback() invocation and hence we could have an out-of-band balance-callback, which *should* trigger the WARN in rq_pin_lock() (which guards against this exact anti-pattern). So while none of that explains how this happens, it does indicate that having it in set_next_task() might not be the most robust option. Instead, explicitly queue the forceidle balancer from pick_next_task() when it does indeed result in forceidle selection. Having it here, ensures it can only be triggered under the __schedule() rq->lock instance, and hence must be ran from that context. This also happens to clean up the code a little, so win-win. Fixes: d2dfa17bc7de ("sched: Trivial forced-newidle balancer") Reported-by: Steven Rostedt <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Tested-by: T.J. Alumbaugh <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2022-04-05dma-buf: finally make dma_resv_excl_fence private v2Christian König2-17/+6
Drivers should never touch this directly. v2: fix rebase clash Signed-off-by: Christian König <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Daniel Vetter <[email protected]>
2022-04-05dt-bindings: display: bridge: Drop requirement on input port for DSI devicesMaxime Ripard2-2/+0
MIPI-DSI devices, if they are controlled through the bus itself, have to be described as a child node of the controller they are attached to. Thus, there's no requirement on the controller having an OF-Graph output port to model the data stream: it's assumed that it would go from the parent to the child. However, some bridges controlled through the DSI bus still require an input OF-Graph port, thus requiring a controller with an OF-Graph output port. This prevents those bridges from being used with the controllers that do not have one without any particular reason to. Let's drop that requirement. Signed-off-by: Maxime Ripard <[email protected]> Reviewed-by: Rob Herring <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2022-04-05sctp: count singleton chunks in assoc user statsJamie Bainbridge1-1/+5
Singleton chunks (INIT, HEARTBEAT PMTU probes, and SHUTDOWN- COMPLETE) are not counted in SCTP_GET_ASOC_STATS "sas_octrlchunks" counter available to the assoc owner. These are all control chunks so they should be counted as such. Add counting of singleton chunks so they are properly accounted for. Fixes: 196d67593439 ("sctp: Add support to per-association statistics via a new SCTP_GET_ASSOC_STATS call") Signed-off-by: Jamie Bainbridge <[email protected]> Acked-by: Marcelo Ricardo Leitner <[email protected]> Link: https://lore.kernel.org/r/c9ba8785789880cf07923b8a5051e174442ea9ee.1649029663.git.jamie.bainbridge@gmail.com Signed-off-by: Paolo Abeni <[email protected]>
2022-04-05drm/i915: Expose client engine utilisation via fdinfoTvrtko Ursulin5-0/+125
Similar to AMD commit 874442541133 ("drm/amdgpu: Add show_fdinfo() interface"), using the infrastructure added in previous patches, we add basic client info and GPU engine utilisation for i915. Example of the output: pos: 0 flags: 0100002 mnt_id: 21 drm-driver: i915 drm-pdev: 0000:00:02.0 drm-client-id: 7 drm-engine-render: 9288864723 ns drm-engine-copy: 2035071108 ns drm-engine-video: 0 ns drm-engine-video-enhance: 0 ns v2: * Update for removal of name and pid. v3: * Use drm_driver.name. v4: * Added drm-engine-capacity- tag. * Fix typo. (Umesh) v5: * Don't output engine data before Gen8. Signed-off-by: Tvrtko Ursulin <[email protected]> Cc: David M Nieto <[email protected]> Cc: Christian König <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Chris Healy <[email protected]> Acked-by: Christian König <[email protected]> Reviewed-by: Umesh Nerlige Ramappa <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Daniel Vetter <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2022-04-05drm/i915: Count engine instances per uabi classTvrtko Ursulin2-5/+7
This will be useful to have at hand in a following patch. Signed-off-by: Tvrtko Ursulin <[email protected]> Reviewed-by: Umesh Nerlige Ramappa <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2022-04-05drm: Document fdinfo format specificationTvrtko Ursulin2-0/+107
Proposal to standardise the fdinfo text format as optionally output by DRM drivers. Idea is that a simple but, well defined, spec will enable generic userspace tools to be written while at the same time avoiding a more heavy handed approach of adding a mid-layer to DRM. i915 implements a subset of the spec, everything apart from the memory stats currently, and a matching intel_gpu_top tool exists. Open is to see if AMD can migrate to using the proposed GPU utilisation key-value pairs, or if they are not workable to see whether to go vendor specific, or if a standardised alternative can be found which is workable for both drivers. Same for the memory utilisation key-value pairs proposal. v2: * Update for removal of name and pid. v3: * 'Drm-driver' tag will be obtained from struct drm_driver.name. (Daniel) v4: * Added drm-engine-capacity- tag. Signed-off-by: Tvrtko Ursulin <[email protected]> Cc: David M Nieto <[email protected]> Cc: Christian König <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Daniel Stone <[email protected]> Cc: Chris Healy <[email protected]> Acked-by: Christian König <[email protected]> Reviewed-by: Umesh Nerlige Ramappa <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2022-04-05drm/i915: Track context current active timeTvrtko Ursulin10-50/+118
Track context active (on hardware) status together with the start timestamp. This will be used to provide better granularity of context runtime reporting in conjunction with already tracked pphwsp accumulated runtime. The latter is only updated on context save so does not give us visibility to any currently executing work. As part of the patch the existing runtime tracking data is moved under the new ce->stats member and updated under the seqlock. This provides the ability to atomically read out accumulated plus active runtime. v2: * Rename and make __intel_context_get_active_time unlocked. v3: * Use GRAPHICS_VER. Signed-off-by: Tvrtko Ursulin <[email protected]> Reviewed-by: Aravind Iddamsetty <[email protected]> # v1 Reviewed-by: Chris Wilson <[email protected]> Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Umesh Nerlige Ramappa <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2022-04-05drm/i915: Track all user contexts per clientTvrtko Ursulin4-0/+23
We soon want to start answering questions like how much GPU time is the context belonging to a client which exited still using. To enable this we start tracking all context belonging to a client on a separate list. Signed-off-by: Tvrtko Ursulin <[email protected]> Reviewed-by: Aravind Iddamsetty <[email protected]> Reviewed-by: Chris Wilson <[email protected]> Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Umesh Nerlige Ramappa <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2022-04-05drm/i915: Track runtime spent in closed and unreachable GEM contextsTvrtko Ursulin2-2/+32
As contexts are abandoned we want to remember how much GPU time they used (per class) so later we can used it for smarter purposes. As GEM contexts are closed we want to have the DRM client remember how much GPU time they used (per class) so later we can used it for smarter purposes. v2: * Size past runtimes array by uabi class, not internal. Signed-off-by: Tvrtko Ursulin <[email protected]> Reviewed-by: Aravind Iddamsetty <[email protected]> # v1 Reviewed-by: Chris Wilson <[email protected]> # v1 Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Umesh Nerlige Ramappa <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]