aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/drm
AgeCommit message (Collapse)AuthorFilesLines
2023-10-26drm/amd/pm: Fix the return value in default caseMa Jun1-3/+1
Fix the return value in default case and drop redundant 'break'. Signed-off-by: Ma Jun <[email protected]> Reviewed-by: Kenneth Feng <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-10-26drm/amdgpu: Add API to get full IP versionLijo Lazar1-0/+7
Fetch the full version of IP including variant and subrevision. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: Le Ma <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-10-26drm/amdgpu: add tmz support for GC IP v11.5.0Jiadong Zhu1-0/+1
Add tmz support for GC 11.5.0. Signed-off-by: Jiadong Zhu <[email protected]> Reviewed-by: Yifan Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-10-26drm/amd/pm: drop unneeded dpm features disablement for SMU 14.0.0Jiadong Zhu1-1/+2
PMFW will handle the features disablement properly for gpu reset case, driver involvement may cause some unexpected issues. Signed-off-by: Jiadong Zhu <[email protected]> Reviewed-by: Yifan Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-10-26drm/amdgpu: modify if condition in nbio_v7_7.cLi Ma1-2/+2
remove unnecessary "enable" in if condition. Signed-off-by: Li Ma <[email protected]> Reviewed-by: Tim Huang <[email protected]> Reviewed-by: Yifan Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-10-26drm/amdkfd: Fix shift out-of-bounds issueJesse Zhang1-1/+1
[ 567.613292] shift exponent 255 is too large for 64-bit type 'long unsigned int' [ 567.614498] CPU: 5 PID: 238 Comm: kworker/5:1 Tainted: G OE 6.2.0-34-generic #34~22.04.1-Ubuntu [ 567.614502] Hardware name: AMD Splinter/Splinter-RPL, BIOS WS43927N_871 09/25/2023 [ 567.614504] Workqueue: events send_exception_work_handler [amdgpu] [ 567.614748] Call Trace: [ 567.614750] <TASK> [ 567.614753] dump_stack_lvl+0x48/0x70 [ 567.614761] dump_stack+0x10/0x20 [ 567.614763] __ubsan_handle_shift_out_of_bounds+0x156/0x310 [ 567.614769] ? srso_alias_return_thunk+0x5/0x7f [ 567.614773] ? update_sd_lb_stats.constprop.0+0xf2/0x3c0 [ 567.614780] svm_range_split_by_granularity.cold+0x2b/0x34 [amdgpu] [ 567.615047] ? srso_alias_return_thunk+0x5/0x7f [ 567.615052] svm_migrate_to_ram+0x185/0x4d0 [amdgpu] [ 567.615286] do_swap_page+0x7b6/0xa30 [ 567.615291] ? srso_alias_return_thunk+0x5/0x7f [ 567.615294] ? __free_pages+0x119/0x130 [ 567.615299] handle_pte_fault+0x227/0x280 [ 567.615303] __handle_mm_fault+0x3c0/0x720 [ 567.615311] handle_mm_fault+0x119/0x330 [ 567.615314] ? lock_mm_and_find_vma+0x44/0x250 [ 567.615318] do_user_addr_fault+0x1a9/0x640 [ 567.615323] exc_page_fault+0x81/0x1b0 [ 567.615328] asm_exc_page_fault+0x27/0x30 [ 567.615332] RIP: 0010:__get_user_8+0x1c/0x30 Signed-off-by: Jesse Zhang <[email protected]> Suggested-by: Philip Yang <[email protected]> Reviewed-by: Yifan Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-10-26drm/amdgpu: refine ras error kernel log printYang Wang2-39/+82
refine ras error kernel log to avoid user-ridden ambiguity. Signed-off-by: Yang Wang <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-10-26drm/amdgpu: fix find ras error node errorYang Wang1-4/+3
the origin function might return the wrong node. Fixes: 5b1270beb380 ("drm/amdgpu: add ras_err_info to identify RAS error source") Signed-off-by: Yang Wang <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-10-26drm/amd/display: reprogram det size while seamless bootHugo Hu3-0/+33
[Why] During system boot in second screen only mode on a seamless boot system, there is a chance that the pipe's det size might not be reset. [How] Reset the det size while resetting the pipe during seamless boot. Reviewed-by: Dmytro Laktyushkin <[email protected]> Acked-by: Roman Li <[email protected]> Signed-off-by: Hugo Hu <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-10-26drm/amd/pm: record mca debug mode in RASTao Zhou1-0/+1
Call amdgpu_ras_set_mca_debug_mode when we set mca debug mode in smu v13_0_6. Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-10-26drm/amdgpu: move buffer funcs setting up a levelAlex Deucher11-84/+19
Rather than doing this in the IP code for the SDMA paging engine, move it up to the core device level init level. This should fix the scheduler init ordering. v2: drop extra parens v3: drop SDMA helpers v4: Added a Fixes tag because amdgpu dereferences an uninitialized scheduler without this patch, and this patch fixes this. (Luben) Tested-by: Luben Tuikov <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Link: https://lore.kernel.org/r/[email protected] Acked-by: Christian König <[email protected]> Fixes: 56e449603f0ac5 ("drm/sched: Convert the GPU scheduler to variable number of run-queues") Signed-off-by: Luben Tuikov <[email protected]>
2023-10-26drm/syncobj: fix DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLEErik Kurzinger1-1/+2
If DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT is invoked with the DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE flag set but no fence has yet been submitted for the given timeline point the call will fail immediately with EINVAL. This does not match the intended behavior where the call should wait until the fence has been submitted (or the timeout expires). The following small example program illustrates the issue. It should wait for 5 seconds and then print ETIME, but instead it terminates right away after printing EINVAL. #include <stdio.h> #include <fcntl.h> #include <time.h> #include <errno.h> #include <xf86drm.h> int main(void) { int fd = open("/dev/dri/card0", O_RDWR); uint32_t syncobj; drmSyncobjCreate(fd, 0, &syncobj); struct timespec ts; clock_gettime(CLOCK_MONOTONIC, &ts); uint64_t point = 1; if (drmSyncobjTimelineWait(fd, &syncobj, &point, 1, ts.tv_sec * 1000000000 + ts.tv_nsec + 5000000000, // 5s DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE, NULL)) { printf("drmSyncobjTimelineWait failed %d\n", errno); } } Fixes: 01d6c3578379 ("drm/syncobj: add support for timeline point wait v8") Signed-off-by: Erik Kurzinger <[email protected]> Reviewed by: Simon Ser <[email protected]> Signed-off-by: Simon Ser <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-26drm/sched: Convert the GPU scheduler to variable number of run-queuesLuben Tuikov10-22/+92
The GPU scheduler has now a variable number of run-queues, which are set up at drm_sched_init() time. This way, each driver announces how many run-queues it requires (supports) per each GPU scheduler it creates. Note, that run-queues correspond to scheduler "priorities", thus if the number of run-queues is set to 1 at drm_sched_init(), then that scheduler supports a single run-queue, i.e. single "priority". If a driver further sets a single entity per run-queue, then this creates a 1-to-1 correspondence between a scheduler and a scheduled entity. Cc: Lucas Stach <[email protected]> Cc: Russell King <[email protected]> Cc: Qiang Yu <[email protected]> Cc: Rob Clark <[email protected]> Cc: Abhinav Kumar <[email protected]> Cc: Dmitry Baryshkov <[email protected]> Cc: Danilo Krummrich <[email protected]> Cc: Matthew Brost <[email protected]> Cc: Boris Brezillon <[email protected]> Cc: Alex Deucher <[email protected]> Cc: Christian König <[email protected]> Cc: Emma Anholt <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Luben Tuikov <[email protected]> Acked-by: Christian König <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-10-26drm/ci: do not automatically retry on errorHelen Koike1-14/+0
Since the kernel doesn't use a bot like Mesa that requires tests to pass in order to merge the patches, leave it to developers and/or maintainers to manually retry. Suggested-by: Rob Clark <[email protected]> Signed-off-by: Helen Koike <[email protected]> Reviewed-by: David Heidelberg <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Maxime Ripard <[email protected]>
2023-10-26drm/ci: export kernel configHelen Koike2-1/+2
Export the resultant kernel config, making it easier to verify if the resultant config was correctly generated. Suggested-by: Rob Clark <[email protected]> Signed-off-by: Helen Koike <[email protected]> Acked-by: Dmitry Baryshkov <[email protected]> Reviewed-by: David Heidelberg <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Maxime Ripard <[email protected]>
2023-10-26drm/ci: increase i915 job timeout to 1h30mHelen Koike1-0/+5
With the new sharding, the default job timeout is not enough for i915 and their jobs are failing before completing. See below the current execution time: 🞋 job i915:tgl 8/8 has new status: success (37m3s) 🞋 job i915:tgl 7/8 has new status: success (19m43s) 🞋 job i915:tgl 6/8 has new status: success (21m47s) 🞋 job i915:tgl 5/8 has new status: success (18m16s) 🞋 job i915:tgl 4/8 has new status: success (21m43s) 🞋 job i915:tgl 3/8 has new status: success (17m59s) 🞋 job i915:tgl 2/8 has new status: success (22m15s) 🞋 job i915:tgl 1/8 has new status: success (18m52s) 🞋 job i915:cml 2/2 has new status: success (1h19m58s) 🞋 job i915:cml 1/2 has new status: success (55m45s) 🞋 job i915:whl 2/2 has new status: success (1h8m56s) 🞋 job i915:whl 1/2 has new status: success (54m3s) 🞋 job i915:kbl 3/3 has new status: success (37m43s) 🞋 job i915:kbl 2/3 has new status: success (36m37s) 🞋 job i915:kbl 1/3 has new status: success (34m52s) 🞋 job i915:amly 2/2 has new status: success (1h7m60s) 🞋 job i915:amly 1/2 has new status: success (59m18s) 🞋 job i915:glk 2/2 has new status: success (58m26s) 🞋 job i915:glk 1/2 has new status: success (50m23s) 🞋 job i915:apl 3/3 has new status: success (1h6m39s) 🞋 job i915:apl 2/3 has new status: success (1h4m45s) 🞋 job i915:apl 1/3 has new status: success (1h7m38s) (generated with ci_run_n_monitor.py script) The longest job is 1h19m58s, so adjust the timeout. Signed-off-by: Helen Koike <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Maxime Ripard <[email protected]>
2023-10-26drm/ci: add subset-1-gfx to LAVA_TAGS and adjust shardsHelen Koike2-10/+15
The Collabora Lava farm added a tag called `subset-1-gfx` to half of devices the graphics community use. Lets use this tag so we don't occupy all the resources. This is particular important because Mesa3D shares the resources with DRM-CI and use them to do pre-merge tests, so it can block developers from getting their patches merged. Signed-off-by: Helen Koike <[email protected]> Reviewed-by: David Heidelberg <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Maxime Ripard <[email protected]>
2023-10-26drm/ci: clean up xfails (specially flakes list)Helen Koike33-287/+162
Since the script that collected the list of the expectation files was bogus and placing test to the flakes list incorrectly, restart the expectation files with the correct script. This reduces a lot the number of tests in the flakes list. Signed-off-by: Helen Koike <[email protected]> Reviewed-by: David Heidelberg <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Maxime Ripard <[email protected]>
2023-10-26drm/ci: uprev IGT and make sure core_getversion is runHelen Koike3-9/+26
IGT has recently merged a patch that makes code_getversion test to fails if the driver isn't loaded or if it isn't the expected one defined in variable IGT_FORCE_DRIVER. Without this test, jobs were passing when the driver didn't load or probe for some reason, giving the illusion that everything was ok. Uprev IGT to include this modification and include core_getversion test in all the shards. Signed-off-by: Helen Koike <[email protected]> Reviewed-by: David Heidelberg <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Maxime Ripard <[email protected]>
2023-10-26drm/ci: add helper script update-xfails.pyHelen Koike2-0/+221
Add helper script that given a gitlab pipeline url, analyse which are the failures and flakes and update the xfails folder accordingly. Example: Trigger a pipeline in gitlab infrastructure, than re-try a few jobs more than once (so we can have data if failures are consistent across jobs with the same name or if they are flakes) and execute: update-xfails.py https://gitlab.freedesktop.org/helen.fornazier/linux/-/pipelines/970661 git diff should show you that it updated files in xfails folder. Signed-off-by: Helen Koike <[email protected]> Tested-by: Vignesh Raman <[email protected]> Reviewed-by: David Heidelberg <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Maxime Ripard <[email protected]>
2023-10-26drm/ci: fix DEBIAN_ARCH and get amdgpu probingHelen Koike4-8/+8
amdgpu driver wasn't loading because amdgpu firmware wasn't being installed in the rootfs due to the wrong DEBIAN_ARCH variable. rename ARCH to DEBIAN_ARCH also, so we don't have the confusing DEBIAN_ARCH, KERNEL_ARCH and ARCH. Signed-off-by: Helen Koike <[email protected]> Reviewed-by: David Heidelberg <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Maxime Ripard <[email protected]>
2023-10-26drm/ci: uprev mesa version: fix container build & crosvmHelen Koike4-3/+22
When building containers, some rust packages were installed without locking the dependencies version, which got updated and started giving errors like: error: failed to compile `bindgen-cli v0.62.0`, intermediate artifacts can be found at `/tmp/cargo-installkNKRwf` Caused by: package `rustix v0.38.13` cannot be built because it requires rustc 1.63 or newer, while the currently active rustc version is 1.60.0 A patch to Mesa was added fixing this error, so update it. Also, commit in linux kernel 6.6 rc3 broke booting in crosvm. Mesa has upreved crosvm to fix this issue. Signed-off-by: Helen Koike <[email protected]> [crosvm mesa update] Co-Developed-by: Vignesh Raman <[email protected]> Signed-off-by: Vignesh Raman <[email protected]> [v1 container build uprev] Tested-by: Jessica Zhang <[email protected]> Acked-by: Jessica Zhang <[email protected]> Reviewed-by: David Heidelberg <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Maxime Ripard <[email protected]>
2023-10-26drm/ci: Enable CONFIG_BACKLIGHT_CLASS_DEVICERob Clark2-0/+2
Dependency for CONFIG_DRM_PANEL_EDP. Missing this was causing the drm driver to not probe on devices that use panel-edp. Signed-off-by: Rob Clark <[email protected]> Tested-by: Helen Koike <[email protected]> Acked-by: Helen Koike <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Maxime Ripard <[email protected]>
2023-10-26drm/ci: force-enable CONFIG_MSM_MMCC_8996 as built-inDmitry Baryshkov1-0/+1
Enable CONFIG_MSM_MMCC_8996, the multimedia clock controller on Qualcomm MSM8996 to prevent the the board from hitting the probe deferral timeouts in CI run. Signed-off-by: Dmitry Baryshkov <[email protected]> Tested-by: Helen Koike <[email protected]> Acked-by: Helen Koike <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Maxime Ripard <[email protected]>
2023-10-26drm/ci: pick up -external-fixes from the merge target repoDmitry Baryshkov1-0/+5
In case of the merge requests it might be useful to push repo-specific fixes which have not yet propagated to the -external-fixes branch in the main UPSTREAM_REPO. For example, in case of drm/msm development, we are staging fixes locally for testing, before pushing them to the drm/drm repo. Thus, if the CI run was triggered by merge request, also pick up the -external fixes basing on the the CI_MERGE target repo / and branch. Signed-off-by: Dmitry Baryshkov <[email protected]> Acked-by: Helen Koike <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Maxime Ripard <[email protected]>
2023-10-26drm/vc4: tests: Fix UAF in the mock helpersMaxime Ripard2-2/+2
The VC4 mock helpers allocate the CRTC, encoders and connectors using a call to kunit_kzalloc(), but the DRM device they are attache to survives for longer than the test itself which leads to use-after-frees reported by KASAN. Switch to drmm_kzalloc to tie the lifetime of these objects to the main DRM device. Fixes: f759f5b53f1c ("drm/vc4: tests: Introduce a mocking infrastructure") Reported-by: Linux Kernel Functional Testing <[email protected]> Closes: https://lore.kernel.org/all/CA+G9fYvJA2HGqzR9LGgq63v0SKaUejHAE6f7+z9cwWN-ourJ_g@mail.gmail.com/ Tested-by: Anders Roxell <[email protected]> Reviewed-by: Maíra Canal <[email protected]> Signed-off-by: Maxime Ripard <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-25file, i915: fix file reference for mmap_singleton()Christian Brauner1-3/+1
Today we got a report at [1] for rcu stalls on the i915 testsuite in [2] due to the conversion of files to SLAB_TYPSSAFE_BY_RCU. Afaict, get_file_rcu() goes into an infinite loop trying to carefully verify that i915->gem.mmap_singleton hasn't changed - see the splat below. So I stared at this code to figure out what it actually does. It seems that the i915->gem.mmap_singleton pointer itself never had rcu semantics. The i915->gem.mmap_singleton is replaced in file->f_op->release::singleton_release(): static int singleton_release(struct inode *inode, struct file *file) { struct drm_i915_private *i915 = file->private_data; cmpxchg(&i915->gem.mmap_singleton, file, NULL); drm_dev_put(&i915->drm); return 0; } The cmpxchg() is ordered against a concurrent update of i915->gem.mmap_singleton from mmap_singleton(). IOW, when mmap_singleton() fails to get a reference on i915->gem.mmap_singleton: While mmap_singleton() does rcu_read_lock(); file = get_file_rcu(&i915->gem.mmap_singleton); rcu_read_unlock(); it allocates a new file via anon_inode_getfile() and does smp_store_mb(i915->gem.mmap_singleton, file); So, then what happens in the case of this bug is that at some point fput() is called and drops the file->f_count to zero leaving the pointer in i915->gem.mmap_singleton in tact. Now, there might be delays until file->f_op->release::singleton_release() is called and i915->gem.mmap_singleton is set to NULL. Say concurrently another task hits mmap_singleton() and does: rcu_read_lock(); file = get_file_rcu(&i915->gem.mmap_singleton); rcu_read_unlock(); When get_file_rcu() fails to get a reference via atomic_inc_not_zero() it will try the reload from i915->gem.mmap_singleton expecting it to be NULL, assuming it has comparable semantics as we expect in __fget_files_rcu(). But it hasn't so it reloads the same pointer again, trying the same atomic_inc_not_zero() again and doing so until file->f_op->release::singleton_release() of the old file has been called. So, in contrast to __fget_files_rcu() here we want to not retry when atomic_inc_not_zero() has failed. We only want to retry in case we managed to get a reference but the pointer did change on reload. <3> [511.395679] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: <3> [511.395716] rcu: Tasks blocked on level-1 rcu_node (CPUs 0-9): P6238 <3> [511.395934] rcu: (detected by 16, t=65002 jiffies, g=123977, q=439 ncpus=20) <6> [511.395944] task:i915_selftest state:R running task stack:10568 pid:6238 tgid:6238 ppid:1001 flags:0x00004002 <6> [511.395962] Call Trace: <6> [511.395966] <TASK> <6> [511.395974] ? __schedule+0x3a8/0xd70 <6> [511.395995] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 <6> [511.396003] ? lockdep_hardirqs_on+0xc3/0x140 <6> [511.396013] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 <6> [511.396029] ? get_file_rcu+0x10/0x30 <6> [511.396039] ? get_file_rcu+0x10/0x30 <6> [511.396046] ? i915_gem_object_mmap+0xbc/0x450 [i915] <6> [511.396509] ? i915_gem_mmap+0x272/0x480 [i915] <6> [511.396903] ? mmap_region+0x253/0xb60 <6> [511.396925] ? do_mmap+0x334/0x5c0 <6> [511.396939] ? vm_mmap_pgoff+0x9f/0x1c0 <6> [511.396949] ? rcu_is_watching+0x11/0x50 <6> [511.396962] ? igt_mmap_offset+0xfc/0x110 [i915] <6> [511.397376] ? __igt_mmap+0xb3/0x570 [i915] <6> [511.397762] ? igt_mmap+0x11e/0x150 [i915] <6> [511.398139] ? __trace_bprintk+0x76/0x90 <6> [511.398156] ? __i915_subtests+0xbf/0x240 [i915] <6> [511.398586] ? __pfx___i915_live_setup+0x10/0x10 [i915] <6> [511.399001] ? __pfx___i915_live_teardown+0x10/0x10 [i915] <6> [511.399433] ? __run_selftests+0xbc/0x1a0 [i915] <6> [511.399875] ? i915_live_selftests+0x4b/0x90 [i915] <6> [511.400308] ? i915_pci_probe+0x106/0x200 [i915] <6> [511.400692] ? pci_device_probe+0x95/0x120 <6> [511.400704] ? really_probe+0x164/0x3c0 <6> [511.400715] ? __pfx___driver_attach+0x10/0x10 <6> [511.400722] ? __driver_probe_device+0x73/0x160 <6> [511.400731] ? driver_probe_device+0x19/0xa0 <6> [511.400741] ? __driver_attach+0xb6/0x180 <6> [511.400749] ? __pfx___driver_attach+0x10/0x10 <6> [511.400756] ? bus_for_each_dev+0x77/0xd0 <6> [511.400770] ? bus_add_driver+0x114/0x210 <6> [511.400781] ? driver_register+0x5b/0x110 <6> [511.400791] ? i915_init+0x23/0xc0 [i915] <6> [511.401153] ? __pfx_i915_init+0x10/0x10 [i915] <6> [511.401503] ? do_one_initcall+0x57/0x270 <6> [511.401515] ? rcu_is_watching+0x11/0x50 <6> [511.401521] ? kmalloc_trace+0xa3/0xb0 <6> [511.401532] ? do_init_module+0x5f/0x210 <6> [511.401544] ? load_module+0x1d00/0x1f60 <6> [511.401581] ? init_module_from_file+0x86/0xd0 <6> [511.401590] ? init_module_from_file+0x86/0xd0 <6> [511.401613] ? idempotent_init_module+0x17c/0x230 <6> [511.401639] ? __x64_sys_finit_module+0x56/0xb0 <6> [511.401650] ? do_syscall_64+0x3c/0x90 <6> [511.401659] ? entry_SYSCALL_64_after_hwframe+0x6e/0xd8 <6> [511.401684] </TASK> Link: [1]: https://lore.kernel.org/intel-gfx/SJ1PR11MB6129CB39EED831784C331BAFB9DEA@SJ1PR11MB6129.namprd11.prod.outlook.com Link: [2]: https://intel-gfx-ci.01.org/tree/linux-next/next-20231013/bat-dg2-11/igt@i915_selftest@[email protected]#dmesg-warnings10963 Cc: Jann Horn <[email protected]>, Cc: Linus Torvalds <[email protected]> Link: https://lore.kernel.org/r/20231025-formfrage-watscheln-84526cd3bd7d@brauner Signed-off-by: Christian Brauner <[email protected]>
2023-10-25drm/amd: Disable ASPM for VI w/ all Intel systemsMario Limonciello1-1/+1
Originally we were quirking ASPM disabled specifically for VI when used with Alder Lake, but it appears to have problems with Rocket Lake as well. Like we've done in the case of dpm for newer platforms, disable ASPM for all Intel systems. Cc: [email protected] # 5.15+ Fixes: 0064b0ce85bb ("drm/amd/pm: enable ASPM by default") Reported-and-tested-by: Paolo Gentili <[email protected]> Closes: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2036742 Signed-off-by: Mario Limonciello <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-10-25drm/i915/pmu: Check if pmu is closed before stopping eventUmesh Nerlige Ramappa1-0/+9
When the driver unbinds, pmu is unregistered and i915->uabi_engines is set to RB_ROOT. Due to this, when i915 PMU tries to stop the engine events, it issues a warn_on because engine lookup fails. All perf hooks are taking care of this using a pmu->closed flag that is set when PMU unregisters. The stop event seems to have been left out. Check for pmu->closed in pmu_event_stop as well. Based on discussion here - https://patchwork.freedesktop.org/patch/492079/?series=105790&rev=2 v2: s/is/if/ in commit title v3: Add fixes tag and cc stable Cc: <[email protected]> # v5.11+ Fixes: b00bccb3f0bb ("drm/i915/pmu: Handle PCI unbind") Signed-off-by: Umesh Nerlige Ramappa <[email protected]> Reviewed-by: Tvrtko Ursulin <[email protected]> Reviewed-by: Andi Shyti <[email protected]> Signed-off-by: Andi Shyti <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit 31f6a06f0c543b43a38fab10f39e5fc45ad62aa2) Signed-off-by: Rodrigo Vivi <[email protected]>
2023-10-25drm/i915/mcr: Hold GT forcewake during steering operationsMatt Roper1-2/+22
The steering control and semaphore registers are inside an "always on" power domain with respect to RC6. However there are some issues if higher-level platform sleep states are entering/exiting at the same time these registers are accessed. Grabbing GT forcewake and holding it over the entire lock/steer/unlock cycle ensures that those sleep states have been fully exited before we access these registers. This is expected to become a formally documented/numbered workaround soon. Note that this patch alone isn't expected to have an immediately noticeable impact on MCR (mis)behavior; an upcoming pcode firmware update will also be necessary to provide the other half of this workaround. v2: - Move the forcewake inside the Xe_LPG-specific IP version check. This should only be necessary on platforms that have a steering semaphore. Fixes: 3100240bf846 ("drm/i915/mtl: Add hardware-level lock for steering") Cc: Radhakrishna Sripada <[email protected]> Cc: Jonathan Cavitt <[email protected]> Signed-off-by: Matt Roper <[email protected]> Reviewed-by: Radhakrishna Sripada <[email protected]> Reviewed-by: Jonathan Cavitt <[email protected]> Reviewed-by: Andi Shyti <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit 8fa1c7cd1fe9cdfc426a603e1f1eecd3f463c487) Signed-off-by: Rodrigo Vivi <[email protected]>
2023-10-25drm/logicvc: Kconfig: select REGMAP and REGMAP_MMIOSui Jingfeng1-0/+2
drm/logicvc driver is depend on REGMAP and REGMAP_MMIO, should select this two kconfig option, otherwise the driver failed to compile on platform without REGMAP_MMIO selected: ERROR: modpost: "__devm_regmap_init_mmio_clk" [drivers/gpu/drm/logicvc/logicvc-drm.ko] undefined! make[1]: *** [scripts/Makefile.modpost:136: Module.symvers] Error 1 make: *** [Makefile:1978: modpost] Error 2 Signed-off-by: Sui Jingfeng <[email protected]> Acked-by: Paul Kocialkowski <[email protected]> Fixes: efeeaefe9be5 ("drm: Add support for the LogiCVC display controller") Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Paul Kocialkowski <[email protected]>
2023-10-25Merge tag 'amd-drm-next-6.7-2023-10-20' of ↵Dave Airlie58-1464/+1452
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.7-2023-10-20: amdgpu: - SMU 13 updates - UMSCH updates - DC MPO fixes - RAS updates - MES 11 fixes - Fix possible memory leaks in error pathes - GC 11.5 fixes - Kernel doc updates - PSP updates - APU IMU fixes - Misc code cleanups - SMU 11 fixes - OD fix - Frame size warning fixes - SR-IOV fixes - NBIO 7.11 updates - NBIO 7.7 updates - XGMI fixes - devcoredump updates amdkfd: - Misc code cleanups - SVM fixes Signed-off-by: Dave Airlie <[email protected]> From: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-24drm/rockchip: vop: Add NV15, NV20 and NV30 supportJonas Karlman3-17/+86
Add support for displaying 10-bit 4:2:0 and 4:2:2 formats produced by the Rockchip Video Decoder on RK322X, RK3288, RK3328 and RK3399. Also add support for 10-bit 4:4:4 format while at it. V5: Use drm_format_info_min_pitch() for correct bpp Add missing NV21, NV61 and NV42 formats V4: Rework RK3328/RK3399 win0/1 data to not affect RK3368 V2: Added NV30 support Signed-off-by: Jonas Karlman <[email protected]> Reviewed-by: Sandy Huang <[email protected]> Reviewed-by: Christopher Obbard <[email protected]> Tested-by: Christopher Obbard <[email protected]> Signed-off-by: Heiko Stuebner <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-24drm/fourcc: Add NV20 and NV30 YUV formatsJonas Karlman1-0/+8
DRM_FORMAT_NV20 and DRM_FORMAT_NV30 formats is the 2x1 and non-subsampled variant of NV15, a 10-bit 2-plane YUV format that has no padding between components. Instead, luminance and chrominance samples are grouped into 4s so that each group is packed into an integer number of bytes: YYYY = UVUV = 4 * 10 bits = 40 bits = 5 bytes The '20' and '30' suffix refers to the optimum effective bits per pixel which is achieved when the total number of luminance samples is a multiple of 4. V2: Added NV30 format Signed-off-by: Jonas Karlman <[email protected]> Reviewed-by: Sandy Huang <[email protected]> Reviewed-by: Christopher Obbard <[email protected]> Tested-by: Christopher Obbard <[email protected]> Signed-off-by: Heiko Stuebner <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-24drm/rockchip: vop2: rename window formats to show window type using themAndy Yan1-15/+15
formats_win_full_10bit is for cluster window, formats_win_full_10bit_yuyv is for rk356x esmart, rk3588 esmart window will support more format. formats_win_lite is for smart window. Rename it based the windows type may let meaning is clearer Signed-off-by: Andy Yan <[email protected]> Acked-by: Sascha Hauer <[email protected]> Signed-off-by: Heiko Stuebner <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-24drm/rockchip: vop2: Add more supported 10bit formatsAndy Yan2-6/+61
Add 10 bit RGB and AFBC based YUV format supported by vop2. Signed-off-by: Andy Yan <[email protected]> Acked-by: Sascha Hauer <[email protected]> Signed-off-by: Heiko Stuebner <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-24drm/rockchip: vop2: remove the unsupported format of cluster windowAndy Yan2-26/+1
The cluster window on vop2 doesn't support linear yuv format(NV12/16/24), it only support afbc based yuv format(DRM_FORMAT_YUV420_8BIT/10BIT), which will be added in next patch. Fixes: 604be85547ce ("drm/rockchip: Add VOP2 driver") Signed-off-by: Andy Yan <[email protected]> Acked-by: Sascha Hauer <[email protected]> Signed-off-by: Heiko Stuebner <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-24drm/rockchip: vop: fix format bpp calculationAndy Yan1-2/+16
We can't rely on cpp for bpp calculation as the cpp of some formats(DRM_FORMAT_YUV420_8BIT/10BIT, etc) is zero. Signed-off-by: Andy Yan <[email protected]> Acked-by: Sascha Hauer <[email protected]> Signed-off-by: Heiko Stuebner <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-24drm/bridge: synopsys: dw-mipi-dsi: Fix hcomponent lbcc for burst modeLiu Ying1-6/+12
In order to support burst mode, vendor drivers set lane_mbps higher than bandwidth through DPI interface. So, calculate horizontal component lane byte clock cycle(lbcc) based on lane_mbps instead of pixel clock rate for burst mode. Fixes: ac87d23694f4 ("drm/bridge: synopsys: dw-mipi-dsi: Use pixel clock rate to calculate lbcc") Reported-by: Heiko Stübner <[email protected]> Closes: https://lore.kernel.org/linux-arm-kernel/5979575.UjTJXf6HLC@diego/T/#u Tested-by: Heiko Stübner <[email protected]> # px30 minievb with xinpeng xpp055c272 Signed-off-by: Liu Ying <[email protected]> Signed-off-by: Heiko Stuebner <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-24drm/client: Convert drm_client_buffer_addfb() to drm_mode_addfb2()Geert Uytterhoeven1-8/+5
Currently drm_client_buffer_addfb() uses the legacy drm_mode_addfb(), which uses bpp and depth to guess the wanted buffer format. However, drm_client_buffer_addfb() already knows the exact buffer format, so there is no need to convert back and forth between buffer format and bpp/depth, and the function can just call drm_mode_addfb2() directly instead. Signed-off-by: Geert Uytterhoeven <[email protected]> Reviewed-by: Javier Martinez Canillas <[email protected]> Tested-by: Javier Martinez Canillas <[email protected]> Signed-off-by: Hamza Mahfooz <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/4b84adfc686288714e69d0442d22f1259ff74903.1697379891.git.geert@linux-m68k.org
2023-10-24drm/i915/perf: Determine context valid in OA reportsUmesh Nerlige Ramappa1-2/+2
When supporting OA for TGL, it was seen that the context valid bit in the report ID was not defined, however revisiting the spec seems to have this bit defined. The bit is used to determine if a context is valid on a context switch and is essential to determine active and idle periods for a context. Re-enable the context valid bit for gen12 platforms. BSpec: 52196 (description of report_id) v2: Include BSpec reference (Ashutosh) Fixes: 00a7f0d7155c ("drm/i915/tgl: Add perf support on TGL") Signed-off-by: Umesh Nerlige Ramappa <[email protected]> Reviewed-by: Ashutosh Dixit <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit 7eeaedf79989a8f131939782832e21e9218ed2a0) Signed-off-by: Rodrigo Vivi <[email protected]>
2023-10-24Merge tag 'topic/vmemdup-user-array-2023-10-24-1' of ↵Dave Airlie2-4/+4
git://anongit.freedesktop.org/drm/drm into drm-next vmemdup-user-array API and changes with it. This is just a process PR to merge the topic branch into drm-next, this contains some core kernel and drm changes. Signed-off-by: Dave Airlie <[email protected]> From: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-23drm/vc4: fix typoDario Binacchi1-1/+1
Replace 'pack' with 'back'. Fixes: c8b75bca92cb ("drm/vc4: Add KMS support for Raspberry Pi.") Signed-off-by: Dario Binacchi <[email protected]> Reviewed-by: Dave Stevenson <[email protected]> Reviewed-by: Bagas Sanjaya <[email protected]> Signed-off-by: Maxime Ripard <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-23drm/amdkfd: reserve a fence slot while locking the BOChristian König1-1/+1
Looks like the KFD still needs this. Signed-off-by: Christian König <[email protected]> Fixes: 8abc1eb2987a ("drm/amdkfd: switch over to using drm_exec v3") Acked-by: Alex Deucher <[email protected]> Acked-by: Felix Kuehling <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-23Merge tag 'drm-msm-next-2023-10-17' of ↵Dave Airlie81-1400/+2214
https://gitlab.freedesktop.org/drm/msm into drm-next Updates for v6.7 DP: - use existing helpers for DPCD handling instead of open-coded functions - set the subconnector type according to the plugged cable / dongle skip validity check for DP CTS EDID checksum DPU: - continued migration of feature flags to use core revision checks - reworked interrupts code to use '0' as NO_IRQ, removed raw IRQ indices from log / trace output gpu: - a7xx support (a730, a740) - fixes and additional speedbins for a635, a643 core: - decouple msm_drv from kms to more cleanly support headless devices (like imx5+a2xx) From: Rob Clark <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGvzkBL2_OgyOeP_b6rVEjrNdfm8jcKzaB04HqHyT5jYwA@mail.gmail.com Signed-off-by: Dave Airlie <[email protected]>
2023-10-23BackMerge tag 'v6.6-rc7' into drm-nextDave Airlie47-157/+290
This is needed to add the msm pr which is based on a higher base. Signed-off-by: Dave Airlie <[email protected]>
2023-10-21drm/amdgpu: Remove redundant call to priority_is_valid()Luben Tuikov1-7/+8
Remove a redundant call to amdgpu_ctx_priority_is_valid() from amdgpu_ctx_priority_permit(), which is called from amdgpu_ctx_init() which is called from amdgpu_ctx_alloc() which is called from amdgpu_ctx_ioctl(), where we've called amdgpu_ctx_priority_is_valid() already first thing in the function. Cc: Alex Deucher <[email protected]> Cc: Christian König <[email protected]> Signed-off-by: Luben Tuikov <[email protected]> Acked-by: Christian König <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-10-20drm/amd/display: Fix stack size issue on DML2Rodrigo Siqueira1-45/+54
This commit is the last part of the fix that reduces the stack size in the DML2 code. Cc: Stephen Rothwell <[email protected]> Cc: Alex Deucher <[email protected]> Cc: Roman Li <[email protected]> Cc: Chaitanya Dhere <[email protected]> Fixes: 7966f319c66d ("drm/amd/display: Introduce DML2") Tested-by: Stephen Rothwell <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Rodrigo Siqueira <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-10-20drm/amd/display: Reduce stack size by splitting functionRodrigo Siqueira1-481/+489
When compiling with allmodconfig, gcc highlights the following error: drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/display_mode_core.c: In function 'dml_core_mode_support': drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/display_mode_core.c:8229:1: error: the frame size of 2736 bytes is larger than 2048 bytes [-Werror=frame-larger-than=] 8229 | } // dml_core_mode_support | ^ cc1: all warnings being treated as errors This commit mitigates part of this problem by extracting the prefetch code to its own function. After applying this commit, the stack size reduces from 2736 to 2464, however, the stack size issue becomes part of the new function. Cc: Stephen Rothwell <[email protected]> Cc: Alex Deucher <[email protected]> Cc: Roman Li <[email protected]> Cc: Chaitanya Dhere <[email protected]> Fixes: 7966f319c66d ("drm/amd/display: Introduce DML2") Tested-by: Stephen Rothwell <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Rodrigo Siqueira <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-10-20drm/amdkfd: remap unaligned svm ranges that have splitAlex Sierra1-10/+31
Split SVM ranges that have been mapped into 2MB page table entries, require to be remap in case the split has happened in a non-aligned VA. [WHY]: This condition causes the 2MB page table entries be split into 4KB PTEs. Signed-off-by: Alex Sierra <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>