aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/v3d/v3d_sched.c
AgeCommit message (Collapse)AuthorFilesLines
2024-09-11Merge v6.11-rc7 into drm-nextSimona Vetter1-0/+6
Thomas needs 5a498d4d06d6 ("drm/fbdev-dma: Only install deferred I/O if necessary") in drm-misc, so start the backmerge cascade. Signed-off-by: Simona Vetter <[email protected]>
2024-08-28drm/v3d: Disable preemption while updating GPU statsTvrtko Ursulin1-0/+6
We forgot to disable preemption around the write_seqcount_begin/end() pair while updating GPU stats: [ ] WARNING: CPU: 2 PID: 12 at include/linux/seqlock.h:221 __seqprop_assert.isra.0+0x128/0x150 [v3d] [ ] Workqueue: v3d_bin drm_sched_run_job_work [gpu_sched] <...snip...> [ ] Call trace: [ ] __seqprop_assert.isra.0+0x128/0x150 [v3d] [ ] v3d_job_start_stats.isra.0+0x90/0x218 [v3d] [ ] v3d_bin_job_run+0x23c/0x388 [v3d] [ ] drm_sched_run_job_work+0x520/0x6d0 [gpu_sched] [ ] process_one_work+0x62c/0xb48 [ ] worker_thread+0x468/0x5b0 [ ] kthread+0x1c4/0x1e0 [ ] ret_from_fork+0x10/0x20 Fix it. Cc: Maíra Canal <[email protected]> Cc: [email protected] # v6.10+ Fixes: 6abe93b621ab ("drm/v3d: Fix race-condition between sysfs/fdinfo and interrupt handler") Signed-off-by: Tvrtko Ursulin <[email protected]> Acked-by: Maíra Canal <[email protected]> Signed-off-by: Maíra Canal <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-08-27Merge v6.11-rc5 into drm-nextDaniel Vetter1-3/+11
amdgpu pr conconflicts due to patches cherry-picked to -fixes, I might as well catch up with a backmerge and handle them all. Plus both misc and intel maintainers asked for a backmerge anyway. Signed-off-by: Daniel Vetter <[email protected]>
2024-08-12drm/v3d: Fix out-of-bounds read in `v3d_csd_job_run()`Maíra Canal1-3/+11
When enabling UBSAN on Raspberry Pi 5, we get the following warning: [ 387.894977] UBSAN: array-index-out-of-bounds in drivers/gpu/drm/v3d/v3d_sched.c:320:3 [ 387.903868] index 7 is out of range for type '__u32 [7]' [ 387.909692] CPU: 0 PID: 1207 Comm: kworker/u16:2 Tainted: G WC 6.10.3-v8-16k-numa #151 [ 387.919166] Hardware name: Raspberry Pi 5 Model B Rev 1.0 (DT) [ 387.925961] Workqueue: v3d_csd drm_sched_run_job_work [gpu_sched] [ 387.932525] Call trace: [ 387.935296] dump_backtrace+0x170/0x1b8 [ 387.939403] show_stack+0x20/0x38 [ 387.942907] dump_stack_lvl+0x90/0xd0 [ 387.946785] dump_stack+0x18/0x28 [ 387.950301] __ubsan_handle_out_of_bounds+0x98/0xd0 [ 387.955383] v3d_csd_job_run+0x3a8/0x438 [v3d] [ 387.960707] drm_sched_run_job_work+0x520/0x6d0 [gpu_sched] [ 387.966862] process_one_work+0x62c/0xb48 [ 387.971296] worker_thread+0x468/0x5b0 [ 387.975317] kthread+0x1c4/0x1e0 [ 387.978818] ret_from_fork+0x10/0x20 [ 387.983014] ---[ end trace ]--- This happens because the UAPI provides only seven configuration registers and we are reading the eighth position of this u32 array. Therefore, fix the out-of-bounds read in `v3d_csd_job_run()` by accessing only seven positions on the '__u32 [7]' array. The eighth register exists indeed on V3D 7.1, but it isn't currently used. That being so, let's guarantee that it remains unused and add a note that it could be set in a future patch. Fixes: 0ad5bc1ce463 ("drm/v3d: fix up register addresses for V3D 7.x") Reported-by: Tvrtko Ursulin <[email protected]> Signed-off-by: Maíra Canal <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-08-08Merge tag 'drm-misc-next-2024-08-01' of ↵Daniel Vetter1-36/+43
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next drm-misc-next for v6.12: UAPI Changes: virtio: - Define DRM capset Cross-subsystem Changes: dma-buf: - heaps: Clean up documentation printk: - Pass description to kmsg_dump() Core Changes: CI: - Update IGT tests - Point upstream repo to GitLab instance modesetting: - Introduce Power Saving Policy property for connectors - Add might_fault() to drm_modeset_lock priming - Add dynamic per-crtc vblank configuration support panic: - Avoid build-time interference with framebuffer console docs: - Document Colorspace property scheduler: - Remove full_recover from drm_sched_start TTM: - Make LRU walk restartable after dropping locks - Allow direct reclaim to allocate local memory Driver Changes: amdgpu: - Support Power Saving Policy connector property ast: - astdp: Support AST2600 with VGA; Clean up HPD bridge: - Silence error message on -EPROBE_DEFER - analogix: Clean aup - bridge-connector: Fix double free - lt6505: Disable interrupt when powered off - tc358767: Make default DP port preemphasis configurable gma500: - Update i2c terminology ivpu: - Add MODULE_FIRMWARE() lcdif: - Fix pixel clock loongson: - Use GEM refcount over TTM's mgag200: - Improve BMC handling - Support VBLANK intterupts nouveau: - Refactor and clean up internals - Use GEM refcount over TTM's panel: - Shutdown fixes plus documentation - Refactor several drivers for better code sharing - boe-th101mb31ig002: Support for starry-er88577 MIPI-DSI panel plus DT; Fix porch parameter - edp: Support AOU B116XTN02.3, AUO B116XAN06.1, AOU B116XAT04.1, BOE NV140WUM-N41, BOE NV133WUM-N63, BOE NV116WHM-A4D, CMN N116BCA-EA2, CMN N116BCP-EA2, CSW MNB601LS1-4 - himax-hx8394: Support Microchip AC40T08A MIPI Display panel plus DT - ilitek-ili9806e: Support Densitron DMT028VGHMCMI-1D TFT plus DT - jd9365da: Support Melfas lmfbx101117480 MIPI-DSI panel plus DT; Refactor for code sharing sti: - Fix module owner stm: - Avoid UAF wih managed plane and CRTC helpers - Fix module owner - Fix error handling in probe - Depend on COMMON_CLK - ltdc: Fix transparency after disabling plane; Remove unused interrupt tegra: - Call drm_atomic_helper_shutdown() v3d: - Clean up perfmon vkms: - Clean up Signed-off-by: Daniel Vetter <[email protected]> From: Thomas Zimmermann <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-07-30Merge drm/drm-fixes into drm-misc-fixesMaxime Ripard1-4/+14
Let's start the new drm-misc-fixes cycle by bringing in 6.11-rc1. Signed-off-by: Maxime Ripard <[email protected]>
2024-07-29Merge drm/drm-next into drm-misc-nextThomas Zimmermann1-3/+13
Backmerging to get a late RC of v6.10 before moving into v6.11. Signed-off-by: Thomas Zimmermann <[email protected]>
2024-07-25drm/scheduler: remove full_recover from drm_sched_startChristian König1-1/+1
This was basically just another one of amdgpus hacks. The parameter allowed to restart the scheduler without turning fence signaling on again. That this is absolutely not a good idea should be obvious by now since the fences will then just sit there and never signal. While at it cleanup the code a bit. Signed-off-by: Christian König <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-07-18drm/v3d: Fix potential memory leak in the performance extensionTvrtko Ursulin1-6/+16
If fetching of userspace memory fails during the main loop, all drm sync objs looked up until that point will be leaked because of the missing drm_syncobj_put. Fix it by exporting and using a common cleanup helper. Signed-off-by: Tvrtko Ursulin <[email protected]> Fixes: bae7cb5d6800 ("drm/v3d: Create a CPU job extension for the reset performance query job") Cc: Maíra Canal <[email protected]> Cc: Iago Toral Quiroga <[email protected]> Cc: [email protected] # v6.8+ Signed-off-by: Maíra Canal <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit 484de39fa5f5b7bd0c5f2e2c5265167250ef7501) Signed-off-by: Thomas Zimmermann <[email protected]>
2024-07-18drm/v3d: Fix potential memory leak in the timestamp extensionTvrtko Ursulin1-6/+16
If fetching of userspace memory fails during the main loop, all drm sync objs looked up until that point will be leaked because of the missing drm_syncobj_put. Fix it by exporting and using a common cleanup helper. Signed-off-by: Tvrtko Ursulin <[email protected]> Fixes: 9ba0ff3e083f ("drm/v3d: Create a CPU job extension for the timestamp query job") Cc: Maíra Canal <[email protected]> Cc: Iago Toral Quiroga <[email protected]> Cc: [email protected] # v6.8+ Reviewed-by: Maíra Canal <[email protected]> Signed-off-by: Maíra Canal <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit 753ce4fea62182c77e1691ab4f9022008f25b62e) Signed-off-by: Thomas Zimmermann <[email protected]>
2024-07-15drm/v3d: Fix Indirect Dispatch configuration for V3D 7.1.6 and laterMaíra Canal1-3/+13
`args->cfg[4]` is configured in Indirect Dispatch using the number of batches. Currently, for all V3D tech versions, `args->cfg[4]` equals the number of batches subtracted by 1. But, for V3D 7.1.6 and later, we must not subtract 1 from the number of batches. Implement the fix by checking the V3D tech version and revision. Fixes several `dEQP-VK.synchronization*` CTS tests related to Indirect Dispatch. Fixes: 18b8413b25b7 ("drm/v3d: Create a CPU job extension for a indirect CSD job") Signed-off-by: Maíra Canal <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-07-13drm/v3d: Do not use intermediate storage when copying performance query resultsTvrtko Ursulin1-21/+36
Removing the intermediate buffer removes the last use of the V3D_MAX_COUNTERS define, which will enable further driver cleanup. While at it pull the 32 vs 64 bit copying decision outside the loop in order to reduce the number of conditional instructions. Signed-off-by: Tvrtko Ursulin <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Maíra Canal <[email protected]> Signed-off-by: Maíra Canal <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-07-13drm/v3d: Size the kperfmon_ids array at runtimeTvrtko Ursulin1-1/+3
Instead of statically reserving pessimistic space for the kperfmon_ids array, make the userspace extension code allocate the exactly required amount of space. Apart from saving some memory at runtime, this also removes the need for the V3D_MAX_PERFMONS macro whose removal will benefit further driver cleanup. Signed-off-by: Tvrtko Ursulin <[email protected]> Reviewed-by: Maíra Canal <[email protected]> Signed-off-by: Maíra Canal <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-07-13drm/v3d: Fix potential memory leak in the performance extensionTvrtko Ursulin1-6/+16
If fetching of userspace memory fails during the main loop, all drm sync objs looked up until that point will be leaked because of the missing drm_syncobj_put. Fix it by exporting and using a common cleanup helper. Signed-off-by: Tvrtko Ursulin <[email protected]> Fixes: bae7cb5d6800 ("drm/v3d: Create a CPU job extension for the reset performance query job") Cc: Maíra Canal <[email protected]> Cc: Iago Toral Quiroga <[email protected]> Cc: [email protected] # v6.8+ Signed-off-by: Maíra Canal <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-07-13drm/v3d: Fix potential memory leak in the timestamp extensionTvrtko Ursulin1-6/+16
If fetching of userspace memory fails during the main loop, all drm sync objs looked up until that point will be leaked because of the missing drm_syncobj_put. Fix it by exporting and using a common cleanup helper. Signed-off-by: Tvrtko Ursulin <[email protected]> Fixes: 9ba0ff3e083f ("drm/v3d: Create a CPU job extension for the timestamp query job") Cc: Maíra Canal <[email protected]> Cc: Iago Toral Quiroga <[email protected]> Cc: [email protected] # v6.8+ Reviewed-by: Maíra Canal <[email protected]> Signed-off-by: Maíra Canal <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-05-20drm/v3d: Use V3D_MAX_COUNTERS instead of V3D_PERFCNT_NUMMaíra Canal1-1/+1
V3D_PERFCNT_NUM represents the maximum number of performance counters for V3D 4.2, but not for V3D 7.1. This means that, if we use V3D_PERFCNT_NUM, we might go out-of-bounds on V3D 7.1. Therefore, use the number of performance counters on V3D 7.1 as the maximum number of counters. This will allow us to create arrays on the stack with reasonable size. Note that userspace must use the value provided by DRM_V3D_PARAM_MAX_PERF_COUNTERS. Signed-off-by: Maíra Canal <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-04-23drm/v3d: Fix race-condition between sysfs/fdinfo and interrupt handlerMaíra Canal1-0/+7
In V3D, the conclusion of a job is indicated by a IRQ. When a job finishes, then we update the local and the global GPU stats of that queue. But, while the GPU stats are being updated, a user might be reading the stats from sysfs or fdinfo. For example, on `gpu_stats_show()`, we could think about a scenario where `v3d->queue[queue].start_ns != 0`, then an interrupt happens, we update the value of `v3d->queue[queue].start_ns` to 0, we come back to `gpu_stats_show()` to calculate `active_runtime` and now, `active_runtime = timestamp`. In this simple example, the user would see a spike in the queue usage, that didn't match reality. In order to address this issue properly, use a seqcount to protect read and write sections of the code. Fixes: 09a93cc4f7d1 ("drm/v3d: Implement show_fdinfo() callback for GPU usage stats") Reported-by: Tvrtko Ursulin <[email protected]> Signed-off-by: Maíra Canal <[email protected]> Reviewed-by: Tvrtko Ursulin <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-04-23drm/v3d: Create function to update a set of GPU statsMaíra Canal1-7/+10
Given a set of GPU stats, that is, a `struct v3d_stats` related to a queue in a given context, create a function that can update this set of GPU stats. Signed-off-by: Maíra Canal <[email protected]> Reviewed-by: Tvrtko Ursulin <[email protected]> Reviewed-by: Jose Maria Casanova Crespo <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-04-23drm/v3d: Create a struct to store the GPU statsMaíra Canal1-8/+12
This will make it easier to instantiate the GPU stats variables and it will create a structure where we can store all the variables that refer to GPU stats. Note that, when we created the struct `v3d_stats`, we renamed `jobs_sent` to `jobs_completed`. This better express the semantics of the variable, as we are only accounting jobs that have been completed. Signed-off-by: Maíra Canal <[email protected]> Reviewed-by: Tvrtko Ursulin <[email protected]> Reviewed-by: Jose Maria Casanova Crespo <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-04-23drm/v3d: Create two functions to update all GPU stats variablesMaíra Canal1-45/+35
Currently, we manually perform all operations to update the GPU stats variables. Apart from the code repetition, this is very prone to errors, as we can see on commit 35f4f8c9fc97 ("drm/v3d: Don't increment `enabled_ns` twice"). Therefore, create two functions to manage updating all GPU stats variables. Now, the jobs only need to call for `v3d_job_update_stats()` when the job is done and `v3d_job_start_stats()` when starting the job. Co-developed-by: Tvrtko Ursulin <[email protected]> Signed-off-by: Tvrtko Ursulin <[email protected]> Signed-off-by: Maíra Canal <[email protected]> Reviewed-by: Jose Maria Casanova Crespo <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-12-01drm/v3d: Create a CPU job extension for the copy performance query jobMaíra Canal1-0/+65
A CPU job is a type of job that performs operations that requires CPU intervention. A copy performance query job is a job that copy the complete or partial result of a query to a buffer. In order to copy the result of a performance query to a buffer, we need to get the values from the performance monitors. So, create a user extension for the CPU job that enables the creation of a copy performance query job. This user extension will allow the creation of a CPU job that copy the results of a performance query to a BO with the possibility to indicate the availability with a availability bit. Signed-off-by: Maíra Canal <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-12-01drm/v3d: Create a CPU job extension for the reset performance query jobMaíra Canal1-0/+36
A CPU job is a type of job that performs operations that requires CPU intervention. A reset performance query job is a job that resets the performance queries by resetting the values of the perfmons. Moreover, we also reset the syncobjs related to the availability of the query. So, create a user extension for the CPU job that enables the creation of a reset performance job. This user extension will allow the creation of a CPU job that resets the perfmons values and resets the availability syncobj. Signed-off-by: Maíra Canal <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-12-01drm/v3d: Create a CPU job extension to copy timestamp query to a bufferMaíra Canal1-0/+56
A CPU job is a type of job that performs operations that requires CPU intervention. A copy timestamp query job is a job that copy the complete or partial result of a query to a buffer. As V3D doesn't provide any mechanism to obtain a timestamp from the GPU, it is a job that needs CPU intervention. So, create a user extension for the CPU job that enables the creation of a copy timestamp query job. This user extension will allow the creation of a CPU job that copy the results of a timestamp query to a BO with the possibility to indicate the timestamp availability with a availability bit. Signed-off-by: Maíra Canal <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-12-01drm/v3d: Create a CPU job extension for the reset timestamp jobMaíra Canal1-0/+21
A CPU job is a type of job that performs operations that requires CPU intervention. A reset timestamp job is a job that resets the timestamp queries based on the value offset of the first query. As V3D doesn't provide any mechanism to obtain a timestamp from the GPU, it is a job that needs CPU intervention. So, create a user extension for the CPU job that enables the creation of a reset timestamp job. This user extension will allow the creation of a CPU job that resets the timestamp value in the timestamp BO and resets the availability syncobj. Signed-off-by: Maíra Canal <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-12-01drm/v3d: Create a CPU job extension for the timestamp query jobMaíra Canal1-1/+39
A CPU job is a type of job that performs operations that requires CPU intervention. A timestamp query job is a job that calculates the query timestamp and updates the query availability by signaling a syncobj. As V3D doesn't provide any mechanism to obtain a timestamp from the GPU, it is a job that needs CPU intervention. So, create a user extension for the CPU job that enables the creation of a timestamp query job. This user extension will allow the creation of a CPU job that performs the timestamp query calculation and updates the timestamp BO with the proper value. Signed-off-by: Maíra Canal <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-12-01drm/v3d: Create a CPU job extension for a indirect CSD jobMaíra Canal1-1/+40
A CPU job is a type of job that performs operations that requires CPU intervention. An indirect CSD job is a job that, when executed in the queue, will map the indirect buffer, read the dispatch parameters, and submit a regular dispatch. Therefore, it is a job that needs CPU intervention. So, create a user extension for the CPU job that enables the creation of an indirect CSD. This user extension will allow the creation of a CSD job linked to a CPU job. The CPU job will wait for the indirect CSD job dependencies and, once they are signaled, it will update the CSD job parameters. Co-developed-by: Melissa Wen <[email protected]> Signed-off-by: Melissa Wen <[email protected]> Signed-off-by: Maíra Canal <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-12-01drm/v3d: Create tracepoints to track the CPU jobMaíra Canal1-0/+4
Create tracepoints to track the three major events of a CPU job lifetime: 1. Submission of a `v3d_submit_cpu` IOCTL 2. Beginning of the execution of a CPU job 3. Ending of the execution of a CPU job Signed-off-by: Maíra Canal <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-12-01drm/v3d: Add a CPU job submissionMelissa Wen1-0/+57
Create a new type of job, a CPU job. A CPU job is a type of job that performs operations that requires CPU intervention. The overall idea is to use user extensions to enable different types of CPU job, allowing the CPU job to perform different operations according to the type of user extension. The user extension ID identify the type of CPU job that must be dealt. Having a CPU job is interesting for synchronization purposes as a CPU job has a queue like any other V3D job and can be synchoronized by the multisync extension. Signed-off-by: Melissa Wen <[email protected]> Co-developed-by: Maíra Canal <[email protected]> Signed-off-by: Maíra Canal <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-11-06drm/v3d: Expose the total GPU usage stats on sysfsMaíra Canal1-1/+14
The previous patch exposed the accumulated amount of active time per client for each V3D queue. But this doesn't provide a global notion of the GPU usage. Therefore, provide the accumulated amount of active time for each V3D queue (BIN, RENDER, CSD, TFU and CACHE_CLEAN), considering all the jobs submitted to the queue, independent of the client. This data is exposed through the sysfs interface, so that if the interface is queried at two different points of time the usage percentage of each of the queues can be calculated. Co-developed-by: Jose Maria Casanova Crespo <[email protected]> Signed-off-by: Jose Maria Casanova Crespo <[email protected]> Signed-off-by: Maíra Canal <[email protected]> Acked-by: Jose Maria Casanova Crespo <[email protected]> Reviewed-by: Melissa Wen <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-11-06drm/v3d: Implement show_fdinfo() callback for GPU usage statsMaíra Canal1-0/+20
This patch exposes the accumulated amount of active time per client through the fdinfo infrastructure. The amount of active time is exposed for each V3D queue: BIN, RENDER, CSD, TFU and CACHE_CLEAN. In order to calculate the amount of active time per client, a CPU clock is used through the function local_clock(). The point where the jobs has started is marked and is finally compared with the time that the job had finished. Moreover, the number of jobs submitted to each queue is also exposed on fdinfo through the identifier "v3d-jobs-<queue>". Co-developed-by: Jose Maria Casanova Crespo <[email protected]> Signed-off-by: Jose Maria Casanova Crespo <[email protected]> Signed-off-by: Maíra Canal <[email protected]> Acked-by: Jose Maria Casanova Crespo <[email protected]> Reviewed-by: Melissa Wen <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-11-02drm/v3d: fix up register addresses for V3D 7.xIago Toral Quiroga1-17/+21
This patch updates a number of register addresses that have been changed in Raspberry Pi 5 (V3D 7.1) and updates the code to use the corresponding registers and addresses based on the actual V3D version. Signed-off-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Maíra Canal <[email protected]> Signed-off-by: Maíra Canal <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-11-01drm/sched: Convert drm scheduler to use a work queue rather than kthreadMatthew Brost1-5/+5
In Xe, the new Intel GPU driver, a choice has made to have a 1 to 1 mapping between a drm_gpu_scheduler and drm_sched_entity. At first this seems a bit odd but let us explain the reasoning below. 1. In Xe the submission order from multiple drm_sched_entity is not guaranteed to be the same completion even if targeting the same hardware engine. This is because in Xe we have a firmware scheduler, the GuC, which allowed to reorder, timeslice, and preempt submissions. If a using shared drm_gpu_scheduler across multiple drm_sched_entity, the TDR falls apart as the TDR expects submission order == completion order. Using a dedicated drm_gpu_scheduler per drm_sched_entity solve this problem. 2. In Xe submissions are done via programming a ring buffer (circular buffer), a drm_gpu_scheduler provides a limit on number of jobs, if the limit of number jobs is set to RING_SIZE / MAX_SIZE_PER_JOB we get flow control on the ring for free. A problem with this design is currently a drm_gpu_scheduler uses a kthread for submission / job cleanup. This doesn't scale if a large number of drm_gpu_scheduler are used. To work around the scaling issue, use a worker rather than kthread for submission / job cleanup. v2: - (Rob Clark) Fix msm build - Pass in run work queue v3: - (Boris) don't have loop in worker v4: - (Tvrtko) break out submit ready, stop, start helpers into own patch v5: - (Boris) default to ordered work queue v6: - (Luben / checkpatch) fix alignment in msm_ringbuffer.c - (Luben) s/drm_sched_submit_queue/drm_sched_wqueue_enqueue - (Luben) Update comment for drm_sched_wqueue_enqueue - (Luben) Positive check for submit_wq in drm_sched_init - (Luben) s/alloc_submit_wq/own_submit_wq v7: - (Luben) s/drm_sched_wqueue_enqueue/drm_sched_run_job_queue v8: - (Luben) Adjust var names / comments Signed-off-by: Matthew Brost <[email protected]> Reviewed-by: Luben Tuikov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Luben Tuikov <[email protected]>
2023-10-26drm/sched: Convert the GPU scheduler to variable number of run-queuesLuben Tuikov1-0/+5
The GPU scheduler has now a variable number of run-queues, which are set up at drm_sched_init() time. This way, each driver announces how many run-queues it requires (supports) per each GPU scheduler it creates. Note, that run-queues correspond to scheduler "priorities", thus if the number of run-queues is set to 1 at drm_sched_init(), then that scheduler supports a single run-queue, i.e. single "priority". If a driver further sets a single entity per run-queue, then this creates a 1-to-1 correspondence between a scheduler and a scheduled entity. Cc: Lucas Stach <[email protected]> Cc: Russell King <[email protected]> Cc: Qiang Yu <[email protected]> Cc: Rob Clark <[email protected]> Cc: Abhinav Kumar <[email protected]> Cc: Dmitry Baryshkov <[email protected]> Cc: Danilo Krummrich <[email protected]> Cc: Matthew Brost <[email protected]> Cc: Boris Brezillon <[email protected]> Cc: Alex Deucher <[email protected]> Cc: Christian König <[email protected]> Cc: Emma Anholt <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Luben Tuikov <[email protected]> Acked-by: Christian König <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-03-01drm/v3d: centralize error handling when init scheduler failsMelissa Wen1-27/+13
Remove redundant error message (since now it is very similar to what we do in drm_sched_init) and centralize all error handling in a unique place, as we follow the same steps in any case of failure. Signed-off-by: Melissa Wen <[email protected]> Acked-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Melissa Wen <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2022-02-23drm/sched: Add device pointer to drm_gpu_schedulerJiawei Gu1-5/+5
Add device pointer so scheduler's printing can use DRM_DEV_ERROR() instead, which makes life easier under multiple GPU scenario. v2: amend all calls of drm_sched_init() v3: fill dev pointer for all drm_sched_init() calls Signed-off-by: Jiawei Gu <[email protected]> Reviewed-by: Andrey Grodzovsky <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Christian König <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-08-30drm/v3d: Use scheduler dependency handlingDaniel Vetter1-28/+1
With the prep work out of the way this isn't tricky anymore. Aside: The chaining of the various jobs is a bit awkward, with the possibility of failure in bad places. I think with the drm_sched_job_init/arm split and maybe preloading the job->dependencies xarray this should be fixable. v2: Rebase over renamed function names for adding dependencies. Reviewed-by: Melissa Wen <[email protected]> (v1) Acked-by: Emma Anholt <[email protected]> Cc: Melissa Wen <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Cc: Emma Anholt <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-08-30drm/v3d: Move drm_sched_job_init to v3d_job_initDaniel Vetter1-8/+7
Prep work for using the scheduler dependency handling. We need to call drm_sched_job_init earlier so we can use the new drm_sched_job_await* functions for dependency handling here. v2: Slightly better commit message and rebase to include the drm_sched_job_arm() call (Emma). v3: Cleanup jobs under construction correctly (Emma) v4: Rebase over perfmon patch Reviewed-by: Melissa Wen <[email protected]> (v3) Acked-by: Emma Anholt <[email protected]> Cc: Melissa Wen <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Cc: Emma Anholt <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-07-21drm/v3d: Expose performance counters to userspaceJuan A. Suarez Romero1-0/+16
The V3D engine has several hardware performance counters that can of interest for userspace performance analysis tools. This exposes new ioctls to create and destroy performance monitor objects, as well as to query the counter values. Each created performance monitor object has an ID that can be attached to CL/CSD submissions, so the driver enables the requested counters when the job is submitted, and updates the performance monitor values when the job is done. It is up to the user to ensure all the jobs have been finished before getting the performance monitor values. It is also up to the user to properly synchronize BCL jobs when submitting jobs with different performance monitors attached. Cc: Daniel Vetter <[email protected]> Cc: David Airlie <[email protected]> Cc: Emma Anholt <[email protected]> To: [email protected] Signed-off-by: Juan A. Suarez Romero <[email protected]> Acked-by: Melissa Wen <[email protected]> Signed-off-by: Melissa Wen <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-07-01drm/sched: Allow using a dedicated workqueue for the timeout/fault tdrBoris Brezillon1-5/+5
Mali Midgard/Bifrost GPUs have 3 hardware queues but only a global GPU reset. This leads to extra complexity when we need to synchronize timeout works with the reset work. One solution to address that is to have an ordered workqueue at the driver level that will be used by the different schedulers to queue their timeout work. Thanks to the serialization provided by the ordered workqueue we are guaranteed that timeout handlers are executed sequentially, and can thus easily reset the GPU from the timeout handler without extra synchronization. v5: * Add a new paragraph to the timedout_job() method v3: * New patch v4: * Actually use the timeout_wq to queue the timeout work Suggested-by: Daniel Vetter <[email protected]> Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Steven Price <[email protected]> Reviewed-by: Lucas Stach <[email protected]> Acked-by: Daniel Vetter <[email protected]> Acked-by: Christian König <[email protected]> Cc: Qiang Yu <[email protected]> Cc: Emma Anholt <[email protected]> Cc: Alex Deucher <[email protected]> Cc: "Christian König" <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-02-05drm/scheduler: provide scheduler score externallyChristian König1-5/+5
Allow multiple schedulers to share the load balancing score. This is useful when one engine has different hw rings. Signed-off-by: Christian König <[email protected]> Reviewed-and-Tested-by: Leo Liu <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-02-02drm/v3d/v3d_sched: fix scheduler callbacks return statusChristian König1-6/+6
Looks like this was not correctly adjusted. Signed-off-by: Christian König <[email protected]> Acked-by: Daniel Vetter <[email protected]> Fixes: a6a1f036c74e ("drm/scheduler: Job timeout handler returns status (v3)") Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-01-29drm/scheduler: Job timeout handler returns status (v3)Luben Tuikov1-15/+17
This patch does not change current behaviour. The driver's job timeout handler now returns status indicating back to the DRM layer whether the device (GPU) is no longer available, such as after it's been unplugged, or whether all is normal, i.e. current behaviour. All drivers which make use of the drm_sched_backend_ops' .timedout_job() callback have been accordingly renamed and return the would've-been default value of DRM_GPU_SCHED_STAT_NOMINAL to restart the task's timeout timer--this is the old behaviour, and is preserved by this patch. v2: Use enum as the status of a driver's job timeout callback method. v3: Return scheduler/device information, rather than task information. Cc: Alexander Deucher <[email protected]> Cc: Andrey Grodzovsky <[email protected]> Cc: Christian König <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Lucas Stach <[email protected]> Cc: Russell King <[email protected]> Cc: Christian Gmeiner <[email protected]> Cc: Qiang Yu <[email protected]> Cc: Rob Herring <[email protected]> Cc: Tomeu Vizoso <[email protected]> Cc: Steven Price <[email protected]> Cc: Alyssa Rosenzweig <[email protected]> Cc: Eric Anholt <[email protected]> Reported-by: kernel test robot <[email protected]> Signed-off-by: Luben Tuikov <[email protected]> Acked-by: Alyssa Rosenzweig <[email protected]> Acked-by: Christian König <[email protected]> Acked-by: Steven Price <[email protected]> Signed-off-by: Christian König <[email protected]> Link: https://patchwork.freedesktop.org/patch/415095/
2020-11-18drm/v3d/v3d_sched: Demote non-conformant kernel-doc headerLee Jones1-2/+2
Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/v3d/v3d_sched.c:75: warning: Function parameter or member 'sched_job' not described in 'v3d_job_dependency' drivers/gpu/drm/v3d/v3d_sched.c:75: warning: Function parameter or member 's_entity' not described in 'v3d_job_dependency' Cc: Eric Anholt <[email protected]> Cc: David Airlie <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Sumit Semwal <[email protected]> Cc: "Christian König" <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Lee Jones <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-11-16drm: fix some kernel-doc markupsMauro Carvalho Chehab1-1/+1
Some identifiers have different names between their prototypes and the kernel-doc markup. Others need to be fixed, as kernel-doc markups should use this format: identifier - description Signed-off-by: Mauro Carvalho Chehab <[email protected]> Acked-by: Jani Nikula <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/12d4ca26f6843618200529ce5445063734d38c04.1605521731.git.mchehab+huawei@kernel.org
2020-04-28drm/v3d: Delete v3d_dev->devDaniel Vetter1-5/+5
We already have it in v3d_dev->drm.dev with zero additional pointer chasing. Personally I don't like duplicated pointers like this because: - reviewers need to check whether the pointer is for the same or different objects if there's multiple - compilers have an easier time too But also a bit a bikeshed, so feel free to ignore. Acked-by: Eric Anholt <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Cc: Eric Anholt <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2019-05-02drm/scheduler: rework job destructionChristian König1-1/+1
We now destroy finished jobs from the worker thread to make sure that we never destroy a job currently in timeout processing. By this we avoid holding lock around ring mirror list in drm_sched_stop which should solve a deadlock reported by a user. v2: Remove unused variable. v4: Move guilty job free into sched code. v5: Move sched->hw_rq_count to drm_sched_start to account for counter decrement in drm_sched_stop even when we don't call resubmit jobs if guily job did signal. v6: remove unused variable Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109692 Acked-by: Chunming Zhou <[email protected]> Signed-off-by: Christian König <[email protected]> Signed-off-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2019-04-18drm/v3d: Add missing implicit synchronization.Eric Anholt1-36/+4
It is the expectation of existing userspace (X11 + Mesa, in particular) that jobs submitted to the kernel against a shared BO will get implicitly synchronized by their submission order. If we want to allow clever userspace to disable implicit synchronization, we should do that under its own submit flag (as amdgpu and lima do). Note that we currently only implicitly sync for the rendering pass, not binning -- if you texture-from-pixmap in the binning vertex shader (vertex coordinate generation), you'll miss out on synchronization. Fixes flickering when multiple clients are running in parallel, particularly GL apps and compositors. v2: Fix a missing refcount on the CSD done fence for L2 cleaning. Signed-off-by: Eric Anholt <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Acked-by: Rob Clark <[email protected]>
2019-04-18drm/v3d: Add support for compute shader dispatch.Eric Anholt1-7/+114
The compute shader dispatch interface is pretty simple -- just pass in the regs that userspace has passed us, with no CLs to run. However, with no CL to run it means that we need to do manual cache flushing of the L2 after the HW execution completes (for SSBO, atomic, and image_load_store writes that are the output of compute shaders). This doesn't yet expose the L2 cache's ability to have a region of the address space not write back to memory (which could be used for shared_var storage). So far, the Mesa side has been tested on V3D v4.2 simpenrose (passing the ES31 tests), and on the kernel side on 7278 (failing atomic compswap tests in a way that doesn't reproduce on simpenrose). v2: Fix excessive allocation for the clean_job (reported by Dan Carpenter). Keep refs on jobs until clean_job is finished, to avoid spurious MMU errors if the output BOs are freed by userspace before L2 cleaning is finished. Signed-off-by: Eric Anholt <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Acked-by: Rob Clark <[email protected]>
2019-04-18drm/v3d: Refactor job management.Eric Anholt1-108/+151
The CL submission had two jobs embedded in an exec struct. When I added TFU support, I had to replicate some of the exec stuff and some of the job stuff. As I went to add CSD, it became clear that actually what was in exec should just be in the two CL jobs, and it would let us share a lot more code between the 4 queues. v2: Fix missing error path in TFU ioctl's bo[] allocation. Signed-off-by: Eric Anholt <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Acked-by: Rob Clark <[email protected]>
2019-04-01drm/v3d: Rename the fence signaled from IRQs to "irq_fence".Eric Anholt1-6/+6
We have another thing called the "done fence" that tracks when the scheduler considers the job done, and having the shared name was confusing. Signed-off-by: Eric Anholt <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Dave Emett <[email protected]>