aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/drm_mm.c
AgeCommit message (Collapse)AuthorFilesLines
2020-06-23drm/mm: cleanup and improve next_hole_*_addr()Christian König1-72/+34
Skipping just one branch of the tree is not the most effective approach. Instead use a macro to define the traversal functions and sort out both branch sides. This improves the performance of the unit tests by a factor of more than 4. Signed-off-by: Christian König <[email protected]> Reviewed-by: Nirmoy Das <[email protected]> Link: https://patchwork.freedesktop.org/patch/370298/
2020-06-23drm/mm: optimize find_hole() as wellChristian König1-4/+7
Abort early if there isn't enough space to allocate from a subtree. Signed-off-by: Christian König <[email protected]> Acked-by: Nirmoy Das <[email protected]> Link: https://patchwork.freedesktop.org/patch/370297/
2020-06-23drm/mm: remove unused rb_hole_size()Christian König1-5/+0
Just some code cleanup. Signed-off-by: Christian König <[email protected]> Reviewed-by: Nirmoy Das <[email protected]> Link: https://patchwork.freedesktop.org/patch/370296/
2020-06-15drm/mm: remove invalid entry based optimizationChristian König1-4/+2
When the current entry is rejected as candidate for the search it does not mean that we can abort the subtree search. It is perfectly possible that only the alignment, but not the size is the reason for the rejection. Signed-off-by: Christian König <[email protected]> Reviewed-by: Nirmoy Das <[email protected]> Link: https://patchwork.freedesktop.org/patch/369394/
2020-06-04drm/mm: fix hole size comparisonNirmoy Das1-2/+2
Fixes: 0cdea4455acd350a ("drm/mm: optimize rb_hole_addr rbtree search") Signed-off-by: Nirmoy Das <[email protected]> Reported-by: Christian König <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Christian König <[email protected]> Link: https://patchwork.freedesktop.org/patch/367726/
2020-05-05drm/mm: optimize rb_hole_addr rbtree searchNirmoy Das1-19/+114
Userspace can severely fragment rb_hole_addr rbtree by manipulating alignment while allocating buffers. Fragmented rb_hole_addr rbtree would result in large delays while allocating buffer object for a userspace application. It takes long time to find suitable hole because if we fail to find a suitable hole in the first attempt then we look for neighbouring nodes using rb_prev()/rb_next(). Traversing rbtree using rb_prev()/rb_next() can take really long time if the tree is fragmented. This patch improves searches in fragmented rb_hole_addr rbtree by modifying it to an augmented rbtree which will store an extra field in drm_mm_node, subtree_max_hole. Each drm_mm_node now stores maximum hole size for its subtree in drm_mm_node->subtree_max_hole. Using drm_mm_node->subtree_max_hole, it is possible to eliminate a complete subtree if that subtree is unable to serve a request hence reducing number of rb_prev()/rb_next() used. With this patch applied, 1 million bo allocs on amdgpu took ~8 sec, compared to 50k bo allocs which took 28 sec without it. partial test code: int test_fragmentation(void) { int i = 0; uint32_t minor_version; uint32_t major_version; struct amdgpu_bo_alloc_request request = {}; amdgpu_bo_handle vram_handle[MAX_ALLOC] = {}; amdgpu_device_handle device_handle; request.alloc_size = 4096; request.phys_alignment = 8192; request.preferred_heap = AMDGPU_GEM_DOMAIN_VRAM; int fd = open("/dev/dri/card0", O_RDWR | O_CLOEXEC); amdgpu_device_initialize(fd, &major_version, &minor_version, &device_handle); for (i = 0; i < MAX_ALLOC; i++) { amdgpu_bo_alloc(device_handle, &request, &vram_handle[i]); } for (i = 0; i < MAX_ALLOC; i++) amdgpu_bo_free(vram_handle[i]); return 0; } v2: Use RB_DECLARE_CALLBACKS_MAX to maintain subtree_max_hole v3: insert_hole_addr() should be static a function fix return value of next_hole_high_addr()/next_hole_low_addr() Reported-by: kbuild test robot <[email protected]> v4: Fix commit message. Signed-off-by: Nirmoy Das <[email protected]> Reviewed-by: Chris Wilson <[email protected]> Acked-by: Christian König <[email protected]> Link: https://patchwork.freedesktop.org/patch/364341/ Signed-off-by: Christian König <[email protected]>
2020-03-31drm/mm: revert "Break long searches in fragmented address spaces"Christian König1-7/+1
This reverts commit 7be1b9b8e9d1e9ef0342d2e001f44eec4030aa4d. The drm_mm is supposed to work in atomic context, so calling schedule() or in this case cond_resched() is illegal. Signed-off-by: Christian König <[email protected]> Acked-by: Daniel Vetter <[email protected]> Link: https://patchwork.freedesktop.org/patch/359278/
2020-03-10drm/mm: Remove redundant assignment in drm_mm_reserve_nodeAkeem G Abodunrin1-1/+1
In Pete Goodliffe words, "You can improve a system by adding new code. You can also improve a system by removing code" - In this case, commit "202b52b7fbf70" added new code to initialize end of the node. So, there is no need for duplicated initialization, and this patch simply removes it. Signed-off-by: Akeem G Abodunrin <[email protected]> Cc: Chris Wilson <[email protected]> Reviewed-by: Chris Wilson <[email protected]> Signed-off-by: Chris Wilson <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-03-06drm/mm: Break long searches in fragmented address spacesChris Wilson1-1/+7
We try hard to select a suitable hole in the drm_mm first time. But if that is unsuccessful, we then have to look at neighbouring nodes, and this requires traversing the rbtree. Walking the rbtree can be slow (much slower than a linear list for deep trees), and if the drm_mm has been purposefully fragmented our search can be trapped for a long, long time. For non-preemptible kernels, we need to break up long CPU bound sections by manually checking for cond_resched(); similarly we should also bail out if we have been told to terminate. (In an ideal world, we would break for any signal, but we need to trade off having to perform the search again after ERESTARTSYS, which again may form a trap of making no forward progress.) Reported-by: Zbigniew Kempczyński <[email protected]> Signed-off-by: Chris Wilson <[email protected]> Cc: Zbigniew Kempczyński <[email protected]> Cc: Joonas Lahtinen <[email protected]> Reviewed-by: Andi Shyti <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2019-10-04drm/mm: Use clear_bit_unlock() for releasing the drm_mm_node()Chris Wilson1-5/+6
A few callers need to serialise the destruction of their drm_mm_node and ensure it is removed from the drm_mm before freeing. However, to be completely sure that any access from another thread is complete before we free the struct, we require the RELEASE semantics of clear_bit_unlock(). This allows the conditional locking such as Thread A Thread B mutex_lock(mm_lock); if (drm_mm_node_allocated(node)) { drm_mm_node_remove(node); mutex_lock(mm_lock); mutex_unlock(mm_lock); if (drm_mm_node_allocated(node)) drm_mm_node_remove(node); mutex_unlock(mm_lock); } kfree(node); to serialise correctly without any lingering accesses from A to the freed node. Allocation / insertion of the node is assumed never to race with removal or eviction scanning. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Tvrtko Ursulin <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2019-10-04drm/mm: Convert drm_mm_node booleans to bitopsChris Wilson1-9/+9
A straightforward conversion of assignment and checking of the boolean state flags (allocated, scanned) into non-atomic bitops. The caller remains responsible for all locking around the drm_mm and its nodes. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Tvrtko Ursulin <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2019-10-04drm/mm: Use helpers for drm_mm_node booleansChris Wilson1-7/+12
In preparation for rearranging the booleans into a flags field, ensure all the current users are using the inline helpers and not directly accessing the members. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Tvrtko Ursulin <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2019-06-26drm: Allow range of 0 for drm_mm_insert_node_in_range()Chris Wilson1-1/+1
We gracefully handle the caller specifying a zero range, so don't force them to special case that condition if it naturally falls out of their setup. What we don't check is if the end < start, so keep that as an assert for an illegal call. Signed-off-by: Chris Wilson <[email protected]> Cc: Joonas Lahtinen <[email protected]> Cc: Daniele Ceraolo Spurio <[email protected]> Cc: Daniel Vetter <[email protected]> Reviewed-by: Daniele Ceraolo Spurio <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2019-05-27drm: drop use of drmP.h in drm/*Sam Ravnborg1-4/+5
The use of the drmP.h header file is deprecated. Remove use from all files in drm/* so people do not look there and follow a bad example. Build tested allyesconfig,allmodconfig on x86, arm etc. Including alpha that is as always more challenging than the rest. Signed-off-by: Sam Ravnborg <[email protected]> Acked-by: Daniel Vetter <[email protected]> Cc: Maarten Lankhorst <[email protected]> Cc: Maxime Ripard <[email protected]> Cc: Sean Paul <[email protected]> Cc: David Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2019-04-29drm: Simplify stacktrace handlingThomas Gleixner1-15/+7
Replace the indirection through struct stack_trace by using the storage array based interfaces. The original code in all printing functions is really wrong. It allocates a storage array on stack which is unused because depot_fetch_stack() does not store anything in it. It overwrites the entries pointer in the stack_trace struct so it points to the depot storage. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Josh Poimboeuf <[email protected]> Acked-by: Daniel Vetter <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: [email protected] Cc: Joonas Lahtinen <[email protected]> Cc: Maarten Lankhorst <[email protected]> Cc: [email protected] Cc: David Airlie <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: [email protected] Cc: David Rientjes <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: [email protected] Cc: Mike Rapoport <[email protected]> Cc: Akinobu Mita <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: [email protected] Cc: Robin Murphy <[email protected]> Cc: Marek Szyprowski <[email protected]> Cc: Johannes Thumshirn <[email protected]> Cc: David Sterba <[email protected]> Cc: Chris Mason <[email protected]> Cc: Josef Bacik <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Mike Snitzer <[email protected]> Cc: Alasdair Kergon <[email protected]> Cc: Tom Zanussi <[email protected]> Cc: Miroslav Benes <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2019-04-14drm: Remove the ULONG_MAX stack trace hackeryThomas Gleixner1-3/+0
No architecture terminates the stack trace with ULONG_MAX anymore. Remove the cruft. Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: [email protected] Cc: Joonas Lahtinen <[email protected]> Cc: Maarten Lankhorst <[email protected]> Cc: [email protected] Cc: David Airlie <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Rodrigo Vivi <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-02-04drm: Trivial comment grammar cleanupsMatt Roper1-1/+1
Most of these are just cases where code comments used contractions (it's, who's) where they actually mean to use a possessive pronoun (its, whose) or vice-versa. Signed-off-by: Matt Roper <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2018-05-24drm/mm: Add a search-by-address variant to only inspect a single holeChris Wilson1-2/+7
Searching for an available hole by address is slow, as there no guarantee that a hole will be available and so we must walk over all nodes in the rbtree before we determine the search was futile. In many cases, the caller doesn't strictly care for the highest available hole and was just opportunistically laying out the address space in a preferred order. In such cases, the caller can accept any address and would rather do so then do a slow walk. To be able to mix search strategies, the caller wants to tell the drm_mm how long to spend on the search. Without a good guide for what should be the best split, start with a request to try once at most. That is return the top-most (or lowest) hole if it fulfils the alignment and size requirements. v2: Documentation, by why of example (selftests) and kerneldoc. Signed-off-by: Chris Wilson <[email protected]> Cc: Joonas Lahtinen <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2018-05-24drm/mm: Reject over-sized allocation requests earlyChris Wilson1-25/+57
As we keep an rbtree of available holes sorted by their size, we can very easily determine if there is any hole large enough that might satisfy the allocation request. This helps when dealing with a highly fragmented address space and a request for a search by address. To cache the largest size, we convert into the cached rbtree variant which tracks the leftmost node for us. However, currently we sorted into ascending size order so the leftmost node is the smallest, and so to make it the largest hole we need to invert our sorting. Signed-off-by: Chris Wilson <[email protected]> Cc: Joonas Lahtinen <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2018-03-28Backmerge tag 'v4.16-rc7' into drm-nextDave Airlie1-3/+18
Linux 4.16-rc7 This was requested by Daniel, and things were getting a bit hard to reconcile, most of the conflicts were trivial though.
2018-02-22Merge tag 'drm-misc-fixes-2018-02-21' of ↵Dave Airlie1-3/+18
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes Fixes for 4.16. I contains fixes for deadlock on runtime suspend on few drivers, a memory leak on non-blocking commits, a crash on color-eviction. The is also meson and edid fixes, plus a fix for a doc warning. * tag 'drm-misc-fixes-2018-02-21' of git://anongit.freedesktop.org/drm/drm-misc: drm/tve200: fix kernel-doc documentation comment include drm/meson: fix vsync buffer update drm: Handle unexpected holes in color-eviction drm/edid: Add 6 bpc quirk for CPT panel in Asus UX303LA drm/amdgpu: Fix deadlock on runtime suspend drm/radeon: Fix deadlock on runtime suspend drm/nouveau: Fix deadlock on runtime suspend drm: Allow determining if current task is output poll worker workqueue: Allow retrieval of current task's work struct drm/atomic: Fix memleak on ERESTARTSYS during non-blocking commits
2018-02-20drm/mm: Fix caching of leftmost node in the interval treeChris Wilson1-4/+5
When we descend the tree to find our slot, if we step to the right, we are no longer the leftmost node. Fixes: f808c13fd373 ("lib/interval_tree: fast overlap detection") Signed-off-by: Chris Wilson <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Jérôme Glisse <[email protected]> Cc: Christian König <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Acked-by: Christian König <[email protected]> for now. Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2018-02-20drm: Handle unexpected holes in color-evictionChris Wilson1-3/+18
During eviction, the driver may free more than one hole in the drm_mm due to the side-effects in evicting the scanned nodes. However, drm_mm_scan_color_evict() expects that the scan result is the first available hole (in the mru freed hole_stack list): kernel BUG at drivers/gpu/drm/drm_mm.c:844! invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: i915 snd_hda_codec_analog snd_hda_codec_generic coretemp snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core lpc_ich snd_pcm e1000e mei_me prime_numbers mei CPU: 1 PID: 1490 Comm: gem_userptr_bli Tainted: G U 4.16.0-rc1-g740f57c54ecf-kasan_6+ #1 Hardware name: Dell Inc. OptiPlex 755 /0PU052, BIOS A08 02/19/2008 RIP: 0010:drm_mm_scan_color_evict+0x2b8/0x3d0 RSP: 0018:ffff880057a573f8 EFLAGS: 00010287 RAX: ffff8800611f5980 RBX: ffff880057a575d0 RCX: dffffc0000000000 RDX: 00000000029d5000 RSI: 1ffff1000af4aec1 RDI: ffff8800611f5a10 RBP: ffff88005ab884d0 R08: ffff880057a57600 R09: 000000000afff000 R10: 1ffff1000b5710b5 R11: 0000000000001000 R12: 1ffff1000af4ae82 R13: ffff8800611f59b0 R14: ffff8800611f5980 R15: ffff880057a57608 FS: 00007f2de0c2e8c0(0000) GS:ffff88006ac40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f2ddde1e000 CR3: 00000000609b2000 CR4: 00000000000006e0 Call Trace: ? drm_mm_scan_remove_block+0x330/0x330 ? drm_mm_scan_remove_block+0x151/0x330 i915_gem_evict_something+0x711/0xbd0 [i915] ? igt_evict_contexts+0x50/0x50 [i915] ? nop_clear_range+0x10/0x10 [i915] ? igt_evict_something+0x90/0x90 [i915] ? i915_gem_gtt_reserve+0x1a1/0x320 [i915] i915_gem_gtt_insert+0x237/0x400 [i915] __i915_vma_do_pin+0xc25/0x1a20 [i915] eb_lookup_vmas+0x1c63/0x3790 [i915] ? i915_gem_check_execbuffer+0x250/0x250 [i915] ? trace_hardirqs_on_caller+0x33f/0x590 ? _raw_spin_unlock_irqrestore+0x39/0x60 ? __pm_runtime_resume+0x7d/0xf0 i915_gem_do_execbuffer+0x86a/0x2ff0 [i915] ? __kmalloc+0x132/0x340 ? i915_gem_execbuffer2_ioctl+0x10f/0x760 [i915] ? drm_ioctl_kernel+0x12e/0x1c0 ? drm_ioctl+0x662/0x980 ? eb_relocate_slow+0xa90/0xa90 [i915] ? i915_gem_execbuffer2_ioctl+0x10f/0x760 [i915] ? __might_fault+0xea/0x1a0 i915_gem_execbuffer2_ioctl+0x3cc/0x760 [i915] ? i915_gem_execbuffer_ioctl+0xba0/0xba0 [i915] ? lock_acquire+0x3c0/0x3c0 ? i915_gem_execbuffer_ioctl+0xba0/0xba0 [i915] drm_ioctl_kernel+0x12e/0x1c0 drm_ioctl+0x662/0x980 ? i915_gem_execbuffer_ioctl+0xba0/0xba0 [i915] ? drm_getstats+0x20/0x20 ? debug_check_no_obj_freed+0x2a6/0x8c0 do_vfs_ioctl+0x170/0xe70 ? ioctl_preallocate+0x170/0x170 ? task_work_run+0xbe/0x160 ? lock_acquire+0x3c0/0x3c0 ? trace_hardirqs_on_caller+0x33f/0x590 ? _raw_spin_unlock_irq+0x2f/0x50 SyS_ioctl+0x36/0x70 ? do_vfs_ioctl+0xe70/0xe70 do_syscall_64+0x18c/0x5d0 entry_SYSCALL_64_after_hwframe+0x26/0x9b RIP: 0033:0x7f2ddf13b587 RSP: 002b:00007fff15c4f9d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f2ddf13b587 RDX: 00007fff15c4fa20 RSI: 0000000040406469 RDI: 0000000000000003 RBP: 00007fff15c4fa20 R08: 0000000000000000 R09: 00007f2ddf3fe120 R10: 0000000000000073 R11: 0000000000000246 R12: 0000000040406469 R13: 0000000000000003 R14: 00007fff15c4fa20 R15: 00000000000000c7 Code: 00 00 00 4a c7 44 22 08 00 00 00 00 42 c7 44 22 10 00 00 00 00 48 81 c4 b8 00 00 00 5b 5d 41 5c 41 5d 41 5e 41 5f c3 0f 0b 0f 0b <0f> 0b 31 c0 eb c0 4c 89 ef e8 9a 09 41 ff e9 1e fe ff ff 4c 89 RIP: drm_mm_scan_color_evict+0x2b8/0x3d0 RSP: ffff880057a573f8 We can trivially relax this assumption by searching the hole_stack for the scan result and warn instead if the driver called us without any result. Fixes: 3fa489dabea9 ("drm: Apply tight eviction scanning to color_adjust") Signed-off-by: Chris Wilson <[email protected]> Cc: Joonas Lahtinen <[email protected]> Cc: <[email protected]> # v4.11+ Reviewed-by: Joonas Lahtinen <[email protected]> Reviewed-by: Daniel Vetter <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2017-12-19BackMerge tag 'v4.15-rc4' into drm-nextDave Airlie1-3/+5
Linux 4.15-rc4 Daniel requested it to fix some messy conflicts.
2017-12-14lib/rbtree,drm/mm: add rbtree_replace_node_cached()Chris Wilson1-3/+5
Add a variant of rbtree_replace_node() that maintains the leftmost cache of struct rbtree_root_cached when replacing nodes within the rbtree. As drm_mm is the only rb_replace_node() being used on an interval tree, the mistake looks fairly self-contained. Furthermore the only user of drm_mm_replace_node() is its testsuite... Testcase: igt/drm_mm/replace Link: http://lkml.kernel.org/r/[email protected] Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Fixes: f808c13fd373 ("lib/interval_tree: fast overlap detection") Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Acked-by: Davidlohr Bueso <[email protected]> Cc: Jérôme Glisse <[email protected]> Cc: Joonas Lahtinen <[email protected]> Cc: Daniel Vetter <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-02drm: Spelling fixesLiviu Dudau1-1/+1
Minor spelling fix for 'monster' and replace 'on' with 'own' in comments. Signed-off-by: Liviu Dudau <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Gustavo Padovan <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2017-09-08lib/interval_tree: fast overlap detectionDavidlohr Bueso1-8/+11
Allow interval trees to quickly check for overlaps to avoid unnecesary tree lookups in interval_tree_iter_first(). As of this patch, all interval tree flavors will require using a 'rb_root_cached' such that we can have the leftmost node easily available. While most users will make use of this feature, those with special functions (in addition to the generic insert, delete, search calls) will avoid using the cached option as they can do funky things with insertions -- for example, vma_interval_tree_insert_after(). [[email protected]: fix deadlock from typo vm_lock_anon_vma()] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Davidlohr Bueso <[email protected]> Signed-off-by: Jérôme Glisse <[email protected]> Acked-by: Christian König <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Doug Ledford <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Cc: David Airlie <[email protected]> Cc: Jason Wang <[email protected]> Cc: Christian Benvenuti <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-02-06drm: Micro-optimise drm_mm_for_each_node_in_range()Chris Wilson1-1/+1
As we require valid start/end parameters, we can replace the initial potential NULL with a pointer to the drm_mm.head_node and so reduce the test on every iteration from a NULL + address comparison to just an address comparison. add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-26 (-26) function old new delta i915_gem_evict_for_node 719 693 -26 (No other users outside of the test harness.) Signed-off-by: Chris Wilson <[email protected]> Cc: Joonas Lahtinen <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2017-02-03drm: Improve drm_mm search (and fix topdown allocation) with rbtreesChris Wilson1-216/+272
The drm_mm range manager claimed to support top-down insertion, but it was neither searching for the top-most hole that could fit the allocation request nor fitting the request to the hole correctly. In order to search the range efficiently, we create a secondary index for the holes using either their size or their address. This index allows us to find the smallest hole or the hole at the bottom or top of the range efficiently, whilst keeping the hole stack to rapidly service evictions. v2: Search for holes both high and low. Rename flags to mode. v3: Discover rb_entry_safe() and use it! v4: Kerneldoc for enum drm_mm_insert_mode. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Cc: Alex Deucher <[email protected]> Cc: "Christian König" <[email protected]> Cc: David Airlie <[email protected]> Cc: Russell King <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Sean Paul <[email protected]> Cc: Lucas Stach <[email protected]> Cc: Christian Gmeiner <[email protected]> Cc: Rob Clark <[email protected]> Cc: Thierry Reding <[email protected]> Cc: Stephen Warren <[email protected]> Cc: Alexandre Courbot <[email protected]> Cc: Eric Anholt <[email protected]> Cc: Sinclair Yeh <[email protected]> Cc: Thomas Hellstrom <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Sinclair Yeh <[email protected]> # vmwgfx Reviewed-by: Lucas Stach <[email protected]> #etnaviv Signed-off-by: Daniel Vetter <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2017-01-25drm/gem|prime|mm: Use recommened kerneldoc for struct member refsDaniel Vetter1-2/+2
I just learned that &struct_name.member_name works and looks pretty even. It doesn't (yet) link to the member directly though, which would be really good for big structures or vfunc tables (where the per-member kerneldoc tends to be long). Also some minor drive-by polish where it makes sense, I read a lot of docs ... Cc: Jani Nikula <[email protected]> Cc: Chris Wilson <[email protected]> Reviewed-by: Gustavo Padovan <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-12-30drm/mm: Some doc polishDaniel Vetter1-19/+22
Added some boilerplate for the structs, documented members where they are relevant and plenty of markup for hyperlinks all over. And a few small wording polish. Note that the intro needs some more love after the DRM_MM_INSERT_* patch from Chris has landed. v2: Spelling fixes (Chris). v3: Use &struct foo instead of &foo structure (Chris). Cc: Chris Wilson <[email protected]> Cc: Joonas Lahtinen <[email protected]> Reviewed-by: David Herrmann <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-12-30drm/mm: Convert to drm_printerDaniel Vetter1-56/+11
Including all drivers. I thought about keeping small compat functions to avoid having to change all drivers. But I really like the drm_printer idea, so figured spreading it more widely is a good thing. v2: Review from Chris: - Natural argument order and better name for drm_mm_print. - show_mm() macro in the selftest. Cc: Rob Clark <[email protected]> Cc: Russell King <[email protected]> Cc: Alex Deucher <[email protected]> Cc: Christian König <[email protected]> Cc: Lucas Stach <[email protected]> Cc: Tomi Valkeinen <[email protected]> Cc: Thierry Reding <[email protected]> Cc: Jyri Sarha <[email protected]> Reviewed-by: Chris Wilson <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-12-28drm: Add kerneldoc markup for new @scan parameters in drm_mmChris Wilson1-0/+2
A couple of parameters slipped through the kerneldoc net. Reported-by: kbuild test robot <[email protected]> Signed-off-by: Chris Wilson <[email protected]> Cc: Daniel Vetter <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-12-28drm/mm: Document locking rulesDaniel Vetter1-0/+5
Drivers need to take care. Motivated by a discussion between Mark and Rob on dri-devel. Cc: Mark yao <[email protected]> Cc: Rob Clark <[email protected]> Reviewed-by: Chris Wilson <[email protected]> [danvet: s/alloc|freeing/modifications/ per Chris' suggestion.] Signed-off-by: Daniel Vetter <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-12-28drm: Use drm_mm_insert_node_in_range_generic() for everyoneChris Wilson1-155/+11
Remove a superfluous helper as drm_mm_insert_node is equivalent to insert_node_in_range with a range of [0, U64_MAX]. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-12-28drm: Apply range restriction after color adjustment when allocationChris Wilson1-10/+6
mm->color_adjust() compares the hole with its neighbouring nodes. They only abutt before we restrict the hole, so we have to apply color_adjust before we apply the range restriction. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-12-28drm: Wrap drm_mm_node.hole_followsChris Wilson1-6/+6
Insulate users from changes to the internal hole tracking within struct drm_mm_node by using an accessor for hole_follows. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> [danvet: resolve conflicts in i915_vma.c] Signed-off-by: Daniel Vetter <[email protected]>
2016-12-28drm: Apply tight eviction scanning to color_adjustChris Wilson1-25/+51
Using mm->color_adjust makes the eviction scanner much tricker since we don't know the actual neighbours of the target hole until after it is created (after scanning is complete). To work out whether we need to evict the neighbours because they impact upon the hole, we have to then check the hole afterwards - requiring an extra step in the user of the eviction scanner when they apply color_adjust. v2: Massage kerneldoc. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-12-28drm: Simplify drm_mm scan-list manipulationChris Wilson1-17/+18
Since we mandate a strict reverse-order of drm_mm_scan_remove_block() after drm_mm_scan_add_block() we can further simplify the list manipulations when generating the temporary scan-hole. v2: Highlight the games being played with the lists to track the scan holes without allocation. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-12-28drm: Optimise power-of-two alignments in drm_mm_scan_add_block()Chris Wilson1-1/+8
For power-of-two alignments, we can avoid the 64bit divide and do a simple bitwise add instead. v2: s/alignment_mask/remainder_mask/ Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-12-28drm: Compute tight evictions for drm_mm_scanChris Wilson1-10/+50
Compute the minimal required hole during scan and only evict those nodes that overlap. This enables us to reduce the number of nodes we need to evict to the bare minimum. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-12-28drm: Fix application of color vs range restriction when scanning drm_mmChris Wilson1-6/+9
The range restriction should be applied after the color adjustment, or else we may inadvertently apply the color adjustment to the restricted hole (and not against its neighbours). Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-12-28drm: Unconditionally do the range check in drm_mm_scan_add_block()Chris Wilson1-49/+4
Doing the check is trivial (low cost in comparison to overall eviction) and helps simplify the code. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-12-28drm: Rename prev_node to hole in drm_mm_scan_add_block()Chris Wilson1-8/+8
Acknowledging that we were building up the hole was more useful to me when reading the code, than knowing the relationship between this node and the previous node. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-12-27drm: Extract struct drm_mm_scan from struct drm_mmChris Wilson1-54/+70
The scan state occupies a large proportion of the struct drm_mm and is rarely used and only contains temporary state. That makes it suitable to moving to its struct and onto the stack of the callers. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> [danvet: Fix up etnaviv to compile, was missing a BUG_ON.] Signed-off-by: Daniel Vetter <[email protected]>
2016-12-27drm: Add asserts to catch overflow in drm_mm_init() and drm_mm_init_scan()Chris Wilson1-0/+7
A simple assert to ensure that we don't overflow start + size when initialising the drm_mm, or its scanner. In future, we may want to switch to tracking the value of ranges (rather than size) so that we can cover the full u64, for example like resource tracking. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-12-27drm: Simplify drm_mm_clean()Chris Wilson1-18/+1
Since commit ea7b1dd44867 ("drm: mm: track free areas implicitly"), to test whether there are any nodes allocated within the range manager, we merely have to ask whether the node_list is empty. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-12-27drm: Detect overflow in drm_mm_reserve_node()Chris Wilson1-3/+2
Protect ourselves from a caller passing in node.start + node.size that will overflow and trick us into reserving that node. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-12-27drm: Fix kerneldoc for drm_mm_scan_remove_block()Chris Wilson1-16/+18
The nodes must be removed in the *reverse* order. This is correct in the overview, but backwards in the function description. Whilst here add Intel's copyright statement and tweak some formatting. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-12-27drm: Promote drm_mm alignment to u64Chris Wilson1-20/+17
In places (e.g. i915.ko), the alignment is exported to userspace as u64 and there now exists hardware for which we can indeed utilize a u64 alignment. As such, we need to keep 64bit integers throughout when handling alignment. Testcase: igt/drm_mm/align64 Testcase: igt/gem_exec_alignment Signed-off-by: Chris Wilson <[email protected]> Cc: Joonas Lahtinen <[email protected]> Cc: Christian König <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]