blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2016-08-19	drm/i915: Flush delayed fence releases after reset	Chris Wilson	1	-1/+5
	What I never hit in testing, but Mika immediately did, was a GPU hang with a pending fence release (where a tiled object has been changed by the user to be untiled, and the update has not yet been committed to the fence register). As the stride/tiling is 0, this causes a divide-by-zero error when trying to write the new fence parameters: [ 28.784518] drm/i915: Resetting chip after gpu hang [ 28.784551] divide error: 0000 [#1] PREEMPT SMP [ 28.784565] Modules linked in: nls_iso8859_1 nls_cp437 vfat fat mxm_wmi x86_pkg_temp_thermal snd_hda_codec_hdmi kvm irqbypass snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec serio_raw snd_hwdep snd_hda_core snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq snd_timer snd_seq_device snd soundcore mac_hid wmi efivarfs autofs4 raid10 raid456 libcrc32c async_raid6_recov async_memcpy async_pq raid6_pq async_xor xor async_tx raid0 multipath linear psmouse e1000e ptp pps_core nvme nvme_core i915 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm video [ 28.784738] CPU: 0 PID: 1692 Comm: kworker/0:2 Not tainted 4.8.0-rc2+ #895 [ 28.784752] Hardware name: System manufacturer System Product Name/Z170M-PLUS, BIOS 1803 05/09/2016 [ 28.784786] Workqueue: events_long i915_hangcheck_elapsed [i915] [ 28.784814] task: ffff923c18f59d40 task.stack: ffff923c1b7e4000 [ 28.784827] RIP: 0010:[<ffffffffc0475b5f>] [<ffffffffc0475b5f>] fence_write+0x9f/0x3b0 [i915] [ 28.784854] RSP: 0018:ffff923c1b7e7b30 EFLAGS: 00010246 [ 28.784866] RAX: 00000000008ca000 RBX: ffff923c18540000 RCX: 0000000000000020 [ 28.784880] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000596d000 [ 28.784894] RBP: ffff923c1b7e7b68 R08: 0000000000000000 R09: 0000000000000000 [ 28.784908] R10: 0000000000000000 R11: 00000000008ca000 R12: ffff923c1ef9d600 [ 28.784921] R13: 0000000000100040 R14: 0000000000100044 R15: ffff923c18549908 [ 28.784935] FS: 0000000000000000(0000) GS:ffff923c36c00000(0000) knlGS:0000000000000000 [ 28.784951] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 28.784962] CR2: 00007f193373c893 CR3: 0000000419c78000 CR4: 00000000003406f0 [ 28.784976] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 28.784990] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 28.785004] Stack: [ 28.785009] 000000000596c03b ffff923c1b7e7b68 ffff923c18549938 0000000000000009 [ 28.785026] ffff923c18540000 ffff923c18549280 ffff923c18547ce8 ffff923c1b7e7b90 [ 28.785044] ffffffffc04761f9 ffff923c18540000 ffff923c18547d00 ffff923c18548ff8 [ 28.785062] Call Trace: [ 28.785078] [<ffffffffc04761f9>] i915_gem_restore_fences+0x39/0x50 [i915] [ 28.785102] [<ffffffffc047fe89>] i915_gem_reset+0x179/0x300 [i915] Reported-by: Mika Kuoppala <[email protected]> Fixes: 49ef5294cda2 ("drm/i915: Move fence tracking from object to vma") Signed-off-by: Chris Wilson <[email protected]> Cc: Mika Kuoppala <[email protected]> Cc: Joonas Lahtinen <[email protected]> Reviewed-by: Mika Kuoppala <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-19	drm/i915: Reattach comment, complete type specification	Dave Gordon	1	-2/+3
	In the recent patch bc3d674 drm/i915: Allow userspace to request no-error-capture upon ... the final version moved the flags and the associated #defines around so they were adjacent; unfortunately, they ended up between a comment and the thing (hw_id) to which the comment applies :( So this patch reshuffles the comment and subject back together. Also, as we're touching 'hw_id', let's change it from just 'unsigned' to a fully-specified 'unsigned int', because some code checking tools (including checkpatch) object to plain 'unsigned'. Fixes: bc3d674462e5 ("drm/i915: Allow userspace to request no-error-capture...") Signed-off-by: Dave Gordon <[email protected]> Cc: Chris Wilson <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Chris Wilson <[email protected]> Signed-off-by: Chris Wilson <[email protected]>
2016-08-18	drm/i915/cmdparser: Accelerate copies from WC memory	Chris Wilson	1	-27/+43
	If we need to use clflush to prepare our batch for reads from memory, we can bypass the cache instead by using non-temporal copies. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Matthew Auld <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915/cmdparser: Use binary search for faster register lookup	Chris Wilson	1	-22/+20
	A significant proportion of the cmdparsing time for some batches is the cost to find the register in the mmiotable. We ensure that those tables are in ascending order such that we could do a binary search if it was ever merited. It is. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Matthew Auld <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915/cmdparser: Check for SKIP descriptors first	Chris Wilson	1	-0/+3
	If the command descriptor says to skip it, ignore checking for anyother other conflict. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Matthew Auld <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915/cmdparser: Compare against the previous command descriptor	Chris Wilson	1	-7/+14
	On the blitter (and in test code), we see long sequences of repeated commands, e.g. XY_PIXEL_BLT, XY_SCANLINE_BLT, or XY_SRC_COPY. For these, we can skip the hashtable lookup by remembering the previous command descriptor and doing a straightforward compare of the command header. The corollary is that we need to do one extra comparison before lookup up new commands. v2: Less magic mask (ok, it is still magic, but now you cannot see!) Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Matthew Auld <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915/cmdparser: Improve hash function	Chris Wilson	1	-20/+31
	The existing code's hashfunction is very suboptimal (most 3D commands use the same bucket degrading the hash to a long list). The code even acknowledge that the issue was known and the fix simple: /* * If we attempt to generate a perfect hash, we should be able to look at bits * 31:29 of a command from a batch buffer and use the full mask for that * client. The existing INSTR_CLIENT_MASK/SHIFT defines can be used for this. */ Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Matthew Auld <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915/cmdparser: Only cache the dst vmap	Chris Wilson	1	-14/+19
	For simplicity, we want to continue using a contiguous mapping of the command buffer, but we can reduce the number of vmappings we hold by switching over to a page-by-page copy from the user batch buffer to the shadow. The cost for saving one linear mapping is about 5% in trivial workloads - which is more or less the overhead in calling kmap_atomic(). Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Matthew Auld <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915/cmdparser: Use cached vmappings	Chris Wilson	2	-81/+54
	The single largest factor in the overhead of parsing the commands is the setup of the virtual mapping to provide a continuous block for the batch buffer. If we keep those vmappings around (against the better judgement of mm/vmalloc.c, which we offset by handwaving and looking suggestively at the shrinker) we can dramatically improve the performance of the parser for small batches (such as media workloads). Furthermore, we can use the prepare shmem read/write functions to determine how best we need to clflush the range (rather than every page of the object). The impact of caching both src/dst vmaps is +80% on ivb and +140% on byt for the throughput on small batches. (Caching just the dst vmap and iterating over the src, doing a page by page copy is roughly 5% slower on both platforms. That may be an acceptable trade-off to eliminate one cached vmapping, and we may be able to reduce the per-page copying overhead further.) For this simple test case, the cmdparser is now within a factor of 2 of ideal performance. Signed-off-by: Chris Wilson <[email protected]> Cc: Matthew Auld <[email protected]> Reviewed-by: Matthew Auld <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915/cmdparser: Add the TIMESTAMP register for the other engines	Chris Wilson	1	-0/+5
	Since I have been using the BCS_TIMESTAMP to measure latency of execution upon the blitter ring, allow regular userspace to also read from that register. They are already allowed RCS_TIMESTAMP! Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Matthew Auld <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915/cmdparser: Make initialisation failure non-fatal	Chris Wilson	3	-16/+19
	If the developer adds a register in the wrong order, we BUG during boot. That makes development and testing very difficult. Let's be a bit more friendly and disable the command parser with a big warning if the tables are invalid. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Matthew Auld <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915: Stop discarding GTT cache-domain on unbind vma	Chris Wilson	1	-23/+3
	Since commit 43566dedde54 ("drm/i915: Broaden application of set-domain(GTT)") we allowed objects to be in the GTT domain, but unbound. Therefore removing the GTT cache domain when removing the GGTT vma is no longer semantically correct. An unfortunate side-effect is we lose the wondrously named i915_gem_object_finish_gtt(), not to be confused with i915_gem_gtt_finish_object()! Signed-off-by: Chris Wilson <[email protected]> Cc: Akash Goel <[email protected]> Cc: Joonas Lahtinen <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915: Bump the inactive tracking for all VMA accessed	Chris Wilson	1	-6/+23
	We track the LRU access for eviction and bump the last access for the user GGTT on set-to-gtt. When we do so we need to not only bump the primary GGTT VMA but all partials as well. Similarly we want to bump the last access tracking for when unpinning an object from the scanout so that they do not get promptly evicted and hopefully remain available for reuse on the next frame. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915: Track display alignment on VMA	Chris Wilson	2	-13/+9
	When using the aliasing ppgtt and pageflipping with the shrinker/eviction active, we note that we often have to rebind the backbuffer before flipping onto the scanout because it has an invalid alignment. If we store the worst-case alignment required for a VMA, we can avoid having to rebind at critical junctures. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915: Fallback to using unmappable memory for scanout	Chris Wilson	2	-4/+14
	The existing ABI says that scanouts are pinned into the mappable region so that legacy clients (e.g. old Xorg or plymouthd) can write directly into the scanout through a GTT mapping. However if the surface does not fit into the mappable region, we are better off just trying to fit it anywhere and hoping for the best. (Any userspace that is capable of using ginormous scanouts is also likely not to rely on pure GTT updates.) With the partial vma fault support, we are no longer restricted to only using scanouts that we can pin (though it is still preferred for performance reasons and for powersaving features like FBC). v2: Skip fence pinning when not mappable. v3: Add a comment to explain the possible ramifications of not being able to use fences for unmappable scanouts. v4: Rebase to skip over some local patches v5: Rebase to defer until after we have unmappable GTT fault support Signed-off-by: Chris Wilson <[email protected]> Cc: Deepak S <[email protected]> Cc: Damien Lespiau <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Jani Nikula <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915: Choose not to evict faultable objects from the GGTT	Chris Wilson	3	-4/+16
	Often times we do not want to evict mapped objects from the GGTT as these are quite expensive to teardown and frequently reused (causing an equally, if not more so, expensive setup). In particular, when faulting in a new object we want to avoid evicting an active object, or else we may trigger a page-fault-of-doom as we ping-pong between evicting two objects. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915: Drop ORIGIN_GTT for untracked GTT writes	Chris Wilson	2	-4/+10
	If FBC is set on a framebuffer that is unmapped, all GTT faults will be from a partial mapping. Writes by the user through the partial VMA are then untracked by the FBC and so we must use the ORIGIN_CPU when flushing the I915_GEM_DOMAIN_GTT. v2: Keep ORIGIN_CPU for set-to-domain(.write=CPU) Signed-off-by: Chris Wilson <[email protected]> Cc: Joonas Lahtinen <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: "Zanoni, Paulo R" <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915: Convert partial ggtt vma to full ggtt if it spans the entire object	Chris Wilson	1	-0/+6
	If we want to create a partial vma from a chunk that is the same size as the object, create a normal ggtt vma instead. The benefit is that it will match future requests for the normal ggtt. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915: Fix partial GGTT faulting	Chris Wilson	1	-31/+30
	We want to always use the partial VMA as a fallback for a failure to bind the object into the GGTT. This extends the support partial objects in the GGTT to cover everything, not just objects too large. v2: Call the partial view, view not partial. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915: Choose partial chunksize based on tile row size	Chris Wilson	1	-1/+16
	In order to support setting up fences for partial mappings of an object, we have to align those mappings with the fence. The minimum chunksize we choose is at least the size of a single tile row. v2: Make minimum chunk size a define for later use Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915: Move fence tracking from object to vma	Chris Wilson	12	-388/+330
	In order to handle tiled partial GTT mmappings, we need to associate the fence with an individual vma. v2: A couple of silly drops replaced spotted by Joonas Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915: Rename fence.lru_list to link	Chris Wilson	4	-8/+7
	Our current practice is to only name the actual list (here dev_priv->fence_list) using "list", and elements upon that list are referred to as "link". Further, the lru nature is of the list and not of the node and including in the name does not disambiguate the link from anything else. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915/userptr: Make gup errors stickier	Chris Wilson	1	-10/+7
	Keep any error reported by the gup_worker until we are notified that the arena has changed (via the mmu-notifier). This has the importance of making two consecutive calls to i915_gem_object_get_pages() reporting the same error, and curtailing a loop of detecting a fault and requeueing a gup_worker. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Mika Kuoppala <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915: Allocate rings from stolen	Chris Wilson	1	-4/+2
	If we have stolen available, make use of it for ringbuffer allocation. Previously this was restricted to !llc platforms, as writing to stolen requires a GGTT mapping - but now that we have partial mappable support, the mappable aperture isn't quite so precious so we can use it more freely and ringbuffers are a good user for the otherwise wasted stolen. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915: Allow ringbuffers to be bound anywhere	Chris Wilson	2	-8/+8
	Now that we have WC vmapping available, we can bind our rings anywhere in the GGTT and do not need to restrict them to the mappable region. Except for stolen objects, for which direct access is verbatim and we must use the mappable aperture. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915: Move map-and-fenceable tracking to the VMA	Chris Wilson	8	-40/+36
	By moving map-and-fenceable tracking from the object to the VMA, we gain fine-grained tracking and the ability to track individual fences on the VMA (subsequent patch). Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915: Disallow direct CPU access to stolen pages for relocations	Chris Wilson	1	-0/+3
	As we cannot access the backing pages behind stolen objects, we should not attempt to do so for relocations. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915: Fallback to single page GTT mmappings for relocations	Chris Wilson	1	-11/+51
	If we cannot pin the entire object into the mappable region of the GTT, try to pin a single page instead. This is much more likely to succeed, and prevents us falling back to the clflush slow path. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915: Refactor execbuffer relocation writing	Chris Wilson	1	-139/+152
	With the introduction of the reloc page cache, we are just one step away from refactoring the relocation write functions into one. Not only does it tidy the code (slightly), but it greatly simplifies the control logic much to gcc's satisfaction. v2: Add selftests to document the relationship between the clflush flags, the KMAP bit and packing into the page-aligned pointer. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915: Tidy up flush cpu/gtt write domains	Chris Wilson	1	-11/+4
	Since we know the write domain, we can drop the local variable and make the code look a tiny bit simpler. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915: Pin the pages first in shmem prepare read/write	Chris Wilson	1	-20/+28
	There is an improbable, but not impossible, case that if we leave the pages unpin as we operate on the object, then somebody via the shrinker may steal the lock (which lock? right now, it is struct_mutex, THE lock) and change the cache domains after we have already inspected them. (Whilst here, avail ourselves of the opportunity to take a couple of steps to make the two functions look more similar.) Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915: Wait for writes through the GTT to land before reading back	Chris Wilson	1	-1/+11
	If we quickly switch from writing through the GTT to a read of the physical page directly with the CPU (e.g. performing relocations through the GTT and then running the command parser), we can observe that the writes are not visible to the CPU. It is not a coherency problem, as extensive investigations with clflush have demonstrated, but a mere timing issue - we have to wait for the GTT to complete it's write before we start our read from the CPU. The issue can be illustrated in userspace with: gtt = gem_mmap__gtt(fd, handle, 0, OBJECT_SIZE, PROT_READ \| PROT_WRITE); cpu = gem_mmap__cpu(fd, handle, 0, OBJECT_SIZE, PROT_READ \| PROT_WRITE); gem_set_domain(fd, handle, I915_GEM_DOMAIN_GTT, I915_GEM_DOMAIN_GTT); for (i = 0; i < OBJECT_SIZE / 64; i++) { int x = 16*i + (i%16); gtt[x] = i; clflush(&cpu[x], sizeof(cpu[x])); assert(cpu[x] == i); } Experimenting with that shows that this behaviour is indeed limited to recent Atom-class hardware. Testcase: igt/gem_exec_flush/basic-batch-default-cmd #byt Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915: Before accessing an object via the cpu, flush GTT writes	Chris Wilson	1	-0/+4
	If we want to read the pages directly via the CPU, we have to be sure that we have to flush the writes via the GTT (as the CPU can not see the address aliasing). Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915: Extract i915_gem_obj_prepare_shmem_write()	Chris Wilson	3	-65/+102
	This is a companion to i915_gem_obj_prepare_shmem_read() that prepares the backing storage for direct writes. It first serialises with the GPU, pins the backing storage and then indicates what clfushes are required in order for the writes to be coherent. Whilst here, fix support for ancient CPUs without clflush for which we cannot do the GTT+clflush tricks. v2: Add i915_gem_obj_finish_shmem_access() for symmetry Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915: Cache kmap between relocations	Chris Wilson	1	-54/+111
	When doing relocations, we have to obtain a mapping to the page containing the target address. This is either a kmap or iomap depending on GPU and its cache coherency. Neighbouring relocation entries are typically within the same page and so we can cache our kmapping between them and avoid those pesky TLB flushes. Note that there is some sleight-of-hand in how the slow relocate works as the reloc_entry_cache implies pagefaults disabled (as we are inside a kmap_atomic section). However, the slow relocate code is meant to be the fallback from the atomic fast path failing. Fortunately it works as we already have performed the copy_from_user for the relocation array (no more pagefaults there) and the kmap_atomic cache is enabled after we have waited upon an active buffer (so no more sleeping in atomic). Magic! Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915: Fallback to single page pwrite/pread if unable to release fence	Chris Wilson	2	-13/+19
	If we cannot release the fence (for example if someone is inexplicably trying to write into a tiled framebuffer that is currently pinned to the display! cough kms_frontbuffer_tracking cough) fallback to using the page-by-page pwrite/pread interface, rather than fail the syscall entirely. Since this is triggerable by the user (along pwrite) we have to remove the WARN_ON(fence->pin_count). Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915: Mark up the GTT flush following WC writes as ORIGIN_CPU	Chris Wilson	1	-2/+2
	Similarly to invalidating beforehand, if the object is mmapped via I915_MMAP_WC we cannot track writes through the I915_GEM_DOMAIN_GTT. At the conclusion of the write, i915_gem_object_flush_gtt_writes() we also need to treat the origin carefully in case it may have been untracked. See also commit aeecc9696aa0 ("drm/i915: use ORIGIN_CPU for frontbuffer invalidation on WC mmaps"). Signed-off-by: Chris Wilson <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Paulo Zanoni <[email protected]> Reviewed-by: Paulo Zanoni <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915: Use ORIGIN_CPU for fb invalidation from pwrite	Chris Wilson	1	-2/+2
	As pwrite does not use the fence for its GTT access, and may even go through a secondary interface avoiding the main VMA, we cannot treat the write as automatically invalidated by the hardware and so we require ORIGIN_CPU frontbufer invalidate/flushes. Signed-off-by: Chris Wilson <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Daniel Vetter <[email protected]>
2016-08-18	drm/i915: vfree() no longer ignores the low bits of the address	Chris Wilson	2	-4/+12
	Since vfree() now likes to WARN when passed a non-page-aligned pointer, we need to discard the low bits to comply with it. Fixes: d31d7cb1460c ("drm/i915: Support for creating write combined type vmaps") Signed-off-by: Chris Wilson <[email protected]> Cc: Joonas Lahtinen <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Reviewed-by: Tvrtko Ursulin <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	agp/intel: Flush chipset writes after updating a single PTE	Chris Wilson	1	-0/+2
	After we update one PTE for a page, the caller expects to be able to immediately use that through a GGTT read/write. To comply with the callers expectations we therefore need to flush the chipset buffers before returning. Reported-by: Matti Hämäläinen <[email protected]> Fixes: d6473f566417 ("drm/i915: Add support for mapping an object page...") Signed-off-by: Chris Wilson <[email protected]> Cc: Ankitprasad Sharma <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Tested-by: Matti Hämäläinen <[email protected]> Cc: [email protected] Reviewed-by: Mika Kuoppala <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915: Unconditionally flush any chipset buffers before execbuf	Chris Wilson	2	-10/+4
	If userspace is asynchronously streaming into the batch or other execobjects, we may not flush those writes along with a change in cache domain (as there is no change). Therefore those writes may end up in internal chipset buffers and not visible to the GPU upon execution. We must issue a flush command or otherwise we encounter incoherency in the batchbuffers and the GPU executing invalid commands (i.e. hanging) quite regularly. v2: Throw a paranoid wmb() into the general flush so that we remain consistent with before. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90841 Fixes: 1816f9236303 ("drm/i915: Support creation of unbound wc user...") Signed-off-by: Chris Wilson <[email protected]> Cc: Akash Goel <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Tested-by: Matti Hämäläinen <[email protected]> Cc: [email protected] Reviewed-by: Mika Kuoppala <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-18	drm/i915/gen9: Drop invalid WARN() during data rate calculation	Matt Roper	1	-2/+0
	It's possible to have a non-zero plane mask and still wind up with a total data rate of zero. There are two cases where this can happen: * planes are active (from the KMS point of view), but are all fully clipped (positioned offscreen) * the only active plane on a CRTC is the cursor (which is handled independently and not counted into the general data rate computations These are both valid display setups (although unusual), so we need to drop the WARN(). Signed-off-by: Matt Roper <[email protected]> Reviewed-by: Maarten Lankhorst <[email protected]> Testcase: kms_universal_planes.cursor-only-pipe-* Signed-off-by: Maarten Lankhorst <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected] Cc: [email protected] #v4.7+
2016-08-18	drm/i915/gen9: Initialize intel_state->active_crtcs during WM sanitization (v2)	Matt Roper	1	-1/+16
	intel_state->active_crtcs is usually only initialized when doing a modeset. During our first atomic commit after boot, we're effectively faking a modeset to sanitize the DDB/wm setup, so ensure that this field gets initialized before use. v2: - Don't clobber active_crtcs if our first commit really is a modeset (Maarten) - Grab connection_mutex when faking a modeset during sanitization (Maarten) Reported-by: Tvrtko Ursulin <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: Maarten Lankhorst <[email protected]> Signed-off-by: Matt Roper <[email protected]> Tested-by: Tvrtko Ursulin <[email protected]> Signed-off-by: Maarten Lankhorst <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected] Cc: [email protected] #v4.7+
2016-08-17	drm/i915: Add missing kerneldoc for guc_client_alloc:engines	Chris Wilson	1	-0/+1
	Brief parameter text to describe @engines. Signed-off-by: Chris Wilson <[email protected]> Cc: Dave Gordon <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Tvrtko Ursulin <[email protected]>
2016-08-17	drm/i915: Remember to set vma->pages for the preallocated stolen object	Chris Wilson	1	-0/+1
	In commit 247177ddd517 ("drm/i915: Always set the vma->pages"), as it title implies, we always set vma->pages for bound objects. Even before that, we would set vma->ggtt_view.pages, for globally bound objects. This was forgotten for the fixup inside the preallocated stolen objects, which has to recreate a global GTT binding outside of the usual VMA insertion path Fixes: 247177ddd517 ("drm/i915: Always set the vma->pages") Signed-off-by: Chris Wilson <[email protected]> Cc: Joonas Lahtinen <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Joonas Lahtinen <[email protected]>
2016-08-17	drm/i915: Mark i915_hpd_poll_init_work as static	Chris Wilson	1	-1/+2
	Local function with forgotten static declaration. Fixes: 19625e85c6ec ("drm/i915: Enable polling when we don't have hpd") Signed-off-by: Chris Wilson <[email protected]> Cc: Ville Syrjälä <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Lyude <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Tvrtko Ursulin <[email protected]>
2016-08-17	drm/i915: Mark the static key for movntqda as static	Chris Wilson	1	-1/+1
	Silence sparse who warns that the global variable is not declared static. Fixes: 0b1de5d58e19 ("drm/i915: Use SSE4.1 movntdqa to ...") Signed-off-by: Chris Wilson <[email protected]> Cc: Akash Goel <[email protected]> Cc: Damien Lespiau <[email protected]> Cc: Mika Kuoppala <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Tvrtko Ursulin <[email protected]>
2016-08-17	drm/i915: Initialize legacy semaphores from engine hw id indexed array	Tvrtko Ursulin	2	-28/+34
	Build the legacy semaphore initialisation array using the engine hardware ids instead of driver internal ones. This makes the static array size dependent only on the number of gen6 semaphore engines. Also makes the per-engine semaphore wait and signal tables hardware id indexed saving some more space. v2: Refactor I915_GEN6_NUM_ENGINES to GEN6_SEMAPHORE_LAST. (Chris Wilson) v3: More polish. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <[email protected]> Reviewed-by: Chris Wilson <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2016-08-17	drm/i915: Add enum for hardware engine identifiers	Tvrtko Ursulin	2	-9/+15
	Put the engine hardware id in the common header so they are not only associated with the GuC since they are needed for the legacy semaphores implementation. Signed-off-by: Tvrtko Ursulin <[email protected]> Reviewed-by: Chris Wilson <[email protected]>
2016-08-16	drm/i915: Silence GCC warning for cmn_a_well	Chris Wilson	1	-2/+2
	Just make the logic simple enough for even GCC to understand (and foolproof against random changes): drivers/gpu/drm/i915/intel_runtime_pm.c: warning: 'cmn_a_well' may be used uninitialized in this function [-Wuninitialized]: => 871:23 Reported-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Chris Wilson <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Imre Deak <[email protected]>