Age | Commit message (Collapse) | Author | Files | Lines |
|
Builds on the context tracking that was added earlier.
- marks engine context PTEs as 'priv' where possible
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
- adds support for specifying SUBDEVICE_ID for channel
- rounds non-power-of-two GPFIFO sizes down, rather than up
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
And use it to cleanup multiple implementations of almost the same thing.
- prepares for non-polled / client-provided USERD
- only zeroes relevant "registers", rather than entire USERD
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
Currently provided by {chan,dma,gpfifo}*.c, and those are going away.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
- less dependence on waiting for runlist updates, on GPUs that allow it
- supports runqueue selector in RAMRL entries
- completes switch to common runl/cgrp/chan topology info
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
That sure was fun to untangle.
- handled per-runlist, rather than globally
- more straight-forward process in general
- various potential SW/HW races have been fixed
- fixes lockdep issues that were present in >=gk104's prior implementation
- volta recovery now actually stands a chance of working
- volta/turing waiting for PBDMA idle before engine reset
- turing using hw-provided TSG info for CTXSW_TIMEOUT
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
A bunch of these can be handled in such a way that the channel can
continue, however, any of these are a pretty decent sign something
has gone horribly wrong, and the safest option is to disable the
channel.
This is a bit of a hack, we will want to handle these individually
and dump relevant debug info for each at some point.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
- nvkm_chan_error() built on top, stops channel and sends 'killed' event
- removes an odd double-bashing of channel enable regs on kepler and up
- pokes doorbell on turing and up, after enabling channel
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
- stops programming (non-existent) runl id field on bind(), from maxwell
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
- adds g8x/turing registers, which were missing before
- switches fermi to polled wait, like later hw (see: 4f2fc25c0f8bc...)
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
Channel groups have somewhat more complicated requirements than what we
currently support. An engine context is shared between all channels in
a channel group, VEID/subctx support (later) brings per-VEID components,
and we need to track an individual channel's engine context pointers.
This commit adds the structures and refcounting to support the above,
wrapping the prior implementation for the moment.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
- supports per-runlist CHIDs
- channel group lock held across reference, rather than global lock
v2:
- remove unnecessary parenthesis
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
After updating GF100 implementation from the GK104/TU102 ones, and using
the new runlist/engine topology info, all three handlers become (almost)
identical.
- there's a temporary kludge to call through to the HW-specific recovery
- engine fault mapping info determined at load time, not on every fault
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
- merges gf100/gk104- NV_PFIFO_INTR_0_PBDMA and NV_PPBDMA_INTR_0 code
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
- bumps pbdma timeout to value RM uses on newer HW
- bumps fb timeout to max from boot default
- one/both of these greatly improves stability on // piglit runs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
NVGPU and RM both program this value.
Fixes a bunch of random hangs running parallel piglit.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
- removes a layer of indirection in the intr handling
- prevents non-stall ctrl racing with unknown intrs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
More control, and shallower call-chain to get to the point.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
Engine context tracking will move to nvkm_cgrp in later commits, so we
create SW-only channel groups on HW without support for them.
- switches to nvkm_chid for TSG/channel ID allocation
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
DRM uses this to setup fence-related items.
- nouveau_chan.runlist will always be "0" for the moment, not an issue
as GPUs prior to ampere have system-wide channel IDs,
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
Previously only available from Kepler onwards.
- also fixes the info() queries causing fifo init()/fini() unnecessarily
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
Creates an nvkm_runl for each runlist on the GPU, and an nvkm_engn for
each engine that is reachable from a runlist.
- basically what gk104- already does, but extended to all chips
- adds per-runlist CHID allocators (Ampere)
- splits g98/gt2xx out from g84 (different target engines)
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
Creates an nvkm_runq for each PBDMA, these will be associated with the
relevant runlist(s) later.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
We need to be able to allocate TSG IDs as well as channel IDs, also,
Ampere has per-runlist channel IDs.
- holds per-ID private data, which will be used for/to protect lookup
- holds an nvkm_event which will be used for events tied to IDs
- not used yet beyond setup, and switching use of "fifo->nr - 1" for
channel ID mask to "chid->mask"
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
This makes it easier to transition everything.
- a couple of function renames for collisions
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
- will make subsequent patches more obvious
- no code changes
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
Adds the basic skeleton for common channel (group) interfaces.
- common behaviour between <gk104 and >=gk104 impl's
- separates priv/user channel objects
- passthrough to existing object for now, kludges removed later
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
- reads channel count from GPU from gm200 onwards
- removes gm20b/gp10b (they become identical to gm200/gp100)
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
Will be used to init client-allocated USERD to default values.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
Displays both owner/user of the falcon (when they differ), and takes
both subdevs' debug levels into account when deciding whether to log
the message.
- runlist debugging will use one of the alternate macros added here
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
This wasn't really needed before; the main place this could race is with
channel recovery, but (through potentially fragile means) shouldn't have
been possible.
However, a number of upcoming patches benefit from having better control
over subdev init, necessitating some improvements here.
- allows subdev/engine oneinit() without init() (host/fifo patches)
- merges engine use locking/tracking into subdev, and extends it to fix
some issues that will arise with future usage patterns (acr patches)
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
- NV_PMC_ENABLE still exists, but we don't touch anything in it yet
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
Ampere needs different handling here, most of what we touch has moved.
We probably want to refactor these interfaces in general, but I'm not
yet sure how they should look, this will get the job done for now.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
- new-style handlers can now be used here too
- decent clean-up
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
TU102 implementation should be OK for Ampere now.
v2. fixup for ga103 early merge
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
- reads vectors from HW, rather than being hardcoded
- removes hacks to support routing via old interfaces
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
- switches ampere over now, and removes its hack mc implementation
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
- uses proper class IDs for Turing/Ampere
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
Initially for NV_USERMODE class, and Turing/Ampere's new interrupt tree.
v2. fixup for ga103 early merge
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
It's quite a lot of tedious and error-prone work to switch over all the
subdevs at once, so allow an nvkm_intr to request new-style handlers to
be created that wrap the existing interfaces.
This will allow a more gradual transition.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
Turing adds a second top-level interrupt tree in HW, in addition to the
trees available via NV_PMC. Most of the interrupts we care about are
exposed in both trees, but not all of them, and we have some rather
nasty hacks to route the fault buffer interrupts.
Ampere removes the NV_PMC trees entirely.
Here we add some infrastructure to be able to handle all of this more
cleanly, as well as providing more explicit control over handlers.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
Unifies the handling between PCI-based and Tegra GPUs, and makes more
explicit/obvious where device interrupts can be expected.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
We're going to want this information available earlier than it is now.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
The vblank and nonstall events have some annoying interactions with DRM
locking, and aren't able to do certain things as a result.
However, other uses of event notifications don't have such requirements,
and upcoming patches take advantage of this for various improvements.
Having separate classes for each nvkm_event's spinlocks allows lockdep
to distinguish between them and avoid false-positives.
v2: __always_inline + comment
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
|
|
This removes support for accelerated fbcon rendering, and fixes a number
of races/crashes/issues around suspend/resume/module unload etc.
Losing HW accelerated rendering isn't ideal, but it's been significantly
reduced in performance since the removal of accelerated scrolling in the
kernel anyway - not to mention, can be racey (skips cpu<->gpu sync) from
certain contexts.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|