Merge tag 'drm-misc-next-2023-04-06' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for v6.4-rc1: UAPI Changes: Cross-subsystem Changes: - Document port and rotation dt bindings better. - For panel timing DT bindings, document that vsync and hsync are first, rather than last in image. - Fix video/aperture typos. Core Changes: - Reject prime DMA-Buf attachment if get_sg_table is missing. (For self-importing dma-buf only.) - Add prime import/export to vram-helper. - Fix oops in drm/vblank when init is not called. - Fixup xres/yres_virtual and other fixes in fb helper. - Improve SCDC debugs. - Skip setting deadline on modesets. - Assorted TTM fixes. Driver Changes: - Add lima usage stats. - Assorted fixes to bridge/lt8192b, tc358767, ivpu, bridge/ti-sn65dsi83, ps8640. - Use pci aperture helpers in drm/ast lynxfb, radeonfb. - Revert some lima patches, as they required a commit that has been reverted upstream. - Add AUO NE135FBM-N41 v8.1 eDP panel. - Add QAIC accel driver. Signed-off-by: Daniel Vetter <[email protected]> From: Maarten Lankhorst <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
author: Daniel Vetter <[email protected]> 2023-04-06 14:37:14 +0200
committer: Daniel Vetter <[email protected]> 2023-04-06 14:37:15 +0200
commit: 52b113e968be66b57f792b2e2a9b8b77f382bd5f (patch)
tree: b0d29f82fe76a7078422fcd3326d2591718af3b9
parent: f86286569e92a260fbf8a1975f9421b4a66581d8 (diff)
parent: e44f18c6ff8beef7b2b10592287f0a9766376d9b (diff)
51 files changed, 7004 insertions, 231 deletions
diff --git a/Documentation/accel/index.rst b/Documentation/accel/index.rst
index 2b43c9a7f67b..e94a0160b6a0 100644
--- a/Documentation/accel/index.rst
+++ b/Documentation/accel/index.rst
@@ -8,6 +8,7 @@ Compute Accelerators
    :maxdepth: 1
 
    introduction
+   qaic/index
 
 .. only::  subproject and html
 
diff --git a/Documentation/accel/qaic/aic100.rst b/Documentation/accel/qaic/aic100.rst
new file mode 100644
index 000000000000..c80d0f1307db
--- /dev/null
+++ b/Documentation/accel/qaic/aic100.rst
@@ -0,0 +1,510 @@
+.. SPDX-License-Identifier: GPL-2.0-only
+
+===============================
+ Qualcomm Cloud AI 100 (AIC100)
+===============================
+
+Overview
+========
+
+The Qualcomm Cloud AI 100/AIC100 family of products (including SA9000P - part of
+Snapdragon Ride) are PCIe adapter cards which contain a dedicated SoC ASIC for
+the purpose of efficiently running Artificial Intelligence (AI) Deep Learning
+inference workloads. They are AI accelerators.
+
+The PCIe interface of AIC100 is capable of PCIe Gen4 speeds over eight lanes
+(x8). An individual SoC on a card can have up to 16 NSPs for running workloads.
+Each SoC has an A53 management CPU. On card, there can be up to 32 GB of DDR.
+
+Multiple AIC100 cards can be hosted in a single system to scale overall
+performance. AIC100 cards are multi-user capable and able to execute workloads
+from multiple users in a concurrent manner.
+
+Hardware Description
+====================
+
+An AIC100 card consists of an AIC100 SoC, on-card DDR, and a set of misc
+peripherals (PMICs, etc).
+
+An AIC100 card can either be a PCIe HHHL form factor (a traditional PCIe card),
+or a Dual M.2 card. Both use PCIe to connect to the host system.
+
+As a PCIe endpoint/adapter, AIC100 uses the standard VendorID(VID)/
+DeviceID(DID) combination to uniquely identify itself to the host. AIC100
+uses the standard Qualcomm VID (0x17cb). All AIC100 SKUs use the same
+AIC100 DID (0xa100).
+
+AIC100 does not implement FLR (function level reset).
+
+AIC100 implements MSI but does not implement MSI-X. AIC100 requires 17 MSIs to
+operate (1 for MHI, 16 for the DMA Bridge).
+
+As a PCIe device, AIC100 utilizes BARs to provide host interfaces to the device
+hardware. AIC100 provides 3, 64-bit BARs.
+
+* The first BAR is 4K in size, and exposes the MHI interface to the host.
+
+* The second BAR is 2M in size, and exposes the DMA Bridge interface to the
+  host.
+
+* The third BAR is variable in size based on an individual AIC100's
+  configuration, but defaults to 64K. This BAR currently has no purpose.
+
+From the host perspective, AIC100 has several key hardware components -
+
+* MHI (Modem Host Interface)
+* QSM (QAIC Service Manager)
+* NSPs (Neural Signal Processor)
+* DMA Bridge
+* DDR
+
+MHI
+---
+
+AIC100 has one MHI interface over PCIe. MHI itself is documented at
+Documentation/mhi/index.rst MHI is the mechanism the host uses to communicate
+with the QSM. Except for workload data via the DMA Bridge, all interaction with
+the device occurs via MHI.
+
+QSM
+---
+
+QAIC Service Manager. This is an ARM A53 CPU that runs the primary
+firmware of the card and performs on-card management tasks. It also
+communicates with the host via MHI. Each AIC100 has one of
+these.
+
+NSP
+---
+
+Neural Signal Processor. Each AIC100 has up to 16 of these. These are
+the processors that run the workloads on AIC100. Each NSP is a Qualcomm Hexagon
+(Q6) DSP with HVX and HMX. Each NSP can only run one workload at a time, but
+multiple NSPs may be assigned to a single workload. Since each NSP can only run
+one workload, AIC100 is limited to 16 concurrent workloads. Workload
+"scheduling" is under the purview of the host. AIC100 does not automatically
+timeslice.
+
+DMA Bridge
+----------
+
+The DMA Bridge is custom DMA engine that manages the flow of data
+in and out of workloads. AIC100 has one of these. The DMA Bridge has 16
+channels, each consisting of a set of request/response FIFOs. Each active
+workload is assigned a single DMA Bridge channel. The DMA Bridge exposes
+hardware registers to manage the FIFOs (head/tail pointers), but requires host
+memory to store the FIFOs.
+
+DDR
+---
+
+AIC100 has on-card DDR. In total, an AIC100 can have up to 32 GB of DDR.
+This DDR is used to store workloads, data for the workloads, and is used by the
+QSM for managing the device. NSPs are granted access to sections of the DDR by
+the QSM. The host does not have direct access to the DDR, and must make
+requests to the QSM to transfer data to the DDR.
+
+High-level Use Flow
+===================
+
+AIC100 is a multi-user, programmable accelerator typically used for running
+neural networks in inferencing mode to efficiently perform AI operations.
+AIC100 is not intended for training neural networks. AIC100 can be utilized
+for generic compute workloads.
+
+Assuming a user wants to utilize AIC100, they would follow these steps:
+
+1. Compile the workload into an ELF targeting the NSP(s)
+2. Make requests to the QSM to load the workload and related artifacts into the
+   device DDR
+3. Make a request to the QSM to activate the workload onto a set of idle NSPs
+4. Make requests to the DMA Bridge to send input data to the workload to be
+   processed, and other requests to receive processed output data from the
+   workload.
+5. Once the workload is no longer required, make a request to the QSM to
+   deactivate the workload, thus putting the NSPs back into an idle state.
+6. Once the workload and related artifacts are no longer needed for future
+   sessions, make requests to the QSM to unload the data from DDR. This frees
+   the DDR to be used by other users.
+
+
+Boot Flow
+=========
+
+AIC100 uses a flashless boot flow, derived from Qualcomm MSMs.
+
+When AIC100 is first powered on, it begins executing PBL (Primary Bootloader)
+from ROM. PBL enumerates the PCIe link, and initializes the BHI (Boot Host
+Interface) component of MHI.
+
+Using BHI, the host points PBL to the location of the SBL (Secondary Bootloader)
+image. The PBL pulls the image from the host, validates it, and begins
+execution of SBL.
+
+SBL initializes MHI, and uses MHI to notify the host that the device has entered
+the SBL stage. SBL performs a number of operations:
+
+* SBL initializes the majority of hardware (anything PBL left uninitialized),
+  including DDR.
+* SBL offloads the bootlog to the host.
+* SBL synchronizes timestamps with the host for future logging.
+* SBL uses the Sahara protocol to obtain the runtime firmware images from the
+  host.
+
+Once SBL has obtained and validated the runtime firmware, it brings the NSPs out
+of reset, and jumps into the QSM.
+
+The QSM uses MHI to notify the host that the device has entered the QSM stage
+(AMSS in MHI terms). At this point, the AIC100 device is fully functional, and
+ready to process workloads.
+
+Userspace components
+====================
+
+Compiler
+--------
+
+An open compiler for AIC100 based on upstream LLVM can be found at:
+https://github.com/quic/software-kit-for-qualcomm-cloud-ai-100-cc
+
+Usermode Driver (UMD)
+---------------------
+
+An open UMD that interfaces with the qaic kernel driver can be found at:
+https://github.com/quic/software-kit-for-qualcomm-cloud-ai-100
+
+Sahara loader
+-------------
+
+An open implementation of the Sahara protocol called kickstart can be found at:
+https://github.com/andersson/qdl
+
+MHI Channels
+============
+
+AIC100 defines a number of MHI channels for different purposes. This is a list
+of the defined channels, and their uses.
+
++----------------+---------+----------+----------------------------------------+
+| Channel name   | IDs     | EEs      | Purpose                                |
++================+=========+==========+========================================+
+| QAIC_LOOPBACK  | 0 & 1   | AMSS     | Any data sent to the device on this    |
+|                |         |          | channel is sent back to the host.      |
++----------------+---------+----------+----------------------------------------+
+| QAIC_SAHARA    | 2 & 3   | SBL      | Used by SBL to obtain the runtime      |
+|                |         |          | firmware from the host.                |
++----------------+---------+----------+----------------------------------------+
+| QAIC_DIAG      | 4 & 5   | AMSS     | Used to communicate with QSM via the   |
+|                |         |          | DIAG protocol.                         |
++----------------+---------+----------+----------------------------------------+
+| QAIC_SSR       | 6 & 7   | AMSS     | Used to notify the host of subsystem   |
+|                |         |          | restart events, and to offload SSR     |
+|                |         |          | crashdumps.                            |
++----------------+---------+----------+----------------------------------------+
+| QAIC_QDSS      | 8 & 9   | AMSS     | Used for the Qualcomm Debug Subsystem. |
++----------------+---------+----------+----------------------------------------+
+| QAIC_CONTROL   | 10 & 11 | AMSS     | Used for the Neural Network Control    |
+|                |         |          | (NNC) protocol. This is the primary    |
+|                |         |          | channel between host and QSM for       |
+|                |         |          | managing workloads.                    |
++----------------+---------+----------+----------------------------------------+
+| QAIC_LOGGING   | 12 & 13 | SBL      | Used by the SBL to send the bootlog to |
+|                |         |          | the host.                              |
++----------------+---------+----------+----------------------------------------+
+| QAIC_STATUS    | 14 & 15 | AMSS     | Used to notify the host of Reliability,|
+|                |         |          | Accessibility, Serviceability (RAS)    |
+|                |         |          | events.                                |
++----------------+---------+----------+----------------------------------------+
+| QAIC_TELEMETRY | 16 & 17 | AMSS     | Used to get/set power/thermal/etc      |
+|                |         |          | attributes.                            |
++----------------+---------+----------+----------------------------------------+
+| QAIC_DEBUG     | 18 & 19 | AMSS     | Not used.                              |
++----------------+---------+----------+----------------------------------------+
+| QAIC_TIMESYNC  | 20 & 21 | SBL/AMSS | Used to synchronize timestamps in the  |
+|                |         |          | device side logs with the host time    |
+|                |         |          | source.                                |
++----------------+---------+----------+----------------------------------------+
+
+DMA Bridge
+==========
+
+Overview
+--------
+
+The DMA Bridge is one of the main interfaces to the host from the device
+(the other being MHI). As part of activating a workload to run on NSPs, the QSM
+assigns that network a DMA Bridge channel. A workload's DMA Bridge channel
+(DBC for short) is solely for the use of that workload and is not shared with
+other workloads.
+
+Each DBC is a pair of FIFOs that manage data in and out of the workload. One
+FIFO is the request FIFO. The other FIFO is the response FIFO.
+
+Each DBC contains 4 registers in hardware:
+
+* Request FIFO head pointer (offset 0x0). Read only by the host. Indicates the
+  latest item in the FIFO the device has consumed.
+* Request FIFO tail pointer (offset 0x4). Read/write by the host. Host
+  increments this register to add new items to the FIFO.
+* Response FIFO head pointer (offset 0x8). Read/write by the host. Indicates
+  the latest item in the FIFO the host has consumed.
+* Response FIFO tail pointer (offset 0xc). Read only by the host. Device
+  increments this register to add new items to the FIFO.
+
+The values in each register are indexes in the FIFO. To get the location of the
+FIFO element pointed to by the register: FIFO base address + register * element
+size.
+
+DBC registers are exposed to the host via the second BAR. Each DBC consumes
+4KB of space in the BAR.
+
+The actual FIFOs are backed by host memory. When sending a request to the QSM
+to activate a network, the host must donate memory to be used for the FIFOs.
+Due to internal mapping limitations of the device, a single contiguous chunk of
+memory must be provided per DBC, which hosts both FIFOs. The request FIFO will
+consume the beginning of the memory chunk, and the response FIFO will consume
+the end of the memory chunk.
+
+Request FIFO
+------------
+
+A request FIFO element has the following structure:
+
+.. code-block:: c
+
+  struct request_elem {
+	u16 req_id;
+	u8  seq_id;
+	u8  pcie_dma_cmd;
+	u32 reserved;
+	u64 pcie_dma_source_addr;
+	u64 pcie_dma_dest_addr;
+	u32 pcie_dma_len;
+	u32 reserved;
+	u64 doorbell_addr;
+	u8  doorbell_attr;
+	u8  reserved;
+	u16 reserved;
+	u32 doorbell_data;
+	u32 sem_cmd0;
+	u32 sem_cmd1;
+	u32 sem_cmd2;
+	u32 sem_cmd3;
+  };
+
+Request field descriptions:
+
+req_id
+	request ID. A request FIFO element and a response FIFO element with
+	the same request ID refer to the same command.
+
+seq_id
+	sequence ID within a request. Ignored by the DMA Bridge.
+
+pcie_dma_cmd
+	describes the DMA element of this request.
+
+	* Bit(7) is the force msi flag, which overrides the DMA Bridge MSI logic
+	  and generates a MSI when this request is complete, and QSM
+	  configures the DMA Bridge to look at this bit.
+	* Bits(6:5) are reserved.
+	* Bit(4) is the completion code flag, and indicates that the DMA Bridge
+	  shall generate a response FIFO element when this request is
+	  complete.
+	* Bit(3) indicates if this request is a linked list transfer(0) or a bulk
+	  transfer(1).
+	* Bit(2) is reserved.
+	* Bits(1:0) indicate the type of transfer. No transfer(0), to device(1),
+	  from device(2). Value 3 is illegal.
+
+pcie_dma_source_addr
+	source address for a bulk transfer, or the address of the linked list.
+
+pcie_dma_dest_addr
+	destination address for a bulk transfer.
+
+pcie_dma_len
+	length of the bulk transfer. Note that the size of this field
+	limits transfers to 4G in size.
+
+doorbell_addr
+	address of the doorbell to ring when this request is complete.
+
+doorbell_attr
+	doorbell attributes.
+
+	* Bit(7) indicates if a write to a doorbell is to occur.
+	* Bits(6:2) are reserved.
+	* Bits(1:0) contain the encoding of the doorbell length. 0 is 32-bit,
+	  1 is 16-bit, 2 is 8-bit, 3 is reserved. The doorbell address
+	  must be naturally aligned to the specified length.
+
+doorbell_data
+	data to write to the doorbell. Only the bits corresponding to
+	the doorbell length are valid.
+
+sem_cmdN
+	semaphore command.
+
+	* Bit(31) indicates this semaphore command is enabled.
+	* Bit(30) is the to-device DMA fence. Block this request until all
+	  to-device DMA transfers are complete.
+	* Bit(29) is the from-device DMA fence. Block this request until all
+	  from-device DMA transfers are complete.
+	* Bits(28:27) are reserved.
+	* Bits(26:24) are the semaphore command. 0 is NOP. 1 is init with the
+	  specified value. 2 is increment. 3 is decrement. 4 is wait
+	  until the semaphore is equal to the specified value. 5 is wait
+	  until the semaphore is greater or equal to the specified value.
+	  6 is "P", wait until semaphore is greater than 0, then
+	  decrement by 1. 7 is reserved.
+	* Bit(23) is reserved.
+	* Bit(22) is the semaphore sync. 0 is post sync, which means that the
+	  semaphore operation is done after the DMA transfer. 1 is
+	  presync, which gates the DMA transfer. Only one presync is
+	  allowed per request.
+	* Bit(21) is reserved.
+	* Bits(20:16) is the index of the semaphore to operate on.
+	* Bits(15:12) are reserved.
+	* Bits(11:0) are the semaphore value to use in operations.
+
+Overall, a request is processed in 4 steps:
+
+1. If specified, the presync semaphore condition must be true
+2. If enabled, the DMA transfer occurs
+3. If specified, the postsync semaphore conditions must be true
+4. If enabled, the doorbell is written
+
+By using the semaphores in conjunction with the workload running on the NSPs,
+the data pipeline can be synchronized such that the host can queue multiple
+requests of data for the workload to process, but the DMA Bridge will only copy
+the data into the memory of the workload when the workload is ready to process
+the next input.
+
+Response FIFO
+-------------
+
+Once a request is fully processed, a response FIFO element is generated if
+specified in pcie_dma_cmd. The structure of a response FIFO element:
+
+.. code-block:: c
+
+  struct response_elem {
+	u16 req_id;
+	u16 completion_code;
+  };
+
+req_id
+	matches the req_id of the request that generated this element.
+
+completion_code
+	status of this request. 0 is success. Non-zero is an error.
+
+The DMA Bridge will generate a MSI to the host as a reaction to activity in the
+response FIFO of a DBC. The DMA Bridge hardware has an IRQ storm mitigation
+algorithm, where it will only generate a MSI when the response FIFO transitions
+from empty to non-empty (unless force MSI is enabled and triggered). In
+response to this MSI, the host is expected to drain the response FIFO, and must
+take care to handle any race conditions between draining the FIFO, and the
+device inserting elements into the FIFO.
+
+Neural Network Control (NNC) Protocol
+=====================================
+
+The NNC protocol is how the host makes requests to the QSM to manage workloads.
+It uses the QAIC_CONTROL MHI channel.
+
+Each NNC request is packaged into a message. Each message is a series of
+transactions. A passthrough type transaction can contain elements known as
+commands.
+
+QSM requires NNC messages be little endian encoded and the fields be naturally
+aligned. Since there are 64-bit elements in some NNC messages, 64-bit alignment
+must be maintained.
+
+A message contains a header and then a series of transactions. A message may be
+at most 4K in size from QSM to the host. From the host to the QSM, a message
+can be at most 64K (maximum size of a single MHI packet), but there is a
+continuation feature where message N+1 can be marked as a continuation of
+message N. This is used for exceedingly large DMA xfer transactions.
+
+Transaction descriptions
+------------------------
+
+passthrough
+	Allows userspace to send an opaque payload directly to the QSM.
+	This is used for NNC commands. Userspace is responsible for managing
+	the QSM message requirements in the payload.
+
+dma_xfer
+	DMA transfer. Describes an object that the QSM should DMA into the
+	device via address and size tuples.
+
+activate
+	Activate a workload onto NSPs. The host must provide memory to be
+	used by the DBC.
+
+deactivate
+	Deactivate an active workload and return the NSPs to idle.
+
+status
+	Query the QSM about it's NNC implementation. Returns the NNC version,
+	and if CRC is used.
+
+terminate
+	Release a user's resources.
+
+dma_xfer_cont
+	Continuation of a previous DMA transfer. If a DMA transfer
+	cannot be specified in a single message (highly fragmented), this
+	transaction can be used to specify more ranges.
+
+validate_partition
+	Query to QSM to determine if a partition identifier is valid.
+
+Each message is tagged with a user id, and a partition id. The user id allows
+QSM to track resources, and release them when the user goes away (eg the process
+crashes). A partition id identifies the resource partition that QSM manages,
+which this message applies to.
+
+Messages may have CRCs. Messages should have CRCs applied until the QSM
+reports via the status transaction that CRCs are not needed. The QSM on the
+SA9000P requires CRCs for black channel safing.
+
+Subsystem Restart (SSR)
+=======================
+
+SSR is the concept of limiting the impact of an error. An AIC100 device may
+have multiple users, each with their own workload running. If the workload of
+one user crashes, the fallout of that should be limited to that workload and not
+impact other workloads. SSR accomplishes this.
+
+If a particular workload crashes, QSM notifies the host via the QAIC_SSR MHI
+channel. This notification identifies the workload by it's assigned DBC. A
+multi-stage recovery process is then used to cleanup both sides, and get the
+DBC/NSPs into a working state.
+
+When SSR occurs, any state in the workload is lost. Any inputs that were in
+process, or queued by not yet serviced, are lost. The loaded artifacts will
+remain in on-card DDR, but the host will need to re-activate the workload if
+it desires to recover the workload.
+
+Reliability, Accessibility, Serviceability (RAS)
+================================================
+
+AIC100 is expected to be deployed in server systems where RAS ideology is
+applied. Simply put, RAS is the concept of detecting, classifying, and
+reporting errors. While PCIe has AER (Advanced Error Reporting) which factors
+into RAS, AER does not allow for a device to report details about internal
+errors. Therefore, AIC100 implements a custom RAS mechanism. When a RAS event
+occurs, QSM will report the event with appropriate details via the QAIC_STATUS
+MHI channel. A sysadmin may determine that a particular device needs
+additional service based on RAS reports.
+
+Telemetry
+=========
+
+QSM has the ability to report various physical attributes of the device, and in
+some cases, to allow the host to control them. Examples include thermal limits,
+thermal readings, and power readings. These items are communicated via the
+QAIC_TELEMETRY MHI channel.
diff --git a/Documentation/accel/qaic/index.rst b/Documentation/accel/qaic/index.rst
new file mode 100644
index 000000000000..ad19b88d1a66
--- /dev/null
+++ b/Documentation/accel/qaic/index.rst
@@ -0,0 +1,13 @@
+.. SPDX-License-Identifier: GPL-2.0-only
+
+=====================================
+ accel/qaic Qualcomm Cloud AI driver
+=====================================
+
+The accel/qaic driver supports the Qualcomm Cloud AI machine learning
+accelerator cards.
+
+.. toctree::
+
+   qaic
+   aic100
diff --git a/Documentation/accel/qaic/qaic.rst b/Documentation/accel/qaic/qaic.rst
new file mode 100644
index 000000000000..72a70ab6e3a8
--- /dev/null
+++ b/Documentation/accel/qaic/qaic.rst
@@ -0,0 +1,170 @@
+.. SPDX-License-Identifier: GPL-2.0-only
+
+=============
+ QAIC driver
+=============
+
+The QAIC driver is the Kernel Mode Driver (KMD) for the AIC100 family of AI
+accelerator products.
+
+Interrupts
+==========
+
+While the AIC100 DMA Bridge hardware implements an IRQ storm mitigation
+mechanism, it is still possible for an IRQ storm to occur. A storm can happen
+if the workload is particularly quick, and the host is responsive. If the host
+can drain the response FIFO as quickly as the device can insert elements into
+it, then the device will frequently transition the response FIFO from empty to
+non-empty and generate MSIs at a rate equivalent to the speed of the
+workload's ability to process inputs. The lprnet (license plate reader network)
+workload is known to trigger this condition, and can generate in excess of 100k
+MSIs per second. It has been observed that most systems cannot tolerate this
+for long, and will crash due to some form of watchdog due to the overhead of
+the interrupt controller interrupting the host CPU.
+
+To mitigate this issue, the QAIC driver implements specific IRQ handling. When
+QAIC receives an IRQ, it disables that line. This prevents the interrupt
+controller from interrupting the CPU. Then AIC drains the FIFO. Once the FIFO
+is drained, QAIC implements a "last chance" polling algorithm where QAIC will
+sleep for a time to see if the workload will generate more activity. The IRQ
+line remains disabled during this time. If no activity is detected, QAIC exits
+polling mode and reenables the IRQ line.
+
+This mitigation in QAIC is very effective. The same lprnet usecase that
+generates 100k IRQs per second (per /proc/interrupts) is reduced to roughly 64
+IRQs over 5 minutes while keeping the host system stable, and having the same
+workload throughput performance (within run to run noise variation).
+
+
+Neural Network Control (NNC) Protocol
+=====================================
+
+The implementation of NNC is split between the KMD (QAIC) and UMD. In general
+QAIC understands how to encode/decode NNC wire protocol, and elements of the
+protocol which require kernel space knowledge to process (for example, mapping
+host memory to device IOVAs). QAIC understands the structure of a message, and
+all of the transactions. QAIC does not understand commands (the payload of a
+passthrough transaction).
+
+QAIC handles and enforces the required little endianness and 64-bit alignment,
+to the degree that it can. Since QAIC does not know the contents of a
+passthrough transaction, it relies on the UMD to satisfy the requirements.
+
+The terminate transaction is of particular use to QAIC. QAIC is not aware of
+the resources that are loaded onto a device since the majority of that activity
+occurs within NNC commands. As a result, QAIC does not have the means to
+roll back userspace activity. To ensure that a userspace client's resources
+are fully released in the case of a process crash, or a bug, QAIC uses the
+terminate command to let QSM know when a user has gone away, and the resources
+can be released.
+
+QSM can report a version number of the NNC protocol it supports. This is in the
+form of a Major number and a Minor number.
+
+Major number updates indicate changes to the NNC protocol which impact the
+message format, or transactions (impacts QAIC).
+
+Minor number updates indicate changes to the NNC protocol which impact the
+commands (does not impact QAIC).
+
+uAPI
+====
+
+QAIC defines a number of driver specific IOCTLs as part of the userspace API.
+This section describes those APIs.
+
+DRM_IOCTL_QAIC_MANAGE
+  This IOCTL allows userspace to send a NNC request to the QSM. The call will
+  block until a response is received, or the request has timed out.
+
+DRM_IOCTL_QAIC_CREATE_BO
+  This IOCTL allows userspace to allocate a buffer object (BO) which can send
+  or receive data from a workload. The call will return a GEM handle that
+  represents the allocated buffer. The BO is not usable until it has been
+  sliced (see DRM_IOCTL_QAIC_ATTACH_SLICE_BO).
+
+DRM_IOCTL_QAIC_MMAP_BO
+  This IOCTL allows userspace to prepare an allocated BO to be mmap'd into the
+  userspace process.
+
+DRM_IOCTL_QAIC_ATTACH_SLICE_BO
+  This IOCTL allows userspace to slice a BO in preparation for sending the BO
+  to the device. Slicing is the operation of describing what portions of a BO
+  get sent where to a workload. This requires a set of DMA transfers for the
+  DMA Bridge, and as such, locks the BO to a specific DBC.
+
+DRM_IOCTL_QAIC_EXECUTE_BO
+  This IOCTL allows userspace to submit a set of sliced BOs to the device. The
+  call is non-blocking. Success only indicates that the BOs have been queued
+  to the device, but does not guarantee they have been executed.
+
+DRM_IOCTL_QAIC_PARTIAL_EXECUTE_BO
+  This IOCTL operates like DRM_IOCTL_QAIC_EXECUTE_BO, but it allows userspace
+  to shrink the BOs sent to the device for this specific call. If a BO
+  typically has N inputs, but only a subset of those is available, this IOCTL
+  allows userspace to indicate that only the first M bytes of the BO should be
+  sent to the device to minimize data transfer overhead. This IOCTL dynamically
+  recomputes the slicing, and therefore has some processing overhead before the
+  BOs can be queued to the device.
+
+DRM_IOCTL_QAIC_WAIT_BO
+  This IOCTL allows userspace to determine when a particular BO has been
+  processed by the device. The call will block until either the BO has been
+  processed and can be re-queued to the device, or a timeout occurs.
+
+DRM_IOCTL_QAIC_PERF_STATS_BO
+  This IOCTL allows userspace to collect performance statistics on the most
+  recent execution of a BO. This allows userspace to construct an end to end
+  timeline of the BO processing for a performance analysis.
+
+DRM_IOCTL_QAIC_PART_DEV
+  This IOCTL allows userspace to request a duplicate "shadow device". This extra
+  accelN device is associated with a specific partition of resources on the
+  AIC100 device and can be used for limiting a process to some subset of
+  resources.
+
+Userspace Client Isolation
+==========================
+
+AIC100 supports multiple clients. Multiple DBCs can be consumed by a single
+client, and multiple clients can each consume one or more DBCs. Workloads
+may contain sensitive information therefore only the client that owns the
+workload should be allowed to interface with the DBC.
+
+Clients are identified by the instance associated with their open(). A client
+may only use memory they allocate, and DBCs that are assigned to their
+workloads. Attempts to access resources assigned to other clients will be
+rejected.
+
+Module parameters
+=================
+
+QAIC supports the following module parameters:
+
+**datapath_polling (bool)**
+
+Configures QAIC to use a polling thread for datapath events instead of relying
+on the device interrupts. Useful for platforms with broken multiMSI. Must be
+set at QAIC driver initialization. Default is 0 (off).
+
+**mhi_timeout_ms (unsigned int)**
+
+Sets the timeout value for MHI operations in milliseconds (ms). Must be set
+at the time the driver detects a device. Default is 2000 (2 seconds).
+
+**control_resp_timeout_s (unsigned int)**
+
+Sets the timeout value for QSM responses to NNC messages in seconds (s). Must
+be set at the time the driver is sending a request to QSM. Default is 60 (one
+minute).
+
+**wait_exec_default_timeout_ms (unsigned int)**
+
+Sets the default timeout for the wait_exec ioctl in milliseconds (ms). Must be
+set prior to the waic_exec ioctl call. A value specified in the ioctl call
+overrides this for that call. Default is 5000 (5 seconds).
+
+**datapath_poll_interval_us (unsigned int)**
+
+Sets the polling interval in microseconds (us) when datapath polling is active.
+Takes effect at the next polling interval. Default is 100 (100 us).
diff --git a/Documentation/devicetree/bindings/display/panel/elida,kd35t133.yaml b/Documentation/devicetree/bindings/display/panel/elida,kd35t133.yaml
index 7adb83e2e8d9..265ab6d30572 100644
--- a/Documentation/devicetree/bindings/display/panel/elida,kd35t133.yaml
+++ b/Documentation/devicetree/bindings/display/panel/elida,kd35t133.yaml
@@ -17,7 +17,9 @@ properties:
     const: elida,kd35t133
   reg: true
   backlight: true
+  port: true
   reset-gpios: true
+  rotation: true
   iovcc-supply:
     description: regulator that supplies the iovcc voltage
   vdd-supply:
@@ -27,6 +29,7 @@ required:
   - compatible
   - reg
   - backlight
+  - port
   - iovcc-supply
   - vdd-supply
 
@@ -43,6 +46,12 @@ examples:
             backlight = <&backlight>;
             iovcc-supply = <&vcc_1v8>;
             vdd-supply = <&vcc3v3_lcd>;
+
+            port {
+                mipi_in_panel: endpoint {
+                    remote-endpoint = <&mipi_out_panel>;
+                };
+            };
         };
     };
 
diff --git a/Documentation/devicetree/bindings/display/panel/feiyang,fy07024di26a30d.yaml b/Documentation/devicetree/bindings/display/panel/feiyang,fy07024di26a30d.yaml
index 1cf84c8dd85e..92df69e80a82 100644
--- a/Documentation/devicetree/bindings/display/panel/feiyang,fy07024di26a30d.yaml
+++ b/Documentation/devicetree/bindings/display/panel/feiyang,fy07024di26a30d.yaml
@@ -26,6 +26,7 @@ properties:
   dvdd-supply:
     description: 3v3 digital regulator
 
+  port: true
   reset-gpios: true
 
   backlight: true
@@ -35,6 +36,7 @@ required:
   - reg
   - avdd-supply
   - dvdd-supply
+  - port
 
 additionalProperties: false
 
@@ -53,5 +55,11 @@ examples:
             dvdd-supply = <&reg_dldo2>;
             reset-gpios = <&pio 3 24 GPIO_ACTIVE_HIGH>; /* LCD-RST: PD24 */
             backlight = <&backlight>;
+
+            port {
+                mipi_in_panel: endpoint {
+                    remote-endpoint = <&mipi_out_panel>;
+                };
+            };
         };
     };
diff --git a/Documentation/devicetree/bindings/display/panel/panel-timing.yaml b/Documentation/devicetree/bindings/display/panel/panel-timing.yaml
index 0d317e61edd8..aea69b84ca5d 100644
--- a/Documentation/devicetree/bindings/display/panel/panel-timing.yaml
+++ b/Documentation/devicetree/bindings/display/panel/panel-timing.yaml
@@ -17,29 +17,29 @@ description: |
 
   The parameters are defined as seen in the following illustration.
 
-  +----------+-------------------------------------+----------+-------+
-  |          |        ^                            |          |       |
-  |          |        |vback_porch                 |          |       |
-  |          |        v                            |          |       |
-  +----------#######################################----------+-------+
-  |          #        ^                            #          |       |
-  |          #        |                            #          |       |
-  |  hback   #        |                            #  hfront  | hsync |
-  |   porch  #        |       hactive              #  porch   |  len  |
-  |<-------->#<-------+--------------------------->#<-------->|<----->|
-  |          #        |                            #          |       |
-  |          #        |vactive                     #          |       |
-  |          #        |                            #          |       |
-  |          #        v                            #          |       |
-  +----------#######################################----------+-------+
-  |          |        ^                            |          |       |
-  |          |        |vfront_porch                |          |       |
-  |          |        v                            |          |       |
-  +----------+-------------------------------------+----------+-------+
-  |          |        ^                            |          |       |
-  |          |        |vsync_len                   |          |       |
-  |          |        v                            |          |       |
-  +----------+-------------------------------------+----------+-------+
+  +-------+----------+-------------------------------------+----------+
+  |       |          |        ^                            |          |
+  |       |          |        |vsync_len                   |          |
+  |       |          |        v                            |          |
+  +-------+----------+-------------------------------------+----------+
+  |       |          |        ^                            |          |
+  |       |          |        |vback_porch                 |          |
+  |       |          |        v                            |          |
+  +-------+----------#######################################----------+
+  |       |          #        ^                            #          |
+  |       |          #        |                            #          |
+  | hsync |  hback   #        |                            #  hfront  |
+  |  len  |   porch  #        |       hactive              #  porch   |
+  |<----->|<-------->#<-------+--------------------------->#<-------->|
+  |       |          #        |                            #          |
+  |       |          #        |vactive                     #          |
+  |       |          #        |                            #          |
+  |       |          #        v                            #          |
+  +-------+----------#######################################----------+
+  |       |          |        ^                            |          |
+  |       |          |        |vfront_porch                |          |
+  |       |          |        v                            |          |
+  +-------+----------+-------------------------------------+----------+
 
 
   The following is the panel timings shown with time on the x-axis.
diff --git a/Documentation/devicetree/bindings/display/panel/sitronix,st7701.yaml b/Documentation/devicetree/bindings/display/panel/sitronix,st7701.yaml
index 83d30eadf7d9..4dc0cd4a6a77 100644
--- a/Documentation/devicetree/bindings/display/panel/sitronix,st7701.yaml
+++ b/Documentation/devicetree/bindings/display/panel/sitronix,st7701.yaml
@@ -42,7 +42,9 @@ properties:
   IOVCC-supply:
     description: I/O system regulator
 
+  port: true
   reset-gpios: true
+  rotation: true
 
   backlight: true
 
@@ -51,6 +53,7 @@ required:
   - reg
   - VCC-supply
   - IOVCC-supply
+  - port
   - reset-gpios
 
 additionalProperties: false
@@ -70,5 +73,11 @@ examples:
             IOVCC-supply = <&reg_dldo2>;
             reset-gpios = <&pio 3 24 GPIO_ACTIVE_HIGH>; /* LCD-RST: PD24 */
             backlight = <&backlight>;
+
+            port {
+                mipi_in_panel: endpoint {
+                    remote-endpoint = <&mipi_out_panel>;
+                };
+            };
         };
     };
diff --git a/Documentation/devicetree/bindings/display/panel/sitronix,st7789v.yaml b/Documentation/devicetree/bindings/display/panel/sitronix,st7789v.yaml
index d984b59daa4a..fa6556363cca 100644
--- a/Documentation/devicetree/bindings/display/panel/sitronix,st7789v.yaml
+++ b/Documentation/devicetree/bindings/display/panel/sitronix,st7789v.yaml
@@ -26,6 +26,10 @@ properties:
   spi-cpha: true
   spi-cpol: true
 
+  dc-gpios:
+    maxItems: 1
+    description: DCX pin, Display data/command selection pin in parallel interface
+
 required:
   - compatible
   - reg
diff --git a/Documentation/devicetree/bindings/display/panel/xinpeng,xpp055c272.yaml b/Documentation/devicetree/bindings/display/panel/xinpeng,xpp055c272.yaml
index d5c46a3cc2b0..c407deb6afb1 100644
--- a/Documentation/devicetree/bindings/display/panel/xinpeng,xpp055c272.yaml
+++ b/Documentation/devicetree/bindings/display/panel/xinpeng,xpp055c272.yaml
@@ -17,6 +17,7 @@ properties:
     const: xinpeng,xpp055c272
   reg: true
   backlight: true
+  port: true
   reset-gpios: true
   iovcc-supply:
     description: regulator that supplies the iovcc voltage
@@ -27,6 +28,7 @@ required:
   - compatible
   - reg
   - backlight
+  - port
   - iovcc-supply
   - vci-supply
 
@@ -44,6 +46,12 @@ examples:
             backlight = <&backlight>;
             iovcc-supply = <&vcc_1v8>;
             vci-supply = <&vcc3v3_lcd>;
+
+            port {
+                mipi_in_panel: endpoint {
+                    remote-endpoint = <&mipi_out_panel>;
+                };
+            };
         };
     };
 
diff --git a/MAINTAINERS b/MAINTAINERS
index 9736e04d3bd3..d037504a5748 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -17265,6 +17265,16 @@ F:	Documentation/devicetree/bindings/clock/qcom,*
 F:	drivers/clk/qcom/
 F:	include/dt-bindings/clock/qcom,*
 
+QUALCOMM CLOUD AI (QAIC) DRIVER
+M:	Jeffrey Hugo <[email protected]>
+L:	[email protected]
+L:	[email protected]
+S:	Supported
+T:	git git://anongit.freedesktop.org/drm/drm-misc
+F:	Documentation/accel/qaic/
+F:	drivers/accel/qaic/
+F:	include/uapi/drm/qaic_accel.h
+
 QUALCOMM CORE POWER REDUCTION (CPR) AVS DRIVER
 M:	Bjorn Andersson <[email protected]>
 M:	Konrad Dybcio <[email protected]>
diff --git a/drivers/accel/Kconfig b/drivers/accel/Kconfig
index c437206aa3f1..64065fb8922b 100644
--- a/drivers/accel/Kconfig
+++ b/drivers/accel/Kconfig
@@ -26,5 +26,6 @@ menuconfig DRM_ACCEL
 
 source "drivers/accel/habanalabs/Kconfig"
 source "drivers/accel/ivpu/Kconfig"
+source "drivers/accel/qaic/Kconfig"
 
 endif
diff --git a/drivers/accel/Makefile b/drivers/accel/Makefile
index f22fd44d586b..ab3df932937f 100644
--- a/drivers/accel/Makefile
+++ b/drivers/accel/Makefile
@@ -2,3 +2,4 @@
 
 obj-$(CONFIG_DRM_ACCEL_HABANALABS)	+= habanalabs/
 obj-$(CONFIG_DRM_ACCEL_IVPU)		+= ivpu/
+obj-$(CONFIG_DRM_ACCEL_QAIC)		+= qaic/
diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c
index 231f29bb5025..eb6405f9bf6b 100644
--- a/drivers/accel/ivpu/ivpu_drv.c
+++ b/drivers/accel/ivpu/ivpu_drv.c
@@ -433,6 +433,10 @@ static int ivpu_pci_init(struct ivpu_device *vdev)
 	/* Clear any pending errors */
 	pcie_capability_clear_word(pdev, PCI_EXP_DEVSTA, 0x3f);
 
+	/* VPU MTL does not require PCI spec 10m D3hot delay */
+	if (ivpu_is_mtl(vdev))
+		pdev->d3hot_delay = 0;
+
 	ret = pcim_enable_device(pdev);
 	if (ret) {
 		ivpu_err(vdev, "Failed to enable PCI device: %d\n", ret);
diff --git a/drivers/accel/qaic/Kconfig b/drivers/accel/qaic/Kconfig
new file mode 100644
index 000000000000..a9f866230058
--- /dev/null
+++ b/drivers/accel/qaic/Kconfig
@@ -0,0 +1,23 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# Qualcomm Cloud AI accelerators driver
+#
+
+config DRM_ACCEL_QAIC
+	tristate "Qualcomm Cloud AI accelerators"
+	depends on DRM_ACCEL
+	depends on PCI && HAS_IOMEM
+	depends on MHI_BUS
+	depends on MMU
+	select CRC32
+	help
+	  Enables driver for Qualcomm's Cloud AI accelerator PCIe cards that are
+	  designed to accelerate Deep Learning inference workloads.
+
+	  The driver manages the PCIe devices and provides an IOCTL interface
+	  for users to submit workloads to the devices.
+
+	  If unsure, say N.
+
+	  To compile this driver as a module, choose M here: the
+	  module will be called qaic.
diff --git a/drivers/accel/qaic/Makefile b/drivers/accel/qaic/Makefile
new file mode 100644
index 000000000000..d5f4952ae79a
--- /dev/null
+++ b/drivers/accel/qaic/Makefile
@@ -0,0 +1,13 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# Makefile for Qualcomm Cloud AI accelerators driver
+#
+
+obj-$(CONFIG_DRM_ACCEL_QAIC)	:= qaic.o
+
+qaic-y := \
+	mhi_controller.o \
+	mhi_qaic_ctrl.o \
+	qaic_control.o \
+	qaic_data.o \
+	qaic_drv.o
diff --git a/drivers/accel/qaic/mhi_controller.c b/drivers/accel/qaic/mhi_controller.c
new file mode 100644
index 000000000000..5036e58e7235
--- /dev/null
+++ b/drivers/accel/qaic/mhi_controller.c
@@ -0,0 +1,563 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+/* Copyright (c) 2019-2021, The Linux Foundation. All rights reserved. */
+/* Copyright (c) 2021-2023 Qualcomm Innovation Center, Inc. All rights reserved. */
+
+#include <linux/delay.h>
+#include <linux/err.h>
+#include <linux/memblock.h>
+#include <linux/mhi.h>
+#include <linux/moduleparam.h>
+#include <linux/pci.h>
+#include <linux/sizes.h>
+
+#include "mhi_controller.h"
+#include "qaic.h"
+
+#define MAX_RESET_TIME_SEC 25
+
+static unsigned int mhi_timeout_ms = 2000; /* 2 sec default */
+module_param(mhi_timeout_ms, uint, 0600);
+MODULE_PARM_DESC(mhi_timeout_ms, "MHI controller timeout value");
+
+static struct mhi_channel_config aic100_channels[] = {
+	{
+		.name = "QAIC_LOOPBACK",
+		.num = 0,
+		.num_elements = 32,
+		.local_elements = 0,
+		.event_ring = 0,
+		.dir = DMA_TO_DEVICE,
+		.ee_mask = MHI_CH_EE_AMSS,
+		.pollcfg = 0,
+		.doorbell = MHI_DB_BRST_DISABLE,
+		.lpm_notify = false,
+		.offload_channel = false,
+		.doorbell_mode_switch = false,
+		.auto_queue = false,
+		.wake_capable = false,
+	},
+	{
+		.name = "QAIC_LOOPBACK",
+		.num = 1,
+		.num_elements = 32,
+		.local_elements = 0,
+		.event_ring = 0,
+		.dir = DMA_FROM_DEVICE,
+		.ee_mask = MHI_CH_EE_AMSS,
+		.pollcfg = 0,
+		.doorbell = MHI_DB_BRST_DISABLE,
+		.lpm_notify = false,
+		.offload_channel = false,
+		.doorbell_mode_switch = false,
+		.auto_queue = false,
+		.wake_capable = false,
+	},
+	{
+		.name = "QAIC_SAHARA",
+		.num = 2,
+		.num_elements = 32,
+		.local_elements = 0,
+		.event_ring = 0,
+		.dir = DMA_TO_DEVICE,
+		.ee_mask = MHI_CH_EE_SBL,
+		.pollcfg = 0,
+		.doorbell = MHI_DB_BRST_DISABLE,
+		.lpm_notify = false,
+		.offload_channel = false,
+		.doorbell_mode_switch = false,
+		.auto_queue = false,
+		.wake_capable = false,
+	},
+	{
+		.name = "QAIC_SAHARA",
+		.num = 3,
+		.num_elements = 32,
+		.local_elements = 0,
+		.event_ring = 0,
+		.dir = DMA_FROM_DEVICE,
+		.ee_mask = MHI_CH_EE_SBL,
+		.pollcfg = 0,
+		.doorbell = MHI_DB_BRST_DISABLE,
+		.lpm_notify = false,
+		.offload_channel = false,
+		.doorbell_mode_switch = false,
+		.auto_queue = false,
+		.wake_capable = false,
+	},
+	{
+		.name = "QAIC_DIAG",
+		.num = 4,
+		.num_elements = 32,
+		.local_elements = 0,
+		.event_ring = 0,
+		.dir = DMA_TO_DEVICE,
+		.ee_mask = MHI_CH_EE_AMSS,
+		.pollcfg = 0,
+		.doorbell = MHI_DB_BRST_DISABLE,
+		.lpm_notify = false,
+		.offload_channel = false,
+		.doorbell_mode_switch = false,
+		.auto_queue = false,
+		.wake_capable = false,
+	},
+	{
+		.name = "QAIC_DIAG",
+		.num = 5,
+		.num_elements = 32,
+		.local_elements = 0,
+		.event_ring = 0,
+		.dir = DMA_FROM_DEVICE,
+		.ee_mask = MHI_CH_EE_AMSS,
+		.pollcfg = 0,
+		.doorbell = MHI_DB_BRST_DISABLE,
+		.lpm_notify = false,
+		.offload_channel = false,
+		.doorbell_mode_switch = false,
+		.auto_queue = false,
+		.wake_capable = false,
+	},
+	{
+		.name = "QAIC_SSR",
+		.num = 6,
+		.num_elements = 32,
+		.local_elements = 0,
+		.event_ring = 0,
+		.dir = DMA_TO_DEVICE,
+		.ee_mask = MHI_CH_EE_AMSS,
+		.pollcfg = 0,
+		.doorbell = MHI_DB_BRST_DISABLE,
+		.lpm_notify = false,
+		.offload_channel = false,
+		.doorbell_mode_switch = false,
+		.auto_queue = false,
+		.wake_capable = false,
+	},
+	{
+		.name = "QAIC_SSR",
+		.num = 7,
+		.num_elements = 32,
+		.local_elements = 0,
+		.event_ring = 0,
+		.dir = DMA_FROM_DEVICE,
+		.ee_mask = MHI_CH_EE_AMSS,
+		.pollcfg = 0,
+		.doorbell = MHI_DB_BRST_DISABLE,
+		.lpm_notify = false,
+		.offload_channel = false,
+		.doorbell_mode_switch = false,
+		.auto_queue = false,
+		.wake_capable = false,
+	},
+	{
+		.name = "QAIC_QDSS",
+		.num = 8,
+		.num_elements = 32,
+		.local_elements = 0,
+		.event_ring = 0,
+		.dir = DMA_TO_DEVICE,
+		.ee_mask = MHI_CH_EE_AMSS,
+		.pollcfg = 0,
+		.doorbell = MHI_DB_BRST_DISABLE,
+		.lpm_notify = false,
+		.offload_channel = false,
+		.doorbell_mode_switch = false,
+		.auto_queue = false,
+		.wake_capable = false,
+	},
+	{
+		.name = "QAIC_QDSS",
+		.num = 9,
+		.num_elements = 32,
+		.local_elements = 0,
+		.event_ring = 0,
+		.dir = DMA_FROM_DEVICE,
+		.ee_mask = MHI_CH_EE_AMSS,
+		.pollcfg = 0,
+		.doorbell = MHI_DB_BRST_DISABLE,
+		.lpm_notify = false,
+		.offload_channel = false,
+		.doorbell_mode_switch = false,
+		.auto_queue = false,
+		.wake_capable = false,
+	},
+	{
+		.name = "QAIC_CONTROL",
+		.num = 10,
+		.num_elements = 128,
+		.local_elements = 0,
+		.event_ring = 0,
+		.dir = DMA_TO_DEVICE,
+		.ee_mask = MHI_CH_EE_AMSS,
+		.pollcfg = 0,
+		.doorbell = MHI_DB_BRST_DISABLE,
+		.lpm_notify = false,
+		.offload_channel = false,
+		.doorbell_mode_switch = false,
+		.auto_queue = false,
+		.wake_capable = false,
+	},
+	{
+		.name = "QAIC_CONTROL",
+		.num = 11,
+		.num_elements = 128,
+		.local_elements = 0,
+		.event_ring = 0,
+		.dir = DMA_FROM_DEVICE,
+		.ee_mask = MHI_CH_EE_AMSS,
+		.pollcfg = 0,
+		.doorbell = MHI_DB_BRST_DISABLE,
+		.lpm_notify = false,
+		.offload_channel = false,
+		.doorbell_mode_switch = false,
+		.auto_queue = false,
+		.wake_capable = false,
+	},
+	{
+		.name = "QAIC_LOGGING",
+		.num = 12,
+		.num_elements = 32,
+		.local_elements = 0,
+		.event_ring = 0,
+		.dir = DMA_TO_DEVICE,
+		.ee_mask = MHI_CH_EE_SBL,
+		.pollcfg = 0,
+		.doorbell = MHI_DB_BRST_DISABLE,
+		.lpm_notify = false,
+		.offload_channel = false,
+		.doorbell_mode_switch = false,
+		.auto_queue = false,
+		.wake_capable = false,
+	},
+	{
+		.name = "QAIC_LOGGING",
+		.num = 13,
+		.num_elements = 32,
+		.local_elements = 0,
+		.event_ring = 0,
+		.dir = DMA_FROM_DEVICE,
+		.ee_mask = MHI_CH_EE_SBL,
+		.pollcfg = 0,
+		.doorbell = MHI_DB_BRST_DISABLE,
+		.lpm_notify = false,
+		.offload_channel = false,
+		.doorbell_mode_switch = false,
+		.auto_queue = false,
+		.wake_capable = false,
+	},
+	{
+		.name = "QAIC_STATUS",
+		.num = 14,
+		.num_elements = 32,
+		.local_elements = 0,
+		.event_ring = 0,
+		.dir = DMA_TO_DEVICE,
+		.ee_mask = MHI_CH_EE_AMSS,
+		.pollcfg = 0,
+		.doorbell = MHI_DB_BRST_DISABLE,
+		.lpm_notify = false,
+		.offload_channel = false,
+		.doorbell_mode_switch = false,
+		.auto_queue = false,
+		.wake_capable = false,
+	},
+	{
+		.name = "QAIC_STATUS",
+		.num = 15,
+		.num_elements = 32,
+		.local_elements = 0,
+		.event_ring = 0,
+		.dir = DMA_FROM_DEVICE,
+		.ee_mask = MHI_CH_EE_AMSS,
+		.pollcfg = 0,
+		.doorbell = MHI_DB_BRST_DISABLE,
+		.lpm_notify = false,
+		.offload_channel = false,
+		.doorbell_mode_switch = false,
+		.auto_queue = false,
+		.wake_capable = false,
+	},
+	{
+		.name = "QAIC_TELEMETRY",
+		.num = 16,
+		.num_elements = 32,
+		.local_elements = 0,
+		.event_ring = 0,
+		.dir = DMA_TO_DEVICE,
+		.ee_mask = MHI_CH_EE_AMSS,
+		.pollcfg = 0,
+		.doorbell = MHI_DB_BRST_DISABLE,
+		.lpm_notify = false,
+		.offload_channel = false,
+		.doorbell_mode_switch = false,
+		.auto_queue = false,
+		.wake_capable = false,
+	},
+	{
+		.name = "QAIC_TELEMETRY",
+		.num = 17,
+		.num_elements = 32,
+		.local_elements = 0,
+		.event_ring = 0,
+		.dir = DMA_FROM_DEVICE,
+		.ee_mask = MHI_CH_EE_AMSS,
+		.pollcfg = 0,
+		.doorbell = MHI_DB_BRST_DISABLE,
+		.lpm_notify = false,
+		.offload_channel = false,
+		.doorbell_mode_switch = false,
+		.auto_queue = false,
+		.wake_capable = false,
+	},
+	{
+		.name = "QAIC_DEBUG",
+		.num = 18,
+		.num_elements = 32,
+		.local_elements = 0,
+		.event_ring = 0,
+		.dir = DMA_TO_DEVICE,
+		.ee_mask = MHI_CH_EE_AMSS,
+		.pollcfg = 0,
+		.doorbell = MHI_DB_BRST_DISABLE,
+		.lpm_notify = false,
+		.offload_channel = false,
+		.doorbell_mode_switch = false,
+		.auto_queue = false,
+		.wake_capable = false,
+	},
+	{
+		.name = "QAIC_DEBUG",
+		.num = 19,
+		.num_elements = 32,
+		.local_elements = 0,
+		.event_ring = 0,
+		.dir = DMA_FROM_DEVICE,
+		.ee_mask = MHI_CH_EE_AMSS,
+		.pollcfg = 0,
+		.doorbell = MHI_DB_BRST_DISABLE,
+		.lpm_notify = false,
+		.offload_channel = false,
+		.doorbell_mode_switch = false,
+		.auto_queue = false,
+		.wake_capable = false,
+	},
+	{
+		.name = "QAIC_TIMESYNC",
+		.num = 20,
+		.num_elements = 32,
+		.local_elements = 0,
+		.event_ring = 0,
+		.dir = DMA_TO_DEVICE,
+		.ee_mask = MHI_CH_EE_SBL | MHI_CH_EE_AMSS,
+		.pollcfg = 0,
+		.doorbell = MHI_DB_BRST_DISABLE,
+		.lpm_notify = false,
+		.offload_channel = false,
+		.doorbell_mode_switch = false,
+		.auto_queue = false,
+		.wake_capable = false,
+	},
+	{
+		.num = 21,
+		.name = "QAIC_TIMESYNC",
+		.num_elements = 32,
+		.local_elements = 0,
+		.event_ring = 0,
+		.dir = DMA_FROM_DEVICE,
+		.ee_mask = MHI_CH_EE_SBL | MHI_CH_EE_AMSS,
+		.pollcfg = 0,
+		.doorbell = MHI_DB_BRST_DISABLE,
+		.lpm_notify = false,
+		.offload_channel = false,
+		.doorbell_mode_switch = false,
+		.auto_queue = false,
+		.wake_capable = false,
+	},
+};
+
+static struct mhi_event_config aic100_events[] = {
+	{
+		.num_elements = 32,
+		.irq_moderation_ms = 0,
+		.irq = 0,
+		.channel = U32_MAX,
+		.priority = 1,
+		.mode = MHI_DB_BRST_DISABLE,
+		.data_type = MHI_ER_CTRL,
+		.hardware_event = false,
+		.client_managed = false,
+		.offload_channel = false,
+	},
+};
+
+static struct mhi_controller_config aic100_config = {
+	.max_channels = 128,
+	.timeout_ms = 0, /* controlled by mhi_timeout */
+	.buf_len = 0,
+	.num_channels = ARRAY_SIZE(aic100_channels),
+	.ch_cfg = aic100_channels,
+	.num_events = ARRAY_SIZE(aic100_events),
+	.event_cfg = aic100_events,
+	.use_bounce_buf = false,
+	.m2_no_db = false,
+};
+
+static int mhi_read_reg(struct mhi_controller *mhi_cntrl, void __iomem *addr, u32 *out)
+{
+	u32 tmp = readl_relaxed(addr);
+
+	if (tmp == U32_MAX)
+		return -EIO;
+
+	*out = tmp;
+
+	return 0;
+}
+
+static void mhi_write_reg(struct mhi_controller *mhi_cntrl, void __iomem *addr, u32 val)
+{
+	writel_relaxed(val, addr);
+}
+
+static int mhi_runtime_get(struct mhi_controller *mhi_cntrl)
+{
+	return 0;
+}
+
+static void mhi_runtime_put(struct mhi_controller *mhi_cntrl)
+{
+}
+
+static void mhi_status_cb(struct mhi_controller *mhi_cntrl, enum mhi_callback reason)
+{
+	struct qaic_device *qdev = pci_get_drvdata(to_pci_dev(mhi_cntrl->cntrl_dev));
+
+	/* this event occurs in atomic context */
+	if (reason == MHI_CB_FATAL_ERROR)
+		pci_err(qdev->pdev, "Fatal error received from device. Attempting to recover\n");
+	/* this event occurs in non-atomic context */
+	if (reason == MHI_CB_SYS_ERROR)
+		qaic_dev_reset_clean_local_state(qdev, true);
+}
+
+static int mhi_reset_and_async_power_up(struct mhi_controller *mhi_cntrl)
+{
+	u8 time_sec = 1;
+	int current_ee;
+	int ret;
+
+	/* Reset the device to bring the device in PBL EE */
+	mhi_soc_reset(mhi_cntrl);
+
+	/*
+	 * Keep checking the execution environment(EE) after every 1 second
+	 * interval.
+	 */
+	do {
+		msleep(1000);
+		current_ee = mhi_get_exec_env(mhi_cntrl);
+	} while (current_ee != MHI_EE_PBL && time_sec++ <= MAX_RESET_TIME_SEC);
+
+	/* If the device is in PBL EE retry power up */
+	if (current_ee == MHI_EE_PBL)
+		ret = mhi_async_power_up(mhi_cntrl);
+	else
+		ret = -EIO;
+
+	return ret;
+}
+
+struct mhi_controller *qaic_mhi_register_controller(struct pci_dev *pci_dev, void __iomem *mhi_bar,
+						    int mhi_irq)
+{
+	struct mhi_controller *mhi_cntrl;
+	int ret;
+
+	mhi_cntrl = devm_kzalloc(&pci_dev->dev, sizeof(*mhi_cntrl), GFP_KERNEL);
+	if (!mhi_cntrl)
+		return ERR_PTR(-ENOMEM);
+
+	mhi_cntrl->cntrl_dev = &pci_dev->dev;
+
+	/*
+	 * Covers the entire possible physical ram region. Remote side is
+	 * going to calculate a size of this range, so subtract 1 to prevent
+	 * rollover.
+	 */
+	mhi_cntrl->iova_start = 0;
+	mhi_cntrl->iova_stop = PHYS_ADDR_MAX - 1;
+	mhi_cntrl->status_cb = mhi_status_cb;
+	mhi_cntrl->runtime_get = mhi_runtime_get;
+	mhi_cntrl->runtime_put = mhi_runtime_put;
+	mhi_cntrl->read_reg = mhi_read_reg;
+	mhi_cntrl->write_reg = mhi_write_reg;
+	mhi_cntrl->regs = mhi_bar;
+	mhi_cntrl->reg_len = SZ_4K;
+	mhi_cntrl->nr_irqs = 1;
+	mhi_cntrl->irq = devm_kmalloc(&pci_dev->dev, sizeof(*mhi_cntrl->irq), GFP_KERNEL);
+
+	if (!mhi_cntrl->irq)
+		return ERR_PTR(-ENOMEM);
+
+	mhi_cntrl->irq[0] = mhi_irq;
+	mhi_cntrl->fw_image = "qcom/aic100/sbl.bin";
+
+	/* use latest configured timeout */
+	aic100_config.timeout_ms = mhi_timeout_ms;
+	ret = mhi_register_controller(mhi_cntrl, &aic100_config);
+	if (ret) {
+		pci_err(pci_dev, "mhi_register_controller failed %d\n", ret);
+		return ERR_PTR(ret);
+	}
+
+	ret = mhi_prepare_for_power_up(mhi_cntrl);
+	if (ret) {
+		pci_err(pci_dev, "mhi_prepare_for_power_up failed %d\n", ret);
+		goto prepare_power_up_fail;
+	}
+
+	ret = mhi_async_power_up(mhi_cntrl);
+	/*
+	 * If EIO is returned it is possible that device is in SBL EE, which is
+	 * undesired. SOC reset the device and try to power up again.
+	 */
+	if (ret == -EIO && MHI_EE_SBL == mhi_get_exec_env(mhi_cntrl)) {
+		pci_err(pci_dev, "Found device in SBL at MHI init. Attempting a reset.\n");
+		ret = mhi_reset_and_async_power_up(mhi_cntrl);
+	}
+
+	if (ret) {
+		pci_err(pci_dev, "mhi_async_power_up failed %d\n", ret);
+		goto power_up_fail;
+	}
+
+	return mhi_cntrl;
+
+power_up_fail:
+	mhi_unprepare_after_power_down(mhi_cntrl);
+prepare_power_up_fail:
+	mhi_unregister_controller(mhi_cntrl);
+	return ERR_PTR(ret);
+}
+
+void qaic_mhi_free_controller(struct mhi_controller *mhi_cntrl, bool link_up)
+{
+	mhi_power_down(mhi_cntrl, link_up);
+	mhi_unprepare_after_power_down(mhi_cntrl);
+	mhi_unregister_controller(mhi_cntrl);
+}
+
+void qaic_mhi_start_reset(struct mhi_controller *mhi_cntrl)
+{
+	mhi_power_down(mhi_cntrl, true);
+}
+
+void qaic_mhi_reset_done(struct mhi_controller *mhi_cntrl)
+{
+	struct pci_dev *pci_dev = container_of(mhi_cntrl->cntrl_dev, struct pci_dev, dev);
+	int ret;
+
+	ret = mhi_async_power_up(mhi_cntrl);
+	if (ret)
+		pci_err(pci_dev, "mhi_async_power_up failed after reset %d\n", ret);
+}
diff --git a/drivers/accel/qaic/mhi_controller.h b/drivers/accel/qaic/mhi_controller.h
new file mode 100644
index 000000000000..2ae45d768e24
--- /dev/null
+++ b/drivers/accel/qaic/mhi_controller.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0-only
+ *
+ * Copyright (c) 2019-2020, The Linux Foundation. All rights reserved.
+ * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#ifndef MHICONTROLLERQAIC_H_
+#define MHICONTROLLERQAIC_H_
+
+struct mhi_controller *qaic_mhi_register_controller(struct pci_dev *pci_dev, void __iomem *mhi_bar,
+						    int mhi_irq);
+void qaic_mhi_free_controller(struct mhi_controller *mhi_cntrl, bool link_up);
+void qaic_mhi_start_reset(struct mhi_controller *mhi_cntrl);
+void qaic_mhi_reset_done(struct mhi_controller *mhi_cntrl);
+
+#endif /* MHICONTROLLERQAIC_H_ */
diff --git a/drivers/accel/qaic/mhi_qaic_ctrl.c b/drivers/accel/qaic/mhi_qaic_ctrl.c
new file mode 100644
index 000000000000..0c7e571f1f12
--- /dev/null
+++ b/drivers/accel/qaic/mhi_qaic_ctrl.c
@@ -0,0 +1,569 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved. */
+
+#include <linux/kernel.h>
+#include <linux/mhi.h>
+#include <linux/mod_devicetable.h>
+#include <linux/module.h>
+#include <linux/poll.h>
+#include <linux/xarray.h>
+#include <uapi/linux/eventpoll.h>
+
+#include "mhi_qaic_ctrl.h"
+#include "qaic.h"
+
+#define MHI_QAIC_CTRL_DRIVER_NAME	"mhi_qaic_ctrl"
+#define MHI_QAIC_CTRL_MAX_MINORS	128
+#define MHI_MAX_MTU			0xffff
+static DEFINE_XARRAY_ALLOC(mqc_xa);
+static struct class *mqc_dev_class;
+static int mqc_dev_major;
+
+/**
+ * struct mqc_buf - Buffer structure used to receive data from device
+ * @data: Address of data to read from
+ * @odata: Original address returned from *alloc() API. Used to free this buf.
+ * @len: Length of data in byte
+ * @node: This buffer will be part of list managed in struct mqc_dev
+ */
+struct mqc_buf {
+	void *data;
+	void *odata;
+	size_t len;
+	struct list_head node;
+};
+
+/**
+ * struct mqc_dev - MHI QAIC Control Device
+ * @minor: MQC device node minor number
+ * @mhi_dev: Associated mhi device object
+ * @mtu: Max TRE buffer length
+ * @enabled: Flag to track the state of the MQC device
+ * @lock: Mutex lock to serialize access to open_count
+ * @read_lock: Mutex lock to serialize readers
+ * @write_lock: Mutex lock to serialize writers
+ * @ul_wq: Wait queue for writers
+ * @dl_wq: Wait queue for readers
+ * @dl_queue_lock: Spin lock to serialize access to download queue
+ * @dl_queue: Queue of downloaded buffers
+ * @open_count: Track open counts
+ * @ref_count: Reference count for this structure
+ */
+struct mqc_dev {
+	u32 minor;
+	struct mhi_device *mhi_dev;
+	size_t mtu;
+	bool enabled;
+	struct mutex lock;
+	struct mutex read_lock;
+	struct mutex write_lock;
+	wait_queue_head_t ul_wq;
+	wait_queue_head_t dl_wq;
+	spinlock_t dl_queue_lock;
+	struct list_head dl_queue;
+	unsigned int open_count;
+	struct kref ref_count;
+};
+
+static void mqc_dev_release(struct kref *ref)
+{
+	struct mqc_dev *mqcdev = container_of(ref, struct mqc_dev, ref_count);
+
+	mutex_destroy(&mqcdev->read_lock);
+	mutex_destroy(&mqcdev->write_lock);
+	mutex_destroy(&mqcdev->lock);
+	kfree(mqcdev);
+}
+
+static int mhi_qaic_ctrl_fill_dl_queue(struct mqc_dev *mqcdev)
+{
+	struct mhi_device *mhi_dev = mqcdev->mhi_dev;
+	struct mqc_buf *ctrlbuf;
+	int rx_budget;
+	int ret = 0;
+	void *data;
+
+	rx_budget = mhi_get_free_desc_count(mhi_dev, DMA_FROM_DEVICE);
+	if (rx_budget < 0)
+		return -EIO;
+
+	while (rx_budget--) {
+		data = kzalloc(mqcdev->mtu + sizeof(*ctrlbuf), GFP_KERNEL);
+		if (!data)
+			return -ENOMEM;
+
+		ctrlbuf = data + mqcdev->mtu;
+		ctrlbuf->odata = data;
+
+		ret = mhi_queue_buf(mhi_dev, DMA_FROM_DEVICE, data, mqcdev->mtu, MHI_EOT);
+		if (ret) {
+			kfree(data);
+			dev_err(&mhi_dev->dev, "Failed to queue buffer\n");
+			return ret;
+		}
+	}
+
+	return ret;
+}
+
+static int mhi_qaic_ctrl_dev_start_chan(struct mqc_dev *mqcdev)
+{
+	struct device *dev = &mqcdev->mhi_dev->dev;
+	int ret = 0;
+
+	ret = mutex_lock_interruptible(&mqcdev->lock);
+	if (ret)
+		return ret;
+	if (!mqcdev->enabled) {
+		ret = -ENODEV;
+		goto release_dev_lock;
+	}
+	if (!mqcdev->open_count) {
+		ret = mhi_prepare_for_transfer(mqcdev->mhi_dev);
+		if (ret) {
+			dev_err(dev, "Error starting transfer channels\n");
+			goto release_dev_lock;
+		}
+
+		ret = mhi_qaic_ctrl_fill_dl_queue(mqcdev);
+		if (ret) {
+			dev_err(dev, "Error filling download queue.\n");
+			goto mhi_unprepare;
+		}
+	}
+	mqcdev->open_count++;
+	mutex_unlock(&mqcdev->lock);
+
+	return 0;
+
+mhi_unprepare:
+	mhi_unprepare_from_transfer(mqcdev->mhi_dev);
+release_dev_lock:
+	mutex_unlock(&mqcdev->lock);
+	return ret;
+}
+
+static struct mqc_dev *mqc_dev_get_by_minor(unsigned int minor)
+{
+	struct mqc_dev *mqcdev;
+
+	xa_lock(&mqc_xa);
+	mqcdev = xa_load(&mqc_xa, minor);
+	if (mqcdev)
+		kref_get(&mqcdev->ref_count);
+	xa_unlock(&mqc_xa);
+
+	return mqcdev;
+}
+
+static int mhi_qaic_ctrl_open(struct inode *inode, struct file *filp)
+{
+	struct mqc_dev *mqcdev;
+	int ret;
+
+	mqcdev = mqc_dev_get_by_minor(iminor(inode));
+	if (!mqcdev) {
+		pr_debug("mqc: minor %d not found\n", iminor(inode));
+		return -EINVAL;
+	}
+
+	ret = mhi_qaic_ctrl_dev_start_chan(mqcdev);
+	if (ret) {
+		kref_put(&mqcdev->ref_count, mqc_dev_release);
+		return ret;
+	}
+
+	filp->private_data = mqcdev;
+
+	return 0;
+}
+
+static void mhi_qaic_ctrl_buf_free(struct mqc_buf *ctrlbuf)
+{
+	list_del(&ctrlbuf->node);
+	kfree(ctrlbuf->odata);
+}
+
+static void __mhi_qaic_ctrl_release(struct mqc_dev *mqcdev)
+{
+	struct mqc_buf *ctrlbuf, *tmp;
+
+	mhi_unprepare_from_transfer(mqcdev->mhi_dev);
+	wake_up_interruptible(&mqcdev->ul_wq);
+	wake_up_interruptible(&mqcdev->dl_wq);
+	/*
+	 * Free the dl_queue. As we have already unprepared mhi transfers, we
+	 * do not expect any callback functions that update dl_queue hence no need
+	 * to grab dl_queue lock.
+	 */
+	mutex_lock(&mqcdev->read_lock);
+	list_for_each_entry_safe(ctrlbuf, tmp, &mqcdev->dl_queue, node)
+		mhi_qaic_ctrl_buf_free(ctrlbuf);
+	mutex_unlock(&mqcdev->read_lock);
+}
+
+static int mhi_qaic_ctrl_release(struct inode *inode, struct file *file)
+{
+	struct mqc_dev *mqcdev = file->private_data;
+
+	mutex_lock(&mqcdev->lock);
+	mqcdev->open_count--;
+	if (!mqcdev->open_count && mqcdev->enabled)
+		__mhi_qaic_ctrl_release(mqcdev);
+	mutex_unlock(&mqcdev->lock);
+
+	kref_put(&mqcdev->ref_count, mqc_dev_release);
+
+	return 0;
+}
+
+static __poll_t mhi_qaic_ctrl_poll(struct file *file, poll_table *wait)
+{
+	struct mqc_dev *mqcdev = file->private_data;
+	struct mhi_device *mhi_dev;
+	__poll_t mask = 0;
+
+	mhi_dev = mqcdev->mhi_dev;
+
+	poll_wait(file, &mqcdev->ul_wq, wait);
+	poll_wait(file, &mqcdev->dl_wq, wait);
+
+	mutex_lock(&mqcdev->lock);
+	if (!mqcdev->enabled) {
+		mutex_unlock(&mqcdev->lock);
+		return EPOLLERR;
+	}
+
+	spin_lock_bh(&mqcdev->dl_queue_lock);
+	if (!list_empty(&mqcdev->dl_queue))
+		mask |= EPOLLIN | EPOLLRDNORM;
+	spin_unlock_bh(&mqcdev->dl_queue_lock);
+
+	if (mutex_lock_interruptible(&mqcdev->write_lock)) {
+		mutex_unlock(&mqcdev->lock);
+		return EPOLLERR;
+	}
+	if (mhi_get_free_desc_count(mhi_dev, DMA_TO_DEVICE) > 0)
+		mask |= EPOLLOUT | EPOLLWRNORM;
+	mutex_unlock(&mqcdev->write_lock);
+	mutex_unlock(&mqcdev->lock);
+
+	dev_dbg(&mhi_dev->dev, "Client attempted to poll, returning mask 0x%x\n", mask);
+
+	return mask;
+}
+
+static int mhi_qaic_ctrl_tx(struct mqc_dev *mqcdev)
+{
+	int ret;
+
+	ret = wait_event_interruptible(mqcdev->ul_wq, !mqcdev->enabled ||
+				       mhi_get_free_desc_count(mqcdev->mhi_dev, DMA_TO_DEVICE) > 0);
+
+	if (!mqcdev->enabled)
+		return -ENODEV;
+
+	return ret;
+}
+
+static ssize_t mhi_qaic_ctrl_write(struct file *file, const char __user *buf, size_t count,
+				   loff_t *offp)
+{
+	struct mqc_dev *mqcdev = file->private_data;
+	struct mhi_device *mhi_dev;
+	size_t bytes_xfered = 0;
+	struct device *dev;
+	int ret, nr_desc;
+
+	mhi_dev = mqcdev->mhi_dev;
+	dev = &mhi_dev->dev;
+
+	if (!mhi_dev->ul_chan)
+		return -EOPNOTSUPP;
+
+	if (!buf || !count)
+		return -EINVAL;
+
+	dev_dbg(dev, "Request to transfer %zu bytes\n", count);
+
+	ret = mhi_qaic_ctrl_tx(mqcdev);
+	if (ret)
+		return ret;
+
+	if (mutex_lock_interruptible(&mqcdev->write_lock))
+		return -EINTR;
+
+	nr_desc = mhi_get_free_desc_count(mhi_dev, DMA_TO_DEVICE);
+	if (nr_desc * mqcdev->mtu < count) {
+		ret = -EMSGSIZE;
+		dev_dbg(dev, "Buffer too big to transfer\n");
+		goto unlock_mutex;
+	}
+
+	while (count != bytes_xfered) {
+		enum mhi_flags flags;
+		size_t to_copy;
+		void *kbuf;
+
+		to_copy = min_t(size_t, count - bytes_xfered, mqcdev->mtu);
+		kbuf = kmalloc(to_copy, GFP_KERNEL);
+		if (!kbuf) {
+			ret = -ENOMEM;
+			goto unlock_mutex;
+		}
+
+		ret = copy_from_user(kbuf, buf + bytes_xfered, to_copy);
+		if (ret) {
+			kfree(kbuf);
+			ret = -EFAULT;
+			goto unlock_mutex;
+		}
+
+		if (bytes_xfered + to_copy == count)
+			flags = MHI_EOT;
+		else
+			flags = MHI_CHAIN;
+
+		ret = mhi_queue_buf(mhi_dev, DMA_TO_DEVICE, kbuf, to_copy, flags);
+		if (ret) {
+			kfree(kbuf);
+			dev_err(dev, "Failed to queue buf of size %zu\n", to_copy);
+			goto unlock_mutex;
+		}
+
+		bytes_xfered += to_copy;
+	}
+
+	mutex_unlock(&mqcdev->write_lock);
+	dev_dbg(dev, "bytes xferred: %zu\n", bytes_xfered);
+
+	return bytes_xfered;
+
+unlock_mutex:
+	mutex_unlock(&mqcdev->write_lock);
+	return ret;
+}
+
+static int mhi_qaic_ctrl_rx(struct mqc_dev *mqcdev)
+{
+	int ret;
+
+	ret = wait_event_interruptible(mqcdev->dl_wq,
+				       !mqcdev->enabled || !list_empty(&mqcdev->dl_queue));
+
+	if (!mqcdev->enabled)
+		return -ENODEV;
+
+	return ret;
+}
+
+static ssize_t mhi_qaic_ctrl_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
+{
+	struct mqc_dev *mqcdev = file->private_data;
+	struct mqc_buf *ctrlbuf;
+	size_t to_copy;
+	int ret;
+
+	if (!mqcdev->mhi_dev->dl_chan)
+		return -EOPNOTSUPP;
+
+	ret = mhi_qaic_ctrl_rx(mqcdev);
+	if (ret)
+		return ret;
+
+	if (mutex_lock_interruptible(&mqcdev->read_lock))
+		return -EINTR;
+
+	ctrlbuf = list_first_entry_or_null(&mqcdev->dl_queue, struct mqc_buf, node);
+	if (!ctrlbuf) {
+		mutex_unlock(&mqcdev->read_lock);
+		ret = -ENODEV;
+		goto error_out;
+	}
+
+	to_copy = min_t(size_t, count, ctrlbuf->len);
+	if (copy_to_user(buf, ctrlbuf->data, to_copy)) {
+		mutex_unlock(&mqcdev->read_lock);
+		dev_dbg(&mqcdev->mhi_dev->dev, "Failed to copy data to user buffer\n");
+		ret = -EFAULT;
+		goto error_out;
+	}
+
+	ctrlbuf->len -= to_copy;
+	ctrlbuf->data += to_copy;
+
+	if (!ctrlbuf->len) {
+		spin_lock_bh(&mqcdev->dl_queue_lock);
+		mhi_qaic_ctrl_buf_free(ctrlbuf);
+		spin_unlock_bh(&mqcdev->dl_queue_lock);
+		mhi_qaic_ctrl_fill_dl_queue(mqcdev);
+		dev_dbg(&mqcdev->mhi_dev->dev, "Read buf freed\n");
+	}
+
+	mutex_unlock(&mqcdev->read_lock);
+	return to_copy;
+
+error_out:
+	mutex_unlock(&mqcdev->read_lock);
+	return ret;
+}
+
+static const struct file_operations mhidev_fops = {
+	.owner = THIS_MODULE,
+	.open = mhi_qaic_ctrl_open,
+	.release = mhi_qaic_ctrl_release,
+	.read = mhi_qaic_ctrl_read,
+	.write = mhi_qaic_ctrl_write,
+	.poll = mhi_qaic_ctrl_poll,
+};
+
+static void mhi_qaic_ctrl_ul_xfer_cb(struct mhi_device *mhi_dev, struct mhi_result *mhi_result)
+{
+	struct mqc_dev *mqcdev = dev_get_drvdata(&mhi_dev->dev);
+
+	dev_dbg(&mhi_dev->dev, "%s: status: %d xfer_len: %zu\n", __func__,
+		mhi_result->transaction_status, mhi_result->bytes_xferd);
+
+	kfree(mhi_result->buf_addr);
+
+	if (!mhi_result->transaction_status)
+		wake_up_interruptible(&mqcdev->ul_wq);
+}
+
+static void mhi_qaic_ctrl_dl_xfer_cb(struct mhi_device *mhi_dev, struct mhi_result *mhi_result)
+{
+	struct mqc_dev *mqcdev = dev_get_drvdata(&mhi_dev->dev);
+	struct mqc_buf *ctrlbuf;
+
+	dev_dbg(&mhi_dev->dev, "%s: status: %d receive_len: %zu\n", __func__,
+		mhi_result->transaction_status, mhi_result->bytes_xferd);
+
+	if (mhi_result->transaction_status &&
+	    mhi_result->transaction_status != -EOVERFLOW) {
+		kfree(mhi_result->buf_addr);
+		return;
+	}
+
+	ctrlbuf = mhi_result->buf_addr + mqcdev->mtu;
+	ctrlbuf->data = mhi_result->buf_addr;
+	ctrlbuf->len = mhi_result->bytes_xferd;
+	spin_lock_bh(&mqcdev->dl_queue_lock);
+	list_add_tail(&ctrlbuf->node, &mqcdev->dl_queue);
+	spin_unlock_bh(&mqcdev->dl_queue_lock);
+
+	wake_up_interruptible(&mqcdev->dl_wq);
+}
+
+static int mhi_qaic_ctrl_probe(struct mhi_device *mhi_dev, const struct mhi_device_id *id)
+{
+	struct mqc_dev *mqcdev;
+	struct device *dev;
+	int ret;
+
+	mqcdev = kzalloc(sizeof(*mqcdev), GFP_KERNEL);
+	if (!mqcdev)
+		return -ENOMEM;
+
+	kref_init(&mqcdev->ref_count);
+	mutex_init(&mqcdev->lock);
+	mqcdev->mhi_dev = mhi_dev;
+
+	ret = xa_alloc(&mqc_xa, &mqcdev->minor, mqcdev, XA_LIMIT(0, MHI_QAIC_CTRL_MAX_MINORS),
+		       GFP_KERNEL);
+	if (ret) {
+		kfree(mqcdev);
+		return ret;
+	}
+
+	init_waitqueue_head(&mqcdev->ul_wq);
+	init_waitqueue_head(&mqcdev->dl_wq);
+	mutex_init(&mqcdev->read_lock);
+	mutex_init(&mqcdev->write_lock);
+	spin_lock_init(&mqcdev->dl_queue_lock);
+	INIT_LIST_HEAD(&mqcdev->dl_queue);
+	mqcdev->mtu = min_t(size_t, id->driver_data, MHI_MAX_MTU);
+	mqcdev->enabled = true;
+	mqcdev->open_count = 0;
+	dev_set_drvdata(&mhi_dev->dev, mqcdev);
+
+	dev = device_create(mqc_dev_class, &mhi_dev->dev, MKDEV(mqc_dev_major, mqcdev->minor),
+			    mqcdev, "%s", dev_name(&mhi_dev->dev));
+	if (IS_ERR(dev)) {
+		xa_erase(&mqc_xa, mqcdev->minor);
+		dev_set_drvdata(&mhi_dev->dev, NULL);
+		kfree(mqcdev);
+		return PTR_ERR(dev);
+	}
+
+	return 0;
+};
+
+static void mhi_qaic_ctrl_remove(struct mhi_device *mhi_dev)
+{
+	struct mqc_dev *mqcdev = dev_get_drvdata(&mhi_dev->dev);
+
+	device_destroy(mqc_dev_class, MKDEV(mqc_dev_major, mqcdev->minor));
+
+	mutex_lock(&mqcdev->lock);
+	mqcdev->enabled = false;
+	if (mqcdev->open_count)
+		__mhi_qaic_ctrl_release(mqcdev);
+	mutex_unlock(&mqcdev->lock);
+
+	xa_erase(&mqc_xa, mqcdev->minor);
+	kref_put(&mqcdev->ref_count, mqc_dev_release);
+}
+
+/* .driver_data stores max mtu */
+static const struct mhi_device_id mhi_qaic_ctrl_match_table[] = {
+	{ .chan = "QAIC_SAHARA", .driver_data = SZ_32K},
+	{},
+};
+MODULE_DEVICE_TABLE(mhi, mhi_qaic_ctrl_match_table);
+
+static struct mhi_driver mhi_qaic_ctrl_driver = {
+	.id_table = mhi_qaic_ctrl_match_table,
+	.remove = mhi_qaic_ctrl_remove,
+	.probe = mhi_qaic_ctrl_probe,
+	.ul_xfer_cb = mhi_qaic_ctrl_ul_xfer_cb,
+	.dl_xfer_cb = mhi_qaic_ctrl_dl_xfer_cb,
+	.driver = {
+		.name = MHI_QAIC_CTRL_DRIVER_NAME,
+	},
+};
+
+int mhi_qaic_ctrl_init(void)
+{
+	int ret;
+
+	ret = register_chrdev(0, MHI_QAIC_CTRL_DRIVER_NAME, &mhidev_fops);
+	if (ret < 0)
+		return ret;
+
+	mqc_dev_major = ret;
+	mqc_dev_class = class_create(THIS_MODULE, MHI_QAIC_CTRL_DRIVER_NAME);
+	if (IS_ERR(mqc_dev_class)) {
+		ret = PTR_ERR(mqc_dev_class);
+		goto unregister_chrdev;
+	}
+
+	ret = mhi_driver_register(&mhi_qaic_ctrl_driver);
+	if (ret)
+		goto destroy_class;
+
+	return 0;
+
+destroy_class:
+	class_destroy(mqc_dev_class);
+unregister_chrdev:
+	unregister_chrdev(mqc_dev_major, MHI_QAIC_CTRL_DRIVER_NAME);
+	return ret;
+}
+
+void mhi_qaic_ctrl_deinit(void)
+{
+	mhi_driver_unregister(&mhi_qaic_ctrl_driver);
+	class_destroy(mqc_dev_class);
+	unregister_chrdev(mqc_dev_major, MHI_QAIC_CTRL_DRIVER_NAME);
+	xa_destroy(&mqc_xa);
+}
diff --git a/drivers/accel/qaic/mhi_qaic_ctrl.h b/drivers/accel/qaic/mhi_qaic_ctrl.h
new file mode 100644
index 000000000000..930b3ace1a59
--- /dev/null
+++ b/drivers/accel/qaic/mhi_qaic_ctrl.h
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0-only
+ *
+ * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#ifndef __MHI_QAIC_CTRL_H__
+#define __MHI_QAIC_CTRL_H__
+
+int mhi_qaic_ctrl_init(void);
+void mhi_qaic_ctrl_deinit(void);
+
+#endif /* __MHI_QAIC_CTRL_H__ */
diff --git a/drivers/accel/qaic/qaic.h b/drivers/accel/qaic/qaic.h
new file mode 100644
index 000000000000..f2bd637a0d4e
--- /dev/null
+++ b/drivers/accel/qaic/qaic.h
@@ -0,0 +1,282 @@
+/* SPDX-License-Identifier: GPL-2.0-only
+ *
+ * Copyright (c) 2019-2021, The Linux Foundation. All rights reserved.
+ * Copyright (c) 2021-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#ifndef _QAIC_H_
+#define _QAIC_H_
+
+#include <linux/interrupt.h>
+#include <linux/kref.h>
+#include <linux/mhi.h>
+#include <linux/mutex.h>
+#include <linux/pci.h>
+#include <linux/spinlock.h>
+#include <linux/srcu.h>
+#include <linux/wait.h>
+#include <linux/workqueue.h>
+#include <drm/drm_device.h>
+#include <drm/drm_gem.h>
+
+#define QAIC_DBC_BASE		SZ_128K
+#define QAIC_DBC_SIZE		SZ_4K
+
+#define QAIC_NO_PARTITION	-1
+
+#define QAIC_DBC_OFF(i)		((i) * QAIC_DBC_SIZE + QAIC_DBC_BASE)
+
+#define to_qaic_bo(obj) container_of(obj, struct qaic_bo, base)
+
+extern bool datapath_polling;
+
+struct qaic_user {
+	/* Uniquely identifies this user for the device */
+	int			handle;
+	struct kref		ref_count;
+	/* Char device opened by this user */
+	struct qaic_drm_device	*qddev;
+	/* Node in list of users that opened this drm device */
+	struct list_head	node;
+	/* SRCU used to synchronize this user during cleanup */
+	struct srcu_struct	qddev_lock;
+	atomic_t		chunk_id;
+};
+
+struct dma_bridge_chan {
+	/* Pointer to device strcut maintained by driver */
+	struct qaic_device	*qdev;
+	/* ID of this DMA bridge channel(DBC) */
+	unsigned int		id;
+	/* Synchronizes access to xfer_list */
+	spinlock_t		xfer_lock;
+	/* Base address of request queue */
+	void			*req_q_base;
+	/* Base address of response queue */
+	void			*rsp_q_base;
+	/*
+	 * Base bus address of request queue. Response queue bus address can be
+	 * calculated by adding request queue size to this variable
+	 */
+	dma_addr_t		dma_addr;
+	/* Total size of request and response queue in byte */
+	u32			total_size;
+	/* Capacity of request/response queue */
+	u32			nelem;
+	/* The user that opened this DBC */
+	struct qaic_user	*usr;
+	/*
+	 * Request ID of next memory handle that goes in request queue. One
+	 * memory handle can enqueue more than one request elements, all
+	 * this requests that belong to same memory handle have same request ID
+	 */
+	u16			next_req_id;
+	/* true: DBC is in use; false: DBC not in use */
+	bool			in_use;
+	/*
+	 * Base address of device registers. Used to read/write request and
+	 * response queue's head and tail pointer of this DBC.
+	 */
+	void __iomem		*dbc_base;
+	/* Head of list where each node is a memory handle queued in request queue */
+	struct list_head	xfer_list;
+	/* Synchronizes DBC readers during cleanup */
+	struct srcu_struct	ch_lock;
+	/*
+	 * When this DBC is released, any thread waiting on this wait queue is
+	 * woken up
+	 */
+	wait_queue_head_t	dbc_release;
+	/* Head of list where each node is a bo associated with this DBC */
+	struct list_head	bo_lists;
+	/* The irq line for this DBC. Used for polling */
+	unsigned int		irq;
+	/* Polling work item to simulate interrupts */
+	struct work_struct	poll_work;
+};
+
+struct qaic_device {
+	/* Pointer to base PCI device struct of our physical device */
+	struct pci_dev		*pdev;
+	/* Req. ID of request that will be queued next in MHI control device */
+	u32			next_seq_num;
+	/* Base address of bar 0 */
+	void __iomem		*bar_0;
+	/* Base address of bar 2 */
+	void __iomem		*bar_2;
+	/* Controller structure for MHI devices */
+	struct mhi_controller	*mhi_cntrl;
+	/* MHI control channel device */
+	struct mhi_device	*cntl_ch;
+	/* List of requests queued in MHI control device */
+	struct list_head	cntl_xfer_list;
+	/* Synchronizes MHI control device transactions and its xfer list */
+	struct mutex		cntl_mutex;
+	/* Array of DBC struct of this device */
+	struct dma_bridge_chan	*dbc;
+	/* Work queue for tasks related to MHI control device */
+	struct workqueue_struct	*cntl_wq;
+	/* Synchronizes all the users of device during cleanup */
+	struct srcu_struct	dev_lock;
+	/* true: Device under reset; false: Device not under reset */
+	bool			in_reset;
+	/*
+	 * true: A tx MHI transaction has failed and a rx buffer is still queued
+	 * in control device. Such a buffer is considered lost rx buffer
+	 * false: No rx buffer is lost in control device
+	 */
+	bool			cntl_lost_buf;
+	/* Maximum number of DBC supported by this device */
+	u32			num_dbc;
+	/* Reference to the drm_device for this device when it is created */
+	struct qaic_drm_device	*qddev;
+	/* Generate the CRC of a control message */
+	u32 (*gen_crc)(void *msg);
+	/* Validate the CRC of a control message */
+	bool (*valid_crc)(void *msg);
+};
+
+struct qaic_drm_device {
+	/* Pointer to the root device struct driven by this driver */
+	struct qaic_device	*qdev;
+	/*
+	 * The physical device can be partition in number of logical devices.
+	 * And each logical device is given a partition id. This member stores
+	 * that id. QAIC_NO_PARTITION is a sentinel used to mark that this drm
+	 * device is the actual physical device
+	 */
+	s32			partition_id;
+	/* Pointer to the drm device struct of this drm device */
+	struct drm_device	*ddev;
+	/* Head in list of users who have opened this drm device */
+	struct list_head	users;
+	/* Synchronizes access to users list */
+	struct mutex		users_mutex;
+};
+
+struct qaic_bo {
+	struct drm_gem_object	base;
+	/* Scatter/gather table for allocate/imported BO */
+	struct sg_table		*sgt;
+	/* BO size requested by user. GEM object might be bigger in size. */
+	u64			size;
+	/* Head in list of slices of this BO */
+	struct list_head	slices;
+	/* Total nents, for all slices of this BO */
+	int			total_slice_nents;
+	/*
+	 * Direction of transfer. It can assume only two value DMA_TO_DEVICE and
+	 * DMA_FROM_DEVICE.
+	 */
+	int			dir;
+	/* The pointer of the DBC which operates on this BO */
+	struct dma_bridge_chan	*dbc;
+	/* Number of slice that belongs to this buffer */
+	u32			nr_slice;
+	/* Number of slice that have been transferred by DMA engine */
+	u32			nr_slice_xfer_done;
+	/* true = BO is queued for execution, true = BO is not queued */
+	bool			queued;
+	/*
+	 * If true then user has attached slicing information to this BO by
+	 * calling DRM_IOCTL_QAIC_ATTACH_SLICE_BO ioctl.
+	 */
+	bool			sliced;
+	/* Request ID of this BO if it is queued for execution */
+	u16			req_id;
+	/* Handle assigned to this BO */
+	u32			handle;
+	/* Wait on this for completion of DMA transfer of this BO */
+	struct completion	xfer_done;
+	/*
+	 * Node in linked list where head is dbc->xfer_list.
+	 * This link list contain BO's that are queued for DMA transfer.
+	 */
+	struct list_head	xfer_list;
+	/*
+	 * Node in linked list where head is dbc->bo_lists.
+	 * This link list contain BO's that are associated with the DBC it is
+	 * linked to.
+	 */
+	struct list_head	bo_list;
+	struct {
+		/*
+		 * Latest timestamp(ns) at which kernel received a request to
+		 * execute this BO
+		 */
+		u64		req_received_ts;
+		/*
+		 * Latest timestamp(ns) at which kernel enqueued requests of
+		 * this BO for execution in DMA queue
+		 */
+		u64		req_submit_ts;
+		/*
+		 * Latest timestamp(ns) at which kernel received a completion
+		 * interrupt for requests of this BO
+		 */
+		u64		req_processed_ts;
+		/*
+		 * Number of elements already enqueued in DMA queue before
+		 * enqueuing requests of this BO
+		 */
+		u32		queue_level_before;
+	} perf_stats;
+
+};
+
+struct bo_slice {
+	/* Mapped pages */
+	struct sg_table		*sgt;
+	/* Number of requests required to queue in DMA queue */
+	int			nents;
+	/* See enum dma_data_direction */
+	int			dir;
+	/* Actual requests that will be copied in DMA queue */
+	struct dbc_req		*reqs;
+	struct kref		ref_count;
+	/* true: No DMA transfer required */
+	bool			no_xfer;
+	/* Pointer to the parent BO handle */
+	struct qaic_bo		*bo;
+	/* Node in list of slices maintained by parent BO */
+	struct list_head	slice;
+	/* Size of this slice in bytes */
+	u64			size;
+	/* Offset of this slice in buffer */
+	u64			offset;
+};
+
+int get_dbc_req_elem_size(void);
+int get_dbc_rsp_elem_size(void);
+int get_cntl_version(struct qaic_device *qdev, struct qaic_user *usr, u16 *major, u16 *minor);
+int qaic_manage_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv);
+void qaic_mhi_ul_xfer_cb(struct mhi_device *mhi_dev, struct mhi_result *mhi_result);
+
+void qaic_mhi_dl_xfer_cb(struct mhi_device *mhi_dev, struct mhi_result *mhi_result);
+
+int qaic_control_open(struct qaic_device *qdev);
+void qaic_control_close(struct qaic_device *qdev);
+void qaic_release_usr(struct qaic_device *qdev, struct qaic_user *usr);
+
+irqreturn_t dbc_irq_threaded_fn(int irq, void *data);
+irqreturn_t dbc_irq_handler(int irq, void *data);
+int disable_dbc(struct qaic_device *qdev, u32 dbc_id, struct qaic_user *usr);
+void enable_dbc(struct qaic_device *qdev, u32 dbc_id, struct qaic_user *usr);
+void wakeup_dbc(struct qaic_device *qdev, u32 dbc_id);
+void release_dbc(struct qaic_device *qdev, u32 dbc_id);
+
+void wake_all_cntl(struct qaic_device *qdev);
+void qaic_dev_reset_clean_local_state(struct qaic_device *qdev, bool exit_reset);
+
+struct drm_gem_object *qaic_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf);
+
+int qaic_create_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv);
+int qaic_mmap_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv);
+int qaic_attach_slice_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv);
+int qaic_execute_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv);
+int qaic_partial_execute_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv);
+int qaic_wait_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv);
+int qaic_perf_stats_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv);
+void irq_polling_work(struct work_struct *work);
+
+#endif /* _QAIC_H_ */
diff --git a/drivers/accel/qaic/qaic_control.c b/drivers/accel/qaic/qaic_control.c
new file mode 100644
index 000000000000..9f216eb6f76e
--- /dev/null
+++ b/drivers/accel/qaic/qaic_control.c
@@ -0,0 +1,1526 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+/* Copyright (c) 2019-2021, The Linux Foundation. All rights reserved. */
+/* Copyright (c) 2021-2023 Qualcomm Innovation Center, Inc. All rights reserved. */
+
+#include <asm/byteorder.h>
+#include <linux/completion.h>
+#include <linux/crc32.h>
+#include <linux/delay.h>
+#include <linux/dma-mapping.h>
+#include <linux/kref.h>
+#include <linux/list.h>
+#include <linux/mhi.h>
+#include <linux/mm.h>
+#include <linux/moduleparam.h>
+#include <linux/mutex.h>
+#include <linux/pci.h>
+#include <linux/scatterlist.h>
+#include <linux/types.h>
+#include <linux/uaccess.h>
+#include <linux/workqueue.h>
+#include <linux/wait.h>
+#include <drm/drm_device.h>
+#include <drm/drm_file.h>
+#include <uapi/drm/qaic_accel.h>
+
+#include "qaic.h"
+
+#define MANAGE_MAGIC_NUMBER		((__force __le32)0x43494151) /* "QAIC" in little endian */
+#define QAIC_DBC_Q_GAP			SZ_256
+#define QAIC_DBC_Q_BUF_ALIGN		SZ_4K
+#define QAIC_MANAGE_EXT_MSG_LENGTH	SZ_64K /* Max DMA message length */
+#define QAIC_WRAPPER_MAX_SIZE		SZ_4K
+#define QAIC_MHI_RETRY_WAIT_MS		100
+#define QAIC_MHI_RETRY_MAX		20
+
+static unsigned int control_resp_timeout_s = 60; /* 60 sec default */
+module_param(control_resp_timeout_s, uint, 0600);
+MODULE_PARM_DESC(control_resp_timeout_s, "Timeout for NNC responses from QSM");
+
+struct manage_msg {
+	u32 len;
+	u32 count;
+	u8 data[];
+};
+
+/*
+ * wire encoding structures for the manage protocol.
+ * All fields are little endian on the wire
+ */
+struct wire_msg_hdr {
+	__le32 crc32; /* crc of everything following this field in the message */
+	__le32 magic_number;
+	__le32 sequence_number;
+	__le32 len; /* length of this message */
+	__le32 count; /* number of transactions in this message */
+	__le32 handle; /* unique id to track the resources consumed */
+	__le32 partition_id; /* partition id for the request (signed) */
+	__le32 padding; /* must be 0 */
+} __packed;
+
+struct wire_msg {
+	struct wire_msg_hdr hdr;
+	u8 data[];
+} __packed;
+
+struct wire_trans_hdr {
+	__le32 type;
+	__le32 len;
+} __packed;
+
+/* Each message sent from driver to device are organized in a list of wrapper_msg */
+struct wrapper_msg {
+	struct list_head list;
+	struct kref ref_count;
+	u32 len; /* length of data to transfer */
+	struct wrapper_list *head;
+	union {
+		struct wire_msg msg;
+		struct wire_trans_hdr trans;
+	};
+};
+
+struct wrapper_list {
+	struct list_head list;
+	spinlock_t lock; /* Protects the list state during additions and removals */
+};
+
+struct wire_trans_passthrough {
+	struct wire_trans_hdr hdr;
+	u8 data[];
+} __packed;
+
+struct wire_addr_size_pair {
+	__le64 addr;
+	__le64 size;
+} __packed;
+
+struct wire_trans_dma_xfer {
+	struct wire_trans_hdr hdr;
+	__le32 tag;
+	__le32 count;
+	__le32 dma_chunk_id;
+	__le32 padding;
+	struct wire_addr_size_pair data[];
+} __packed;
+
+/* Initiated by device to continue the DMA xfer of a large piece of data */
+struct wire_trans_dma_xfer_cont {
+	struct wire_trans_hdr hdr;
+	__le32 dma_chunk_id;
+	__le32 padding;
+	__le64 xferred_size;
+} __packed;
+
+struct wire_trans_activate_to_dev {
+	struct wire_trans_hdr hdr;
+	__le64 req_q_addr;
+	__le64 rsp_q_addr;
+	__le32 req_q_size;
+	__le32 rsp_q_size;
+	__le32 buf_len;
+	__le32 options; /* unused, but BIT(16) has meaning to the device */
+} __packed;
+
+struct wire_trans_activate_from_dev {
+	struct wire_trans_hdr hdr;
+	__le32 status;
+	__le32 dbc_id;
+	__le64 options; /* unused */
+} __packed;
+
+struct wire_trans_deactivate_from_dev {
+	struct wire_trans_hdr hdr;
+	__le32 status;
+	__le32 dbc_id;
+} __packed;
+
+struct wire_trans_terminate_to_dev {
+	struct wire_trans_hdr hdr;
+	__le32 handle;
+	__le32 padding;
+} __packed;
+
+struct wire_trans_terminate_from_dev {
+	struct wire_trans_hdr hdr;
+	__le32 status;
+	__le32 padding;
+} __packed;
+
+struct wire_trans_status_to_dev {
+	struct wire_trans_hdr hdr;
+} __packed;
+
+struct wire_trans_status_from_dev {
+	struct wire_trans_hdr hdr;
+	__le16 major;
+	__le16 minor;
+	__le32 status;
+	__le64 status_flags;
+} __packed;
+
+struct wire_trans_validate_part_to_dev {
+	struct wire_trans_hdr hdr;
+	__le32 part_id;
+	__le32 padding;
+} __packed;
+
+struct wire_trans_validate_part_from_dev {
+	struct wire_trans_hdr hdr;
+	__le32 status;
+	__le32 padding;
+} __packed;
+
+struct xfer_queue_elem {
+	/*
+	 * Node in list of ongoing transfer request on control channel.
+	 * Maintained by root device struct.
+	 */
+	struct list_head list;
+	/* Sequence number of this transfer request */
+	u32 seq_num;
+	/* This is used to wait on until completion of transfer request */
+	struct completion xfer_done;
+	/* Received data from device */
+	void *buf;
+};
+
+struct dma_xfer {
+	/* Node in list of DMA transfers which is used for cleanup */
+	struct list_head list;
+	/* SG table of memory used for DMA */
+	struct sg_table *sgt;
+	/* Array pages used for DMA */
+	struct page **page_list;
+	/* Number of pages used for DMA */
+	unsigned long nr_pages;
+};
+
+struct ioctl_resources {
+	/* List of all DMA transfers which is used later for cleanup */
+	struct list_head dma_xfers;
+	/* Base address of request queue which belongs to a DBC */
+	void *buf;
+	/*
+	 * Base bus address of request queue which belongs to a DBC. Response
+	 * queue base bus address can be calculated by adding size of request
+	 * queue to base bus address of request queue.
+	 */
+	dma_addr_t dma_addr;
+	/* Total size of request queue and response queue in byte */
+	u32 total_size;
+	/* Total number of elements that can be queued in each of request and response queue */
+	u32 nelem;
+	/* Base address of response queue which belongs to a DBC */
+	void *rsp_q_base;
+	/* Status of the NNC message received */
+	u32 status;
+	/* DBC id of the DBC received from device */
+	u32 dbc_id;
+	/*
+	 * DMA transfer request messages can be big in size and it may not be
+	 * possible to send them in one shot. In such cases the messages are
+	 * broken into chunks, this field stores ID of such chunks.
+	 */
+	u32 dma_chunk_id;
+	/* Total number of bytes transferred for a DMA xfer request */
+	u64 xferred_dma_size;
+	/* Header of transaction message received from user. Used during DMA xfer request. */
+	void *trans_hdr;
+};
+
+struct resp_work {
+	struct work_struct work;
+	struct qaic_device *qdev;
+	void *buf;
+};
+
+/*
+ * Since we're working with little endian messages, its useful to be able to
+ * increment without filling a whole line with conversions back and forth just
+ * to add one(1) to a message count.
+ */
+static __le32 incr_le32(__le32 val)
+{
+	return cpu_to_le32(le32_to_cpu(val) + 1);
+}
+
+static u32 gen_crc(void *msg)
+{
+	struct wrapper_list *wrappers = msg;
+	struct wrapper_msg *w;
+	u32 crc = ~0;
+
+	list_for_each_entry(w, &wrappers->list, list)
+		crc = crc32(crc, &w->msg, w->len);
+
+	return crc ^ ~0;
+}
+
+static u32 gen_crc_stub(void *msg)
+{
+	return 0;
+}
+
+static bool valid_crc(void *msg)
+{
+	struct wire_msg_hdr *hdr = msg;
+	bool ret;
+	u32 crc;
+
+	/*
+	 * The output of this algorithm is always converted to the native
+	 * endianness.
+	 */
+	crc = le32_to_cpu(hdr->crc32);
+	hdr->crc32 = 0;
+	ret = (crc32(~0, msg, le32_to_cpu(hdr->len)) ^ ~0) == crc;
+	hdr->crc32 = cpu_to_le32(crc);
+	return ret;
+}
+
+static bool valid_crc_stub(void *msg)
+{
+	return true;
+}
+
+static void free_wrapper(struct kref *ref)
+{
+	struct wrapper_msg *wrapper = container_of(ref, struct wrapper_msg, ref_count);
+
+	list_del(&wrapper->list);
+	kfree(wrapper);
+}
+
+static void save_dbc_buf(struct qaic_device *qdev, struct ioctl_resources *resources,
+			 struct qaic_user *usr)
+{
+	u32 dbc_id = resources->dbc_id;
+
+	if (resources->buf) {
+		wait_event_interruptible(qdev->dbc[dbc_id].dbc_release, !qdev->dbc[dbc_id].in_use);
+		qdev->dbc[dbc_id].req_q_base = resources->buf;
+		qdev->dbc[dbc_id].rsp_q_base = resources->rsp_q_base;
+		qdev->dbc[dbc_id].dma_addr = resources->dma_addr;
+		qdev->dbc[dbc_id].total_size = resources->total_size;
+		qdev->dbc[dbc_id].nelem = resources->nelem;
+		enable_dbc(qdev, dbc_id, usr);
+		qdev->dbc[dbc_id].in_use = true;
+		resources->buf = NULL;
+	}
+}
+
+static void free_dbc_buf(struct qaic_device *qdev, struct ioctl_resources *resources)
+{
+	if (resources->buf)
+		dma_free_coherent(&qdev->pdev->dev, resources->total_size, resources->buf,
+				  resources->dma_addr);
+	resources->buf = NULL;
+}
+
+static void free_dma_xfers(struct qaic_device *qdev, struct ioctl_resources *resources)
+{
+	struct dma_xfer *xfer;
+	struct dma_xfer *x;
+	int i;
+
+	list_for_each_entry_safe(xfer, x, &resources->dma_xfers, list) {
+		dma_unmap_sgtable(&qdev->pdev->dev, xfer->sgt, DMA_TO_DEVICE, 0);
+		sg_free_table(xfer->sgt);
+		kfree(xfer->sgt);
+		for (i = 0; i < xfer->nr_pages; ++i)
+			put_page(xfer->page_list[i]);
+		kfree(xfer->page_list);
+		list_del(&xfer->list);
+		kfree(xfer);
+	}
+}
+
+static struct wrapper_msg *add_wrapper(struct wrapper_list *wrappers, u32 size)
+{
+	struct wrapper_msg *w = kzalloc(size, GFP_KERNEL);
+
+	if (!w)
+		return NULL;
+	list_add_tail(&w->list, &wrappers->list);
+	kref_init(&w->ref_count);
+	w->head = wrappers;
+	return w;
+}
+
+static int encode_passthrough(struct qaic_device *qdev, void *trans, struct wrapper_list *wrappers,
+			      u32 *user_len)
+{
+	struct qaic_manage_trans_passthrough *in_trans = trans;
+	struct wire_trans_passthrough *out_trans;
+	struct wrapper_msg *trans_wrapper;
+	struct wrapper_msg *wrapper;
+	struct wire_msg *msg;
+	u32 msg_hdr_len;
+
+	wrapper = list_first_entry(&wrappers->list, struct wrapper_msg, list);
+	msg = &wrapper->msg;
+	msg_hdr_len = le32_to_cpu(msg->hdr.len);
+
+	if (in_trans->hdr.len % 8 != 0)
+		return -EINVAL;
+
+	if (msg_hdr_len + in_trans->hdr.len > QAIC_MANAGE_EXT_MSG_LENGTH)
+		return -ENOSPC;
+
+	trans_wrapper = add_wrapper(wrappers,
+				    offsetof(struct wrapper_msg, trans) + in_trans->hdr.len);
+	if (!trans_wrapper)
+		return -ENOMEM;
+	trans_wrapper->len = in_trans->hdr.len;
+	out_trans = (struct wire_trans_passthrough *)&trans_wrapper->trans;
+
+	memcpy(out_trans->data, in_trans->data, in_trans->hdr.len - sizeof(in_trans->hdr));
+	msg->hdr.len = cpu_to_le32(msg_hdr_len + in_trans->hdr.len);
+	msg->hdr.count = incr_le32(msg->hdr.count);
+	*user_len += in_trans->hdr.len;
+	out_trans->hdr.type = cpu_to_le32(QAIC_TRANS_PASSTHROUGH_TO_DEV);
+	out_trans->hdr.len = cpu_to_le32(in_trans->hdr.len);
+
+	return 0;
+}
+
+/* returns error code for failure, 0 if enough pages alloc'd, 1 if dma_cont is needed */
+static int find_and_map_user_pages(struct qaic_device *qdev,
+				   struct qaic_manage_trans_dma_xfer *in_trans,
+				   struct ioctl_resources *resources, struct dma_xfer *xfer)
+{
+	unsigned long need_pages;
+	struct page **page_list;
+	unsigned long nr_pages;
+	struct sg_table *sgt;
+	u64 xfer_start_addr;
+	int ret;
+	int i;
+
+	xfer_start_addr = in_trans->addr + resources->xferred_dma_size;
+
+	need_pages = DIV_ROUND_UP(in_trans->size + offset_in_page(xfer_start_addr) -
+				  resources->xferred_dma_size, PAGE_SIZE);
+
+	nr_pages = need_pages;
+
+	while (1) {
+		page_list = kmalloc_array(nr_pages, sizeof(*page_list), GFP_KERNEL | __GFP_NOWARN);
+		if (!page_list) {
+			nr_pages = nr_pages / 2;
+			if (!nr_pages)
+				return -ENOMEM;
+		} else {
+			break;
+		}
+	}
+
+	ret = get_user_pages_fast(xfer_start_addr, nr_pages, 0, page_list);
+	if (ret < 0 || ret != nr_pages) {
+		ret = -EFAULT;
+		goto free_page_list;
+	}
+
+	sgt = kmalloc(sizeof(*sgt), GFP_KERNEL);
+	if (!sgt) {
+		ret = -ENOMEM;
+		goto put_pages;
+	}
+
+	ret = sg_alloc_table_from_pages(sgt, page_list, nr_pages,
+					offset_in_page(xfer_start_addr),
+					in_trans->size - resources->xferred_dma_size, GFP_KERNEL);
+	if (ret) {
+		ret = -ENOMEM;
+		goto free_sgt;
+	}
+
+	ret = dma_map_sgtable(&qdev->pdev->dev, sgt, DMA_TO_DEVICE, 0);
+	if (ret)
+		goto free_table;
+
+	xfer->sgt = sgt;
+	xfer->page_list = page_list;
+	xfer->nr_pages = nr_pages;
+
+	return need_pages > nr_pages ? 1 : 0;
+
+free_table:
+	sg_free_table(sgt);
+free_sgt:
+	kfree(sgt);
+put_pages:
+	for (i = 0; i < nr_pages; ++i)
+		put_page(page_list[i]);
+free_page_list:
+	kfree(page_list);
+	return ret;
+}
+
+/* returns error code for failure, 0 if everything was encoded, 1 if dma_cont is needed */
+static int encode_addr_size_pairs(struct dma_xfer *xfer, struct wrapper_list *wrappers,
+				  struct ioctl_resources *resources, u32 msg_hdr_len, u32 *size,
+				  struct wire_trans_dma_xfer **out_trans)
+{
+	struct wrapper_msg *trans_wrapper;
+	struct sg_table *sgt = xfer->sgt;
+	struct wire_addr_size_pair *asp;
+	struct scatterlist *sg;
+	struct wrapper_msg *w;
+	unsigned int dma_len;
+	u64 dma_chunk_len;
+	void *boundary;
+	int nents_dma;
+	int nents;
+	int i;
+
+	nents = sgt->nents;
+	nents_dma = nents;
+	*size = QAIC_MANAGE_EXT_MSG_LENGTH - msg_hdr_len - sizeof(**out_trans);
+	for_each_sgtable_sg(sgt, sg, i) {
+		*size -= sizeof(*asp);
+		/* Save 1K for possible follow-up transactions. */
+		if (*size < SZ_1K) {
+			nents_dma = i;
+			break;
+		}
+	}
+
+	trans_wrapper = add_wrapper(wrappers, QAIC_WRAPPER_MAX_SIZE);
+	if (!trans_wrapper)
+		return -ENOMEM;
+	*out_trans = (struct wire_trans_dma_xfer *)&trans_wrapper->trans;
+
+	asp = (*out_trans)->data;
+	boundary = (void *)trans_wrapper + QAIC_WRAPPER_MAX_SIZE;
+	*size = 0;
+
+	dma_len = 0;
+	w = trans_wrapper;
+	dma_chunk_len = 0;
+	for_each_sg(sgt->sgl, sg, nents_dma, i) {
+		asp->size = cpu_to_le64(dma_len);
+		dma_chunk_len += dma_len;
+		if (dma_len) {
+			asp++;
+			if ((void *)asp + sizeof(*asp) > boundary) {
+				w->len = (void *)asp - (void *)&w->msg;
+				*size += w->len;
+				w = add_wrapper(wrappers, QAIC_WRAPPER_MAX_SIZE);
+				if (!w)
+					return -ENOMEM;
+				boundary = (void *)w + QAIC_WRAPPER_MAX_SIZE;
+				asp = (struct wire_addr_size_pair *)&w->msg;
+			}
+		}
+		asp->addr = cpu_to_le64(sg_dma_address(sg));
+		dma_len = sg_dma_len(sg);
+	}
+	/* finalize the last segment */
+	asp->size = cpu_to_le64(dma_len);
+	w->len = (void *)asp + sizeof(*asp) - (void *)&w->msg;
+	*size += w->len;
+	dma_chunk_len += dma_len;
+	resources->xferred_dma_size += dma_chunk_len;
+
+	return nents_dma < nents ? 1 : 0;
+}
+
+static void cleanup_xfer(struct qaic_device *qdev, struct dma_xfer *xfer)
+{
+	int i;
+
+	dma_unmap_sgtable(&qdev->pdev->dev, xfer->sgt, DMA_TO_DEVICE, 0);
+	sg_free_table(xfer->sgt);
+	kfree(xfer->sgt);
+	for (i = 0; i < xfer->nr_pages; ++i)
+		put_page(xfer->page_list[i]);
+	kfree(xfer->page_list);
+}
+
+static int encode_dma(struct qaic_device *qdev, void *trans, struct wrapper_list *wrappers,
+		      u32 *user_len, struct ioctl_resources *resources, struct qaic_user *usr)
+{
+	struct qaic_manage_trans_dma_xfer *in_trans = trans;
+	struct wire_trans_dma_xfer *out_trans;
+	struct wrapper_msg *wrapper;
+	struct dma_xfer *xfer;
+	struct wire_msg *msg;
+	bool need_cont_dma;
+	u32 msg_hdr_len;
+	u32 size;
+	int ret;
+
+	wrapper = list_first_entry(&wrappers->list, struct wrapper_msg, list);
+	msg = &wrapper->msg;
+	msg_hdr_len = le32_to_cpu(msg->hdr.len);
+
+	if (msg_hdr_len > (UINT_MAX - QAIC_MANAGE_EXT_MSG_LENGTH))
+		return -EINVAL;
+
+	/* There should be enough space to hold at least one ASP entry. */
+	if (msg_hdr_len + sizeof(*out_trans) + sizeof(struct wire_addr_size_pair) >
+	    QAIC_MANAGE_EXT_MSG_LENGTH)
+		return -ENOMEM;
+
+	if (in_trans->addr + in_trans->size < in_trans->addr || !in_trans->size)
+		return -EINVAL;
+
+	xfer = kmalloc(sizeof(*xfer), GFP_KERNEL);
+	if (!xfer)
+		return -ENOMEM;
+
+	ret = find_and_map_user_pages(qdev, in_trans, resources, xfer);
+	if (ret < 0)
+		goto free_xfer;
+
+	need_cont_dma = (bool)ret;
+
+	ret = encode_addr_size_pairs(xfer, wrappers, resources, msg_hdr_len, &size, &out_trans);
+	if (ret < 0)
+		goto cleanup_xfer;
+
+	need_cont_dma = need_cont_dma || (bool)ret;
+
+	msg->hdr.len = cpu_to_le32(msg_hdr_len + size);
+	msg->hdr.count = incr_le32(msg->hdr.count);
+
+	out_trans->hdr.type = cpu_to_le32(QAIC_TRANS_DMA_XFER_TO_DEV);
+	out_trans->hdr.len = cpu_to_le32(size);
+	out_trans->tag = cpu_to_le32(in_trans->tag);
+	out_trans->count = cpu_to_le32((size - sizeof(*out_trans)) /
+								sizeof(struct wire_addr_size_pair));
+
+	*user_len += in_trans->hdr.len;
+
+	if (resources->dma_chunk_id) {
+		out_trans->dma_chunk_id = cpu_to_le32(resources->dma_chunk_id);
+	} else if (need_cont_dma) {
+		while (resources->dma_chunk_id == 0)
+			resources->dma_chunk_id = atomic_inc_return(&usr->chunk_id);
+
+		out_trans->dma_chunk_id = cpu_to_le32(resources->dma_chunk_id);
+	}
+	resources->trans_hdr = trans;
+
+	list_add(&xfer->list, &resources->dma_xfers);
+	return 0;
+
+cleanup_xfer:
+	cleanup_xfer(qdev, xfer);
+free_xfer:
+	kfree(xfer);
+	return ret;
+}
+
+static int encode_activate(struct qaic_device *qdev, void *trans, struct wrapper_list *wrappers,
+			   u32 *user_len, struct ioctl_resources *resources)
+{
+	struct qaic_manage_trans_activate_to_dev *in_trans = trans;
+	struct wire_trans_activate_to_dev *out_trans;
+	struct wrapper_msg *trans_wrapper;
+	struct wrapper_msg *wrapper;
+	struct wire_msg *msg;
+	dma_addr_t dma_addr;
+	u32 msg_hdr_len;
+	void *buf;
+	u32 nelem;
+	u32 size;
+	int ret;
+
+	wrapper = list_first_entry(&wrappers->list, struct wrapper_msg, list);
+	msg = &wrapper->msg;
+	msg_hdr_len = le32_to_cpu(msg->hdr.len);
+
+	if (msg_hdr_len + sizeof(*out_trans) > QAIC_MANAGE_MAX_MSG_LENGTH)
+		return -ENOSPC;
+
+	if (!in_trans->queue_size)
+		return -EINVAL;
+
+	if (in_trans->pad)
+		return -EINVAL;
+
+	nelem = in_trans->queue_size;
+	size = (get_dbc_req_elem_size() + get_dbc_rsp_elem_size()) * nelem;
+	if (size / nelem != get_dbc_req_elem_size() + get_dbc_rsp_elem_size())
+		return -EINVAL;
+
+	if (size + QAIC_DBC_Q_GAP + QAIC_DBC_Q_BUF_ALIGN < size)
+		return -EINVAL;
+
+	size = ALIGN((size + QAIC_DBC_Q_GAP), QAIC_DBC_Q_BUF_ALIGN);
+
+	buf = dma_alloc_coherent(&qdev->pdev->dev, size, &dma_addr, GFP_KERNEL);
+	if (!buf)
+		return -ENOMEM;
+
+	trans_wrapper = add_wrapper(wrappers,
+				    offsetof(struct wrapper_msg, trans) + sizeof(*out_trans));
+	if (!trans_wrapper) {
+		ret = -ENOMEM;
+		goto free_dma;
+	}
+	trans_wrapper->len = sizeof(*out_trans);
+	out_trans = (struct wire_trans_activate_to_dev *)&trans_wrapper->trans;
+
+	out_trans->hdr.type = cpu_to_le32(QAIC_TRANS_ACTIVATE_TO_DEV);
+	out_trans->hdr.len = cpu_to_le32(sizeof(*out_trans));
+	out_trans->buf_len = cpu_to_le32(size);
+	out_trans->req_q_addr = cpu_to_le64(dma_addr);
+	out_trans->req_q_size = cpu_to_le32(nelem);
+	out_trans->rsp_q_addr = cpu_to_le64(dma_addr + size - nelem * get_dbc_rsp_elem_size());
+	out_trans->rsp_q_size = cpu_to_le32(nelem);
+	out_trans->options = cpu_to_le32(in_trans->options);
+
+	*user_len += in_trans->hdr.len;
+	msg->hdr.len = cpu_to_le32(msg_hdr_len + sizeof(*out_trans));
+	msg->hdr.count = incr_le32(msg->hdr.count);
+
+	resources->buf = buf;
+	resources->dma_addr = dma_addr;
+	resources->total_size = size;
+	resources->nelem = nelem;
+	resources->rsp_q_base = buf + size - nelem * get_dbc_rsp_elem_size();
+	return 0;
+
+free_dma:
+	dma_free_coherent(&qdev->pdev->dev, size, buf, dma_addr);
+	return ret;
+}
+
+static int encode_deactivate(struct qaic_device *qdev, void *trans,
+			     u32 *user_len, struct qaic_user *usr)
+{
+	struct qaic_manage_trans_deactivate *in_trans = trans;
+
+	if (in_trans->dbc_id >= qdev->num_dbc || in_trans->pad)
+		return -EINVAL;
+
+	*user_len += in_trans->hdr.len;
+
+	return disable_dbc(qdev, in_trans->dbc_id, usr);
+}
+
+static int encode_status(struct qaic_device *qdev, void *trans, struct wrapper_list *wrappers,
+			 u32 *user_len)
+{
+	struct qaic_manage_trans_status_to_dev *in_trans = trans;
+	struct wire_trans_status_to_dev *out_trans;
+	struct wrapper_msg *trans_wrapper;
+	struct wrapper_msg *wrapper;
+	struct wire_msg *msg;
+	u32 msg_hdr_len;
+
+	wrapper = list_first_entry(&wrappers->list, struct wrapper_msg, list);
+	msg = &wrapper->msg;
+	msg_hdr_len = le32_to_cpu(msg->hdr.len);
+
+	if (msg_hdr_len + in_trans->hdr.len > QAIC_MANAGE_MAX_MSG_LENGTH)
+		return -ENOSPC;
+
+	trans_wrapper = add_wrapper(wrappers, sizeof(*trans_wrapper));
+	if (!trans_wrapper)
+		return -ENOMEM;
+
+	trans_wrapper->len = sizeof(*out_trans);
+	out_trans = (struct wire_trans_status_to_dev *)&trans_wrapper->trans;
+
+	out_trans->hdr.type = cpu_to_le32(QAIC_TRANS_STATUS_TO_DEV);
+	out_trans->hdr.len = cpu_to_le32(in_trans->hdr.len);
+	msg->hdr.len = cpu_to_le32(msg_hdr_len + in_trans->hdr.len);
+	msg->hdr.count = incr_le32(msg->hdr.count);
+	*user_len += in_trans->hdr.len;
+
+	return 0;
+}
+
+static int encode_message(struct qaic_device *qdev, struct manage_msg *user_msg,
+			  struct wrapper_list *wrappers, struct ioctl_resources *resources,
+			  struct qaic_user *usr)
+{
+	struct qaic_manage_trans_hdr *trans_hdr;
+	struct wrapper_msg *wrapper;
+	struct wire_msg *msg;
+	u32 user_len = 0;
+	int ret;
+	int i;
+
+	if (!user_msg->count) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	wrapper = list_first_entry(&wrappers->list, struct wrapper_msg, list);
+	msg = &wrapper->msg;
+
+	msg->hdr.len = cpu_to_le32(sizeof(msg->hdr));
+
+	if (resources->dma_chunk_id) {
+		ret = encode_dma(qdev, resources->trans_hdr, wrappers, &user_len, resources, usr);
+		msg->hdr.count = cpu_to_le32(1);
+		goto out;
+	}
+
+	for (i = 0; i < user_msg->count; ++i) {
+		if (user_len >= user_msg->len) {
+			ret = -EINVAL;
+			break;
+		}
+		trans_hdr = (struct qaic_manage_trans_hdr *)(user_msg->data + user_len);
+		if (user_len + trans_hdr->len > user_msg->len) {
+			ret = -EINVAL;
+			break;
+		}
+
+		switch (trans_hdr->type) {
+		case QAIC_TRANS_PASSTHROUGH_FROM_USR:
+			ret = encode_passthrough(qdev, trans_hdr, wrappers, &user_len);
+			break;
+		case QAIC_TRANS_DMA_XFER_FROM_USR:
+			ret = encode_dma(qdev, trans_hdr, wrappers, &user_len, resources, usr);
+			break;
+		case QAIC_TRANS_ACTIVATE_FROM_USR:
+			ret = encode_activate(qdev, trans_hdr, wrappers, &user_len, resources);
+			break;
+		case QAIC_TRANS_DEACTIVATE_FROM_USR:
+			ret = encode_deactivate(qdev, trans_hdr, &user_len, usr);
+			break;
+		case QAIC_TRANS_STATUS_FROM_USR:
+			ret = encode_status(qdev, trans_hdr, wrappers, &user_len);
+			break;
+		default:
+			ret = -EINVAL;
+			break;
+		}
+
+		if (ret)
+			break;
+	}
+
+	if (user_len != user_msg->len)
+		ret = -EINVAL;
+out:
+	if (ret) {
+		free_dma_xfers(qdev, resources);
+		free_dbc_buf(qdev, resources);
+		return ret;
+	}
+
+	return 0;
+}
+
+static int decode_passthrough(struct qaic_device *qdev, void *trans, struct manage_msg *user_msg,
+			      u32 *msg_len)
+{
+	struct qaic_manage_trans_passthrough *out_trans;
+	struct wire_trans_passthrough *in_trans = trans;
+	u32 len;
+
+	out_trans = (void *)user_msg->data + user_msg->len;
+
+	len = le32_to_cpu(in_trans->hdr.len);
+	if (len % 8 != 0)
+		return -EINVAL;
+
+	if (user_msg->len + len > QAIC_MANAGE_MAX_MSG_LENGTH)
+		return -ENOSPC;
+
+	memcpy(out_trans->data, in_trans->data, len - sizeof(in_trans->hdr));
+	user_msg->len += len;
+	*msg_len += len;
+	out_trans->hdr.type = le32_to_cpu(in_trans->hdr.type);
+	out_trans->hdr.len = len;
+
+	return 0;
+}
+
+static int decode_activate(struct qaic_device *qdev, void *trans, struct manage_msg *user_msg,
+			   u32 *msg_len, struct ioctl_resources *resources, struct qaic_user *usr)
+{
+	struct qaic_manage_trans_activate_from_dev *out_trans;
+	struct wire_trans_activate_from_dev *in_trans = trans;
+	u32 len;
+
+	out_trans = (void *)user_msg->data + user_msg->len;
+
+	len = le32_to_cpu(in_trans->hdr.len);
+	if (user_msg->len + len > QAIC_MANAGE_MAX_MSG_LENGTH)
+		return -ENOSPC;
+
+	user_msg->len += len;
+	*msg_len += len;
+	out_trans->hdr.type = le32_to_cpu(in_trans->hdr.type);
+	out_trans->hdr.len = len;
+	out_trans->status = le32_to_cpu(in_trans->status);
+	out_trans->dbc_id = le32_to_cpu(in_trans->dbc_id);
+	out_trans->options = le64_to_cpu(in_trans->options);
+
+	if (!resources->buf)
+		/* how did we get an activate response without a request? */
+		return -EINVAL;
+
+	if (out_trans->dbc_id >= qdev->num_dbc)
+		/*
+		 * The device assigned an invalid resource, which should never
+		 * happen. Return an error so the user can try to recover.
+		 */
+		return -ENODEV;
+
+	if (out_trans->status)
+		/*
+		 * Allocating resources failed on device side. This is not an
+		 * expected behaviour, user is expected to handle this situation.
+		 */
+		return -ECANCELED;
+
+	resources->status = out_trans->status;
+	resources->dbc_id = out_trans->dbc_id;
+	save_dbc_buf(qdev, resources, usr);
+
+	return 0;
+}
+
+static int decode_deactivate(struct qaic_device *qdev, void *trans, u32 *msg_len,
+			     struct qaic_user *usr)
+{
+	struct wire_trans_deactivate_from_dev *in_trans = trans;
+	u32 dbc_id = le32_to_cpu(in_trans->dbc_id);
+	u32 status = le32_to_cpu(in_trans->status);
+
+	if (dbc_id >= qdev->num_dbc)
+		/*
+		 * The device assigned an invalid resource, which should never
+		 * happen. Inject an error so the user can try to recover.
+		 */
+		return -ENODEV;
+
+	if (status) {
+		/*
+		 * Releasing resources failed on the device side, which puts
+		 * us in a bind since they may still be in use, so enable the
+		 * dbc. User is expected to retry deactivation.
+		 */
+		enable_dbc(qdev, dbc_id, usr);
+		return -ECANCELED;
+	}
+
+	release_dbc(qdev, dbc_id);
+	*msg_len += sizeof(*in_trans);
+
+	return 0;
+}
+
+static int decode_status(struct qaic_device *qdev, void *trans, struct manage_msg *user_msg,
+			 u32 *user_len, struct wire_msg *msg)
+{
+	struct qaic_manage_trans_status_from_dev *out_trans;
+	struct wire_trans_status_from_dev *in_trans = trans;
+	u32 len;
+
+	out_trans = (void *)user_msg->data + user_msg->len;
+
+	len = le32_to_cpu(in_trans->hdr.len);
+	if (user_msg->len + len > QAIC_MANAGE_MAX_MSG_LENGTH)
+		return -ENOSPC;
+
+	out_trans->hdr.type = QAIC_TRANS_STATUS_FROM_DEV;
+	out_trans->hdr.len = len;
+	out_trans->major = le16_to_cpu(in_trans->major);
+	out_trans->minor = le16_to_cpu(in_trans->minor);
+	out_trans->status_flags = le64_to_cpu(in_trans->status_flags);
+	out_trans->status = le32_to_cpu(in_trans->status);
+	*user_len += le32_to_cpu(in_trans->hdr.len);
+	user_msg->len += len;
+
+	if (out_trans->status)
+		return -ECANCELED;
+	if (out_trans->status_flags & BIT(0) && !valid_crc(msg))
+		return -EPIPE;
+
+	return 0;
+}
+
+static int decode_message(struct qaic_device *qdev, struct manage_msg *user_msg,
+			  struct wire_msg *msg, struct ioctl_resources *resources,
+			  struct qaic_user *usr)
+{
+	u32 msg_hdr_len = le32_to_cpu(msg->hdr.len);
+	struct wire_trans_hdr *trans_hdr;
+	u32 msg_len = 0;
+	int ret;
+	int i;
+
+	if (msg_hdr_len > QAIC_MANAGE_MAX_MSG_LENGTH)
+		return -EINVAL;
+
+	user_msg->len = 0;
+	user_msg->count = le32_to_cpu(msg->hdr.count);
+
+	for (i = 0; i < user_msg->count; ++i) {
+		trans_hdr = (struct wire_trans_hdr *)(msg->data + msg_len);
+		if (msg_len + le32_to_cpu(trans_hdr->len) > msg_hdr_len)
+			return -EINVAL;
+
+		switch (le32_to_cpu(trans_hdr->type)) {
+		case QAIC_TRANS_PASSTHROUGH_FROM_DEV:
+			ret = decode_passthrough(qdev, trans_hdr, user_msg, &msg_len);
+			break;
+		case QAIC_TRANS_ACTIVATE_FROM_DEV:
+			ret = decode_activate(qdev, trans_hdr, user_msg, &msg_len, resources, usr);
+			break;
+		case QAIC_TRANS_DEACTIVATE_FROM_DEV:
+			ret = decode_deactivate(qdev, trans_hdr, &msg_len, usr);
+			break;
+		case QAIC_TRANS_STATUS_FROM_DEV:
+			ret = decode_status(qdev, trans_hdr, user_msg, &msg_len, msg);
+			break;
+		default:
+			return -EINVAL;
+		}
+
+		if (ret)
+			return ret;
+	}
+
+	if (msg_len != (msg_hdr_len - sizeof(msg->hdr)))
+		return -EINVAL;
+
+	return 0;
+}
+
+static void *msg_xfer(struct qaic_device *qdev, struct wrapper_list *wrappers, u32 seq_num,
+		      bool ignore_signal)
+{
+	struct xfer_queue_elem elem;
+	struct wire_msg *out_buf;
+	struct wrapper_msg *w;
+	int retry_count;
+	long ret;
+
+	if (qdev->in_reset) {
+		mutex_unlock(&qdev->cntl_mutex);
+		return ERR_PTR(-ENODEV);
+	}
+
+	elem.seq_num = seq_num;
+	elem.buf = NULL;
+	init_completion(&elem.xfer_done);
+	if (likely(!qdev->cntl_lost_buf)) {
+		/*
+		 * The max size of request to device is QAIC_MANAGE_EXT_MSG_LENGTH.
+		 * The max size of response from device is QAIC_MANAGE_MAX_MSG_LENGTH.
+		 */
+		out_buf = kmalloc(QAIC_MANAGE_MAX_MSG_LENGTH, GFP_KERNEL);
+		if (!out_buf) {
+			mutex_unlock(&qdev->cntl_mutex);
+			return ERR_PTR(-ENOMEM);
+		}
+
+		ret = mhi_queue_buf(qdev->cntl_ch, DMA_FROM_DEVICE, out_buf,
+				    QAIC_MANAGE_MAX_MSG_LENGTH, MHI_EOT);
+		if (ret) {
+			mutex_unlock(&qdev->cntl_mutex);
+			return ERR_PTR(ret);
+		}
+	} else {
+		/*
+		 * we lost a buffer because we queued a recv buf, but then
+		 * queuing the corresponding tx buf failed. To try to avoid
+		 * a memory leak, lets reclaim it and use it for this
+		 * transaction.
+		 */
+		qdev->cntl_lost_buf = false;
+	}
+
+	list_for_each_entry(w, &wrappers->list, list) {
+		kref_get(&w->ref_count);
+		retry_count = 0;
+retry:
+		ret = mhi_queue_buf(qdev->cntl_ch, DMA_TO_DEVICE, &w->msg, w->len,
+				    list_is_last(&w->list, &wrappers->list) ? MHI_EOT : MHI_CHAIN);
+		if (ret) {
+			if (ret == -EAGAIN && retry_count++ < QAIC_MHI_RETRY_MAX) {
+				msleep_interruptible(QAIC_MHI_RETRY_WAIT_MS);
+				if (!signal_pending(current))
+					goto retry;
+			}
+
+			qdev->cntl_lost_buf = true;
+			kref_put(&w->ref_count, free_wrapper);
+			mutex_unlock(&qdev->cntl_mutex);
+			return ERR_PTR(ret);
+		}
+	}
+
+	list_add_tail(&elem.list, &qdev->cntl_xfer_list);
+	mutex_unlock(&qdev->cntl_mutex);
+
+	if (ignore_signal)
+		ret = wait_for_completion_timeout(&elem.xfer_done, control_resp_timeout_s * HZ);
+	else
+		ret = wait_for_completion_interruptible_timeout(&elem.xfer_done,
+								control_resp_timeout_s * HZ);
+	/*
+	 * not using _interruptable because we have to cleanup or we'll
+	 * likely cause memory corruption
+	 */
+	mutex_lock(&qdev->cntl_mutex);
+	if (!list_empty(&elem.list))
+		list_del(&elem.list);
+	if (!ret && !elem.buf)
+		ret = -ETIMEDOUT;
+	else if (ret > 0 && !elem.buf)
+		ret = -EIO;
+	mutex_unlock(&qdev->cntl_mutex);
+
+	if (ret < 0) {
+		kfree(elem.buf);
+		return ERR_PTR(ret);
+	} else if (!qdev->valid_crc(elem.buf)) {
+		kfree(elem.buf);
+		return ERR_PTR(-EPIPE);
+	}
+
+	return elem.buf;
+}
+
+/* Add a transaction to abort the outstanding DMA continuation */
+static int abort_dma_cont(struct qaic_device *qdev, struct wrapper_list *wrappers, u32 dma_chunk_id)
+{
+	struct wire_trans_dma_xfer *out_trans;
+	u32 size = sizeof(*out_trans);
+	struct wrapper_msg *wrapper;
+	struct wrapper_msg *w;
+	struct wire_msg *msg;
+
+	wrapper = list_first_entry(&wrappers->list, struct wrapper_msg, list);
+	msg = &wrapper->msg;
+
+	/* Remove all but the first wrapper which has the msg header */
+	list_for_each_entry_safe(wrapper, w, &wrappers->list, list)
+		if (!list_is_first(&wrapper->list, &wrappers->list))
+			kref_put(&wrapper->ref_count, free_wrapper);
+
+	wrapper = add_wrapper(wrappers, offsetof(struct wrapper_msg, trans) + sizeof(*out_trans));
+
+	if (!wrapper)
+		return -ENOMEM;
+
+	out_trans = (struct wire_trans_dma_xfer *)&wrapper->trans;
+	out_trans->hdr.type = cpu_to_le32(QAIC_TRANS_DMA_XFER_TO_DEV);
+	out_trans->hdr.len = cpu_to_le32(size);
+	out_trans->tag = cpu_to_le32(0);
+	out_trans->count = cpu_to_le32(0);
+	out_trans->dma_chunk_id = cpu_to_le32(dma_chunk_id);
+
+	msg->hdr.len = cpu_to_le32(size + sizeof(*msg));
+	msg->hdr.count = cpu_to_le32(1);
+	wrapper->len = size;
+
+	return 0;
+}
+
+static struct wrapper_list *alloc_wrapper_list(void)
+{
+	struct wrapper_list *wrappers;
+
+	wrappers = kmalloc(sizeof(*wrappers), GFP_KERNEL);
+	if (!wrappers)
+		return NULL;
+	INIT_LIST_HEAD(&wrappers->list);
+	spin_lock_init(&wrappers->lock);
+
+	return wrappers;
+}
+
+static int qaic_manage_msg_xfer(struct qaic_device *qdev, struct qaic_user *usr,
+				struct manage_msg *user_msg, struct ioctl_resources *resources,
+				struct wire_msg **rsp)
+{
+	struct wrapper_list *wrappers;
+	struct wrapper_msg *wrapper;
+	struct wrapper_msg *w;
+	bool all_done = false;
+	struct wire_msg *msg;
+	int ret;
+
+	wrappers = alloc_wrapper_list();
+	if (!wrappers)
+		return -ENOMEM;
+
+	wrapper = add_wrapper(wrappers, sizeof(*wrapper));
+	if (!wrapper) {
+		kfree(wrappers);
+		return -ENOMEM;
+	}
+
+	msg = &wrapper->msg;
+	wrapper->len = sizeof(*msg);
+
+	ret = encode_message(qdev, user_msg, wrappers, resources, usr);
+	if (ret && resources->dma_chunk_id)
+		ret = abort_dma_cont(qdev, wrappers, resources->dma_chunk_id);
+	if (ret)
+		goto encode_failed;
+
+	ret = mutex_lock_interruptible(&qdev->cntl_mutex);
+	if (ret)
+		goto lock_failed;
+
+	msg->hdr.magic_number = MANAGE_MAGIC_NUMBER;
+	msg->hdr.sequence_number = cpu_to_le32(qdev->next_seq_num++);
+
+	if (usr) {
+		msg->hdr.handle = cpu_to_le32(usr->handle);
+		msg->hdr.partition_id = cpu_to_le32(usr->qddev->partition_id);
+	} else {
+		msg->hdr.handle = 0;
+		msg->hdr.partition_id = cpu_to_le32(QAIC_NO_PARTITION);
+	}
+
+	msg->hdr.padding = cpu_to_le32(0);
+	msg->hdr.crc32 = cpu_to_le32(qdev->gen_crc(wrappers));
+
+	/* msg_xfer releases the mutex */
+	*rsp = msg_xfer(qdev, wrappers, qdev->next_seq_num - 1, false);
+	if (IS_ERR(*rsp))
+		ret = PTR_ERR(*rsp);
+
+lock_failed:
+	free_dma_xfers(qdev, resources);
+encode_failed:
+	spin_lock(&wrappers->lock);
+	list_for_each_entry_safe(wrapper, w, &wrappers->list, list)
+		kref_put(&wrapper->ref_count, free_wrapper);
+	all_done = list_empty(&wrappers->list);
+	spin_unlock(&wrappers->lock);
+	if (all_done)
+		kfree(wrappers);
+
+	return ret;
+}
+
+static int qaic_manage(struct qaic_device *qdev, struct qaic_user *usr, struct manage_msg *user_msg)
+{
+	struct wire_trans_dma_xfer_cont *dma_cont = NULL;
+	struct ioctl_resources resources;
+	struct wire_msg *rsp = NULL;
+	int ret;
+
+	memset(&resources, 0, sizeof(struct ioctl_resources));
+
+	INIT_LIST_HEAD(&resources.dma_xfers);
+
+	if (user_msg->len > QAIC_MANAGE_MAX_MSG_LENGTH ||
+	    user_msg->count > QAIC_MANAGE_MAX_MSG_LENGTH / sizeof(struct qaic_manage_trans_hdr))
+		return -EINVAL;
+
+dma_xfer_continue:
+	ret = qaic_manage_msg_xfer(qdev, usr, user_msg, &resources, &rsp);
+	if (ret)
+		return ret;
+	/* dma_cont should be the only transaction if present */
+	if (le32_to_cpu(rsp->hdr.count) == 1) {
+		dma_cont = (struct wire_trans_dma_xfer_cont *)rsp->data;
+		if (le32_to_cpu(dma_cont->hdr.type) != QAIC_TRANS_DMA_XFER_CONT)
+			dma_cont = NULL;
+	}
+	if (dma_cont) {
+		if (le32_to_cpu(dma_cont->dma_chunk_id) == resources.dma_chunk_id &&
+		    le64_to_cpu(dma_cont->xferred_size) == resources.xferred_dma_size) {
+			kfree(rsp);
+			goto dma_xfer_continue;
+		}
+
+		ret = -EINVAL;
+		goto dma_cont_failed;
+	}
+
+	ret = decode_message(qdev, user_msg, rsp, &resources, usr);
+
+dma_cont_failed:
+	free_dbc_buf(qdev, &resources);
+	kfree(rsp);
+	return ret;
+}
+
+int qaic_manage_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv)
+{
+	struct qaic_manage_msg *user_msg;
+	struct qaic_device *qdev;
+	struct manage_msg *msg;
+	struct qaic_user *usr;
+	u8 __user *user_data;
+	int qdev_rcu_id;
+	int usr_rcu_id;
+	int ret;
+
+	usr = file_priv->driver_priv;
+
+	usr_rcu_id = srcu_read_lock(&usr->qddev_lock);
+	if (!usr->qddev) {
+		srcu_read_unlock(&usr->qddev_lock, usr_rcu_id);
+		return -ENODEV;
+	}
+
+	qdev = usr->qddev->qdev;
+
+	qdev_rcu_id = srcu_read_lock(&qdev->dev_lock);
+	if (qdev->in_reset) {
+		srcu_read_unlock(&qdev->dev_lock, qdev_rcu_id);
+		srcu_read_unlock(&usr->qddev_lock, usr_rcu_id);
+		return -ENODEV;
+	}
+
+	user_msg = data;
+
+	if (user_msg->len > QAIC_MANAGE_MAX_MSG_LENGTH) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	msg = kzalloc(QAIC_MANAGE_MAX_MSG_LENGTH + sizeof(*msg), GFP_KERNEL);
+	if (!msg) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	msg->len = user_msg->len;
+	msg->count = user_msg->count;
+
+	user_data = u64_to_user_ptr(user_msg->data);
+
+	if (copy_from_user(msg->data, user_data, user_msg->len)) {
+		ret = -EFAULT;
+		goto free_msg;
+	}
+
+	ret = qaic_manage(qdev, usr, msg);
+
+	/*
+	 * If the qaic_manage() is successful then we copy the message onto
+	 * userspace memory but we have an exception for -ECANCELED.
+	 * For -ECANCELED, it means that device has NACKed the message with a
+	 * status error code which userspace would like to know.
+	 */
+	if (ret == -ECANCELED || !ret) {
+		if (copy_to_user(user_data, msg->data, msg->len)) {
+			ret = -EFAULT;
+		} else {
+			user_msg->len = msg->len;
+			user_msg->count = msg->count;
+		}
+	}
+
+free_msg:
+	kfree(msg);
+out:
+	srcu_read_unlock(&qdev->dev_lock, qdev_rcu_id);
+	srcu_read_unlock(&usr->qddev_lock, usr_rcu_id);
+	return ret;
+}
+
+int get_cntl_version(struct qaic_device *qdev, struct qaic_user *usr, u16 *major, u16 *minor)
+{
+	struct qaic_manage_trans_status_from_dev *status_result;
+	struct qaic_manage_trans_status_to_dev *status_query;
+	struct manage_msg *user_msg;
+	int ret;
+
+	user_msg = kmalloc(sizeof(*user_msg) + sizeof(*status_result), GFP_KERNEL);
+	if (!user_msg) {
+		ret = -ENOMEM;
+		goto out;
+	}
+	user_msg->len = sizeof(*status_query);
+	user_msg->count = 1;
+
+	status_query = (struct qaic_manage_trans_status_to_dev *)user_msg->data;
+	status_query->hdr.type = QAIC_TRANS_STATUS_FROM_USR;
+	status_query->hdr.len = sizeof(status_query->hdr);
+
+	ret = qaic_manage(qdev, usr, user_msg);
+	if (ret)
+		goto kfree_user_msg;
+	status_result = (struct qaic_manage_trans_status_from_dev *)user_msg->data;
+	*major = status_result->major;
+	*minor = status_result->minor;
+
+	if (status_result->status_flags & BIT(0)) { /* device is using CRC */
+		/* By default qdev->gen_crc is programmed to generate CRC */
+		qdev->valid_crc = valid_crc;
+	} else {
+		/* By default qdev->valid_crc is programmed to bypass CRC */
+		qdev->gen_crc = gen_crc_stub;
+	}
+
+kfree_user_msg:
+	kfree(user_msg);
+out:
+	return ret;
+}
+
+static void resp_worker(struct work_struct *work)
+{
+	struct resp_work *resp = container_of(work, struct resp_work, work);
+	struct qaic_device *qdev = resp->qdev;
+	struct wire_msg *msg = resp->buf;
+	struct xfer_queue_elem *elem;
+	struct xfer_queue_elem *i;
+	bool found = false;
+
+	mutex_lock(&qdev->cntl_mutex);
+	list_for_each_entry_safe(elem, i, &qdev->cntl_xfer_list, list) {
+		if (elem->seq_num == le32_to_cpu(msg->hdr.sequence_number)) {
+			found = true;
+			list_del_init(&elem->list);
+			elem->buf = msg;
+			complete_all(&elem->xfer_done);
+			break;
+		}
+	}
+	mutex_unlock(&qdev->cntl_mutex);
+
+	if (!found)
+		/* request must have timed out, drop packet */
+		kfree(msg);
+
+	kfree(resp);
+}
+
+static void free_wrapper_from_list(struct wrapper_list *wrappers, struct wrapper_msg *wrapper)
+{
+	bool all_done = false;
+
+	spin_lock(&wrappers->lock);
+	kref_put(&wrapper->ref_count, free_wrapper);
+	all_done = list_empty(&wrappers->list);
+	spin_unlock(&wrappers->lock);
+
+	if (all_done)
+		kfree(wrappers);
+}
+
+void qaic_mhi_ul_xfer_cb(struct mhi_device *mhi_dev, struct mhi_result *mhi_result)
+{
+	struct wire_msg *msg = mhi_result->buf_addr;
+	struct wrapper_msg *wrapper = container_of(msg, struct wrapper_msg, msg);
+
+	free_wrapper_from_list(wrapper->head, wrapper);
+}
+
+void qaic_mhi_dl_xfer_cb(struct mhi_device *mhi_dev, struct mhi_result *mhi_result)
+{
+	struct qaic_device *qdev = dev_get_drvdata(&mhi_dev->dev);
+	struct wire_msg *msg = mhi_result->buf_addr;
+	struct resp_work *resp;
+
+	if (mhi_result->transaction_status || msg->hdr.magic_number != MANAGE_MAGIC_NUMBER) {
+		kfree(msg);
+		return;
+	}
+
+	resp = kmalloc(sizeof(*resp), GFP_ATOMIC);
+	if (!resp) {
+		kfree(msg);
+		return;
+	}
+
+	INIT_WORK(&resp->work, resp_worker);
+	resp->qdev = qdev;
+	resp->buf = msg;
+	queue_work(qdev->cntl_wq, &resp->work);
+}
+
+int qaic_control_open(struct qaic_device *qdev)
+{
+	if (!qdev->cntl_ch)
+		return -ENODEV;
+
+	qdev->cntl_lost_buf = false;
+	/*
+	 * By default qaic should assume that device has CRC enabled.
+	 * Qaic comes to know if device has CRC enabled or disabled during the
+	 * device status transaction, which is the first transaction performed
+	 * on control channel.
+	 *
+	 * So CRC validation of first device status transaction response is
+	 * ignored (by calling valid_crc_stub) and is done later during decoding
+	 * if device has CRC enabled.
+	 * Now that qaic knows whether device has CRC enabled or not it acts
+	 * accordingly.
+	 */
+	qdev->gen_crc = gen_crc;
+	qdev->valid_crc = valid_crc_stub;
+
+	return mhi_prepare_for_transfer(qdev->cntl_ch);
+}
+
+void qaic_control_close(struct qaic_device *qdev)
+{
+	mhi_unprepare_from_transfer(qdev->cntl_ch);
+}
+
+void qaic_release_usr(struct qaic_device *qdev, struct qaic_user *usr)
+{
+	struct wire_trans_terminate_to_dev *trans;
+	struct wrapper_list *wrappers;
+	struct wrapper_msg *wrapper;
+	struct wire_msg *msg;
+	struct wire_msg *rsp;
+
+	wrappers = alloc_wrapper_list();
+	if (!wrappers)
+		return;
+
+	wrapper = add_wrapper(wrappers, sizeof(*wrapper) + sizeof(*msg) + sizeof(*trans));
+	if (!wrapper)
+		return;
+
+	msg = &wrapper->msg;
+
+	trans = (struct wire_trans_terminate_to_dev *)msg->data;
+
+	trans->hdr.type = cpu_to_le32(QAIC_TRANS_TERMINATE_TO_DEV);
+	trans->hdr.len = cpu_to_le32(sizeof(*trans));
+	trans->handle = cpu_to_le32(usr->handle);
+
+	mutex_lock(&qdev->cntl_mutex);
+	wrapper->len = sizeof(msg->hdr) + sizeof(*trans);
+	msg->hdr.magic_number = MANAGE_MAGIC_NUMBER;
+	msg->hdr.sequence_number = cpu_to_le32(qdev->next_seq_num++);
+	msg->hdr.len = cpu_to_le32(wrapper->len);
+	msg->hdr.count = cpu_to_le32(1);
+	msg->hdr.handle = cpu_to_le32(usr->handle);
+	msg->hdr.padding = cpu_to_le32(0);
+	msg->hdr.crc32 = cpu_to_le32(qdev->gen_crc(wrappers));
+
+	/*
+	 * msg_xfer releases the mutex
+	 * We don't care about the return of msg_xfer since we will not do
+	 * anything different based on what happens.
+	 * We ignore pending signals since one will be set if the user is
+	 * killed, and we need give the device a chance to cleanup, otherwise
+	 * DMA may still be in progress when we return.
+	 */
+	rsp = msg_xfer(qdev, wrappers, qdev->next_seq_num - 1, true);
+	if (!IS_ERR(rsp))
+		kfree(rsp);
+	free_wrapper_from_list(wrappers, wrapper);
+}
+
+void wake_all_cntl(struct qaic_device *qdev)
+{
+	struct xfer_queue_elem *elem;
+	struct xfer_queue_elem *i;
+
+	mutex_lock(&qdev->cntl_mutex);
+	list_for_each_entry_safe(elem, i, &qdev->cntl_xfer_list, list) {
+		list_del_init(&elem->list);
+		complete_all(&elem->xfer_done);
+	}
+	mutex_unlock(&qdev->cntl_mutex);
+}
diff --git a/drivers/accel/qaic/qaic_data.c b/drivers/accel/qaic/qaic_data.c
new file mode 100644
index 000000000000..c0a574cd1b35
--- /dev/null
+++ b/drivers/accel/qaic/qaic_data.c
@@ -0,0 +1,1902 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+/* Copyright (c) 2019-2021, The Linux Foundation. All rights reserved. */
+/* Copyright (c) 2021-2023 Qualcomm Innovation Center, Inc. All rights reserved. */
+
+#include <linux/bitfield.h>
+#include <linux/bits.h>
+#include <linux/completion.h>
+#include <linux/delay.h>
+#include <linux/dma-buf.h>
+#include <linux/dma-mapping.h>
+#include <linux/interrupt.h>
+#include <linux/kref.h>
+#include <linux/list.h>
+#include <linux/math64.h>
+#include <linux/mm.h>
+#include <linux/moduleparam.h>
+#include <linux/scatterlist.h>
+#include <linux/spinlock.h>
+#include <linux/srcu.h>
+#include <linux/types.h>
+#include <linux/uaccess.h>
+#include <linux/wait.h>
+#include <drm/drm_file.h>
+#include <drm/drm_gem.h>
+#include <drm/drm_print.h>
+#include <uapi/drm/qaic_accel.h>
+
+#include "qaic.h"
+
+#define SEM_VAL_MASK	GENMASK_ULL(11, 0)
+#define SEM_INDEX_MASK	GENMASK_ULL(4, 0)
+#define BULK_XFER	BIT(3)
+#define GEN_COMPLETION	BIT(4)
+#define INBOUND_XFER	1
+#define OUTBOUND_XFER	2
+#define REQHP_OFF	0x0 /* we read this */
+#define REQTP_OFF	0x4 /* we write this */
+#define RSPHP_OFF	0x8 /* we write this */
+#define RSPTP_OFF	0xc /* we read this */
+
+#define ENCODE_SEM(val, index, sync, cmd, flags)			\
+		({							\
+			FIELD_PREP(GENMASK(11, 0), (val)) |		\
+			FIELD_PREP(GENMASK(20, 16), (index)) |		\
+			FIELD_PREP(BIT(22), (sync)) |			\
+			FIELD_PREP(GENMASK(26, 24), (cmd)) |		\
+			FIELD_PREP(GENMASK(30, 29), (flags)) |		\
+			FIELD_PREP(BIT(31), (cmd) ? 1 : 0);		\
+		})
+#define NUM_EVENTS	128
+#define NUM_DELAYS	10
+
+static unsigned int wait_exec_default_timeout_ms = 5000; /* 5 sec default */
+module_param(wait_exec_default_timeout_ms, uint, 0600);
+MODULE_PARM_DESC(wait_exec_default_timeout_ms, "Default timeout for DRM_IOCTL_QAIC_WAIT_BO");
+
+static unsigned int datapath_poll_interval_us = 100; /* 100 usec default */
+module_param(datapath_poll_interval_us, uint, 0600);
+MODULE_PARM_DESC(datapath_poll_interval_us,
+		 "Amount of time to sleep between activity when datapath polling is enabled");
+
+struct dbc_req {
+	/*
+	 * A request ID is assigned to each memory handle going in DMA queue.
+	 * As a single memory handle can enqueue multiple elements in DMA queue
+	 * all of them will have the same request ID.
+	 */
+	__le16	req_id;
+	/* Future use */
+	__u8	seq_id;
+	/*
+	 * Special encoded variable
+	 * 7	0 - Do not force to generate MSI after DMA is completed
+	 *	1 - Force to generate MSI after DMA is completed
+	 * 6:5	Reserved
+	 * 4	1 - Generate completion element in the response queue
+	 *	0 - No Completion Code
+	 * 3	0 - DMA request is a Link list transfer
+	 *	1 - DMA request is a Bulk transfer
+	 * 2	Reserved
+	 * 1:0	00 - No DMA transfer involved
+	 *	01 - DMA transfer is part of inbound transfer
+	 *	10 - DMA transfer has outbound transfer
+	 *	11 - NA
+	 */
+	__u8	cmd;
+	__le32	resv;
+	/* Source address for the transfer */
+	__le64	src_addr;
+	/* Destination address for the transfer */
+	__le64	dest_addr;
+	/* Length of transfer request */
+	__le32	len;
+	__le32	resv2;
+	/* Doorbell address */
+	__le64	db_addr;
+	/*
+	 * Special encoded variable
+	 * 7	1 - Doorbell(db) write
+	 *	0 - No doorbell write
+	 * 6:2	Reserved
+	 * 1:0	00 - 32 bit access, db address must be aligned to 32bit-boundary
+	 *	01 - 16 bit access, db address must be aligned to 16bit-boundary
+	 *	10 - 8 bit access, db address must be aligned to 8bit-boundary
+	 *	11 - Reserved
+	 */
+	__u8	db_len;
+	__u8	resv3;
+	__le16	resv4;
+	/* 32 bit data written to doorbell address */
+	__le32	db_data;
+	/*
+	 * Special encoded variable
+	 * All the fields of sem_cmdX are passed from user and all are ORed
+	 * together to form sem_cmd.
+	 * 0:11		Semaphore value
+	 * 15:12	Reserved
+	 * 20:16	Semaphore index
+	 * 21		Reserved
+	 * 22		Semaphore Sync
+	 * 23		Reserved
+	 * 26:24	Semaphore command
+	 * 28:27	Reserved
+	 * 29		Semaphore DMA out bound sync fence
+	 * 30		Semaphore DMA in bound sync fence
+	 * 31		Enable semaphore command
+	 */
+	__le32	sem_cmd0;
+	__le32	sem_cmd1;
+	__le32	sem_cmd2;
+	__le32	sem_cmd3;
+} __packed;
+
+struct dbc_rsp {
+	/* Request ID of the memory handle whose DMA transaction is completed */
+	__le16	req_id;
+	/* Status of the DMA transaction. 0 : Success otherwise failure */
+	__le16	status;
+} __packed;
+
+inline int get_dbc_req_elem_size(void)
+{
+	return sizeof(struct dbc_req);
+}
+
+inline int get_dbc_rsp_elem_size(void)
+{
+	return sizeof(struct dbc_rsp);
+}
+
+static void free_slice(struct kref *kref)
+{
+	struct bo_slice *slice = container_of(kref, struct bo_slice, ref_count);
+
+	list_del(&slice->slice);
+	drm_gem_object_put(&slice->bo->base);
+	sg_free_table(slice->sgt);
+	kfree(slice->sgt);
+	kfree(slice->reqs);
+	kfree(slice);
+}
+
+static int clone_range_of_sgt_for_slice(struct qaic_device *qdev, struct sg_table **sgt_out,
+					struct sg_table *sgt_in, u64 size, u64 offset)
+{
+	int total_len, len, nents, offf = 0, offl = 0;
+	struct scatterlist *sg, *sgn, *sgf, *sgl;
+	struct sg_table *sgt;
+	int ret, j;
+
+	/* find out number of relevant nents needed for this mem */
+	total_len = 0;
+	sgf = NULL;
+	sgl = NULL;
+	nents = 0;
+
+	size = size ? size : PAGE_SIZE;
+	for (sg = sgt_in->sgl; sg; sg = sg_next(sg)) {
+		len = sg_dma_len(sg);
+
+		if (!len)
+			continue;
+		if (offset >= total_len && offset < total_len + len) {
+			sgf = sg;
+			offf = offset - total_len;
+		}
+		if (sgf)
+			nents++;
+		if (offset + size >= total_len &&
+		    offset + size <= total_len + len) {
+			sgl = sg;
+			offl = offset + size - total_len;
+			break;
+		}
+		total_len += len;
+	}
+
+	if (!sgf || !sgl) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	sgt = kzalloc(sizeof(*sgt), GFP_KERNEL);
+	if (!sgt) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	ret = sg_alloc_table(sgt, nents, GFP_KERNEL);
+	if (ret)
+		goto free_sgt;
+
+	/* copy relevant sg node and fix page and length */
+	sgn = sgf;
+	for_each_sgtable_sg(sgt, sg, j) {
+		memcpy(sg, sgn, sizeof(*sg));
+		if (sgn == sgf) {
+			sg_dma_address(sg) += offf;
+			sg_dma_len(sg) -= offf;
+			sg_set_page(sg, sg_page(sgn), sg_dma_len(sg), offf);
+		} else {
+			offf = 0;
+		}
+		if (sgn == sgl) {
+			sg_dma_len(sg) = offl - offf;
+			sg_set_page(sg, sg_page(sgn), offl - offf, offf);
+			sg_mark_end(sg);
+			break;
+		}
+		sgn = sg_next(sgn);
+	}
+
+	*sgt_out = sgt;
+	return ret;
+
+free_sgt:
+	kfree(sgt);
+out:
+	*sgt_out = NULL;
+	return ret;
+}
+
+static int encode_reqs(struct qaic_device *qdev, struct bo_slice *slice,
+		       struct qaic_attach_slice_entry *req)
+{
+	__le64 db_addr = cpu_to_le64(req->db_addr);
+	__le32 db_data = cpu_to_le32(req->db_data);
+	struct scatterlist *sg;
+	__u8 cmd = BULK_XFER;
+	int presync_sem;
+	u64 dev_addr;
+	__u8 db_len;
+	int i;
+
+	if (!slice->no_xfer)
+		cmd |= (slice->dir == DMA_TO_DEVICE ? INBOUND_XFER : OUTBOUND_XFER);
+
+	if (req->db_len && !IS_ALIGNED(req->db_addr, req->db_len / 8))
+		return -EINVAL;
+
+	presync_sem = req->sem0.presync + req->sem1.presync + req->sem2.presync + req->sem3.presync;
+	if (presync_sem > 1)
+		return -EINVAL;
+
+	presync_sem = req->sem0.presync << 0 | req->sem1.presync << 1 |
+		      req->sem2.presync << 2 | req->sem3.presync << 3;
+
+	switch (req->db_len) {
+	case 32:
+		db_len = BIT(7);
+		break;
+	case 16:
+		db_len = BIT(7) | 1;
+		break;
+	case 8:
+		db_len = BIT(7) | 2;
+		break;
+	case 0:
+		db_len = 0; /* doorbell is not active for this command */
+		break;
+	default:
+		return -EINVAL; /* should never hit this */
+	}
+
+	/*
+	 * When we end up splitting up a single request (ie a buf slice) into
+	 * multiple DMA requests, we have to manage the sync data carefully.
+	 * There can only be one presync sem. That needs to be on every xfer
+	 * so that the DMA engine doesn't transfer data before the receiver is
+	 * ready. We only do the doorbell and postsync sems after the xfer.
+	 * To guarantee previous xfers for the request are complete, we use a
+	 * fence.
+	 */
+	dev_addr = req->dev_addr;
+	for_each_sgtable_sg(slice->sgt, sg, i) {
+		slice->reqs[i].cmd = cmd;
+		slice->reqs[i].src_addr = cpu_to_le64(slice->dir == DMA_TO_DEVICE ?
+						      sg_dma_address(sg) : dev_addr);
+		slice->reqs[i].dest_addr = cpu_to_le64(slice->dir == DMA_TO_DEVICE ?
+						       dev_addr : sg_dma_address(sg));
+		/*
+		 * sg_dma_len(sg) returns size of a DMA segment, maximum DMA
+		 * segment size is set to UINT_MAX by qaic and hence return
+		 * values of sg_dma_len(sg) can never exceed u32 range. So,
+		 * by down sizing we are not corrupting the value.
+		 */
+		slice->reqs[i].len = cpu_to_le32((u32)sg_dma_len(sg));
+		switch (presync_sem) {
+		case BIT(0):
+			slice->reqs[i].sem_cmd0 = cpu_to_le32(ENCODE_SEM(req->sem0.val,
+									 req->sem0.index,
+									 req->sem0.presync,
+									 req->sem0.cmd,
+									 req->sem0.flags));
+			break;
+		case BIT(1):
+			slice->reqs[i].sem_cmd1 = cpu_to_le32(ENCODE_SEM(req->sem1.val,
+									 req->sem1.index,
+									 req->sem1.presync,
+									 req->sem1.cmd,
+									 req->sem1.flags));
+			break;
+		case BIT(2):
+			slice->reqs[i].sem_cmd2 = cpu_to_le32(ENCODE_SEM(req->sem2.val,
+									 req->sem2.index,
+									 req->sem2.presync,
+									 req->sem2.cmd,
+									 req->sem2.flags));
+			break;
+		case BIT(3):
+			slice->reqs[i].sem_cmd3 = cpu_to_le32(ENCODE_SEM(req->sem3.val,
+									 req->sem3.index,
+									 req->sem3.presync,
+									 req->sem3.cmd,
+									 req->sem3.flags));
+			break;
+		}
+		dev_addr += sg_dma_len(sg);
+	}
+	/* add post transfer stuff to last segment */
+	i--;
+	slice->reqs[i].cmd |= GEN_COMPLETION;
+	slice->reqs[i].db_addr = db_addr;
+	slice->reqs[i].db_len = db_len;
+	slice->reqs[i].db_data = db_data;
+	/*
+	 * Add a fence if we have more than one request going to the hardware
+	 * representing the entirety of the user request, and the user request
+	 * has no presync condition.
+	 * Fences are expensive, so we try to avoid them. We rely on the
+	 * hardware behavior to avoid needing one when there is a presync
+	 * condition. When a presync exists, all requests for that same
+	 * presync will be queued into a fifo. Thus, since we queue the
+	 * post xfer activity only on the last request we queue, the hardware
+	 * will ensure that the last queued request is processed last, thus
+	 * making sure the post xfer activity happens at the right time without
+	 * a fence.
+	 */
+	if (i && !presync_sem)
+		req->sem0.flags |= (slice->dir == DMA_TO_DEVICE ?
+				    QAIC_SEM_INSYNCFENCE : QAIC_SEM_OUTSYNCFENCE);
+	slice->reqs[i].sem_cmd0 = cpu_to_le32(ENCODE_SEM(req->sem0.val, req->sem0.index,
+							 req->sem0.presync, req->sem0.cmd,
+							 req->sem0.flags));
+	slice->reqs[i].sem_cmd1 = cpu_to_le32(ENCODE_SEM(req->sem1.val, req->sem1.index,
+							 req->sem1.presync, req->sem1.cmd,
+							 req->sem1.flags));
+	slice->reqs[i].sem_cmd2 = cpu_to_le32(ENCODE_SEM(req->sem2.val, req->sem2.index,
+							 req->sem2.presync, req->sem2.cmd,
+							 req->sem2.flags));
+	slice->reqs[i].sem_cmd3 = cpu_to_le32(ENCODE_SEM(req->sem3.val, req->sem3.index,
+							 req->sem3.presync, req->sem3.cmd,
+							 req->sem3.flags));
+
+	return 0;
+}
+
+static int qaic_map_one_slice(struct qaic_device *qdev, struct qaic_bo *bo,
+			      struct qaic_attach_slice_entry *slice_ent)
+{
+	struct sg_table *sgt = NULL;
+	struct bo_slice *slice;
+	int ret;
+
+	ret = clone_range_of_sgt_for_slice(qdev, &sgt, bo->sgt, slice_ent->size, slice_ent->offset);
+	if (ret)
+		goto out;
+
+	slice = kmalloc(sizeof(*slice), GFP_KERNEL);
+	if (!slice) {
+		ret = -ENOMEM;
+		goto free_sgt;
+	}
+
+	slice->reqs = kcalloc(sgt->nents, sizeof(*slice->reqs), GFP_KERNEL);
+	if (!slice->reqs) {
+		ret = -ENOMEM;
+		goto free_slice;
+	}
+
+	slice->no_xfer = !slice_ent->size;
+	slice->sgt = sgt;
+	slice->nents = sgt->nents;
+	slice->dir = bo->dir;
+	slice->bo = bo;
+	slice->size = slice_ent->size;
+	slice->offset = slice_ent->offset;
+
+	ret = encode_reqs(qdev, slice, slice_ent);
+	if (ret)
+		goto free_req;
+
+	bo->total_slice_nents += sgt->nents;
+	kref_init(&slice->ref_count);
+	drm_gem_object_get(&bo->base);
+	list_add_tail(&slice->slice, &bo->slices);
+
+	return 0;
+
+free_req:
+	kfree(slice->reqs);
+free_slice:
+	kfree(slice);
+free_sgt:
+	sg_free_table(sgt);
+	kfree(sgt);
+out:
+	return ret;
+}
+
+static int create_sgt(struct qaic_device *qdev, struct sg_table **sgt_out, u64 size)
+{
+	struct scatterlist *sg;
+	struct sg_table *sgt;
+	struct page **pages;
+	int *pages_order;
+	int buf_extra;
+	int max_order;
+	int nr_pages;
+	int ret = 0;
+	int i, j, k;
+	int order;
+
+	if (size) {
+		nr_pages = DIV_ROUND_UP(size, PAGE_SIZE);
+		/*
+		 * calculate how much extra we are going to allocate, to remove
+		 * later
+		 */
+		buf_extra = (PAGE_SIZE - size % PAGE_SIZE) % PAGE_SIZE;
+		max_order = min(MAX_ORDER - 1, get_order(size));
+	} else {
+		/* allocate a single page for book keeping */
+		nr_pages = 1;
+		buf_extra = 0;
+		max_order = 0;
+	}
+
+	pages = kvmalloc_array(nr_pages, sizeof(*pages) + sizeof(*pages_order), GFP_KERNEL);
+	if (!pages) {
+		ret = -ENOMEM;
+		goto out;
+	}
+	pages_order = (void *)pages + sizeof(*pages) * nr_pages;
+
+	/*
+	 * Allocate requested memory using alloc_pages. It is possible to allocate
+	 * the requested memory in multiple chunks by calling alloc_pages
+	 * multiple times. Use SG table to handle multiple allocated pages.
+	 */
+	i = 0;
+	while (nr_pages > 0) {
+		order = min(get_order(nr_pages * PAGE_SIZE), max_order);
+		while (1) {
+			pages[i] = alloc_pages(GFP_KERNEL | GFP_HIGHUSER |
+					       __GFP_NOWARN | __GFP_ZERO |
+					       (order ? __GFP_NORETRY : __GFP_RETRY_MAYFAIL),
+					       order);
+			if (pages[i])
+				break;
+			if (!order--) {
+				ret = -ENOMEM;
+				goto free_partial_alloc;
+			}
+		}
+
+		max_order = order;
+		pages_order[i] = order;
+
+		nr_pages -= 1 << order;
+		if (nr_pages <= 0)
+			/* account for over allocation */
+			buf_extra += abs(nr_pages) * PAGE_SIZE;
+		i++;
+	}
+
+	sgt = kmalloc(sizeof(*sgt), GFP_KERNEL);
+	if (!sgt) {
+		ret = -ENOMEM;
+		goto free_partial_alloc;
+	}
+
+	if (sg_alloc_table(sgt, i, GFP_KERNEL)) {
+		ret = -ENOMEM;
+		goto free_sgt;
+	}
+
+	/* Populate the SG table with the allocated memory pages */
+	sg = sgt->sgl;
+	for (k = 0; k < i; k++, sg = sg_next(sg)) {
+		/* Last entry requires special handling */
+		if (k < i - 1) {
+			sg_set_page(sg, pages[k], PAGE_SIZE << pages_order[k], 0);
+		} else {
+			sg_set_page(sg, pages[k], (PAGE_SIZE << pages_order[k]) - buf_extra, 0);
+			sg_mark_end(sg);
+		}
+	}
+
+	kvfree(pages);
+	*sgt_out = sgt;
+	return ret;
+
+free_sgt:
+	kfree(sgt);
+free_partial_alloc:
+	for (j = 0; j < i; j++)
+		__free_pages(pages[j], pages_order[j]);
+	kvfree(pages);
+out:
+	*sgt_out = NULL;
+	return ret;
+}
+
+static bool invalid_sem(struct qaic_sem *sem)
+{
+	if (sem->val & ~SEM_VAL_MASK || sem->index & ~SEM_INDEX_MASK ||
+	    !(sem->presync == 0 || sem->presync == 1) || sem->pad ||
+	    sem->flags & ~(QAIC_SEM_INSYNCFENCE | QAIC_SEM_OUTSYNCFENCE) ||
+	    sem->cmd > QAIC_SEM_WAIT_GT_0)
+		return true;
+	return false;
+}
+
+static int qaic_validate_req(struct qaic_device *qdev, struct qaic_attach_slice_entry *slice_ent,
+			     u32 count, u64 total_size)
+{
+	int i;
+
+	for (i = 0; i < count; i++) {
+		if (!(slice_ent[i].db_len == 32 || slice_ent[i].db_len == 16 ||
+		      slice_ent[i].db_len == 8 || slice_ent[i].db_len == 0) ||
+		      invalid_sem(&slice_ent[i].sem0) || invalid_sem(&slice_ent[i].sem1) ||
+		      invalid_sem(&slice_ent[i].sem2) || invalid_sem(&slice_ent[i].sem3))
+			return -EINVAL;
+
+		if (slice_ent[i].offset + slice_ent[i].size > total_size)
+			return -EINVAL;
+	}
+
+	return 0;
+}
+
+static void qaic_free_sgt(struct sg_table *sgt)
+{
+	struct scatterlist *sg;
+
+	for (sg = sgt->sgl; sg; sg = sg_next(sg))
+		if (sg_page(sg))
+			__free_pages(sg_page(sg), get_order(sg->length));
+	sg_free_table(sgt);
+	kfree(sgt);
+}
+
+static void qaic_gem_print_info(struct drm_printer *p, unsigned int indent,
+				const struct drm_gem_object *obj)
+{
+	struct qaic_bo *bo = to_qaic_bo(obj);
+
+	drm_printf_indent(p, indent, "user requested size=%llu\n", bo->size);
+}
+
+static const struct vm_operations_struct drm_vm_ops = {
+	.open = drm_gem_vm_open,
+	.close = drm_gem_vm_close,
+};
+
+static int qaic_gem_object_mmap(struct drm_gem_object *obj, struct vm_area_struct *vma)
+{
+	struct qaic_bo *bo = to_qaic_bo(obj);
+	unsigned long offset = 0;
+	struct scatterlist *sg;
+	int ret;
+
+	if (obj->import_attach)
+		return -EINVAL;
+
+	for (sg = bo->sgt->sgl; sg; sg = sg_next(sg)) {
+		if (sg_page(sg)) {
+			ret = remap_pfn_range(vma, vma->vm_start + offset, page_to_pfn(sg_page(sg)),
+					      sg->length, vma->vm_page_prot);
+			if (ret)
+				goto out;
+			offset += sg->length;
+		}
+	}
+
+out:
+	return ret;
+}
+
+static void qaic_free_object(struct drm_gem_object *obj)
+{
+	struct qaic_bo *bo = to_qaic_bo(obj);
+
+	if (obj->import_attach) {
+		/* DMABUF/PRIME Path */
+		dma_buf_detach(obj->import_attach->dmabuf, obj->import_attach);
+		dma_buf_put(obj->import_attach->dmabuf);
+	} else {
+		/* Private buffer allocation path */
+		qaic_free_sgt(bo->sgt);
+	}
+
+	drm_gem_object_release(obj);
+	kfree(bo);
+}
+
+static const struct drm_gem_object_funcs qaic_gem_funcs = {
+	.free = qaic_free_object,
+	.print_info = qaic_gem_print_info,
+	.mmap = qaic_gem_object_mmap,
+	.vm_ops = &drm_vm_ops,
+};
+
+static struct qaic_bo *qaic_alloc_init_bo(void)
+{
+	struct qaic_bo *bo;
+
+	bo = kzalloc(sizeof(*bo), GFP_KERNEL);
+	if (!bo)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&bo->slices);
+	init_completion(&bo->xfer_done);
+	complete_all(&bo->xfer_done);
+
+	return bo;
+}
+
+int qaic_create_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv)
+{
+	struct qaic_create_bo *args = data;
+	int usr_rcu_id, qdev_rcu_id;
+	struct drm_gem_object *obj;
+	struct qaic_device *qdev;
+	struct qaic_user *usr;
+	struct qaic_bo *bo;
+	size_t size;
+	int ret;
+
+	if (args->pad)
+		return -EINVAL;
+
+	usr = file_priv->driver_priv;
+	usr_rcu_id = srcu_read_lock(&usr->qddev_lock);
+	if (!usr->qddev) {
+		ret = -ENODEV;
+		goto unlock_usr_srcu;
+	}
+
+	qdev = usr->qddev->qdev;
+	qdev_rcu_id = srcu_read_lock(&qdev->dev_lock);
+	if (qdev->in_reset) {
+		ret = -ENODEV;
+		goto unlock_dev_srcu;
+	}
+
+	size = PAGE_ALIGN(args->size);
+	if (size == 0) {
+		ret = -EINVAL;
+		goto unlock_dev_srcu;
+	}
+
+	bo = qaic_alloc_init_bo();
+	if (IS_ERR(bo)) {
+		ret = PTR_ERR(bo);
+		goto unlock_dev_srcu;
+	}
+	obj = &bo->base;
+
+	drm_gem_private_object_init(dev, obj, size);
+
+	obj->funcs = &qaic_gem_funcs;
+	ret = create_sgt(qdev, &bo->sgt, size);
+	if (ret)
+		goto free_bo;
+
+	bo->size = args->size;
+
+	ret = drm_gem_handle_create(file_priv, obj, &args->handle);
+	if (ret)
+		goto free_sgt;
+
+	bo->handle = args->handle;
+	drm_gem_object_put(obj);
+	srcu_read_unlock(&qdev->dev_lock, qdev_rcu_id);
+	srcu_read_unlock(&usr->qddev_lock, usr_rcu_id);
+
+	return 0;
+
+free_sgt:
+	qaic_free_sgt(bo->sgt);
+free_bo:
+	kfree(bo);
+unlock_dev_srcu:
+	srcu_read_unlock(&qdev->dev_lock, qdev_rcu_id);
+unlock_usr_srcu:
+	srcu_read_unlock(&usr->qddev_lock, usr_rcu_id);
+	return ret;
+}
+
+int qaic_mmap_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv)
+{
+	struct qaic_mmap_bo *args = data;
+	int usr_rcu_id, qdev_rcu_id;
+	struct drm_gem_object *obj;
+	struct qaic_device *qdev;
+	struct qaic_user *usr;
+	int ret;
+
+	usr = file_priv->driver_priv;
+	usr_rcu_id = srcu_read_lock(&usr->qddev_lock);
+	if (!usr->qddev) {
+		ret = -ENODEV;
+		goto unlock_usr_srcu;
+	}
+
+	qdev = usr->qddev->qdev;
+	qdev_rcu_id = srcu_read_lock(&qdev->dev_lock);
+	if (qdev->in_reset) {
+		ret = -ENODEV;
+		goto unlock_dev_srcu;
+	}
+
+	obj = drm_gem_object_lookup(file_priv, args->handle);
+	if (!obj) {
+		ret = -ENOENT;
+		goto unlock_dev_srcu;
+	}
+
+	ret = drm_gem_create_mmap_offset(obj);
+	if (ret == 0)
+		args->offset = drm_vma_node_offset_addr(&obj->vma_node);
+
+	drm_gem_object_put(obj);
+
+unlock_dev_srcu:
+	srcu_read_unlock(&qdev->dev_lock, qdev_rcu_id);
+unlock_usr_srcu:
+	srcu_read_unlock(&usr->qddev_lock, usr_rcu_id);
+	return ret;
+}
+
+struct drm_gem_object *qaic_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf)
+{
+	struct dma_buf_attachment *attach;
+	struct drm_gem_object *obj;
+	struct qaic_bo *bo;
+	size_t size;
+	int ret;
+
+	bo = qaic_alloc_init_bo();
+	if (IS_ERR(bo)) {
+		ret = PTR_ERR(bo);
+		goto out;
+	}
+
+	obj = &bo->base;
+	get_dma_buf(dma_buf);
+
+	attach = dma_buf_attach(dma_buf, dev->dev);
+	if (IS_ERR(attach)) {
+		ret = PTR_ERR(attach);
+		goto attach_fail;
+	}
+
+	size = PAGE_ALIGN(attach->dmabuf->size);
+	if (size == 0) {
+		ret = -EINVAL;
+		goto size_align_fail;
+	}
+
+	drm_gem_private_object_init(dev, obj, size);
+	/*
+	 * skipping dma_buf_map_attachment() as we do not know the direction
+	 * just yet. Once the direction is known in the subsequent IOCTL to
+	 * attach slicing, we can do it then.
+	 */
+
+	obj->funcs = &qaic_gem_funcs;
+	obj->import_attach = attach;
+	obj->resv = dma_buf->resv;
+
+	return obj;
+
+size_align_fail:
+	dma_buf_detach(dma_buf, attach);
+attach_fail:
+	dma_buf_put(dma_buf);
+	kfree(bo);
+out:
+	return ERR_PTR(ret);
+}
+
+static int qaic_prepare_import_bo(struct qaic_bo *bo, struct qaic_attach_slice_hdr *hdr)
+{
+	struct drm_gem_object *obj = &bo->base;
+	struct sg_table *sgt;
+	int ret;
+
+	if (obj->import_attach->dmabuf->size < hdr->size)
+		return -EINVAL;
+
+	sgt = dma_buf_map_attachment(obj->import_attach, hdr->dir);
+	if (IS_ERR(sgt)) {
+		ret = PTR_ERR(sgt);
+		return ret;
+	}
+
+	bo->sgt = sgt;
+	bo->size = hdr->size;
+
+	return 0;
+}
+
+static int qaic_prepare_export_bo(struct qaic_device *qdev, struct qaic_bo *bo,
+				  struct qaic_attach_slice_hdr *hdr)
+{
+	int ret;
+
+	if (bo->size != hdr->size)
+		return -EINVAL;
+
+	ret = dma_map_sgtable(&qdev->pdev->dev, bo->sgt, hdr->dir, 0);
+	if (ret)
+		return -EFAULT;
+
+	return 0;
+}
+
+static int qaic_prepare_bo(struct qaic_device *qdev, struct qaic_bo *bo,
+			   struct qaic_attach_slice_hdr *hdr)
+{
+	int ret;
+
+	if (bo->base.import_attach)
+		ret = qaic_prepare_import_bo(bo, hdr);
+	else
+		ret = qaic_prepare_export_bo(qdev, bo, hdr);
+
+	if (ret == 0)
+		bo->dir = hdr->dir;
+
+	return ret;
+}
+
+static void qaic_unprepare_import_bo(struct qaic_bo *bo)
+{
+	dma_buf_unmap_attachment(bo->base.import_attach, bo->sgt, bo->dir);
+	bo->sgt = NULL;
+	bo->size = 0;
+}
+
+static void qaic_unprepare_export_bo(struct qaic_device *qdev, struct qaic_bo *bo)
+{
+	dma_unmap_sgtable(&qdev->pdev->dev, bo->sgt, bo->dir, 0);
+}
+
+static void qaic_unprepare_bo(struct qaic_device *qdev, struct qaic_bo *bo)
+{
+	if (bo->base.import_attach)
+		qaic_unprepare_import_bo(bo);
+	else
+		qaic_unprepare_export_bo(qdev, bo);
+
+	bo->dir = 0;
+}
+
+static void qaic_free_slices_bo(struct qaic_bo *bo)
+{
+	struct bo_slice *slice, *temp;
+
+	list_for_each_entry_safe(slice, temp, &bo->slices, slice)
+		kref_put(&slice->ref_count, free_slice);
+}
+
+static int qaic_attach_slicing_bo(struct qaic_device *qdev, struct qaic_bo *bo,
+				  struct qaic_attach_slice_hdr *hdr,
+				  struct qaic_attach_slice_entry *slice_ent)
+{
+	int ret, i;
+
+	for (i = 0; i < hdr->count; i++) {
+		ret = qaic_map_one_slice(qdev, bo, &slice_ent[i]);
+		if (ret) {
+			qaic_free_slices_bo(bo);
+			return ret;
+		}
+	}
+
+	if (bo->total_slice_nents > qdev->dbc[hdr->dbc_id].nelem) {
+		qaic_free_slices_bo(bo);
+		return -ENOSPC;
+	}
+
+	bo->sliced = true;
+	bo->nr_slice = hdr->count;
+	list_add_tail(&bo->bo_list, &qdev->dbc[hdr->dbc_id].bo_lists);
+
+	return 0;
+}
+
+int qaic_attach_slice_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv)
+{
+	struct qaic_attach_slice_entry *slice_ent;
+	struct qaic_attach_slice *args = data;
+	struct dma_bridge_chan	*dbc;
+	int usr_rcu_id, qdev_rcu_id;
+	struct drm_gem_object *obj;
+	struct qaic_device *qdev;
+	unsigned long arg_size;
+	struct qaic_user *usr;
+	u8 __user *user_data;
+	struct qaic_bo *bo;
+	int ret;
+
+	usr = file_priv->driver_priv;
+	usr_rcu_id = srcu_read_lock(&usr->qddev_lock);
+	if (!usr->qddev) {
+		ret = -ENODEV;
+		goto unlock_usr_srcu;
+	}
+
+	qdev = usr->qddev->qdev;
+	qdev_rcu_id = srcu_read_lock(&qdev->dev_lock);
+	if (qdev->in_reset) {
+		ret = -ENODEV;
+		goto unlock_dev_srcu;
+	}
+
+	if (args->hdr.count == 0) {
+		ret = -EINVAL;
+		goto unlock_dev_srcu;
+	}
+
+	arg_size = args->hdr.count * sizeof(*slice_ent);
+	if (arg_size / args->hdr.count != sizeof(*slice_ent)) {
+		ret = -EINVAL;
+		goto unlock_dev_srcu;
+	}
+
+	if (args->hdr.dbc_id >= qdev->num_dbc) {
+		ret = -EINVAL;
+		goto unlock_dev_srcu;
+	}
+
+	if (args->hdr.size == 0) {
+		ret = -EINVAL;
+		goto unlock_dev_srcu;
+	}
+
+	if (!(args->hdr.dir == DMA_TO_DEVICE  || args->hdr.dir == DMA_FROM_DEVICE)) {
+		ret = -EINVAL;
+		goto unlock_dev_srcu;
+	}
+
+	dbc = &qdev->dbc[args->hdr.dbc_id];
+	if (dbc->usr != usr) {
+		ret = -EINVAL;
+		goto unlock_dev_srcu;
+	}
+
+	if (args->data == 0) {
+		ret = -EINVAL;
+		goto unlock_dev_srcu;
+	}
+
+	user_data = u64_to_user_ptr(args->data);
+
+	slice_ent = kzalloc(arg_size, GFP_KERNEL);
+	if (!slice_ent) {
+		ret = -EINVAL;
+		goto unlock_dev_srcu;
+	}
+
+	ret = copy_from_user(slice_ent, user_data, arg_size);
+	if (ret) {
+		ret = -EFAULT;
+		goto free_slice_ent;
+	}
+
+	ret = qaic_validate_req(qdev, slice_ent, args->hdr.count, args->hdr.size);
+	if (ret)
+		goto free_slice_ent;
+
+	obj = drm_gem_object_lookup(file_priv, args->hdr.handle);
+	if (!obj) {
+		ret = -ENOENT;
+		goto free_slice_ent;
+	}
+
+	bo = to_qaic_bo(obj);
+
+	ret = qaic_prepare_bo(qdev, bo, &args->hdr);
+	if (ret)
+		goto put_bo;
+
+	ret = qaic_attach_slicing_bo(qdev, bo, &args->hdr, slice_ent);
+	if (ret)
+		goto unprepare_bo;
+
+	if (args->hdr.dir == DMA_TO_DEVICE)
+		dma_sync_sgtable_for_cpu(&qdev->pdev->dev, bo->sgt, args->hdr.dir);
+
+	bo->dbc = dbc;
+	drm_gem_object_put(obj);
+	srcu_read_unlock(&qdev->dev_lock, qdev_rcu_id);
+	srcu_read_unlock(&usr->qddev_lock, usr_rcu_id);
+
+	return 0;
+
+unprepare_bo:
+	qaic_unprepare_bo(qdev, bo);
+put_bo:
+	drm_gem_object_put(obj);
+free_slice_ent:
+	kfree(slice_ent);
+unlock_dev_srcu:
+	srcu_read_unlock(&qdev->dev_lock, qdev_rcu_id);
+unlock_usr_srcu:
+	srcu_read_unlock(&usr->qddev_lock, usr_rcu_id);
+	return ret;
+}
+
+static inline int copy_exec_reqs(struct qaic_device *qdev, struct bo_slice *slice, u32 dbc_id,
+				 u32 head, u32 *ptail)
+{
+	struct dma_bridge_chan *dbc = &qdev->dbc[dbc_id];
+	struct dbc_req *reqs = slice->reqs;
+	u32 tail = *ptail;
+	u32 avail;
+
+	avail = head - tail;
+	if (head <= tail)
+		avail += dbc->nelem;
+
+	--avail;
+
+	if (avail < slice->nents)
+		return -EAGAIN;
+
+	if (tail + slice->nents > dbc->nelem) {
+		avail = dbc->nelem - tail;
+		avail = min_t(u32, avail, slice->nents);
+		memcpy(dbc->req_q_base + tail * get_dbc_req_elem_size(), reqs,
+		       sizeof(*reqs) * avail);
+		reqs += avail;
+		avail = slice->nents - avail;
+		if (avail)
+			memcpy(dbc->req_q_base, reqs, sizeof(*reqs) * avail);
+	} else {
+		memcpy(dbc->req_q_base + tail * get_dbc_req_elem_size(), reqs,
+		       sizeof(*reqs) * slice->nents);
+	}
+
+	*ptail = (tail + slice->nents) % dbc->nelem;
+
+	return 0;
+}
+
+/*
+ * Based on the value of resize we may only need to transmit first_n
+ * entries and the last entry, with last_bytes to send from the last entry.
+ * Note that first_n could be 0.
+ */
+static inline int copy_partial_exec_reqs(struct qaic_device *qdev, struct bo_slice *slice,
+					 u64 resize, u32 dbc_id, u32 head, u32 *ptail)
+{
+	struct dma_bridge_chan *dbc = &qdev->dbc[dbc_id];
+	struct dbc_req *reqs = slice->reqs;
+	struct dbc_req *last_req;
+	u32 tail = *ptail;
+	u64 total_bytes;
+	u64 last_bytes;
+	u32 first_n;
+	u32 avail;
+	int ret;
+	int i;
+
+	avail = head - tail;
+	if (head <= tail)
+		avail += dbc->nelem;
+
+	--avail;
+
+	total_bytes = 0;
+	for (i = 0; i < slice->nents; i++) {
+		total_bytes += le32_to_cpu(reqs[i].len);
+		if (total_bytes >= resize)
+			break;
+	}
+
+	if (total_bytes < resize) {
+		/* User space should have used the full buffer path. */
+		ret = -EINVAL;
+		return ret;
+	}
+
+	first_n = i;
+	last_bytes = i ? resize + le32_to_cpu(reqs[i].len) - total_bytes : resize;
+
+	if (avail < (first_n + 1))
+		return -EAGAIN;
+
+	if (first_n) {
+		if (tail + first_n > dbc->nelem) {
+			avail = dbc->nelem - tail;
+			avail = min_t(u32, avail, first_n);
+			memcpy(dbc->req_q_base + tail * get_dbc_req_elem_size(), reqs,
+			       sizeof(*reqs) * avail);
+			last_req = reqs + avail;
+			avail = first_n - avail;
+			if (avail)
+				memcpy(dbc->req_q_base, last_req, sizeof(*reqs) * avail);
+		} else {
+			memcpy(dbc->req_q_base + tail * get_dbc_req_elem_size(), reqs,
+			       sizeof(*reqs) * first_n);
+		}
+	}
+
+	/* Copy over the last entry. Here we need to adjust len to the left over
+	 * size, and set src and dst to the entry it is copied to.
+	 */
+	last_req = dbc->req_q_base + (tail + first_n) % dbc->nelem * get_dbc_req_elem_size();
+	memcpy(last_req, reqs + slice->nents - 1, sizeof(*reqs));
+
+	/*
+	 * last_bytes holds size of a DMA segment, maximum DMA segment size is
+	 * set to UINT_MAX by qaic and hence last_bytes can never exceed u32
+	 * range. So, by down sizing we are not corrupting the value.
+	 */
+	last_req->len = cpu_to_le32((u32)last_bytes);
+	last_req->src_addr = reqs[first_n].src_addr;
+	last_req->dest_addr = reqs[first_n].dest_addr;
+
+	*ptail = (tail + first_n + 1) % dbc->nelem;
+
+	return 0;
+}
+
+static int send_bo_list_to_device(struct qaic_device *qdev, struct drm_file *file_priv,
+				  struct qaic_execute_entry *exec, unsigned int count,
+				  bool is_partial, struct dma_bridge_chan *dbc, u32 head,
+				  u32 *tail)
+{
+	struct qaic_partial_execute_entry *pexec = (struct qaic_partial_execute_entry *)exec;
+	struct drm_gem_object *obj;
+	struct bo_slice *slice;
+	unsigned long flags;
+	struct qaic_bo *bo;
+	bool queued;
+	int i, j;
+	int ret;
+
+	for (i = 0; i < count; i++) {
+		/*
+		 * ref count will be decremented when the transfer of this
+		 * buffer is complete. It is inside dbc_irq_threaded_fn().
+		 */
+		obj = drm_gem_object_lookup(file_priv,
+					    is_partial ? pexec[i].handle : exec[i].handle);
+		if (!obj) {
+			ret = -ENOENT;
+			goto failed_to_send_bo;
+		}
+
+		bo = to_qaic_bo(obj);
+
+		if (!bo->sliced) {
+			ret = -EINVAL;
+			goto failed_to_send_bo;
+		}
+
+		if (is_partial && pexec[i].resize > bo->size) {
+			ret = -EINVAL;
+			goto failed_to_send_bo;
+		}
+
+		spin_lock_irqsave(&dbc->xfer_lock, flags);
+		queued = bo->queued;
+		bo->queued = true;
+		if (queued) {
+			spin_unlock_irqrestore(&dbc->xfer_lock, flags);
+			ret = -EINVAL;
+			goto failed_to_send_bo;
+		}
+
+		bo->req_id = dbc->next_req_id++;
+
+		list_for_each_entry(slice, &bo->slices, slice) {
+			/*
+			 * If this slice does not fall under the given
+			 * resize then skip this slice and continue the loop
+			 */
+			if (is_partial && pexec[i].resize && pexec[i].resize <= slice->offset)
+				continue;
+
+			for (j = 0; j < slice->nents; j++)
+				slice->reqs[j].req_id = cpu_to_le16(bo->req_id);
+
+			/*
+			 * If it is a partial execute ioctl call then check if
+			 * resize has cut this slice short then do a partial copy
+			 * else do complete copy
+			 */
+			if (is_partial && pexec[i].resize &&
+			    pexec[i].resize < slice->offset + slice->size)
+				ret = copy_partial_exec_reqs(qdev, slice,
+							     pexec[i].resize - slice->offset,
+							     dbc->id, head, tail);
+			else
+				ret = copy_exec_reqs(qdev, slice, dbc->id, head, tail);
+			if (ret) {
+				bo->queued = false;
+				spin_unlock_irqrestore(&dbc->xfer_lock, flags);
+				goto failed_to_send_bo;
+			}
+		}
+		reinit_completion(&bo->xfer_done);
+		list_add_tail(&bo->xfer_list, &dbc->xfer_list);
+		spin_unlock_irqrestore(&dbc->xfer_lock, flags);
+		dma_sync_sgtable_for_device(&qdev->pdev->dev, bo->sgt, bo->dir);
+	}
+
+	return 0;
+
+failed_to_send_bo:
+	if (likely(obj))
+		drm_gem_object_put(obj);
+	for (j = 0; j < i; j++) {
+		spin_lock_irqsave(&dbc->xfer_lock, flags);
+		bo = list_last_entry(&dbc->xfer_list, struct qaic_bo, xfer_list);
+		obj = &bo->base;
+		bo->queued = false;
+		list_del(&bo->xfer_list);
+		spin_unlock_irqrestore(&dbc->xfer_lock, flags);
+		dma_sync_sgtable_for_cpu(&qdev->pdev->dev, bo->sgt, bo->dir);
+		drm_gem_object_put(obj);
+	}
+	return ret;
+}
+
+static void update_profiling_data(struct drm_file *file_priv,
+				  struct qaic_execute_entry *exec, unsigned int count,
+				  bool is_partial, u64 received_ts, u64 submit_ts, u32 queue_level)
+{
+	struct qaic_partial_execute_entry *pexec = (struct qaic_partial_execute_entry *)exec;
+	struct drm_gem_object *obj;
+	struct qaic_bo *bo;
+	int i;
+
+	for (i = 0; i < count; i++) {
+		/*
+		 * Since we already committed the BO to hardware, the only way
+		 * this should fail is a pending signal. We can't cancel the
+		 * submit to hardware, so we have to just skip the profiling
+		 * data. In case the signal is not fatal to the process, we
+		 * return success so that the user doesn't try to resubmit.
+		 */
+		obj = drm_gem_object_lookup(file_priv,
+					    is_partial ? pexec[i].handle : exec[i].handle);
+		if (!obj)
+			break;
+		bo = to_qaic_bo(obj);
+		bo->perf_stats.req_received_ts = received_ts;
+		bo->perf_stats.req_submit_ts = submit_ts;
+		bo->perf_stats.queue_level_before = queue_level;
+		queue_level += bo->total_slice_nents;
+		drm_gem_object_put(obj);
+	}
+}
+
+static int __qaic_execute_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv,
+				   bool is_partial)
+{
+	struct qaic_partial_execute_entry *pexec;
+	struct qaic_execute *args = data;
+	struct qaic_execute_entry *exec;
+	struct dma_bridge_chan *dbc;
+	int usr_rcu_id, qdev_rcu_id;
+	struct qaic_device *qdev;
+	struct qaic_user *usr;
+	u8 __user *user_data;
+	unsigned long n;
+	u64 received_ts;
+	u32 queue_level;
+	u64 submit_ts;
+	int rcu_id;
+	u32 head;
+	u32 tail;
+	u64 size;
+	int ret;
+
+	received_ts = ktime_get_ns();
+
+	size = is_partial ? sizeof(*pexec) : sizeof(*exec);
+
+	n = (unsigned long)size * args->hdr.count;
+	if (args->hdr.count == 0 || n / args->hdr.count != size)
+		return -EINVAL;
+
+	user_data = u64_to_user_ptr(args->data);
+
+	exec = kcalloc(args->hdr.count, size, GFP_KERNEL);
+	pexec = (struct qaic_partial_execute_entry *)exec;
+	if (!exec)
+		return -ENOMEM;
+
+	if (copy_from_user(exec, user_data, n)) {
+		ret = -EFAULT;
+		goto free_exec;
+	}
+
+	usr = file_priv->driver_priv;
+	usr_rcu_id = srcu_read_lock(&usr->qddev_lock);
+	if (!usr->qddev) {
+		ret = -ENODEV;
+		goto unlock_usr_srcu;
+	}
+
+	qdev = usr->qddev->qdev;
+	qdev_rcu_id = srcu_read_lock(&qdev->dev_lock);
+	if (qdev->in_reset) {
+		ret = -ENODEV;
+		goto unlock_dev_srcu;
+	}
+
+	if (args->hdr.dbc_id >= qdev->num_dbc) {
+		ret = -EINVAL;
+		goto unlock_dev_srcu;
+	}
+
+	dbc = &qdev->dbc[args->hdr.dbc_id];
+
+	rcu_id = srcu_read_lock(&dbc->ch_lock);
+	if (!dbc->usr || dbc->usr->handle != usr->handle) {
+		ret = -EPERM;
+		goto release_ch_rcu;
+	}
+
+	head = readl(dbc->dbc_base + REQHP_OFF);
+	tail = readl(dbc->dbc_base + REQTP_OFF);
+
+	if (head == U32_MAX || tail == U32_MAX) {
+		/* PCI link error */
+		ret = -ENODEV;
+		goto release_ch_rcu;
+	}
+
+	queue_level = head <= tail ? tail - head : dbc->nelem - (head - tail);
+
+	ret = send_bo_list_to_device(qdev, file_priv, exec, args->hdr.count, is_partial, dbc,
+				     head, &tail);
+	if (ret)
+		goto release_ch_rcu;
+
+	/* Finalize commit to hardware */
+	submit_ts = ktime_get_ns();
+	writel(tail, dbc->dbc_base + REQTP_OFF);
+
+	update_profiling_data(file_priv, exec, args->hdr.count, is_partial, received_ts,
+			      submit_ts, queue_level);
+
+	if (datapath_polling)
+		schedule_work(&dbc->poll_work);
+
+release_ch_rcu:
+	srcu_read_unlock(&dbc->ch_lock, rcu_id);
+unlock_dev_srcu:
+	srcu_read_unlock(&qdev->dev_lock, qdev_rcu_id);
+unlock_usr_srcu:
+	srcu_read_unlock(&usr->qddev_lock, usr_rcu_id);
+free_exec:
+	kfree(exec);
+	return ret;
+}
+
+int qaic_execute_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv)
+{
+	return __qaic_execute_bo_ioctl(dev, data, file_priv, false);
+}
+
+int qaic_partial_execute_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv)
+{
+	return __qaic_execute_bo_ioctl(dev, data, file_priv, true);
+}
+
+/*
+ * Our interrupt handling is a bit more complicated than a simple ideal, but
+ * sadly necessary.
+ *
+ * Each dbc has a completion queue. Entries in the queue correspond to DMA
+ * requests which the device has processed. The hardware already has a built
+ * in irq mitigation. When the device puts an entry into the queue, it will
+ * only trigger an interrupt if the queue was empty. Therefore, when adding
+ * the Nth event to a non-empty queue, the hardware doesn't trigger an
+ * interrupt. This means the host doesn't get additional interrupts signaling
+ * the same thing - the queue has something to process.
+ * This behavior can be overridden in the DMA request.
+ * This means that when the host receives an interrupt, it is required to
+ * drain the queue.
+ *
+ * This behavior is what NAPI attempts to accomplish, although we can't use
+ * NAPI as we don't have a netdev. We use threaded irqs instead.
+ *
+ * However, there is a situation where the host drains the queue fast enough
+ * that every event causes an interrupt. Typically this is not a problem as
+ * the rate of events would be low. However, that is not the case with
+ * lprnet for example. On an Intel Xeon D-2191 where we run 8 instances of
+ * lprnet, the host receives roughly 80k interrupts per second from the device
+ * (per /proc/interrupts). While NAPI documentation indicates the host should
+ * just chug along, sadly that behavior causes instability in some hosts.
+ *
+ * Therefore, we implement an interrupt disable scheme similar to NAPI. The
+ * key difference is that we will delay after draining the queue for a small
+ * time to allow additional events to come in via polling. Using the above
+ * lprnet workload, this reduces the number of interrupts processed from
+ * ~80k/sec to about 64 in 5 minutes and appears to solve the system
+ * instability.
+ */
+irqreturn_t dbc_irq_handler(int irq, void *data)
+{
+	struct dma_bridge_chan *dbc = data;
+	int rcu_id;
+	u32 head;
+	u32 tail;
+
+	rcu_id = srcu_read_lock(&dbc->ch_lock);
+
+	if (!dbc->usr) {
+		srcu_read_unlock(&dbc->ch_lock, rcu_id);
+		return IRQ_HANDLED;
+	}
+
+	head = readl(dbc->dbc_base + RSPHP_OFF);
+	if (head == U32_MAX) { /* PCI link error */
+		srcu_read_unlock(&dbc->ch_lock, rcu_id);
+		return IRQ_NONE;
+	}
+
+	tail = readl(dbc->dbc_base + RSPTP_OFF);
+	if (tail == U32_MAX) { /* PCI link error */
+		srcu_read_unlock(&dbc->ch_lock, rcu_id);
+		return IRQ_NONE;
+	}
+
+	if (head == tail) { /* queue empty */
+		srcu_read_unlock(&dbc->ch_lock, rcu_id);
+		return IRQ_NONE;
+	}
+
+	disable_irq_nosync(irq);
+	srcu_read_unlock(&dbc->ch_lock, rcu_id);
+	return IRQ_WAKE_THREAD;
+}
+
+void irq_polling_work(struct work_struct *work)
+{
+	struct dma_bridge_chan *dbc = container_of(work, struct dma_bridge_chan,  poll_work);
+	unsigned long flags;
+	int rcu_id;
+	u32 head;
+	u32 tail;
+
+	rcu_id = srcu_read_lock(&dbc->ch_lock);
+
+	while (1) {
+		if (dbc->qdev->in_reset) {
+			srcu_read_unlock(&dbc->ch_lock, rcu_id);
+			return;
+		}
+		if (!dbc->usr) {
+			srcu_read_unlock(&dbc->ch_lock, rcu_id);
+			return;
+		}
+		spin_lock_irqsave(&dbc->xfer_lock, flags);
+		if (list_empty(&dbc->xfer_list)) {
+			spin_unlock_irqrestore(&dbc->xfer_lock, flags);
+			srcu_read_unlock(&dbc->ch_lock, rcu_id);
+			return;
+		}
+		spin_unlock_irqrestore(&dbc->xfer_lock, flags);
+
+		head = readl(dbc->dbc_base + RSPHP_OFF);
+		if (head == U32_MAX) { /* PCI link error */
+			srcu_read_unlock(&dbc->ch_lock, rcu_id);
+			return;
+		}
+
+		tail = readl(dbc->dbc_base + RSPTP_OFF);
+		if (tail == U32_MAX) { /* PCI link error */
+			srcu_read_unlock(&dbc->ch_lock, rcu_id);
+			return;
+		}
+
+		if (head != tail) {
+			irq_wake_thread(dbc->irq, dbc);
+			srcu_read_unlock(&dbc->ch_lock, rcu_id);
+			return;
+		}
+
+		cond_resched();
+		usleep_range(datapath_poll_interval_us, 2 * datapath_poll_interval_us);
+	}
+}
+
+irqreturn_t dbc_irq_threaded_fn(int irq, void *data)
+{
+	struct dma_bridge_chan *dbc = data;
+	int event_count = NUM_EVENTS;
+	int delay_count = NUM_DELAYS;
+	struct qaic_device *qdev;
+	struct qaic_bo *bo, *i;
+	struct dbc_rsp *rsp;
+	unsigned long flags;
+	int rcu_id;
+	u16 status;
+	u16 req_id;
+	u32 head;
+	u32 tail;
+
+	rcu_id = srcu_read_lock(&dbc->ch_lock);
+
+	head = readl(dbc->dbc_base + RSPHP_OFF);
+	if (head == U32_MAX) /* PCI link error */
+		goto error_out;
+
+	qdev = dbc->qdev;
+read_fifo:
+
+	if (!event_count) {
+		event_count = NUM_EVENTS;
+		cond_resched();
+	}
+
+	/*
+	 * if this channel isn't assigned or gets unassigned during processing
+	 * we have nothing further to do
+	 */
+	if (!dbc->usr)
+		goto error_out;
+
+	tail = readl(dbc->dbc_base + RSPTP_OFF);
+	if (tail == U32_MAX) /* PCI link error */
+		goto error_out;
+
+	if (head == tail) { /* queue empty */
+		if (delay_count) {
+			--delay_count;
+			usleep_range(100, 200);
+			goto read_fifo; /* check for a new event */
+		}
+		goto normal_out;
+	}
+
+	delay_count = NUM_DELAYS;
+	while (head != tail) {
+		if (!event_count)
+			break;
+		--event_count;
+		rsp = dbc->rsp_q_base + head * sizeof(*rsp);
+		req_id = le16_to_cpu(rsp->req_id);
+		status = le16_to_cpu(rsp->status);
+		if (status)
+			pci_dbg(qdev->pdev, "req_id %d failed with status %d\n", req_id, status);
+		spin_lock_irqsave(&dbc->xfer_lock, flags);
+		/*
+		 * A BO can receive multiple interrupts, since a BO can be
+		 * divided into multiple slices and a buffer receives as many
+		 * interrupts as slices. So until it receives interrupts for
+		 * all the slices we cannot mark that buffer complete.
+		 */
+		list_for_each_entry_safe(bo, i, &dbc->xfer_list, xfer_list) {
+			if (bo->req_id == req_id)
+				bo->nr_slice_xfer_done++;
+			else
+				continue;
+
+			if (bo->nr_slice_xfer_done < bo->nr_slice)
+				break;
+
+			/*
+			 * At this point we have received all the interrupts for
+			 * BO, which means BO execution is complete.
+			 */
+			dma_sync_sgtable_for_cpu(&qdev->pdev->dev, bo->sgt, bo->dir);
+			bo->nr_slice_xfer_done = 0;
+			bo->queued = false;
+			list_del(&bo->xfer_list);
+			bo->perf_stats.req_processed_ts = ktime_get_ns();
+			complete_all(&bo->xfer_done);
+			drm_gem_object_put(&bo->base);
+			break;
+		}
+		spin_unlock_irqrestore(&dbc->xfer_lock, flags);
+		head = (head + 1) % dbc->nelem;
+	}
+
+	/*
+	 * Update the head pointer of response queue and let the device know
+	 * that we have consumed elements from the queue.
+	 */
+	writel(head, dbc->dbc_base + RSPHP_OFF);
+
+	/* elements might have been put in the queue while we were processing */
+	goto read_fifo;
+
+normal_out:
+	if (likely(!datapath_polling))
+		enable_irq(irq);
+	else
+		schedule_work(&dbc->poll_work);
+	/* checking the fifo and enabling irqs is a race, missed event check */
+	tail = readl(dbc->dbc_base + RSPTP_OFF);
+	if (tail != U32_MAX && head != tail) {
+		if (likely(!datapath_polling))
+			disable_irq_nosync(irq);
+		goto read_fifo;
+	}
+	srcu_read_unlock(&dbc->ch_lock, rcu_id);
+	return IRQ_HANDLED;
+
+error_out:
+	srcu_read_unlock(&dbc->ch_lock, rcu_id);
+	if (likely(!datapath_polling))
+		enable_irq(irq);
+	else
+		schedule_work(&dbc->poll_work);
+
+	return IRQ_HANDLED;
+}
+
+int qaic_wait_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv)
+{
+	struct qaic_wait *args = data;
+	int usr_rcu_id, qdev_rcu_id;
+	struct dma_bridge_chan *dbc;
+	struct drm_gem_object *obj;
+	struct qaic_device *qdev;
+	unsigned long timeout;
+	struct qaic_user *usr;
+	struct qaic_bo *bo;
+	int rcu_id;
+	int ret;
+
+	usr = file_priv->driver_priv;
+	usr_rcu_id = srcu_read_lock(&usr->qddev_lock);
+	if (!usr->qddev) {
+		ret = -ENODEV;
+		goto unlock_usr_srcu;
+	}
+
+	qdev = usr->qddev->qdev;
+	qdev_rcu_id = srcu_read_lock(&qdev->dev_lock);
+	if (qdev->in_reset) {
+		ret = -ENODEV;
+		goto unlock_dev_srcu;
+	}
+
+	if (args->pad != 0) {
+		ret = -EINVAL;
+		goto unlock_dev_srcu;
+	}
+
+	if (args->dbc_id >= qdev->num_dbc) {
+		ret = -EINVAL;
+		goto unlock_dev_srcu;
+	}
+
+	dbc = &qdev->dbc[args->dbc_id];
+
+	rcu_id = srcu_read_lock(&dbc->ch_lock);
+	if (dbc->usr != usr) {
+		ret = -EPERM;
+		goto unlock_ch_srcu;
+	}
+
+	obj = drm_gem_object_lookup(file_priv, args->handle);
+	if (!obj) {
+		ret = -ENOENT;
+		goto unlock_ch_srcu;
+	}
+
+	bo = to_qaic_bo(obj);
+	timeout = args->timeout ? args->timeout : wait_exec_default_timeout_ms;
+	timeout = msecs_to_jiffies(timeout);
+	ret = wait_for_completion_interruptible_timeout(&bo->xfer_done, timeout);
+	if (!ret) {
+		ret = -ETIMEDOUT;
+		goto put_obj;
+	}
+	if (ret > 0)
+		ret = 0;
+
+	if (!dbc->usr)
+		ret = -EPERM;
+
+put_obj:
+	drm_gem_object_put(obj);
+unlock_ch_srcu:
+	srcu_read_unlock(&dbc->ch_lock, rcu_id);
+unlock_dev_srcu:
+	srcu_read_unlock(&qdev->dev_lock, qdev_rcu_id);
+unlock_usr_srcu:
+	srcu_read_unlock(&usr->qddev_lock, usr_rcu_id);
+	return ret;
+}
+
+int qaic_perf_stats_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv)
+{
+	struct qaic_perf_stats_entry *ent = NULL;
+	struct qaic_perf_stats *args = data;
+	int usr_rcu_id, qdev_rcu_id;
+	struct drm_gem_object *obj;
+	struct qaic_device *qdev;
+	struct qaic_user *usr;
+	struct qaic_bo *bo;
+	int ret, i;
+
+	usr = file_priv->driver_priv;
+	usr_rcu_id = srcu_read_lock(&usr->qddev_lock);
+	if (!usr->qddev) {
+		ret = -ENODEV;
+		goto unlock_usr_srcu;
+	}
+
+	qdev = usr->qddev->qdev;
+	qdev_rcu_id = srcu_read_lock(&qdev->dev_lock);
+	if (qdev->in_reset) {
+		ret = -ENODEV;
+		goto unlock_dev_srcu;
+	}
+
+	if (args->hdr.dbc_id >= qdev->num_dbc) {
+		ret = -EINVAL;
+		goto unlock_dev_srcu;
+	}
+
+	ent = kcalloc(args->hdr.count, sizeof(*ent), GFP_KERNEL);
+	if (!ent) {
+		ret = -EINVAL;
+		goto unlock_dev_srcu;
+	}
+
+	ret = copy_from_user(ent, u64_to_user_ptr(args->data), args->hdr.count * sizeof(*ent));
+	if (ret) {
+		ret = -EFAULT;
+		goto free_ent;
+	}
+
+	for (i = 0; i < args->hdr.count; i++) {
+		obj = drm_gem_object_lookup(file_priv, ent[i].handle);
+		if (!obj) {
+			ret = -ENOENT;
+			goto free_ent;
+		}
+		bo = to_qaic_bo(obj);
+		/*
+		 * perf stats ioctl is called before wait ioctl is complete then
+		 * the latency information is invalid.
+		 */
+		if (bo->perf_stats.req_processed_ts < bo->perf_stats.req_submit_ts) {
+			ent[i].device_latency_us = 0;
+		} else {
+			ent[i].device_latency_us = div_u64((bo->perf_stats.req_processed_ts -
+							    bo->perf_stats.req_submit_ts), 1000);
+		}
+		ent[i].submit_latency_us = div_u64((bo->perf_stats.req_submit_ts -
+						    bo->perf_stats.req_received_ts), 1000);
+		ent[i].queue_level_before = bo->perf_stats.queue_level_before;
+		ent[i].num_queue_element = bo->total_slice_nents;
+		drm_gem_object_put(obj);
+	}
+
+	if (copy_to_user(u64_to_user_ptr(args->data), ent, args->hdr.count * sizeof(*ent)))
+		ret = -EFAULT;
+
+free_ent:
+	kfree(ent);
+unlock_dev_srcu:
+	srcu_read_unlock(&qdev->dev_lock, qdev_rcu_id);
+unlock_usr_srcu:
+	srcu_read_unlock(&usr->qddev_lock, usr_rcu_id);
+	return ret;
+}
+
+static void empty_xfer_list(struct qaic_device *qdev, struct dma_bridge_chan *dbc)
+{
+	unsigned long flags;
+	struct qaic_bo *bo;
+
+	spin_lock_irqsave(&dbc->xfer_lock, flags);
+	while (!list_empty(&dbc->xfer_list)) {
+		bo = list_first_entry(&dbc->xfer_list, typeof(*bo), xfer_list);
+		bo->queued = false;
+		list_del(&bo->xfer_list);
+		spin_unlock_irqrestore(&dbc->xfer_lock, flags);
+		dma_sync_sgtable_for_cpu(&qdev->pdev->dev, bo->sgt, bo->dir);
+		complete_all(&bo->xfer_done);
+		drm_gem_object_put(&bo->base);
+		spin_lock_irqsave(&dbc->xfer_lock, flags);
+	}
+	spin_unlock_irqrestore(&dbc->xfer_lock, flags);
+}
+
+int disable_dbc(struct qaic_device *qdev, u32 dbc_id, struct qaic_user *usr)
+{
+	if (!qdev->dbc[dbc_id].usr || qdev->dbc[dbc_id].usr->handle != usr->handle)
+		return -EPERM;
+
+	qdev->dbc[dbc_id].usr = NULL;
+	synchronize_srcu(&qdev->dbc[dbc_id].ch_lock);
+	return 0;
+}
+
+/**
+ * enable_dbc - Enable the DBC. DBCs are disabled by removing the context of
+ * user. Add user context back to DBC to enable it. This function trusts the
+ * DBC ID passed and expects the DBC to be disabled.
+ * @qdev: Qranium device handle
+ * @dbc_id: ID of the DBC
+ * @usr: User context
+ */
+void enable_dbc(struct qaic_device *qdev, u32 dbc_id, struct qaic_user *usr)
+{
+	qdev->dbc[dbc_id].usr = usr;
+}
+
+void wakeup_dbc(struct qaic_device *qdev, u32 dbc_id)
+{
+	struct dma_bridge_chan *dbc = &qdev->dbc[dbc_id];
+
+	dbc->usr = NULL;
+	empty_xfer_list(qdev, dbc);
+	synchronize_srcu(&dbc->ch_lock);
+}
+
+void release_dbc(struct qaic_device *qdev, u32 dbc_id)
+{
+	struct bo_slice *slice, *slice_temp;
+	struct qaic_bo *bo, *bo_temp;
+	struct dma_bridge_chan *dbc;
+
+	dbc = &qdev->dbc[dbc_id];
+	if (!dbc->in_use)
+		return;
+
+	wakeup_dbc(qdev, dbc_id);
+
+	dma_free_coherent(&qdev->pdev->dev, dbc->total_size, dbc->req_q_base, dbc->dma_addr);
+	dbc->total_size = 0;
+	dbc->req_q_base = NULL;
+	dbc->dma_addr = 0;
+	dbc->nelem = 0;
+	dbc->usr = NULL;
+
+	list_for_each_entry_safe(bo, bo_temp, &dbc->bo_lists, bo_list) {
+		list_for_each_entry_safe(slice, slice_temp, &bo->slices, slice)
+			kref_put(&slice->ref_count, free_slice);
+		bo->sliced = false;
+		INIT_LIST_HEAD(&bo->slices);
+		bo->total_slice_nents = 0;
+		bo->dir = 0;
+		bo->dbc = NULL;
+		bo->nr_slice = 0;
+		bo->nr_slice_xfer_done = 0;
+		bo->queued = false;
+		bo->req_id = 0;
+		init_completion(&bo->xfer_done);
+		complete_all(&bo->xfer_done);
+		list_del(&bo->bo_list);
+		bo->perf_stats.req_received_ts = 0;
+		bo->perf_stats.req_submit_ts = 0;
+		bo->perf_stats.req_processed_ts = 0;
+		bo->perf_stats.queue_level_before = 0;
+	}
+
+	dbc->in_use = false;
+	wake_up(&dbc->dbc_release);
+}
diff --git a/drivers/accel/qaic/qaic_drv.c b/drivers/accel/qaic/qaic_drv.c
new file mode 100644
index 000000000000..1106ad88a5b6
--- /dev/null
+++ b/drivers/accel/qaic/qaic_drv.c
@@ -0,0 +1,647 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+/* Copyright (c) 2019-2021, The Linux Foundation. All rights reserved. */
+/* Copyright (c) 2021-2023 Qualcomm Innovation Center, Inc. All rights reserved. */
+
+#include <linux/delay.h>
+#include <linux/dma-mapping.h>
+#include <linux/idr.h>
+#include <linux/interrupt.h>
+#include <linux/list.h>
+#include <linux/kref.h>
+#include <linux/mhi.h>
+#include <linux/module.h>
+#include <linux/msi.h>
+#include <linux/mutex.h>
+#include <linux/pci.h>
+#include <linux/spinlock.h>
+#include <linux/workqueue.h>
+#include <linux/wait.h>
+#include <drm/drm_accel.h>
+#include <drm/drm_drv.h>
+#include <drm/drm_file.h>
+#include <drm/drm_gem.h>
+#include <drm/drm_ioctl.h>
+#include <uapi/drm/qaic_accel.h>
+
+#include "mhi_controller.h"
+#include "mhi_qaic_ctrl.h"
+#include "qaic.h"
+
+MODULE_IMPORT_NS(DMA_BUF);
+
+#define PCI_DEV_AIC100			0xa100
+#define QAIC_NAME			"qaic"
+#define QAIC_DESC			"Qualcomm Cloud AI Accelerators"
+#define CNTL_MAJOR			5
+#define CNTL_MINOR			0
+
+bool datapath_polling;
+module_param(datapath_polling, bool, 0400);
+MODULE_PARM_DESC(datapath_polling, "Operate the datapath in polling mode");
+static bool link_up;
+static DEFINE_IDA(qaic_usrs);
+
+static int qaic_create_drm_device(struct qaic_device *qdev, s32 partition_id);
+static void qaic_destroy_drm_device(struct qaic_device *qdev, s32 partition_id);
+
+static void free_usr(struct kref *kref)
+{
+	struct qaic_user *usr = container_of(kref, struct qaic_user, ref_count);
+
+	cleanup_srcu_struct(&usr->qddev_lock);
+	ida_free(&qaic_usrs, usr->handle);
+	kfree(usr);
+}
+
+static int qaic_open(struct drm_device *dev, struct drm_file *file)
+{
+	struct qaic_drm_device *qddev = dev->dev_private;
+	struct qaic_device *qdev = qddev->qdev;
+	struct qaic_user *usr;
+	int rcu_id;
+	int ret;
+
+	rcu_id = srcu_read_lock(&qdev->dev_lock);
+	if (qdev->in_reset) {
+		ret = -ENODEV;
+		goto dev_unlock;
+	}
+
+	usr = kmalloc(sizeof(*usr), GFP_KERNEL);
+	if (!usr) {
+		ret = -ENOMEM;
+		goto dev_unlock;
+	}
+
+	usr->handle = ida_alloc(&qaic_usrs, GFP_KERNEL);
+	if (usr->handle < 0) {
+		ret = usr->handle;
+		goto free_usr;
+	}
+	usr->qddev = qddev;
+	atomic_set(&usr->chunk_id, 0);
+	init_srcu_struct(&usr->qddev_lock);
+	kref_init(&usr->ref_count);
+
+	ret = mutex_lock_interruptible(&qddev->users_mutex);
+	if (ret)
+		goto cleanup_usr;
+
+	list_add(&usr->node, &qddev->users);
+	mutex_unlock(&qddev->users_mutex);
+
+	file->driver_priv = usr;
+
+	srcu_read_unlock(&qdev->dev_lock, rcu_id);
+	return 0;
+
+cleanup_usr:
+	cleanup_srcu_struct(&usr->qddev_lock);
+free_usr:
+	kfree(usr);
+dev_unlock:
+	srcu_read_unlock(&qdev->dev_lock, rcu_id);
+	return ret;
+}
+
+static void qaic_postclose(struct drm_device *dev, struct drm_file *file)
+{
+	struct qaic_user *usr = file->driver_priv;
+	struct qaic_drm_device *qddev;
+	struct qaic_device *qdev;
+	int qdev_rcu_id;
+	int usr_rcu_id;
+	int i;
+
+	qddev = usr->qddev;
+	usr_rcu_id = srcu_read_lock(&usr->qddev_lock);
+	if (qddev) {
+		qdev = qddev->qdev;
+		qdev_rcu_id = srcu_read_lock(&qdev->dev_lock);
+		if (!qdev->in_reset) {
+			qaic_release_usr(qdev, usr);
+			for (i = 0; i < qdev->num_dbc; ++i)
+				if (qdev->dbc[i].usr && qdev->dbc[i].usr->handle == usr->handle)
+					release_dbc(qdev, i);
+		}
+		srcu_read_unlock(&qdev->dev_lock, qdev_rcu_id);
+
+		mutex_lock(&qddev->users_mutex);
+		if (!list_empty(&usr->node))
+			list_del_init(&usr->node);
+		mutex_unlock(&qddev->users_mutex);
+	}
+
+	srcu_read_unlock(&usr->qddev_lock, usr_rcu_id);
+	kref_put(&usr->ref_count, free_usr);
+
+	file->driver_priv = NULL;
+}
+
+DEFINE_DRM_ACCEL_FOPS(qaic_accel_fops);
+
+static const struct drm_ioctl_desc qaic_drm_ioctls[] = {
+	DRM_IOCTL_DEF_DRV(QAIC_MANAGE, qaic_manage_ioctl, 0),
+	DRM_IOCTL_DEF_DRV(QAIC_CREATE_BO, qaic_create_bo_ioctl, 0),
+	DRM_IOCTL_DEF_DRV(QAIC_MMAP_BO, qaic_mmap_bo_ioctl, 0),
+	DRM_IOCTL_DEF_DRV(QAIC_ATTACH_SLICE_BO, qaic_attach_slice_bo_ioctl, 0),
+	DRM_IOCTL_DEF_DRV(QAIC_EXECUTE_BO, qaic_execute_bo_ioctl, 0),
+	DRM_IOCTL_DEF_DRV(QAIC_PARTIAL_EXECUTE_BO, qaic_partial_execute_bo_ioctl, 0),
+	DRM_IOCTL_DEF_DRV(QAIC_WAIT_BO, qaic_wait_bo_ioctl, 0),
+	DRM_IOCTL_DEF_DRV(QAIC_PERF_STATS_BO, qaic_perf_stats_bo_ioctl, 0),
+};
+
+static const struct drm_driver qaic_accel_driver = {
+	.driver_features	= DRIVER_GEM | DRIVER_COMPUTE_ACCEL,
+
+	.name			= QAIC_NAME,
+	.desc			= QAIC_DESC,
+	.date			= "20190618",
+
+	.fops			= &qaic_accel_fops,
+	.open			= qaic_open,
+	.postclose		= qaic_postclose,
+
+	.ioctls			= qaic_drm_ioctls,
+	.num_ioctls		= ARRAY_SIZE(qaic_drm_ioctls),
+	.prime_fd_to_handle	= drm_gem_prime_fd_to_handle,
+	.gem_prime_import	= qaic_gem_prime_import,
+};
+
+static int qaic_create_drm_device(struct qaic_device *qdev, s32 partition_id)
+{
+	struct qaic_drm_device *qddev;
+	struct drm_device *ddev;
+	struct device *pdev;
+	int ret;
+
+	/* Hold off implementing partitions until the uapi is determined */
+	if (partition_id != QAIC_NO_PARTITION)
+		return -EINVAL;
+
+	pdev = &qdev->pdev->dev;
+
+	qddev = kzalloc(sizeof(*qddev), GFP_KERNEL);
+	if (!qddev)
+		return -ENOMEM;
+
+	ddev = drm_dev_alloc(&qaic_accel_driver, pdev);
+	if (IS_ERR(ddev)) {
+		ret = PTR_ERR(ddev);
+		goto ddev_fail;
+	}
+
+	ddev->dev_private = qddev;
+	qddev->ddev = ddev;
+
+	qddev->qdev = qdev;
+	qddev->partition_id = partition_id;
+	INIT_LIST_HEAD(&qddev->users);
+	mutex_init(&qddev->users_mutex);
+
+	qdev->qddev = qddev;
+
+	ret = drm_dev_register(ddev, 0);
+	if (ret) {
+		pci_dbg(qdev->pdev, "%s: drm_dev_register failed %d\n", __func__, ret);
+		goto drm_reg_fail;
+	}
+
+	return 0;
+
+drm_reg_fail:
+	mutex_destroy(&qddev->users_mutex);
+	qdev->qddev = NULL;
+	drm_dev_put(ddev);
+ddev_fail:
+	kfree(qddev);
+	return ret;
+}
+
+static void qaic_destroy_drm_device(struct qaic_device *qdev, s32 partition_id)
+{
+	struct qaic_drm_device *qddev;
+	struct qaic_user *usr;
+
+	qddev = qdev->qddev;
+
+	/*
+	 * Existing users get unresolvable errors till they close FDs.
+	 * Need to sync carefully with users calling close(). The
+	 * list of users can be modified elsewhere when the lock isn't
+	 * held here, but the sync'ing the srcu with the mutex held
+	 * could deadlock. Grab the mutex so that the list will be
+	 * unmodified. The user we get will exist as long as the
+	 * lock is held. Signal that the qcdev is going away, and
+	 * grab a reference to the user so they don't go away for
+	 * synchronize_srcu(). Then release the mutex to avoid
+	 * deadlock and make sure the user has observed the signal.
+	 * With the lock released, we cannot maintain any state of the
+	 * user list.
+	 */
+	mutex_lock(&qddev->users_mutex);
+	while (!list_empty(&qddev->users)) {
+		usr = list_first_entry(&qddev->users, struct qaic_user, node);
+		list_del_init(&usr->node);
+		kref_get(&usr->ref_count);
+		usr->qddev = NULL;
+		mutex_unlock(&qddev->users_mutex);
+		synchronize_srcu(&usr->qddev_lock);
+		kref_put(&usr->ref_count, free_usr);
+		mutex_lock(&qddev->users_mutex);
+	}
+	mutex_unlock(&qddev->users_mutex);
+
+	if (qddev->ddev) {
+		drm_dev_unregister(qddev->ddev);
+		drm_dev_put(qddev->ddev);
+	}
+
+	kfree(qddev);
+}
+
+static int qaic_mhi_probe(struct mhi_device *mhi_dev, const struct mhi_device_id *id)
+{
+	struct qaic_device *qdev;
+	u16 major, minor;
+	int ret;
+
+	/*
+	 * Invoking this function indicates that the control channel to the
+	 * device is available. We use that as a signal to indicate that
+	 * the device side firmware has booted. The device side firmware
+	 * manages the device resources, so we need to communicate with it
+	 * via the control channel in order to utilize the device. Therefore
+	 * we wait until this signal to create the drm dev that userspace will
+	 * use to control the device, because without the device side firmware,
+	 * userspace can't do anything useful.
+	 */
+
+	qdev = pci_get_drvdata(to_pci_dev(mhi_dev->mhi_cntrl->cntrl_dev));
+
+	qdev->in_reset = false;
+
+	dev_set_drvdata(&mhi_dev->dev, qdev);
+	qdev->cntl_ch = mhi_dev;
+
+	ret = qaic_control_open(qdev);
+	if (ret) {
+		pci_dbg(qdev->pdev, "%s: control_open failed %d\n", __func__, ret);
+		return ret;
+	}
+
+	ret = get_cntl_version(qdev, NULL, &major, &minor);
+	if (ret || major != CNTL_MAJOR || minor > CNTL_MINOR) {
+		pci_err(qdev->pdev, "%s: Control protocol version (%d.%d) not supported. Supported version is (%d.%d). Ret: %d\n",
+			__func__, major, minor, CNTL_MAJOR, CNTL_MINOR, ret);
+		ret = -EINVAL;
+		goto close_control;
+	}
+
+	ret = qaic_create_drm_device(qdev, QAIC_NO_PARTITION);
+
+	return ret;
+
+close_control:
+	qaic_control_close(qdev);
+	return ret;
+}
+
+static void qaic_mhi_remove(struct mhi_device *mhi_dev)
+{
+/* This is redundant since we have already observed the device crash */
+}
+
+static void qaic_notify_reset(struct qaic_device *qdev)
+{
+	int i;
+
+	qdev->in_reset = true;
+	/* wake up any waiters to avoid waiting for timeouts at sync */
+	wake_all_cntl(qdev);
+	for (i = 0; i < qdev->num_dbc; ++i)
+		wakeup_dbc(qdev, i);
+	synchronize_srcu(&qdev->dev_lock);
+}
+
+void qaic_dev_reset_clean_local_state(struct qaic_device *qdev, bool exit_reset)
+{
+	int i;
+
+	qaic_notify_reset(qdev);
+
+	/* remove drmdevs to prevent new users from coming in */
+	qaic_destroy_drm_device(qdev, QAIC_NO_PARTITION);
+
+	/* start tearing things down */
+	for (i = 0; i < qdev->num_dbc; ++i)
+		release_dbc(qdev, i);
+
+	if (exit_reset)
+		qdev->in_reset = false;
+}
+
+static struct qaic_device *create_qdev(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+	struct qaic_device *qdev;
+	int i;
+
+	qdev = devm_kzalloc(&pdev->dev, sizeof(*qdev), GFP_KERNEL);
+	if (!qdev)
+		return NULL;
+
+	if (id->device == PCI_DEV_AIC100) {
+		qdev->num_dbc = 16;
+		qdev->dbc = devm_kcalloc(&pdev->dev, qdev->num_dbc, sizeof(*qdev->dbc), GFP_KERNEL);
+		if (!qdev->dbc)
+			return NULL;
+	}
+
+	qdev->cntl_wq = alloc_workqueue("qaic_cntl", WQ_UNBOUND, 0);
+	if (!qdev->cntl_wq)
+		return NULL;
+
+	pci_set_drvdata(pdev, qdev);
+	qdev->pdev = pdev;
+
+	mutex_init(&qdev->cntl_mutex);
+	INIT_LIST_HEAD(&qdev->cntl_xfer_list);
+	init_srcu_struct(&qdev->dev_lock);
+
+	for (i = 0; i < qdev->num_dbc; ++i) {
+		spin_lock_init(&qdev->dbc[i].xfer_lock);
+		qdev->dbc[i].qdev = qdev;
+		qdev->dbc[i].id = i;
+		INIT_LIST_HEAD(&qdev->dbc[i].xfer_list);
+		init_srcu_struct(&qdev->dbc[i].ch_lock);
+		init_waitqueue_head(&qdev->dbc[i].dbc_release);
+		INIT_LIST_HEAD(&qdev->dbc[i].bo_lists);
+	}
+
+	return qdev;
+}
+
+static void cleanup_qdev(struct qaic_device *qdev)
+{
+	int i;
+
+	for (i = 0; i < qdev->num_dbc; ++i)
+		cleanup_srcu_struct(&qdev->dbc[i].ch_lock);
+	cleanup_srcu_struct(&qdev->dev_lock);
+	pci_set_drvdata(qdev->pdev, NULL);
+	destroy_workqueue(qdev->cntl_wq);
+}
+
+static int init_pci(struct qaic_device *qdev, struct pci_dev *pdev)
+{
+	int bars;
+	int ret;
+
+	bars = pci_select_bars(pdev, IORESOURCE_MEM);
+
+	/* make sure the device has the expected BARs */
+	if (bars != (BIT(0) | BIT(2) | BIT(4))) {
+		pci_dbg(pdev, "%s: expected BARs 0, 2, and 4 not found in device. Found 0x%x\n",
+			__func__, bars);
+		return -EINVAL;
+	}
+
+	ret = pcim_enable_device(pdev);
+	if (ret)
+		return ret;
+
+	ret = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64));
+	if (ret)
+		return ret;
+	ret = dma_set_max_seg_size(&pdev->dev, UINT_MAX);
+	if (ret)
+		return ret;
+
+	qdev->bar_0 = devm_ioremap_resource(&pdev->dev, &pdev->resource[0]);
+	if (IS_ERR(qdev->bar_0))
+		return PTR_ERR(qdev->bar_0);
+
+	qdev->bar_2 = devm_ioremap_resource(&pdev->dev, &pdev->resource[2]);
+	if (IS_ERR(qdev->bar_2))
+		return PTR_ERR(qdev->bar_2);
+
+	/* Managed release since we use pcim_enable_device above */
+	pci_set_master(pdev);
+
+	return 0;
+}
+
+static int init_msi(struct qaic_device *qdev, struct pci_dev *pdev)
+{
+	int mhi_irq;
+	int ret;
+	int i;
+
+	/* Managed release since we use pcim_enable_device */
+	ret = pci_alloc_irq_vectors(pdev, 1, 32, PCI_IRQ_MSI);
+	if (ret < 0)
+		return ret;
+
+	if (ret < 32) {
+		pci_err(pdev, "%s: Requested 32 MSIs. Obtained %d MSIs which is less than the 32 required.\n",
+			__func__, ret);
+		return -ENODEV;
+	}
+
+	mhi_irq = pci_irq_vector(pdev, 0);
+	if (mhi_irq < 0)
+		return mhi_irq;
+
+	for (i = 0; i < qdev->num_dbc; ++i) {
+		ret = devm_request_threaded_irq(&pdev->dev, pci_irq_vector(pdev, i + 1),
+						dbc_irq_handler, dbc_irq_threaded_fn, IRQF_SHARED,
+						"qaic_dbc", &qdev->dbc[i]);
+		if (ret)
+			return ret;
+
+		if (datapath_polling) {
+			qdev->dbc[i].irq = pci_irq_vector(pdev, i + 1);
+			disable_irq_nosync(qdev->dbc[i].irq);
+			INIT_WORK(&qdev->dbc[i].poll_work, irq_polling_work);
+		}
+	}
+
+	return mhi_irq;
+}
+
+static int qaic_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+	struct qaic_device *qdev;
+	int mhi_irq;
+	int ret;
+	int i;
+
+	qdev = create_qdev(pdev, id);
+	if (!qdev)
+		return -ENOMEM;
+
+	ret = init_pci(qdev, pdev);
+	if (ret)
+		goto cleanup_qdev;
+
+	for (i = 0; i < qdev->num_dbc; ++i)
+		qdev->dbc[i].dbc_base = qdev->bar_2 + QAIC_DBC_OFF(i);
+
+	mhi_irq = init_msi(qdev, pdev);
+	if (mhi_irq < 0) {
+		ret = mhi_irq;
+		goto cleanup_qdev;
+	}
+
+	qdev->mhi_cntrl = qaic_mhi_register_controller(pdev, qdev->bar_0, mhi_irq);
+	if (IS_ERR(qdev->mhi_cntrl)) {
+		ret = PTR_ERR(qdev->mhi_cntrl);
+		goto cleanup_qdev;
+	}
+
+	return 0;
+
+cleanup_qdev:
+	cleanup_qdev(qdev);
+	return ret;
+}
+
+static void qaic_pci_remove(struct pci_dev *pdev)
+{
+	struct qaic_device *qdev = pci_get_drvdata(pdev);
+
+	if (!qdev)
+		return;
+
+	qaic_dev_reset_clean_local_state(qdev, false);
+	qaic_mhi_free_controller(qdev->mhi_cntrl, link_up);
+	cleanup_qdev(qdev);
+}
+
+static void qaic_pci_shutdown(struct pci_dev *pdev)
+{
+	/* see qaic_exit for what link_up is doing */
+	link_up = true;
+	qaic_pci_remove(pdev);
+}
+
+static pci_ers_result_t qaic_pci_error_detected(struct pci_dev *pdev, pci_channel_state_t error)
+{
+	return PCI_ERS_RESULT_NEED_RESET;
+}
+
+static void qaic_pci_reset_prepare(struct pci_dev *pdev)
+{
+	struct qaic_device *qdev = pci_get_drvdata(pdev);
+
+	qaic_notify_reset(qdev);
+	qaic_mhi_start_reset(qdev->mhi_cntrl);
+	qaic_dev_reset_clean_local_state(qdev, false);
+}
+
+static void qaic_pci_reset_done(struct pci_dev *pdev)
+{
+	struct qaic_device *qdev = pci_get_drvdata(pdev);
+
+	qdev->in_reset = false;
+	qaic_mhi_reset_done(qdev->mhi_cntrl);
+}
+
+static const struct mhi_device_id qaic_mhi_match_table[] = {
+	{ .chan = "QAIC_CONTROL", },
+	{},
+};
+
+static struct mhi_driver qaic_mhi_driver = {
+	.id_table = qaic_mhi_match_table,
+	.remove = qaic_mhi_remove,
+	.probe = qaic_mhi_probe,
+	.ul_xfer_cb = qaic_mhi_ul_xfer_cb,
+	.dl_xfer_cb = qaic_mhi_dl_xfer_cb,
+	.driver = {
+		.name = "qaic_mhi",
+	},
+};
+
+static const struct pci_device_id qaic_ids[] = {
+	{ PCI_DEVICE(PCI_VENDOR_ID_QCOM, PCI_DEV_AIC100), },
+	{ }
+};
+MODULE_DEVICE_TABLE(pci, qaic_ids);
+
+static const struct pci_error_handlers qaic_pci_err_handler = {
+	.error_detected = qaic_pci_error_detected,
+	.reset_prepare = qaic_pci_reset_prepare,
+	.reset_done = qaic_pci_reset_done,
+};
+
+static struct pci_driver qaic_pci_driver = {
+	.name = QAIC_NAME,
+	.id_table = qaic_ids,
+	.probe = qaic_pci_probe,
+	.remove = qaic_pci_remove,
+	.shutdown = qaic_pci_shutdown,
+	.err_handler = &qaic_pci_err_handler,
+};
+
+static int __init qaic_init(void)
+{
+	int ret;
+
+	ret = mhi_driver_register(&qaic_mhi_driver);
+	if (ret) {
+		pr_debug("qaic: mhi_driver_register failed %d\n", ret);
+		return ret;
+	}
+
+	ret = pci_register_driver(&qaic_pci_driver);
+	if (ret) {
+		pr_debug("qaic: pci_register_driver failed %d\n", ret);
+		goto free_mhi;
+	}
+
+	ret = mhi_qaic_ctrl_init();
+	if (ret) {
+		pr_debug("qaic: mhi_qaic_ctrl_init failed %d\n", ret);
+		goto free_pci;
+	}
+
+	return 0;
+
+free_pci:
+	pci_unregister_driver(&qaic_pci_driver);
+free_mhi:
+	mhi_driver_unregister(&qaic_mhi_driver);
+	return ret;
+}
+
+static void __exit qaic_exit(void)
+{
+	/*
+	 * We assume that qaic_pci_remove() is called due to a hotplug event
+	 * which would mean that the link is down, and thus
+	 * qaic_mhi_free_controller() should not try to access the device during
+	 * cleanup.
+	 * We call pci_unregister_driver() below, which also triggers
+	 * qaic_pci_remove(), but since this is module exit, we expect the link
+	 * to the device to be up, in which case qaic_mhi_free_controller()
+	 * should try to access the device during cleanup to put the device in
+	 * a sane state.
+	 * For that reason, we set link_up here to let qaic_mhi_free_controller
+	 * know the expected link state. Since the module is going to be
+	 * removed at the end of this, we don't need to worry about
+	 * reinitializing the link_up state after the cleanup is done.
+	 */
+	link_up = true;
+	mhi_qaic_ctrl_deinit();
+	pci_unregister_driver(&qaic_pci_driver);
+	mhi_driver_unregister(&qaic_mhi_driver);
+}
+
+module_init(qaic_init);
+module_exit(qaic_exit);
+
+MODULE_AUTHOR(QAIC_DESC " Kernel Driver Team");
+MODULE_DESCRIPTION(QAIC_DESC " Accel Driver");
+MODULE_LICENSE("GPL");
diff --git a/drivers/gpu/drm/ast/ast_drv.c b/drivers/gpu/drm/ast/ast_drv.c
index 3a7af6d5aa79..e1224ef4ad83 100644
--- a/drivers/gpu/drm/ast/ast_drv.c
+++ b/drivers/gpu/drm/ast/ast_drv.c
@@ -89,27 +89,13 @@ static const struct pci_device_id ast_pciidlist[] = {
 
 MODULE_DEVICE_TABLE(pci, ast_pciidlist);
 
-static int ast_remove_conflicting_framebuffers(struct pci_dev *pdev)
-{
-	bool primary = false;
-	resource_size_t base, size;
-
-	base = pci_resource_start(pdev, 0);
-	size = pci_resource_len(pdev, 0);
-#ifdef CONFIG_X86
-	primary = pdev->resource[PCI_ROM_RESOURCE].flags & IORESOURCE_ROM_SHADOW;
-#endif
-
-	return drm_aperture_remove_conflicting_framebuffers(base, size, primary, &ast_driver);
-}
-
 static int ast_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 {
 	struct ast_device *ast;
 	struct drm_device *dev;
 	int ret;
 
-	ret = ast_remove_conflicting_framebuffers(pdev);
+	ret = drm_aperture_remove_conflicting_pci_framebuffers(pdev, &ast_driver);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/bridge/fsl-ldb.c b/drivers/gpu/drm/bridge/fsl-ldb.c
index 450b352914f4..682623369498 100644
--- a/drivers/gpu/drm/bridge/fsl-ldb.c
+++ b/drivers/gpu/drm/bridge/fsl-ldb.c
@@ -84,10 +84,16 @@ struct fsl_ldb {
 	struct drm_bridge *panel_bridge;
 	struct clk *clk;
 	struct regmap *regmap;
-	bool lvds_dual_link;
 	const struct fsl_ldb_devdata *devdata;
+	bool ch0_enabled;
+	bool ch1_enabled;
 };
 
+static bool fsl_ldb_is_dual(const struct fsl_ldb *fsl_ldb)
+{
+	return (fsl_ldb->ch0_enabled && fsl_ldb->ch1_enabled);
+}
+
 static inline struct fsl_ldb *to_fsl_ldb(struct drm_bridge *bridge)
 {
 	return container_of(bridge, struct fsl_ldb, bridge);
@@ -95,7 +101,7 @@ static inline struct fsl_ldb *to_fsl_ldb(struct drm_bridge *bridge)
 
 static unsigned long fsl_ldb_link_frequency(struct fsl_ldb *fsl_ldb, int clock)
 {
-	if (fsl_ldb->lvds_dual_link)
+	if (fsl_ldb_is_dual(fsl_ldb))
 		return clock * 3500;
 	else
 		return clock * 7000;
@@ -170,35 +176,28 @@ static void fsl_ldb_atomic_enable(struct drm_bridge *bridge,
 
 	configured_link_freq = clk_get_rate(fsl_ldb->clk);
 	if (configured_link_freq != requested_link_freq)
-		dev_warn(fsl_ldb->dev, "Configured LDB clock (%lu Hz) does not match requested LVDS clock: %lu Hz",
+		dev_warn(fsl_ldb->dev, "Configured LDB clock (%lu Hz) does not match requested LVDS clock: %lu Hz\n",
 			 configured_link_freq,
 			 requested_link_freq);
 
 	clk_prepare_enable(fsl_ldb->clk);
 
 	/* Program LDB_CTRL */
-	reg = LDB_CTRL_CH0_ENABLE;
+	reg =	(fsl_ldb->ch0_enabled ? LDB_CTRL_CH0_ENABLE : 0) |
+		(fsl_ldb->ch1_enabled ? LDB_CTRL_CH1_ENABLE : 0) |
+		(fsl_ldb_is_dual(fsl_ldb) ? LDB_CTRL_SPLIT_MODE : 0);
 
-	if (fsl_ldb->lvds_dual_link)
-		reg |= LDB_CTRL_CH1_ENABLE | LDB_CTRL_SPLIT_MODE;
+	if (lvds_format_24bpp)
+		reg |=	(fsl_ldb->ch0_enabled ? LDB_CTRL_CH0_DATA_WIDTH : 0) |
+			(fsl_ldb->ch1_enabled ? LDB_CTRL_CH1_DATA_WIDTH : 0);
 
-	if (lvds_format_24bpp) {
-		reg |= LDB_CTRL_CH0_DATA_WIDTH;
-		if (fsl_ldb->lvds_dual_link)
-			reg |= LDB_CTRL_CH1_DATA_WIDTH;
-	}
+	if (lvds_format_jeida)
+		reg |=	(fsl_ldb->ch0_enabled ? LDB_CTRL_CH0_BIT_MAPPING : 0) |
+			(fsl_ldb->ch1_enabled ? LDB_CTRL_CH1_BIT_MAPPING : 0);
 
-	if (lvds_format_jeida) {
-		reg |= LDB_CTRL_CH0_BIT_MAPPING;
-		if (fsl_ldb->lvds_dual_link)
-			reg |= LDB_CTRL_CH1_BIT_MAPPING;
-	}
-
-	if (mode->flags & DRM_MODE_FLAG_PVSYNC) {
-		reg |= LDB_CTRL_DI0_VSYNC_POLARITY;
-		if (fsl_ldb->lvds_dual_link)
-			reg |= LDB_CTRL_DI1_VSYNC_POLARITY;
-	}
+	if (mode->flags & DRM_MODE_FLAG_PVSYNC)
+		reg |=	(fsl_ldb->ch0_enabled ? LDB_CTRL_DI0_VSYNC_POLARITY : 0) |
+			(fsl_ldb->ch1_enabled ? LDB_CTRL_DI1_VSYNC_POLARITY : 0);
 
 	regmap_write(fsl_ldb->regmap, fsl_ldb->devdata->ldb_ctrl, reg);
 
@@ -210,9 +209,8 @@ static void fsl_ldb_atomic_enable(struct drm_bridge *bridge,
 	/* Wait for VBG to stabilize. */
 	usleep_range(15, 20);
 
-	reg |= LVDS_CTRL_CH0_EN;
-	if (fsl_ldb->lvds_dual_link)
-		reg |= LVDS_CTRL_CH1_EN;
+	reg |=	(fsl_ldb->ch0_enabled ? LVDS_CTRL_CH0_EN : 0) |
+		(fsl_ldb->ch1_enabled ? LVDS_CTRL_CH1_EN : 0);
 
 	regmap_write(fsl_ldb->regmap, fsl_ldb->devdata->lvds_ctrl, reg);
 }
@@ -265,7 +263,7 @@ fsl_ldb_mode_valid(struct drm_bridge *bridge,
 {
 	struct fsl_ldb *fsl_ldb = to_fsl_ldb(bridge);
 
-	if (mode->clock > (fsl_ldb->lvds_dual_link ? 160000 : 80000))
+	if (mode->clock > (fsl_ldb_is_dual(fsl_ldb) ? 160000 : 80000))
 		return MODE_CLOCK_HIGH;
 
 	return MODE_OK;
@@ -286,7 +284,7 @@ static int fsl_ldb_probe(struct platform_device *pdev)
 {
 	struct device *dev = &pdev->dev;
 	struct device_node *panel_node;
-	struct device_node *port1, *port2;
+	struct device_node *remote1, *remote2;
 	struct drm_panel *panel;
 	struct fsl_ldb *fsl_ldb;
 	int dual_link;
@@ -311,10 +309,23 @@ static int fsl_ldb_probe(struct platform_device *pdev)
 	if (IS_ERR(fsl_ldb->regmap))
 		return PTR_ERR(fsl_ldb->regmap);
 
-	/* Locate the panel DT node. */
-	panel_node = of_graph_get_remote_node(dev->of_node, 1, 0);
-	if (!panel_node)
-		return -ENXIO;
+	/* Locate the remote ports and the panel node */
+	remote1 = of_graph_get_remote_node(dev->of_node, 1, 0);
+	remote2 = of_graph_get_remote_node(dev->of_node, 2, 0);
+	fsl_ldb->ch0_enabled = (remote1 != NULL);
+	fsl_ldb->ch1_enabled = (remote2 != NULL);
+	panel_node = of_node_get(remote1 ? remote1 : remote2);
+	of_node_put(remote1);
+	of_node_put(remote2);
+
+	if (!fsl_ldb->ch0_enabled && !fsl_ldb->ch1_enabled) {
+		of_node_put(panel_node);
+		return dev_err_probe(dev, -ENXIO, "No panel node found");
+	}
+
+	dev_dbg(dev, "Using %s\n",
+		fsl_ldb_is_dual(fsl_ldb) ? "dual-link mode" :
+		fsl_ldb->ch0_enabled ? "channel 0" : "channel 1");
 
 	panel = of_drm_find_panel(panel_node);
 	of_node_put(panel_node);
@@ -325,20 +336,26 @@ static int fsl_ldb_probe(struct platform_device *pdev)
 	if (IS_ERR(fsl_ldb->panel_bridge))
 		return PTR_ERR(fsl_ldb->panel_bridge);
 
-	/* Determine whether this is dual-link configuration */
-	port1 = of_graph_get_port_by_id(dev->of_node, 1);
-	port2 = of_graph_get_port_by_id(dev->of_node, 2);
-	dual_link = drm_of_lvds_get_dual_link_pixel_order(port1, port2);
-	of_node_put(port1);
-	of_node_put(port2);
 
-	if (dual_link == DRM_LVDS_DUAL_LINK_EVEN_ODD_PIXELS) {
-		dev_err(dev, "LVDS channel pixel swap not supported.\n");
-		return -EINVAL;
-	}
+	if (fsl_ldb_is_dual(fsl_ldb)) {
+		struct device_node *port1, *port2;
 
-	if (dual_link == DRM_LVDS_DUAL_LINK_ODD_EVEN_PIXELS)
-		fsl_ldb->lvds_dual_link = true;
+		port1 = of_graph_get_port_by_id(dev->of_node, 1);
+		port2 = of_graph_get_port_by_id(dev->of_node, 2);
+		dual_link = drm_of_lvds_get_dual_link_pixel_order(port1, port2);
+		of_node_put(port1);
+		of_node_put(port2);
+
+		if (dual_link < 0)
+			return dev_err_probe(dev, dual_link,
+					     "Error getting dual link configuration\n");
+
+		/* Only DRM_LVDS_DUAL_LINK_ODD_EVEN_PIXELS is supported */
+		if (dual_link == DRM_LVDS_DUAL_LINK_EVEN_ODD_PIXELS) {
+			dev_err(dev, "LVDS channel pixel swap not supported.\n");
+			return -EINVAL;
+		}
+	}
 
 	platform_set_drvdata(pdev, fsl_ldb);
 
diff --git a/drivers/gpu/drm/bridge/lontium-lt8912b.c b/drivers/gpu/drm/bridge/lontium-lt8912b.c
index b40baced1331..13c131ade268 100644
--- a/drivers/gpu/drm/bridge/lontium-lt8912b.c
+++ b/drivers/gpu/drm/bridge/lontium-lt8912b.c
@@ -504,7 +504,6 @@ static int lt8912_attach_dsi(struct lt8912 *lt)
 	dsi->format = MIPI_DSI_FMT_RGB888;
 
 	dsi->mode_flags = MIPI_DSI_MODE_VIDEO |
-			  MIPI_DSI_MODE_VIDEO_BURST |
 			  MIPI_DSI_MODE_LPM |
 			  MIPI_DSI_MODE_NO_EOT_PACKET;
 
diff --git a/drivers/gpu/drm/bridge/parade-ps8640.c b/drivers/gpu/drm/bridge/parade-ps8640.c
index b823e55650b1..c3eb45179405 100644
--- a/drivers/gpu/drm/bridge/parade-ps8640.c
+++ b/drivers/gpu/drm/bridge/parade-ps8640.c
@@ -184,7 +184,7 @@ static int _ps8640_wait_hpd_asserted(struct ps8640 *ps_bridge, unsigned long wai
 	 * actually connected to GPIO9).
 	 */
 	ret = regmap_read_poll_timeout(map, PAGE2_GPIO_H, status,
-				       status & PS_GPIO9, wait_us / 10, wait_us);
+				       status & PS_GPIO9, 20000, wait_us);
 
 	/*
 	 * The first time we see HPD go high after a reset we delay an extra
diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
index aa51c61a78c7..603bb3c51027 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
@@ -1426,9 +1426,9 @@ void dw_hdmi_set_high_tmds_clock_ratio(struct dw_hdmi *hdmi,
 	/* Control for TMDS Bit Period/TMDS Clock-Period Ratio */
 	if (dw_hdmi_support_scdc(hdmi, display)) {
 		if (mtmdsclock > HDMI14_MAX_TMDSCLK)
-			drm_scdc_set_high_tmds_clock_ratio(hdmi->ddc, 1);
+			drm_scdc_set_high_tmds_clock_ratio(&hdmi->connector, 1);
 		else
-			drm_scdc_set_high_tmds_clock_ratio(hdmi->ddc, 0);
+			drm_scdc_set_high_tmds_clock_ratio(&hdmi->connector, 0);
 	}
 }
 EXPORT_SYMBOL_GPL(dw_hdmi_set_high_tmds_clock_ratio);
@@ -2116,7 +2116,7 @@ static void hdmi_av_composer(struct dw_hdmi *hdmi,
 				min_t(u8, bytes, SCDC_MIN_SOURCE_VERSION));
 
 			/* Enabled Scrambling in the Sink */
-			drm_scdc_set_scrambling(hdmi->ddc, 1);
+			drm_scdc_set_scrambling(&hdmi->connector, 1);
 
 			/*
 			 * To activate the scrambler feature, you must ensure
@@ -2132,7 +2132,7 @@ static void hdmi_av_composer(struct dw_hdmi *hdmi,
 			hdmi_writeb(hdmi, 0, HDMI_FC_SCRAMBLER_CTRL);
 			hdmi_writeb(hdmi, (u8)~HDMI_MC_SWRSTZ_TMDSSWRST_REQ,
 				    HDMI_MC_SWRSTZ);
-			drm_scdc_set_scrambling(hdmi->ddc, 0);
+			drm_scdc_set_scrambling(&hdmi->connector, 0);
 		}
 	}
 
diff --git a/drivers/gpu/drm/bridge/tc358767.c b/drivers/gpu/drm/bridge/tc358767.c
index 6d16ec45ea61..91f7cb56a654 100644
--- a/drivers/gpu/drm/bridge/tc358767.c
+++ b/drivers/gpu/drm/bridge/tc358767.c
@@ -1896,10 +1896,10 @@ static int tc_mipi_dsi_host_attach(struct tc_data *tc)
 				     "failed to create dsi device\n");
 
 	tc->dsi = dsi;
-
 	dsi->lanes = dsi_lanes;
 	dsi->format = MIPI_DSI_FMT_RGB888;
-	dsi->mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_SYNC_PULSE;
+	dsi->mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_BURST |
+			  MIPI_DSI_MODE_LPM | MIPI_DSI_CLOCK_NON_CONTINUOUS;
 
 	ret = mipi_dsi_attach(dsi);
 	if (ret < 0) {
diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi83.c b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
index 91ecfbe45bf9..75286c9afbb9 100644
--- a/drivers/gpu/drm/bridge/ti-sn65dsi83.c
+++ b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
@@ -642,7 +642,9 @@ static int sn65dsi83_host_attach(struct sn65dsi83 *ctx)
 
 	dsi->lanes = dsi_lanes;
 	dsi->format = MIPI_DSI_FMT_RGB888;
-	dsi->mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_BURST;
+	dsi->mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_BURST |
+			  MIPI_DSI_MODE_VIDEO_NO_HFP | MIPI_DSI_MODE_VIDEO_NO_HBP |
+			  MIPI_DSI_MODE_VIDEO_NO_HSA | MIPI_DSI_MODE_NO_EOT_PACKET;
 
 	ret = devm_mipi_dsi_attach(dev, dsi);
 	if (ret < 0) {
@@ -698,8 +700,10 @@ static int sn65dsi83_probe(struct i2c_client *client)
 	drm_bridge_add(&ctx->bridge);
 
 	ret = sn65dsi83_host_attach(ctx);
-	if (ret)
+	if (ret) {
+		dev_err_probe(dev, ret, "failed to attach DSI host\n");
 		goto err_remove_bridge;
+	}
 
 	return 0;
 
diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
index 1e26fa63845a..7a748785c545 100644
--- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c
+++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
@@ -363,7 +363,7 @@ static int __maybe_unused ti_sn65dsi86_resume(struct device *dev)
 	/* td2: min 100 us after regulators before enabling the GPIO */
 	usleep_range(100, 110);
 
-	gpiod_set_value(pdata->enable_gpio, 1);
+	gpiod_set_value_cansleep(pdata->enable_gpio, 1);
 
 	/*
 	 * If we have a reference clock we can enable communication w/ the
@@ -386,7 +386,7 @@ static int __maybe_unused ti_sn65dsi86_suspend(struct device *dev)
 	if (pdata->refclk)
 		ti_sn65dsi86_disable_comms(pdata);
 
-	gpiod_set_value(pdata->enable_gpio, 0);
+	gpiod_set_value_cansleep(pdata->enable_gpio, 0);
 
 	ret = regulator_bulk_disable(SN_REGULATOR_SUPPLY_NUM, pdata->supplies);
 	if (ret)
diff --git a/drivers/gpu/drm/display/drm_scdc_helper.c b/drivers/gpu/drm/display/drm_scdc_helper.c
index c3ad4ab2b456..6d2f244e5830 100644
--- a/drivers/gpu/drm/display/drm_scdc_helper.c
+++ b/drivers/gpu/drm/display/drm_scdc_helper.c
@@ -26,6 +26,8 @@
 #include <linux/delay.h>
 
 #include <drm/display/drm_scdc_helper.h>
+#include <drm/drm_connector.h>
+#include <drm/drm_device.h>
 #include <drm/drm_print.h>
 
 /**
@@ -140,7 +142,7 @@ EXPORT_SYMBOL(drm_scdc_write);
 
 /**
  * drm_scdc_get_scrambling_status - what is status of scrambling?
- * @adapter: I2C adapter for DDC channel
+ * @connector: connector
  *
  * Reads the scrambler status over SCDC, and checks the
  * scrambling status.
@@ -148,14 +150,16 @@ EXPORT_SYMBOL(drm_scdc_write);
  * Returns:
  * True if the scrambling is enabled, false otherwise.
  */
-bool drm_scdc_get_scrambling_status(struct i2c_adapter *adapter)
+bool drm_scdc_get_scrambling_status(struct drm_connector *connector)
 {
 	u8 status;
 	int ret;
 
-	ret = drm_scdc_readb(adapter, SCDC_SCRAMBLER_STATUS, &status);
+	ret = drm_scdc_readb(connector->ddc, SCDC_SCRAMBLER_STATUS, &status);
 	if (ret < 0) {
-		DRM_DEBUG_KMS("Failed to read scrambling status: %d\n", ret);
+		drm_dbg_kms(connector->dev,
+			    "[CONNECTOR:%d:%s] Failed to read scrambling status: %d\n",
+			    connector->base.id, connector->name, ret);
 		return false;
 	}
 
@@ -165,7 +169,7 @@ EXPORT_SYMBOL(drm_scdc_get_scrambling_status);
 
 /**
  * drm_scdc_set_scrambling - enable scrambling
- * @adapter: I2C adapter for DDC channel
+ * @connector: connector
  * @enable: bool to indicate if scrambling is to be enabled/disabled
  *
  * Writes the TMDS config register over SCDC channel, and:
@@ -175,14 +179,17 @@ EXPORT_SYMBOL(drm_scdc_get_scrambling_status);
  * Returns:
  * True if scrambling is set/reset successfully, false otherwise.
  */
-bool drm_scdc_set_scrambling(struct i2c_adapter *adapter, bool enable)
+bool drm_scdc_set_scrambling(struct drm_connector *connector,
+			     bool enable)
 {
 	u8 config;
 	int ret;
 
-	ret = drm_scdc_readb(adapter, SCDC_TMDS_CONFIG, &config);
+	ret = drm_scdc_readb(connector->ddc, SCDC_TMDS_CONFIG, &config);
 	if (ret < 0) {
-		DRM_DEBUG_KMS("Failed to read TMDS config: %d\n", ret);
+		drm_dbg_kms(connector->dev,
+			    "[CONNECTOR:%d:%s] Failed to read TMDS config: %d\n",
+			    connector->base.id, connector->name, ret);
 		return false;
 	}
 
@@ -191,9 +198,11 @@ bool drm_scdc_set_scrambling(struct i2c_adapter *adapter, bool enable)
 	else
 		config &= ~SCDC_SCRAMBLING_ENABLE;
 
-	ret = drm_scdc_writeb(adapter, SCDC_TMDS_CONFIG, config);
+	ret = drm_scdc_writeb(connector->ddc, SCDC_TMDS_CONFIG, config);
 	if (ret < 0) {
-		DRM_DEBUG_KMS("Failed to enable scrambling: %d\n", ret);
+		drm_dbg_kms(connector->dev,
+			    "[CONNECTOR:%d:%s] Failed to enable scrambling: %d\n",
+			    connector->base.id, connector->name, ret);
 		return false;
 	}
 
@@ -203,7 +212,7 @@ EXPORT_SYMBOL(drm_scdc_set_scrambling);
 
 /**
  * drm_scdc_set_high_tmds_clock_ratio - set TMDS clock ratio
- * @adapter: I2C adapter for DDC channel
+ * @connector: connector
  * @set: ret or reset the high clock ratio
  *
  *
@@ -230,14 +239,17 @@ EXPORT_SYMBOL(drm_scdc_set_scrambling);
  * Returns:
  * True if write is successful, false otherwise.
  */
-bool drm_scdc_set_high_tmds_clock_ratio(struct i2c_adapter *adapter, bool set)
+bool drm_scdc_set_high_tmds_clock_ratio(struct drm_connector *connector,
+					bool set)
 {
 	u8 config;
 	int ret;
 
-	ret = drm_scdc_readb(adapter, SCDC_TMDS_CONFIG, &config);
+	ret = drm_scdc_readb(connector->ddc, SCDC_TMDS_CONFIG, &config);
 	if (ret < 0) {
-		DRM_DEBUG_KMS("Failed to read TMDS config: %d\n", ret);
+		drm_dbg_kms(connector->dev,
+			    "[CONNECTOR:%d:%s] Failed to read TMDS config: %d\n",
+			    connector->base.id, connector->name, ret);
 		return false;
 	}
 
@@ -246,9 +258,11 @@ bool drm_scdc_set_high_tmds_clock_ratio(struct i2c_adapter *adapter, bool set)
 	else
 		config &= ~SCDC_TMDS_BIT_CLOCK_RATIO_BY_40;
 
-	ret = drm_scdc_writeb(adapter, SCDC_TMDS_CONFIG, config);
+	ret = drm_scdc_writeb(connector->ddc, SCDC_TMDS_CONFIG, config);
 	if (ret < 0) {
-		DRM_DEBUG_KMS("Failed to set TMDS clock ratio: %d\n", ret);
+		drm_dbg_kms(connector->dev,
+			    "[CONNECTOR:%d:%s] Failed to set TMDS clock ratio: %d\n",
+			    connector->base.id, connector->name, ret);
 		return false;
 	}
 
diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
index d4d2a2ce40f8..2c2c9caf0be5 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -1528,6 +1528,12 @@ static void set_fence_deadline(struct drm_device *dev,
 	for_each_new_crtc_in_state (state, crtc, new_crtc_state, i) {
 		ktime_t v;
 
+		if (drm_atomic_crtc_needs_modeset(new_crtc_state))
+			continue;
+
+		if (!new_crtc_state->active)
+			continue;
+
 		if (drm_crtc_next_vblank_start(crtc, &v))
 			continue;
 
diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
index 63ec95e86d0e..64458982be40 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -1537,6 +1537,27 @@ static void drm_fb_helper_fill_pixel_fmt(struct fb_var_screeninfo *var,
 	}
 }
 
+static void __fill_var(struct fb_var_screeninfo *var,
+		       struct drm_framebuffer *fb)
+{
+	int i;
+
+	var->xres_virtual = fb->width;
+	var->yres_virtual = fb->height;
+	var->accel_flags = FB_ACCELF_TEXT;
+	var->bits_per_pixel = drm_format_info_bpp(fb->format, 0);
+
+	var->height = var->width = 0;
+	var->left_margin = var->right_margin = 0;
+	var->upper_margin = var->lower_margin = 0;
+	var->hsync_len = var->vsync_len = 0;
+	var->sync = var->vmode = 0;
+	var->rotate = 0;
+	var->colorspace = 0;
+	for (i = 0; i < 4; i++)
+		var->reserved[i] = 0;
+}
+
 /**
  * drm_fb_helper_check_var - implementation for &fb_ops.fb_check_var
  * @var: screeninfo to check
@@ -1589,6 +1610,23 @@ int drm_fb_helper_check_var(struct fb_var_screeninfo *var,
 		return -EINVAL;
 	}
 
+	__fill_var(var, fb);
+
+	/*
+	 * fb_pan_display() validates this, but fb_set_par() doesn't and just
+	 * falls over. Note that __fill_var above adjusts y/res_virtual.
+	 */
+	if (var->yoffset > var->yres_virtual - var->yres ||
+	    var->xoffset > var->xres_virtual - var->xres)
+		return -EINVAL;
+
+	/* We neither support grayscale nor FOURCC (also stored in here). */
+	if (var->grayscale > 0)
+		return -EINVAL;
+
+	if (var->nonstd)
+		return -EINVAL;
+
 	/*
 	 * Workaround for SDL 1.2, which is known to be setting all pixel format
 	 * fields values to zero in some cases. We treat this situation as a
@@ -1604,11 +1642,6 @@ int drm_fb_helper_check_var(struct fb_var_screeninfo *var,
 	}
 
 	/*
-	 * Likewise, bits_per_pixel should be rounded up to a supported value.
-	 */
-	var->bits_per_pixel = bpp;
-
-	/*
 	 * drm fbdev emulation doesn't support changing the pixel format at all,
 	 * so reject all pixel format changing requests.
 	 */
@@ -1638,11 +1671,6 @@ int drm_fb_helper_set_par(struct fb_info *info)
 	if (oops_in_progress)
 		return -EBUSY;
 
-	if (var->pixclock != 0) {
-		drm_err(fb_helper->dev, "PIXEL CLOCK SET\n");
-		return -EINVAL;
-	}
-
 	/*
 	 * Normally we want to make sure that a kms master takes precedence over
 	 * fbdev, to avoid fbdev flickering and occasionally stealing the
@@ -2036,12 +2064,9 @@ static void drm_fb_helper_fill_var(struct fb_info *info,
 	}
 
 	info->pseudo_palette = fb_helper->pseudo_palette;
-	info->var.xres_virtual = fb->width;
-	info->var.yres_virtual = fb->height;
-	info->var.bits_per_pixel = drm_format_info_bpp(format, 0);
-	info->var.accel_flags = FB_ACCELF_TEXT;
 	info->var.xoffset = 0;
 	info->var.yoffset = 0;
+	__fill_var(&info->var, fb);
 	info->var.activate = FB_ACTIVATE_NOW;
 
 	drm_fb_helper_fill_pixel_fmt(&info->var, format);
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index 149cd4ff6a3b..d29dafce9bb0 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -544,7 +544,8 @@ int drm_prime_handle_to_fd_ioctl(struct drm_device *dev, void *data,
  * Optional pinning of buffers is handled at dma-buf attach and detach time in
  * drm_gem_map_attach() and drm_gem_map_detach(). Backing storage itself is
  * handled by drm_gem_map_dma_buf() and drm_gem_unmap_dma_buf(), which relies on
- * &drm_gem_object_funcs.get_sg_table.
+ * &drm_gem_object_funcs.get_sg_table. If &drm_gem_object_funcs.get_sg_table is
+ * unimplemented, exports into another device are rejected.
  *
  * For kernel-internal access there's drm_gem_dmabuf_vmap() and
  * drm_gem_dmabuf_vunmap(). Userspace mmap support is provided by
@@ -583,6 +584,9 @@ int drm_gem_map_attach(struct dma_buf *dma_buf,
 {
 	struct drm_gem_object *obj = dma_buf->priv;
 
+	if (!obj->funcs->get_sg_table)
+		return -ENOSYS;
+
 	return drm_gem_pin(obj);
 }
 EXPORT_SYMBOL(drm_gem_map_attach);
diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
index 299fa2a19a90..877e2067534f 100644
--- a/drivers/gpu/drm/drm_vblank.c
+++ b/drivers/gpu/drm/drm_vblank.c
@@ -996,10 +996,16 @@ EXPORT_SYMBOL(drm_crtc_vblank_count_and_time);
 int drm_crtc_next_vblank_start(struct drm_crtc *crtc, ktime_t *vblanktime)
 {
 	unsigned int pipe = drm_crtc_index(crtc);
-	struct drm_vblank_crtc *vblank = &crtc->dev->vblank[pipe];
-	struct drm_display_mode *mode = &vblank->hwmode;
+	struct drm_vblank_crtc *vblank;
+	struct drm_display_mode *mode;
 	u64 vblank_start;
 
+	if (!drm_dev_has_vblank(crtc->dev))
+		return -EINVAL;
+
+	vblank = &crtc->dev->vblank[pipe];
+	mode = &vblank->hwmode;
+
 	if (!vblank->framedur_ns || !vblank->linedur_ns)
 		return -EINVAL;
 
diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c
index 73240cf78c8b..d8a9790f9d36 100644
--- a/drivers/gpu/drm/i915/display/intel_ddi.c
+++ b/drivers/gpu/drm/i915/display/intel_ddi.c
@@ -3988,8 +3988,8 @@ static int intel_hdmi_reset_link(struct intel_encoder *encoder,
 
 	ret = drm_scdc_readb(adapter, SCDC_TMDS_CONFIG, &config);
 	if (ret < 0) {
-		drm_err(&dev_priv->drm, "Failed to read TMDS config: %d\n",
-			ret);
+		drm_err(&dev_priv->drm, "[CONNECTOR:%d:%s] Failed to read TMDS config: %d\n",
+			connector->base.base.id, connector->base.name, ret);
 		return 0;
 	}
 
diff --git a/drivers/gpu/drm/i915/display/intel_hdmi.c b/drivers/gpu/drm/i915/display/intel_hdmi.c
index c7e9e1fbed37..a690a5616506 100644
--- a/drivers/gpu/drm/i915/display/intel_hdmi.c
+++ b/drivers/gpu/drm/i915/display/intel_hdmi.c
@@ -2646,11 +2646,8 @@ bool intel_hdmi_handle_sink_scrambling(struct intel_encoder *encoder,
 				       bool scrambling)
 {
 	struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
-	struct intel_hdmi *intel_hdmi = enc_to_intel_hdmi(encoder);
 	struct drm_scrambling *sink_scrambling =
 		&connector->display_info.hdmi.scdc.scrambling;
-	struct i2c_adapter *adapter =
-		intel_gmbus_get_adapter(dev_priv, intel_hdmi->ddc_bus);
 
 	if (!sink_scrambling->supported)
 		return true;
@@ -2661,9 +2658,8 @@ bool intel_hdmi_handle_sink_scrambling(struct intel_encoder *encoder,
 		    str_yes_no(scrambling), high_tmds_clock_ratio ? 40 : 10);
 
 	/* Set TMDS bit clock ratio to 1/40 or 1/10, and enable/disable scrambling */
-	return drm_scdc_set_high_tmds_clock_ratio(adapter,
-						  high_tmds_clock_ratio) &&
-		drm_scdc_set_scrambling(adapter, scrambling);
+	return drm_scdc_set_high_tmds_clock_ratio(connector, high_tmds_clock_ratio) &&
+		drm_scdc_set_scrambling(connector, scrambling);
 }
 
 static u8 chv_port_to_ddc_pin(struct drm_i915_private *dev_priv, enum port port)
diff --git a/drivers/gpu/drm/lima/lima_drv.c b/drivers/gpu/drm/lima/lima_drv.c
index 7b8d7178d09a..39cab4a55f57 100644
--- a/drivers/gpu/drm/lima/lima_drv.c
+++ b/drivers/gpu/drm/lima/lima_drv.c
@@ -392,8 +392,10 @@ static int lima_pdev_probe(struct platform_device *pdev)
 
 	/* Allocate and initialize the DRM device. */
 	ddev = drm_dev_alloc(&lima_drm_driver, &pdev->dev);
-	if (IS_ERR(ddev))
-		return PTR_ERR(ddev);
+	if (IS_ERR(ddev)) {
+		err = PTR_ERR(ddev);
+		goto err_out0;
+	}
 
 	ddev->dev_private = ldev;
 	ldev->ddev = ddev;
diff --git a/drivers/gpu/drm/panel/panel-edp.c b/drivers/gpu/drm/panel/panel-edp.c
index 926906ca2304..e23ddab2126e 100644
--- a/drivers/gpu/drm/panel/panel-edp.c
+++ b/drivers/gpu/drm/panel/panel-edp.c
@@ -1879,6 +1879,7 @@ static const struct edp_panel_entry edp_panels[] = {
 	EDP_PANEL_ENTRY('B', 'O', 'E', 0x07d1, &boe_nv133fhm_n61.delay, "NV133FHM-N61"),
 	EDP_PANEL_ENTRY('B', 'O', 'E', 0x082d, &boe_nv133fhm_n61.delay, "NV133FHM-N62"),
 	EDP_PANEL_ENTRY('B', 'O', 'E', 0x094b, &delay_200_500_e50, "NT116WHM-N21"),
+	EDP_PANEL_ENTRY('B', 'O', 'E', 0x095f, &delay_200_500_e50, "NE135FBM-N41 v8.1"),
 	EDP_PANEL_ENTRY('B', 'O', 'E', 0x098d, &boe_nv110wtm_n61.delay, "NV110WTM-N61"),
 	EDP_PANEL_ENTRY('B', 'O', 'E', 0x09dd, &delay_200_500_e50, "NT116WHM-N21"),
 	EDP_PANEL_ENTRY('B', 'O', 'E', 0x0a5d, &delay_200_500_e50, "NV116WHM-N45"),
diff --git a/drivers/gpu/drm/tegra/sor.c b/drivers/gpu/drm/tegra/sor.c
index 8af632740673..34af6724914f 100644
--- a/drivers/gpu/drm/tegra/sor.c
+++ b/drivers/gpu/drm/tegra/sor.c
@@ -2140,10 +2140,8 @@ static void tegra_sor_hdmi_disable_scrambling(struct tegra_sor *sor)
 
 static void tegra_sor_hdmi_scdc_disable(struct tegra_sor *sor)
 {
-	struct i2c_adapter *ddc = sor->output.ddc;
-
-	drm_scdc_set_high_tmds_clock_ratio(ddc, false);
-	drm_scdc_set_scrambling(ddc, false);
+	drm_scdc_set_high_tmds_clock_ratio(&sor->output.connector, false);
+	drm_scdc_set_scrambling(&sor->output.connector, false);
 
 	tegra_sor_hdmi_disable_scrambling(sor);
 }
@@ -2168,10 +2166,8 @@ static void tegra_sor_hdmi_enable_scrambling(struct tegra_sor *sor)
 
 static void tegra_sor_hdmi_scdc_enable(struct tegra_sor *sor)
 {
-	struct i2c_adapter *ddc = sor->output.ddc;
-
-	drm_scdc_set_high_tmds_clock_ratio(ddc, true);
-	drm_scdc_set_scrambling(ddc, true);
+	drm_scdc_set_high_tmds_clock_ratio(&sor->output.connector, true);
+	drm_scdc_set_scrambling(&sor->output.connector, true);
 
 	tegra_sor_hdmi_enable_scrambling(sor);
 }
@@ -2179,9 +2175,8 @@ static void tegra_sor_hdmi_scdc_enable(struct tegra_sor *sor)
 static void tegra_sor_hdmi_scdc_work(struct work_struct *work)
 {
 	struct tegra_sor *sor = container_of(work, struct tegra_sor, scdc.work);
-	struct i2c_adapter *ddc = sor->output.ddc;
 
-	if (!drm_scdc_get_scrambling_status(ddc)) {
+	if (!drm_scdc_get_scrambling_status(&sor->output.connector)) {
 		DRM_DEBUG_KMS("SCDC not scrambled\n");
 		tegra_sor_hdmi_scdc_enable(sor);
 	}
diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index ca7744b852f5..4bca6b54520a 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -218,14 +218,21 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct vm_fault *vmf,
 	prot = ttm_io_prot(bo, bo->resource, prot);
 	if (!bo->resource->bus.is_iomem) {
 		struct ttm_operation_ctx ctx = {
-			.interruptible = false,
+			.interruptible = true,
 			.no_wait_gpu = false,
 			.force_alloc = true
 		};
 
 		ttm = bo->ttm;
-		if (ttm_tt_populate(bdev, bo->ttm, &ctx))
-			return VM_FAULT_OOM;
+		err = ttm_tt_populate(bdev, bo->ttm, &ctx);
+		if (err) {
+			if (err == -EINTR || err == -ERESTARTSYS ||
+			    err == -EAGAIN)
+				return VM_FAULT_NOPAGE;
+
+			pr_debug("TTM fault hit %pe.\n", ERR_PTR(err));
+			return VM_FAULT_SIGBUS;
+		}
 	} else {
 		/* Iomem should not be marked encrypted */
 		prot = pgprot_decrypted(prot);
diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c
index aa116a7bbae3..18c342a919a2 100644
--- a/drivers/gpu/drm/ttm/ttm_pool.c
+++ b/drivers/gpu/drm/ttm/ttm_pool.c
@@ -47,6 +47,11 @@
 
 #include "ttm_module.h"
 
+#define TTM_MAX_ORDER (PMD_SHIFT - PAGE_SHIFT)
+#define __TTM_DIM_ORDER (TTM_MAX_ORDER + 1)
+/* Some architectures have a weird PMD_SHIFT */
+#define TTM_DIM_ORDER (__TTM_DIM_ORDER <= MAX_ORDER ? __TTM_DIM_ORDER : MAX_ORDER)
+
 /**
  * struct ttm_pool_dma - Helper object for coherent DMA mappings
  *
@@ -65,11 +70,11 @@ module_param(page_pool_size, ulong, 0644);
 
 static atomic_long_t allocated_pages;
 
-static struct ttm_pool_type global_write_combined[MAX_ORDER];
-static struct ttm_pool_type global_uncached[MAX_ORDER];
+static struct ttm_pool_type global_write_combined[TTM_DIM_ORDER];
+static struct ttm_pool_type global_uncached[TTM_DIM_ORDER];
 
-static struct ttm_pool_type global_dma32_write_combined[MAX_ORDER];
-static struct ttm_pool_type global_dma32_uncached[MAX_ORDER];
+static struct ttm_pool_type global_dma32_write_combined[TTM_DIM_ORDER];
+static struct ttm_pool_type global_dma32_uncached[TTM_DIM_ORDER];
 
 static spinlock_t shrinker_lock;
 static struct list_head shrinker_list;
@@ -368,6 +373,43 @@ static int ttm_pool_page_allocated(struct ttm_pool *pool, unsigned int order,
 }
 
 /**
+ * ttm_pool_free_range() - Free a range of TTM pages
+ * @pool: The pool used for allocating.
+ * @tt: The struct ttm_tt holding the page pointers.
+ * @caching: The page caching mode used by the range.
+ * @start_page: index for first page to free.
+ * @end_page: index for last page to free + 1.
+ *
+ * During allocation the ttm_tt page-vector may be populated with ranges of
+ * pages with different attributes if allocation hit an error without being
+ * able to completely fulfill the allocation. This function can be used
+ * to free these individual ranges.
+ */
+static void ttm_pool_free_range(struct ttm_pool *pool, struct ttm_tt *tt,
+				enum ttm_caching caching,
+				pgoff_t start_page, pgoff_t end_page)
+{
+	struct page **pages = tt->pages;
+	unsigned int order;
+	pgoff_t i, nr;
+
+	for (i = start_page; i < end_page; i += nr, pages += nr) {
+		struct ttm_pool_type *pt = NULL;
+
+		order = ttm_pool_page_order(pool, *pages);
+		nr = (1UL << order);
+		if (tt->dma_address)
+			ttm_pool_unmap(pool, tt->dma_address[i], nr);
+
+		pt = ttm_pool_select_type(pool, caching, order);
+		if (pt)
+			ttm_pool_type_give(pt, *pages);
+		else
+			ttm_pool_free_page(pool, caching, order, *pages);
+	}
+}
+
+/**
  * ttm_pool_alloc - Fill a ttm_tt object
  *
  * @pool: ttm_pool to use
@@ -382,12 +424,14 @@ static int ttm_pool_page_allocated(struct ttm_pool *pool, unsigned int order,
 int ttm_pool_alloc(struct ttm_pool *pool, struct ttm_tt *tt,
 		   struct ttm_operation_ctx *ctx)
 {
-	unsigned long num_pages = tt->num_pages;
+	pgoff_t num_pages = tt->num_pages;
 	dma_addr_t *dma_addr = tt->dma_address;
 	struct page **caching = tt->pages;
 	struct page **pages = tt->pages;
+	enum ttm_caching page_caching;
 	gfp_t gfp_flags = GFP_USER;
-	unsigned int i, order;
+	pgoff_t caching_divide;
+	unsigned int order;
 	struct page *p;
 	int r;
 
@@ -405,11 +449,12 @@ int ttm_pool_alloc(struct ttm_pool *pool, struct ttm_tt *tt,
 	else
 		gfp_flags |= GFP_HIGHUSER;
 
-	for (order = min_t(unsigned int, MAX_ORDER - 1, __fls(num_pages));
+	for (order = min_t(unsigned int, TTM_MAX_ORDER, __fls(num_pages));
 	     num_pages;
 	     order = min_t(unsigned int, order, __fls(num_pages))) {
 		struct ttm_pool_type *pt;
 
+		page_caching = tt->caching;
 		pt = ttm_pool_select_type(pool, tt->caching, order);
 		p = pt ? ttm_pool_type_take(pt) : NULL;
 		if (p) {
@@ -418,6 +463,7 @@ int ttm_pool_alloc(struct ttm_pool *pool, struct ttm_tt *tt,
 			if (r)
 				goto error_free_page;
 
+			caching = pages;
 			do {
 				r = ttm_pool_page_allocated(pool, order, p,
 							    &dma_addr,
@@ -426,14 +472,15 @@ int ttm_pool_alloc(struct ttm_pool *pool, struct ttm_tt *tt,
 				if (r)
 					goto error_free_page;
 
+				caching = pages;
 				if (num_pages < (1 << order))
 					break;
 
 				p = ttm_pool_type_take(pt);
 			} while (p);
-			caching = pages;
 		}
 
+		page_caching = ttm_cached;
 		while (num_pages >= (1 << order) &&
 		       (p = ttm_pool_alloc_page(pool, gfp_flags, order))) {
 
@@ -442,6 +489,7 @@ int ttm_pool_alloc(struct ttm_pool *pool, struct ttm_tt *tt,
 							   tt->caching);
 				if (r)
 					goto error_free_page;
+				caching = pages;
 			}
 			r = ttm_pool_page_allocated(pool, order, p, &dma_addr,
 						    &num_pages, &pages);
@@ -468,15 +516,13 @@ int ttm_pool_alloc(struct ttm_pool *pool, struct ttm_tt *tt,
 	return 0;
 
 error_free_page:
-	ttm_pool_free_page(pool, tt->caching, order, p);
+	ttm_pool_free_page(pool, page_caching, order, p);
 
 error_free_all:
 	num_pages = tt->num_pages - num_pages;
-	for (i = 0; i < num_pages; ) {
-		order = ttm_pool_page_order(pool, tt->pages[i]);
-		ttm_pool_free_page(pool, tt->caching, order, tt->pages[i]);
-		i += 1 << order;
-	}
+	caching_divide = caching - tt->pages;
+	ttm_pool_free_range(pool, tt, tt->caching, 0, caching_divide);
+	ttm_pool_free_range(pool, tt, ttm_cached, caching_divide, num_pages);
 
 	return r;
 }
@@ -492,27 +538,7 @@ EXPORT_SYMBOL(ttm_pool_alloc);
  */
 void ttm_pool_free(struct ttm_pool *pool, struct ttm_tt *tt)
 {
-	unsigned int i;
-
-	for (i = 0; i < tt->num_pages; ) {
-		struct page *p = tt->pages[i];
-		unsigned int order, num_pages;
-		struct ttm_pool_type *pt;
-
-		order = ttm_pool_page_order(pool, p);
-		num_pages = 1ULL << order;
-		if (tt->dma_address)
-			ttm_pool_unmap(pool, tt->dma_address[i], num_pages);
-
-		pt = ttm_pool_select_type(pool, tt->caching, order);
-		if (pt)
-			ttm_pool_type_give(pt, tt->pages[i]);
-		else
-			ttm_pool_free_page(pool, tt->caching, order,
-					   tt->pages[i]);
-
-		i += num_pages;
-	}
+	ttm_pool_free_range(pool, tt, tt->caching, 0, tt->num_pages);
 
 	while (atomic_long_read(&allocated_pages) > page_pool_size)
 		ttm_pool_shrink();
@@ -542,7 +568,7 @@ void ttm_pool_init(struct ttm_pool *pool, struct device *dev,
 
 	if (use_dma_alloc) {
 		for (i = 0; i < TTM_NUM_CACHING_TYPES; ++i)
-			for (j = 0; j < MAX_ORDER; ++j)
+			for (j = 0; j < TTM_DIM_ORDER; ++j)
 				ttm_pool_type_init(&pool->caching[i].orders[j],
 						   pool, i, j);
 	}
@@ -562,7 +588,7 @@ void ttm_pool_fini(struct ttm_pool *pool)
 
 	if (pool->use_dma_alloc) {
 		for (i = 0; i < TTM_NUM_CACHING_TYPES; ++i)
-			for (j = 0; j < MAX_ORDER; ++j)
+			for (j = 0; j < TTM_DIM_ORDER; ++j)
 				ttm_pool_type_fini(&pool->caching[i].orders[j]);
 	}
 
@@ -616,7 +642,7 @@ static void ttm_pool_debugfs_header(struct seq_file *m)
 	unsigned int i;
 
 	seq_puts(m, "\t ");
-	for (i = 0; i < MAX_ORDER; ++i)
+	for (i = 0; i < TTM_DIM_ORDER; ++i)
 		seq_printf(m, " ---%2u---", i);
 	seq_puts(m, "\n");
 }
@@ -627,7 +653,7 @@ static void ttm_pool_debugfs_orders(struct ttm_pool_type *pt,
 {
 	unsigned int i;
 
-	for (i = 0; i < MAX_ORDER; ++i)
+	for (i = 0; i < TTM_DIM_ORDER; ++i)
 		seq_printf(m, " %8u", ttm_pool_type_count(&pt[i]));
 	seq_puts(m, "\n");
 }
@@ -730,13 +756,16 @@ int ttm_pool_mgr_init(unsigned long num_pages)
 {
 	unsigned int i;
 
+	BUILD_BUG_ON(TTM_DIM_ORDER > MAX_ORDER);
+	BUILD_BUG_ON(TTM_DIM_ORDER < 1);
+
 	if (!page_pool_size)
 		page_pool_size = num_pages;
 
 	spin_lock_init(&shrinker_lock);
 	INIT_LIST_HEAD(&shrinker_list);
 
-	for (i = 0; i < MAX_ORDER; ++i) {
+	for (i = 0; i < TTM_DIM_ORDER; ++i) {
 		ttm_pool_type_init(&global_write_combined[i], NULL,
 				   ttm_write_combined, i);
 		ttm_pool_type_init(&global_uncached[i], NULL, ttm_uncached, i);
@@ -769,7 +798,7 @@ void ttm_pool_mgr_fini(void)
 {
 	unsigned int i;
 
-	for (i = 0; i < MAX_ORDER; ++i) {
+	for (i = 0; i < TTM_DIM_ORDER; ++i) {
 		ttm_pool_type_fini(&global_write_combined[i]);
 		ttm_pool_type_fini(&global_uncached[i]);
 
diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index 464c3cc8e6fb..06713d8b82b5 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -885,7 +885,8 @@ static void vc4_hdmi_set_infoframes(struct drm_encoder *encoder)
 static void vc4_hdmi_enable_scrambling(struct drm_encoder *encoder)
 {
 	struct vc4_hdmi *vc4_hdmi = encoder_to_vc4_hdmi(encoder);
-	struct drm_device *drm = vc4_hdmi->connector.dev;
+	struct drm_connector *connector = &vc4_hdmi->connector;
+	struct drm_device *drm = connector->dev;
 	const struct drm_display_mode *mode = &vc4_hdmi->saved_adjusted_mode;
 	unsigned long flags;
 	int idx;
@@ -903,8 +904,8 @@ static void vc4_hdmi_enable_scrambling(struct drm_encoder *encoder)
 	if (!drm_dev_enter(drm, &idx))
 		return;
 
-	drm_scdc_set_high_tmds_clock_ratio(vc4_hdmi->ddc, true);
-	drm_scdc_set_scrambling(vc4_hdmi->ddc, true);
+	drm_scdc_set_high_tmds_clock_ratio(connector, true);
+	drm_scdc_set_scrambling(connector, true);
 
 	spin_lock_irqsave(&vc4_hdmi->hw_lock, flags);
 	HDMI_WRITE(HDMI_SCRAMBLER_CTL, HDMI_READ(HDMI_SCRAMBLER_CTL) |
@@ -922,7 +923,8 @@ static void vc4_hdmi_enable_scrambling(struct drm_encoder *encoder)
 static void vc4_hdmi_disable_scrambling(struct drm_encoder *encoder)
 {
 	struct vc4_hdmi *vc4_hdmi = encoder_to_vc4_hdmi(encoder);
-	struct drm_device *drm = vc4_hdmi->connector.dev;
+	struct drm_connector *connector = &vc4_hdmi->connector;
+	struct drm_device *drm = connector->dev;
 	unsigned long flags;
 	int idx;
 
@@ -944,8 +946,8 @@ static void vc4_hdmi_disable_scrambling(struct drm_encoder *encoder)
 		   ~VC5_HDMI_SCRAMBLER_CTL_ENABLE);
 	spin_unlock_irqrestore(&vc4_hdmi->hw_lock, flags);
 
-	drm_scdc_set_scrambling(vc4_hdmi->ddc, false);
-	drm_scdc_set_high_tmds_clock_ratio(vc4_hdmi->ddc, false);
+	drm_scdc_set_scrambling(connector, false);
+	drm_scdc_set_high_tmds_clock_ratio(connector, false);
 
 	drm_dev_exit(idx);
 }
@@ -955,12 +957,13 @@ static void vc4_hdmi_scrambling_wq(struct work_struct *work)
 	struct vc4_hdmi *vc4_hdmi = container_of(to_delayed_work(work),
 						 struct vc4_hdmi,
 						 scrambling_work);
+	struct drm_connector *connector = &vc4_hdmi->connector;
 
-	if (drm_scdc_get_scrambling_status(vc4_hdmi->ddc))
+	if (drm_scdc_get_scrambling_status(connector))
 		return;
 
-	drm_scdc_set_high_tmds_clock_ratio(vc4_hdmi->ddc, true);
-	drm_scdc_set_scrambling(vc4_hdmi->ddc, true);
+	drm_scdc_set_high_tmds_clock_ratio(connector, true);
+	drm_scdc_set_scrambling(connector, true);
 
 	queue_delayed_work(system_wq, &vc4_hdmi->scrambling_work,
 			   msecs_to_jiffies(SCRAMBLING_POLLING_DELAY_MS));
diff --git a/drivers/staging/sm750fb/sm750.c b/drivers/staging/sm750fb/sm750.c
index effc7fcc3703..22ace3168723 100644
--- a/drivers/staging/sm750fb/sm750.c
+++ b/drivers/staging/sm750fb/sm750.c
@@ -989,20 +989,6 @@ release_fb:
 	return err;
 }
 
-static int lynxfb_kick_out_firmware_fb(struct pci_dev *pdev)
-{
-	resource_size_t base = pci_resource_start(pdev, 0);
-	resource_size_t size = pci_resource_len(pdev, 0);
-	bool primary = false;
-
-#ifdef CONFIG_X86
-	primary = pdev->resource[PCI_ROM_RESOURCE].flags &
-					IORESOURCE_ROM_SHADOW;
-#endif
-
-	return aperture_remove_conflicting_devices(base, size, primary, "sm750_fb1");
-}
-
 static int lynxfb_pci_probe(struct pci_dev *pdev,
 			    const struct pci_device_id *ent)
 {
@@ -1011,7 +997,7 @@ static int lynxfb_pci_probe(struct pci_dev *pdev,
 	int fbidx;
 	int err;
 
-	err = lynxfb_kick_out_firmware_fb(pdev);
+	err = aperture_remove_conflicting_pci_devices(pdev, "sm750_fb1");
 	if (err)
 		return err;
 
diff --git a/drivers/video/aperture.c b/drivers/video/aperture.c
index 41e77de1ea82..b009468ffdff 100644
--- a/drivers/video/aperture.c
+++ b/drivers/video/aperture.c
@@ -20,7 +20,7 @@
  * driver can be active at any given time. Many systems load a generic
  * graphics drivers, such as EFI-GOP or VESA, early during the boot process.
  * During later boot stages, they replace the generic driver with a dedicated,
- * hardware-specific driver. To take over the device the dedicated driver
+ * hardware-specific driver. To take over the device, the dedicated driver
  * first has to remove the generic driver. Aperture functions manage
  * ownership of framebuffer memory and hand-over between drivers.
  *
@@ -76,7 +76,7 @@
  * generic EFI or VESA drivers, have to register themselves as owners of their
  * framebuffer apertures. Ownership of the framebuffer memory is achieved
  * by calling devm_aperture_acquire_for_platform_device(). If successful, the
- * driveris the owner of the framebuffer range. The function fails if the
+ * driver is the owner of the framebuffer range. The function fails if the
  * framebuffer is already owned by another driver. See below for an example.
  *
  * .. code-block:: c
@@ -126,7 +126,7 @@
  * et al for the registered framebuffer range, the aperture helpers call
  * platform_device_unregister() and the generic driver unloads itself. The
  * generic driver also has to provide a remove function to make this work.
- * Once hot unplugged fro mhardware, it may not access the device's
+ * Once hot unplugged from hardware, it may not access the device's
  * registers, framebuffer memory, ROM, etc afterwards.
  */
 
@@ -203,7 +203,7 @@ static void aperture_detach_platform_device(struct device *dev)
 
 	/*
 	 * Remove the device from the device hierarchy. This is the right thing
-	 * to do for firmware-based DRM drivers, such as EFI, VESA or VGA. After
+	 * to do for firmware-based fb drivers, such as EFI, VESA or VGA. After
 	 * the new driver takes over the hardware, the firmware device's state
 	 * will be lost.
 	 *
diff --git a/drivers/video/fbdev/aty/radeon_base.c b/drivers/video/fbdev/aty/radeon_base.c
index 657064227de8..972c4bbedfa3 100644
--- a/drivers/video/fbdev/aty/radeon_base.c
+++ b/drivers/video/fbdev/aty/radeon_base.c
@@ -2238,14 +2238,6 @@ static const struct bin_attribute edid2_attr = {
 	.read	= radeon_show_edid2,
 };
 
-static int radeon_kick_out_firmware_fb(struct pci_dev *pdev)
-{
-	resource_size_t base = pci_resource_start(pdev, 0);
-	resource_size_t size = pci_resource_len(pdev, 0);
-
-	return aperture_remove_conflicting_devices(base, size, false, KBUILD_MODNAME);
-}
-
 static int radeonfb_pci_register(struct pci_dev *pdev,
 				 const struct pci_device_id *ent)
 {
@@ -2296,7 +2288,7 @@ static int radeonfb_pci_register(struct pci_dev *pdev,
 	rinfo->fb_base_phys = pci_resource_start (pdev, 0);
 	rinfo->mmio_base_phys = pci_resource_start (pdev, 2);
 
-	ret = radeon_kick_out_firmware_fb(pdev);
+	ret = aperture_remove_conflicting_pci_devices(pdev, KBUILD_MODNAME);
 	if (ret)
 		goto err_release_fb;
 
diff --git a/include/drm/display/drm_scdc_helper.h b/include/drm/display/drm_scdc_helper.h
index ded01fd948b4..34600476a1b9 100644
--- a/include/drm/display/drm_scdc_helper.h
+++ b/include/drm/display/drm_scdc_helper.h
@@ -28,6 +28,7 @@
 
 #include <drm/display/drm_scdc.h>
 
+struct drm_connector;
 struct i2c_adapter;
 
 ssize_t drm_scdc_read(struct i2c_adapter *adapter, u8 offset, void *buffer,
@@ -71,9 +72,9 @@ static inline int drm_scdc_writeb(struct i2c_adapter *adapter, u8 offset,
 	return drm_scdc_write(adapter, offset, &value, sizeof(value));
 }
 
-bool drm_scdc_get_scrambling_status(struct i2c_adapter *adapter);
+bool drm_scdc_get_scrambling_status(struct drm_connector *connector);
 
-bool drm_scdc_set_scrambling(struct i2c_adapter *adapter, bool enable);
-bool drm_scdc_set_high_tmds_clock_ratio(struct i2c_adapter *adapter, bool set);
+bool drm_scdc_set_scrambling(struct drm_connector *connector, bool enable);
+bool drm_scdc_set_high_tmds_clock_ratio(struct drm_connector *connector, bool set);
 
 #endif
diff --git a/include/drm/drm_gem_vram_helper.h b/include/drm/drm_gem_vram_helper.h
index d3e8920c0b64..f4aab64411d8 100644
--- a/include/drm/drm_gem_vram_helper.h
+++ b/include/drm/drm_gem_vram_helper.h
@@ -160,7 +160,9 @@ void drm_gem_vram_simple_display_pipe_cleanup_fb(
 	.debugfs_init             = drm_vram_mm_debugfs_init, \
 	.dumb_create		  = drm_gem_vram_driver_dumb_create, \
 	.dumb_map_offset	  = drm_gem_ttm_dumb_map_offset, \
-	.gem_prime_mmap		  = drm_gem_prime_mmap
+	.gem_prime_mmap		  = drm_gem_prime_mmap, \
+	.prime_handle_to_fd	  = drm_gem_prime_handle_to_fd, \
+	.prime_fd_to_handle	  = drm_gem_prime_fd_to_handle
 
 /*
  *  VRAM memory manager
diff --git a/include/uapi/drm/qaic_accel.h b/include/uapi/drm/qaic_accel.h
new file mode 100644
index 000000000000..2d348744a853
--- /dev/null
+++ b/include/uapi/drm/qaic_accel.h
@@ -0,0 +1,397 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
+ *
+ * Copyright (c) 2019-2020, The Linux Foundation. All rights reserved.
+ * Copyright (c) 2021-2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#ifndef QAIC_ACCEL_H_
+#define QAIC_ACCEL_H_
+
+#include "drm.h"
+
+#if defined(__cplusplus)
+extern "C" {
+#endif
+
+/* The length(4K) includes len and count fields of qaic_manage_msg */
+#define QAIC_MANAGE_MAX_MSG_LENGTH SZ_4K
+
+/* semaphore flags */
+#define QAIC_SEM_INSYNCFENCE	2
+#define QAIC_SEM_OUTSYNCFENCE	1
+
+/* Semaphore commands */
+#define QAIC_SEM_NOP		0
+#define QAIC_SEM_INIT		1
+#define QAIC_SEM_INC		2
+#define QAIC_SEM_DEC		3
+#define QAIC_SEM_WAIT_EQUAL	4
+#define QAIC_SEM_WAIT_GT_EQ	5 /* Greater than or equal */
+#define QAIC_SEM_WAIT_GT_0	6 /* Greater than 0 */
+
+#define QAIC_TRANS_UNDEFINED			0
+#define QAIC_TRANS_PASSTHROUGH_FROM_USR		1
+#define QAIC_TRANS_PASSTHROUGH_TO_USR		2
+#define QAIC_TRANS_PASSTHROUGH_FROM_DEV		3
+#define QAIC_TRANS_PASSTHROUGH_TO_DEV		4
+#define QAIC_TRANS_DMA_XFER_FROM_USR		5
+#define QAIC_TRANS_DMA_XFER_TO_DEV		6
+#define QAIC_TRANS_ACTIVATE_FROM_USR		7
+#define QAIC_TRANS_ACTIVATE_FROM_DEV		8
+#define QAIC_TRANS_ACTIVATE_TO_DEV		9
+#define QAIC_TRANS_DEACTIVATE_FROM_USR		10
+#define QAIC_TRANS_DEACTIVATE_FROM_DEV		11
+#define QAIC_TRANS_STATUS_FROM_USR		12
+#define QAIC_TRANS_STATUS_TO_USR		13
+#define QAIC_TRANS_STATUS_FROM_DEV		14
+#define QAIC_TRANS_STATUS_TO_DEV		15
+#define QAIC_TRANS_TERMINATE_FROM_DEV		16
+#define QAIC_TRANS_TERMINATE_TO_DEV		17
+#define QAIC_TRANS_DMA_XFER_CONT		18
+#define QAIC_TRANS_VALIDATE_PARTITION_FROM_DEV	19
+#define QAIC_TRANS_VALIDATE_PARTITION_TO_DEV	20
+
+/**
+ * struct qaic_manage_trans_hdr - Header for a transaction in a manage message.
+ * @type: In. Identifies this transaction. See QAIC_TRANS_* defines.
+ * @len: In. Length of this transaction, including this header.
+ */
+struct qaic_manage_trans_hdr {
+	__u32 type;
+	__u32 len;
+};
+
+/**
+ * struct qaic_manage_trans_passthrough - Defines a passthrough transaction.
+ * @hdr: In. Header to identify this transaction.
+ * @data: In. Payload of this ransaction. Opaque to the driver. Userspace must
+ *	  encode in little endian and align/pad to 64-bit.
+ */
+struct qaic_manage_trans_passthrough {
+	struct qaic_manage_trans_hdr hdr;
+	__u8 data[];
+};
+
+/**
+ * struct qaic_manage_trans_dma_xfer - Defines a DMA transfer transaction.
+ * @hdr: In. Header to identify this transaction.
+ * @tag: In. Identified this transfer in other transactions. Opaque to the
+ *	 driver.
+ * @pad: Structure padding.
+ * @addr: In. Address of the data to DMA to the device.
+ * @size: In. Length of the data to DMA to the device.
+ */
+struct qaic_manage_trans_dma_xfer {
+	struct qaic_manage_trans_hdr hdr;
+	__u32 tag;
+	__u32 pad;
+	__u64 addr;
+	__u64 size;
+};
+
+/**
+ * struct qaic_manage_trans_activate_to_dev - Defines an activate request.
+ * @hdr: In. Header to identify this transaction.
+ * @queue_size: In. Number of elements for DBC request and response queues.
+ * @eventfd: Unused.
+ * @options: In. Device specific options for this activate.
+ * @pad: Structure padding.  Must be 0.
+ */
+struct qaic_manage_trans_activate_to_dev {
+	struct qaic_manage_trans_hdr hdr;
+	__u32 queue_size;
+	__u32 eventfd;
+	__u32 options;
+	__u32 pad;
+};
+
+/**
+ * struct qaic_manage_trans_activate_from_dev - Defines an activate response.
+ * @hdr: Out. Header to identify this transaction.
+ * @status: Out. Return code of the request from the device.
+ * @dbc_id: Out. Id of the assigned DBC for successful request.
+ * @options: Out. Device specific options for this activate.
+ */
+struct qaic_manage_trans_activate_from_dev {
+	struct qaic_manage_trans_hdr hdr;
+	__u32 status;
+	__u32 dbc_id;
+	__u64 options;
+};
+
+/**
+ * struct qaic_manage_trans_deactivate - Defines a deactivate request.
+ * @hdr: In. Header to identify this transaction.
+ * @dbc_id: In. Id of assigned DBC.
+ * @pad: Structure padding.  Must be 0.
+ */
+struct qaic_manage_trans_deactivate {
+	struct qaic_manage_trans_hdr hdr;
+	__u32 dbc_id;
+	__u32 pad;
+};
+
+/**
+ * struct qaic_manage_trans_status_to_dev - Defines a status request.
+ * @hdr: In. Header to identify this transaction.
+ */
+struct qaic_manage_trans_status_to_dev {
+	struct qaic_manage_trans_hdr hdr;
+};
+
+/**
+ * struct qaic_manage_trans_status_from_dev - Defines a status response.
+ * @hdr: Out. Header to identify this transaction.
+ * @major: Out. NNC protocol version major number.
+ * @minor: Out. NNC protocol version minor number.
+ * @status: Out. Return code from device.
+ * @status_flags: Out. Flags from device.  Bit 0 indicates if CRCs are required.
+ */
+struct qaic_manage_trans_status_from_dev {
+	struct qaic_manage_trans_hdr hdr;
+	__u16 major;
+	__u16 minor;
+	__u32 status;
+	__u64 status_flags;
+};
+
+/**
+ * struct qaic_manage_msg - Defines a message to the device.
+ * @len: In. Length of all the transactions contained within this message.
+ * @count: In. Number of transactions in this message.
+ * @data: In. Address to an array where the transactions can be found.
+ */
+struct qaic_manage_msg {
+	__u32 len;
+	__u32 count;
+	__u64 data;
+};
+
+/**
+ * struct qaic_create_bo - Defines a request to create a buffer object.
+ * @size: In.  Size of the buffer in bytes.
+ * @handle: Out. GEM handle for the BO.
+ * @pad: Structure padding. Must be 0.
+ */
+struct qaic_create_bo {
+	__u64 size;
+	__u32 handle;
+	__u32 pad;
+};
+
+/**
+ * struct qaic_mmap_bo - Defines a request to prepare a BO for mmap().
+ * @handle: In.  Handle of the GEM BO to prepare for mmap().
+ * @pad: Structure padding. Must be 0.
+ * @offset: Out. Offset value to provide to mmap().
+ */
+struct qaic_mmap_bo {
+	__u32 handle;
+	__u32 pad;
+	__u64 offset;
+};
+
+/**
+ * struct qaic_sem - Defines a semaphore command for a BO slice.
+ * @val: In. Only lower 12 bits are valid.
+ * @index: In. Only lower 5 bits are valid.
+ * @presync: In. 1 if presync operation, 0 if postsync.
+ * @cmd: In. One of QAIC_SEM_*.
+ * @flags: In. Bitfield. See QAIC_SEM_INSYNCFENCE and QAIC_SEM_OUTSYNCFENCE
+ * @pad: Structure padding.  Must be 0.
+ */
+struct qaic_sem {
+	__u16 val;
+	__u8  index;
+	__u8  presync;
+	__u8  cmd;
+	__u8  flags;
+	__u16 pad;
+};
+
+/**
+ * struct qaic_attach_slice_entry - Defines a single BO slice.
+ * @size: In. Size of this slice in bytes.
+ * @sem0: In. Semaphore command 0. Must be 0 is not valid.
+ * @sem1: In. Semaphore command 1. Must be 0 is not valid.
+ * @sem2: In. Semaphore command 2. Must be 0 is not valid.
+ * @sem3: In. Semaphore command 3. Must be 0 is not valid.
+ * @dev_addr: In. Device address this slice pushes to or pulls from.
+ * @db_addr: In. Address of the doorbell to ring.
+ * @db_data: In. Data to write to the doorbell.
+ * @db_len: In. Size of the doorbell data in bits - 32, 16, or 8.  0 is for
+ *	    inactive doorbells.
+ * @offset: In. Start of this slice as an offset from the start of the BO.
+ */
+struct qaic_attach_slice_entry {
+	__u64 size;
+	struct qaic_sem	sem0;
+	struct qaic_sem	sem1;
+	struct qaic_sem	sem2;
+	struct qaic_sem	sem3;
+	__u64 dev_addr;
+	__u64 db_addr;
+	__u32 db_data;
+	__u32 db_len;
+	__u64 offset;
+};
+
+/**
+ * struct qaic_attach_slice_hdr - Defines metadata for a set of BO slices.
+ * @count: In. Number of slices for this BO.
+ * @dbc_id: In. Associate the sliced BO with this DBC.
+ * @handle: In. GEM handle of the BO to slice.
+ * @dir: In. Direction of data flow. 1 = DMA_TO_DEVICE, 2 = DMA_FROM_DEVICE
+ * @size: In. Total length of the BO.
+ *	  If BO is imported (DMABUF/PRIME) then this size
+ *	  should not exceed the size of DMABUF provided.
+ *	  If BO is allocated using DRM_IOCTL_QAIC_CREATE_BO
+ *	  then this size should be exactly same as the size
+ *	  provided during DRM_IOCTL_QAIC_CREATE_BO.
+ * @dev_addr: In. Device address this slice pushes to or pulls from.
+ * @db_addr: In. Address of the doorbell to ring.
+ * @db_data: In. Data to write to the doorbell.
+ * @db_len: In. Size of the doorbell data in bits - 32, 16, or 8.  0 is for
+ *	    inactive doorbells.
+ * @offset: In. Start of this slice as an offset from the start of the BO.
+ */
+struct qaic_attach_slice_hdr {
+	__u32 count;
+	__u32 dbc_id;
+	__u32 handle;
+	__u32 dir;
+	__u64 size;
+};
+
+/**
+ * struct qaic_attach_slice - Defines a set of BO slices.
+ * @hdr: In. Metadata of the set of slices.
+ * @data: In. Pointer to an array containing the slice definitions.
+ */
+struct qaic_attach_slice {
+	struct qaic_attach_slice_hdr hdr;
+	__u64 data;
+};
+
+/**
+ * struct qaic_execute_entry - Defines a BO to submit to the device.
+ * @handle: In. GEM handle of the BO to commit to the device.
+ * @dir: In. Direction of data. 1 = to device, 2 = from device.
+ */
+struct qaic_execute_entry {
+	__u32 handle;
+	__u32 dir;
+};
+
+/**
+ * struct qaic_partial_execute_entry - Defines a BO to resize and submit.
+ * @handle: In. GEM handle of the BO to commit to the device.
+ * @dir: In. Direction of data. 1 = to device, 2 = from device.
+ * @resize: In. New size of the BO.  Must be <= the original BO size.  0 is
+ *	    short for no resize.
+ */
+struct qaic_partial_execute_entry {
+	__u32 handle;
+	__u32 dir;
+	__u64 resize;
+};
+
+/**
+ * struct qaic_execute_hdr - Defines metadata for BO submission.
+ * @count: In. Number of BOs to submit.
+ * @dbc_id: In. DBC to submit the BOs on.
+ */
+struct qaic_execute_hdr {
+	__u32 count;
+	__u32 dbc_id;
+};
+
+/**
+ * struct qaic_execute - Defines a list of BOs to submit to the device.
+ * @hdr: In. BO list metadata.
+ * @data: In. Pointer to an array of BOs to submit.
+ */
+struct qaic_execute {
+	struct qaic_execute_hdr hdr;
+	__u64 data;
+};
+
+/**
+ * struct qaic_wait - Defines a blocking wait for BO execution.
+ * @handle: In. GEM handle of the BO to wait on.
+ * @timeout: In. Maximum time in ms to wait for the BO.
+ * @dbc_id: In. DBC the BO is submitted to.
+ * @pad: Structure padding. Must be 0.
+ */
+struct qaic_wait {
+	__u32 handle;
+	__u32 timeout;
+	__u32 dbc_id;
+	__u32 pad;
+};
+
+/**
+ * struct qaic_perf_stats_hdr - Defines metadata for getting BO perf info.
+ * @count: In. Number of BOs requested.
+ * @pad: Structure padding. Must be 0.
+ * @dbc_id: In. DBC the BO are associated with.
+ */
+struct qaic_perf_stats_hdr {
+	__u16 count;
+	__u16 pad;
+	__u32 dbc_id;
+};
+
+/**
+ * struct qaic_perf_stats - Defines a request for getting BO perf info.
+ * @hdr: In. Request metadata
+ * @data: In. Pointer to array of stats structures that will receive the data.
+ */
+struct qaic_perf_stats {
+	struct qaic_perf_stats_hdr hdr;
+	__u64 data;
+};
+
+/**
+ * struct qaic_perf_stats_entry - Defines a BO perf info.
+ * @handle: In. GEM handle of the BO to get perf stats for.
+ * @queue_level_before: Out. Number of elements in the queue before this BO
+ *			was submitted.
+ * @num_queue_element: Out. Number of elements added to the queue to submit
+ *		       this BO.
+ * @submit_latency_us: Out. Time taken by the driver to submit this BO.
+ * @device_latency_us: Out. Time taken by the device to execute this BO.
+ * @pad: Structure padding. Must be 0.
+ */
+struct qaic_perf_stats_entry {
+	__u32 handle;
+	__u32 queue_level_before;
+	__u32 num_queue_element;
+	__u32 submit_latency_us;
+	__u32 device_latency_us;
+	__u32 pad;
+};
+
+#define DRM_QAIC_MANAGE				0x00
+#define DRM_QAIC_CREATE_BO			0x01
+#define DRM_QAIC_MMAP_BO			0x02
+#define DRM_QAIC_ATTACH_SLICE_BO		0x03
+#define DRM_QAIC_EXECUTE_BO			0x04
+#define DRM_QAIC_PARTIAL_EXECUTE_BO		0x05
+#define DRM_QAIC_WAIT_BO			0x06
+#define DRM_QAIC_PERF_STATS_BO			0x07
+
+#define DRM_IOCTL_QAIC_MANAGE			DRM_IOWR(DRM_COMMAND_BASE + DRM_QAIC_MANAGE, struct qaic_manage_msg)
+#define DRM_IOCTL_QAIC_CREATE_BO		DRM_IOWR(DRM_COMMAND_BASE + DRM_QAIC_CREATE_BO,	struct qaic_create_bo)
+#define DRM_IOCTL_QAIC_MMAP_BO			DRM_IOWR(DRM_COMMAND_BASE + DRM_QAIC_MMAP_BO, struct qaic_mmap_bo)
+#define DRM_IOCTL_QAIC_ATTACH_SLICE_BO		DRM_IOW(DRM_COMMAND_BASE + DRM_QAIC_ATTACH_SLICE_BO, struct qaic_attach_slice)
+#define DRM_IOCTL_QAIC_EXECUTE_BO		DRM_IOW(DRM_COMMAND_BASE + DRM_QAIC_EXECUTE_BO,	struct qaic_execute)
+#define DRM_IOCTL_QAIC_PARTIAL_EXECUTE_BO	DRM_IOW(DRM_COMMAND_BASE + DRM_QAIC_PARTIAL_EXECUTE_BO,	struct qaic_execute)
+#define DRM_IOCTL_QAIC_WAIT_BO			DRM_IOW(DRM_COMMAND_BASE + DRM_QAIC_WAIT_BO, struct qaic_wait)
+#define DRM_IOCTL_QAIC_PERF_STATS_BO		DRM_IOWR(DRM_COMMAND_BASE + DRM_QAIC_PERF_STATS_BO, struct qaic_perf_stats)
+
+#if defined(__cplusplus)
+}
+#endif
+
+#endif /* QAIC_ACCEL_H_ */
author	Daniel Vetter <[email protected]>	2023-04-06 14:37:14 +0200
committer	Daniel Vetter <[email protected]>	2023-04-06 14:37:15 +0200
commit	52b113e968be66b57f792b2e2a9b8b77f382bd5f (patch)
tree	b0d29f82fe76a7078422fcd3326d2591718af3b9
parent	f86286569e92a260fbf8a1975f9421b4a66581d8 (diff)
parent	e44f18c6ff8beef7b2b10592287f0a9766376d9b (diff)