Age | Commit message (Collapse) | Author | Files | Lines |
|
Remove pointers to ASIC-specific functions and instead call the functions
explicitly as they are not accessed from outside the ASIC-specific files.
Signed-off-by: Tomer Tayar <[email protected]>
Signed-off-by: Oded Gabbay <[email protected]>
|
|
Move duplicated PCI-related code from ASIC-specific files into the common
pci.c file.
Signed-off-by: Tomer Tayar <[email protected]>
Signed-off-by: Oded Gabbay <[email protected]>
|
|
At the start of some IOCTLs we check if the device is disabled or in reset.
If it is, we return -EBUSY and print a message to kernel log.
Because these IOCTLs can be called at very high frequency, use ratelimit
to avoid spamming the kernel log. Also use the same type of message -
dev_warn - in all the relevant IOCTLs.
Signed-off-by: Oded Gabbay <[email protected]>
|
|
This patch removes some old defines which are not in use anymore.
Signed-off-by: Oded Gabbay <[email protected]>
|
|
The Event Queue MSI/X ID is different per ASIC. This patch renames the
current define to have the GOYA_ prefix to mark it only for Goya. It also
moves it from the common armcp_if.h file to the ASIC specific goya_fw_if.h
file.
Signed-off-by: Oded Gabbay <[email protected]>
|
|
This patch moves the code that is responsible of the communication
vs. the F/W to a dedicated file. This will allow us to share the code
between different ASICs.
Signed-off-by: Tomer Tayar <[email protected]>
Signed-off-by: Oded Gabbay <[email protected]>
|
|
This will prevent unneeded include of header files, which may increase
compilation time.
Signed-off-by: Dotan Barak <[email protected]>
Signed-off-by: Oded Gabbay <[email protected]>
|
|
The goya_non_fatal_events array actually contains all the possible events
the driver can receive from the F/W. Therefore, use a proper name
for the array.
The patch also adds missing event Ids to the goya_async_event_id enum.
Signed-off-by: Oded Gabbay <[email protected]>
|
|
This patch adds a definition of a new status in the device CPU boot stages
and add the handling of the new status.
Signed-off-by: Igor Grinberg <[email protected]>
Signed-off-by: Oded Gabbay <[email protected]>
|
|
We want the char-misc fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
In preparation to enabling -Wimplicit-fallthrough, mark switch
cases where we are expecting to fall through.
This patch fixes the following warning:
drivers/misc/sgi-xp/xpc_uv.c: In function ‘xpc_handle_activate_mq_msg_uv’:
drivers/misc/sgi-xp/xpc_uv.c:573:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
xpc_wakeup_channel_mgr(part);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/misc/sgi-xp/xpc_uv.c:575:2: note: here
case XPC_ACTIVATE_MQ_MSG_MARK_ENGAGED_UV:
^~~~
Warning level 3 was used: -Wimplicit-fallthrough=3
This patch is part of the ongoing efforts to enable
-Wimplicit-fallthrough.
Signed-off-by: Gustavo A. R. Silva <[email protected]>
Acked-by: Robin Holt<[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
In some cases where Neural Processing is required the size of init process
exceeds default size of 2MB, increase this size to 64MB which is required
for QCS404 CDSP Neural Processing.
Signed-off-by: Srinivas Kandagatla <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Remote page size should be calculated based on address and size, fix this!
Without this we will endup with one page less in cases where the buffer
is across 3 pages.
Fixes: c68cfb718c8f ("misc: fastrpc: Add support for context Invoke method")
Reported-by: Krishnaiah Tadakamalla <[email protected]>
Signed-off-by: Srinivas Kandagatla <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Argument buffers that are passed could be derived from a big buffer,
and some of the arguments buffers could overlap each other.
Take care of such instanaces.
This is optimization that DSP expects while sending buffers
which overlap. So make the DSP happy doing it.
Without which DSP seems to crash.
Fixes: c68cfb718c8f ("misc: fastrpc: Add support for context Invoke method")
Signed-off-by: Srinivas Kandagatla <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
While passing address phy address to DSP, take care of the offset
calculated from virtual address vma.
Fixes: c68cfb718c8f ("misc: fastrpc: Add support for context Invoke method")
Signed-off-by: Srinivas Kandagatla <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
context spin lock can be interrupted from callback path so use correct spinlock
so that we do not hit spinlock recursion.
Fixes: c68cfb718c8f ("misc: fastrpc: Add support for context Invoke method")
Signed-off-by: Srinivas Kandagatla <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
dma_alloc_coherent buffers could have writes queued in store buffers so
commit them before sending buffer to DSP using correct dma barriers.
Same with vice-versa.
Fixes: c68cfb718c8f ("misc: fastrpc: Add support for context Invoke method")
Signed-off-by: Srinivas Kandagatla <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
This patch fixes the error exit path of fastrpc_init_create_process().
If the DMA allocation or the DSP invoke fails the fastrpc_map was freed
but not removed from the mapping list leading to a double free once the
mapping list is emptied in fastrpc_device_release().
[srinivas kandagatla]: Cleaned up error path labels and reset init mem
to NULL after free
Fixes: d73f71c7c6ee("misc: fastrpc: Add support for create remote init process")
Signed-off-by: Thierry Escande <[email protected]>
Signed-off-by: Srinivas Kandagatla <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
When the remote DSP invocation is interrupted by the user, the
associated DMA buffer can be freed in interrupt context causing a kernel
BUG.
This patch adds a worker thread associated to the fastrpc context. It
is scheduled in the rpmsg callback to decrease its refcount out of the
interrupt context.
Fixes: c68cfb718c8f ("misc: fastrpc: Add support for context Invoke method")
Signed-off-by: Thierry Escande <[email protected]>
Signed-off-by: Srinivas Kandagatla <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
These lines weren't indented far enough.
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Use unified version of the copyright notice in the files
Update copyright years according the year the files
were touched, except this patch and SPDX conversions.
Signed-off-by: Tomas Winkler <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
1. Remove redundant parentheses around single license
2. Fix the license to GPL-2.0 and not GPL-2.0+ in mei_hdcp.h
Cc: Ramalingam C <[email protected]>
Signed-off-by: Tomas Winkler <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Replace boiler plate licenses texts with the SPDX license
identifiers in the mei files header.
Signed-off-by: Tomas Winkler <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Add SPDX tag with GPLv2 license to mei Kconfig.
Signed-off-by: Tomas Winkler <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
The driver uses the DMA_BUF module which is built only if
DMA_SHARED_BUFFER is selected. DMA_SHARED_BUFFER doesn't have any
dependencies so it is ok to select it (as done by many other components).
Reported-by: kbuild test robot <[email protected]>
Signed-off-by: Oded Gabbay <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
send_cpu_message() doesn't update the result parameter when an error
occurs in its code. Therefore, callers of send_cpu_message() shouldn't use
the result value when the return code indicates error.
This patch fixes a static checker warning in goya_test_cpu_queue(), where
that function did print the result even though the return code from
send_cpu_message() indicated error.
Signed-off-by: Oded Gabbay <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
A spin lock is taken here so we should use GFP_ATOMIC.
Fixes: 0feaf86d4e69 ("habanalabs: add virtual memory and MMU modules")
Signed-off-by: Wei Yongjun <[email protected]>
Reviewed-by: Oded Gabbay <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Merge hch's big DMA rework series. This is in a topic branch in case he
wants to merge it to minimise conflicts.
|
|
Add the layerscape EP device support in pci_endpoint_test driver.
Signed-off-by: Xiaowei Bao <[email protected]>
Signed-off-by: Lorenzo Pieralisi <[email protected]>
Reviewed-by: Minghuan Lian <[email protected]>
Reviewed-by: Zhiqiang Hou <[email protected]>
Reviewed-by: Greg KH <[email protected]>
|
|
The list of supported functions can be altered upon link reset,
clean the flags to allow correct selections of supported
features.
Cc: <[email protected]> v4.19+
Signed-off-by: Alexander Usyskin <[email protected]>
Signed-off-by: Tomas Winkler <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
In case of error, the function dma_buf_get() returns ERR_PTR() and never
returns NULL. The NULL test in the return value check should be replaced
with IS_ERR().
Signed-off-by: Wei Yongjun <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
This change fixes fastrpc_device_open() when no session is available and
return an error in such case.
Signed-off-by: Thierry Escande <[email protected]>
Signed-off-by: Srinivas Kandagatla <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Fastrpc is a dma buf exporter as well, so select the corresponding
DMA_SHARED_BUFFER config to fix below compilation errors on platforms
without this config.
ld: drivers/misc/fastrpc.o: in function 'fastrpc_free_map':
fastrpc.c:(.text+0xbe): undefined reference to 'dma_buf_unmap_attachment'
ld: fastrpc.c:(.text+0xcb): undefined reference to 'dma_buf_detach'
ld: fastrpc.c:(.text+0xd4): undefined reference to 'dma_buf_put'
ld: drivers/misc/fastrpc.o: in function 'fastrpc_map_create':
fastrpc.c:(.text+0xb2b): undefined reference to 'dma_buf_get'
ld: fastrpc.c:(.text+0xb47): undefined reference to 'dma_buf_attach'
ld: fastrpc.c:(.text+0xb61): undefined reference to 'dma_buf_map_attachment'
ld: fastrpc.c:(.text+0xc36): undefined reference to 'dma_buf_put'
ld: fastrpc.c:(.text+0xc48): undefined reference to 'dma_buf_detach'
ld: drivers/misc/fastrpc.o: in function 'fastrpc_device_ioctl':
fastrpc.c:(.text+0x1756): undefined reference to 'dma_buf_get'
ld: fastrpc.c:(.text+0x1776): undefined reference to 'dma_buf_put'
ld: fastrpc.c:(.text+0x1780): undefined reference to 'dma_buf_put'
ld: fastrpc.c:(.text+0x1abf): undefined reference to 'dma_buf_export'
ld: fastrpc.c:(.text+0x1ae7): undefined reference to 'dma_buf_fd'
ld: fastrpc.c:(.text+0x1cb5): undefined reference to 'dma_buf_put'
ld: fastrpc.c:(.text+0x1cca): undefined reference to 'dma_buf_put'
Reported-by: Randy Dunlap <[email protected]>
Signed-off-by: Srinivas Kandagatla <[email protected]>
Acked-by: Randy Dunlap <[email protected]> # build-tested
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
There is no good reason for this helper, just opencode it.
Signed-off-by: Christoph Hellwig <[email protected]>
Tested-by: Christian Zigotzky <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Now that we've switched all the powerpc nommu and swiotlb methods to
use the generic dma_direct_* calls we can remove these ops vectors
entirely and rely on the common direct mapping bypass that avoids
indirect function calls entirely. This also allows to remove a whole
lot of boilerplate code related to setting up these operations.
Signed-off-by: Christoph Hellwig <[email protected]>
Tested-by: Christian Zigotzky <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
This patch adds debugfs support to the driver. It allows the user-space to
display information that is contained in the internal structures of the
driver, such as:
- active command submissions
- active user virtual memory mappings
- number of allocated command buffers
It also enables the user to perform reads and writes through Goya's PCI
bars.
Reviewed-by: Mike Rapoport <[email protected]>
Signed-off-by: Oded Gabbay <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
This patch implements the INFO IOCTL. That IOCTL is used by the user to
query information that is relevant/needed by the user in order to submit
deep learning jobs to Goya.
The information is divided into several categories, such as H/W IP, Events
that happened, DDR usage and more.
Reviewed-by: Mike Rapoport <[email protected]>
Signed-off-by: Oded Gabbay <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
This patch adds the Virtual Memory and MMU modules.
Goya has an internal MMU which provides process isolation on the internal
DDR. The internal MMU also performs translations for transactions that go
from Goya to the Host.
The driver is responsible for allocating and freeing memory on the DDR
upon user request. It also provides an interface to map and unmap DDR and
Host memory to the device address space.
The MMU in Goya supports 3-level and 4-level page tables. With 3-level, the
size of each page is 2MB, while with 4-level the size of each page is 4KB.
In the DDR, the physical pages are always 2MB.
Reviewed-by: Mike Rapoport <[email protected]>
Signed-off-by: Omer Shpigelman <[email protected]>
Signed-off-by: Oded Gabbay <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
This patch adds the main flow for the user to submit work to the device.
Each work is described by a command submission object (CS). The CS contains
3 arrays of command buffers: One for execution, and two for context-switch
(store and restore).
For each CB, the user specifies on which queue to put that CB. In case of
an internal queue, the entry doesn't contain a pointer to the CB but the
address in the on-chip memory that the CB resides at.
The driver parses some of the CBs to enforce security restrictions.
The user receives a sequence number that represents the CS object. The user
can then query the driver regarding the status of the CS, using that
sequence number.
In case the CS doesn't finish before the timeout expires, the driver will
perform a soft-reset of the device.
Reviewed-by: Mike Rapoport <[email protected]>
Signed-off-by: Oded Gabbay <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
This patch adds support for doing various on-the-fly reset of Goya.
The driver supports two types of resets:
1. soft-reset
2. hard-reset
Soft-reset is done when the device detects a timeout of a command
submission that was given to the device. The soft-reset process only resets
the engines that are relevant for the submission of compute jobs, i.e. the
DMA channels, the TPCs and the MME. The purpose is to bring the device as
fast as possible to a working state.
Hard-reset is done in several cases:
1. After soft-reset is done but the device is not responding
2. When fatal errors occur inside the device, e.g. ECC error
3. When the driver is removed
Hard-reset performs a reset of the entire chip except for the PCI
controller and the PLLs. It is a much longer process then soft-reset but it
helps to recover the device without the need to reboot the Host.
After hard-reset, the driver will restore the max power attribute and in
case of manual power management, the frequencies that were set.
This patch also adds two entries to the sysfs, which allows the root user
to initiate a soft or hard reset.
Reviewed-by: Mike Rapoport <[email protected]>
Signed-off-by: Oded Gabbay <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
This patch add the sysfs and hwmon entries that are exposed by the driver.
Goya has several sensors, from various categories such as temperature,
voltage, current, etc. The driver exposes those sensors in the standard
hwmon mechanism.
In addition, the driver exposes a couple of interfaces in sysfs, both for
configuration and for providing status of the device or driver.
The configuration attributes is for Power Management:
- Automatic or manual
- Frequency value when moving to high frequency mode
- Maximum power the device is allowed to consume
The rest of the attributes are read-only and provide the following
information:
- Versions of the various firmwares running on the device
- Contents of the device's EEPROM
- The device type (currently only Goya is supported)
- PCI address of the device (to allow user-space to connect between
/dev/hlX to PCI address)
- Status of the device (operational, malfunction, in_reset)
- How many processes are open on the device's file
Reviewed-by: Mike Rapoport <[email protected]>
Signed-off-by: Oded Gabbay <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
This patch adds support for receiving events from Goya's control CPU and
for receiving MSI-X interrupts from Goya's DMA engines and CPU.
Goya's PCI controller supports up to 8 MSI-X interrupts, which only 6 of
them are currently used. The first 5 interrupts are dedicated for Goya's
DMA engine queues. The 6th interrupt is dedicated for Goya's control CPU.
The DMA queue will signal its MSI-X entry upon each completion of a command
buffer that was placed on its primary queue. The driver will then mark that
CB as completed and free the related resources. It will also update the
command submission object which that CB belongs to.
There is a dedicated event queue (EQ) between the driver and Goya's control
CPU. The EQ is located on the Host memory. The control CPU writes a new
entry to the EQ for various reasons, such as ECC error, MMU page fault, Hot
temperature. After writing the new entry to the EQ, the control CPU will
trigger its dedicated MSI-X entry to signal the driver that there is a new
entry in the EQ. The driver will then read the entry and act accordingly.
Reviewed-by: Mike Rapoport <[email protected]>
Signed-off-by: Oded Gabbay <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
This patch adds the H/W queues module and the code to initialize Goya's
various compute and DMA engines and their queues.
Goya has 5 DMA channels, 8 TPC engines and a single MME engine. For each
channel/engine, there is a H/W queue logic which is used to pass commands
from the user to the H/W. That logic is called QMAN.
There are two types of QMANs: external and internal. The DMA QMANs are
considered external while the TPC and MME QMANs are considered internal.
For each external queue there is a completion queue, which is located on
the Host memory.
The differences between external and internal QMANs are:
1. The location of the queue's memory. External QMANs are located on the
Host memory while internal QMANs are located on the on-chip memory.
2. The external QMAN write an entry to a completion queue and sends an
MSI-X interrupt upon completion of a command buffer that was given to
it. The internal QMAN doesn't do that.
Reviewed-by: Mike Rapoport <[email protected]>
Signed-off-by: Oded Gabbay <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
This patch adds the basic part of Goya's H/W initialization. It adds code
that initializes Goya's internal CPU, various registers that are related to
internal routing, scrambling, workarounds for H/W bugs, etc.
It also initializes Goya's security scheme that prevents the user from
abusing Goya to steal data from the host, crash the host, change
Goya's F/W, etc.
Reviewed-by: Mike Rapoport <[email protected]>
Signed-off-by: Oded Gabbay <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
This patch adds the command buffer (CB) module, which allows the user to
create and destroy CBs and to map them to the user's process
address-space.
A command buffer is a memory blocks that reside in DMA-able address-space
and is physically contiguous so it can be accessed by the device without
MMU translation. The command buffer memory is allocated using the
coherent DMA API.
When creating a new CB, the IOCTL returns a handle of it, and the
user-space process needs to use that handle to mmap the buffer to get a VA
in the user's address-space.
Before destroying (freeing) a CB, the user must unmap the CB's VA using the
CB handle.
Each CB has a reference counter, which tracks its usage in command
submissions and also its mmaps (only a single mmap is allowed).
The driver maintains a pool of pre-allocated CBs in order to reduce
latency during command submissions. In case the pool is empty, the driver
will go to the slow-path of allocating a new CB, i.e. calling
dma_alloc_coherent.
Reviewed-by: Mike Rapoport <[email protected]>
Signed-off-by: Oded Gabbay <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
This patch adds two modules - ASID and context.
Each user process that opens a device's file must have at least one
context before it is able to "work" with the device. Each context has its
own device address-space and contains information about its runtime state
(its active command submissions).
To have address-space separation between contexts, each context is assigned
a unique ASID, which stands for "address-space id". Goya supports up to
1024 ASIDs.
Currently, the driver doesn't support multiple contexts. Therefore, the
user doesn't need to actively create a context. A "primary context" is
created automatically when the user opens the device's file.
Reviewed-by: Mike Rapoport <[email protected]>
Signed-off-by: Oded Gabbay <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
This patch adds a basic support for the Goya device. The code initializes
the device's PCI controller and PCI bars. It also initializes various S/W
structures and adds some basic helper functions.
Reviewed-by: Mike Rapoport <[email protected]>
Signed-off-by: Oded Gabbay <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
This patch just adds a lot of header files that contain description of
Goya's registers.
Signed-off-by: Oded Gabbay <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
This patch adds the habanalabs skeleton driver. The driver does nothing at
this stage except very basic operations. It contains the minimal code to
insmod and rmmod the driver and to create a /dev/hlX file per PCI device.
Reviewed-by: Mike Rapoport <[email protected]>
Signed-off-by: Oded Gabbay <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
If the device node defines 'num-addresses', let it override the default
behavior.
Signed-off-by: Bartosz Golaszewski <[email protected]>
|