aboutsummaryrefslogtreecommitdiff
path: root/drivers/misc/habanalabs/include
AgeCommit message (Collapse)AuthorFilesLines
2021-04-09habanalabs: update to latest F/W communication headerOhad Sharabi2-1/+200
update files to latest version from F/W team. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-04-09habanalabs: send dynamic msi-x indexes to f/wOhad Sharabi1-17/+58
In order to minimize hard coded values between F/W and the driver, we send msi-x indexes dynamically to the F/W. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-04-09habanalabs: support DEVICE_UNUSABLE error indication from FWKoby Elbaz1-0/+4
In case of multiple ECC errors, FW will set the DEVICE_UNUSABLE bit. On boot-up, the driver will therefore fail inserting the device. Signed-off-by: Koby Elbaz <kelbaz@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-04-09habanalabs: support legacy and new pll indexesOhad Sharabi4-25/+47
In order to use minimum of hard coded values common to LKD and F/W a dynamic method to work with PLLs is introduced in this patch. Formerly asic specific PLL numbering is now common for all asics. To be backward compatible a bit in dev status is defined, if the bit is not set LKD will keep working with old PLL numbering. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-04-09habanalabs/gaudi: Update async events headerOfir Bitton2-13/+24
Update with latest version from the Firmware team. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-04-09habanalabs/gaudi: reset device upon BMC requestOfir Bitton2-1/+2
In case the BMC of the devices' box wants to initiate a reset of a specific device, it must go through driver. Once driver will receive the request it will initiate a hard reset flow. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-04-09habanalabs: update hl_boot_if.hOhad Sharabi1-0/+11
Update to the latest version of the file as supplied by the F/W. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-04-09habanalabs/gaudi: update extended async event headerOfir Bitton1-5/+5
Update to the latest definition of the firmware Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-04-09habanalabs: return current power via INFO IOCTLSagiv Ozeri1-0/+5
Add driver implementation for reading the current power from the device CPU F/W. Signed-off-by: Sagiv Ozeri <sozeri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-04-09habanalabs: reset device in case of sync errorOhad Sharabi3-0/+11
As the F/wW is the first to detect out of sync event, a new event is added to notify the driver on such event. In which case the driver performs hard reset. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-04-09habanalabs: set max asid to 2farah kassabri2-2/+2
currently we support only 2 asids in all asics. asid 0 for driver, and asic 1 for user. no need to setup 1024 asids configurations at init phase. Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-02-08habanalabs: improve communication protocol with cpucpOfir Bitton1-0/+5
Current messaging communictaion protocol with cpucp can get out of sync due to coherency issues. In order to improve the protocol reliability, we modify the protocol to expect a different acknowledgment for every packet sent to cpucp. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: update to latest hl_boot_if.h spec from F/WOded Gabbay1-1/+7
It adds the definition for indication that the F/W handles HBM ECC events. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: update SyncManager interrupt handlingOded Gabbay1-2/+9
The firmware provides more information about SyncManager events. Adjust the code to the latest firmware interface file. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: fix ETR security issueOhad Sharabi2-2/+8
ETR should always be non-secured as it is used by the users to record profiling/trace data. This patch fixes the configuration to match those requirements. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs/gaudi: print sync manager SEI interrupt infoOfir Bitton2-0/+11
Driver must print sync manager SEI information upon receiving interrupt from FW. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs/gaudi: remove PCI access to SM blockOfir Bitton1-0/+3
Due to HW limitation we must remove all direct access to SM registers, in order to do that we will access SM registers using the HW QMANS. When possible and no user context is present, we can directly access the HW QMANS. Whenever there is an active user, driver will prepare a pending command buffer list which will be sent upon user submissions. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: read device boot errors after cpucp is upOfir Bitton1-0/+3
Boot cpu can report errors in various boot stages. Current implementaion does not take into consideration errors reported in late stages, hence we will check for errors at the most late stage when fetching cpucp information. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: update to latest hl_boot_if.hOded Gabbay1-4/+4
Update the latest version of this file that the F/W exports Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs/gaudi: remove duplicated gaudi packets masksOfir Bitton1-24/+0
As all packets use the same CTL register masks, we remove duplicated masks and use common masks instead. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2021-01-27habanalabs: update firmware boot interfaceOded Gabbay1-0/+5
Update to latest firmware hl_boot_if.h file. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-12-28habanalabs: update comment in hl_boot_if.hOded Gabbay1-1/+1
Hard-reset flag is updated in many stages of the boot sequence of the firmware. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-12-28habanalabs/gaudi: disable CGM at HW initializationOded Gabbay1-0/+5
In case the clock gating was enabled in preboot we need to disable it at the H/W initialization stage before touching the MME/TPC registers. Otherwise, the ASIC can get stuck. If the security is enabled in the firmware level, the CGM is always disabled and the driver can't enable it. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-12-28habanalabs: preboot hard reset supportOfir Bitton1-0/+2
FW hard reset capability indication is now moved to preboot stage. Driver will check if HW is dirty only after it validated preboot is up. If HW is dirty, driver will perform a hard reset according to the FW capability. In addition, FW defines a new message which driver need to send in order to initiate a hard reset. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs: add ull to PLL masksAlon Mizrahi1-4/+4
These defines are 64-bit defines so they need ull suffix. Signed-off-by: Alon Mizrahi <amizrahi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs: update firmware filesOded Gabbay3-2/+21
Update various firmware header files with new defines. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs: fetch pll frequency from firmwareAlon Mizrahi5-237/+49
Once firmware security is enabled, driver must fetch pll frequencies through the firmware message interface instead of reading the registers directly. Signed-off-by: Alon Mizrahi <amizrahi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs/gaudi: fetch HBM ecc info from FWOfir Bitton1-0/+32
Once FW security is enabled there is no access to HBM ecc registers, need to read values from FW using a dedicated interface. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs: fetch hard reset capability from FWOfir Bitton1-11/+19
Driver must fetch FW hard reset capability during boot time, in order to skip the hard reset flow if necessary. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs: fetch security indication from FWOfir Bitton4-0/+108
Add support for fetching security indication from FW. This indication is needed in order to skip unnecessary initializations done by FW. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs/gaudi: add NIC firmware-related definitionsOded Gabbay2-4/+54
Add new structures and messages that the driver use to interact with the firmware to receive information and events (errors) about GAUDI's NIC. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-30habanalabs/gaudi: add NIC QMAN H/W and registers definitionsOded Gabbay13-1/+9168
Add auto-generated header files that describe the NIC QMANs registers used by the driver. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-11-04habanalabs/gaudi: mask WDT error in QMANOded Gabbay1-1/+0
This interrupt cause is not relevant because of how the user use the QMAN arbitration mechanism. We must mask it as the log explodes with it. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2020-09-22habanalabs: update scratchpad register mapOded Gabbay2-0/+2
Our firmware use some scratchpad registers in the device for different roles. Update the file to the latest version of the firmware code. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22habanalabs: update firmware interface fileOded Gabbay1-0/+25
Add new packet to fetch PLL information from firmware. This will be needed in the future when the driver won't be able to access the PLL registers directly Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22habanalabs: add num_hops to hl_mmu_propertiesMoti Haimovski1-0/+2
This commit adds the number of HOPs supported by the device to the device MMU properties. Signed-off-by: Moti Haimovski <mhaimovski@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22habanalabs: increase PQ COMP_OFFSET by one nibbleOded Gabbay1-1/+1
For future ASICs, we increase this field by one nibble. This field was not used by the current ASICs so this change doesn't break anything. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22habanalabs: Fix alignment issue in cpucp_info structureOfir Bitton1-0/+1
Because the device CPU compiler aligns structures to 8 bytes, struct cpucp_info has an alignment issue as some parts in the structure are not aligned to 8 bytes. It is preferred that we explicitly insert placeholders inside the structure to avoid confusion in order to validate this scenario, we printed both pointers: __u8 cpucp_version[VERSION_MAX_LEN]; (0xffff899c67ed4cbc) __le64 dram_size; (0xffff899c67ed4d40) we see difference of 132 bytes although the first array is only 128 bytes long, Meaning compiler added a 4 byte padding. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22habanalabs: replace armcp with the generic cpucpOded Gabbay1-137/+136
ArmCP mandates that the device CPU is always an ARM processor, which might be wrong in the future. Most of this change is an internal renaming of variables, functions and defines but there are two entries in sysfs which have armcp in their names. Add identical cpucp entries but don't remove yet the armcp entries. Those will be deprecated next year. Add the documentation about it in sysfs documentation. Signed-off-by: Moti Haimovski <mhaimovski@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22habanalabs: update GAUDI hardware specsOded Gabbay1-0/+2
Add define for the 2 MME slave engines. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22habanalabs: add support for getting device total energyfarah kassabri1-0/+1
Add driver implementation for reading the total energy consumption from the device ARM FW. Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22habanalabs: Include linux/bitfield.h only in habanalabs.hTomer Tayar1-1/+0
Include linux/bitfield.h only in habanalabs.h, instead of in each and every file that needs it, as habanalabs.h is already included by all. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22habanalabs: use FIELD_PREP() instead of <<Oded Gabbay1-152/+122
Use the standard FIELD_PREP() macro instead of << operator to perform bitmask operations. This ensures type check safety and eliminate compiler warnings. Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-09-22habanalabs: add information about PCIe controllerOfir Bitton1-0/+10
Update firmware header with new API for getting pcie info such as tx/rx throughput and replay counter. These counters are needed by customers for monitor and maintenance of multiple devices. Add new opcodes to the INFO ioctl to retrieve these counters. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-08-31habanalabs: fix report of RAZWI initiator coordinatesOfir Bitton1-16/+16
All initiator coordinates received upon an 'MMU page fault RAZWI event' should be the routers coordinates, the only exception is the DMA initiators for which the reported coordinates correspond to their actual location. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24habanalabs: update hl_boot_if.h from firmwareOded Gabbay1-0/+14
Update the boot interface file from the latest version from firmware. Defines for secure boot were added. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
2020-07-24habanalabs: create common folderOded Gabbay3-0/+0
For internal needs of our CI we need to move all the common code into a common folder instead of putting them in the root folder of the driver. Same applies to the common header files under include/ Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
2020-07-24habanalabs: halt device CPU only upon certain resetOded Gabbay2-0/+4
Currently the driver halts the device CPU in the halt engines function, which halts all the engines of the ASIC. The problem is that if later on we stop the reset process (due to inability to clean memory mappings in time), the CPU will remain in halt mode. This creates many issues, such as thermal/power control and FLR handling. Therefore, move the halting of the device CPU to the very end of the reset process, just before writing to the registers to initiate the reset. In addition, the driver now needs to send a message to the device F/W to disable it from sending interrupts to the host machine because during halt engines function the driver disables the MSI/MSI-X interrupts. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai>
2020-07-24habanalabs: Extract ECC information from FWOded Gabbay2-12/+19
ECC (Error Correcting Code) interrupts are going to be handled by the FW. Hence, we define an interface in which the driver can obtain the relevant ECC information. This information is needed for monitoring and can also lead to a hard reset if ECC error is not correctable. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24habanalabs: calculate trace frequency from PLLAdam Aharon2-0/+115
The profiler needs to know the PLL values for correctly showing the profiling data. Because our firmware can use different PLL configurations, we need to read the PLL values from the ASIC to pass them to the profiler. Signed-off-by: Adam Aharon <aaharon@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>