aboutsummaryrefslogtreecommitdiff
path: root/drivers/misc
AgeCommit message (Collapse)AuthorFilesLines
2021-06-18habanalabs: add debug flag to prevent failure on timeoutYuri Nudelman2-5/+25
Sometimes it is useful to allow the command to continue running despite the timeout occurred, to differentiate between really stuck or just very time consuming commands. This can be achieved by passing a new debug flag alongside the cs, HL_CS_FLAGS_SKIP_RESET_ON_TIMEOUT. Anyway, if the timeout occurred, a warning print shall be issued, however this shall not fail the submission. Signed-off-by: Yuri Nudelman <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs/gaudi: add FW alive event supportOfir Bitton4-1/+32
In order for driver to be aware of process or thread crashes inside GAUDI's CPU, we introduce a new event which contains all relevant information. Upon event reception, driver will dump information and will reset the device. Signed-off-by: Ofir Bitton <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs/gaudi: don't use disabled ports in collective waitOfir Bitton1-148/+71
In the collective wait, we put jobs on the QMANs of all the NICs. The code takes into account if a port is disabled only in case of PCI card. When this info arrives from the f/w, the code doesn't take it into account, and it tries to schedule jobs on NICs that aren't enabled and thats a bug. To fix this, after the f/w sends us the list of disabled ports, we update the state of the QMANs according to that list. In addition, we need to update the HW_CAP bits so the collective wait operation will not try to use those QMANs. We also need to update the collective master monitor mask. Moreover, we need to add a protection for such future cases and in case the user will try to submit work to those QMANs. Signed-off-by: Ofir Bitton <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs/gaudi: update to latest f/w specsOded Gabbay2-10/+33
Update the firmware interface files to their latest version. Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs/gaudi: split host irq interfaces towards FWOfir Bitton4-8/+44
Current implementation uses a single interrupt interface towards FW, this interface is causing races between interrupt types. We split this interface to interface per interrupt type. Signed-off-by: Ofir Bitton <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs: prefer ASYNC device probingOded Gabbay1-1/+5
There is no dependency when probing multiple devices so indicate to the kernel that it can probe our devices in ASYNC fashion. This shortens insmod of the driver from ~2 minutes to 20 seconds on a system with 8 devices. Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs/gaudi: add ARB to QM stop on error masksTomer Tayar2-15/+17
Update the QM stop on error masks to also stop on ARB errors. Signed-off-by: Tomer Tayar <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs/gaudi: don't use nic_ports_mask in computeOded Gabbay1-12/+12
nic_ports_mask is used by the networking part of the driver. In the compute part, we use the HW_CAP bits to select what is active and what is not. Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs/gaudi: set the correct cpu_id on MME2_QM failureKoby Elbaz1-1/+1
This fix was applied since there was an incorrect reported CPU ID to GIC such that an error in MME2 QMAN aliased to be an arriving from DMA0_QM. Signed-off-by: Koby Elbaz <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs/gaudi: refactor reset codeOded Gabbay2-18/+34
After all the latest changes to the reset code, there were some redundancy and errors in the flows. If the Linux FIT is loaded to the ASIC CPU, we need to communicate with it only via GIC. If it is not loaded, we need to either use COMMS protocol (for newer f/w) or MSG_TO_CPU register (for older f/w). In addition, if we halted the device CPU then we need to mark that the driver will do the reset, regardless of the capabilities. Also, to prevent false errors, we need to keep track whether the device CPU was already halted. If so, we shouldn't try to halt it again. Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs: track security status using positive logicOhad Sharabi8-50/+51
Using negative logic (i.e. fw_security_disabled) is confusing. Modify the flag to use positive logic (fw_security_enabled). Signed-off-by: Ohad Sharabi <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs/gaudi: use COMMS to reset device / halt CPUKoby Elbaz3-4/+39
This is needed because legacy FW 'communication' protocol will soon become obsolete. Because COMMS is a boot protocol, communicating through it is supported only until Linux is loaded to the device CPU, where in that case we will fallback to the former implementation. Signed-off-by: Koby Elbaz <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs/gaudi: disable GIC usage if security is enabledKoby Elbaz3-15/+29
Security is set based on PCI ID, and after reading preboot status bits. GIC usage is set in both scenarios since GIC can't be used when security is enabled. Moreover, writing to GIC/SP is enabled only after Linux is fully loaded. Signed-off-by: Koby Elbaz <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs: read preboot status bits in an earlier stageKoby Elbaz1-8/+2
On newer releases, host won't be able to trigger an interrupt directly to the ASIC GIC controller. To be able to decide whether GIC can/not be used, we must read device's preboot status bits in a stage that precedes the possible first use of GIC (when device is in dirty state). Signed-off-by: Koby Elbaz <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs: check running index in eqe controlOded Gabbay5-5/+49
To harden the event queue mechanism, we add a running index to the control header of the entry. The firmware writes the index in each entry and the driver verifies that the index of the current entry is larger by 1 of the index of the previous entry. In case it isn't, the driver will treat the entry as if it wasn't valid (it won't process it but won't skip it). Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs: set memory scrubbing to disabled by defaultOded Gabbay1-2/+2
Scrubbing memory after every unmap is very costly in terms of performance. If a user wants it he can enable it but the default should prioritize performance. Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs/gaudi: do not move HBM bar if iATU done by FWOfir Bitton1-0/+20
As iATU configuration is done by FW, driver should not try and move HBM bar. Signed-off-by: Ofir Bitton <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs/gaudi: read GIC sts after FW is loadedKoby Elbaz3-28/+57
Reading of GIC privileged status will be done after F/W is loaded, because privileged GIC capability is only available with the correct ARMCP version, and after it's loaded. Such versions necessarily support COMMS, so GIC alternatives (SP regs) will be read directly from dynamic regs. As well, initiation of DMA QMANs will occur after F/W is loaded since it depends on GIC configuration. In case F/W isn't loaded there's no problem since either way there won't be any GIC IRQ handling. Signed-off-by: Koby Elbaz <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs: check if asic secured with asic typeOhad Sharabi1-1/+1
Fix issue in which the input to the function is_asic_secured was device PCI_IDS number instead of the asic_type enumeration. Signed-off-by: Ohad Sharabi <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs/gaudi: send hard reset cause to prebootKoby Elbaz6-8/+208
LKD should provide hard reset cause to preboot prior to loading any FW components (in case needed). Current implementation is based on the new FW 'COMMS' protocol In cased 'COMMS' is disabled - reset cause won't be sent. Currently, only 2 reset causes are shared: HEARTBEAT & TDR. Sending the reset cause will provide the missing watchdog info that the firmware needs to provide to the BMC. Signed-off-by: Koby Elbaz <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs: notify before f/w loadingOded Gabbay1-0/+3
An information print notifying on starting to load the f/w was removed by mistake when moving to the new dynamic f/w loading mechanism. Restore that print as the F/W loading usually takes between 10 to 20 seconds and this print helps the user know the status of the driver load. Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs/gaudi: use scratchpad regs instead of GIC controllerKoby Elbaz6-36/+104
Due to new security restrictions, GIC controller can no longer be accessed from user/kernel. To monitor that, a new status bit will be read from preboot caps, indicating whether direct access to GIC is blocked. In case it is blocked, driver will use scratchpad registers instead of using GIC interface on two main scenarios: The first of which LKD triggers interrupts to F/W through GIC, and the second of when LKD configures all engines/QMANs to write to GIC when they want to report an error. From F/W perspective, it will poll on all SPs, and once IRQ number is retrieved, SP register is cleared, and it will perform the write to the GIC to trigger the IRQ handler. Signed-off-by: Koby Elbaz <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs: read f/w's 2-nd sts and err registersOhad Sharabi4-133/+291
Maintain both STS1 and ERR1 registers used for status communication with F/W. Those are not maintained as we currently have less than 31 statuses/error defined and so LKD did not refer to those register. The reason to read them now is to try to support future f/w versions with current driver. Signed-off-by: Ohad Sharabi <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs: avoid using uninitialized pointerOhad Sharabi1-0/+1
When attempting to read FW component's version we should break if input FW component is invalid in order to avoid using uninitialized destination pointer. Signed-off-by: Ohad Sharabi <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs: set dma mask from fw once fw done iatu configOhad Sharabi3-10/+8
When setting "DMA mask from FW" we are reading PSOC_GLOBAL_CONF register which is allowed only once FW has done it's iATU configuration. Signed-off-by: Ohad Sharabi <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs: better error print for pin failureOded Gabbay1-1/+2
Print the user given pointer and error code on failure to get user pages for easier debugging. Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs: add missing space after castingOmer Shpigelman1-1/+1
Change casting code according to kernel coding style. Signed-off-by: Omer Shpigelman <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs: ignore device unusable statusOded Gabbay1-3/+3
Some users might want to implement their own policy of when the device is unusable so we need to ignore this status in the driver and continue loading as normal. Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs: load linux image to deviceOhad Sharabi4-81/+249
Implementing dynamic linux image load to the device. This patch also implements the FW communication steps during the boot-fit. This patch also enables the dynamic protocol based on the compatibility flag. Signed-off-by: Ohad Sharabi <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs: load boot fit to deviceOhad Sharabi4-69/+615
Implementing dynamic boot fit image load to the device. Note that some necessary adjustment were added to the static loader as well so that both loaders can co-exist. as this is not the final FW load stage the dynamic FW load is still forced to be non functional. Signed-off-by: Ohad Sharabi <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs: use dev_dbg upon hint address failureOded Gabbay1-2/+4
Hint address failure that results in a valid mapping with an address that was allocated by the driver is not a real failure. Therefore, the driver shouldn't notify about this in kernel log. The user is responsible to check the returned address. Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs: modify progress status messagesGuy Nisan1-10/+10
Indicate "progress" instead of "error" when reporting progress status. Change "u-boot stopped by user" to "Cannot boot" message as CPU_BOOT_STATUS_UBOOT_NOT_READY may indicate a fatal error that prevent u-boot from loading firmware. Signed-off-by: Guy Nisan <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs: give FW a grace time for configuring iATUOfir Bitton1-0/+4
iATU (internal Address Translation Unit of the PCI controller) configuration is being done by FW right after driver enables the PCI device. Hence, driver must add a minor sleep afterwards in order to make sure FW finishes configuring iATU regions. Signed-off-by: Ofir Bitton <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs: update to latest f/w headersOded Gabbay5-43/+88
Update the common and GAUDI firmware header files to the latest version. The latest version use the correct endianness types so this commit also contains minor changes to the code to use the correct conversions when reading/writing to the firmware structures. Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs: expose ASIC specific PCI info to common codeOhad Sharabi4-0/+124
LKD has interfaces in which it receives device address. For instance the debugfs_read/write variants receives device address for CFG/SRAM/DRAM for read/write and need to translate to the mapped PCI BAR address. In addition, the dynamic FW load protocol dictates that the address to which the LKD will copy the image for the next FW component will be received as a device address and can be placed either in SRAM or DRAM. We need to distinguish those regions as the access methods to those regions are different (in DRAM we possibly need to set the BAR base). Looking forward this code will be used to remove duplicated code in the debugfs_read/write that search the memory region for the input device address. Signed-off-by: Ohad Sharabi <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs: dynamic fw load reset protocolOhad Sharabi4-40/+360
First stage of the dynamic FW load protocol is to reset the protocol to avoid residues from former load cycles. Signed-off-by: Ohad Sharabi <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs: use common fw_version readOhad Sharabi4-93/+74
Instead of using multiple ASIC specific copies of functions to read the FW version use single common one that gets ASIC specific arguments. Signed-off-by: Ohad Sharabi <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs: use mmu cache range invalidationAlon Mizrahi4-94/+16
Use mmu cache range invalidation instead of entire cache invalidation because it yields better performance. In GOYA and GAUDI, always use entire cache invalidation because these ASICs don't support range invalidation. Signed-off-by: Alon Mizrahi <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs: refactor init device cpu codeOhad Sharabi4-28/+82
Replace multiple arguments to init device CPU function by passing firmware loader managing structure that is initialized per ASIC with the loader parameters. In addition, the FW loader management structure is now part of the habanalabs device, this way the loader parameters will be able to be communicated across various boot stages. Signed-off-by: Ohad Sharabi <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs: request f/w in separate functionOhad Sharabi1-21/+44
This refactor is needed due to the dynamic FW load in which requesting the FW file (and getting its attributes) is not immediately followed by copying FW file content. Signed-off-by: Ohad Sharabi <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs: prepare preboot stage to dynamic f/w loadOhad Sharabi2-15/+71
Start the skeleton for the dynamic F/W load by marking current preboot code path as legacy. Signed-off-by: Ohad Sharabi <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs: update firmware files to latestOded Gabbay2-2/+4
Update the firmware files to the latest from the firmware team. Signed-off-by: Oded Gabbay <[email protected]>
2021-06-18habanalabs: increase ELBI reset timeout for PLDMMoti Haimovski1-1/+1
On PLDM, in case of NIC hangs, the ELBI reset to take much longer than expected. As a result an increase in the ELBI reset timeout is required. Signed-off-by: Moti Haimovski <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2021-06-15mei: hdcp: SPDX tag should be the first lineTom Rix1-1/+0
checkpatch looks for the tag on the first line. So delete empty first line Signed-off-by: Tom Rix <[email protected]> Link: https://lore.kernel.org/r/[email protected] Acked-by: Tomas Winkler <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2021-06-15arm64: Add ARM64_PTR_AUTH_KERNEL config optionDaniel Kiss1-3/+3
This patch add the ARM64_PTR_AUTH_KERNEL config and deals with the build aspect of it. Userspace support has no dependency on the toolchain therefore all toolchain checks and build flags are controlled the new config option. The default config behavior will not be changed. Signed-off-by: Daniel Kiss <[email protected]> Acked-by: Will Deacon <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2021-06-14Merge tag 'v5.13-rc6' into tty-nextGreg Kroah-Hartman9-13/+42
We want the tty fixes in here as well. Signed-off-by: Greg Kroah-Hartman <[email protected]>
2021-06-14Merge tag 'v5.13-rc6' into driver-core-nextGreg Kroah-Hartman9-13/+42
We need the driver core fix in here as well. Signed-off-by: Greg Kroah-Hartman <[email protected]>
2021-06-14Merge tag 'v5.13-rc6' into char-misc-nextGreg Kroah-Hartman9-13/+42
We need the fixes in here as well. Signed-off-by: Greg Kroah-Hartman <[email protected]>
2021-06-12nvmem: eeprom: at25: fram discovery simplificationJiri Prchal1-8/+5
Changed "is_fram" to bool and set it based on compatible string. Signed-off-by: Jiri Prchal <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2021-06-11nvmem: eeprom: at25: fix type compiler warningsJiri Prchal1-3/+3
Fixes: drivers/misc/eeprom/at25.c:181:28: warning: field width should have type 'int', but argument has type 'unsigned long' drivers/misc/eeprom/at25.c:386:13: warning: cast to smaller integer type 'int' from 'const void *' Reported-by: kernel test robot <[email protected]> Signed-off-by: Jiri Prchal <[email protected]> Fixes: fd307a4ad332 ("nvmem: prepare basics for FRAM support") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>