Customers demand high reliability for the FPGAs (field-programmable gate arrays) installed on products required for mission-critical usage and for authentication of functional safety specifications.
Hitachi Information & Telecommunication Engineering,Ltd. uses its technical expertise and experience, which have been accumulated over many years, to resolve problems related to FPGA. To firmly support your important systems and business operations that cannot stop, we provide the following services:
When data in configuration RAM (CRAM) in an FPGA is inverted due to the effects of cosmic-ray neutrons, the circuit configuration changes, which can cause the device to malfunction. This type of malfunction is called a “soft error”.
When a soft error occurs, the malfunction continues if the power is not turned on again. After the power is turned on, system operation returns to normal. This results in the problem that investigators cannot identify the cause of the malfunction.
For details about FPGA-related soft errors, see the following document:
Failsafe requirements and physical security requirements are required for applications for which safety and reliability are emphasized. To meet these requirements, the propagation of errors must be suppressed.
Normally, the FPGA layout is not arranged on the basis of functional module, unlike the case of ASIC. For this reason, gates and wiring of multiple functional modules might be adjacent to each other. In such a layout, when a physical failure occurs, the error might propagate to multiple functions at the same time.
Measures and IP (intellectual property) against soft errors are provided by FPGA vendors (Xilinx and Intel). However, the specifications of the IP provided by vendors are difficult to understand, and installation is difficult if the user does not have a good understanding of FPGA.
Our company has been working on measures against soft errors in collaboration with research laboratories of Hitachi, Ltd. since 2010. We can incorporate the vendor-provided IP and install functionality against soft errors based on Hitachi's technology expertise and experience, which have been accumulated over many years. By using this functionality, the inverted-CRAM soft error can be corrected (soft error correction) and system operation can continue. Alternatively, the system can be safely stopped when a soft error is detected (soft error detection).
We can implement isolation for FPGA for products where the propagation of errors must be suppressed.
In an isolation implementation, the layout is divided by functional module. Enclosing gates and wires inside the fences prevent a functional module from being mixed with others, thereby suppressing the propagation of errors.
To easily incorporate functionality against soft errors, we are developing “wrapping IP” (supports the IP for Xilinx only) based on technologies accumulated over many years. An error injection function is also installed in the wrapping IP, so virtual soft errors can be intentionally generated.
Measures against soft errors: Wrapping IP (SERES*1)
*1: SERES refers to soft error restoration support.
The FPGA vendor provides soft error measures for configuration RAM (CRAM). Figure 1 shows the circuit configuration when IPs (intellectual properties) for soft error measures are used. IP1 to IP3 are to be connected by the user. Furthermore, to control IP3, the user also needs to design and incorporate control logic. Therefore, if the user cannot correctly understand the functions of each piece of IP, the interface, and use method, soft error measures cannot be implemented.
Figure 1: A circuit configuration using IP provided by the vendor to resolve a soft error problem
Our company developed the wrapping IP that combines the IPs provided by the FPGA vendor and IP control logic based on technologies accumulated over many years. (See figure 2) This wrapping IP contains the following functions against soft errors, and simplifies the difficult IP control.
Functions of the wrapping IP:
*2: A highly reliable original function for this IP
Figure 2: The wrapping IP developed by our company
When a soft error occurs in an FPGA, the FPGA is initialized when the power of the system is turned on. This is why the abnormality cannot be identified. By using the soft error logging function developed by our company, the occurrence of a soft error is recorded even after the system is restarted.
Soft error logging function
To decide whether a soft error is the cause of a system stopping or restarting, the error detected by implementing soft error measures must be clearly recorded. However, when the system restarts, the FPGA is initialized and the result of the detection disappears.
We developed a soft error logging function, so that results of soft error detection can be recorded after a system restarts. The soft error logging function has the following features:
Features of the soft error logging function:
For know-how regarding soft error measures, see the following document:
Know-how regarding soft error measures:
1. Soft errors might occur even though soft error measures are implemented.
Soft error measures of the FPGA do not suppress the occurrence of soft errors or reduce their frequency. The measures correct the data in the configuration RAM (CRAM) in which data inversion due to neutron rays occurred. Customers need to be aware that soft errors cannot be stopped.
2. System malfunctions might occur even though soft error measures are implemented.
Soft error measures for FPGAs monitor data inversions in the CRAM, and then correct the data at the location where the soft error occurs. For this reason, a malfunction might occur in the period from the time a soft error occurs to the time the data is corrected. To ensure system operation continues without interruption, measures against soft errors at the system level also need to be considered, and not just incorporating soft error measures for the FPGA.
3. Prior consideration of soft error measures are important.
A soft error cannot be intentionally generated in normal circumstances, so it is extremely difficult to check whether the logic incorporated for soft error measures is effective. To check the operation by performing evaluations on an actual machine, a function that generates pseudo soft errors needs to be installed. In addition, the logging function must also be installed to record soft errors after the system restarts because of a soft error. It is important to consider these functions in advance based on your system requirements.
As a method for checking the impact of soft errors on products, we, in collaboration with Hitachi,Ltd., provide a neutron irradiation test service that uses particle accelerators owned by universities and other institutions.
We are entrusted with providing support for neutron irradiation tests and for providing consulting and design related to measures against FPGA soft errors.
Planning the test |
|
---|---|
Conducting the test and analyzing |
|
Taking measures and performing design |
|
We are working on QASS (General Incorporated Association Quantum beam Applications for Safe and Smart society) activities together with the Research and Development Group of Hitachi, Ltd., which participates in QASS.
The following table provides an overview of supported FPGAs and services provided by our company.
Overview of supported FPGAs | |||||
---|---|---|---|---|---|
FPGA vendors | Xilinx | Intel | |||
Supported devices |
|
|
|
||
FPGA development tools | Vivado | Quartus Prime | |||
Design language |
|
||||
Engineering services | |||||
Soft error detection | Y | Y | Y | ||
Soft error correction | Y | Y | N | ||
Isolation implementation | Y | Y | Y | ||
Our unique services | |||||
Providing the wrapping IP | Y | N | N | ||
Installing the soft error logging function |
Y | Y | Y | ||
Performing neutron irradiation tests* | Y | Y | Y |
Legend: Y: Available N: Not available
There are international standards for terrestrial telecommunication equipment. Our company has experience in developing FPGA based on international standards for telecommunication equipment.
Establishment of Soft Error Measures for Telecommunication Equipment and ITU-T International Standards.
Source: News Release of Hitachi, Ltd.
One of the Hitachi’s Lumada customer cases for which services are provided is “Evaluation of the frequency of unreproducible failures in electronic systems caused by neutrons and suggestions on countermeasures based on our know-how”. In this customer case, we measured and evaluated neutron irradiation, tolerance, and error information.
Lumada use case: Evaluation of the frequency of unreproducible failures in electronic systems caused by neutrons and suggestions on countermeasures based on our know-how.
Source: Lumada, Hitachi, Ltd.