In this paper, we propose fault-tolerant field-programmable gate array (FPGA) architectures and their design framework for intellectual property (IP) cores in system-on-chip (SoC). Unlike discrete FPGAs, in which the integration scale can be made relatively large, programmable IP cores must correspond to arrays of various sizes. The key features of our architectures are a regular tile structure, spare modules and bypass wires for fault avoidance, and a configuration mechanism for singlecycle reconfiguration. In addition, we utilize routing tools, namely EasyRouter for proposed architecture. This tool can handle various array sizes corresponding to developed programmable IP cores. In this evaluation, we compared the performances of conventional FPGAs and the proposed fault-tolerant FPGA architectures. On average, our architectures have less than 1.82 times the area and 1.11 times the delay compared with traditional island-style FPGAs. At the same time, our FPGA shows a higher fault tolerant performance.
Introduction
In an era of system-on-chip (SoC), intellectual property (IP) cores based design have become important components of embedded systems for the purpose of reducing product cycle time and development cost. Programmable logic IP cores are particularly attractive for building programmable SoCs and avoiding the additional cost of chip re-spins. The two main approaches to realizing programmable SoCs: hard-IPbased and soft-IP-based design. Field-programmable gate array (FPGA) vendors provide devices comprising a microprocessor combined with programmable logic as hard-IP cores [1] . In contrast, the soft IP core [2] , [3] approach adopts the reverse way, which has the advantages of providing freedom to target any process and utilizing a standard cell design flow with state-of-the-art application-specific integrated circuit(ASIC) design tools. Soft-IP core design allows the designer to easily make modifications according to their needs. Figure 1 shows an image of programmable logic IP in an SoC. The logic function of a programmable-logic IP can be changed rapidly by downloading a configuration bitstream.
The reliability of programmable logic is very important in SoC design. ily affected by temporary faults (soft errors) caused by radiation. A number of studies have been conducted on soft errors [4] - [6] . Using scrubbing methods is effective for eliminating soft errors in configuration memory bits. In contrast, other than chip replacement, there are no effective measures against physical faults (hard errors). Since SoCs are used in a variety of environments, there is a strong possibility that they may suffer temporary faults caused by radiation or severe temperature cycle fluctuation. It is critical to stop the operation long time in highly reliable systems, such as medical equipment and devices operating in remote locations. This is possible if the scale of the device allows for masking of hard errors by triple modular redundancy (TMR). However, unlike discrete FPGAs, in which the integration scale can be relatively large, the area constraints on IP cores are severe.
In this paper, we enhance an FPGA architecture [7] featuring high tolerance to physical errors for soft-IP-based programmable logic design, and we present an actual TEG chip manufactured using a 65-nm CMOS technology. In our study, we focus on the following two constraints: scalable design techniques depending on target applications and fault avoidance techniques at the device level. The first constraint requires scalable design techniques since the size of the programmable logic IP varies depending on the target application. With regard to the second constraint, the question is whether to execute place and routing again or to incorporate a physical-fault avoidance mechanism into the architecture itself. Since implementing avoidance techniques by place and routing requires time to recompile the configuration bitstream depending on the fault location [8] , [9] , avoidance time becomes an issue. Meanwhile, if an avoidance mechanism is incorporated into the architecture, the system can continue operating when a fault occurs because of its capability to perform rapid fault avoidance.
The rest of the paper is organized as follows. Section 2 presents related researches, and we present a strategy for programmable IP design in Sect. 3. Sections 4 and 5 introduce the proposed FPGA architecture and CAD tools, and Sect. 6 compares the proposed architecture with traditional FPGA architectures. Section 7 presents the results of a detailed evaluation of an actual TEG chip manufactured with our architecture. The conclusion is presented in Sect. 8.
Related Researches
Fault tolerance methods for FPGAs can be divided into two groups [10] based on the level of abstraction at which faults are tolerated as follow: Configuration-level (CL) and device-level (DL). The first group of methods takes a higherlevel approach, namely fault tolerance implemented at the level of the FPGA configuration [8] , [9] . Configuration-level (CL) methods regard the FPGA as a set of abstract resources, often represented as a graph structure, without considering the actual physical structure of the device. When a circuit is placed and routed, fault-free resources are selected from the set of available resources. CL methods are extremely flexible with respect to fault patterns, and most CL approaches can tolerate large numbers of faults. However, CL methods are implemented by tools, and therefore tolerance to new faults requires additional re-compile time.
Methods of the second group consider faults at the level of the FPGA hardware [11] - [13] . The most advantage of DL methods is no need for re-compile. Since DL methods employ a larger array to utilize fault free region which is redundancy hardware, the area overhead is considerable. DL methods are also less flexible and therefore less capable of improving reliability than CL methods. In addition, a fault position detection on process is required. Thus, these methods are generally used only for manufacturing test. The proposed method can detect fault position completely and has better area, delay, fault tolerant performance than TMR methods.
Both classes of fault tolerance have their pros and cons. However, since the proposed FPGA architecture can apply after manufacturing, the system can continue operating because of its capability to perform fault avoidance quickly.
Strategy for Programmable IP Design

Design Requirements
Our target is soft-IP-based FPGA cores, and we consider stuck-at faults. To avoid faults inside an FPGA core without recompiling, the configuration data of an implemented circuit must be moved quickly to fault-free regions. Thus, we consider the following points.
(1) Fault avoidance with embedded spare modules. Unlike discrete FPGAs, FPGA cores are bound by severe area constrains. Thus, the designer cannot easily use hardware redundancy techniques such as TMR. Additional hardware modules have to be kept to a minimum to ensure high circuit performance.
(2) Handling cores of various sizes. The design technique must support design scalability since target applications require different core sizes and IO operations. It is also highly desirable to construct design automation of the process from exploration of the FPGA architecture to implementation of this architecture onto a chip.
(3) Automation of both fault detection and fault avoidance. To perform proper fault avoidance, it is important to detect the fault location. Thus, automation of both fault detection and avoidance is desirable.
If the FPGA has an irregular architecture, removing configuration data can be rather difficult. FPGA arrays consist of a number of logic tiles, each of which has a connection block (CB), a switch block (SB), and a logic block (LB) (Fig. 2) . Since island-style FPGAs can have various tile topologies [14] , configuration data is not the same for each logic tile. As a result, configuration data cannot be simply moved from faulty tiles to spare tiles. Thus, [14] proposed a uniform tile structure referred to as homogeneous FPGA. Also, fault detection techniques using the SB topology have been proposed [8] that allow detection of the locations of any faults in FPGA cores (see the next subsection for details on these methods). For these reasons, our fault-tolerant FPGA (FT-FPGA) is based on a homogeneous tile structure. Note that we assume that our FT-FPGA is combined with a processor into an SoC. The processor performs both fault detection and fault avoidance according to the constraints on the FPGA core. Since these cores are implemented on a single chip, this technique essentially implements on-chip fault tolerance.
Our Previous Researches
In this subsection, we present previous research related to the topic of this paper. Most FPGAs have a complex structure to achieve high programmability, which also makes manufacturing tests difficult. Thus, automatic test pattern generation tools for ASICs cannot be used directly for FP-GAs. To address this problem, we proposed a simple and regular FPGA structure [14] suitable for design for test (Fig. 2) . In this architecture, all tiles have a uniform structure, unlike the traditional island-style FPGA architecture, which is composed of several different types of tiles.
We also implement aligners to simplify the connections of wire segments. The configuration patterns for fault detection are generated using the regularity of Wilton-type SBs [15] . Wilton-type SBs can be divided into three types of paths: (a) orthogonal, (b) clockwise, and (c) counterclockwise (Fig. 3) . A desirable feature of this approach is that each path can be tested individually. For example, when all SBs are configured as clockwise paths, the propagation paths are formed as a single stroke path (Fig. 4) . In this case, test signals enter from LBs or input/output blocks, and all clockwise paths are tested. This method can detect the locations of all stuck-at faults in O(N) cycles for an N × N size FPGA array [8] .
In addition, we developed an FPGA design framework that is focused on improving FPGA IP design efficiency. A novel routing tool (EasyRouter) written in the C# language, is combined with traditional academic tools for architecture exploration and circuit implementation. Easyrouter can automatically generate HDL designs of FPGA cores and configuration bitstreams. Once the HDL design of the FPGA core is generated, we can combine it with commercial verylarge-scale integration (VLSI) CAD tools.
Fault-Tolerant FPGA Architecture
Proposed Architecture
In this section, we propose an FT-FPGA architecture (Fig. 5) . FT-FPGA is based on the homogeneous routing architecture in [14] , which has a simple regular structure. To implement physical fault avoidance, spare tiles are introduced to the homogeneous FPGA. In this architecture, the entire region is divided into several groups of tiles referred to as tile arrays (TAs). Each TA includes spare tiles, and fault avoidance is performed in individual TAs.
A single column of spare tiles is placed to the right of the main array. In the absence of faults, a circuit is configured with normal tiles and the spare tiles are not used ( Fig. 6 (a) ). If a fault occurs, part of the implemented circuit is moved to fault-free tiles to the right of the column containing faulty tiles ( Fig. 6 (b) ). However, moving cir- cuits causes connection mismatches between TAs. To address this mismatches, we implement TA interfaces, which are constructed as selectors between TAs, to update the connections. Note that tiles in the circuit can be moved column by column only. If the FPGA has to shift tiles row by row, a number of interfaces would be necessary to update connections between TAs. This would cause significant degradation in circuit speed and increase the overall area of the FPGA.
Connections between tiles must be rerouted after implementing fault avoidance. Therefore, we introduce bypass wires. Figure 7 (a) shows conventional wire segments between SBs. In contrast, we implement bypass wires as shown in Fig. 7 (b) . All segments have bypass wires, each of which is connected to a selector. When performing fault avoidance, these selectors switch the connections between SBs so as to skip faulty tiles. Moreover, all connections within TA are composed of single-wire segments, and the connections between TAs are composed of both single-wire and multiple-wire segments. Single-wire segments are used to connect neighboring TAs as well as individual tiles, and multiple-wire segments connect distant TAs. Since fault avoidance is implemented within individual TAs, multiplewire segments do not have bypass wires. Figure 8 shows the proposed configuration structure, which consists of the configuration memory bits of each tile and its multiplexers. These configuration memory bits are also connected serially. In the proposed fault tolerance technique, configuration data can be moved to the tile to the right for the purpose of fault avoidance. However, serial shifting of the configuration data requires a considerable amount of time. To reduce the avoidance time, in our architecture we use multiplexers to switch between different tile configurations. These multiplexers can select configuration data of from either one of two neighboring tiles. "L" is chosen by default, and "H" is chosen when a fault occurs. This method can switch the configuration data quickly. Moreover, configuration data for spare tiles is unnecessary because spare tiles are not used in the absence of faults and neighboring tiles are used during fault avoidance. This method can achieve fast fault avoidance and high area efficiency.
Configuration Technique
Fault Detection and Avoidance Flow
Fault tolerance requires testing and fault detection. For the proposed architectures, we utilize the testing and detection techniques previously proposed in [8] . These techniques were developed for homogeneous FPGAs and therefore allow us to identify the locations of all defective tiles in the fault-tolerant architecture. Figure 9 shows fault detection and avoidance flow. Test mode includes two fault detection phases and one fault avoidance phase. First, when the system runs "test mode", entire system is checked by downloading the bitstream for fault detection. This test patterns [8] can detect fault position. If fault is not found, the system back to normal operation. On the other hand, when faults exist, FT-FPGA uses spare tile according to information of the fault position. After fault avoidance, the system execute fault detection again. If faults are remained, fault avoidance is failed and chip replacement are needed. Note that these fault detection methods do not cover configuration memory itself. In the case of fault detection of configuration memory bits, we have only to input toggle pattern bitstream and observe output of all serial data by scrubbing.
CAD Tool Set for FT-FPGA Architecture Exploration
Design Flow
The design flow used in this paper consists of three steps: minimum channel width (CW) exploration, physical information extraction, and performance evaluation. First, minimum CW exploration is performed with a selected benchmark set (Step1 in Fig. 10 (a) ). We use the conventional FPGA CAD flow to perform synthesis (with ODIN II [16] ), technology mapping (with ABC [17] ), clustering (with TVPack [18] ), and placement (with VPR 5.0 [19] ). Routing is performed with a novel routing tool (EasyRouter [20] ), which has greater flexibility in the development of new architectures than conventional VPR tools have. Next, when the FPGA architecture is decided, we design one tile and extract its physical information by commercially available VLSI design CAD tools (Step2 in Fig. 10 (a) ). HDL code for the programmable core is generated by EasyRouter, the area is derived from the GDSII layout, and representative path delays are extracted by static timing analysis (STA).
Finally, we evaluate the performance using EasyRouter (Step3 in Fig. 10 (b) ). In addition to the files necessary for routing, such as the clustered netlist and placement results, EasyRouter generates bitstream of applications. Using the tile area and delay information derived in the second step, EasyRouter combined with VLSI CAD tools can provide reports on FPGA area and critical path delay of fixed CW and array size. Of course, if we utilize full chip standard delay format(SDF), it is possible to use commercial STA tools to get highly accurate delay report.
Routing Tool
We use the EasyRouter tool [20] instead of a conventional VPR tool to perform routing. EasyRouter is based on routing algorithms similar to those in VPR but offers a number of improvements. We have developed a script-based architecture definition mechanism by taking the code file itself as the architecture definition file. Unlike conventional VPR, in which XML-based architecture description files support only parameter changes for island-style FPGAs, EasyRouter lets users freely implement any functions for their new architectures in the architecture script file. This mechanism offers users maximum flexibility for designing new FPGA architectures. We have also developed an HDL and a bitstream generator to facilitate the evaluation of the designed FPGA architectures with commercially available VLSI CAD tools. Moreover, EasyRouter is coded in C# in the .Net framework with full object-oriented programming support, which reduces the amount of code and its complexity, facilitating the implementation of new architectures. Owing to the benefits of the open-source Mono runtime environment [21] , EasyRouter can be executed under most operating systems. Using EasyRouter, we implement and evaluate various FPGA architectures. Figure 11 shows a block diagram of the EasyRouter operation. First, the architecture script execution block reads, compiles, and executes an architecture script file. All architecture-dependent functions (such as architecture parameters, physical information setup, netlist, placement import, and routing resource graph (RRGraph) build) are included in this script file. Note that physical information about tiles as derived using the flow in Fig. 10 is included in the script file. Then, using RRGraph generated by the architecture script code, the routing block performs routing. The algorithm for determining channel width is a breadth-first search. Finally, the report generation block produces area and delay reports.
Evaluation
In this section, we introduce the evaluation conditions and flows. Then, we present the area, critical path delay, and fault tolerance performance of the fault-tolerant architectures with different tile array sizes, multiple-wire segment lengths, and multiple-wire segment ratios. Using the evaluation results, we analyze and discuss the characteristics and tradeoffs of the proposed architectures.
Evaluation Conditions
In the evaluation, we compare the performances of conventional island-style FPGAs, homogeneous FPGAs, and FTFPGAs. Island-style and homogeneous FPGAs are also examined using the TMR technique. FT-FPGA architectures are examined using three architectural parameters, namely TA size (2, 3, or 4), multiple-wire segment length (0, 1, or 2), and multiple-wire segment ratio (25% or 50%). The proposed architectures are labeled in the format T xMxRxx, where x is a digit; for example, T 2M1R25 indicates 2×2 TAs, the multiple-wire segment length 1(indicating that the connection passes though one tile array only), and 25% multiple-wire segment ratio.
For all target FPGAs, the lookup table size is six, and each logic block contains four logic elements. The number of input pins in one logic block is 15 [22] . The SBs have a Wilton-type structure. We use unidirectional single-and multiple-wire segments for routing tracks. The parameter Fc in is set to 0.5, which indicates that CB connects half of the routing wires in the routing track to one input pin of a logic block. The outputs of four logic blocks are connected to the multiplexers of SBs.
The FPGAs for evaluation are designed using a 65-nm CMOS technology, and the synthesis tool is Synopsys Design Compiler F-2011.09-SP2. The layout is implemented using the Cadence EDI system 10.13, and STA is performed using Synopsys PrimeTime F-2011.12-SP1.
Evaluation Flow
We introduced the CAD flow in Sect. 5. In this section, we present the evaluation flow in detail. First, the 10 largest MCNC [23] benchmarks are mapped, clustered, and placed with ABC, T-VPack, and VPR, respectively. Next, routing and examination of minimum channel width for each benchmark are performed using EasyRouter. Then, the channel width of each architecture is fixed to 1.2 times the minimum channel width of all benchmark tests on that architecture. The physical information required for performance evaluation is calculated using a fixed channel width. Finally, performance is analyzed using EasyRouter, after which area and timing reports are generated.
To evaluate the fault tolerance, we use a simulationbased fault injection program which randomly selects a tile as a designated faulty tile. The proposed fault-tolerant mechanism is subsequently activated. Finally, the simulator analyzes whether all faults are tolerated. This process is repeated 100,000 times, and the success rate is evaluated. The array size is set according to the size of each benchmark. Note that fault avoidance is not performed when failures occur in unused tiles. 
Fault Tolerance Performance
In this evaluation, it is necessary to consider circuits that cannot be covered by redundant protection in fault tolerance evaluation. We assume that it is checked to download bitstream correctly. In this paper, we first perform error injection simulation experiment for 100,000 times and count the fault avoidance rate. Then final fault tolerability with circuit area consideration is calculated by the following equation:
where Area avoidance is the area ratio of protected circuits in one tile. N indicates the number of injected errors. Rate avoidance is fault avoidance rate of the error injection simulation. The only circuits that cannot be covered by spare tiles within one tile are interface blocks that shown in Fig. 6 and other no-redundant circuits. The area ratio of these circuits is less than 1% of the tile area, as listed in Table 2 . Because place and routing is performed flatten design, area data is from synthesis report. Figure 12 shows benchmark-independent fault tolerance performance results. For examples, circuit "pdc", which is the largest circuit in this evaluation, with tile arrays of size 864 tiles (576 normal and 288 spare tiles) are required in T 2 architecture (Table 1) . Island-style and homogeneous FPGAs are tolerant to only some faults when faulty tiles are not used. Two TMR FPGAs can also be tolerant up to 1 faulty tiles with 99% success rate. In contrast, although the performances of three FT-FPGA architectures are different, each of architectures shows a higher success rate than TMR FPGAs. This is because each TA can tolerate faults on a single column of tiles. Fault tolerance decreases significantly with increasing TA size. As the case shown in Fig. 12 , if we inject 7 errors into pdc circuit implementation, the Fault tolerability is calculated as the follows:
The T 2 architecture can tolerate up to 7 faulty tiles with 87.8% success rate, and other benchmark circuits also display similar trends. Note that if fault occurs in scan wire, it is not treat only one fault. This is because that correct configuration cannot be executed. However, since it is not impossible to calculate these wire area by synthesis report, these faults are excluded in this evaluation. Figure 13 shows the area evaluation results normalized by the value for island-style FPGAs. Comparing with islandstyle FPGAs, the additional area overheads of the proposed architectures with tile array sizes of 2, 3, and 4 are, on average, 1.90x, 1.79x, and 1.76x, respectively (total average area of FT-FPGAs is 1.82x). This is because a larger T x has fewer the total number of spare tiles as shown in Table 1 . On the other hand, island-style FPGA with TMR are 2.59x larger in area than island-style FPGA. Figure 14 shows the critical path delay evaluation results normalized by that for island-style FPGAs. Comparing with island-style FPGAs, the delay overheads of the proposed architectures with tile array sizes of 2, 3, and 4 are, on average, 1.18x, 1.13x, and 1.03x, respectively (total average delay of FT-FPGAs is 1.11x). In the delay evaluation, first we analyze architectures without multiple-wire segments, in which island-style and homogeneous FPGAs display similar performance. The critical path delay overhead of FT-FPGAs T 2M0, T 3M0, and T 4M0 is approximately 1.3x that of island-style FPGAs owing to the use of TA interfaces and spare tiles. We can also see from the results that architectures with larger TAs have shorter critical path delays because fewer fault-tolerant resources are used.
Area and Delay
Next, we investigate the performance of FPGAs with multiple-wire segments. Overall, multiple-wire segments of types M1 and M2 improve the performance for all architectures. Also, higher multiple-wire segment ratios also improve the performance. We can see that the effect of type M2 on T 4 architectures is more favorable than other architectures. Multiple-wire segments on architectures with larger TAs are effective for higher delay performance.
TMR circuits implemented on conventional FPGAs require on average 2.59x the area and have 1.90x the critical path delay of island-style FPGAs. In comparison, on average, our architectures require less than 1.82x the area and have 1.11x the delay compared with island-style FPGAs. We are confident that the compactness and fault tolerance of our architectures are suitable for designing highperformance programmable IP cores.
Physical Validation
Chip Layout
Although the proposed target FPGA is provided as a soft IP, it is necessary to be physically place and routing and integrated into a SoC in the end. To perform physical validation, we design an actual FT-FPGA TEG chip using a 65-nm TSMC standard cell library. In this evaluation, we explore FT-FPGA architecture using the architecture explore mode of EasyRouter [20] , which targets DES (6.5K 2-NAND gates) and 8-bit MAC (1.5K 2-NAND gates) circuits. In order to occur pseudo stack at faults, we prepare five error injection circuits for each tile. The error injection circuit is shown in Fig. 17 . If we want to inject a stuckat error on wire 1 in Fig. 17 (a) , we place a MUX on this wire as Fig. 17 (b) shows. This MUX is controlled by FFs that used as an error injection memory. All error injection memories are connected in serial chain, which is controllable from outside of the chip. A tile contains three stuckat 1 injection circuits, other two injection circuits simulate stuck-at 0 errors. Table 3 shows the characteristics of the TEG chip, and Fig. 15 (a) shows a top view of the chip layout. The TEG chip has a 5×5 TAs with 4×4 normal tiles and 4×1 spare tiles. The total number of normal and spare tiles is 400 (20×20 tiles) and 100, respectively. Moreover, the total number of configuration memory bits is 205,520, and total die area is 12mm 2 (3.46mm × 3.46mm). Note that we utilize scan type flip flops (FFs) as a configuration memory cells.
Figure 15 (b) shows the layout of the tile array; its area is 0.34 mm 2 . Figure 15 (c) and (d) shows the layout of normal and spare tiles. Since spare tiles do not have configuration memory bits, their area is about one-seventh that of a normal tile. The total spare tile area is only 4.3% of the total die area. Figure 16 shows a photograph of the FT-FPGA TEG chip. In this evaluation, we generate bitstreams of DES and 8-bit MAC circuits by using the CAD flow introduced in Sect. 4 . All functions that include normal operation and fault detection/avoidance are examined thoroughly through downloading the bitstream. 
Delay Performance
In this section, we first present STA results for the delay for one-segment wires, which shows the delay distribution of logically equivalent routing paths of the normal tile module. Then, we show the MAC circuit and the DES circuit critical path delay results derived from full-chip STA performed by using PrimeTime and EasyRouter fast evaluation mode ( Fig. 10 (b) ). The full-chip STA is performed with SDF data extracted from the full chip layout. However, when evaluating a number of FPGAs with different architectures and scales, the full chip layout process is time consuming. Instead, EasyRouter fast evaluation mode calculates the critical path delay from SDF of one tile because all tiles structures in our FPGA are the same. The accuracy of this delay model is also shown in this section.
One important physical layout goal in the case of FPGAs is to ensure that logically equivalent paths, such as one-segment wires in routing channels, have the same physical delay. If this requirement is fulfilled, FPGA electronic design automation tools can predict path delay accurately, without having to take into account differences between equivalent paths, thereby ensuring a stable implementation performance of user circuits. However, because the tile layout of FT-FPGA is determined by automatic placement and routing, it is difficult to meet this design condition. Although we impose constraints to ensure that equivalent paths have similar physical delays, the delays of all 168 one-segment wires (28 tracks and F s = 0.3 with 4 direction channels) in the tile are still in the range from 0.32 to 0.70ns, with an average delay of 0.50ns (Fig. 18) . To determine the highest safe clock frequency, EasyRouter uses the worst delay of the equivalent paths to calculate the critical path delay for a user circuit. Figure 19 shows the critical path delays of the MAC circuit and the DES circuit as calculated by PrimeTime and EasyRouter, respectively. This figure shows the maximum, average, and minimum delay values of 20 placement seed implementations for each benchmark. The PrimeTime results are taken to represent the actual delay during the 20 seed implementations, where the critical path delay of the MAC and DES circuits are around 6.57 and 10.89 ns, respectively. The DES circuit, which is larger, has a broader distribution range. Figure 19 also shows the accuracy of EasyRouter fast evaluation mode, which uses the worst delay of representative equivalent paths of a single tile to evaluate the delay for the entire FPGA. From the results we can see that the average critical path delays of the MAC and DES circuits, as calculated by EasyRouter, are respectively 1.28x and 1.38x PrimeTime results. Furthermore, PrimeTime results show that the average critical path delay of the standard DES circuit is 1.36x that of the MAC circuit, while EasyRouter reports an increase of 1.47x. Therefore, fast delay evaluation in Easyrouter can be considered to be a reliable tool for fast evaluation of critical path delay.
Conclusions
In this paper, we propose an FPGA architecture featuring high tolerance to physical errors for soft-IP-based programmable logic design, and we present an actual TEG chip to perform physical validation. The key features of our architectures are regular tile structure, spare modules and bypass wires for fault avoidance, and configuration mechanism for single-cycle reconfiguration. Our FT-FPGA is smaller area and delay than island-style FPGA using TMR method. Nevertheless, FT-FPGA shows a higher fault tolerant performance.
Next, we fabricated an actual FT-FPGA TEG chip using a 65-nm TSMC standard cell library. Then, we show the MAC circuit and the DES circuit critical path delay results derived from full-chip STA performed by using PrimeTime and EasyRouter fast evaluation mode. From the results we can see that the average critical path delays of the MAC and DES circuits, as calculated by EasyRouter, are respectively 1.28x and 1.38x PrimeTime results. Furthermore, PrimeTime results show that the average critical path delay of the standard DES circuit is 1.36x that of the MAC circuit, while EasyRouter reports an increase of 1.47x. Therefore, fast delay evaluation in Easyrouter can be considered to be a reliable tool for fast evaluation of critical path delay. We are confident that the compactness and fault tolerance of our architectures are suitable for designing high-performance programmable IP cores.
