Abstract This paper describes a tunable transient filter (TTF) design for soft error rate reduction in combinational logic circuits. TTFs can be inserted into combinational circuits to suppress propagated single-event transients (SETs) before they can be captured in latches or flip-flops. TTFs are tuned by adjusting the maximum width of the propagated SET that can be suppressed. A TTF requires 6-14 transistors, making it an attractive cost-effective option to reduce the soft error rate in combinational circuits. A global optimization approach based on geometric programming that integrates TTF insertion with dual-V DD and gate sizing is described. Simulation results for the 65 nm process technology indicate that a 17-48× reduction in the soft error rate can be achieved with this approach.
reduced logic depth are projected to increase in the soft error rate in sub-100 nm integrated circuits [3, 30] . Soft errors occur as a result of single-event transients (SETs) caused by high-energy neutron or alpha particle strikes in integrated circuits. Although soft errors cause no permanent damage, they can severely limit the reliability of electronic systems.
Several solutions including ones based on error detection and correction have been proposed to enhance the reliability of memories, flip-flops, and latches to soft errors (e.g., [20, 21] ). Error detection techniques include techniques that (i) require addition of extra hardware [23] and (ii) use time redundancy [25] to reduce hardware overhead. Besides error detection, techniques that use logical masking to increase robustness to SETs based on redundancy addition and removal [32] and rewiring [2] have also been proposed. However, the applicability of these techniques to combinational circuits is limited owing to the irregular multi-level structure of combinational circuits that leads to very high design overhead.
Various gate level hardening techniques [8, 10, 15, 19, 27, 35, 36] have also been proposed in literature. These techniques use design parameters like gate size and supply voltage V DD to optimally achieve SET robustness for the gates in a circuit. In [36] , selective sizing of critical gates is used achieve robustness to SETs. In [10] , a technique based on addition of capacitive load to the primary outputs followed by modification of size and V DD assignment of internal gates is used to increase robustness to SETs. In [19] , the authors propose soft error hardening techniques that can be applied during layout using an automatic layout generator. In [8] , an optimal assignment of size and V DD for SET robustness is obtained using an optimization framework based on geometric programming. A simultaneous gate sizing and flip-flop selection technique has been proposed in [27] .
Various circuit level techniques based on filtering SET transients [12, 17, 29] by addition of filtering circuitry have also been proposed. A technique for filtering glitches using pass transistors is proposed in [17] . In this technique, the most vulnerable (observable) logic gates are identified and isolated using complementary pass gates. In [12] , a technique based on duplicating vulnerable gates (called shadow gates) is proposed. The output of the shadow gate is tied to the output of the original gate using a voltage clamping circuit. This prevents the output voltage of the gate from deviating due to SETs. A Schmitt trigger circuit is proposed in [29] to mask SETs before they are latched at the outputs of a combinational circuit. The complementary pass gates act as a low pass filter and suppress transient pulses due to radiation strikes. Although filtering techniques have a lower area and power overhead, these techniques incur a performance penalty because the addition of the filtering circuitry is performed post-synthesis and is not incorporated into the design flow.
This paper describes a tunable transient filter (TTF) design for soft error rate reduction in combinational circuits. When inserted into combinational circuits, TTFs suppress propagated SETs before they can be captured in latches or flip-flops. TTFs are tunable during design, since it is possible to adjust the maximum width of the propagated SET that a TTF can suppress. Since practical TTFs are implemented using only 6-14 transistors, their area and power costs are negligible in comparison to traditional fault avoidance and tolerance techniques. TTFs also (i) do not incur any overhead for error detection, correction, and recovery and (ii) can complement SET robustness techniques based on circuit optimization and silicon-on-insulator substrates. TTFs are also advantageous over techniques that require explicit redesign of the flip-flops and latches to tolerate specific propagated SET widths (e.g., [20, 22, 26, 27] ), since TTFs can be customized to the characteristics of the propagated SET widths to be suppressed.
The performance (i.e., delay) penalty of TTF insertion is proportional to the width of the maximum propagated SET that they are designed to suppress. Judicious use of TTFs on non-critical paths, i.e., paths with slack may reduce or nullify this penalty, making them an attractive cost-effective option to reduce the soft error rate in combinational circuits. Further, TTF insertion into combinational circuits can be combined with any of the circuit optimization techniques proposed in literature to reduce the soft error rate (e.g., [7-10, 13, 16, 37] ). This paper describes a global optimization approach based on geometric programming for robust combinational circuit design. The proposed approach combines TTF insertion with gate sizing and dual-V DD optimization | subject to performance and power constraints | to optimize combinational circuits for robustness to SETs. SPICE-based Monte Carlo simulations of 5 benchmark circuits of 50-600 gates in the 65 nm process technology indicate a 17-48× reduction in the soft error rate for average power overhead of 30.6%. Further simulation results for 11 benchmark circuits at three performance points illustrate the tradeoffs that can be achieved. This paper is an extended version of [34] .
The rest of this paper is organized as follows. Section 2 describes the design of the TTF. Section 3 presents simulation and validation results for the basic TTF design. Section 4 describes the global optimization framework that integrates TTF insertion with simultaneous dual-V DD and gate sizing. Section 5 presents simulation results. Section 6 is a conclusion.
Tunable Transient Filter Design
The design of the TTF is motivated by the filtering effect of logic gates [1] . Logic gates have a non-zero inertial delay, and they suppress input pulses that are of smaller width than the inertial delay from passing unattenuated through the gate. The design of the TTF leverages this observation and is designed to eliminate a propagated SET altogether, or suppress it in magnitude and duration so that the latch or flip-flop is not affected by the propagated SET.
Let Δ SET be the maximum width of the propagated SET that the TTF is designed to suppress. By design, the TTF allows input signals of duration larger than Δ SET to pass through unattenuated. In other words, the TTF acts as a strong low-pass filter designed to block high frequency noise inputs in the form of propagated SETs. Figure 1 illustrates the design of the proposed TTF using two inverters and two structures called the filter gates. The two inverters are essential elements in the design of the TTF. The two filter gates in Fig. 1 function as the low pass filter in the TTF design. The
Fig. 1 Basic TTF structure, where the filter gates FG 1 and FG 2 are driven by the input node N 0 strong filtering effect of the TTF is attributable to the use of the input signal N 0 to drive the inputs of the two filter gates. The propagation delay through the series filter gates subject to a load capacitance C L is given by
where n FG is the number of serial filter gates internal to the TTF and R eq is the effective on-resistance of a single filter gate in series. A TTF requires (2n FG + 4) transistors for implementation. By varying n FG and the transistor widths in the filter gates, both the propagation delay and Δ SET can be varied at fine granularity to design TTFs capable of suppressing propagated SETs of different widths. It is thus possible to trade-off delay of the TTF for the capability to suppress larger propagated SETs.
TTF Design and Validation
TTFs were designed in the 65 nm process technology using the predictive technology model [6] , and simulated using SPICE. The propagated SETs were modeled by trapezoidal waveforms with rise and fall times of 15 ps at the inputs to the TTF. The width of the trapezoidal waveform Δ SET is measured about 0.5V DD . In this paper, a 0 →1 →0 (1→0 →1) input to the TTF is suppressed if its magnitude is less (greater) than 0.2V DD (0.8V DD ). Similarly, a 0 →1 →0 (1→0 →1) input to the TTF is preserved if its magnitude is greater (less) than 0.8V DD (0.2V DD ). Consider a propagated 0→1 →0 SET at the input N 0 of the TTF shown in Fig. 1 with Δ SET = 60 ps. The waveforms of nodes from N 0 to N 4 in the filter are plotted in Fig. 2a . It is clear from the waveform at node N 4 that the propagated SET was suppressed by the TTF. Similarly, a negative propagated 1→0 →1 SET input to the TTF is suppressed as illustrated in Fig. 2b .
When an input to the TTF has a width as large as 79 ps, the input is preserved at the output with its magnitude larger than 0.8V DD as shown in Fig. 3 . The filter actually functions as a delay element, with Δ prop = 89 ps. This desirable property ensures that normal switching activity by legitimate signals of large duration is passed with minimum loss in quality, instead of being eliminated or degraded by the TTF.
In Table 1 , we present the simulation results of characterization of TTFs that differ in the number of filter gates n FG used to suppress propagated SETs. For each TTF, i.e., for each n FG in row 1, the propagation delay Δ prop is reported in row 2; the maximum suppressed propagated SET width Δ SET is reported in row 3; and the peak voltage of the filtered SET is reported in row 4. It is clear from these results that the input SETs of width less than Δ SET are suppressed, since the magnitude of the filtered SET is consistently less than 0.2V DD in all cases. In row 5, we show the durations of signals that can pass through the TTF with magnitude greater than 0.8V DD . The actual magnitudes are given in row 6. It follows from the entries in rows 2 and 3 of the table that there is a linear relationship between the propagation delay of the TTF and the duration of the propagated SET that can be suppressed by the TTF. For an individual filter, it is possible to determine the maximum width of propagated SETs that can be suppressed safely, and the minimum width for signals that can be preserved. The difference in these two values is defined as the filter margin. For the TTFs in Table 1 , the filter margin is 33% of the maximum suppressed propagated SET width Δ SET , indicating the clean cutoff properties of the proposed TTFs. Besides the filter gate stages, the transistor sizes also affect the performance of a TTF, which can be utilized to finely tune the TTF to obtain desirable delay values and filtering effects.
TTF Insertion for Robustness to SETs
A single TTF can be used to suppress all propagated SETs of width less than Δ SET originating in its transitive fanin cone. This paper proposes the insertion of TTFs at the primary outputs of combinational circuits to reduce the soft error rate. Since primary outputs have large transitive fanin cones in comparison to internal nodes, the TTF can suppress more propagated SETs than TTFs inserted at internal nodes. Several approaches to TTF insertion and the tradeoffs involved are discussed below.
(i) Brute-force TTF insertion followed by circuit optimization The simplest approach to TTF insertion for SET robustness would be to add TTFs at all the primary outputs. TTF insertion at all primary outputs potentially provides full coverage and requires minimum effort in soft error modeling and analysis. The primary challenge is the design of TTFs with a Δ SET such that all SETs that propagate to the primary outputs are suppressed. A major disadvantage of this approach is that the delay of the critical path is increased by Δ prop . One solution to off-set the Δ prop delay penalty would be to perform circuit optimization. However, TTF insertion at all primary outputs followed by circuit optimization may return sub-optimal designs.
(ii) Selective TTF insertion followed by circuit optimization Brute-force insertion of TTFs at all primary outputs can be replaced by selective TTF insertion, where TTFs are inserted only at those primary outputs that have sufficient slack. For instance, in a high speed design, it is not economical to pay a delay penalty of 100 ps by inserting TTFs on the critical paths. Depending on the available slack, selective TTF insertion will have negligible delay penalty. Since TTFs are only inserted at some primary outputs, exposed gates, i.e., gates in the circuit that have propagation paths to primary outputs not protected by TTFs must be made robust. This is done by following selective TTF insertion with circuit optimization to achieve delay and SET robustness at the exposed gates. (iii) Simultaneous TTF insertion and circuit optimization The width of propagated SETs in combinational logic circuits is of the order of 100 ps [3, 31] . Even if selective TTF insertion is adopted, a Δ prop of the order of 100 ps may be too high a penalty to pay for SET robustness. Whereas pure circuit optimization techniques may be used in such circumstances, this may result in large area and power overhead as well.
A middle-ground approach that combines selective insertion of TTFs with smaller Δ prop (of the order of 50 ps) with circuit optimization is proposed to get the best of both approaches in this paper. In this approach, the task of suppressing a propagated SET is shared between TTF insertion and circuit optimization based on gate sizing and dual-V DD techniques. Sizing and V DD assignments at the gates are used to partially suppress propagated SETs at the site of the strike. This partial suppression allows the use of a TTF with Δ prop of 50 ps and Δ SET of 32 ps (from Table 1 ), because this is an acceptable Δ prop penalty for most circuits. The optimization formulation that is described in the remainder of this section describes how two sets of SET robustness constraints can be specified at gates to realize simultaneous TTF insertion, gate sizing, and dual-V DD optimization for SET robustness in a global optimization framework based on geometric programming.
Circuit Optimization Background
Geometric Programming (GP) for Minimum Power GP-based formulations for the problem of design optimization to minimize power (both static and dynamic) using gate sizing and dual-V DD techniques | subject to performance constraints on delay T spec at the primary outputs | are well described in literature, e.g., [4, 28] . In this paper, this is called algorithm PD for power-delay optimization. The size W i and supply voltage V DD,i of the ith gate are the design variables of algorithm PD. The GP formulation requires that dynamic power, static power, and delay be expressible as posynomial functions in the variables (W and V DD ) of the GP. We limit ourselves to symmetric gate sizing and use W i to refer to the transistor sizes. Thus, scaling a single gate through W i is equivalent to scaling all transistors (nMOS and pMOS) in the gate by the same ratio. Also, the solution to algorithm PD results in the supply V DD,i assuming continuous values over the available range. Our implementation of PD uses standard branch-andbound techniques from literature to solve this GP problem to obtain discrete values for V DD,i [5, 18] .
Integrating SET Robustness Constraints into PD
The common metric used to evaluate the SET robustness of memories and logic gates is based on critical charge Q crit . Q crit is the minimum charge that needs to be deposited by a particle strike to produce a SET [11] . For a process technology, memory cells and gates with smaller Q crit are considered more vulnerable to SETs. The inverse-exponential relation between particle flux and energy results in an exponential dependence of SET robustness on Q crit , as described by empirically verified models in literature [14] . The common practice to make memories and logic gates robust to SETs is to raise Q crit .
This paper builds on a simple, highly accurate, and comprehensive model for the Q crit of a logic gate that was described in [8] . In [8] , the authors proposed a model for Q crit that integrates factors such as W, V DD , V T , load capacitance C out , and the available noise margin ηV DD and relates them to the Q crit at the gate as
where k and β are calibration constants for the nMOS/pMOS transistor networks in a logic gate and τ α is a process-dependent parameter that models charge collection. Based upon this model, an additional constraint for SET-robustness at each gate can be incorporated into the PD optimization framework to obtain the powerdelay-SET (PDS) optimization algorithm as follows. Let Q rob be the desired minimum Q crit for all (or a subset) of the gates in a design. Note that Q rob can also be the nominal or the maximum charge deposited by particle strikes for a process technology. Q rob can be determined using actual measurements with test structures or by 3-dimensional device simulations. For a given Q rob and when η is set to 0.5 for balanced noise margins, Eq. 2 can be simplified to produce constraints of the form
where k i is a constant for each type of gate (inverter, 2-input NAND, etc.), C out,i is the total capacitance (load and parasitic) at the ith gate, and n is the total number of gates in the design. Note that k i is derived from the expression for Q crit in Eq. 2 above. Note also that β 0 and β 1 are calibration constants (0.5 ≤ { β 0 , β 1 } ≤ 0.8) that further refine β and are obtained from circuit simulations for each type of gate. Algorithm PDS determines globally optimal assignments for size and supply voltage for all the gates of the design. However, the power and area overhead of the robust design may be very high in comparison to the base design, especially if the performance (i.e., delay) constraint is not demanding because the base PD-optimized design will use minimum-sized gates. Such gates will be sized significantly by algorithm PDS, resulting in large overhead. In contrast, inserting TTFs on such paths with high slack can reduce this overhead as described below.
Relaxed SET Robustness Constraints When all primary outputs in the transitive fanout cone of gate i, denoted by tfo-cone(i), are protected by TTFs, the SET robustness constraints can be relaxed at i and the sizing and V DD assignments to gate i can be made in a less aggressive manner to attain robustness to SETs. Upon relaxation, the new SET robustness constraints for gate i are defined such that the V DD and size W limit the duration and not the peak of the propagated SET to less than a specific value determined by the Δ SET of the TTFs. Such propagated SETs of width less than or equal to Δ SET are then eliminated by the TTFs that are present on every propagation path from the gate to the outputs.
The relaxed constraints for SET robustness can be derived as follows. When a SET occurs at a gate, it follows from the principle of charge conservation that a part of the deposited charge is dissipated by the drain current and that the rest of the charge is temporarily stored in the node capacitance C out . For a given robustness charge Q rob , charge conservation yields
where W I D (t) is the drain current through the transistors dissipating the deposited charge and V is the peak of the propagated SET. Since I D (t) is a non-linear function that depends on the region of operation of the transistors, the above equation has no closed-form solution. However, a simplifying assumption can be made as follows. Let λ be the duration of the propagated SET about V DD (= V). Let ξ Q rob (ξ ≤ 1) be the fraction of Q rob that is dissipated by the saturated drain current I D,sat during time period λ. Then,
Rearranging terms, a constraint on the duration of the propagated SET of the form λ < λ * can be derived.
Let λ * equal Δ SET for the TTFs. Then, for given Q rob and λ * , the above equation can be calibrated for each gate in the library using SPICE simulations. Since I D,sat through the unit transistor is roughly proportional to V 2 DD , this observation can be used to simplify the above expression for calibration and to derive the following closed-form SET robustness constraints for each gate i: ). The algorithm PDS returns values for t j in this continuous interval. This value is discretized to 1 or t MAX using a greedy assignment approach. The PDS algorithm with TTF insertion has multiple SET robustness constraints for each gate, and the optimization problem is set up such that the value of the filter variables determine the SET robustness constraint that dominate the size and V DD assignments for each gate in the circuit.
In the first set of SET robustness constraints for a gate i, the original SET robustness constraint in Eq. 3 is scaled by the filter variable t −1 j to obtain a separate constraint for each primary output j in the transitive fanout cone of i
where tfo-cone(i) refers to all primary outputs in the transitive fanout cone of gate i and n is the total number of gates. The term t −1 j is multiplied so that these constraints dominate the size and V DD assignments when at least one primary output j in the tfo-cone(i) is not a b protected by a TTF, i.e., when t j = 1. Adding TTFs to all the primary outputs in tfo-cone(i) forces all t j , j ∈ tfo-cone(i) to t MAX for the constraints in Eq. 5, and hence these constraints are trivially satisfied. In this case, the second relaxed SET robustness constraint given by Eq. 4 dominates the size and V DD assignments of gate i. Note that the delay constraints in PDS must be modified to incorporate the propagation delay of the TTF given by Δ prop . This is done by adding the term (t i /t MAX )Δ prop to the arrival time of the ith primary output. Note also that the term C out,i must be replaced by a monomial term M i in Eqs. 4 and 5, as described in [8] , for compatibility with the GP-based formulation in PDS. Figure 4 illustrates the advantages of TTF insertion at two performance points: (a) T spec = 85ps and (b) T spec = 110 ps. There are two sizes associated with each gate in Fig. 4 . The first gate size is for a PDS-optimized design without TTF insertion and the second gate size if for PDS-optimized design with TTF insertion. For the high performance point of T spec = 85 ps, a TTF is inserted is only on output y and not on output z. This is because the delay constraints for output y are not as tight as the delay constraints for output z. Hence, it is beneficial to pay the delay penalty for inserting a TTF at output y in return for relaxing the SET robustness constraints of gates exclusively in the fanin cone of y (G 4 , G 7 , G 8 and G 11 ) . Thus, the sizes of gates G 4 , G 7 , G 8 and G 11 with TTF insertion are smaller than the sizes without TTF insertion. Note that the size of gate G 3 remains unchanged even though G 3 is in the fanin cone of output y. This is because G 3 also has a path to output z that is not protected by a TTF. Hence, the SET robustness constraints of G 3 cannot be relaxed and its size remains unchanged.
Illustrative example
For a lower performance point of T spec = 110 ps, the delay constraints at both outputs y and z are sufficiently relaxed so that it is beneficial to insert a TTF at both outputs. Hence, the SET robustness constraints for all gates can be relaxed and the sizes of all gates are smaller as compared to the sizes without TTF insertion.
The two performance points described in the above illustrate that the benefits of TTFs in reducing the area and power overhead required for achieving SET robustness depends on the performance point. The continuous trend of area and power savings obtained using TTFs over a wide range of T spec is illustrated for benchmark circuit cu in Fig. 5 . Figure 5 shows the power-delay and area-delay trade-off curves for three designs, (i) PD-optimized design, i.e. without considering SET robustness, (ii) PDS-optimized design without TTF insertion, and (iii) PDS-optimized design with TTF insertion. The PDS curves are obtained when cu is optimized for SET robustness with charge Q rob of 20fC. For a low performance point, i.e. a high value of T spec , TTFs are inserted on all primary outputs and significant area and power savings can be obtained in comparison to a PDS-optimized design without TTF insertion. As T spec decreases, the delay constraints on some primary outputs become tighter and the number of primary outputs for which TTF insertion is beneficial reduces. Thus, the area and power savings obtained using TTF insertion also reduces. For high performance points, i.e., low T spec values, the delay penalty of TTFs hinders their insertion at most primary outputs, and the area-delay and power-delay trade-off curves for PDS-optimized design without TTF insertion and PDS-optimized design with TTF insertion overlap. Note that there are dips in the area-delay trade-off curves because when V DD assignments are made to optimize power, the sizes of gates can be reduced to meet timing constraints, thus causing the area of the circuit to drop.
Simulation Results
The simulation results described in this section begin with an introduction of TTF calibration to eliminate propagated SETs for a specific robustness charge Q rob in Section 5.1. Section 5.2 presents results for validation of the robustness models given by Eqs. 3 and 4. The results of circuit optimization using TTF insertion in combination with simultaneous gate sizing and dual-V DD optimization is described in Section 5.3.
TTF Design and Calibration
TTFs are calibrated using the simulation setup shown in Fig. 6 . 0→1 →0 and 1→0 →1 input SETs are injected at the output of the first inverter in a chain of three inverters. The TTF with the single filter gate in column 2
of Table 1 , with a Δ prop of 50 ps and a Δ SET of 32 ps, is then calibrated to suppress these propagated SETs at N 4 . Figure 7 presents the results of simulations that were performed to validate the robustness models given by Eqs. 3 and 4. Simulations were performed on 2-input NAND gates over a range of load capacitance and two supply voltages (1.0 V and 1.2 V). In all cases, the gate size required to (i) limit the peak of the propagated SET 0.5V DD and (ii) suppress propagated SETs to less than Δ SET of 32 ps was determined using the compact models as well as using SPICE simulations. The minimum and maximum load capacitances chosen for model validation include fanout-of-1 to fanout-of-4 circuits with gate sizes ranging from 2 to 10 units. From the figure, it is clear that optimizing for SET width suppression for TTFs requires less overhead at both values of V DD in comparison to optimizing for SET magnitude suppression. The maximum error in size for SET robustness determined using the models was 0.5 times the size of the unit-scaled 2-input NAND gate. Similar results were observed for the other logic gates that were used for synthesis of the benchmarks.
Model Validation

Circuit Optimization
This section presents results for TTF insertion combined with circuit optimization based on sizing and dual-V DD techniques for SET robustness. The GP framework for circuit optimization was implemented using the MOSEK software [24] . The SPICE library for the 65 nm technology node was obtained from the Berkeley predictive technology model [6] . Eleven combinational benchmark circuits, which were purely logic or a mixture of logic and control, were chosen from the ISCAS85 and LGSynth91 suite [33] . We used τ α = 50 ps and a robustness charge Q rob of 20 fC in all our simulations. We built a technology library comprising of inverters, and 2-input and 3-input NAND and NOR gates of different drive strengths for initial synthesis of the benchmarks. The optimization for SET robustness was performed on these synthesized netlists.
Validation Using Monte Carlo Runs
In order to estimate the reduction in soft error rate achieved using the proposed optimization techniques, a Monte Carlo simulation framework for soft error analysis was implemented. For each circuit, the charge used to simulate particle strikes was chosen from a uniform random distribution over the interval [10 , 20] fC. The site for particle strikes and the input pattern were also randomly generated. For each strike, the original PD-optimized design and the PDS-optimized design with inserted TTFs are simulated with the same input pattern and site of strike. The outputs of both circuits are observed for propagated SETs that deviate from 0 (V DD ) by ηV DD ((1 − η)V DD ) at the primary outputs for η equals 0.5. Only the 5 smallest benchmark circuits were evaluated in this manner, since it takes of the order of 24 h to simulate a circuit with 100, 000 patterns. Note that the the contribution of the TTFs to the soft error rate of the optimized circuit is neglected since the TTFs occupy negligible area and since they are inherently robust to particle strikes. Table 2 presents soft error rate reduction results based on the observed errors at the primary outputs of the PD-optimized and PDS-optimized designs. It is clear from the table that the proposed technique provides significant reduction in the soft error rate Design Overhead for SET Robustness Table 3 presents area and power overhead (in %) when the benchmarks are optimized using algorithm PDS with TTF insertion for a delay constraint of T spec , 1.15 T spec , and 1.3 T spec on all outputs. In all cases, the overhead is reported with respect to the total area (power) of the design optimized using the PD algorithm. The value of T spec for each benchmark was set to Δ min + 0.1(Δ max − Δ min ), where Δ min and Δ max are the minimum and maximum delays for the design. The T spec values are then relaxed by 15% and 30% from this optimum to obtain results for 1.15 T spec and 1.3 T spec . The first column is the name of the circuit and the second column reports the number of gates, primary inputs, and primary outputs of the circuit. Under the next major heading, the power overhead for simultaneous dual-V DD and gate sizing for a 20 fC Q rob and delay constraints of T spec , 1.15 T spec , and 1.3T spec are reported. Under the final major heading, the results of optimization for TTF insertion and simultaneous dual-V DD and gate sizing | number of TTFs inserted and power overhead | are reported for a 20 fC Q rob . The average area (power) overhead for SET robustness at T spec , 1.15 T spec , and 1.3 T spec was 23.5 % (22.4 %), 32 % (30.6 %), and 39.3 % (39.8 %) respectively. The maximum runtime for the largest benchmark c7552 with 2919 gates was approximately 200 minutes on a 2.4 GHz Opteron processor with 6 GB of memory. The large runtime can be attributed to the branchand-bound technique to discretize V DD and the greedy assignment approach to discretize TTF variables.
There are several observations that can be made from the results. First, the number of TTFs that are inserted decreases (or remains constant) from 1.3 T spec to T spec in all cases. This is because the delay constraints are tighter in faster designs, with fewer opportunities for TTF insertion. In slower designs, the delay constraints are not as tight, which creates more opportunities for TTF insertion. path delays on all the outputs. Thus, the greedy filter insertion algorithm either causes TTFs to be inserted on all the outputs or on none of the outputs, depending on whether a TTF is inserted on the first output processed by the algorithm or not. TTF insertion on all the outputs may drive the algorithm into infeasibility for such circuits. In such cases, the algorithm returns a circuit with no TTFs Second, the search space for design with TTF insertion is a super-set of the search space for design without TTF insertion. Thus, the power overhead for optimization with TTF insertion is always less than or equal to the power overhead for optimization without TTF insertion. This is seen by comparing the power overhead for the same circuit and performance point in the table.
Third, the power overhead required for SET robustness increases monotonically from high performance (delay = T spec ) to low performance (delay = 1.3 T spec ) designs. This is because when the design is optimized for T spec , a significant number of gates in the design have larger sizes and high V DD in the base case. Hence, the overhead required to satisfy SET robustness constraints is a smaller fraction of the power of the baseline design. As we relax T spec , there is a decrease in the average size of the gates and fewer gates use high V DD to meet delay constraints when PD is run. When PDS is run, it has to increase the W i and make more assignments to high-V DD to meet SET robustness requirements. Even if a TTF is added, the SET robustness constraints are only relaxed, but not completely removed. This is observed in the higher power overhead with respect to the baseline case for slow designs at 1.15T spec and 1.3 T spec .
Last, some benchmark circuits like c499 and c1355 have balanced path delays and thus TTF insertion is absent, i.e., no TTFs are inserted at all three values of T spec considered in our simulations. Further simulations indicate that TTF insertion is abrupt in these designs, with TTFs inserted at all the primary outputs for delays greater than 1.5T spec . Such designs exhibit a critical T spec below which no TTFs are inserted.
Conclusion
There is significant interest in low-cost solutions for soft error rate reduction in combinational circuits. This paper described TTFs to suppress propagated SETs in combinational circuits. In combination with circuit optimization based on gate sizing and dual-V DD techniques, TTF insertion is an attractive option to achieve significant reduction in the soft error rate at modest cost. An area of future research is to investigate selective optimization of the most vulnerable gates to further tradeoff overhead for SET robustness at very fine granularity.
