Abstract-We explore the use of signatures, i.e., partial truth tables generated via bit-parallel functional simulation, during soft error analysis and logic synthesis. We first present a signature-based CAD framework that incorporates tools for the logic-level Analysis of Soft Error Rate (x) and for Signature-based Design for Reliability (SiDeR). We observe that the soft error rate (SER) of a logic circuit is closely related to various testability parameters, such as signal observability and probability. We show that these parameters can be computed very efficiently (in linear time) by means of signatures. Consequently, AnSER evaluates logic masking two to three orders of magnitude faster than other SER evaluators while maintaining accuracy. AnSER can also compute SER efficiently in sequential circuits by approximating steady-state probabilities and sequential signal observabilities. In the second part of this paper, we incorporate AnSER into logic synthesis design flows aimed at reliable circuit design. SiDeR identifies and exploits redundancy already present in a circuit via signature comparison to decrease SER. We show that SiDeR reduces SER by 40% with only 13% area overhead. We also describe a second signature-based synthesis strategy that employs local rewriting to simultaneously improve area and decrease SER. This technique yields 13% reduction in SER with a 2% area decrease. We show that combining the two synthesis approaches can result in further area-reliability improvements.
Signature-Based SER Analysis and
Design of Logic Circuits I. INTRODUCTION S OFT (transient) errors are becoming an important concern in digital integrated circuits. It has long been known that many soft faults are masked and do not lead to observable circuit errors. Therefore, analyzers are needed to assess the impact of masking mechanisms on the soft error rate (SER) of a circuit. Furthermore, deliberately increasing masking is a key to low-SER designs. Hence, SER analysis can effectively guide and evaluate synthesis by accounting for relevant masking mechanisms.
In this paper, we present a methodology to guide the logic synthesis process toward greater design robustness. First, we develop an SER analyzer, Analysis of Soft Error Rate (AnSER), which estimates logic masking efficiently and accurately. When a fault occurs in a portion of the circuit that is logically unsensitized, it is said to be "logically masked." This type of masking originates during logic synthesis and remains in effect through the rest of the design flow. The difficulty in estimating logic masking is the input (state) space explosion problem when considering the different paths of sensitization invoked by different input patterns (states). We use signaturebased analysis to efficiently solve this problem.
A signature is a partial truth table of a Boolean function, computed by bit-parallel functional simulation of the circuit on applying random inputs. Signatures allow us to compute probabilistic information about a circuit that is relevant to SER computation. First, we compute signatures for all nodes in a circuit. Then, we compute observability don't-care (ODC) masks from the signatures to estimate node observability and testability. This information is used, in turn, to compute the SER. Fig. 1 shows our SER-aware synthesis methodology, which exploits the intimate relations between logic masking, simulation signatures, ODCs, and the testability of stuck-at (SA) faults. We also extend our techniques to evaluate the impact of soft faults on sequential circuits. We find that signatures offer a way to overcome computational challenges associated with sequentialcircuit analysis including steady-state probability computation, reachability analysis, and Markov-chain analysis, which often tax other analysis methodologies.
The second part of this paper focuses on the design for decreased SER. Soft-error reliability can be improved by increasing masking opportunities. In the past, researchers have 0278-0070/$25.00 © 2009 IEEE resorted to massive functional redundancy in schemes such as in triple modular redundancy (TMR) to improve reliability. However, these methods require a substantial increase in area and power consumption. In contrast, as we show, closely coupling synthesis and SER analysis can reduce overhead with significant improvements in reliability. We present a novel synthesis technique called Signature-based Design for Reliability (SiDeR) that uses signatures and ODCs to identify partial redundancies among critical nodes of the circuit. Then, SiDeR can protect logic in the fan-in cone of the critical node with the addition of a single gate. This paper's main contributions are as follows:
1) a fast incremental SER analyzer, AnSER, that can be used stand-alone or integrated with logic synthesis; 2) a novel synthesis technique, SiDeR, that decreases SER by exploiting logical covering relationships and ODCs. In addition, we demonstrate the use of AnSER in a general logic synthesis flow to guide a local restructuring technique known as rewriting. This technique rewrites small windows of logic throughout the circuit in order to improve area. The use of AnSER can simultaneously improve area and SER.
This paper is organized as follows. Section II discusses a previous work on SER analysis and SER-aware synthesis. Section III covers the background on bit-parallel simulation and signatures. Section IV introduces our SER analysis methodology, and Section V extends the analysis to sequential circuits. Section VI describes two strategies for synthesis to decrease SER. Section VII presents empirical results. Finally, Section VIII concludes this paper.
II. PREVIOUS WORK
Recent SER evaluators include SERA [28] , FASER [27] , SERD [23] , and MARS-C [18] along with its sequential extension MARS-S [19] . These tools estimate the SER of a technology-mapped circuit by accounting for three masking mechanisms with varying levels of detail. The three masking mechanisms [24] are as follows: 1) logic masking (the glitch occurs in a nonsensitized portion of the circuit), 2) electrical masking (the glitch is attenuated and blocked by the electrical characteristics of CMOS gates), and 3) temporal masking (the glitch occurs in a nonlatching portion of the clock cycle). Logic masking is accounted for by explicit enumeration of the input vector (or state) space in decision diagram-based methods [18] , [27] or by fault simulation on specific vectors [23] , [28] . Electrical masking is assessed using SPICE-based precharacterization of the gate library. Timing masking is either approximated as a derating factor proportional to the latching time of flipflops in the design [27] or based on timing analysis information [18] . In addition, MARS-S [19] uses Markov-chain analysis and symbolic simulation to analyze SER in sequential circuits.
While these methods offer detailed analysis of SER, they can be difficult to use during logic design because of the following reasons: 1) They require complete information such as electrical characterization and timing analysis, which may be unavailable during logic design, and 2) they use unscalable methods for logic masking analysis. Some tools [13] , [18] , [27] use ADDs (DDs with multiple real valued terminals) to completely enumerate input patterns and calculate patterndependent error probabilities for logic masking analysis-this has exponential worst-case complexity. This use of ADDs in SER analysis is different from the use of BDDs in logic synthesis to represent Boolean functions. The latter is generally much more efficient. Other tools electrically simulate circuits vectorby-vector, which can slow down SER analysis and become a bottleneck in circuit optimization as well.
Several techniques are known to reduce the impact of soft errors on logic circuits. Rao et al. [22] use the algorithm from [23] to selectively resize gates and flip-flops. Low-energy particle strikes are less likely to cause a glitch in larger gates due to the increased internal capacitance of the gate. Larger gates also imply that glitches are less likely to appear at gate outputs, and those that do appear are often electrically masked.
Soft errors can also be mitigated by adding redundant logic. Classic techniques such as TMR and quadded logic [25] achieve this by systematically replicating logic. In quadded logic, each gate is replaced by a network of four gates which logically mask single faults. TMR triplicates the entire circuit and uses voters to mask faults. Mohanram and Touba [20] reduce the cost of TMR by replicating only the most susceptible gates. However, even partial replication of this kind is quite expensive.
Almukhaizim et al. [2] proposed SER reduction via guided rewiring. In the form of rewiring that they use [26] , one of the four design errors is introduced to the circuit at each step. These include as follows: 1) removing a target wire; 2) changing the gate driven by the target wire; 3) adding an extra input to the gate driven by the target wire; and 4) replacing the target wire with a different wire. Then, the algorithm from [26] is called to provide a list of possible single-operation corrections for the introduced errors. The correction that improves SER by the most is chosen based on reevaluation of SER by the tool SERA [28] . ATPG-based rewiring [5] has been used in other contexts such as in optimizing sequential circuits [16] and timing optimization [8] . However, the work in [2] appears to be the first example of reliability-guided circuit restructuring without the explicit addition of redundancy.
Almukhaizim and Makris [3] have recently proposed improving SER through the addition of redundant wires. Redundant wires are identified by deriving relations between wires using logical implication. Others have used logic implication to add (and remove) redundancies in circuits for logic optimization [14] , [16] . The approach in [3] related to our SiDeR approach originally proposed in [12] . However, in contrast to logic implication analysis, we identify redundancy using signature comparison. Empirical results show that our methods improve SER by a wider margin.
III. SIGNATURES AND ODC MASKS
In this paper, we systematically use node signatures to compute the SER, to target error-sensitive areas of a circuit, and to identify redundant nodes for resynthesis. A circuit node g can be labeled by a signature sig(g) = F g (X 1 )F g (X 2 ), . . . , F g (X K ) defined as the sequence of logic values observed at g in response to a sequence of K input vectors X 1 , X 2 , . . . , X K . Here, F g (X i ) ∈ {0, 1} indicates the value appearing at g in response to X i . The signature sig(g) thus partially specifies the Boolean function F g realized by g. Applying all possible input vectors (exhaustively simulating) generates a signature that corresponds to a full truth table. In general, sig(g) can be seen as a kind of "supersignal" appearing on g. It is composed of individual binary signals that are defined by some current set of vectors. Like the individual signals, sig(g) can be processed by EDA tools such as simulators and synthesizers, e.g., it is a single entity. It can be propagated through a sequence of logic gates and combined with other signatures via Boolean operations. This processing can take advantage of bitwise operations to speed up the overall computation compared to processing the signals that compose sig(g) one at a time. Signatures with thousands of bits can be useful in pruning nonequivalent nodes during equivalence checking [21] , [29] . A related speedup technique is also the basis for "parallel" fault simulation [4] . The basic algorithm for computing signatures is shown for reference in Fig. 2 . Here, Op g refers to the operation gate g. This operation is applied to the signatures of the input nodes of gate g, denoted as inputsigs(g). Fig. 3 shows a five-input circuit where each of the ten nodes is labeled by an 8-b signature SIG computed with eight input vectors. These vectors are randomly generated, and conventional functional simulation propagates signatures to the internal and output nodes. In a typical implementation such as ours, signatures are stored as logical words and manipulated with 64-b logical operations, ensuring high simulation throughput. Therefore, 64 vector simulations are conducted in parallel with each signature processed. Generating K-bit signatures in an N -node circuit takes O(NK) time.
ODCs occur at node g for certain input vectors when the values at g do not affect the primary outputs. For example, in the circuit AND(a, OR(a, b)), the output of the OR gate is inconsequential when a = 0. Corresponding to the K-bit signature sig(g), we define ODCmask(g) as the K-bit sequence whose ith bit is 0 if input vector X i is in the don't-care set of g; otherwise, the ith bit is 1. Formally,
ODCmask is computed by bitwise negating sig(g) and resimulating the circuit through the fan-out of g to check if the changes are propagated to any of the primary outputs. This algorithm is shown as compute_odc_exact in Fig. 4 and has complexity O(N 2 ) for a circuit with N gates. Its practical implementations may truncate resimulation at gates where signatures do not change, thus achieving an additional speed-up.
We found the heuristic algorithm from [21] , which has only O(N ) complexity to be particularly convenient to use. This algorithm is also shown in Fig. 4 . Here, the circuit traversed in reverse topological order, and for each node, a local ODC mask is computed for its immediate downstream gates. The local ODC mask is derived by flipping each value in the input signatures of a gate to see if the output of the gate changes. The local ODC mask is then bitwise-ANDed with the respective global ODC mask at the output of the gate to produce the ODC mask of the gate for a particular fan-out branch. The ODC masks for all fan-out branches are then ORed to produce the final ODC mask for the node. The ORing takes into account the fact that a node is observable for an input vector if it is observable along any of its fan-out branches. Reconvergent fanout can eventually lead to incorrect values. The masks can be corrected by performing exact simulation downstream from the converging nodes. This step is not strictly necessary for SER evaluation as we show later.
Example 1: Fig. 3 shows a sample 8-b signature and the accompanying ODC mask for each node of a ten-node circuit. The ODC mask at c, for instance, is derived by computing ODC masks for paths through nodes f and g, respectively, and then ORing the two. The local ODC mask of c for the gate through f is 01110101. When this is ANDed with the ODC mask of f , we find the global ODC, 01110001, of c on paths through f . Similarly, the local ODC mask of c for the gate with output g is 11101100, and the global ODC mask for paths through g is 01000100. We get the ODC mask of c by ORing the ODC masks for paths through f and g, which yields 01110101.
IV. ANALYSIS OF SER
We now present the SER analyzer AnSER which was specifically designed for use in logic synthesis. In this section, we focus on combinational logic; in the next section, we cover SER in sequential circuits.
A. Fault Model for Soft Errors
Integrating SER analysis efficiently into logic synthesis requires scalability and logical-level fault models that are technology independent. Other existing tools typically use complex SPICE-based electrical characterization to model soft faults. For example, Rao et al. [23] model such faults by averaging glitch waveforms defined by Weibull probability distributions. Some existing tools only work with a single process technology and very small gate libraries [23] , [27] .
AnSER uses a probabilistic logic-level fault model for single to reason efficiently about the resulting errors. As clock frequency increases and threshold voltages decrease, logical masking also tends to dominate over electrical and timing masking. Hence, SER optimization need not be delayed until layout and electrical information are available. By leveraging fast bit-parallel simulation, AnSER offers linear-time SER analysis and fast incremental updates after circuit transformations.
We propose a fault model based on the standard SA fault model. For every clock cycle, we assume that each circuit node g has a temporary single stuck-at-1 (TSA-1) fault with occurrence probability P err 1 (g), and a temporary single stuck-at-0 (TSA-0) fault with probability P err 0 (g) otherwise. While this TSA model focuses on logic masking, it can also incorporate the other masking mechanisms, if desired. For example, electrical masking can be approximated by derating P err 0 and P err 1 by a factor dependent on adjacent gates [23] . Miskov-Zivanov and Marculescu [18] and Zhang et al. [27] demonstrate the incorporation of timing masking by dividing error probabilities by a constant dependent on the clock period.
By using the TSA fault model, AnSER computes the SER of the entire circuit as a probability of error per cycle by considering primarily logic masking. The results can easily be converted into units of FIT or failures per 10 9 s. If the soft error probability per cycle is p, then the FIT is simply p × freq × 10 9 , where freq is the clock frequency. Assuming that only one fault occurs in each cycle, Gerr 0 (g) is the FIT rate of the gate g and is related to the probability of error P err 0 (g) by a constant.
B. SER Evaluation
AnSER computes the SER by counting the number of test vectors that propagate the effects of a soft fault to the output(s). Test-vector counting was also used in [10] to compute SER, although the authors also used BDD-based techniques. Intuitively, if a large number of test vectors are applied at the inputs, then faults will be propagated to the outputs often. It should be noted that SER computation is inherently more difficult than test generation. Testing involves generating a vector that sensitizes the fault on a node, and propagates the resulting error to the output. SER evaluation involves counting the number of vectors that detect each fault and is therefore in the P -hard complexity class.
Next, we describe how AnSER uses signatures and ODC masks to derive several metrics that are necessary for our SER computation. These metrics are based on signal probability (controllability), observability, and testability parameters commonly used in ATPG [4] . Fig. 5 summarizes the algorithm used by AnSER for SER computation. It involves two topological traversals of the target circuit: one to propagate signatures forward and another to propagate ODC masks backward. The ratio of 0s and 1s in a node's signature is taken as a measure of signal probability, while the relative proportion of 1s in an ODC mask indicates observability. These two measures are combined to obtain a testability figure-of-merit for each node of interest, which is then multiplied by the probability of the associated TSA to obtain the SER for the node. This SER for the node captures the probability that a fault occurs, and its effects are propagated to the output. Our estimate can be contrasted with technologydependent SER estimates that include timing and electrical masking.
We define the probability of node g having logic value 1, denoted as P [g = 1], as the fraction of 1s in the signature sig(g)
The corresponding 0-controllability metric is
. The observability of a node is defined as the number of 1s in its ODC mask
This observability metric is an estimate of the probability that g's value is propagated to a primary output. The 1-testability of
, is the number of bit positions where g's ODC mask and signature both are 1
Similarly, 0-testability is the number of positions where the ODC mask is 1 and the signature is 0. In other words, 0-testability is an estimate of the number of vectors that test for stuck-at-0 faults. Example 2: Consider again the circuit in Fig. 3 . Node g has signature sig(g) = 01011011 and ODC mask ODCmask(g) = 01000100. Hence,
Suppose that each node g in a circuit C has fault probabilities P err 0 (g) and P err 1 (g) for TSA-0 and TSA-1 faults, respectively. Then, the SER of C is the sum of SER contributions at each gate g in the circuit. Here, we weight intrinsic gate fault probabilities by the testability of the gate for the particular TSA
Example 3: The test 0 and test 1 measures for each gate in the circuit are shown in Fig. 3 . If each gate has TSA-1 probability P err 0 = p and TSA-0 probability P err 1 = q, then the SER is given by P err(C) = 2p + (13/8)q.
The metrics test 0 and test 1 implicitly incorporate fault sensitization and propagation conditions, Hence, (4) accounts for the possibility of a fault being logically masked. Note that P err 0 (g) refers to the 1-controllability of g and so is weighted by the 1-testability, similarly for P err 1 (g).
V. SER ANALYSIS IN SEQUENTIAL LOGIC
In this section, we extend our SER analysis to include sequential circuits, which have memory elements (D flop-flops) in addition to primary inputs and outputs. Recall that the values stored in flip-flops collectively form the state of the circuit. The combinational logic computes state information and primary outputs as a function of the current state and primary inputs. In the following, we list three factors to consider while analyzing sequential circuit SER.
1) Steady-state probability distribution: It has been shown that under normal operation, most sequential circuits exhibit convergence to particular state distributions [9] . Discovering the steady-state probability is useful for accurately computing the SER. 2) State reachability: Some states cannot be reached from a given initial state; therefore, only the reachable part of the state space should account for the SER. The following two sections develop a simulation-based framework to address these issues. In Section V-A, we perform steady-state and reachability analysis through sequential simulation. In Section V-B, we assess sequential observability by applying techniques from Section IV-B to time-frame-expanded circuits. In addition, we explain how these relatively simple solutions handle subtle concerns in sequential circuit SER analysis.
A. Steady-State and Reachability Analysis
Usually, the primary input distribution is assumed to be uniform, or is explicitly given by the user, while the state distribution has to be derived. Cho et al. [6] and Hachtel et al. [9] show that aperiodic finite-state machines (FSMs) with strongly connected state spaces eventually reach a steady-state distribution. An FSM is periodic if its states can be visited only at regular intervals, and aperiodic otherwise. Periodic FSMs do not reach steady state. A modulo-d counter is an example of such an FSM. In [6] , it is shown that most ISCAS and other benchmark circuits reach steady state because they are synchronizable; in other words, they can be taken to a reset state starting from any state, using a specific fixedlength sequence. This indicates that the circuits are aperiodic (otherwise, different length sequences would have to be used from each state) and strongly connected (otherwise, some states could not be taken to the reset state).
In order to approximate the steady-state distribution, we perform sequential simulation using signatures. We claim that for a large enough n, these states are sampled from a steady-state probability distribution. Empirical results suggest that most ISCAS benchmarks reach a steady state in ten cycles or less [19] under the aforementioned operating conditions.
Our method can also handle systems that are decomposable. Such systems pass through some transient states and are then confined to a set of strongly connected closed (SCC) states. That is, states in the system are partitioned into transient states and sets of SCC states. For such systems, the steady state is heavily dependent on the initial states. We address this by implicitly performing reachability analysis starting in a reset state. Thus, each bit of the signature corresponds to a simulation that: 1) It starts from a reset state and propagates through the combinational logic; 2) it moves to adjacent reachable states; and 3) for large enough n, it reaches steady state within the partition. By using our method, simulating a circuit with g gates for n simulation cycles and K-bit signatures takes time O(Kng). Fig. 6 summarizes our simulation algorithm for sequential circuits. Note that it does not require matrix analysis which is often the bottleneck in other methods. Markov matrices usually encode state transition probabilities explicitly and can be prohibitively expensive due to the problem of state-space explosion [9] , [19] . Fig. 7 shows a sequential simulation with 3-b signatures. The flip-flops with outputs x and y are initialized to 000 in cycle 0, labeled as T 0 . Then, the combinational logic is simulated. In T 1 , the inputs of x and y are transferred to the output, and the process continues. At the conclusion of the simulation, the values for x and y at T 3 are saved for sequential error analysis which is explained in the next section.
Although we only considered aperiodic systems, we observe that for a periodic system, the SER would need to be analyzed for the maximum period D, since the state distribution oscillates over the period. If the FSM is periodic with period D, we can average over the SER for over D simulation cycles.
B. Error Persistence and Sequential Observability
In order to assess the impact of soft faults on sequential circuits, we consider several cycles through which faults persist using time-frame expansion. This involves making n copies of the circuit, C 0 , C 1 , . . . , C n−1 , thereby converting a sequential circuit into a pseudocombinational circuit. In the expanded circuit, flip-flops are treated as buffers. The outputs for the flipflops of the kth frame are connected to the primary inputs of frame k + 1 frame (as appropriate) for 0 < k < n − 1. Flipflop outputs that feed into the 0th frame are treated as primary inputs, and flip-flop inputs of frame n are treated as primary outputs. Fig. 8 shows a three-time-frame circuit that corresponds to that of Fig. 7 . Here, the primary inputs and outputs of each frame are marked by their frame numbers. Furthermore, new primary inputs and outputs are created corresponding to the inputs from flip-flops for frame 0 and outputs of flip-flops for frame 3. Intermediate flip-flops are represented by buffers (shaded).
Observability is analyzed by considering all n frames together as a single combinational circuit, thus allowing the single-fault SER analysis described in the previous section to be applied to sequential circuits. Other useful information such as the average number of cycles during which faults persist can also be determined using time-frame expansion.
After the multicycle sequential simulation described in the previous section, we store the signatures of the flip-flops and use signatures to stimulate the newly created primary inputs (corresponding to frame 0 flip-flops) in the time-frameexpanded circuit. For instance, the x 0 and y 0 inputs of the circuit in Fig. 8 are simulated with the corresponding signatures marked by T 3 (the final signature after multicycle simulation is finished) from Fig. 7 . Randomly generated signatures are used for primary inputs not corresponding to flip-flops (such as a 0 and b 0 in Fig. 8 ).
After simulation, we perform ODC analysis starting from the primary outputs and flip-flops inputs of the nth frame all the way to the inputs of the 0th frame. In other words, primary outputs and any flip-flops with errors after n cycles are considered to be observable. Fig. 9 shows our algorithm for sequential SER computation. The value of n can be varied until the SER stabilizes, i.e., does not change appreciably from an n-frame analysis to an (n + 1)-frame analysis.
The n-frame ODC analysis can lead to different gates being shown as critical for SER. For instance, the designer can deem errors that persist longer than n cycles as more critical than errors that are quickly flushed at primary outputs. In this case, the ODC analysis only considers the fan-in cones of the primary outputs of C n . The SER of the circuit with respect to n cycles of sequential masking is the SER computed on the C 0 frame as follows:
The SER algorithm in Fig. 9 still runs in linear time with respect to the size of the circuit, since each simulation is linear and ODC analysis (even with an n-frame analysis) runs in linear time as well. The SER values and run times of some ISCAS-89 benchmark circuits are given later (Section VII).
VI. DESIGN FOR RELIABILITY
We now present methods for logic synthesis that leverage the fast signature-based SER analysis method embodied in AnSER to improve the resilience of a given circuit with respect to soft errors. First, we discuss an SER-aware design method, SiDeR, which involves utilizing redundancy within the circuit identified using precomputed signatures. This is a global restructuring technique in that connections can be made between nodes in any part of the circuit based on their informational redundancy. The second method locally restructures small portions of the circuit to improve area and SER. As we show in Section VII, these techniques can be combined or used individually.
A. SiDeR
Our SiDeR method is aimed at increasing logic masking at high-impact nodes by exploiting redundancy already present in the circuit. This redundancy is identified using signatures. Pairs of signatures are checked for functional relations, which when verified can be used to increase reliability through the use of a single gate. Compared to techniques such as partial TMR that replicate vulnerable signals, SiDeR incurs a smaller area overhead since it increases logic masking by adding gates one at a time.
In order to limit the area overhead, the functional relations that we consider only cover relationships between nodes. We say that g covers f , denoted as f ⊆ g, if g is 1 for every input vector that makes f = 1 (here, we are equating nodes with the Boolean functions they realize in the usual manner). In the presence of ODCs, this relation can be generalized using bitwise operations to the following:
In other words, g covers f if and only if g is 1 or a don'tcare wherever f is 1. We define node g to be an anticover of node f when
The covering relation can be extended naturally to signatures and bit-parallel simulation. For instance, suppose that x has signature sig(x) = 11000 and sig(y) = 11001. By definition, sig(x) ⊆ sig(y); therefore, x can be replaced by AND(x, y). In this case, all 0-to-1 flips of the third and fourth input vectors will be masked as long as they are not propagated through both x and y. If y is replaced by OR(x, y), then all 1-to-0 flips of the first two bits will be masked.
For a node x, we find other nodes that it covers or anticovers. Given a candidate node y covered by x, we add redundant logic by transforming node x into OR(x, y) because y ⊆ x implies that OR(x, y) = x. Similarly, if x is an anticover of y, we transform node x into AND(x, y). To generalize, we identify y such that x = f (x, y 1 , y 2 , . . . , y n ) and f denotes an arbitrary Boolean function. Replacing x by f results in errors being masked for cases where x does not have a controlling value for f .
In the trivial case where x is chosen as a candidate cover for itself, the redundant logic generated by x = f (x, x) will not decrease SER. At the other extreme, if x and y have disjoint fanin cones and x = y, then all faults that cause x to flip from 0 to 1 will be masked when x is replaced by AND(x, y) . Similarly, all 1-to-0 faults will be masked by OR(x, y). In the general case of x = f (x, y), where x and y are different nodes, the impact of x and the portion of its fan-in cone that is disjoint from y will be reduced as determined by f . This occurs because sensitized paths in the fan-in cone that include x but not y will benefit from the extra logic masking generated by f (x, y).
Signatures provide an especially effective method for identifying partial redundancy in the form of covering relationships. For instance, if OR(x, y) = x, then it follows that sig(x) > sig(y) lexicographically (otherwise, sig(y) has a 1 in a position where sig(x) does not). Therefore, sorting all of the signatures can narrow the search for candidate signals y. In addition, sig(x) must have more ones than sig(y) so that |sig(x)| > |sig(y)|, where |sig(x)| is the size of the signature. This means that we can narrow the candidate set further by having a sizesorted list of signatures and intersecting the candidates found using the two lists. The search candidates can be pruned even more by performing multiple lexicographical sorts and multiple size sorts of signatures starting from different bit positions. For instance, if we sort the signatures lexicographically from the ith bit, sig(x) must still occur before sig(y) for the same reason. As a result, signature-based redundancy identification can be an efficient alternative to logic implication analysis which is used in [3] .
1) Node Impact Analysis: In order to quickly identify candidates for resynthesis, we calculate a measure called impact that describes the node's influence on the overall circuit SER. Intuitively, this influence should be proportional to the probability that faults arrive at the node and the probability that those faults are observed as errors at the output. In other words, a node has high impact if many errors "flow" through it.
We propose a linear-time algorithm for computing impact, as shown in Fig. 10 . It employs a notion of the observability of one node g relative to another node f , embodied in the following definition of
As computed in Fig. 10 , the impact measure is precise in cases where all gate error probabilities are equal. The algorithm works by keeping a running signature called impactsig(f ) at each node f , which is an indication of the faults propagated to f through paths from its fan-out cone. In general, nodes closer to the primary outputs are more observable than those closer to the primary inputs. However, a node g in the fan-in cone F of node f may have observability greater than f due to fan-out in F . For the circuit in Fig. 3, relODCmask(g, h) = 01000100&01110110) = 01000100. If P err = p, then including faults on h itself, the impact of h is 5p/8 + 2p/8 = 7p/8. In cases where some gates have higher intrinsic error probabilities than others, an average value of P err can be used.
This impact measure does not have to be used with SiDeR, but it can guide other techniques such as gate hardening, which rely on finding error-critical parts of a circuit [22] . We note, however, that our measure does not take into account secondorder effects, i.e., changes to the signatures and ODCmasks of other nodes, in cases where the additional logic actually changes the functionality of other nodes (through the use of ODCs).
Since AnSER maintains signatures and ODC information for each node, we can quickly find covers for resynthesis. Fig. 11 shows replicated logic for node a derived by utilizing don't-care values stored with its signature. Signature-based replication must be verified since signatures do not fully capture Boolean functions. Moreover, the use of SAT-based verification allows for the use of approximate ODC computation (rather than exact) to identify candidates as well. We use a SAT solver (MiniSAT) to check equivalence by constructing miters along a cut in the fan-out cone of x between the original circuit and the new circuit with cover f (x, y) (for further details on SAT-based verification of logic optimizations, see [21] and [29] ).
B. Guided Local Rewriting
In this section, we demonstrate the use of AnSER to guide an external logic synthesis technique known as logic rewriting. Rewriting is a general technique that optimizes small subcircuits to obtain overall area improvements [17] . We optimize circuits simultaneously for SER and area by using AnSER to accept or reject rewrites.
The rewriting technique relies on the fact that different irredundant topologies corresponding to the same Boolean function can exhibit different SER characteristics. For instance, the circuit AND (A, AND(B, C) ) is more reliable than the circuit AND(B, AND(A, C)) if P [A = 0] > P [B = 0] since A will mask more errors than B in this case. Due to the heavy dependence on signal probability, enumerating such cases is difficult. This is precisely where AnSER's speed can aid in deciding between certain optimizations for a particular subcircuit.
The implementation of rewriting reported in [1] and [17] first derives a four-input cut for a selected node, defining a oneoutput subcircuit. Functionally equivalent replacement candidates are found using lookup tables. To extend the algorithms described in [1] to improve SER and area, we rewrite fourinput subcircuits to improve their reliability. To ensure global SER improvement, we resimulate the circuit and update SER estimates. Computational efficiency is achieved through fast incremental updates by AnSER. Furthermore, we quickly prune candidate rewrites based on how they change the impact of the rewritten subcircuit. By extending the notion of impact to oneoutput subcircuits, we require only local computations. Fig. 11(a) shows two candidate rewrites. The original subcircuit with three gates can be rewritten with two gates. New nodal equivalences for the rewritten circuit can quickly be identified using structural hashing to further reducing area. In Fig. 11(a) , we also observe that the contributions of the two equivalent subcircuits to the SER are different. The larger circuit, which has a redundant input a, allows for more logic masking.
C. Remarks
Without proper extensions, our resynthesis techniques may negatively affect delay and testability of a given circuit. Since SiDeR decreases the testability of nodes, more test patterns may be necessary for testing, or some nodes may become untestable. Fortunately, AnSER maintains testability and signature information for every node in the circuit. Indeed, for a node g, the test 0 (g) and test 1 (g) measures are an approximation of the random pattern testability of node g. Therefore, we can output both the testability and test vectors for any circuit node. In addition, if we are given a set of test vectors to preserve, we can avoid node mergers that render a signal untestable by any of the given test vectors. Since we reevaluate signatures and ODCmasks after each change in the circuit, the updated signatures and ODCmasks indicate the new test vectors for a particular node. If these vectors are not among the given test vectors, then the change can be rejected.
The precise analysis of circuit delay requires technology mapping and interconnect lengths which are not available during technology-independent logic synthesis. If critical path information were available, the delay overhead could be decreased in SiDeR by prohibiting gate additions along those paths. For our rewriting technique, we could modify the objective function for selecting rewrites by requiring that modifications along critical paths to either maintain or decrease delay.
It is also possible to annotate SiDeR transformations on the netlist, allowing a downstream physical synthesis tool to undo optimizations when accurate timing analysis can be performed. Since fewer than 10% of gates and wires are timing-critical in heavily optimized ICs, this "undo" functionality should be able to meet original timing constraints while also preserving the improvements in reliability achieved by SiDeR.
VII. EMPIRICAL VALIDATION
We now report empirical results for SER analysis using AnSER and our two SER-aware synthesis techniques. The experiments were conducted on a 2.4-GHz AMD Athlon 4000+ workstation with 2 GB of RAM. The algorithms were implemented in C++.
For validation purposes, we compare AnSER which computes SER under the TSA fault model with complete testvector enumeration using the ATPG tool ATALANTA [15] . We provided ATALANTA with a list of all possible SA faults in the circuit to generate tests in "diagnostic mode," which generates all test vectors for each fault. We used an intrinsic gate fault value of Gerr = 1 × 10 6 on all faults. Since TSA faults are SA faults that last only for one cycle, the probability of a TSA fault causing an output error is equal to the number of test vectors for the corresponding SA fault weighted by their frequency. Assuming uniform input distribution, the fraction of vectors that detect a fault provides an exact measure of its testability. Then, we computed the SER by weighting the testability with a small gate fault probability as in (4) . While the exact computation can be performed only for small benchmarks, Table I suggests that our algorithm is accurate to about 3% for 2048 simulation vectors. More test vectors can be used if desired.
We isolate the effects of the two possible sources of inaccuracy: 1) sampling inaccuracy and 2) inaccuracy due to approximate ODC computation. Sampling inaccuracy is due to the incomplete enumeration of the input space. Approximate ODCs computed using the algorithm from [21] incur inaccuracy due to mutual masking. When a fault is propagated through two reconvergent paths, they may cancel each other out. However, the results in Table I indicate that most of the inaccuracy is due to sampling and not due to approximate ODCs. The last two columns of Table I , corresponding to exact ODC computation, show an average of 2.65% error. Therefore, only 0.41% of the error is due to the approximate ODC computation. On the other hand, while enumerating the entire input space is intractable, our use of bit-parallel computation enables significantly more vectors to be sampled than other techniques [2] , [23] , [28] given the same amount of time.
To obtain accurate gate characterization information for the experiments, we adapted data from [23] , where several gate types are analyzed in a 130-nm 1.2V DD technology via SPICE simulations. We use an average SER value of 8 × 10 −5 for all gates. However, the SER analyzers from [23] , [27] , and [28] all report error rates that differ by orders of magnitude. SERA tends to report error rates on the order of 10 −3 for 180-nm technology nodes, and FASER reports error rates on the order of 10 −5 for 100 nm. Furthermore, although our focus is logic masking, we approximate electrical masking by scaling our fault probabilities at nodes by a small derating factor to obtain trends similar to that in [23] . In Fig. 12 , we compare AnSER and SERD when computing SER for inverter chains of varying lengths. Since there is only one path in this circuit that is always sensitized, it helps us estimate the derating factor. [7] , [23] , AND [27] Table II compares AnSER with the prior art on ISCAS-85 benchmarks, using similar or identical host CPUs. While the runtimes in [7] include 50 runs, the runtimes in [23] are reported per input vector. Thus, we multiply data from [23] by the number of vectors (2048) used there. Our runtimes appear better by several orders of magnitude. We believe that our faster runtimes are in large part due to the use of bit-parallel functional simulation to determine logic masking, which has a strong input-vector dependence. Most other works use fault simulation or symbolic methods. Table III shows SER and runtime results for IWLS benchmarks that were evaluated when we implemented AnSER within the OAGear package [11] . 1 Note that our algorithm scales linearly in the size of the circuit, unlike the majority of prior algorithms. We assume a uniform input distribution in these experiments, although AnSER is not limited to any particular input distribution. An input distribution supplied by a user, a sequential gate-level simulator, or a Verilog simulator can be used directly, even if it includes repeated vectors. In the latter case, all calculations based on signatures will remain correct. SER and runtime results with exact and approximate ODCs are shown on the ISCAS85 benchmarks in Table IV . Again, the results show that approximate ODCs are sufficient for most benchmark circuits since the loss of accuracy due to ODC approximation is negligible.
Table V compares the multicycle simulation runtimes of AnSER with this of MARS-S, the sequential circuit SER analyzer from [19] . MARS-S employs symbolic simulations using a BDD-/ADD-based framework to compute steady-state probability distributions, while we use signature-based bitparallel functional simulations. The number of cycles needed to [10] . In other words, flip-flops propagate few errors to the outputs in later cycles due to sequential circuit masking. The latched errors tend to quickly dissipate with the number of cycles leaving the SER for multiple-cycle analysis close to the error rate of the current cycle's primary outputs. Table VII shows improvements in SER and area overhead obtained by SiDeR. The first set of results is for exact covers, i.e., covers that do not consider ODCs. The second set uses ODCs to increase the number of candidates as well as to maintain testability. Since the use of ODCs results in candidate-target pairs that are not identical, faults at the output of either gate can still be propagated in most cases. In both cases, AND/OR gates are used according to the covering relationship. For exact covers, we see an average of 29.1% SER improvement with only 5% area overhead. The improvements for the ODC covers are 39.8% with area overhead of 13.1%. Table VIII illustrates the use of AnSER to guide the local rewriting implementation in the ABC logic-synthesis package [1] . AnSER calculates the global SER impact of each local change to decide whether to accept this change. After checking hundreds of circuit rewriting possibilities, those that improve SER and have limited area overhead are retained. The data indicate that, on average, SER decreases by 10.7%, while area decreases by 2.3%. For instance, in the case of alu4, a circuit with 740 gates, we achieve 29% lower SER while reducing the area by 0.5%. Although area optimization is often thought to hurt reliability, these results show that carefully guided logic transformations can eliminate this problem. Table IX shows the results of combining SiDeR and local rewriting. In this experiment, we first used SiDeR followed by two passes of rewriting (in area-unconstrained and area-constrained modes) to improve both area and SER. This particular combination of the two techniques yields 68% improvement in SER with 26% area overhead. The improvements seen are not necessarily additive as one optimization may change the starting point and available options for the other. Furthermore, different interleavings of the two optimizations can provide different results. ATPG-based rewiring. SEROnly refers to optimizing only the SER, while JointOpt refers to joint optimization for SER, area, power, and delay. The next three techniques are from [3] . Here, logic implication analysis is used to identify redundant wires. IndirImply, BackJustify, and DirImply are techniques that incorporate indirect implications, backward justification, and direct implications, respectively. The next two techniques from [20] are variants of partial fault masking. Nodes identified as susceptible are triplicated in the PartialMask technique, while these nodes are only duplicated in DomValue. The final four techniques are ours-the basic SiDeR technique, SiDeR augmented with ODCs, guided local rewriting, and combined SiDeR/rewriting.
Generally, these results show that guiding logic optimization techniques such as those in [2] and [3] incur less overhead than explicit replication [20] . Furthermore, our results indicate that SiDeR is able to identify more inherent redundancy than implication-based analysis [3] . A possible explanation for this is that implication analysis is restricted to certain types of implications and certain parts of the circuit. Furthermore, it is difficult to incorporate ODCs and SDCs (satisfiability don't cares) in implication analysis. Our signature-based analysis, on the other hand, implicitly incorporates SDCs, and we explicitly compute ODCmasks to incorporate don't cares in our synthesis techniques.
VIII. CONCLUSION
We have presented an SER-aware design framework which includes a logic-level fault model, SER evaluation algorithms, and logic synthesis techniques. AnSER, our technologyindependent SER analyzer, accurately evaluates the logic masking in combinational and sequential logic circuits. It achieves very high speed (2-3 orders of magnitude faster than previous methods) through the efficient use of node signatures and ODC masks. We also proposed a new SER-aware resynthesis strategy, SiDeR, which manipulates node signatures to find redundancies within the circuit. Then, with the addition of a few fault-masking gates, SiDeR protects the fan-in cones of vulnerable nodes. On average, SiDeR decreases SER by 40% with only 13% area overhead. Finally, we successfully applied AnSER to local rewriting and showed that this approach simultaneously improves area and SER. Combining the two techniques gives a 68% SER improvement.
