Abstract-High energy particles from cosmic rays or packaging materials can generate a glitch or a current transient (single event transient or SET) in a logic circuit. This SET can eventually get captured in a register resulting in a flip of the register content, which is known as soft error or single-event upset (SEU). A soft error is typically modeled as a probabilistic single bit-flip model. In developing such abstract fault models, an important issue to consider is the likelihood of multiple bit errors caused by particle strikes. In an earlier related work, we performed a characterization study of the impact of an SET on a logic circuit to quantify the extent to which this model is accurate. We found that, a substantial fraction of SEU outcomes had multiple register flips. In this paper, we perform a deeper analysis to understand the individual contributions of, 'strike on a register' and 'strike on a logic gate', on multiple flips. We use postlayout circuit simulations and Monte Carlo sampling scheme to study the impact of an SET. We perform our simulations on ISCAS'85, ISCAS'89 and ITC'99 benchmarks in 180nm and 65nm technologies. We find that, amongst the erroneous outcomes, the probability of multiple bit-flips for 'gate-strike' cases was substantial and went up to 50%, where as those for 'register-strike' cases was just about 2%. This implies that, if we were to do hardened flip-flop designs to eliminate the flips due to register strikes, in such designs, out of the remaining flips which are going to be due to 'gate strikes', a large fraction will be multiple flips. Hardening the flip-flops is not going to help in this scenario. Thus, gate strikes are problematic and complex error correction schemes at the circuit level are needed to eliminate the multiple errors due to 'gate strikes'. Index Terms-soft error, gate strike, multiple bit flips, fault model, logic circuits
I. INTRODUCTION

S
OFT errors have a significant impact on the circuit reliability at nanoscale technologies. High energy particles from either cosmic rays or packaging materials are known to be the major contributors towards soft errors. When such particles strike a semiconductor substrate, they generate charge which in turn results in a glitch or a transient current in a circuit. Such a glitch is called the single event transient (SET). The SET can occur on a register or on a logic gate and can eventually get captured in a register, altering the stored bit (0 → 1 or 1 → 0). Such an error is known as a single event upset (SEU) or soft errors. The cosmic rays interact with earth's atmosphere creating secondary particles, mainly neutrons, protons, muons and pions, as they penetrate down to the sea level. The energy or flux of these particles increases at a rate of 5x per 5000 feet, reaching a maximum intensity at about 12-15 km from sea level [1] , [2] . Thus, the errors caused due to these particles were of specific interest in satellite, aircraft and space applications. For example, in-flight measurements reveal an upset rate of approximately 5x10 -3 upsets per hour per memory device [3] .
Several studies were then conducted by IBM and also summarized by Boeing Defense and Space group, which reveal that a significant amount of neutron flux is found even at sea level [1] , [4] . The average low energy neutron flux (around 10MeV) at sea level is reported to be nearly 100000 neutrons/sq.cm per year [1] . They report that thousands of upsets happen every year at the ground level, which are mainly recorded by computer systems having error detection and correction logs. For instance, the average error rate in memories observed in several computers were reported to be nearly 1.5e-12 upsets/bit-hr and most of the upsets are reported to be due to atmospheric neutrons [4] . Thus, SEUs caused due to neutrons gained significant importance. Most early studies focused on SEUs only in memories. SEUs in logic circuits were not considered because such circuits exhibit inherent masking phenomena, which prevented the SETs from getting captured in a register/flip-flop. However, as technology scales, the impact of masking phenomena tends to reduce [5] , [6] , [7] , [8] and it is important to study SEUs in logic circuits.
At the architectural level, soft errors are commonly modeled by a probabilistic single bit-flip fault model [9] , [10] , [11] , [12] . In developing such abstract fault models, an important issue to consider is the likelihood of multiple bit errors caused by particle strikes. This likelihood has been studied to a great extent in memories, but has not been understood to the same extent in logic circuits. This model has been challenged in [13] , [14] , [15] , [16] , which report that multiple bit flips do occur in logic circuits. However, the state-of-art fault model continues to be a single bit-flip, single cycle model for soft error in logic circuits. Reliability estimates (such as mean time to failure/MTTF) and reliability enhancement techniques (such as error correction codes etc) are also based on the assumption that a single bit flip occurs due to a particle strike. However, in reality, if multiple errors occur, these MTTF estimates are likely to be optimistic and the error correction methods are likely to be insufficient. Most of the existing techniques that estimate soft error rate (SER) use approximate modeling techniques to arrive at these conclusions which reduces the accuracy of their results. Therefore it is essential that our estimation technique be as accurate as possible, in order to increase the confidence in the conclusions on the multiple bit flip probability in logic circuits. So, we propose to perform a detailed characterization of the impact of an SET using post-layout circuit (SPICE) simulations in order to get accurate bitflip statistics. Our goal is to evaluate the accuracy of the single bit-flip, single cycle fault model. In other words, we attempt to quantify the likelihood that an SET can cause multiple bit errors in logic circuits consisting of combinational gates and flip-flops.
We performed a circuit level study to this extent, earlier [17] . We had established that the impact of an SET in a circuit can be understood as a two-cycle phenomenon, that is, the SEU outcomes need to be observed across two clock cycles (the cycle which had the SET injected and the following clock cycle) in order to accurately capture the phenomenon. This leads us to the fact that there are several possible SEU outcomes. We estimated the multiple flip probability given that there was at least one flip in that particular clock cycle. We found that a substantial fraction (24%) of such outcomes were multiple register flips. In this paper, our focus is still on evaluating the multiple flip probability. However, we revisit the previous analysis and estimate the multiple flip probability given that there is at least one erroneous outcome in either of the two clock cycles. In other words, given that an SET propagates and causes an error in a flip-flop, what is the probability that it can flip multiple flip-flops? We also delve a bit more deeper and evaluate this probability separately for strikes on logic gates and strikes on registers to understand their individual contributions. We run our experiments on the ISCAS'85, ISACAS'89 and ITC'99 benchmark circuits in 180nm and 65nm technologies. We evaluate the bit-flip statistics by comparing the SET-induced circuit simulation with a fault-free register-transfer-level (RTL) reference simulation. In our simulations, we assume that an SET affects a single transistor [18] , [19] .
We find that, overall, up to 8% of the erroneous outcomes result in multiple bit-flips. Although this probability is low, it can a have significant impact on error-detection or correction schemes. It means that a single bit error correction scheme can go wrong in as much as 8% of the cases, and these errors will go down as silent undetected errors. The probability of multiple errors also increases as technology is scaled (based on 180nm and 65nm data). So, the traditional single bit-flip soft error model certainly needs to be reexamined to get more realistic circuit reliability estimates. A key observation is that, amongst the erroneous outcomes, the probability of multiple bit-flips for 'gate-strike' cases was substantial and went up to 50%, that is, these errors are caused due to the propagation of the SET from the logic gate to the flip-flop. On the other hand, out of the erroneous outcomes, the likelihood of multiple flips for 'register-strike' cases was just about 2%. This implies that, if we were to do hardened flip-flop designs to eliminate the flips due to register strikes, in such designs, out of the remaining flips which are going to be due to 'gate strikes', a large fraction will be multiple flips. So, although the traditional circuit designs with hardened flip-flops will solve one problem, they will uncover a different problem. Since, strikes on gates can be problematic, different types of error correction schemes such as delay-capture flip-flop designs or uneven paths with padded delays might be needed at the circuit level to eliminate such multiple errors due to 'gate strikes'.
We organize the rest of the paper as follows. In Section II, we provide a brief introduction to the two-cycle phenomenon and describe the possible SEU outcomes in a logic circuit. We describe the experimental setup in Section III. We present our simulation results on ISCAS'85, ISCAS'89 and ITC'99 benchmarks in Section IV. In Section V, we summarize our paper.
II. A BRIEF INTRODUCTION TO TWO-CYCLE PHENOMENON
AND THE POSSIBLE SEU OUTCOMES In a logic circuit, a high-energy particle can strike a logic gate or a register, resulting in an SET. We call this the 'gatestrike' and 'register-strike' respectively. We model the SET as a current injection at the drain of a transistor in a particular clock cycle 'k'. This SET can eventually propagate and get captured in a register or a flip-flop in the same clock cycle 'k' or in the subsequent clock cycle 'k+1', depending on the time instant at which the SET occurs in clock cycle 'k'. If the SET flips a register content early in the clock cycle 'k', it will have enough time to propagate and flip some register in clock cycle'k+1'. However, if the SET flips a register later in the clock cycle 'k', the error will not have enough time to propagate and flip some other register in the subsequent clock cycle 'k+1'. Thus, the impact of an SET, in reality, needs to be viewed across two clock cycles to accurately capture the phenomenon. We call this the 'two-cycle phenomenon'. This time dependence of the strike is currently missing in the single bit flip, single cycle fault model. This phenomenon is described in more detail in our earlier work [17] . We can thus come up with a systematic notation for the possible SEU outcomes across two clock cycles. This is shown in Figure 1 . In the figure, 'N' stands for no-flip, 'F' stands for flip and 'F m ' stands for multiple flips. Since we are interested in evaluating the accuracy of the single bit-flip model, we focus on the likelihood of multiple register flips. The ones caused directly as a result of the SET are denoted by 'NF m ', 'F m F' and 'F m N'. The multiple flip caused due to the propagation of the previous flip is denoted by 'FF m ', which will be anyway captured by modeling the first flip using the traditional single bit-flip model. So, the multiple flips in this case are not a direct result of the SET and we do not consider this in our experiments.
III. SIMULATION SETUP
We model the SET as a current injection at the drain of a transistor [18] , [19] . We use a fixed glitch in our experiments and we assume that the probability that an SET affects a drain is proportional to its area. We use the scaling trends presented in [20] to arrive at the glitch height and width in our experiments.
We perform our simulations on the ISCAS'85, ISCAS'89 and ITC'99 benchmark circuits in 180nm and 65nm technologies. We add flip-flops to the inputs and outputs of the combinational circuits of the ISCAS'85 benchmarks. Each circuit that we perform our experiment on, is described as a Verilog or VHDL netlist (RTL). We first perform a Verilog/VHDL simulation with a representative test-bench and store the input/output values of all the flip-flops (registers). The circuit is then implemented to layout using synthesis and placement and route tools. The post-layout circuit netlist is then extracted. From this netlist, we generate 'sample circuit simulation decks' by picking a random clock cycle (t) for simulation, selecting a random drain (d) to inject the SET and selecting a random time instant (k) in the clock cycle to inject the SET. The sample circuit is simulated and the register outputs are recorded at two clock instants: 'k' and 'k+1'.The register values from the SET injected circuit simulation are compared with the corresponding reference values from the fault-free RTL simulation. Differences between the sampled values in the circuit simulation and the reference trace are recorded as bit-flips.
We generate several sample simulation decks with different (d,k,t) values and use the Ngspice or Hspice circuit simulators [21] to perform our simulations. The average number of sample circuit simulations run for each circuit was about 4000. We ran these simulations in parallel using GNU Parallel [22] on a high performance computing cluster in CDAC, Pune. The time taken to run these many simulations for each circuit was on an average about 1 to 2 hours. We then classify these flips in to one of the SEU outcomes shown in Figure  1 . The experimental setup is shown in Figure 2 . This entire process is automated using a set of python and perl scripts. We continue generating circuit samples until the standard error (
) is reduced to less than 10% of the value of the estimate. The probabilities obtained from this Monte Carlo sampling experiment fall within the 95% confidence interval. A detailed description of the setup can be found in our earlier work [17] .
IV. RESULTS
We perform our experiments on ISCAS'85 (c432 etc), IS-CAS'89 (s344 etc) and ITC'99 (b01 etc) benchmark circuits in 180nm and 65nm technologies. Flip-flops (registers) are added to the inputs and outputs of the combinational circuits of the ISCAS'85 benchmarks. Clock frequency for each circuit is set to the maximum operable frequency of the post-layout netlist, which is determined by post-layout timing analysis. We calculate the probability of a 'gate-strike' and 'register-strike' depending on whether the SET was injected on a gate or a register. Further, we classify the bit-flips into one of the SEU outcomes described in Figure 1 . In this paper, we focus on the multiple bit flip probability independently due to 'gate-strike' and 'register strike' scenarios.
A. Probability of multiple register flips given an erroneous outcome
Given that an error occurred due to an SET (that is, one of these cases occurred-N F * , F * N or F F * ), what is the likelihood that the first error event caused by the SET affects multiple registers? There are two possibilities:
• The first error event occurs at cycle k + 1 and consists of multiple registers being in error, that is, P m = P (N F m |atleast one f lip).
• The first error event occurs at cycle k and consists of multiple registers being in error, that is, P (F m N |atleast one f lip). There is another scenario where the multiple flips leads to flip in the next cycle: P (F m F |atleast one f lip).
We do not count the P (F F m |atleast one f lip), because the multiple flips in cycle k + 1 are caused due to the propagation of the first flip, which will be anyway captured by modeling the first flip using the traditional single bit-flip model. So, the multiple flips in this case are not a direct result of the SET. Amongst the two cases mentioned above, we did not come across the scenario of P (F m N |atleast one f lip) or P (F m F |atleast one f lip) in our experiments, hence we only report the first probability: P m = P (N F m |atleast one f lip). This is calculated as follows for gate-strike and register-strike (denoted by reg-strike in the equation) cases: 
For instance, if we ran 5000 simulations, out of which there were 1000 cases in which at least one error occurred (this can include N F , F N or F F across gate and register strikes), out of which, say 100 cases had multiple errors (N F m ), we calculate P m as 100/1000. This probability is shown in Table  II for the benchmark circuits, across both register and gate strikes. From Table II , we see that the maximum probability of multiple flips P m that was observed in the circuits we simulated, across both register and gate strikes was 8%. This probability is considerably higher in most of the 65nm implementations as compared to the 180nm implementations. For a particular circuit, it is likely to depend on various factors such as the structure of the circuit itself, the presence of balanced paths, logic depth, the kind of logic gates used and the input combinations. Although 8% seems low, it can still have a significant impact on error detection and correction schemes. This means that, a single error correction scheme can go wrong 8% of the times and the errors can go down as silent undetected errors.
B. The contribution of gate strikes and register strikes to multiple register flips
In this section, we analyze our results further to understand the role of gate-strikes and register-strikes independently on the multiple flip probability. In other words, we address the following question. Given that an error occured due to a strike on a logic gate, what is the probability that it was a multiple flip? This is denoted by P GM . Similarly, we calculate the multiple flip probability P RM given that an error was caused due to a register strike. We calculate P GM and P RM respectively as follows:
Here, we should note that, the denominator in both the fractions is the total number of cases which had erroneous outcomes (flips). For the gate strike scenario, a major contributor of these erroneous outcomes is the 'NF' scenario, where as for the register strike case, errors are mainly contributed by the 'FN' and 'FF' scenarios as already observed in [17] . This is because, when an SET occurs at a register, it is more likely to cause a flip in the same clock cycle (clock cycle 'k'). So, P F N and P F F are dominant as compared to P N F for register strike scenarios. On the other hand, when an SET occurs on a logic gate in clock cycle 'k', the register flip is most likely to occur due to the propagation of the SET and hence, the flip is likely to occur in the subsequent clock cycle 'k+1'. So, P F N and P F F are extremely small as compared to P N F for gatestrike scenarios. This difference is seen to have a significant impact on the probabilities P GM and P RM . We present these probabilities in Table II . We find that P GM is significantly greater than P RM and its value is found to be a maximum of 50%. This is mainly because of two reasons: 1) When an SET occurs on a register, a) The probability of a flip or an error occuring is high (high probabilities of an F F or an F N ) as compared to the SET occurring on a gate [17] . The flip probability for register-strike cases is found to be nearly 10x greater than that for the gatestrike cases [17] . This increases the denominator in equation 2.5 and results in a small value for P RM . b) The probability of an N F scenario is very less, as compared to the F N and F F scenarios for a register strike as already observed in [17] . Hence N F m scenarios are also less leading to a small value in the numerator in equation 2.5. This again leads to low value of P RM . 2) When an SET occurs on a logic gate, a) The probability of a flip is low (low probabilities of an F F or an F N ) as compared to the SET occurring on a register as already observed in [17] . This decreases the denominator in equation 2.4. b) The major contributor of flips for gate strikes is N F [17] . Hence N F m scenarios are also high, as compared to the register strike cases, leading to a large value in the numerator in equation 2.4. These lead to a high value of P GM . Overall, the summary from this section is described below:
• The probability that a flip or an error occurs is high in the register-strike cases. Robust flip-flop designs can prevent a flip due to a strike on a register and hence can reduce the overall flip probability.
• Given that there is an error, the probability of multiple errors is less (about 2%) in register strike cases, where as they are significant in the gate-strike cases (up to 50%). Thus, in a circuit with robust flip-flops, we can eliminate the flips due to 'register strike' cases, but out of the remaining flips which are going to be due to 'gate strikes', multiple flips are extremely likely. This is depicted in Figure 3 . Thus, gate strikes are problematic.
Hardening the flip-flops is not going to help in this scenario. Alternate methods such as delay capture flipflop designs or uneven path delays may need to be used to prevent multiple flips due to gate strikes.
• The probability of multiple errors increases as technology is scaled, based on our 180nm and 65nm data.
V. CONCLUSIONS
In this paper, we have extended our previous work, in which we performed a detailed characterization of the impact of an SET on a logic circuit. Our focus in this paper is on quantifying the extent to which the single bit-flip fault model for soft errors is accurate, by understanding the individual contributions of strikes on logic gates and strikes on registers. We find that, amongst the erroneous outcomes, the probability of multiple bit-flips for 'gate-strike' cases was substantial and went up to 50% where as those for 'register-strike' cases was just about 2%. This implies that, if we were to do hardened flip-flop designs to eliminate the flips due to register strikes, in such designs, out of the remaining flips which are going to be due to 'gate strikes', a large fraction will be multiple flips. So, although the traditional circuit designs with hardened flipflops will solve one problem, they will uncover a different problem. Since, strikes on gates can be problematic, different approaches such as delay-capture flip-flop designs or uneven paths with padded delays might be needed at the circuit level to eliminate such multiple errors due to 'gate strikes'. The broad conclusion is that, although there is a need to re-examine the traditional single bit-flip soft error model, there is a need to focus on circuit techniques to eliminate multiple bit errors due to gate strikes.
