ABSTRACT Unlike conventional CMOS circuits, reversible circuits do not have latent faults, so faults occurring in internal circuit nodes always result in an error at the output. This is a unique feature for online or concurrent fault tolerance and the main motivation of this paper with an aim of achieving highly efficient fault-tolerant CMOS logic circuits. For this purpose, we first implement fault-tolerant reversible circuits. We develop two techniques to make a reversible circuit fault-tolerant by using multiple-control Toffoli gates. The first technique is based on single parity preserving and offers error detection for odd number of errors at the output. The second technique is constructed on Hamming codes, which results in circuits detecting any number of errors unless the number of errors at the output is the order of d or correcting (d − 1)/2 bit errors, where d is the minimum Hamming distance between any pair of bit patterns. We select d = 3 in this paper. We also claim that 100% error detection is possible with conservative reversible gates, such as a Fredkin gate. For this purpose, we develop a greedy synthesis algorithm that implements an arbitrary reversible function with multiple-control Fredkin gates. As the next step, we utilize the proposed reversible circuits with conventional CMOS gates. This certainly approves the practical use of the proposed techniques. The effectiveness of our techniques is demonstrated on benchmark circuits, implemented by both reversible and CMOS gates, in terms of fault tolerance performances and area costs. Comparisons with the related studies in the literature as well as with dual-modular redundancy and triple-modular redundancy-based circuits clearly favor the proposed designs.
I. INTRODUCTION
In the literature, research on reversible computing has been mainly motivated by its low power capability that even allows zero power dissipation in theory [1] , [2] , and its direct relation with quantum computing constructed on unitary matrix based reversible operations [3] . Our motivation is different. We exploit fault tolerance capability of reversible computing to detect faults concurrently. Reversible gates do not have a ''don't care'' condition, and correspondingly any switching fault in a circuit node causing 0→1 or 1→0 transition should change the output logic values. Therefore, reversible circuits do not have latent switching faults, defined as faults not causing an error at the output for the current operation, but might be destructive for next operations. This inference is based on two properties: 1) a reversible circuit should satisfy one-to-one matching between its input and output assignments, and 2) a subcircuit of a reversible circuit is also reversible.
Different from reversible circuits, conventional CMOS logic circuits do have ''don't care'' conditions that results in latent faults. Consider a NAND gate having two inputs and an output. Suppose that a switching fault occurs in one of its inputs. Considering all of the four input assignments, we see that only in 50% of the cases we see a change at the output. It gets even worse for a three-input NAND gate; here, only 25% of the switching faults cause a change at the output. These low detection rates caused by latent faults are problematic, especially in online or concurrent fault tolerance for IoT and real-time applications as well as for reliability-critical aerospace and military applications; any problem should be immediately solved without necessarily waiting for an error occurrence at the output. Also such latent faults can ruin the used fault tolerance scheme [4] . For example, consider a system using dual modular redundancy (DMR) or triple modular redundancy (TMR). A permanent latent fault in one of the replicas would disrupt the detecting/correcting mechanism of DMR/TMR. To deal with this problem, N-modular redundancy (NMR) and similar techniques can be used [5] - [7] . However, these techniques do not fully solve the problem due to the existence of latent faults. Additionally, area cost increases significantly. We show that our reversible circuit based solutions with CMOS implementations are more efficient both in terms of area and fault tolerance performances.
Investigating the literature further, we see many works focusing on concurrent fault detection by the means of using various coding schemes such as Berger codes [8] , weightbased codes [9] , and Bose-Lin codes [10] . Even almost perfect fault detection (99.5% fault coverage) is achieved in [9] . However, this fault coverage is just for observable faults at the output, so latent faults are neglected. A similar treatment is used in [11] . Also, there are some works partially detecting and masking faults for the most susceptible nodes in the logic network [12] , [13] . Although these approaches are areaefficient, they offer poor fault coverage rates.
In contrary to the mentioned works, our approach do consider latent faults by utilizing the reversible bijective feature which allows faults occurring in any intermediate node to be reflected at the output. We first aim at achieving faulttolerant reversible circuit implementations, and then replacing reversible gates with their proposed CMOS counterparts. Note that since CMOS gates are not reversible, our final CMOS circuits are not reversible as well.
We develop two techniques to make a reversible circuit fault-tolerant using multiple-control Toffoli gates. The first technique is for error detection and based on single parity preserving. The idea is preserving the input parity at the output, so any odd number of errors at the output can be detected by comparing input and output parities. This approach can be implemented in two possible ways. In the first way, the desired circuit can be synthesized by solely using custom parity-preserving building blocks that guarantees global parity preservation. For this purpose, different gates, such as Khan and Islam gates, are proposed [14] - [16] . Although these gates work properly in theory by assuming that they are internally fault-free, this is not the case in practice. Indeed, these gates are generally too complex and large to be hardly assumed as simple fault-free gates. Therefore, any fault internally occurred in gate nodes rather than interconnections between gates can violate their faulttolerant property. The second way of implementing paritypreserving circuits is to add an extra input and an output [17] . Although this approach offers better fault tolerance compared to the first one, its implementation with reversible gates is not given in the referred study and might cause extremely large area overheads. In this regard, using the same way, we introduce synthesis method that results in a fault-tolerant reversible circuit with doubled area in terms of reversible and quantum cost.
Our second technique can be used either for error detection or for error correction. We exploit Hamming codes to achieve detection of any number of errors at the output unless the number of errors is the order of d, or correction of (d − 1)/2 bit errors where d is the minimum Hamming distance between any pair of bit patterns. We select d = 3 in this study. Indeed, the idea of using Hamming codes in fault tolerance of reversible circuits is previously introduced in [18] and [19] . However, these studies focus on a constrained set of circuits for encoding and decoding purposes rather than presenting a generic method for converting any reversible circuit to a fault-tolerant one. In this paper, we satisfy this.
We also claim that 100% error detection is possible with conservative reversible gates such as a Fredkin gate. For this purpose we develop a greedy synthesis algorithm that implements an arbitrary reversible function with multiple-control Fredkin gates. Our algorithm first converts the truth table of a given function into a conservative form by adding 0 and 1 valued inputs and their corresponding outputs. Then row by row synthesis is performed with Fredkin gates. To our knowledge, there is no algorithm in the literature to synthesize any given reversible function with Fredkin gates. However, we realize that if we apply our initial conversion technique to a given function as a pre-processing step, and then the algorithm given in [20] could perform a synthesis with only Fredkin gates. Nevertheless, the resulted area results are generally much worse than those of ours.
Apart from all of the mentioned studies, we utilize the proposed reversible circuits with conventional CMOS gates including NOT, NAND, and XOR gates to show the circuits' potential for practical use. The effectiveness of our techniques is demonstrated on benchmark circuits, implemented by both reversible and CMOS gates, in terms of fault tolerance performances and area costs. Comparisons with the related studies in the literature as well as with DMR and TMR based circuits clearly favor the proposed designs.
The rest of paper is organized as follow. In Section 2, we discuss basics of reversible logic and reversible cost measures used in this paper. In Section 3, we develop two techniques to make a reversible circuit fault-tolerant by using single parity preserving and Hamming code based approaches. Section 4 represents our synthesis technique for 100% error detection using Fredkin gates. In Section 5, we show how to utilize the proposed reversible circuits with CMOS logic gates. In Section 6, we give experimental results to evaluate the proposed circuits. Finally, Section 7 concludes this work with future directions.
II. PRELIMINARIES
While a conventional Boolean function always carries a one bit information (0 or 1) that is independent of the number of input bits, a reversible Boolean function carries information with using the same number of input and output bits. For reversible functions, each input bit combination results in a unique output bit combination; the reverse of this is also true because of the reversibility. This means that the input values can be deduced by looking at the output values of the reversible function. Bijection function in mathematics is also a great example to understand reversibility. In these functions, input and output sets have the same number of elements and each element has only one counterpart in the other set.
A. BASICS OF REVERSIBLE CIRCUITS
A reversible function can be realized by a reversible circuit consisting of reversible gates. In this study we use three types of gates: MCT (Multiple Control Toffoli), MPMCT (Mixed Polarity Multiple Control Toffoli), and MPMCF (Mixed Polarity Multiple Control Fredkin). Definition of gates are as follows, with corresponding symbols given in Figure 1 and Figure 2 where symbols •, •, ⊕, and × denote positive control, negative control, Toffoli target lines, and Fredkin target lines, respectively.
• NOT: a 1-bit gate performing NOT operation.
• CNOT: a 2-bit gate performing 1 bit NOT operation on its target bit iff its control bit is 1.
• Toffoli: a 3-bit gate performing 1 bit NOT operation on its target bit iff its control bits are both 1.
• Multiple Control Toffoli: an n-bit gate, n = 1, 2, 3, 4, . . ., performing 1 bit NOT operation on its target bit iff all of its control bits are 1.
• Mixed Polarity Multiple Control Toffoli: an n-bit gate, n = 1, 2, 3, 4, . . ., performing 1 bit NOT operation on its target bit iff all of its positive control bits are 1 and all of its negative control bits are 0.
• Fredkin: a 3-bit gate performing swap operation on its target bits iff its control bit is 1. • Multiple Control Fredkin: an n-bit gate, n = 1, 2, 3, 4, . . ., performing swap operation on its target bits iff all of its control bits are 1.
• Mixed Polarity Multiple Control Fredkin: an n-bit gate, n = 1, 2, 3, 4, . . ., performing swap operation on its target bits iff all of its positive control bits are 1 and all of its negative control bits are 0.
B. AREA COSTS OF REVERSIBLE CIRCUITS
For quantum cost, we use a measure given in [21] and [22] because it is the most commonly used and accepted one compared to other measures in the literature [23] , [24] . Table 1 summarizes the quantum costs used in this study. One can also consider reversible cost or just simply gate count [25] . But this metric does not consider the complexity of a gate including the bit sizes.
C. FAULT TOLERANCE IN REVERSIBLE CIRCUITS
The following lemma demonstrates why reversible circuits do not have latent switching faults. Such faults do not immediately cause an error at the output for the current operation, but they might be destructive for next operations [26] .
Lemma 1: A switching fault (0→1 or 1→0 transition) in a node of a reversible circuit always results in a change/transition at the output value.
Proof: The proof is by contradiction. Suppose that a transition in a node does not cause any change at the output. Since subcircuit of a reversible circuit is also reversible, the node can be considered as an input node of a reversible circuit. Also we know that a reversible circuit has one-toone matching between its inputs and outputs, so a change in an input should change the output. As a result, there is a contradiction.
Consider a reversible circuit with inputs I 1 , I 2 ,. . . , I n and outputs O 1 , O 2 ,. . . , O n . The circuit is parity preservative iff Proof: Since a transition does not scatter to multi transitions at the output. With XORing the outputs, one can always detect the fault.
We select the Fredkin gate since it does not just preserve the XOR of inputs, but it also preserves the arithmetic summation of the input values. Therefore, along with XORing the outputs, one can also detect faults by counting the 1 or 0 valued outputs. Another reason of selecting the Fredkin is its synthesis friendly simple structure.
III. MAKING A REVERSIBLE CIRCUIT FAULT-TOLERANT
In this section, we discuss our methods to make a given reversible circuit fault-tolerant in terms of error detection and correction. In order to elaborate them, an optimized 1-bit full adder synthesized in [19] is used as an example of a given circuit. The circuit is shown in Figure 3 .
A. SINGLE PARITY BASED ERROR DETECTION
Single parity is basically based on the parity preservative property. In order to satisfy the property for circuits consisting of MCT or MPMCT gates which are not parity preservative gates, we add an extra bit line to a circuit and an extra gate for each gate of the circuit. The added gate shares the same control lines with those of the corresponding gate in the original circuit, with its target always in the added line. In this manner, we can satisfy parity preservative equation by doubling the circuit area cost. Thus, we could detect odd number of errors at the output. We elucidate our method with the following example.
Example 1: Let's make the full adder in Figure 3 Figure 4 .
Note that our method guarantees parity preserving property not only for the given circuit, but also for any subcircuit of it that can be used for determining the fault places. This cannot be done with the conventional DMR technique. Additionally, although area overheads are same in DMR and our technique, the resulted DMR circuit is not reversible, so there is no guarantee of keeping the fault information at the output.
B. HAMMING 3 BASED ERROR DETECTION AND CORRECTION
Basic idea of Hamming 3 encoding is based on the following equations [27] , [28] .
Constructed on these equations, we present our two-step algorithm to make a reversible circuit fault-tolerant.
Input: A reversible circuit consisting of MPMCT gates having n bit/data lines
Output: A fault-tolerant circuit that can detect any number of errors unless the number of errors at the output is the order of 3, or correct 1 bit error at the output.
1) Finalize Equations 1-5 by considering n. Thus, the needed parity lines are determined. 2) In order to satisfy the finalized equations, for each gate add extra gates having the same controls as those of the corresponding gate and the targets on the parity lines. We elucidate our method with the following example. Figure 3 as a given circuit. By using the first step we obtain the finalized equations:
Example 2: Again consider the full adder in
In the second step, we start with the first gate on the left side.
Since it has a target bit on d 1 , by using Equations 6 and 7 we should add two targets on p 1 and p 2 . Therefore, two extra gates are needed. This is illustrated in Figure 5 (a). After applying the procedure for the second, third, and the fourth gates, we obtain the final form as shown in Figure 5 (b).
The area overhead of our method is more or less the same with that of TMR. However, our method offers higher error correction rates. Also while our method can correct or detect errors, TMR is only for correction. Error detection performance of our method is much better than that of DMR. A final note is that similar to our single parity based method, the proposed error detection/correction scheme is valid for any subcircuit of the given circuit.
C. SIMPLIFIED SINGLE PARITY AND HAMMING 3
For our methods introduced in the last two subsections, we add extra gates with their controls on parity bits. Here, we show that we can reduce the area overheads of the proposed techniques by investigating added gate pairs in adjacent stages, separated by dashed red lines in figures. We have two cases for simplification: 1) gates and their locations are identical such that both gates have the same target and control bit lines; and 2) one of the gates shares all control and target lines of the other one, plus having one extra control. For the first case, we remove both gates since switching a parity bit twice results in no change. For the second case, we remove the gate having one less control lines, and keep the other gate with negating its extra control. The reason is that a change in a parity occurs only if all of the shared controls are active and the extra control is inactive. Figure 6 shows an example of simplification applied to the circuit in Figure 5 . For the first two stages, there are two gates satisfying the second case. This is illustrated by Figure 6 (a). Also in the third and the fourth stages, there is a similar case. As a result, the simplified circuit is obtained as shown in Figure 6 (b).
IV. PERFECT ERROR DETECTION WITH FREDKIN GATES
From Lemma 2 given in the preliminaries section, we know that 100% error detection is possible with Fredkin gates. For this purpose we develop a greedy synthesis algorithm that implements an arbitrary reversible function with MPMCF gates. Our algorithm has four steps as follows.
Input: A reversible function with its truth table having n inputs and n outputs.
Output: A reversible circuit consisting of MPMCF gates that implements the given function. 1) Make the given truth table conservative by adding 0 and 1 valued inputs and their corresponding outputs.
• For each row of the truth table, find the difference value as the number of 1's in each output row minus the number of 1's in the corresponding input row.
• The number of added 1 valued inputs is the highest positive value of the difference, and the number of added 0 valued inputs is the absolute of the lowest negative value of the difference.
• Based on the added input values, the output values must be set in a way to achieve same number of 1's in each input/output row of the truth table.
2) Sort input and output columns of the 
4)
Start from the row having the smallest unmatched bits, assign MPMCF gates row by row.
• Select controls of MPMCF gates such that the gate only changes the bits in the corresponding row, without disturbing other rows. In the worst-case scenario, this is satisfied by using all bits controls except the two target bits.
• The number of used MPMCF gates in a row is the number of unmatched bits over two.
As an example, we again use the reversible full-adder circuit in Figure 3 and its truth table; n = 4. Steps of the algorithm is summarized in Figure 7 . First we determine the difference values, shown in Figure 3 (a) . Since the highest positive value is 2, we add two 1 valued inputs I n1 and I n2 as well as the corresponding outputs O n1 and O n2 . This is illustrated in Figure 7 (b). Note that since there is no negative difference value, we do not need to add 0 valued inputs. After performing sorting, we map MPMCF gates as shown in Figure 7 (c) and (d).
V. CMOS LOGIC IMPLEMENTATIONS
Our algorithms given in Section III result in reversible circuits with MPMCT and MPMCF gates. We show how to convert these gates into CMOS gate based realizations.
Consider a conventional NAND gate. Since three input combinations are mapped to a single logical 1 value, the information regarding to a possible fault at one of the inputs can be lost. The same problem occurs in NOR, OR, and AND gates. This is indeed related to ''don't care'' conditions. On the other hand, NOT, XOR, and XNOR gates perfectly satisfy the awareness of an input fault. However, they do not form a universal set. We use NAND, XOR, and NOT gates for realizations such that any internal node of the resulted CMOS logic circuit should be an input of an XOR or a NOT gate. Also if an inverter is driving a NAND gate then we replace the inverter with a cascaded inverter pair in a loop to prevent ''don't care'' conditions. As a result, we guarantee of eliminating any latent switching faults at the nodes.
Gate implementations of an MCT gate is shown in Figure 8 (a) . For an MPMCT, we only add cascaded inverter pairs to the inputs having negative controls. This is TABLE 3. Reversible-quantum costs and error detection and correction rates of the proposed hamming 3 and single parity based methods. shown in Figure 8 (b) . Figure 9 shows the implementation for a MPMCF gate. Again in case of having negative controls, we add cascaded inverter pairs to the corresponding inputs.
Note that since CMOS logic is one directional, these implementations are not fully reversible anymore. They implement the reversible functions proceeding only from inputs to outputs.
VI. EXPERIMENTAL RESULTS
We use reversible benchmarks from [29] . We evaluate our methods in terms of area cost, power cost and fault tolerance performance. We consider three measures for area costs: 1) reversible cost, 2) quantum cost, and 3) CMOS cost. As previously explained in the preliminaries section, for reversible cost we basically use reversible gate counts, and for quantum cost we use a measure in Table 1 . For CMOS cost, we report an estimation of occupied die area using TSMC 0.18 µm technology. Beside that CMOS estimated power values are also reported. CMOS area and power results are obtained using the Genus tool in Cadence.
For fault tolerance analysis, in each try we inject a randomly placed switching fault into a circuit node that causes a bit flip, and check the resulted output errors. Detection and correction rates represent the ratio of the number of tries where errors are detected/corrected at the output to the total number of tries, using Monte Carlo simulation. For the single parity scheme, errors are detected iff they occur in odd numbers at the output. And for the Hamming coded scheme, all of the errors are detected except those occurring in numbers multiplicand of d. In our study d is 3. And correction rate for Hamming coded scheme is the ratio of single fault at the output to the total number of trials.
A. FAULT TOLERANCE WITH REVERSIBLE GATES
For our single parity and Hamming 3 based methods, we directly use the synthesized benchmarks from [29] . Then we make them fault-tolerant. In the literature, to our knowledge, there is no similar study. As we discuss in the introduction section, although there are different fault-tolerant approaches proposed for reversible circuits, they lack implementations with reversible gates. If we implement them with the known reversible synthesis techniques suitable for don't care inputs (error detection/correction necessarily requires VOLUME 6, 2018 don't care conditions), then the area costs become excessively large. Table 2 shows an example for a reversible 1-bit full adder synthesized with our techniques and with the transformation based synthesis (TBS) technique [30] . Since area costs of TBS are much larger (even worse for larger benchmarks), we do not add further results of TBS in the following tables.
Area costs and error detection/correction rates of the proposed methods are shown in Table 3 . By examining Table 3 , we can conclude that for the Hamming 3, the simplification almost always reduces the area costs with a slight decrease in error detection/correction. On the other hand, for the single parity, the simplification causes a major decrease in error detection, so it might not be preferable. That is due to losing parity preservative feature of the simplified stages. Another inference is that on average our single parity and Hamming 3 based techniques make the original circuit area two and three times larger, respectively. One important point is that detection rates of the single parity method is as good as those for the Hamming 3 based method.
B. FAULT TOLERANCE WITH CMOS GATES
To show practical usage of the proposed techniques, we perform CMOS implementations with NOT, NAND, and XOR-2 gates as explained in Section V. Here, we extensively apply our techniques in comparison with DMR and TMR solutions to reversible benchmark functions by reporting area and power results.
Results are shown in Table 4 . Since the single parity can only detect faults, we compare it with the DMR scheme. In most cases, the single parity has much higher error detection rates with similar or better area and power consumption in comparison to the DMR scheme. Since the Hamming 3 technique has a correction capability, we compare it with the TMR scheme. Again in most cases, the Hamming 3 proposes a better performance in both area cost and the error correction rates. However, power consumption of the Hamming 3 is generally more than that of the TMR. On average, the single parity consumes 8.2% less power and 59.4% less area in comparison to the DMR. And, the Hamming 3 consumes 32.5% more power and 56.6% less area compared to the TMR.
C. FAULT TOLERANCE WITH FREDKIN GATES
Since Fredkin gate is a conservative gate and if a circuit is synthesized using only this gate it will yield 100% error detection. In the literature, synthesis with Fredkin gates has not been proposed. However, by making any truth table conservative and then performing Fredkin Enabled TBS scheme using the Soeken's approach [20] , we can have a circuit constructed on just Fredkin gates. The results shown in Table 5 , clearly favor our synthesis technique for each of the three area cost measures as well as for power consumption.
VII. CONCLUSION
In this study, we have proposed methods to achieve latentfault-free and error detecting/correcting CMOS circuits. For this purpose, we first implement fault-tolerant reversible circuits. Since our methods to make a reversible circuit fault-tolerant would not disturb the original circuit, it yields smaller area overhead in comparison to any other synthesis technique in the literature. Next, we convert our reversible circuits to CMOS realizations, and then compare our methods with conventional DMR and TMR techniques. On the quest to achieve perfect error detection, we also develop a greedy synthesis algorithm that implements an arbitrary reversible function with multiple-control Fredkin gates. As a future work, we aim to find a better, with much smaller CMOS area, Fredkin or other conservative gate based synthesis technique to achieve 100% error detection and correction. M. HÜSREV CILASUN is currently an Engineer with Aselsan A.Ş. He has authored several conference and journal papers on EEG processing and autonomous direction estimation, number theory, and robotics. His research interests include reversible and quantum circuits, FPGA/ASIC design, digital signal and image processing, and machine learning.
