Quantum-dot cellular automata (QCA) is
Introduction
Based on the Moore's law introduced by Gordon Moore in 1965, the numbers of components on a chip will double every 18 months [1] . Recently, current metal-oxide-semiconductor (CMOS) technology has reached to its physical limitations and has faced some challenging problems such as high-power consumption and the challenges in the feature size reduction. Thus, many researches have been conducted to find various technologies in order to replace typical CMOS technology. Among promising technologies is the quantum-dot cellular automata (QCA) with faster switching speed and higher density, which is one of the alternatives to replace the conventional CMOS technology. The QCA, which was first introduced by Lent in 1994 et. al [2, 3] was expected to operate with densities ______________________________________________________________________________________________________________________ of 10 12 devices/cm 2 in 100 GHz domain with switching speeds as low as 10ps. This technology is a combined strategy for transmission and computation to build logical circuits at nanoscale. These encourage many researchers to implement new circuits in the QCA. This paper describes the QCA design of some reversible Gates such as CNOT, Toffoli, Feynman, Double Feynman, Fredkin, Peres, MCL and R Gates. Also, these optimized QCA reversible Gates are used to achieve the design of QCA 4-Bit reversible parity checker and 3-bit reversible binary to Grey converter. Plenty of the QCA-based designs of reversible logic Gates and circuits have been reported previously by the researchers [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] . In [6] some optimized scalable reversible logic Gates have been designed in the QCA technology and compared with conventional CMOS technology. The designs of QCA reversible Grey to binary converter and QCA reversible binary to Grey have been investigated in [4] based on Feynman Gate. In [8] innovative designs of Peres and R Gates have been presented in both QCA and CMOS technologies, which can be used in design of complex low power consumption nanoscale computing structures. Novel schemes of MCL, Fredkin, URG and BJN Gates based on QCA have been proposed in [12] . In [14] , a novel approach to design two different QCA layouts of Toffoli Gate have been introduced. In [15] , a QCA layout of Peres Gate as a universal Gate has been used to implement all the basic QCA logic Gates. The design of an ultralow power reversible n-bit binary incrementer in the QCA has been demonstrated in [22] . In [23] the design of reversible 1-bit comparator the QCA technology has been proposed () using an optimized QCA layout of Feynman Gate. In this paper, all presented QCA layouts for reversible Gates and circuits will be compared with the previous QCA and CMOS [6, 7, [25] [26] designs and we will prove that our QCA designs are more efficient in terms of the number of cells, occupied area, power consumption and input to output delay. The rest of the paper is organized as following: In Section 2, a summary of the QCA technology's background is introduced. Section 3 describes the basic reversible logic Gates. In Section 4, the proposed QCA reversible Gates and circuits are presented. Section 5 shows the simulation results and the comparisons between the proposed circuits and the previous QCA and CMOS works. Finally, Section 6 concludes the paper.
Quantum-dot cellular automata
The primary element of the QCA is the QCA cell shown in Fig. 1a [2, 3] . The QCA cell consists of an arrangement of four quantum-dots (located at the corners of a square) and two free electrons. The electrons can tunnel between dots using a proper clocking mechanism, but they cannot leave the QCA cell. The columbic interaction between these free electrons forces them to move to diagonally opposite positions, thus depending on the electron's location only two possible states can be obtained. This makes the QCA cell to have two polarizations. These two possible states are defined as the QCA cell polarization P=+1 (to represent logic "1") and P=-1 (to represent logic "0"). All QCA logic circuits can be implemented using three fundamental gates: majority Gate, inverter Gate, and QCA wire. By placing an array of QCA cells a QCA wire can be created. The QCA wire is a cluster of QCA cells, in which the polarization of one cell is directly affected by the polarization of its neighboring cells through the columbic repulsion between the QCA cells. Therefore, all QCA cells in the QCA wire have similar polarization to carry signal from one direction to another. Binary wire and inverter chain are two kinds of QCA wires as shown in Fig. 1b . In a QCA binary wire, the polarization of the 1st QCA cell (binary information) is propagated through the entire array. In a QCA inverter chain, each cell is polarized against its neighboring cells. All QCA logic circuits can be implemented using 3-input majority Gate, wire, and inverter Gate [2] . The QCA inverter, which is shown in Fig. 1c returns the reversed value of its input. The majority Gate or voter (MV), which is considered as the most important Gate in the QCA technology is shown in Fig. 1d . In this QCA Gate, the polarization of the output f follows the polarization of the majority of the three input cells A, B and C. The logic function of the 3-input majority Gate is given by [3] :
Where A, B and C are the inputs, and f is the output. Using the majority Gate and by fixing the polarization of one input, for example the input C as P=+1 ("logic 1") or P=-1 (logic "0"), OR and AND Gates can be implemented as following: In Fig. 1e , the circuit diagrams of the QCA 3-input majority Gate and inverter Gate are shown. In the QCA, there are two types of crossovers (wirecrossing or overlapping of wires): coplanar and multi-layer crossovers as shown in Fig. 1f and Fig.  1g , respectively. The coplanar crossover uses one layer and two wire types (binary wire and inverter chain), but the multi-layer crossover uses the binary wire and more than one layer of the QCA cells. [16, 24] , and Reversible binary to Grey converter [4] have been introduced. In [27] , a novel QCA reversible Gate, which has a promising future in constructing of nanoscale low power consumption information processing system has been presented. In [28] , an efficient design of the QCA Fredkin Gate based on the QCA wire, 3-input majority Gate and QCA inverter Gate has been presented, in which compared to its previous designs the number of cells, covered area and latency time has been reduced. In [29] , a new layout design of multiply complements logic (MCL) gate based on the QCA inverter, the QCA wire and QCA majority voter (MV) Gates has been introduced. In [30] , a novel QCA based 2-bit, 3-bit and 4-bit binary to Gray code converters have been implemented. The proposed designs in [30] have the layouts, in which the number of cells, area, and input to output delay have been reduced.
Reversible gates
The reversible logic circuits have many applications in the areas like low power architectures for CMOS technology, quantum computers, optical and DNA computing and Nanotechnology [31] [32] [33] . In the reversible logic Gates, the number of outputs are equal to the numbers of inputs to generate a specific set of output vector for each set of input vector. Whenever there is a need to make the number of inputs and outputs identical, an extra input or output can be added. Therefore, in a reversible circuit, which should be composed of a number of reversible Gates, a one to one correspondence between its inputs and outputs exists. Also, in order to have a good performance and a low complexity, it should be formed using less quantum cost and minimal number of reversible Gates and garbage outputs. Many reversible logic Gates exist. Some of the most important reversible Gates are described as following [31] [32] [33] :
The NOT Gate is a 1*1 elementary reversible Gate with a quantum cost equal to zero. The equation between its input and output is given by:
Where, a is the input and P is the output.
CNOT Gate
The CNOT (controlled-not) Gate is a 2*2 reversible Gate with a quantum cost equal to one.
______________________________________________________________________________________________________________________
This Gate can be described as:
Where, a and b are the inputs, and P and Q are the outputs.
Feynman Gate
A Feynman Gate just like CNOT Gate is a 2*2 reversible Gate with a quantum cost equal to one, which is widely used to provide fan-out in reversible circuits. The equation between its inputs and outputs is given by:
Where, a and b are the inputs and P and Q are the outputs.
Double Feynman Gate
Double Feynman Gate is a 3*3 reversible Gate with a quantum cost equal to two. This Gate can be described with the following equation:
Where, a,b and c are the inputs and P, Q and R are the outputs.
Toffoli Gate
The Toffoli Gate is a 3*3 reversible Gate with a quantum cost equal to five. In this Gate, the outputs (P, Q and R) are related to the inputs (a, b and c) by:
3.6. Fredkin Gate The Fredkin Gate is a 3*3 reversible Gate. Its quantum cost is equal to five. The following equation describes this Gate:
Peres Gate
The Peres Gate is a 3*3 reversible Gate, which has a quantum cost equal to four (the minimum between all 3*3 reversible Gates). Its outputs are defined by:
From the Equation (10) it can be seen that the Peres Gate can be used as a half adder if c=0.
R Gate
The R Gate is a 3*3 reversible Gate with the following equation:
MCL Gate
The MCL Gate is a 3*3 reversible Gate. The following equation describes this Gate:
Where, a, b and c are the inputs, and P, Q and R are the outputs.
Implementation in the QCA
This section presents the proposed QCA implementations of the reversible logic Gates and circuits. New efficient QCA implementations of some reversible Gates such as CNOT, Toffoli, Feynman, Double Feynman, Fredkin, Peres, MCL and R Gates are presented. Also, the designs of a 4-Bit reversible parity checker and a 3-bit reversible binary to Grey converter are introduced using these optimized layouts. In all of the proposed layouts, except for the Fredkin Gate, a low complexity and high speed QCA XOR structure is used, which is designed based on the explicit interactions between the QCA cells as shown in Fig. 2 [34] . In [34] , a new well-optimized structure for three-input XOR
______________________________________________________________________________________________________________________
Gate is proposed that is based on the QCA cell interaction. This XOR Gate is a 5-input majoritylike device [35] [36] , which is composed of 14 QCA cells with an input to output delay equal to only two clock phases. In Fig. 2 , the QCA layout and simulation results of this XOR Gate are shown. We used this XOR Gate for the first time to design and implement the QCA reversible logic Gates and circuits. The QCA layout shown in Fig. 2 acts as a 3-input XOR Gates. It can be seen that by fixing an input, for example input c to "0" or "1" logics, twoinput XOR and two-input XNOR Gates can be obtained, respectively. [34] .
Figure 2. Efficient layout of a 3-input XOR Gate and its simulation result
In Fig. 3 , the proposed QCA layouts for NOT, CNOT, Toffoli, Feynman, Double Feynman, Fredkin, Peres, MCL and R Gates are shown. These layouts are implemented with the minimum number of cells, minimum occupied area, and minimum input to output delay (per clock phases). As shown in Fig. 3 , to implement these QCA layouts three logic components are used, which include the threeinput majority Gate (shown in Fig. 1d ), the NOT Gate and the three-input XOR Gate (shown in Fig.  2 ). Despite having the highly integrated QCA layouts, none of the wire-crossing techniques (shown in Fig. 1 ) are used to overcome the problems of wire-crossing techniques. Because in the multi-layer method, two wires in the crossover pass through two different layers. This lacks a physical implementation because the manufacturing process would be difficult due to its multi-layered nature and manufacturing cost. Also, the QCA coplanar method has some problems such as the high sensitivity of 45 o rotated cells in the QCA implementation.
______________________________________________________________________________________________________________________ Figure 3. The new proposed QCA layouts for (a) NOT, (b) CNOT, (c) Toffoli, (d) Feynman, (e) Double Feynman, (f) Fredkin, (g) Peres, (h) MCL and (i) R Gates.

Results and discussion
The simulation results of the proposed QCA layouts are shown in Fig. 4 and Fig. 5 . The proposed QCA layouts are simulated using QCADesigner software Ver. 2.0.3 [37] with the default parameters of the both Bistable and Coherence Vector engines. QCADesigner software as a powerful and fast tool can be used to the QCA layout design and simulation. For both simulation engines the same results are achieved, which indicate the correctness of the proposed layouts. In Fig. 4 and Fig. 5 , the results of Bistable engine with the following parameters (default parameters) is used: number of samples 12800, convergence tolerance 0.0001, radius of effect 65 nm, relative permittivity 12.9, clock high 9.8×10 −22 J, clock low 3.8×10 −23 J, clock amplitude factor 2, layer separation 11.5 nm, maximum iterations per sample 100 and randomize simulation order. Also, the QCA cells are assumed to have a width and height of 18 nm, and their quantum dots have 5 nm diameter. As it can be seen from Fig. 4 and Fig. 5 , the proposed layouts work satisfactory and the outputs of all proposed QCA reversible Gates generate thorough highly polarized signals, which can provide a high drivability for the QCA reversible circuits.
Figure 4. Simulation results of the proposed QCA layouts for (a) NOT, (b) CNOT, (c) Toffoli, (d) Feynman
and (e) Double Feynman Gates.
______________________________________________________________________________________________________________________ Figure 5. Simulation results of the proposed QCA layouts for (a) Fredkin, (b) Peres, (c) MCL and (d) R Gates.
We have also designed a 4-Bit reversible parity checker and a 3-bit reversible binary to Grey converter using the proposed layouts. To implement the QCA 4-Bit reversible parity checker, three CNOT Gates are used. The equations (13)- (15) describe the outputs of the first, the second, and the third CNOT Gates, respectively in order to obtain a 4-Bit reversible parity checker.
Where a,b,c and d are the inputs, o is the output and Gar1, Gar2 and Gar3 are the garbage outputs.
To implement a 3-bit reversible binary to Grey converter, two Feynman Gates can be used.
Equations (16) and (17) describe the outputs of the first and the second Feynman Gates, respectively to obtain a 3-bit reversible binary to Grey converter.
Where, B2, B1 and B0 are the three input bits (the binary code), G2, G1 and G0 are the corresponding three output bits (the Grey code) and Gar is the garbage output. To prove the superiority of our proposed designs on the previous works, a complete analysis is performed on the different aspects with the comparison factors such as occupied area, cell count, delay, and type of wire-crossing methods. Table 1 shows a detailed comparison between the proposed designs and the previous works. As it can be seen in this table, some of the previous works [10, 18] have been designed using the coplanar wire-crossing scheme and the QCA layouts in [4, 5, 11, 25] use the multi-layer wire-crossing method. From Table 1 it is clear that our presented QCA layouts are better than all the previous works with a considerable superiority. For example, 50% increase in computation speed is achieved compared to the fastest previous Toffoli Gate [21] . Our Toffoli Gate has more than 1.7 times smaller area compared to the previous Toffoli Gate [20] . Also, it consumes 23 cells less than the previous Toffoli Gate presented in [13] . In comparison with the previous Feynman Gate [23] , the important improvements (IIs) achieved for the proposed Feynman Gate are 26% in the area and 56% in the cells count. Also, our Feynman Gate is faster and the previous design [23] uses the multilayer wirecrossing. The important improvement is given by:
where X and Y are the area or cell count for our designs and the previous works, respectively. In Fig. 6 , the proposed QCA layouts of the 4-Bit reversible parity checker and 3-bit reversible binary to Grey converter are shown. Also, in this figure the simulation results of these layouts using QCADesigner with the default parameters of Bistable simulation engine (for the QCA 4-Bit reversible parity checker the number of samples is selected equal to 30000) are presented. These proposed layouts are very dense designs, which are implemented with the minimum number of QCA cells and clock phases without any wire-crossing methods. From Fig. 6 , it can be seen that the proposed layouts work satisfactory and produce the correct outputs with the highly polarized signals, which can provide a high drivability for QCA reversible circuits. For the other Gates, the important improvements achieved for our QCA designs are as following:
• In comparison to the Double Feynman Gate presented in [9] : 23% in the area and 56% in the cells count with the same delay • In comparison to the Fredkin Gate presented in [7] : 5% in the area and 3.8% in the cells count with the same delay • In comparison to the Peres Gate presented in [15] : 12% in the area, 58% in the cells count and our design is 2 times faster • In comparison to the MCL Gate presented in [12] : 15% in the area, 30% in the cells count and our design is 2 times faster • In comparison to the R Gate presented in [8] :
59% in the area, 60% in the cells count and our design is faster • In comparison to the reversible parity checker presented in [16] : 75% in the area, 75% in the cells count and our design is 4 times faster • In comparison to the reversible binary to Grey converter presented in [4] : 74% in the area, 71% in the cells count and our design is 2 times faster. In the above list, references [16] and [4] use the multi-layer wire-crossing method. Furthermore, in comparison with the conventional CMOS technology the occupied areas for our QCA designs are about 1359 times (for CMOS Peres Gate presented in [25] ), 1747 times (for CMOS Fredkin Gate presented in [7] ) and 280 times (for CMOS R Gate presented in [6] ) smaller. In comparison to the Feynman Gate presented in [27] the proposed Feynman design has 55% improvement in the area and 52.9% improvement in the cells count with a same input to output delay. In comparison to the Fredkin Gate presented in [28] , the important improvements achieved for the proposed QCA design are 24% in the area and 18.2% in the cells count with a same input to output delay. In comparison to the MCL Gate presented in [29] , the proposed MCL Gate has 55.2% improvement in the area, 33.3% improvement in the cells count. Also, our design is 2 times faster. Also, in comparison to the 3-bit binary to Grey converter presented in [30] , our design is faster and it has 95% improvement in the area and 60.9% improvement in the cells count. [19] 0.100 99 4 Without wire-crossing Peres [21] 0.18 117 3 Without wire-crossing Peres [22] 0.075 97 4 Without wire-crossing Fredkin [7] 0.07 79 3 Without wire-crossing Fredkin [10] 0.273 231 4 Coplanar Fredkin [11] 0.19 187 9 Multi-layer Fredkin [12] 0.09 81 4 Without wire-crossing Fredkin [17] 0.194 191 4 Multi-layer Fredkin [18] 0.375 246 4 Coplanar Fredkin [21] 0.10 97 3 Without wire-crossing Fredkin [28] 0.087 93 3 Without wire-crossing MCL [12] 0.020 23 3 Without wire-crossing MCL [21] 0.05 36 2 Without wire-crossing MCL [29] 0.038 24 2 Without wire-crossing Toffoli [13] 0.081 44 5 Without wire-crossing Toffoli [14] 0.067 48 4 Without wire-crossing Toffoli [20] 0.043 101 5 Without wire-crossing Toffoli [21] 0.06 57 3 Without wire-crossing CNOT [21] 0.06 49 3 Without wire-crossing Reversible parity checker [16] 0.143 130 8 Without wire-crossing 4-Bit parity checker [24] 0.051 97 6 Multi-layer Reversible binary to Grey converter [4] 0.139 112 4 Multi-layer 3-Bit binary to Grey converter [30] 0.75 82 3 Without wire-crossing R (CMOS technology) [6] 12.3 ---------------Fredkin (CMOS technology) [7] 122.3 ---------------Peres (CMOS technology) [25] 59.8 ---------------
Conclusion
This paper presents the innovative QCA implementations of the basic reversible logic Gates. Also, the efficient designs of QCA 4-Bit reversible parity checker and reversible binary to Grey converter are presented. Our designs are able to overcome the weaknesses of the previous QCA designs due to their specific attributes. We showed that in comparison with the other previous QCA layouts, the proposed designs are implemented with the minimum cell count, area and input to output delay. Also, our proposed QCA layouts are much denser in comparison with the conventional CMOS implementations. The simulation results using QCADesigner software showed that our QCA layouts perform well. These new layouts can be simply used as suitable components for designing of the low power consumption architectures in nanoscale reversible computing.
