Abstract-As CMOS technology reaches its physical limits, new technologies such as quantum-dot cellular automata, single electron tunneling, and tunneling-phase logic are being proposed as alternatives to CMOS technology. These technologies use either majority or minority logic to implement logic functions. Existing majority/minority logic synthesis methods, based on three-feasible networks, often result in suboptimal solutions. In this paper, an efficient algorithm to find the minimal majority gate mapping, along with a majority expression look-up table (MLUT) is developed. Based on the MLUT, a comprehensive majority/minority logic synthesis technique is proposed. A redundancy removal method is also developed to further optimize the synthesized circuit. This technique makes effort toward achieving different optimization goals and results in fewer majority gates and fewer levels than previous methods. For the 29 MCNC benchmark circuits, when targeted to optimize the logic levels, there is an average reduction of 7.0% in the number of levels as well as 6.3% in the number of gates. For optimization targeted to reduce gate counts, there is an average reduction of 9.5% in the number of gates as well as 0.8% in the number of levels, as compared to the best available method.
I. INTRODUCTION

C
ONTINUING advances in semiconductor technology have led to the reduction of feature sizes in complementary metal-oxide semiconductor (CMOS) design. Attempts to further scale down the feature sizes face many hurdles due to the fundamental physical limits of CMOS technology. For example, as gate lengths are reduced below 10 nm, quantum effects are likely to dominate the performance of the device, resulting in increased gate leakage current, capacitive coupling, electromigration failures, doping fluctuations and increased difficulties in lithography [1] . Alternative technologies such as quantumdot cellular automata (QCA) [2] - [5] , single electron tunneling (SET) [6] , [7] and tunneling phase logic (TPL) [8] are being considered as possible replacements for CMOS. Unlike CMOS technology which uses "NAND/NOR/NOT" gates to implement circuits, QCA uses majority logic, SET uses both majority and minority logic, and TPL uses minority logic.
Conventional Boolean logic uses "AND", "OR" and "NOT" as basic units, therefore, traditional logic reduction methods produce simplified expressions in the two standard forms: sum of products and product of sums. However, in the next generation nanotechnologies, the basic logic units are three-input majority gates rather than "NAND" or "NOR" gates used in CMOS. Therefore, existing logic synthesis methods that target CMOS technology are not efficient enough to take full advantage of majority gates. In order to automate circuit designs using emerging technologies, efficient synthesis algorithms for majority/minority logic are necessary.
Decades ago, methods such as Reduced-unitized-tables [9] , K-maps [10] , and Shannon's decomposition principles [11] were employed to synthesize majority logic functions. However, these methods were only suitable for small networks since they were used for solving problems manually. The majority logic synthesis methods in [12] , [13] are based on geometric interpretation of the Boolean functions and lead to thirteen standard functions. However, these functions are limited to synthesizing three-variable functions. Other approaches that can handle majority logic circuits with more than three variables are proposed in [14] - [18] . In these methods, standard logic synthesis tools, such as SIS [19] , are initially used to decompose the circuit into three-feasible networks. The resulting three-feasible networks are then synthesized to obtain majority functions. In our earlier work [20] , [21] , Boolean functions are decomposed to four-feasible networks and standard functions are then used to convert each node in the network into corresponding majority expression. However, the standard functions obtained in [20] and [21] are not a complete set. Later, by using the graph theory, the full set of standard functions is identified in [22] . However, the majority expressions of these standard functions are not optimal, thus leading to sub-optimal solutions.
In this paper, a comprehensive majority/minority logic synthesis method that is capable of processing both three-feasible and four-feasible networks is proposed. In most cases, fourfeasible network is naturally simpler than three-feasible network since each node in the four-feasible network can accommodate one extra variable. However, five-feasible or more-feasible networks are not discussed because of the complexity of ensuring the optimal majority expression for each node in such networks. In the proposed method, we introduce the concept of majority expression look-up table (MLUT). The MLUT is developed by generating optimized majority expressions for all four-variable Boolean logic functions. To the best of our knowledge, the concept of MLUT has not been used in the synthesis of majority/minority logic synthesis, or a different setting. Then a comprehensive method is provided so that any circuit, with an arbitrary number of inputs, can be synthesized to obtain its optimal majority expression by decomposing the circuit and targeting each decomposed node to MLUT. The resulting majority network can be further simplified by applying the proposed redundancy removal process. A verification method is also developed to verify the functionality of the synthesized results. This approach results in fewer majority gates and fewer logic levels as compared to existing methods [16] , [18] .
The rest of the paper is organized as follows. Section II gives background information on QCA, SET and TPL technologies. Section III introduces the procedure for finding the MLUT. Section IV explains the details of the proposed synthesis method. Section V compares the new results with results obtained by prior methods. This is followed by conclusions in Section VI.
II. BACKGROUND MATERIAL
A. QCA Cells
The standard QCA cell contains four quantum dots that are confined by the cell boundary. The dots can be charged by free electrons that tunnel through each of the two diagonal dots. If two free electrons are placed in a cell, they occupy diagonal sites because the Coulomb repulsion pushes them far away from each other. Thus, there are two energetically stable states of a pair of electrons in a QCA cell. Logic 0 is represented by two electrons taking the upper-left and lower-right position. Logic 1 is represented by two electrons taking the upper right and lower left position. These two cases are shown in Fig. 1(a) and (b), respectively.
B. QCA Devices
This section introduces some basic QCA logic devices, including QCA wires, QCA inverters and QCA majority voters [23] - [26] .
A QCA wire is formed by placing QCA cells next to each other as shown in Fig. 2 . The signal propagates along the wire from left to right when excited from the leftmost cell. The figure shows how a logic 0 is propagated from left to right. The information flow can be bi-directional depending on the QCA clock.
A QCA majority gate can perform a three-input logic function as given in (1) . Simple diagrams of a QCA majority gate and a QCA inverter are shown in Fig. 3 
By forcing one of the three inputs of the majority gate to a constant logic 0 or a 1 as shown in Fig. 4 , a two-input "AND" or a two-input "OR" gate can be realized. These logic functions can also be derived from (1) by substituting "C = 0" and "C = 1", respectively. The resulting functions are given by
C. SET Technology
A SET minority gate is shown in Fig. 5 A SET majority configuration is realized by a balanced pair of SEBs [7] as shown in Fig. 5(b) . An electron tunnels through one of the SEBs to form a negative voltage and prevents other electron movements as Vdd increases. As a result, the stable voltage states for the two SEBs are (1, 0) and (0, 1) depending on the inputs. For example, if all inputs are 0, the voltage state is (0, 1) and node B has a negative voltage.
By forcing one of the three inputs to logic 0 or 1, "NAND" gate and "NOR" gate are obtained for SET minority gate, while "AND" gate and OR gate are obtained for SET majority gate [6] .
D. TPL Technology
Two waveform phases are used to represent logic values in the TPL technology. If two inputs have different phases, they neutralize each other and the reverse of the third input determines the output. If all inputs have the same phase, the output equals the inverse of the inputs. A basic TPL minority configuration [8] is shown in Fig. 6 , which can be converted into a "NAND" gate or a "NOR" gate by forcing one of the three inputs to logic 0 or 1.
Since minority logic networks can be easily derived from majority networks using De Morgan's theorem, we only focus on majority logic synthesis in this work. A straightforward way to implement majority logic is to first synthesize the Boolean functions using traditional methods to obtain simplified expressions in terms of "AND", "OR" and "NOT" logic. These logic units are then mapped one-to-one to majority "AND/OR" gates and inverters to produce the majority expression. However, this one-to-one mapping usually does not lead to optimized results in terms of the number of logic levels or the number of logic gates used in a circuit. The number of levels and gates are the most important performance criteria since they determine the circuit latency and size. Therefore, there is a strong need for developing an efficient majority logic synthesis method. The proposed method is described in detail in Sections III and IV.
III. TERMINOLOGY
A. Primitives
When trying to convert four-variable functions to corresponding majority expressions, the first step is to determine the Boolean logic functions that can be represented by only one majority gate. These Boolean logic functions, called primitives, are the basic units for constructing other majority functions.
Depending on the number of variable-inputs to a majority gate, five types of primitives can be defined. The first type has no variable-input at all; it is logic 1 or 0 and is referred to as a constant type primitive ("C" type). The second type has a single variable-input such as a or c and is called a single type primitive ("S" type). The third and fourth types of primitives both have two variable-inputs, which are called the "AND" type and the "OR" type depending on the third input being a 0 or 1. The last type of primitive has three variable-inputs and is called the "T" type primitive [16] , [20] . Primitives for three-variable Boolean logic functions can be defined in the same way, which are actually a subset of four-variable primitives. All five types of primitives can be realized with one majority gate. Their K-map implementations are shown in Fig. 7 .
Since any majority logic function can be constructed by primitives, it is important to find out all of them. There are two primitives, logic 1 and logic 0, that belong to "C" type and there are eight primitives belong to "S" type which are single variable inputs a, b, c, d and their complementary literals a , b , c and d . "AND/OR" type primitives are all based on two variable-input majority gates. Therefore, any combination of two inputs from a, b, c, d, a , b , c and d can be used. The only combinations that are not allowed are the four pairs, a&a , b&b , c&c and d&d . Each of these combinations appear once for the "AND" primitive and once for the "OR" primitive. The total number of "AND\OR" primitives is calculated as given in
All three inputs of the "T" type primitives are variables. So any three out of four variables a, b, c and d could be selected as inputs, and each input to the majority gate can appear as itself or its complement. The number of possible different "T" type primitives is given in
Thus, the total number of primitives for four-variable functions is determined to be 90, as given in 10 + 48 + 32 = 90. 
A list of primitives for four-variable functions is shown in Table I , in which the first 10 entries belongs to the first two types of primitives, the entries labeled from 11 to 58 belong to "AND\OR" type primitives, and the rest entries are "T" type primitives.
B. Majority Expressions Lookup Table
In this section, a MLUT is developed which includes optimized majority expressions for all four-variable Boolean logic functions. Since there are a total of 2 16 possible functions, there are 65,536 entries in the table. The majority expression in each entry is guaranteed to be optimal since it is obtained through a process of exhaustive search and update. Of these, ten entries require no majority gate and 80 entries can be realized with a single majority gate. These 90 primitives are shown in Table I . Any four-variable function can be constructed using a combination of these entries. The combination rule discussed in work [13] , [20] is briefly reviewed here through an example:
This function can be mapped to a K-map with a series of on-set (represent minterms which are contained in the function) and off-set (represent minterms which are not contained in the function) squares, as shown in Fig. 8 . Any feasible majority gate based solution for F (e.g., f1, f2 and f3 in Fig. 8 ) will cover the on-set squares in the K-map of F twice or more and will cover any offset squares at most once. For example, the cell "0111", an on-set cell in K-map of F, is covered twice by f2 and f3. The cell "1000", an off-set cell, is covered once by f2. One possible majority expression for F is: F = M (M (a, b, 0), M(a, b, 1), M(a, c, d) ).
The flowchart of proposed algorithm for producing the MLUT is shown in Fig. 9 . Definitions of terms used in the flowchart are as follows: MLUT optimized majority expressions lookup table; L order of iterations; initialized to 1; N L number of entries in MLUT at iteration L; P the set of primitives; f1, f2, f3 functions selected from P; F function constructed from f1, f2, f3; i, j, k loop variables. The MLUT is initialized with the 90 primitives, i.e., L = 1, N 1 = 90, MLUT = 90 primitives. To obtain the function solutions using two levels of majority gates, all N L entries in MLUT are copied to P. Three primitives f1, f2, f3 are then selected from P to construct a function F = M (f 1, f2, f3) . If F is not already a part of the MLUT, the new function F is added to the MLUT. If F is already in MLUT, it must have been constructed with other primitives: f1 , f2 and f3 . If M(f1, f2, f3) is a better implementation, it replaces M(f1 , f2 , f3 ) for F in MLUT. The majority expressions are compared in terms of the following parameters: 1. minimal number of levels; 2. minimal number of majority gates; 3. minimal number of inverters; 4. minimal number of gate inputs. Since the number of levels and gates determine the latency and the size of a circuit, they are typically given higher priority. The number of inverters also needs to be reduced because they may lead to additional complexity in building circuits. The gate input or fan-in is defined as variable inputs to a gate. For example, M(a, b, c) has three gate inputs and M(a, b, 0) has only two gate inputs. In QCA circuits, the logic 1 and 0 are generated from external sources and they also have to be routed to their positions as the variable inputs do. However, 1 or 0 can sometimes be obtained from their neighboring 1s or 0s and could reduce the routing complexity. As a result, the number of gate inputs is considered as the least priority parameter. Note that depending on particular characteristics of the implementation technology, it is always possible to reorder the priority of these criteria. For example, since inverters can be easily implemented in MQCA technology it is not critical to reduce the number of inverters.
When all possibilities of combining any three primitives in iteration L are exhausted, there are N L functions in the MLUT. Since these N L functions are obtained through an exhaustive search and update procedures, it is guaranteed that they are optimized in terms of four parameters mentioned above and can be considered as primitives for next level functions. After the completion of each iteration, if N L equals 65536, all majority expressions are obtained in optimal form. Otherwise, the next iteration to identify other four-feasible functions with (L + 1) levels of majority gates must be explored by copying all N L functions in MLUT to P and selecting f1, f2, f3 from the updated P. This process is repeated until all solutions of four-variable functions are obtained.
From this exhaustive search approach, 10,260 functions are obtained using two levels; 55,184 functions are obtained using three levels; and the remaining two functions require four levels. These two functions are "exclusive OR" or "exclusive NOR" operation of four variables as given in (7) and (8) . Their K-map representations are shown in Fig. 10(a) and (b) , from which it can be seen that the patterns of the K-maps are complex so it is harder to select proper primitives to build them according to the combination rule. Equation (9) shows the optimized majority expression of (8) and Fig. 11 shows the corresponding K-maps. Since (7) and (8) are complementary to each other, the majority expression for (7) is the complementary form of (9)
Any synthesized node with four variables or less can then be targeted to MLUT to obtain its optimally implemented majority expression. Although the table is built on four-variable functions, it can be used for nodes with less than four variables by making one or more of the inputs as a "don't care" value.
IV. METHODOLOGY AND VERIFICATION
A. Synthesis Method Overview
An overview of the proposed synthesis method is shown in Fig. 12 . The input to the algorithm is an arbitrary network of Boolean functions, and the output is an optimized majority expression network. The input network has to be preprocessed and decomposed to ensure each node contains four or fewer variables, i.e., a four-feasible network. Then the optimal majority expressions from MLUT can be mapped to the decomposed network to obtain the preliminary solution. The final step is to remove all redundancies that are present in the preliminary solution to obtain the final synthesized majority network. A verification method is also provided to ensure the correctness of the final output.
B. Preprocessing and Decomposition
Preprocessing provides a good start for decomposition by algebraically factoring out the common terms and removing the redundant terms from the Boolean equations. Decomposition will split the bigger nodes in the network into smaller nodes; thus it makes them easy to convert into majority expressions. Preprocessing and decomposition are done using SIS [19] by performing a sequence of operations. Improved preprocessing and decomposition methods based on [16] , [18] are proposed as shown in Figs. 13 and 14 , respectively. All "(x)" in these two figures can be replaced by "2", "3" and "4" to obtain two-feasible, three-feasible and four-feasible decomposed networks. None of these four decomposition methods present optimal results for all cases; therefore, all methods should be tried out to find the best solution. The decomposed network is not guaranteed to be optimal even if all of them are tried. However, these methods provide a substantial library of heuristic techniques for decomposition. 
C. Replacing Boolean Equations With Majority Expressions
After decomposition, each node in the network contains four or fewer variables, which means it has a corresponding optimized entry in MLUT. The initial majority expression network can be obtained by replacing all nodes with their majority expressions found in the MLUT. However this network can have redundancies that must be removed to obtain the final solution.
D. Redundancy Removal
When a large network is decomposed to a four-feasible network, a lot of redundant logic functions are generated during synthesis. The redundancy removal can greatly reduce the circuit complexity, and it can be done in several steps as shown in the example. Consider a network with three inputs a, b, c and one output f as given in
The first step is to remove repeated nodes. By selecting each node and comparing its original form and complementary form with the rest of the nodes, it can be found that [4] and [6] are identical while [4] and [5] are complementary to each other. So the content of node [6] is replaced by node name [4] and the content of node [5] is replaced by node name [4] . Equation (11) is obtained by applying these substitutions
The second step is to simplify the nodes with duplicated inputs. By checking each node in the network, it can be seen that there is no such node.
The third step is to sweep nodes without any majority gate. Two nodes, [5] and [6] , fall into this category. Since both of them are internal nodes rather than the primary output of the circuit, they can be eliminated and replaced. By deleting the last two rows in the network and substituting [5] and [6] with [4] and [4] , respectively, (12) is obtained
The last step is to optimize the use of inverters. In this network, one of the inputs in node [3] has two cascaded inverters and they can cancel each other out. Another function of this step is to reduce the number of inverters. For example, f = M (0, a , b ) will be factored to f = M (1, a, b) to reduce one inverter. However, this function is not reflected by this example since all the nodes are already in optimized forms. After completing the first iteration of redundancy removal, (13) is obtained
Apparently, the network is not yet fully simplified and more iterations of simplification are needed. In the second iteration, by removing duplicated inputs in node [3] , substituting [3] with [4] , and deleting nodes that contain no majority gates, the network becomes f = M (b, [4] , [4] ). In the third iteration, this circuit is finally optimized as given in
Compared to three levels and five majority gates with redundancies, the simplified function does not require any majority gate. It can be seen from this example that redundancy removal plays an important part in the synthesis process. When processing larger circuits, the removal process may take many iterations and be repeated until no further improvements can be made. After this step, a simplified majority logic network is generated.
E. Verification
In the proposed method, the majority expressions are first translated into ".vhd" file. Then, with a proper test bench, simulations are used to verify the correctness of majority equations. If the synthesized network generates the same outputs as the original specifications for a comprehensive set of inputs then both implementations are identical.
The exhaustive test, using all possible combinations of inputs, is done on the circuits that include 15 or fewer inputs. For example, consider the benchmark circuit cm85a with 11 inputs and 3 outputs. Both of its original "blif" file, which is directly obtained from the benchmark suite, and its synthesized majority equations are converted to ".vhd" files. The simulation waveforms are shown in Fig. 15 . On the figure, the 11 bit signal cnt represents the 11 inputs. The signals l, m and n are three outputs for the original "blif" file while output_l, output_m and output_n are the three outputs for synthesized majority network. The 3 bit signal error indicates the errors of the three output signals (only a part of the simulation is shown in Fig. 15 in order to make the details visible). Any mismatches between original outputs and synthesized outputs will set the corresponding error signal to 1. For example, if there is a difference between output_n and n, the signal error[0] will be set to 1. The error signal is always 0 for all combinations of inputs of cm85a showing that the synthesized result and the original specification match each other.
If there are too many inputs in a circuit, the exhaustive test is time consuming or even impossible to finish. In this case, pseudo-exhaustive testing provides a good balance between time and completeness by grouping the inputs to reduce the simulation possibilities. In this work, pseudo-exhaustive tests [27] are done on the circuits that have 16 or more inputs and each group includes a maximum of ten inputs. Consider a circuit with 24 inputs, in which the first 20 inputs will be put into the first two groups and the remaining four inputs will be included in the third group. When the simulation starts, the first group is selected to do the exhaustive test, while the remaining 16 inputs are set to 0. Then the next ten inputs are selected, while other inputs are set to 0. The test is ended when all groups have been selected once.
The exhaustive/pseudo-exhaustive functional test has been done on all benchmark solutions considered in this work and all verifications ran successfully.
V. RESULTS AND COMPARISON
In this section, an overall comparison between the best existing majority logic synthesis method [18] and the proposed method is demonstrated. The benchmark circuit cm85a from MCNC benchmark suite is explained in detail for the purpose of comparison. The circuit equations are shown in (15) . It includes three primary outputs: l, m and n, which are in braces and 11 primary inputs a, b, c, d, e, f, g, h, i, j and k. The rest of the variables are internal nodes.
The SIS tool is used to simplify and decompose the circuit as mentioned above. By applying the corresponding commands, a four-feasible network is produced as shown in (16) 
Each node in the decomposed network has a corresponding entry in the MLUT. By replacing all nodes with their majority expressions and removing all redundancies, the set of equations in (17) The results from [18] are shown in (18) . The equivalent majority gate circuits for the proposed work and [18] are shown in Figs. 16 and 17 , respectively. In the circuit shown in Fig. 16 , six levels and 19 gates are used. However, in Fig. 17 , six levels and 26 gates are used.
Of the 40 benchmarks, the proposed method produces the same results for 11 benchmarks as [18] . For the remaining 29 benchmarks, the comparison results are as shown in Table II . We also compare our results with the method in [16] . In Table II , the first column lists the names of benchmarks. The columns under the title "Method [18] " and "Method [16] " show the results for the corresponding benchmarks obtained from [18] and [16] (18) The columns titled "Priority: level" and "Priority: gate" show the results obtained from the proposed method using either the number of levels or gates as the first priority. The "Reduction%" columns compare the proposed method with the methods in [18] and [16] and give the percentage reductions.
It can be seen from the table that when the priority is to reduce the number of levels, there is an average reduction of 7.0% in the number of levels. At the same time, the number of gates is reduced by 6.3% when compared to the method in [18] . When the priority is to optimize the number of gates, there is an average reduction of 9.5% in the number of gates as compared to [18] . This optimization priority also reduces the number of levels by 0.8%. When compared to [16] , the savings from the proposed method are even more significant. For optimization targeted for levels, the average reduction in levels is 12.3% and the reduction in gate counts is 21.3%. For optimization targeted to gates, the average reduction in gate counts is 24.1% and the reduction in levels is 6.3%. Since, the synthesis results are independent of technology used, they are effective for any majority/minority based technologies including QCA, SET and TPL.
VI. CONCLUSION
A new approach to generate optimal majority expressions is presented in this paper. The corresponding minority network can be easily obtained by complementing the majority expression. An important part of this approach is the creation of a MLUT, which includes optimal majority expressions for Boolean functions with four or fewer variables. A Majority expression network is obtained by decomposing Boolean expressions and replacing each node with corresponding entries in MLUT. Redundancy removal and verification methods are also 
