It is a well known fact in logic design that synthesis of some special class of Boolean functions is often easier than the synthesis of a general unrestricted specification. In reversible logic, well-scaled synthesis methods with a reasonably small cost of the associated implementation have been found for only a few classes of functions. This includes synthesis of multiple-output symmetric and reversible linear functions.
Introduction
Reversible logic implementations are such that the values of input variables can be deduced from the output values. Information loss does not contribute to heat dissipation in these circuits [4, 13] . Therefore, they potentially help to solve at least two problems: overheating and power saving, which implies longer life for batteries.
The reversible logic solution may be especially important in low-voltage designs of mobile systems, where both power saving and overheating are very important due to the need for light weight and independent power supply.
Reversible implementations have applications in quantum computing [12, 25] and nanotechnology [19, 20] . Quantum technology has received significantly more attention, and is often considered to be the most promising application for reversible computations. Consequently, in this paper, we calculate quantum costs for the presented designs. Numerous applications requiring reversible implementations have resulted in the appearance of multiple reversible synthesis papers, e.g. [2, 9, 10, 11, 15, 17, 21, 24, 26, 28, 30, 32] . Indeed, once a technology is discovered, the next step towards employing it is the creation of useful applications and the synthesis of the corresponding circuits.
The use of general purpose reversible synthesis methods [2, 9, 10, 11, 15, 17, 21, 24, 30, 32] usually results in large and thus, likely, technologically expensive specifications. Those methods, especially their heuristic parts, scale poorly. In fact, the largest benchmark function for which a heuristically synthesized circuit has been reported has only 20 inputs and 20 outputs [17] . Most of the above methods, except [15, 24, 32] , target synthesis of the reversible specifications only. This limits their applicability to the synthesis of useful benchmark functions since the latter are usually specified irreversibly. The task of finding a reversible specification containing a given irreversible specification that can be effectively used by one or the other synthesis approach is difficult to solve and no reasonably good solution of this problem has yet been found. Synthesis approaches that work with irreversible specifications have been proposed [15, 24, 32] . The synthesis method suggested in [32] has neither been implemented nor tested. The synthesis approach in [15] works with functions having up to 10 input variables only. Usage of methods discussed in [24] allows synthesis of larger irreversible specifications. However, its application sometimes results in a significant number of garbage bits, wide and expensive gates (resulting in high overall technological cost), and it fails to employ the interdependence of bits in a reversible circuit.
Due to the above problems with the synthesis of a general non-resticted specification, it is a good idea to synthesize reversible circuits for classes of functions. Linear reversible functions with n input/output variables were synthesized with O(n 2 / log(n)) reversible/quantum CNOT operations [26] . This synthesis requires no garbage and is asymptotically optimal.
Symmetric functions were first synthesized as a separate class in [28] . The reversible gate count for an n-input m-output symmetric function synthesis in this method is n 2 2 + mn. But, their implementation uses excessive garbage,
. This will likely prevent their synthesis results from being used in quantum technology-most advanced of the existing quantum technologies is liquid NMR [1, 8] which imposes a strict limit on the number of qubits allowed in a single computation. Since, minimization of garbage is an important synthesis criterion in such application [8] .
We also notice that an inexpensive quantum realization of the Kerntopf gates used in [28] for synthesis was never found. In particular, it was recently shown [22] that optimal quantum implementation of the Kerntopf gate requires 14 elementary operations in a well studied [14] quantum gate library composed with NOT, CNOT, and controlled-sqrt-of-NOT gates. Optimal NCV quantum implementation of the Toffoli gate used in this work requires only 5 such operations.
[28] mentions an approach to a non-symmetric Boolean specification extension into a larger but symmetric specification. This is very useful because it makes it possible to synthesize any function by first "symmetrizing" it through adding new input variables, and then synthesizing its extended symmetric specification. Thus, a good reversible synthesis procedure for a symmetric specification may be of interest in general reversible synthesis. This further motivates research of the ways to construct inexpensive circuits for symmetric functions.
In this paper we present a synthesis method with the reversible gate count of at most
, quantum implementation cost of at most
mn) and at most 2n − 2 bits of garbage; and its modification with the reversible gate count of at most
+ mn * log(n)) and garbage of at most n + k − 1, where k = 2 log n .
Preliminaries

Reversible Circuits
Reversible logic design differs significantly from conventional logic design. A reversible circuit should be composed with reversible gates. In addition to the reversibility of gates, "no fan-outs" and "no feed-backs" [25] restrictions are applied.
This leaves us with the cascade as the only possible structure. The circuit diagrams are built in the popular notations, such as those used in [25] . In short, horizontal 
The set C which controls the change of the j-th bit is called the set of controls and t is called the target. [6] is often termed as a CNOT gate. In quantum technology, these gates are used as basic building blocks, therefore we associate a quantum cost of one with the use of each such gate.
is usually referred to as a Toffoli gate [31] . The Toffoli gate is a composite gate, and in quantum it is simulated with 5 elementary operations [25] . Gates NOT, CNOT and Toffoli are depicted in Figure 1 . Quantum implementations of larger Toffoli gates were also reported [3, 18] . For the purpose of future quantum cost calculation, we notice that a Toffoli gate with 3 controls can be simulated with 13 (or 15, depending on the set of basic quantum operations chosen) quantum operations [3] 
Multiple Output Symmetric Functions Definition 2. Multiple output symmetric Boolean function
.., x n )), for any permutation π of its n inputs.
A single output symmetric function with n inputs can be defined by its carry vector 
Example 1. Over the set of 3 variables {x 1 , x 2 , x 3 }, the σ-functions are those as listed below
The following two lemmas are well-known results employing the σ-functions. Linear combination of the σ-functions is, in fact, equivalent to to the Positive Polarity Reed-Muller expansion (PPRM) [29] , an EXOR polynomial with all literals appearing in the positive polarity. A more general object, Fixed Polarity Reed- ) storage space [5] . This includes finding PPRM, which is equivalent to the linear combination of σ-functions. Similarly, an FPRM is a linear combination of polarized σ-functions. However, we will use PPRM when constructing the reversible implementations of symmetric functions. This is because the bottleneck of our approach is in construction of the largest degree σ-function that one needs to implement a given symmetric function. And, it can be shown that the degree of the largest degree σ-function participating in the expansion does not decrease when considering FPRMs instead of the plain PPRM.
Proof. Observe that the first part of the right hand side has all the terms of degree k which include variable x n as a multiple. The second part has all the terms of degree k that do not include the variable x n . Thus, right hand side has all the terms of degree k, which, according to the definition, forms the left hand side.
Example 3. We illustrate Lemma 2 for n = 4 and k = 2:
Each function σ k n (x 1 , x 2 , ..., x n ) is symmetric. Therefore, it can be described by a subset M k ⊂ {0, 1, 2, ..., n} of the input weights where its output equals 1. In other words, M k is a set of indices corresponding to the unit values of the carry vector
According to Kummer's Theorem [7] maximal j : ( This observation allows to formulate the following useful result. Note that from the point of view of applications, it makes sense to simplify expression (1) through applying techniques like EXORCISM [23] . In particular, designs of functions sym12 and sym15 shown in Table 5 will benefit from such simplification.
5 Toffoli5 gates used in the design of sym12 can be replaced with 1 Toffoli5 gate and
). Analogously, 6 Toffoli5 gates used in the design of sym15 can be replaced with 2 Toffoli5 and 2 CNOT gates (becausex 1 x 2x3 x 4 ⊕x 1 x 2 x 3x4 ⊕
Reversible Synthesis of Multiple Output Symmetric Functions
Our approach to the reversible synthesis of symmetric Boolean functions is as follows.
We first consider a symmetric Boolean function defined by its carry vector. We next transform carry vector into the linear combination of σ-functions (PPRM) using procedure discussed in [5] . This is followed by construction of all needed σ-functions and synthesis of their linear combinations using simple modula-2 addition (through application of CNOT gates or the technique discussed in [26] ). The above explains all except how and which σ-functions to construct. This is discussed next.
The statement of the Lemma 2 holds for k = 1. The result for k = 1 can be used to calculate σ 1 n (x 1 , x 2 , ..., x n ). We suggest using Lemma 2 as a basis for the following dynamic programming algorithm which calculates the set of σ-functions can be realized with:
• at most m NOT gates, at most n + mn − 1 CNOT gates, and at most
Toffoli gates;
• at most n + m − 1 garbage bits;
• quantum implementation cost of at most This reduces the number of garbage bits to 2n − m − 1.
If all the outputs can be composed using the first k
, there is no need to create the remaining (n − k) σ-functions. This observation allows us to formulate the following result (proof is analogous to that of the previous theorem).
Theorem 3. Every symmetric multiple output function
− → F (x 1 , x 2 , ..., x n ) = (y 1 , y 2 , ..., y m ),
such that its linear σ-function decomposition requires a σ-function of maximal degree k (2 ≤ k ≤ n) can be realized with:
Toffoli gates;
• at most n + k − 2 garbage bits;
• and quantum implementation cost of at most Written as a Boolean formula this expression is ( can be reused for the output and thus new output rails y 1 , y 2 and y 3 need not be introduced. Observe that the gates that affect a garbage bit whose changed value is not used by the design afterwards (the gate colored gray in Figure 2 ) can be deleted from the circuit without changing the output of the target function. The resulting circuit contains 12 gates.
In our further designs, if a gate affects a garbage bit whose changed value is not used in the circuit to affect useful output bits afterwards, it can be deleted from the design. This trivial procedure brings some simplification in almost every case.
Also note, that the quantum cost of the boxed parts in the second circuit in Figure   2 is 4 (instead of 6=5+1 as one would expect), an implementation that was known to Peres [27] . Once these considerations are taken into account, the final quantum implementation cost will be lower than stated in the above theorems. We also suggest to use the templates [18] to further reduce the quantum costs of all presented designs.
Next, observe that using large Toffoli gates allows synthesis of symmetric specifications with a small number of outputs m, m ≺ n log(n) with smaller reversible gate count, garbage, and sometimes smaller quantum cost. Using Theorem 1 allows to
, first and then use at most n Toffoli gates with log(n) controls and at most s control bit negations to construct each of the m outputs of a given symmetric specification. The following Theorem summarizes this result.
Theorem 4. Every symmetric multiple output function
can be realized with:
• at most m NOT gates, at most n CNOT gates, at most
Toffoli gates, and at most nm Toffoli gates with log(n) controls;
• at most n + k − 1 garbage bits;
• quantum implementation cost of at most
.
Comparison of the Results
There were several design methods proposed in literature for the reversible design of multiple output Boolean functions. We would like to compare our results to the results of the RPGA method by Perkowski et al. [28] (the method designed to synthesize the symmetric functions with reversible gates), reversible wave cascades [24] , Khan gate family synthesis [10, 11] , generalized Toffoli gates family [15] and design of the Toffoli circuits using the templates [17] . The comparison consists of the three parts: comparison of the garbage, number of gates in the reversible cascade and comparison of the quantum costs.
Unfortunately, [28] do not provide a (calculated in [15] ), when the presented methods have the garbage of maximum (2n − 2). A good quantum realization of the Kerntopf gates used in [28] was never found, therefore we claim that from the point of view of quantum cost our method will produce quantum circuits which will be constant (= 14 5 in quantum NOT, CNOT, controlled-sqrt-of-NOT gate library) times cheaper.
Comparison to the reversible wave cascades [24] (RWC columns), Khan gate family synthesis [11] (KGF columns) and generalized Toffoli gates family [15] (GT columns)
reversible synthesis results is summarized in Table 5 . Actual circuits for our designs can be found in [16] .
The comparison in Table 5 is not quite fair. On one hand, the methods RWC, KGF and GT are general synthesis methods, which do not use special properties of functions. On the other hand, the cardinality of the set of gates of these is greater on the order than the number of gates used in the presented method.
It can be seen that our method produces better results for larger functions both from the point of view of the reversible cost and garbage. The presented method can never beat the generalized Toffoli gates family synthesis method in terms of the number of garbage bits, since the last uses theoretically minimal number of garbage bits. But, the GT method scales badly-it can produce circuits for functions with no more than 10 inputs. The RWC and KGF are synthesized heuristically and their usage is expected to cause problems when scaled.
We were not able to compare the quantum costs of the presented designs to those of earlier methods due to impossibility or hardness of getting access to the actual circuits. However, in few cases the comparison of quantum costs could be made. [17] gives an example of a circuit for rd53 function with 12 reversible gates, which seems to be the smallest (reversible gate count wise) among all known. The generalized Toffoli gates used in [17] are expensive (yet, generally, less expensive than the gates in RWC, KGF and GT) and the quantum cost calculation based on [16] reports the quantum cost of 120 for that realization. In the same costing metric, our 12-gate realization of rd53 has the quantum cost of 36 only, which is more than 3 times lower.
[15] presents a circuit for rd53 with cost 232. Which, again, compares favorably to our 36.
Another interesting comparison can be made using 2of 5 function. Realization in [15] uses 7 reversible gates in comparison to 12 in the presented paper. However, the quantum cost of their realization is 158 in comparison to 32 for our circuit. Thus, we find our realization to be potentially more practical. This example also serves as a good illustration of the thesis [22] that a small number of reversible gates does not mean a cheap technological implementation.
Secondly, we synthesized reversible circuits for some symmetric functions or symmetric components of some benchmarks [5] whose reversible implementations were never reported before. The results can be found in Table 5 . the quantum cost of the circuits. We stress that this is an upper bound, because the listed number is the weighted gate count, which, for instance, does not take it into account that some sequences of Toffoli-CNOT gates added into the total sum as 6 are Peres gate with the cost 4. Further simplification could be achieved through dropping the gates affecting garbage outputs only and applying local optimization techniques [18] .
Conclusion
In this paper we presented an efficient reversible/quantum synthesis method for the class of multiple output symmetric Boolean functions. As compared to the best previously reported method targeting the synthesis of symmetric Boolean functions, our method uses simpler gates (resulting in technologically preferable circuit specifications) and requires significantly less garbage bits. We compared our designs to those presented previously and found that our circuits are smaller. We presented reversible implementations for some well known symmetric benchmark functions whose reversible circuits were never reported before. Further advance of our synthesis approach includes optimization of the presented circuits; and synthesizing almost symmetric functions (which is, likely, a separate problem rather than a trivial extension of the presented technique). 
