Efficient Synthesis of Linear Reversible Circuits by Patel, K. N. et al.
ar
X
iv
:q
ua
nt
-p
h/
03
02
00
2v
1 
 3
 F
eb
 2
00
3
Efficient Synthesis of Linear Reversible Circuits
Ketan N. Patel, Igor L. Markov and John P. Hayes
University of Michigan, Ann Arbor 48109-2122
{knpatel,imarkov,jhayes}@eecs.umich.edu
Abstract. In this paper we consider circuit synthesis for n-wire linear reversible
circuits using the C-NOT gate library. These circuits are an important class of re-
versible circuits with applications to quantum computation. Previous algorithms,
based on Gaussian elimination and LU-decomposition, yield circuits with O
(
n2
)
gates in the worst-case. However, an information theoretic bound suggests that it
may be possible to reduce this to as few as O
(
n2/ log n
)
gates.
We present an algorithm that is optimal up to a multiplicative constant, as well
as Θ(log n) times faster than previous methods. While our results are primarily
asymptotic, simulation results show that even for relatively small n our algorithm
is faster and yields more efficient circuits than the standard method. Generically
our algorithm can be interpreted as a matrix decomposition algorithm, yielding
an asymptotically efficient decomposition of a binary matrix into a product of
elementary matrices.
1 Introduction
A reversible or information lossless circuit is one that implements a bijective function,
or loosely, a circuit where the inputs can be recovered from the outputs and all out-
put values are achievable. A major motivation for studying reversible circuits is the
emerging field of quantum computation [6]. A quantum circuit implements a unitary
function, and is therefore reversible. Circuit synthesis for reversible computations is an
active area of research [2,4,7,9]. The goal in circuit synthesis is, given a gate library, to
synthesize an efficient circuit performing a desired computation. In the quantum con-
text, the individual gates correspond to physical operations on quantum states called
qubits, and therefore reducing the number of gates in the synthesis generally leads to a
more efficient implementation.
Linear reversible classical circuits form an important sub-class of quantum circuits,
which can be generated by a single type of gate called a C-NOT gate (see Figure 1c).
This gate is an important primitive for quantum computation because it forms a univer-
sal gate set when augmented with single qubit rotations [5]. Moreover, current quan-
tum circuit synthesis algorithms can generate circuits with strings of C-NOT gates, and
therefore more efficient synthesis for these classical linear reversible sub-circuits would
imply more efficient synthesis for the overall quantum computation.
In this paper we consider the problem of efficiently synthesizing an arbitrary linear
reversible circuit on n wires using C-NOT gates. This problem can be mapped to the
problem of row reducing a n× n binary matrix. Until now the best synthesis methods
have been based on standard row reduction methods such as Gaussian elimination and
a a’
b
a a’
b’
a  b   a’  b’
0  0   0   0
0  1   0   1
1  0   1   1
1  1   1   0
a
b
a’
a  b   a’ 
0  0   0
0  1   0
1  0   0
1  1   1
1   0
a   a’
0   1
(b) (c)(a)
Fig. 1. Examples of reversible and irreversible logic gates with truth tables a) AND gate
b) NOT gate c) C-NOT gate. Both the NOT and C-NOT gates are reversible while the
AND gate is not.
LU-decomposition, which yield circuits with O(n2) gates [3]. However, the best lower
bound leaves open the possibility that synthesis with as few as O
(
n2/ log n
)
gates in
the worst case may exist [9].
We present a new synthesis algorithm that meets the lower bound, and is therefore
asymptotically optimal up to a multiplicative constant. Furthermore, our algorithm is
also asymptotically faster than previous methods. Simulation results show that the pro-
posed algorithm outperforms previous methods even for relatively small n. Generically
our algorithm can be interpreted as a matrix decomposition algorithm, that yields an
asymptotically efficient elementary matrix decomposition of a binary matrix. General-
izations to matrices over larger finite fields are straightforward.
2 Background
We can represent the action of an n-input m-output logic gate as a function mapping the
values of the inputs to those of the outputs: f : Fn2 → Fm2 , where f maps each element
of Fn2 to an element in Fm2 . Here F2 is the two-element field, and Fn2 is the set of all
n-dimensional vectors over this field. A gate is reversible if this function is bijective,
that is, f is one-to-one and onto. Intuitively, this means that the inputs can be uniquely
determined from the outputs and all output values are achievable. For example, the AND
gate (Figure 1a) is not reversible since it maps three input values to the same output
value. The NOT gate (Figure 1b), on the other hand, is reversible since both possible
input values yield unique output values, and both possible output values are achievable.
The controlled-NOT or C-NOT gate, shown in Figure 1c, is another important reversible
gate. This gate passes the first input, called the control, through unchanged and inverts
the second, called the target, if the control is a one. As its truth table shows, this gate
is reversible since it maps each input vector to a unique output vector and all output
vectors are achievable.
A reversible circuit is a directed acyclic combinatorial logic circuit where all gates
are reversible and are interconnected without fanout [9]. An example of a reversible
circuit consisting of C-NOT gates is shown in Figure 2. Note that, as is the case for
reversible gates, the function computed by a reversible circuit is bijective.
Output 0
Output 1
Output 2
Output 3
G1
Input 3
Input 0
Input 1
Input 2
G2
G3 G4
G5
G6
Fig. 2. Reversible circuit example.
We say a circuit or gate, computing the function f , is linear if f (x1⊕x2) = f (x1)⊕
f (x2) for all x1,x2 ∈ Fn2, where ⊕ is the bitwise XOR operation. The C-NOT gate is an
example of a linear gate:
f ([0 0])⊕ f (x) = f (x) f ([0 1])⊕ f ([1 0]) = f ([1 1]) f ([1 0])⊕ f ([1 1]) = f ([0 1])
f (x)⊕ f (x) = f ([0 0]) f ([0 1])⊕ f ([1 1]) = f ([1 0]) .
The action of any linear reversible circuit on n wires can be represented by a linear
transformation over F2. In particular, we can represent the action of the circuit as mul-
tiplication by a non-singular n× n matrix A with elements in F2:
Ax = y,
where x and y are n-dimensional vectors representing the values of the input and out-
put bits respectively. Using this representation, the action of a C-NOT gate corresponds
to multiplication by an elementary matrix, which is the identity matrix with one off-
diagonal entry set to one. Multiplication by an elementary matrix performs a row op-
eration, the addition of one row of a matrix or vector to another. Applying a series of
C-NOT gates corresponds to performing a series of these row operations on the input
vector or equivalently to multiplying it by a series of elementary matrices. For example,
the linear transform computed by the circuit in Figure 2 is given by
A=
G6

1 0 0 0
0 1 0 0
0 0 1 0
0 0 1 1

 ·
G5

1 1 0 0
0 1 0 0
0 0 1 0
0 0 0 1

 ·
G4

1 0 0 0
0 1 1 0
0 0 1 0
0 0 0 1

 ·
G3

1 0 0 0
0 1 0 0
0 1 1 0
0 0 0 1

 ·
G2

1 0 0 0
0 1 0 0
0 0 1 0
0 0 1 1

 ·
G1

1 0 0 0
1 1 0 0
0 0 1 0
0 0 0 1

=


1 0 1 0
0 0 1 0
1 1 1 0
1 1 0 1


We can use the matrix notation to count the number of different n-input linear re-
versible transformations. In order for the transformation to be reversible, its matrix must
be non-singular, in other words, all nontrivial sum of the rows should be non-zero. There
are 2n− 1 possible choices for the first row, all vectors except for the all zeros vector.
There are 2n−2 possible choices for the second row, since it cannot be the equal to the
first row or the all zeros vector. In general, there are 2n− 2i−1 possible choices for the
ith row, since it cannot be any of the 2i−1 linear combinations of the previous i−1 rows
(otherwise the matrix would be singular). Therefore there are
n−1
∏
i=0
(
2n− 2i
)
unique n-input linear reversible transformations.
Since any non-singular matrix A can be reduced to the identity matrix using row
operations, we can write A as a product of elementary matrices. Therefore, any linear
reversible function can be be synthesized from C-NOT gates. Moreover, the problem
of C-NOT circuit synthesis is equivalent to the problem of row reduction of a matrix A
representing the linear reversible function: any synthesis of the circuit can be written as
a product of elementary matrices equal to A and any such product yields a synthesis. The
length of the synthesized circuit is given by the number of elementary matrices in the
product. Standard Gaussian elimination and LU-decomposition based methods requires
O(n2) gates in the worst-case [3]. However, the best lower bound is only Ω
(
n2/ log n
)
gates [9].
Lemma 1 (Lower Bound). There are n-bit linear reversible transformation that can-
not be synthesized using fewer than Ω(n2/ log n) C-NOT gates.
Proof Let d be the maximum number of C-NOT gates needed to synthesize any linear
reversible function on n wires. The number of different C-NOT gates which can act on
n wires is n(n−1). Therefore the number of unique C-NOT circuit with no more than d
gates must be no more than
(
n2− n+ 1
)d
, where we have included a do-nothing NOP
gate in addition to the n2− n C-NOT gates to account for circuits with fewer than d
gates. Since the number of circuits with no more than d C-NOT gates must be greater
than the number of unique linear reversible function on n wires, we have the inequality
(
n2− n+ 1
)d
≥
n−1
∏
i=0
(
2n− 2i
)
≥ 2n(n−1). (1)
Taking the log of both the left and right sides of the equations gives
d ≥ n(n− 1) log2
log(n2− n+ 1)
=
n2− n
log2 (n2− n+ 1)
= Ω
(
n2
log n
)
. (2)

This lemma suggests a synthesis more efficient than standard Gaussian elimination may
be possible. The multiplicative constant in this lower bound is 1/2 (assuming logs are
taken base 2).
3 Efficient Synthesis
In this section we present our synthesis algorithm, which achieves the lower bound
given in the previous section. In Gaussian elimination, row operations are used to place
ones on the diagonal of the matrix and to eliminate any remaining ones. One row oper-
ation is required for each entry in the matrix that is targeted. Since there are n2 matrix
entries, O(n2) row operation are required in the worst case. If instead we group entries
together and use single row operations to change these groups, we can reduce the num-
ber of row operation required, and therefore the number of gates needed to synthesize
the circuit.
The basic idea is as follows. We first partition the columns of the n× n matrix into
sections of no more than m columns each. We call the entries in a particular row and
section a sub-row. For each section we use row operations to eliminate sub-row patterns
that repeat in that section. This leaves relatively few (< 2m) non-zero sub-rows in the
section. These remaining entries are handled using Gaussian elimination. If m is small
enough (< log2 n), most of the row operations result from the first step, which requires
a factor of m fewer row operations than full Gaussian elimination. As with the Gaussian
elimination based method, our algorithm is applied in two steps; first the matrix is
reduced to an upper triangular matrix, the resulting matrix is transposed, and then the
process is repeated to reduce it to the identity. Detailed pseudo-code for our algorithm
is shown on the next page. The following example illustrates our algorithm for a 6-wire
linear reversible circuit.
1) Choose m = 2 and partition matrix.


1 1
1 0
0 0
0 1
0 0
1 0
0 1
1 1
0 0
1 1
1 0
1 1
1 1
0 0
0 1
1 1
1 1
1 0


2) (Step A - section 1) Eliminate duplicate sub-rows.


1 1
1 0
0 0
0 1
0 0
1 0
0 1
1 1
0 0
1 1
1 0
1 1
1 1
0 0
0 1
1 1
1 1
1 0


1→ 4
1→ 5
=⇒


1 1
1 0
0 0
0 1
0 0
1 0
0 1
0 0
0 0
1 1
1 0
1 1
0 0
0 0
0 1
1 1
1 1
1 0


3) (Step B - section 1, column 1) One already on diagonal.
4) (Step C - section 1, column 1) Remove remaining ones in column below diagonal.


1 1
1 0
0 0
0 1
0 0
1 0
0 1
0 0
0 0
1 1
1 0
1 1
0 0
0 0
0 1
1 1
1 1
1 0


1→ 2
=⇒


1 1
0 1
0 0
0 1
0 0
1 0
0 1
0 0
0 0
1 1
1 0
1 1
0 0
0 0
0 1
1 1
1 1
1 0


3) (Step B - section 1, column 1) One already on diagonal.
Algorithm 1: Efficient C-NOT Synthesis
[circuit] = CNOT Synth(A, n, m)
{
// synthesize lower/upper triangular part
[A,circuit l] = Lwr CNOT Synth(A, n, m)
A = transpose(A);
[A,circuit u] = Lwr CNOT Synth(A, n, m)
// combine lower/upper triangular synthesis
switch control/target of C-NOT gates in circuit u;
circuit = [reverse(circuit u) | circuit l];
}
[A,circuit] = Lwr CNOT Synth(A, n, m)
{
circuit = [];
for (sec=1; sec<=ceil(n/m); sec++) // Iterate over column sections
{
// remove duplicate sub-rows in section sec
for (i=0; i<2m; i++)
patt[i] = -1; //marker for first positions of sub-row patterns
for (row ind=(sec-1)*m; row ind<n; row ind++)
{
sub-row patt = A[row ind,(sec-1)*m:sec*m-1];
// if first copy of pattern save otherwise remove
if (patt[sub-row patt] == -1)
patt[sub-row patt] = row ind;
else
{
A[row ind,:] += A[patt[sub-row patt],:];
Step A circuit = [C-NOT(patt[sub-row patt],row ind) | circuit];
}
}
// use Gaussian elimination for remaining entries in column section
for (col ind=(sec-1)*m; col ind<sec*m-1; col ind++)
{
// check for 1 on diagonal
diag one = 1;
if (A[col ind,col ind] == 0)
diag one = 0;
// remove ones in rows below col ind
for (row ind=col ind+1; row ind<n; row ind++)
{
if (A[row ind,col ind] == 1)
{
if (diag one == 0)
{
A[col ind,:] += A[row ind,:];
Step B circuit = [C-NOT(row ind,col ind) | circuit];
diag one = 1;
}
A[row ind,:] += A[col ind,:];
Step C circuit = [C-NOT(col ind,row ind) | circuit];
} } } } }
4) (Step C - section 1, column 1) Remove remaining ones in column below diagonal.


1 1
0 1
0 0
0 1
0 0
1 0
0 1
0 0
0 0
1 1
1 0
1 1
0 0
0 0
0 1
1 1
1 1
1 0


2→ 3
=⇒


1 1
0 1
0 0
0 1
0 0
1 0
0 0
0 0
0 1
1 1
0 0
1 1
0 0
0 0
0 1
1 1
1 1
1 0


5) (Step A - section 2) Eliminate duplicate sub-rows below row 2.


1 1
0 1
0 0
0 1
0 0
1 0
0 0
0 0
0 1
1 1
0 0
1 1
0 0
0 0
0 1
1 1
1 1
1 0


3→ 5
4→ 6
=⇒


1 1
0 1
0 0
0 1
0 0
1 0
0 0
0 0
0 1
1 1
0 0
1 1
0 0
0 0
0 0
0 0
1 1
0 1


6) (Step B - section 2, column 3) Place one on diagonal.


1 1
0 1
0 0
0 1
0 0
1 0
0 0
0 0
0 1
1 1
0 0
1 1
0 0
0 0
0 0
0 0
1 1
0 1


4→ 3
=⇒


1 1
0 1
0 0
0 1
0 0
1 0
0 0
0 0
1 0
1 1
1 1
1 1
0 0
0 0
0 0
0 0
1 1
0 1


7) (Step C - section 2, column 3) Remove remaining ones in column below diagonal.


1 1
0 1
0 0
0 1
0 0
1 0
0 0
0 0
1 0
1 1
1 1
1 1
0 0
0 0
0 0
0 0
1 1
0 1


3→ 4
=⇒


1 1
0 1
0 0
0 1
0 0
1 0
0 0
0 0
1 0
0 1
1 1
0 0
0 0
0 0
0 0
0 0
1 1
0 1


8) Matrix is now upper triangular. Transpose and continue.


1 1
0 1
0 0
0 1
0 0
1 0
0 0
0 0
1 0
0 1
1 1
0 0
0 0
0 0
0 0
0 0
1 1
0 1


transpose
=⇒


1 0
1 1
0 0
0 0
0 0
0 0
0 0
0 1
1 0
0 1
0 0
0 0
0 1
0 0
1 0
1 0
1 0
1 1


9) (Step A - section 1) Eliminate duplicate sub-rows.


1 0
1 1
0 0
0 0
0 0
0 0
0 0
0 1
1 0
0 1
0 0
0 0
0 1
0 0
1 0
1 0
1 0
1 1


4→ 5
=⇒


1 0
1 1
0 0
0 0
0 0
0 0
0 0
0 1
1 0
0 1
0 0
0 0
0 0
0 0
1 1
1 0
1 0
1 1


10) (Step B - section 1, column 1) Because matrix is triangular and non-singular there
will always be ones on the diagonal.
11) (Step C - section 1, column 1) Remove remaining ones in column.


1 0
1 1
0 0
0 0
0 0
0 0
0 0
0 1
1 0
0 1
0 0
0 0
0 0
0 0
1 1
1 0
1 0
1 1


1→ 2
=⇒


1 0
0 1
0 0
0 0
0 0
0 0
0 0
0 1
1 0
0 1
0 0
0 0
0 0
0 0
1 1
1 0
1 0
1 1


12) (Step C - section 1, column 2) Remove remaining ones in column.


1 0
0 1
0 0
0 0
0 0
0 0
0 0
0 1
1 0
0 1
0 0
0 0
0 0
0 0
1 1
1 0
1 0
1 1


2→ 4
=⇒


1 0
0 1
0 0
0 0
0 0
0 0
0 0
0 0
1 0
0 1
0 0
0 0
0 0
0 0
1 1
1 0
1 0
1 1


13) (Step A - section 2) Eliminate duplicate sub-rows.


1 0
0 1
0 0
0 0
0 0
0 0
0 0
0 0
1 0
0 1
0 0
0 0
0 0
0 0
1 1
1 0
1 0
1 1


3→ 6
=⇒


1 0
0 1
0 0
0 0
0 0
0 0
0 0
0 0
1 0
0 1
0 0
0 0
0 0
0 0
1 1
0 0
1 0
1 1


14) (Step C - section 2, column 1) Remove remaining ones in column.


1 0
0 1
0 0
0 0
0 0
0 0
0 0
0 0
1 0
0 1
0 0
0 0
0 0
0 0
1 1
0 0
1 0
1 1


3→ 5
=⇒


1 0
0 1
0 0
0 0
0 0
0 0
0 0
0 0
1 0
0 1
0 0
0 0
0 0
0 0
0 1
0 0
1 0
1 1


Output 1
Output 2
Output 3
Output 4
Output 5
Output 6
Input 4
Input 1
Input 2
Input 3
Input 5
Input 6
Fig. 3. Synthesized C-NOT circuit example. The gates in the right and left boxes corre-
spond to row operations before and after the transpose step respectively. Those in the
left box are in the same order the row operations were applied and their controls and
targets are switched. The gates in the right box are in the reverse order that the row
operations were applied.
15) (Step C - section 2, column 2) Remove remaining ones in column.


1 0
0 1
0 0
0 0
0 0
0 0
0 0
0 0
1 0
0 1
0 0
0 0
0 0
0 0
0 1
0 0
1 0
1 1


4→ 5
=⇒


1 0
0 1
0 0
0 0
0 0
0 0
0 0
0 0
1 0
0 1
0 0
0 0
0 0
0 0
0 0
0 0
1 0
1 1


16) (Step C - section 3, column 1) Remove remaining ones in column.


1 0
0 1
0 0
0 0
0 0
0 0
0 0
0 0
1 0
0 1
0 0
0 0
0 0
0 0
0 0
0 0
1 0
1 1


5→ 6
=⇒


1 0
0 1
0 0
0 0
0 0
0 0
0 0
0 0
1 0
0 1
0 0
0 0
0 0
0 0
0 0
0 0
1 0
0 1


The synthesized circuit is specified by the row operations and is shown in Figure 3.
In general, the length of the synthesized circuit is given by the number of row oper-
ations used in the algorithm. By accounting for the maximum number of row operations
in each step, we can calculate an upper bound on the maximum number of gates that
could be required in synthesizing an n-wire linear reversible circuit. C-NOT gates are
added in the steps marked Step A-C in the algorithm. Step A is used to eliminate the
duplicates in the subsections. It is called fewer than n+m times per section (combined
for the upper/lower triangular stages of the algorithm), giving a total of no more than
(n+m) · ⌈n/m⌉ gates. Step B is used to place ones on the diagonal. It can be called no
more than n times. Step C is used to remove the ones remaining after all duplicate sub-
rows have been cleared. Since there are only 2m m-bit words, there can be at most as
many non-zero sub-rows below the m×m sub-matrix on the diagonal. Therefore, Step
C is called fewer than m · (2m+m) times per section, or fewer than 2⌈n/m⌉m · (2m+m)
times in all. Adding these up we have
total row ops ≤ (n+m) ·
⌈ n
m
⌉
+ n+ 2
⌈ n
m
⌉
m · (2m +m) (3)
≤
n2
m
+ n+ n+m+ n+2n2m+ 2nm+ 2m2m+ 2m2. (4)
If m = α log2 n,
total row ops≤ n
2
α log2 n
+ 3n+α log2 n+ 2n1+α+ 2nα log2 n
+ 2α log2 n ·nα+ 2(α log2 n)
2 . (5)
If α< 1, the first term dominates as n gets large. Therefore the number of row operations
is O(n2/ log n). Combining this result with Lemma 1, we have the following theorem.
Theorem 1. The worst-case length of an n-wire C-NOT circuit is Θ(n2/ log n) gates.
In Equation 5, α can be chosen to be arbitrarily close to 1. In the limit, the multiplicative
constant in the O(n2/ log n) expression becomes 1 (assuming logs are taken base 2). By
contrast, the multiplicative constant in the lower bound in Lemma 1 is 1/2.
This algorithm, in addition to generating more efficient circuits than the standard
method, is also asymptotically more efficient in terms of run time. The execution time
of the algorithm is dominated by the row operations on the matrix, which are each O(n).
Therefore the overall execution time is O(n3/ log n) compared to O(n3) for standard
Gaussian elimination [8, p. 42].
Our algorithm is closely related to Kronrod’s Algorithm (also known as “The Four
Russians’ Algorithm”) for construction of the transitive closure of a graph [1]. One im-
portant difference between the two is that in their case the goal was a fast algorithm for
their application, which is only of secondary concern for our application. Our primary
goal is an algorithm that produces an efficient circuit synthesis. Generically, our algo-
rithm can be interpreted as producing an efficient elementary matrix decomposition of
a binary matrix.
4 Simulation Results
Though Algorithm 1 is asymptotically optimal, it would be of interest to know how
large n must be before the algorithm begins to outperform standard Gaussian elimi-
nation. For this purpose we have synthesized linear reversible circuits using both our
method and Gaussian elimination for randomly generated non-singular 0-1 matrices.
The results of these simulations are summarized in Figure 4. Our algorithm shows an
0 10 20 30 40 50 60 70 80
0
500
1000
1500
2000
2500
3000
3500
wires
ci
rc
ui
t l
en
gt
h
Algorithm 1
Gaussian Elimination
Fig. 4. Performance of Algorithm 1 vs. Gaussian elimination on randomly generated
linear reversible functions. Each point corresponds to the average length of the cir-
cuit generated for 100 randomly generated matrices. The x-axis specifies n, the num-
ber of inputs/outputs of the linear reversible circuit, and the y-axis specifies the av-
erage number of gates in the circuit synthesis. For Algorithm 1, m was chosen to be
round((log2 n)/2).
improvement over Gaussian elimination for n as small as 8. The length of the circuit
synthesis produced by Algorithm 1 is dependent on the choice m, the size of the col-
umn sections. Here we have somewhat arbitrarily chosen m = round((log2 n)/2). The
performance for some values of n could be significantly improved by optimizing this
choice. This would also smooth out the performance curve in Figure 4 for Algorithm 1.
5 Conclusions and Future Work
We have given an algorithm for linear reversible circuit synthesis that is asymptotically
optimal in the worst-case. We show that the algorithm is also asymptotically faster than
current methods. While our results are primarily asymptotic, simulation results show
that even in the finite case our algorithm outperforms the current synthesis method.
Applications of our work include circuit synthesis for quantum circuits.
While the primary motivations for the synthesis method we have given here are to
provide an asymptotic bound on circuit complexity and a practical method to synthesize
efficient circuits, another application is to bounds on circuit complexity for the finite
case. In particular, we can use our method to determine an upper bound on the maximum
number of gates required to synthesize any n wire C-NOT circuit. For this application
the particular partitioning of the columns can be very important. For example, much
better bounds can be determined if the size of the sections are a function of the location
of the section in the matrix. Sections to the left have more rows below the diagonal
and therefore should be larger than sections towards the right of the matrix which have
fewer rows below the diagonal. An ongoing area of work is determining optimal column
partitioning methods.
Our algorithm basically yields an efficient decomposition for matrices with ele-
ments in F2, and can be generalized in a straightforward manner for matrices over any
finite field. The asymptotic size of the generalized decomposition is O(n2/ log|F | n),
where |F | is the order of the finite field. Our algorithm, particularly in this generalized
form, is quite generic and may lend itself to a wide range of other applications. Re-
lated algorithms [1] have applications in finding the transitive closure of a graph, binary
matrix multiplication, and pattern matching.
A major area of future work is extending our results to more general reversible cir-
cuits, particularly quantum circuits. Currently, there is an asymptotic gap between the
best upper and lower bounds on the worst-case circuit complexity both for general clas-
sical reversible circuits and quantum circuits. The gap for classical reversible circuits
is the same logarithmic factor that previously existed for linear reversible circuits [9],
which suggests it may be possible to extend our methods to this problem.
References
1. V. L. Arlazarov, E. A. Dinic, M. A. Kronrod, and I. A. Faradzˇev. On economical construction
of the transitive closure of an oriented graph. Soviet Mathematics Doklady, pages 1209–10,
1970.
2. A. Barenco, C. H. Bennett, R. Cleve, D. P. DiVincenzo, N. Margolus, P. Shor, T. Sleator,
J. Smolin, and H. Weinfurter. Elementary gates for quantum computation. Physical Rev. A,
pages 3457–67, 1995.
3. T. Beth and M. Ro¨tteler. Quantum algorithms: Applicable algebra and quantum physics. In
Quantum Information, pages 96–150. Springer, 2001.
4. G. Cybenko. Reducing quantum computations to elementary unitary operations. Quantum
Computation, 2001.
5. D. P. DiVincenzo. Two-bit gates are universal for quantum computation. Physical Rev. A,
pages 1015–22, 1995.
6. M. A. Nielsen and I. L. Chuang. Quantum Computation and Quantum Information. Cam-
bridge University Press, 2000.
7. M. Perkowski, L. Jozwiak, P. Kerntopf, A. Mishchenko, A. Al-Rabadi, A. Coppola, A. Buller,
X. Song, M. Khan, S. Yanushkevich, V. Shmerko, and M. Chrzanowska-Jeske. A general
decomposition for reversible logic. Reed-Muller Workshop, August 2001.
8. W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery. Numerical Recipes in C.
Cambrigde University Press, 1992.
9. V. V. Shende, A. K. Prasad, I. L. Markov, and J. P. Hayes. Reversible logic circuit synthesis.
Proc. IEEE/ACM Intl. Conf. on Computer Aided Design, pages 353–60, November 2002.
