A Depth-Aware Swap Insertion Scheme for the Qubit Mapping Problem by Zhang, Chi et al.
A Depth-Aware Swap Insertion Scheme
for the Qubit Mapping Problem
Chi Zhang∗ Yanhao Chen† Yuwei Jin†
Wonsun Ahn∗ Youtao Zhang∗ Eddy Z. Zhang†
∗University of Pittsburgh
{chz54, wahn}@pitt.edu, zhangyt@cs.pitt.edu
†Rutgers University
chenyh64@gmail.com, yj243@scarletmail.rutgers.edu, eddy.zhengzhang@gmail.com
Abstract—The rapid progress of physical implementation of
quantum computers paved the way of realising the design of
tools to help users write quantum programs for any given
quantum devices. The physical constraints inherent to the current
NISQ architectures prevent most quantum algorithms from being
directly executed on quantum devices. To enable two-qubit gates
in the algorithm, existing works focus on inserting SWAP gates
to dynamically remap logical qubits to physical qubits. However,
their schemes lack the consideration of the depth of generated
quantum circuits. In this work, we propose a depth-aware SWAP
insertion scheme for qubit mapping problem in the NISQ era.
Index Terms—Quantum Computing, Emerging languages and
compilers, Emerging Device Technologies
I. INTRODUCTION
Quantum computing has exhibited its theoretical advantage
over classical computing by showing impressive speedup on
applications including large integer factoring [1], database
search [2], and quantum simulation [3]. It is considered to
be a new computational model that may have a subversive
impact on the future, and has attracted major interests of a
large number of researchers and companies.
With the advent of advanced manufacturing technology,
the industry is able to build small-scale quantum computers
– Noisy Intermediate-Scale Quantum [4] (NISQ) devices. A
NISQ device is equipped with dozens to hundreds of qubits.
IBM [5] released its 53-qubit quantum computer in October
2019 and has made it available for commercial use. Google [6]
released the 72-qubit Bristlecone quantum computer in March
2018. Other companies including Intel [7], Rigetti [8], and
IonQ, have released their quantum computing devices with
dozens of qubits. The current NISQ technology may not be
perfect, but it’s a good first step towards the more powerful
quantum devices in the future.
In order to map high level quantum programs to NISQ
devices, it is important to overcome two obstacles. First, to
be able to execute a quantum circuit, it is necessary to map
logical qubits to physical qubits with respect to architecture
and program coupling constraints. Any quantum program can
be implemented using an universal gate set [9] of a small
number of elementary gates. For instance, the {H, CNOT, S,
T} set is an universal gate set, in which the {H, S, T} gates
are single qubit gates, the CNOT gate is a two-qubit gate.
The two-qubit gate must be mapped to two qubits that are
physically connected. However, in real quantum architecture,
qubits may have limited connection and not every two qubits
are connected, as shown in the IBM QX2 architecture in Fig. 1
(a). For this reason, a quantum circuit is not directly executable
on a NISQ device, unless circuit transformation is performed.
The common practice is to insert SWAP operations to dynam-
ically remap the logical qubit such that the transformed circuit
is hardware-compliant for each (set of) two-qubit gate(s).
Second, it is critical that the depth of a quantum circuit
be minimized for the NISQ device. A qubit is volatile and
error prone. It gradually decays over time and may have phase
and bit flip errors. It may completely lose its state after a
certain period of time, called coherence time. Quantum error
correction (QEC) codes can detect error syndromes and fix
them. However, QEC needs to use a large number of redundant
physical qubits. A realistic QEC circuit may need more than
10,000 physical qubits, which is not possible for today’s NISQ
device. Without QEC, a program must terminate within a
threshold amount of time. The depth of the circuit, which is the
number of steps the circuit executes, must be optimized. IBM
proposed the metric of quantum volume [10] for evaluating
the effectiveness of quantum computers which accounts for
not only the width of the circuit (the number of qubits), but
also the depth, how many steps the circuit can execute.
Transforming the logical circuit into a hardware-compliant
one will inevitably result in increased gate count and circuit
depth. Most previous work for qubit mapping [11]–[16] focus
on minimizing the number of inserted gates, but not the depth
of the transformed circuit. However, even if the gate count is
small, it does not necessarily mean the depth of the circuit
is small, due to the dependence between different gates. We
discover that previous work that aims to minimize number of
inserted gate may significantly increase the depth of the circuit
(in Section IV). For instance, the Sabre approach by Li et al.
[11] reduces the gate count by 1.1%, but increases the depth
of the 10-qubit QFT circuit by over 44.5%. The two studies
[17], [18] stress the importance of taking into consideration
the variability in the qubit (link) error rates, but they do not
directly address the issue of the increased circuit depth.
The depth of the circuit, as mentioned above, is critical and
ar
X
iv
:2
00
2.
07
28
9v
1 
 [c
s.E
T]
  1
7 F
eb
 20
20
determines if a quantum program is executable on a NISQ
device with respect to its physical limits. In this paper, we
propose the first depth-aware qubit mapping scheme for quan-
tum circuits running on arbitrary qubit connectivity hardware.
Our depth-aware qubit mapper searches for the mapping that
minimizes the transformed circuit depth and keeps the gate
count within a reasonable range. Our results show we can
reduce the depth of the transformed circuit by up to 30%
compared with two best known qubit mappers [11], [12], and
in the meantime, have on average less than 3% additional gates
over a large set of representative benchmarks.
II. BACKGROUND AND MOTIVATION
A. Quantum Computing Basics
1) Qubit: A quantum bit or qubit, is the counterpart to
classical bit in the realm of quantum computing. Different
from a classical bit that represents either ‘1’ or ‘0’, a qubit is
in the coherent superposition of both states. It is considered
as a two-state quantum system that exhibits the peculiarity of
quantum mechanics [9]. An example is the spin of the electron
that the two states can be spin up and spin down.
2) Quantum Gates: There are two types of basic quantum
gates. One type of basic gates is the single-qubit gate, a unitary
quantum operation that can be abstracted as the rotation around
the axis of the Bloch sphere [9] which represents the state
space of one qubit. A single qubit-gate can be parameterized
using two rotation angles around the axes. There are several
elementary single-qubit gates including the Hadamard (H)
gate, the phase (S) gate, and the pi/8 (T) gate [9]. The
other type of basic gates is the multi-qubit gate. However, all
complex quantum gates can be decomposed into a sequence of
single qubit gates H, S, T, and the two-qubit CNOT gate. Thus
we only focus on the two-qubit CNOT gate. The CNOT gate
operates on two qubits which are distinguished as a control
qubit and a target qubit. If the control qubit is 1, the CNOT
gate flips the state of the target qubit, otherwise, the target
qubit remains the same.
3) Quantum Circuit: Quantum circuit is composed of a
set of qubits and a sequence of quantum operations on
these qubits. There are various ways to describe the quantum
circuits. One way is to use the quantum assembly language
called OpenQASM [19] released by IBM. Another way is
to use the circuit diagram, in which qubits are represented
as horizontal lines and quantum operations are the different
blocks on those lines. In Fig. 2 (a), we show a simple example
of quantum circuit diagram. A single-qubit gate is denoted as
a square on the line, and one CNOT gate is represented by a
line connecting two qubits and a circle enclosing a plus sign.
B. Qubit Mapping and Depth-Awareness
To enable the execution of a quantum circuit, the logical
qubits in the circuit must be mapped to the physical qubit on
the target hardware. When applying a CNOT gate, the two
qubits connected by the CNOT gate need to be physically
connected to each other. Due to the irregular physical qubit
layout of existing devices, it is generally considered impossible
Q2Q1
Q3
Q4 Q5
q1 q2
q3
q4 q5
(a) IBM QX2 (b) Logic coupling graph
Fig. 1. (a) The connectivity structure of IBM QX2, (b) The coupling graph
for logical qubits in the motivation example in Fig. 2
to find an initial mapping that makes the entire circuit CNOT-
compliant. The common practice is to insert SWAP operations
to remap the logical qubits. A swap operation exchanges the
states of the two input qubits of interest. As shown in Fig. 3,
a SWAP operation is implemented using 3 CNOT gates for
architecture with bi-directional links, or 3 CNOT gates plus
4 Hadamard gates for architecture with single-direction links,
where a bi-directional link means both ends of the link can be
the control or target qubit, while single-direction link means
only one end of it can be the control qubit.
IBM’s Qiskit uses a stochastic method to insert SWAPs [15]
operations but often results in significant increase in the
number of inserted gates and depth. Existing works [11], [14],
[16] are more efficient than IBM’s Qiskit mapper. They use
efficient heuristics to find the mapping rather than a stochastic
method. However, the main objective of these methods is to
reduce the gate count. It makes sense to minimize the gate
count, but it is more important to focus on the depth of circuit,
as in the NISQ era the depth is equivalent to the estimated
execution time. Reducing the depth of the circuit can reduce
the likelihood of the circuit failing at an early stage.
We show an motivation example in Fig. 2. The hardware
model is shown in Fig. 1 (a). It has five qubits and the
connectivity is the same as the IBM QX2 architecture except
that the links are all bidirectional. There are 5 physical qubits:
Q1 to Q5 and six bi-directional edges. One CNOT gate can
only be applied on one of these edges.
In the example, the initial mapping between logical qubits
(denoted by lower case q) and physical qubits (denoted by the
upper case Q) is shown next to each qubit (line), which is
{ {q1 → Q1}, {q2 → Q2}, {q3 → Q3}, {q4 → Q4}, {q5 →
Q5} }. With this initial mapping, it starts scheduling gates
one by one until it encounters a (set of) CNOT gate(s) which
cannot be scheduled due to physical constraints. We show the
interaction of logical qubits in Fig. 1(b) such that two logical
qubits are connected if there is a CNOT operation between
them. When we encounter the gate “CNOT q2, q5” (marked
red in the circuit diagram in Fig. 2 and as the dotted line in the
logical coupling graph Fig. 1), the scheduling has to terminate
since this translates into “CNOT Q2, Q5” on the hardware,
while no physical link exists between Q2 and Q5. Neces-
sary SWAP operations are needed. When applying a SWAP
operation, the two input physical qubits will exchange their
Hq1  (Q1)
q2  (Q2)
q3  (Q3)
q4  (Q4)
q5  (Q5)
H S H
H
X
q1  (Q1)
q2  (Q2)
q3  (Q3)
q4  (Q4)
q5  (Q5)
Q4
Q3
Q3
Q5
X
X
X
H S H
H
X
q1  (Q1)
q2  (Q2)
q3  (Q3)
q4  (Q4)
q5  (Q5)
Q2
Q3
X
H T S H
(a)
(b)
(c)
X T
X T
X
Fig. 2. Motivation Example: (a) the original logical circuit; (b) uses 2 swaps
but the depth of the circuit is not increased; (c) only uses 1 swap but the
depth of the circuit has been increased
states. Fig. 2 (b) and (c) provide two options for transforming
the circuit. Fig. 2 (b) inserts 2 SWAPs (SWAP Q3, Q4 and
SWAP Q3, Q5) such that “CNOT q2, q5” becomes “CNOT Q2,
Q3”, however the two SWAPs can run in parallel with existing
single qubit gates in the circuit, without having to increase the
depth of the circuit. Fig. 2 (c) inserts only 1 SWAP (SWAP
Q2, Q3) such that “CNOT q2, q5” becomes “CNOT Q3, Q5”,
but it can not overlap with existing single-qubit gates in the
circuit and will only increase the depth of the circuit by 3
(assuming we use 3 gates to implement the SWAP operation
and each elementary gate takes 1 cycle in this example).
In this example, the best two known approaches by Zulehner
et al. [14] and Li et al. [11] will both choose to insert 1 SWAP
since they only optimize the number of gates inserted into the
circuit (or the depth of the inserted gates), but not the depth
of the entire transformed circuit. This example stresses the
importance of depth-awareness in SWAP insertion schemes
and motivates our work.
X
X
H
H
H
H
(a) (b) (c)
m
n
n
m
m
n
m
n
n
m
n
m
Fig. 3. Implementation of a SWAP operation
III. PROPOSED SOLUTION
A. Metric
As our work is a depth-aware SWAP insertion scheme, we
first precisely define the metric for characterizing the depth
of a circuit. In order to fully explain the metric, we need to
introduce the concepts of dependency graph and critical path.
The dependency graph represents the precedence relation
between quantum gates in a logical quantum circuit. The
definition is below:
Definition 1. Dependency Graph : The dependency graph of a
quantum circuit C with a set of gates Ψ is a Directed Acyclic
Graph Gψ = (Ψ, Eψ), Eψ ⊆ ψ × ψ. A directed edge from
node ψ1 to node ψ2 exists if and only if the output of gate ψ1
is (part of) the input of gate ψ2 in the quantum circuit C.
The critical path is referred to as the longest path in the
dependency graph. And the definition is below:
Definition 2. Critical Path : Given a dependency graph
Gψ = (Ψ, Eψ) of a quantum circuit. The critical path is
CP = Max(Path(ψ1, ψ2)) s.t. ψ1, ψ2 ∈ Eψ and ψ1 6= ψ2
The depth is characterizing the number of execution steps
of a quantum circuit, which is tantamount to the critical path
length of the circuit. The longest path in the dependence
graph describes the minimal number of steps the circuit needs
in order for every gate’s data dependence be resolved. In
Algorithm 1, we show how we calculate the critical path.
Algorithm 1: Calculate the Critical Path of a Circuit
Input : The circuit’s dependency graph G(V,E)
Output: The critical path CP
earliest start = {};
CP = 0;
for n ∈ V in topological order do
temp = 0;
for p ∈ V’s predecessors do
if temp < earliest start[p] + latency[p] then
temp = earliest start[p] + latency[p];
end
end
earliest start[n] = temp;
if CP < temp + latency[n] then
CP = temp;
end
end
return CP ;
We first sort the nodes in the directed acyclic graph in
topological order. Then we process the nodes in that order.
For each node, we check the earliest start time for each of
its predecessors, and add it by the latency of that predecessor,
then we choose the maximum and use it as the earliest start
time of this node. The maximum of all nodes’ earliest start
time added by their latency is the critical path length.
We use the critical path length as the metric for ranking
different swap insertion options.
B. Framework Design
With the metric precisely explained in previous section, now
we continue to explain the work flow of our framework and
the intuitions behind it.
Before delving into the details of this framework, we need
to define the layer and the coupling graph.
Qubit 
Connectivity
Org. Circuit & 
Its Initial Layer
Process  
A Layer
Need  
SWAPs ? 
Add  
SWAPs
N
Y
Move to Next Layer
Transformed 
Circuit
Move to Next Layer
Iterative Mapper
Fig. 4. The Qubit Mapping Framework
Definition 3. Coupling Graph : The coupling graph of a
quantum architecture X with a set of physical qubits Q is
a directed graph G = (Q,E), E ⊆ Q × Q. The edge
Ex = (Q1, Q2) ∈ E if and only if a CNOT gate can be
applied to Q1 and Q2 in X with Q1 being the control qubit
and Q2 being the target qubit.
We can divide the set of quantum gates in a circuit into
layers, so that all gates in the same layer can be executed
concurrently. The formal definition of a layer is:
Definition 4. Layer : A quantum circuit C can be divided into
layers L = l1, l2, l3, ..., lm, while
⋃m
i=1 li = C and
⋂m
i=1 li =
∅. The set of gates at layer li can run concurrently and act
on distinct sets of qubits.
To divide a circuit into layers, we group the gates that have
the same earliest start time (defined in Algorithm 1) into the
same layer. The order of the layers is thus determined by the
order of the earliest start times.
We use an iterative process to find the mapping. Our
framework is depicted in Fig. 4. And this iterative process is
explained as below. We start the framework by taking the input
of the coupling graph (also denoted as Qubit Connectivity) and
the original circuit’s initial layer.
We process the circuit layer by layer. Given a layer, we
perform the following steps.
• We check the layer to see if it is hardware-compliant
based on the coupling graph and the qubit mapping before
current layer is scheduled.
• If YES, we move on to next layer.
• If NO, we invoke our mapping searcher to search for
(the set of) swaps that are necessary to solve the current
layer. We consider depth-awareness during the selection
of the set of swap gates – the resulted mapping of which
generates the smallest critical path length (described
in Section III-C). After we find a hardware-compliant
mapping, we move to the next layer.
After all layers are processed, the mapping terminates.
C. Circuit Mapping Searcher
Here we describe the specific mapping searcher we use to
overcome the coupling constraint for a given layer.
We build our method upon the A-star algorithm for finding
valid mappings that minimize the number of only the inserted
SWAP gates [14]. We extend it by changing the ranking
metric and allowing it to search for feasible mappings that
do not necessarily have the smallest SWAP gate counts. It
will help us search in a way that minimizes the depth while
not significantly increasing the gate count.
We rank the swap options by the increase in the critical path
length. Since it is an iterative process that handles the gates
layer by layer, it is tempting to consider only minimizing the
depth of the already processed circuit when deciding which
swaps to use.
Q1 Q2
Q3 Q4
Hq1  (Q1)
q2  (Q2)
q3  (Q3)
q4  (Q4)
S H
Processed 
Circuit
Remaining 
Circuit
Hq1  (Q1)
q2  (Q2)
q3  (Q3)
q4  (Q4)
S H
Processed 
Circuit
Remaining 
Circuit
X
X
(Q4)
(Q3)
Overlaps with  
remaining circuit(a) (b) (c)
Qubit 
Connectivity
Fig. 5. (a) Layout of an example architecture with 4 physical qubits (b)
Example of a quantum circuit, the dashed line separates the processed circuit
and the remaining circuit (c) Inserted SWAP overlaps with remaining circuit
instead of existing processed circuit
But the example in Fig. 5 shows that not only the processed
circuit, but also the remaining circuit can help overlap the
SWAPs with existing gates in the circuit without affecting the
critical path. As shown in Fig. 5, for the CNOT gate (in red),
there is no way it can overlap the necessary SWAPs with the
processed circuit (dubbed as the circuit before the dashed line).
But when we look after the dashed line, the three single-qubit
gates can overlap with inserted SWAP. And this renders less
impact to the depth of the resulting circuit, compared to if we
insert the SWAP on Q1 and Q2.
Based on this intuition, we design our scheme of choosing
the SWAP candidate as in Fig. 6. For each of the hardware-
compliant remapping candidates that we acquire from the A-
star searcher, we calculate the critical path after merging the
candidate (set of) swap(s) with both the processed circuit and
the not-processed circuit. We choose the mapping that yields
the shortest critical path.
Processed Circuit

(PC) S
Remaining Circuit

(RC)
S: Circuit with added SWAPs
S1 S2 S3 Sn…Hardware-compliant  candidates
Select Sx with the smallest CP(X)
CP(1) CP(2) CP(3) CP(n)…
Calculate critical path:  
CP(X) =  
critical_path(PC, SX, RC)
Fig. 6. Choose SWAP Candidates
D. Optimizations
We use two ways to optimize our proposed solution. One is
to expand more nodes during the A-star search, and another
one is to search into deeper levels.
1) Expand More Nodes: In the search process for A-star,
the normal routine is to expand the one node of least cost at
each step. Here, we can expand more than one node at each
step and increase the search space. The number of nodes that
can be expanded at a time can go from 1 to larger number.
2) Deeper Search: We increase the depth of the A-star
search tree. In normal case, the search process ends when
it finds the first node that minimizes the number of SWAPs,
which is reflected as a certain level of the A-star tree. To
this end, the second optimization that we applied here is to
continue the search into a deeper level of the A-star tree. We
can specify and tune the parameter of the deeper search.
By tuning these parameters, there are more possible nodes
added into our search space. With a larger search space, we
have a larger possibility to jump out of one local optima and
go to the global optima.
IV. EVALUATION
In this section, we evaluate our depth-aware swap insertion
scheme (denoted as DPS) and compare it with the two state-
of-the-art qubit mappers. The experiment setup is listed below:
• Benchmarks: We use the quantum circuits from
RevLib [20], IBM Qiskit [15], and ScaffCC [21].
• Hardware Model: We use IBM’s 20-qubit Q20 Tokyo
architecture, which was used in [11]’s work. The qubit
connectivity graph is shown in Fig. 7.
• Evaluation Platform: The mapping experiments are
conducted on a Intel 2.4 GHz Core i5 machine, with 8
GB 1600 MHz DDR3 memory. The operating system is
MacOS Mojave. We use IBM’s Qiskit [15] to evaluate
the depth of the transformed circuit.
• Baselines: We compare our work with two best know
qubit mapping solutions, the work by Zulehner and others
[14] (denoted as Zulehner), the Sabre qubit mapper
from [11] (denoted as Sabre), and IBM’s stochastic map-
per in Qiskit. Since IBM’s Qiskit mapper is significantly
worse in terms of gate count and depth than all other
mappers we evaluate, as also evidenced in the work by
Zulehner et al. [14], we do not present Qiskit results.
• Metrics: We are comparing the depth and gate count of
the transformed circuit circuits for all different strategies.
Fig. 7. IBM Q20 Tokyo Physical Layout [11]
Table. I shows a summary experimental results. For gate
count, we compare the total gate count generated in the
transformed circuit. For depth, we compare the increased depth
for each benchmark, denoted as “Depth-delta” in Table I.
The improvement columns provides the ratio between one of
the two baseline’s depth-delta and our depth-delta. We use
the term minimum improvement to denote the improvement
over the best of the two baselines, and the term maximum
improvement to denote the improvement over the worse of the
two baselines.
We discuss our findings from the following three aspects:
depth reduction, gate count change, and the trade-off between
gate count and depth.
A. Depth Reduction
For depth reduction, as shown in Table I, our proposed
solution outperforms the two baselines Zulehner and Sabre.
Comparing depth-delta, the added depth of the circuit, our
approach outperforms the better of the two baselines by more
than 20% and up to 3X. For five out of the twenty-three
benchmarks, our improvement on depth-delta is less than 20%
compared with the better of the two baselines. However, for
these cases, our approach still achieves considerable improve-
ment over the worse of the two baselines. In these cases, it is
possible that one of the two baselines happen to achieve very
good depth in the transformed circuit and there is not much
potential to improve. But our approach is still able to find a
good mapping for these benchmarks and the performance is
on par with the better of the two baselines.
B. Gates Count Changes
The primary goal of our depth-aware qubit mapper is to
minimize the depth of the circuit. However, we discover that
our qubit mapper can sometimes reduce the gate count. We
discover that four out of the twenty three (17%) benchmarks,
our qubit mapper yields the smallest number of gates among
all three versions of qubit mappers. For 57% of these bench-
marks, our method is ranked among top-2 of the three qubit
mappers in terms of gate count. For the benchmarks where our
method yields the largest gate count, the increased gate count
percentage is negligible. On average, our depth-aware qubit
mapper adds 3% gate count. From the experiment results, we
can see that our solution does not greatly increase the number
of gates while reducing the depth of the circuit.
C. Trade-off between Gate Count and Depth
While all previous works focus on reducing the total gate
count (and the depth among the inserted gates themselves)
after qubit mapping transformation, it is crucial to think about
the trade-off between the resulted gate count and depth. Some-
times the choice made during the search process that favors
the reduced gate count, might adversely affect the critical path.
In Table I, the Sabre mapper reduces the number of gates for
10-qubit QFT by 1.1% compared with Zulehner’s mapper, but
increases the depth by 44.5%. For the sym 9 246 benchmark,
Sabre reduces the gate count by 3.8% compared with our
approach, but increases the depth by 25.5%. Therefore a
small reduction in the gate count may not be worthwhile if
it increases the circuit depth significantly.
TABLE I
SUMMARY OF EXPERIMENT RESULTS
Benchmark Total Gate # Depth Depth-delta Improvement
name n Zulehner Sabre DPS Original Zulehner Sabre DPS Min Max
4gt5 75 5 131 122 119 47 44 44 29 1.52 1.52
mini-alu 167 5 435 396 432 162 131 125 119 1.05 1.10
mod10 171 5 361 328 298 139 117 89 39 2.28 3
alu-v2 30 6 804 717 795 285 261 241 201 1.20 1.30
decod24-enable 126 6 533 476 509 190 187 150 141 1.06 1.33
mod5adder 127 6 849 780 858 302 256 256 222 1.15 1.15
4mod5-bdd 287 7 94 94 94 41 18 23 17 1.06 1.35
alu-bdd 288 7 126 117 135 48 36 36 30 1.2 1.2
majority 239 7 915 780 885 344 265 194 182 1.06 1.46
rd53 130 7 1619 1508 1619 569 529 482 384 1.26 1.38
rd53 135 7 419 410 422 159 116 112 109 1.03 1.06
rd53 138 8 186 183 174 56 37 40 21 1.76 1.90
cm82a 208 8 899 944 1007 337 219 295 213 1.03 1.38
qft 10 10 266 263 281 63 47 96 44 1.07 2.18
rd73 140 10 347 329 338 92 84 79 67 1.18 1.25
dc1 220 11 2868 2685 3129 1038 820 697 681 1.02 1.20
wim 266 11 1505 1415 1511 514 431 450 311 1.39 1.45
z4 268 11 4453 4477 4972 1644 1162 1492 1076 1.08 1.39
cycle10 2 110 12 9143 8666 10115 3386 2467 2640 2421 1.02 1.09
sym9 146 12 493 454 472 127 118 138 86 1.37 1.60
adr4 197 13 5299 5017 5530 1839 1439 1599 1210 1.19 1.32
rd53 311 13 467 413 446 124 138 157 87 1.59 1.80
cnt3-5 179 16 325 238 286 61 79 59 43 1.37 1.84
We compare the total gate count generated. For depth, we compare the increased depth for each benchmark, denoted as “Depth-delta” here. The
improvement represents the ratio of a baseline’s depth-delta divided by DPS’s depth-delta. Min/Max represents the improvement over the best/worst baseline.
V. CONCLUSION
The physical layout of contemporary quantum devices im-
poses limitations for mapping a high level quantum program to
the hardware. It is critical to develop an efficient qubit mapper
in the NISQ era. Existing studies aim to reduce the gate
count but are oblivious to the depth of the transformed circuit.
This paper presents the design of the first depth-aware swap
insertion scheme. Experiment results show that our proposed
solution generates hardware-compliant circuits with reduced
depth compared with state-of-the-art mapping schemes, with
negligible overhead of increased gate count.
REFERENCES
[1] P. W. Shor, “Algorithms for quantum computation: Discrete logarithms
and factoring,” in Proceedings 35th annual symposium on foundations
of computer science. Ieee, 1994, pp. 124–134.
[2] L. K. Grover, “A fast quantum mechanical algorithm for database
search,” in Proceedings of the Twenty-eighth Annual ACM Symposium
on Theory of Computing, ser. STOC ’96. New York, NY, USA:
ACM, 1996, pp. 212–219. [Online]. Available: http://doi.acm.org/10.
1145/237814.237866
[3] A. Peruzzo, J. McClean, P. Shadbolt, M.-H. Yung, X.-Q. Zhou,
P. J. Love, A. Aspuru-Guzik, and J. L. OBrien, “A variational
eigenvalue solver on a photonic quantum processor,” in Nature
Communications, vol. 5, no. 1, 2014, p. 4213. [Online]. Available:
https://doi.org/10.1038/ncomms5213
[4] J. Preskill, “Quantum computing in the nisq era and beyond,” Quantum,
vol. 2, p. 79, 2018.
[5] W. Knight, “IBM Raises the Bar with a 50-Qubit Quantum
Computer,” https://www.technologyreview.com/s/609451/
ibm-raises-the-bar-with-a-50-qubit-quantum-computer, 2017.
[6] J. Kelly, “A Preview of Bristlecone, Googles New Quantum
Processor,” https://ai.googleblog.com/2018/03/a-preview-of-bristlecone-
googles-new.html, 2018.
[7] J. Hsu, “Intels 49-Qubit Chip Shoots for Quantum Supremacy,”
https://spectrum.ieee.org/tech-talk/computing/hardware/intels-49qubit-
chip-aims-for-quantum-supremacy, 2018.
[8] Rigetti, https://www.rigetti.com/.
[9] M. A. Nielsen and I. Chuang, “Quantum computation and quantum
information,” 2002.
[10] A. W. Cross, L. S. Bishop, S. Sheldon, P. D. Nation, and J. M.
Gambetta, “Validating quantum computers using randomized model
circuits,” Physical Review A, vol. 100, no. 3, Sep 2019. [Online].
Available: http://dx.doi.org/10.1103/PhysRevA.100.032328
[11] G. Li, Y. Ding, and Y. Xie, “Tackling the qubit mapping problem
for nisq-era quantum devices,” in Proceedings of the Twenty-Fourth
International Conference on Architectural Support for Programming
Languages and Operating Systems. ACM, 2019, pp. 1001–1014.
[12] R. Wille, L. Burgholzer, and A. Zulehner, “Mapping quantum circuits
to ibm qx architectures using the minimal number of swap and h
operations,” in Proceedings of the 56th Annual Design Automation
Conference 2019. ACM, 2019, p. 142.
[13] A. Zulehner, S. Gasser, and R. Wille, “Exact global reordering for near-
est neighbor quantum circuits using A∗,” in International Conference
on Reversible Computation. Springer, 2017, pp. 185–201.
[14] A. Zulehner, A. Paler, and R. Wille, “Efficient mapping of quantum
circuits to the ibm qx architectures,” in 2018 Design, Automation &
Test in Europe Conference & Exhibition (DATE). IEEE, 2018, pp.
1135–1138.
[15] QISKit: Open Source Quantum Information Science Kit, https://https:
//qiskit.org/.
[16] M. Y. Siraichi, V. F. d. Santos, S. Collange, and F. M. Q. Pereira,
“Qubit allocation,” in Proceedings of the 2018 International Symposium
on Code Generation and Optimization. ACM, 2018, pp. 113–125.
[17] S. S. Tannu and M. K. Qureshi, “Not all qubits are created equal:
A case for variability-aware policies for nisq-era quantum computers,”
in Proceedings of the Twenty-Fourth International Conference on
Architectural Support for Programming Languages and Operating
Systems, ser. ASPLOS ’19. New York, NY, USA: ACM, 2019, pp. 987–
999. [Online]. Available: http://doi.acm.org/10.1145/3297858.3304007
[18] P. Murali, J. M. Baker, A. Javadi-Abhari, F. T. Chong, and M. Martonosi,
“Noise-adaptive compiler mappings for noisy intermediate-scale
quantum computers,” in Proceedings of the Twenty-Fourth International
Conference on Architectural Support for Programming Languages
and Operating Systems, ser. ASPLOS ’19. New York, NY,
USA: ACM, 2019, pp. 1015–1029. [Online]. Available: http:
//doi.acm.org/10.1145/3297858.3304075
[19] A. W. Cross, L. S. Bishop, J. A. Smolin, and J. M. Gambetta, “Open
quantum assembly language,” arXiv preprint arXiv:1707.03429, 2017.
[20] R. Wille, D. Große, L. Teuber, G. W. Dueck, and R. Drechsler, “Revlib:
An online resource for reversible functions and reversible circuits,” in
38th International Symposium on Multiple Valued Logic (ismvl 2008).
IEEE, 2008, pp. 220–225.
[21] A. JavadiAbhari, S. Patil, D. Kudrow, J. Heckey, A. Lvov, F. T. Chong,
and M. Martonosi, “Scaffcc: a framework for compilation and analysis
of quantum computing programs,” in Proceedings of the 11th ACM
Conference on Computing Frontiers. ACM, 2014, p. 1.
