Rapid advancement in the domain of quantum technologies have opened up researchers to the real possibility of experimenting with quantum circuits, and simulating smallscale quantum programs. Nevertheless, the quality of currently available qubits and environmental noise pose a challenge in smooth execution of the quantum circuits. Therefore, efficient design automation flows for mapping a given algorithm to the Noisy Intermediate Scale Quantum (NISQ) computer becomes of utmost importance. State-of-the-art quantum design automation tools are primarily focused on reducing logical depth, gate count and qubit counts with recent emphasis on topology-aware (nearest-neighbour compliance) mapping. In this work, we extend the technology mapping flows to simultaneously consider the topology and gate fidelity constraints while keeping logical depth and gate count as optimization objectives. We provide a comprehensive problem formulation and multi-tier approach towards solving it. The proposed automation flow is compatible with commercial quantum computers, such as IBM QX and Rigetti. Our simulation results over 10 quantum circuit benchmarks, show that the fidelity of the circuit can be improved up to 3.37× with an average improvement of 1.87×.
I. INTRODUCTION
Quantum computation [1] promises to expand the reach of computing beyond classical -both theoretically and practically. It is conjectured that there are problems belonging to Bounded-Error Quantum Polynomial (BQP) class, which cannot be solved efficiently in a classical computer. There are already several problems, most notably, Shor's number factorization and Boson Sampling that demonstrate superpolynomial speedup over the best known classical algorithms. Consequently, a massive research effort is underway to develop practical, scalable quantum computers. To enable convenient programming for quantum computers, Microsoft has open-sourced Q# [2] and IBM has provided selective remote access to small-scale quantum computers [3] . These frameworks present opportunity for researchers to describe a quantum algorithm and test its outcome by running on a practical system. However, the scalability of quantum computers is hindered due to their extreme susceptibility to decoherence and noise. According to Threshold theorem, Quantum Error-Correcting Codes (QECC) can only provide robustness up to a level of local noise (i.e., threshold) [4] . The resource requirement for QECC may scale up faster than the number of computing qubits, eventually posing as a roadblock.
In the interim, there remains an interesting possibility to explore Noisy Intermediate-Scale Quantum (NISQ) computers [5] to solve 'useful' computations demonstrating the efficacy of scalable quantum computers, even though at a small scale. Needless to mention, the NISQ era also presents an opportunity for the software tool chain to mature and prepare well ahead of the arrival of large-scale quantum computing. A typical tool chain consists of the programming language to describe quantum circuits and algorithms, synthesis and technology mapping phases. We focus on the challenge of technology mapping.
While mapping a given quantum algorithm on NISQ computer, primarily two sets of constraints are considered -topology and fidelity. On one hand, topology-aware constraints in state-of-the-art literature considered 2D topology (barring a few exceptions). On the other hand, fidelity-aware mapping flows began with a quantum circuit that is topology-compliant. To the best of our knowledge, the interplay between these two constraints are not explored. In this paper, we discuss all variants of topology and fidelity constraints and present MUQUT -Multi-Constraint Quantum Circuit Mapping flow. We have made following novel contributions.
• A multi-constraint quantum circuit mapping problem formulation for NISQ-era quantum computers. • A multi-tier technology mapping flow, combining optimal and heuristic solutions for various sub-problems, is presented (section III). • Demonstrative examples and benchmarking based on the commercial quantum architectures (IBM, Rigetti) to validate our advances (section IV).
II. BACKGROUND AND MOTIVATION
In quantum computing, the operations take place on qubits, which are a linear combination of the conventional Boolean states in the two dimensional complex Hilbert space. Each operation on these qubits can be defined by a unitary matrix [1] represented by means of quantum gates. Definition 1. (Quantum gate) A quantum gate over the inputs X = {x 1 , . . . , x n } consists of a single target line t ∈ X and, one or more control line(s) c ∈ X with t = c. arXiv:1911.08559v1 [quant-ph] 16 Nov 2019 (a) Definition 2 (Quantum circuit). A quantum circuit, defined over n-qubits q 1 , q 2 ,...,q n is a series of levels L i , where each level L i consists of a set of quantum gates
with each gate G j i operating on one or more qubits. Any two pair of gates G j i and G k i in a level L i do not operate on any common qubit and therefore can be executed in parallel. At a logical level, we consider that each level L i takes one cycle to execute. A quantum circuit with k levels has a delay of k cycles.
Example II.1. Fig. 1a shows a quantum circuit with 5 CNOT gates. The circuit has 4 levels,
Definition 3 (Clifford group). The Clifford group is a set of special kind of quantum gates (G) which satisfies [6] .
A. Nearest Neighbor Compliance
A major challenge towards the realization of practical and scalable quantum computing is to achieve quantum error correction. Long-distance interacting qubits are particularly susceptible to the noise. Therefore, prominent quantum technologies and quantum error correction codes, e.g., surface codes [4] require that the quantum gates must be formed with a nearest neighbor interaction. In the resulting circuits, the interacting qubits may form a chain, as in a 1D qubit layout, and therefore, these circuits are referred to as Linear Nearest Neighbor (LNN) circuits.
Given a quantum gate with m-control lines l 1 , ..., l m and target line l t , qubits q l and q t have to be nearest-neighbors, 1 ≤ l ≤ m. For level L i , we define interaction I i as the set of nearest neighbors for the all the gates in L i .
Example II.2. For L 3 of Fig. 1a , the interaction I 3 is
Conversion of a quantum circuit to LNN can be achieved by inserting SWAP gates that make all control lines and target lines adjacent. More precisely, a cascade of adjacent SWAP gates can be inserted in front of each gate g with non-adjacent circuit lines in order to shift the control line of g towards the target line, or vice versa, until they are adjacent. This is shown using the following example.
Example II.3. Consider the circuit depicted in Fig. 1a consisting of gates g 1 , g 2 , g 3 , g 4 and g 5 , numbered from left to right. As can be seen, gates g 2 , g 4 , and g 5 are non-adjacent. Thus, in order to make this circuit nearest neighbour compliant, SWAP gates are inserted, as shown in Fig. 1b .
Quite a few works have been done in recent past to convert a quantum circuit to LNN by introducing additional gates, which, naturally worsens the circuit performance by increasing the logical depth and gate count. To balance that effect, heuristic approaches [7] - [12] and exact algorithms [13] are proposed. It is pointed out [10] that the problem of NN-compliant quantum circuit construction is NP-complete. Hence, it is unlikely that this problem can be solved optimally for large instances. To the best of our knowledge, [11] , [14] were the first to look into arbitrary topologies for quantum circuits with nearest neighbor constraints. So far, most of the other works in this domain have concentrated on 1D qubit layout or 2D qubit lattice structures [7] , [9] . The work presented in [14] focuses on identifying the qubit topology best suited for a given quantum circuit placement. Relatively unexplored is the topic of mapping on available topologies. This particular problem has been dealt with in [11] with examples taken from liquid state NMR molecules as the topologies. A graph partitioningbased approach (claimed to be asymptotically optimal for the case of chain nearest neighbour architecture) is proposed.
Independently, efficient qubit topology identification and the mapping flows for specific interaction graphs have been done in [15] . It has been proved that for cyclic butterfly network, the depth overhead for mapping a given quantum circuit to nearest neighbor is 6 log n. Subsequently, the mapping algorithm is also derived. The commercial quantum computers, such as IBM QX and Rigetti do not operate on a linear array of qubits, as shown in Fig. 2a and Fig. 2b respectively. This drives the need for developing nearest-neighbor mapping techniques that can support arbitrary topologies.
Practical setups for diverse quantum circuit topologies have been made available through [3] . Formally, the topology of qubits can be described by means of a topology graph.
Definition 4. (Topology graph) A topology graph is an ordered
T E is the edge-set, which contains a set of edges. An edge e vw ∈ T E indicates that qubit at location/vertex v and w can interact. In other words, qubits at location v and w are nearestneighbors (NN).
Example II.4. Fig. 2a shows the topology of a 14-qubit IBM QX Melbourne quantum computer [16] .
B. Fidelity of Quantum Computers
Existing quantum computers are noisy and error-prone. The errors in the quantum computers can be broadly characterized in two classes: (a) gate errors and (b) decoherence. Due to the gate errors, the final state of the operation deviates from the ideal state. Moreover, the qubits lose their saved state due to decoherence. Therefore, a quantum circuit does not generate correct results all the time when executed for a number of times. There are several metrics to quantify the correctness of the output: (a) fidelity, (b) probability of basis/pure states, and (c) success rate. We briefly discuss these three metrics. Fidelity: Mathematically, fidelity between two quantum states, ρ and σ in density matrix representation, is expressed as F = ρ 1/2 σρ 1/2 . This computes the closeness of the two density matrices. If the noise per operation is high, the actual output deviates more, and fidelity reduces. A higher fidelity is better in terms of reliability of a quantum circuit. Probability of pure states: In this approach, the output density matrix can be Eigen-decomposed with every possible pure states as the Eigen vectors. The Eigen values will then denote the probability of the pure state. Probability of success: This approach utilizes single-qubit and multi-qubit gate error-rates and law of probability to calculate the overall success rate of a quantum circuit. Suppose, a quantum circuit consists of three gates, G 1 , G 2 , and G 3 , each with error rate η 1 , η 2 , and η 3 . According to the law of probability, the success rate of this circuit is
Example II.5. Error rates for different qubits and qubit pairs of IBM QX 14 qubit quantum computer is shown in Fig. 2c . It shows that multi-qubit gate error (CNOT) is an order higher than the single qubit gate error. Moreover, there is a qubit-to-qubit (Q2Q) variation among the error rates. For example, qubit pair Q9 − Q8 has substantially higher error rate (0.32) than most of the other qubit pairs. Therefore, running operations on this qubit pair will be more erroneous compared to a qubit pair with lower error-rate, such as pair Q1 − Q0.
In recent times, the concept of noise-aware mapping has emerged [17] - [21] . There are qubit-to-qubit error-rate variations. The aforementioned works propose mapping and/or synthesis of a quantum circuit to less erroneous qubits for more reliable result. However, none of these works have considered all the constraints that we outline in this paper.
C. Motivation
The recent works that consider the fidelity of quantum gates take a LNN circuit as a starting point [20] . We present a motivational example in this subsection to demonstrate a range of design choices that remain unexplored, when decoupling these two steps of mapping.
Let us take the circuit presented in Fig. 3a as a representative example. The circuit comprises of four qubits (l 1 , l 2 , l 3 , l 4 ). For this circuit to be mapped to a QC such as the one presented in Fig. 2a , first we need to decide to which subgraph we want the circuit to be mapped to and make the circuit nearest neighbour compliant.
1) We consider 4-vertex non-isomorphic subgraphs extracted from the Fig. 2a , as shown in Fig. 4 . Each of these subgraphs can be used as the 'host' platform for the original circuit. The original circuit does not satisfy nearest neighbour constraint for any of the considered subgraphs and therefore, needs insertion of swap gates. Each topology (subgraph) has different nearest neighbour constraints, which results in different NN-compliant circuits. 2) Given a subgraph G, we need to decide the starting configuration C, i.e, mapping of a logical qubit (l 1 , l 2 , l 3 , l 4 ) in the circuit to a unique vertex (a, b, c, d) in the subgraph. For example, Fig. 3b uses l 1 → a, l 2 → b, l 3 → c and l 4 → d as configuration. 3) With the chosen subgraph G and initial configuration C of qubits, we construct the NN-circuit compliant with the chosen subgraph. Figure 3 presents three different solutions, one for each topology subgraph. As evident from the figures, NN circuits of Fig. 3b and Fig. 3d need a single SWAP gate for reaching NN compliance, while Fig. 3c requires two SWAP gates. 4) These NN-compliant circuits are subjected to further exploration of gate fidelity, produces different overall circuit resilience as can be observed in Fig. 5 .
As evident from Fig. 2c , qubits have different error-rates. Instead of scheduling the NN compliant circuit in an error-agnostic fashion to any set of qubits, qubits with better error-rates can be selected to execute the circuit. For example, the qubits {a, b, c, d} in the T-topology in Fig. 3c can be assigned to a number of different sets physical qubits (suppose, {0, 1, 2, 13} and {8, 9, 10, 5}) in the 14-qubit IBM computer ( Fig. 2a ). Although both assignment will satisfy NN compliance, they are not equivalent in terms of operation fidelity. Gate operations involving qubit pair Q9 − Q8 and Q9 − Q10 introduces a higher error compared operations on pair Q1 − Q0 and Q1 − Q2 due to different gate error-rates (Table 2c) . Therefore, in this step error-rate difference of qubits is taken into consideration, and the NN compliant circuit is mapped to better quality qubits for a better noise resiliency.
III. METHODOLOGY
From the demonstrative example, we observe the following degrees of freedom, applicable to the technology mapping, when starting from a quantum circuit without any neighbourhood compliance.
1) Topology Subgraph Selection: A quantum computer with n qubits can host a quantum circuit with set of gates operating on k qubits, such that k ≤ n using different subgraphs in it, as depicted in the Fig. 4 . 2) Logical Qubit to Topology graph node Mapping: For a given subgraph extracted from the 'host' quantum computer, there remains the possibility of different qubit to subgraph vertex mappings. We define this as a configuration, presented in Definition 5. 3) Nearest Neighbour (NN) Compliance: Given an initial configuration and topology subgraph, NN compliance can be achieved by inserting swap gates, with the objectives to minimize delay as well number of swap gates required for NN compliance. 4) Fidelity-aware mapping of NN-compliant circuit to QC:
We determine the mapping of the NN-compliant circuit to the qubits in the quantum computer, while considering the qubit and gate error rates to minimize error-rate. The above degrees of freedom can be independently or jointly exercised to optimize one or more of the following targets -gate count, logical depth and circuit fidelity. In this work, we focus on achieving high circuit fidelity. The proposed technology mapping flow is shown in Fig. 6 . We describe the individual blocks in the following subsections.
A. Extracting Topology Graph
Given a quantum circuit over k-qubits and a NISQ computer with n qubit, such that n > k, there are multiple possible embedding of the circuit onto the NISQ computer. We use a probabilistic algorithm with fixed number of attempts to extract topology sub-graphs from a given quantum computer topology graph T , as shown in Algorithm 1.
B. NN-compliant circuit technology mapping
In this subsection, we present an optimal technique based on Integer Linear Programming (ILP) formulation for solving the NN-compliant circuit mapping problem for arbitrary topologies [12] . We consider an input circuit over say n qubits to be mapped with a topology graph with n vertices extracted using the technique presented in the previous subsection. We then generate some random configurations mapping each qubit to a vertex and solve the NN-compliant technology mapping problem. Formally, a configuration can be defined as follows.
Definition 5. (Configuration) A configuration C t is the set of ordered tuples (q i , v), which indicates that in cycle t, qubit q i , is at location v, 1 ≤ i ≤ n and v ∈ T V . Configuration C 0 represents the initial configuration.
Given an initial configuration C of n-inputs, a series of levels L 1 , L 2 , . . . , L k and a topology graph T , the objective is to determine the series of swap gates needed to transform the location of the qubits from configuration C such that all qubits pairs in interaction I 1 (corresponding on level L 1 ) are nearestneighbor, and then again location of qubits are transformed to be nearest neighbors for I 2 (corresponding on level L 2 ) and so on, till interaction I k (corresponding on level L k ) is met and the combined delay of swap gates and gates present in the actual circuit is minimum. Objective function::
Level scheduling constraints:: Each level can be sched- Delay due to insertion of swap gates cv,q,t 1 indicates qubit q will move to new location v in cycle t mi,t 0 indicates Interaction i met in cycle t ai,t 1 indicates gates in Level i are scheduled in cycle t np,q,t 1 indicates qubit p and q are NN in cycle t ebI i ,t 1 indicates interaction Ii has been met in cycle t and gates of level i can be placed in the current or following cycles. p (p,v),(q,w),t 1 indicates qubit p is in location v and q is in location w in cycle t bq,t 1 indicates qubit q cannot be involved in a swap in cycle t xv,p,t 1 indicates qubit p is in location v in cycle t bv,q,t 1 indicates qubit q in location v cannot be involved in a swap in cycle t uv,q,t 1 indicates qubit q will remain in location v in cycle t sbm,n,t 1 indicates swap is not permitted between locations m and n in cycle t uled/activated exactly once. 
Activation for a level i can happen only if corresponding interaction i is met.
Swap blocking constraints:: If an interaction i is met and all the gates in any Level i such that (i < i ) have been scheduled, then swaps involving the qubits in interaction i cannot be performed and interaction i is blocked till Level i has been scheduled. Qubit involved in an interaction i cannot be swapped in the cycle, when the Level i is scheduled. Chronological interaction constraints:: If an interaction is met in cycle t, then the status should not change to not met after that cycle. In addition, interaction i must be met before i − 1 th interaction is met.
Successful interaction constraints:: An interaction is met if all the qubit pairs in the interaction are nearest neighbors. If an interaction has been met in cycle t, then in all cycles t > t, the qubit positions do not matter any longer.
Nearest neighbor constraints:: Two qubits p and q are nearest neighbors if the qubits are in two locations v and w respectively or in w and v respectively, such that (v, w) ∈ G E and (p, q) ∈ I.
(14) n p,q,t = ∨ (v,w)∈G E (p (p,v),(q,w),t ∨ p (p,w),(q,v),t ) (15) Qubit position update constraints:: A qubit q is at location v in cycle t + 1 if it was in location v in cycle t and there were no swaps performed involving the location v or if q was in a location w which is nearest neighbor with v and a swap was performed between v and w.
Qubit location and swap constraints:: A qubit q can be at exactly one position in any given cycle. In a given cycle, a location can be involved in at most one swap.
v∈G V
x v,q,t = 1;
Initialization constraints:: A qubit q is at location v in cycle 0, based on input configuration C.
x v,q,0 = 1;
This completes the description of the ILP formulation of NNmapping of quantum circuits for arbitrary topologies. The variables used in the ILP formulation are summarily presented in Table II . For large circuits, the ILP solves takes a long time to find the optimal solutions. Therefore, we split the input circuit into non-overlapping windows. A window can be defined as a set of consecutive levels in a circuit. For example, a window size of 2 would split a circuit with l levels into n 2 windows. We solve the NN compliance for each window separately.
C. Fidelity-aware mapping
The NN-compliant circuit generated from the previous step is further optimized by taking qubit error-rate variations into account. As demonstrated in the motivational example, the same NN-compliant circuit when mapped to a different set of physical qubits on quantum computer can result in different fidelities or probability of successes. To find a mapping of the NN-compliant circuit to less erroneous physical qubits, we exploit the regularities in existing NISQ devices. The NISQ devices generally follow a grid-like architecture (e.g., IBMQ16) 1 .
Step 1. After getting the NN-compliant circuit, we extract a rectangular grid architecture from the coupling graph which is sufficient to implement the given quantum circuit. As an example, we take the T-topology in Fig. 7(a) as the NNcircuit and IBMQ16 as the target NISQ computer. We call this H-Grid. Inside the H-Grid, the given workload can be mapped in at least 4 different ways as shown in Fig. 7(c) . These are basically the horizontally and vertically mirrored reassignment of the qubits from initial rectangular grid within the H-Grid. [t]
Step 2. We determine the uniquely fitted H-Grids within the QC by sliding the rectangular grid over the QC. Let the number of qubits in the horizontal and vertical direction are QH, and QV for the QC, and HQH and HQV for the H-Grid. Then the number of H-Grid (N HG) within the QC will be (QH-HQH+1)*(QV-HQV+1).
Step 3. We compute all possible mapping coordinates for the qubits within each unique H-Grid. The total number of unique mapping (N ISG) in the target hardware for the given workload will be at least (4 × N HG) for HQH, HQV > 1. Finally, for each possible mapping fidelity is computed taking the qubit error-rates into consideration. Fig. 8 : The number of CNOT gates and the fidelity of the QFT-5 circuit show variation with respect to initial topology graph and qubit configuration. One choice of initial topology can generate a smaller number of gates and a higher fidelity than others. Moreover, the fidelity of the initial placement can be further improved through fidelity-aware mapping (avg. 1.92x better). 
# of gates

IV. EXPERIMENTAL RESULTS
The proposed multi-constrained technology mapping flow 2 was implemented in Python, with Gurobi 8 as the ILP solver [22] . Topology graph variation: We demonstrate the effect of topology graph variation with QFT5 benchmark. Fig. 8 shows the number of CNOT gates in the final NN-compliant circuit and the fidelity of the initial random mapping for different topology graphs and qubit configurations (window size w = 1). For space constraint, we show 39 out of 78 configurations for QFT-5 benchmark. However, for clarity we have plotted 39 of those). Results show that depending on the qubit configuration and topology graph, the final circuit can have different number of CNOT gates to satisfy the NN compliance. A smaller number of CNOT gates or the total number of gates in general, results in a better fidelity. 2 https://github.com/debjyoti0891/quantum-chain However, the initial placement of the NN-compliant circuit does not take the qubit-to-qubit error-rate variations in account. Therefore, the fidelity-aware mapping step further searches for a mapping with better fidelity. In Fig. 8 , the noise-aware fidelity bar shows that noise-awareness in conjunction with gate-reduction can improve the fidelity. We observe on average 1.96× improvement for the QFT-5 benchmark. Variants of NN topology solver: We explore the technology mapping flow with various window sizes in the NN topology solver, w = {1, 2, 4, 6}. The maximum, average, and minimum number of total gates are plotted for different benchmarks for these window sizes. The total number of gates consists of noisy gates only from the IBM QX quantum computer, i.e., U2, U3, and CNOT gate. With larger window sizes, the NN topology solver can reduce the overall number of gates required due to optimization across multiple levels in the circuit, to find a NN complaint circuit.
Putting it all together: The proposed flow offers the scope of optimizing multiple parameters. Providing a large window size as input to the topology solver results in a NN-compliant circuit with fewer gates, at the cost of higher run times. For all the benchmarks, the mapped circuit with maximum fidelity has a substantially smaller number of gates compared to the minimum fidelity mapping of the same circuit, as evident from Fig. 9b . This is intuitively correct since the fidelity of a circuit is inversely related to the number of gates in the circuit. The simulation results with realistic error-rates and connectivity of IBMQ16 shows that the overall fidelity can be improved up to 6.76× among the simulated benchmarks, as shown in Fig. 9c . The most improvement is observed for QFT5 benchmarks which has the largest number of gates among the simulated benchmarks. Therefore, our proposed flow offers a myriad of design space choices (window-size, initial topology, qubit configuatioo, and noise-aware mapping to physical qubits) to improve the fidelity of a given input quantum circuit. Variants of quantum computer: The proposed flow also permits mapping to various quantum computers. To demonstratet this, we consider the of Rigetti's 16-qubit quantum computer that has a topology graph as shown in Fig. 2b . Note that, most of the qubits are connected to 2 other qubits. The native 2-qubit gate of Rigetti system is Controlled-Z (CZ). Decomposing SWAP gate with native Rigetti gates results in quantum circuit with 18 gates of which 11 are noisy RX and CZ, and 7 are noise-free RZ. Due to large number native operations for a SWAP gate, the total number of gates in the NN-compliant circuits are relatively larger for Rigetti system. We generate NN-compliant circuit for the same benchmarks and plot the maximum, average, and minimum number of gates for each benchmark for different window sizes (w = {1, 2, 4}). Fig. 10b shows the maximum and minimum fidelity of the NNcompliant circuits for each benchmark. Rigetti reports mean 1qubit (F1Q) and 2-qubit (F2Q) gate fidelities instead of qubitwise specifications. Therefore, for the analysis in Fig. 10b all the qubits are considered identical with same gate fidelities (Accessed: 06-Aug-2019; F1Q = 93.79% and F2Q = 91.71%). However, in reality, gate error-rates will exhibit qubit-to-qubit variation, which offer the scope for additional improvement in fidelity over the reported results in Fig. 10b .
V. CONCLUSION
In this paper, we presented an integrated flow for multiconstraint quantum circuit mapping on noisy intermediatescale quantum computers. The flow explores a number of degrees of freedom including various topology graphs, qubit mapping, NN compliance, and qubit error-rates to generate mapping with a high fidelity. An optimal integrated mapping remains to be explored in future, along with studies for larger benchmarks.
