Noisy, intermediate-scale quantum (NISQ) computers are expected to execute quantum circuits of up to a few hundred qubits. The circuits have to satisfy certain constraints concerning the placement and interactions of the involved qubits. Hence, a compiler takes an input circuit not conforming to a NISQ architecture and transforms it to a conforming output circuit. NISQ hardware is faulty and insufficient to implement computational fault-tolerance, such that computation results will be faulty, too. Accordingly, compilers need to optimise the depth and the gate count of the compiled circuits, because these influence the aggregated computation result error. This work discusses the complexity of compilation with a particular focus on the search space structure. The presented analysis decomposes the compilation problem into three combinatorial subproblems for which heuristics can be determined. The search space structure is the result of analysing jointly the gate sequence of the input circuit and its influence on how qubits have to be mapped to a NISQ architecture. These findings support the development of future NISQ compilers.
I. INTRODUCTION
The first generation of general purpose quantum computers, which are called noisy, intermediate-scale quantum (NISQ) computers [9] , will operate on a few hundred qubits and will not support computational faulttolerance. The IBM Q Experience computers, which fall into the NISQ category, have sparked the interest in the automated preparation (compilation) of arbitrary quantum circuits. It is expected that more NISQ computers with varying architectures will be available. NISQ compilation is motivated in part by the different architectures, but more important by the technical limitations of NISQ hardware (qubit and quantum gate fault rates, gate execution time etc.).
Near-term applications of NISQ may be used to explore many-particle quantum systems or optimisation problems, and the executed circuits need to have a very short depth (e.g. less than 100 [9] ). Although this is a serious limitation, it is hoped that hardware quality will increase such that longer circuits may be executed non fault-tolerantly.
In this work, we discuss the compilation complexity in general, and in particular its search space structure. To this end, a few prerequisites about quantum information processing are required (e.g. qubit, Hadamard gate, SWAP gate) [8] . This allows to mainly focus on the structural, rather than functional, issues of the problem.
Sec. I A is an informal, abstract introduction to the problem stated in Sec. I A. Afterwards, Sec. II is a more formal introduction to the terminology, the recent related work and the elements necessary for illustrating the search space (Sec. III). Based on that, the search space structure (called search diagram) is formulated using the representations from Sec. II A 2 and Sec. III C * alexandru.paler@jku.at (a simplified representation of quantum circuits called CNOT-chain). NISQ compilation is decomposed into three combinatorial subproblems, and Sec. IV sketches an exact method and the heuristics for the subproblems. In the following, the terms NISQ computer, quantum computer and computer are used interchangeably.
A. NISQ architecture
A NISQ architecture is the arrangement of physical qubits and the interactions supported between them, as well as the set of available quantum gates that can be used natively. Qubit arrangement and interactions form a directed graph (also called coupling graph and coupling map, see Sec. II) like the one from Fig. 1 . The term interaction refers, in general, to the supported two qubit gates (e.g. CNOT). Graph edges indicate the computer qubits on which two qubit gates can be applied, and edge directions abstract which qubit is the control and target.
An architecture does not allow arbitrary two qubit interactions, such that, in general, a quantum circuit cannot be executed directly on the NISQ. Two solutions to overcome this limitation are exemplified in the following.
First, in a coupling graph with the vertices Q0, Q1 and the edge Q1 → Q0, the NISQ qubits are Q0 and Q1 and the computer supports natively only the application of a CNOT where Q1 is the control qubit and Q0 the target qubit. However, it would be possible to implement a CNOT having Q0 the control and Q1 the target (CNOT with a direction opposite to the graph edge), if the computer supports a way to implement the single qubit Hadamard gate, such that: 1) both Q0 and Q1 are Hadamard transformed; b) the CNOT is applied according to the graph edge direction, c) both Q0 and Q1 are again Hadamard transformed.
Second, assume that a quantum circuit requires the Q0 → Q3 CNOT, but in coupling graph there is no edge between the two corresponding vertices. Therefore, the state of Q3 should be moved to the neighbourhood of Q0 (e.g. Q2) using SWAP gates. Afterwards, the CNOT is implemented using the Hadamard method discussed in the previous example.
B. Problem statement
NISQ compilation is a procedure that takes 1) a quantum circuit C operating onubits, and 2) a NISQ computer with at least q physical qubits, and generates 3) a functionally equivalent circuit C which can be executed on the NISQ.
To execute an arbitrary quantum circuit on a NISQ, single qubit gates (e.g. Hadamard) and SWAP gates are inserted into the original circuit without changing its functionality. Quantum gates are not fault-tolerant, such that each additional gate increases the error in the circuit output.
The following problem has to be solved: For a given NISQ and quantum circuit C, compile a circuit C which is functionally equivalent to C and includes as few as possible additional gates that enable its execution on the NISQ.
The discussion herein is circuit and hardware agnostic, and quantum algorithm and quantum hardware details (cf. [6, 7] ) are not considered. Compilation is defined in a sense that emphasises the mapping of circuit qubits to hardware (physical) qubits. Simultaneously, both the manner in which the circuits were obtained (synthesised and optimised) and which algorithm they represent are not important. Thus, compilation is not extracting information from the circuits or the coupling graph. The circuit and the coupling graph are given and used, but are not, for example, topologically analysed (e.g. [4] ).
The problem statement does not mention if C is expressed using the universal gate set supported by the NISQ. If this is not the case, C has to be translated to a functionally equivalent C that uses gates compatible with the NISQ gate set. This is a complex task with regard to the optimal number of resulting gates (e.g. [11] ), and also does not fall within the scope of this work.
It is assumed, without loss of generality, that C is an OpenQASM expression [2] and that the NISQ supports the execution of OpenQASM circuits. For example, the IBM Q Experience machines support natively the CNOT and single qubit gates representing one, two or all three rotations from the z-y-z Euler decomposition of an arbitrary SU(2) unitary. Accordingly, the single qubit Hadamard gate is achieved (emulated) through a y-rotation followed by a z-rotation.
II. BACKGROUND
This section provides a formal description of the prerequisites used later to describe the search space structure of the compilation problem: a) the coupling graph (representing the architectural constraints within the NISQ computer), b) remote CNOTs (representing CNOT gates which, in the originally given circuit C, do not comply with the coupling graph), c) configurations (permutations indicating circuit to hardware mappings, resulting after remote CNOTs are translated to the NISQ considering the constraints encoded into the coupling graph).
A. Coupling graph
Compilation of C from C for a NISQ's architecture requires a directed coupling graph G = (V, E), where |V | = q and |E| ≤ q(q − 1). Graph vertices stand for the computer's physical qubits, the edges for the CNOTs supported between pairs of physical qubits, and the edge directions for the CNOT directions (which qubit is control or target). If the computer supports both CNOT directions between a pair of qubits, there are two directed edges between the corresponding graph vertex pairs. The maximum number of graph edges is the double of the edges number from a complete graph K q .
Remote CNOTs
There is a distinction between quantum wire and qubit. A qubit is a state, and a wire is the analog of the hardware holding the state. In a quantum circuit diagram, qubits are mapped to quantum wires. In practice, circuit qubits will be mapped to physical qubits (hardware). Therefore, the circuit wire is the analog of a hardware qubit, enabling the following definition.
Definition 1 A remote CNOT is a CNOT executed between two qubits mapped to non-adjacent wires.
Remote CNOTs (e.g. Fig. 2 ) may exist in C, but C is structurally correct only if it does not contain any remote CNOTs. A correctly compiled C implies that its CNOTs are not remote. The CNOTs conform to the constraints encoded into the coupling graph G. Thus, compilation includes a first subproblem.
Problem 1 Place the circuit qubits on the NISQ physical qubits, such that C contains a minimum number of Hadamard and SWAP gates. From this mapping perspective, NISQ compilation is similar to earlier works about quantum circuit synthesis/optimisation for linear nearest neighbour (LNN) architectures [10] . Various approaches have been proposed for this task, ranging from global reordering of quantum wires [14] to application of circuit rewrite rules [10] , or manual adaptation of the circuits to the LNN constraints [1, 5] .
A quantum circuit diagram has a LNN architecture, but a NISQ may have a completely different one (e.g. grid). Thus, most of the previous works are not guaranteed to deliver optimal results or, even more, to scale with the number of qubits. Scalability is the reason why large circuits, such as Shor's algorithm, were manually optimised in a systematic manner [5] instead of using software tools. Nevertheless, LNN synthesis and optimisation is based on a cost function which mirrors the LNN architecture. It is possible to adapt that cost function to other architectures, and some heuristics were presented for example in [13] .
Mappings and configurations
Mapping circuit qubits to hardware qubits generates a permutation of size q. For example, assume that Q i are the qubits (wires) of a circuit C, H i are the physical qubits of a computer, and that the circuit and the computer have five qubits. The permutation p 1 = (0, 1, 2, 3, 4) is the trivial mapping where Q i is mapped to p 1 [Q i ] = H i . The permutation p 2 = (2, 1, 0, 4, 3) represents a mapping where Q 0 is mapped to H 2 , G 1 is mapped to H 1 and so on.
Definition 2 A configuration is the permutation that represents the mapping between circuit and NISQ qubits.
The terms permutation and configuration will be used interchangeably. The previous p 1 and p 2 are also configurations. Forubits, the size of the configuration is q. The set of all permutations forms a symmetric group with q−1 transposition generators. A transposition swaps two elements and keeps all others fixed, and it is possible to express a given permutation using a non-unique sequence of transpositions. The size of the this group is q!, and if all group elements would be arranged in a cyclic graph (e.g. Fig. 3 ), between each two adjacent permutations (configurations) the difference would be a non-unique sequence of transpositions. One does not need to place all permutations in a cyclic graph, and can use only the generators forming a K q−1 complete graph. A permutation would result after any path running through the K q−1 graph. For illustrative and algorithmic purposes, although it has extremely high dimensions for large values of q, the exhaustive representation is preferred in the following. 
III. SEARCH SPACE STRUCTURE
The search space structure of a naive compilation method provides insights on the potential heuristics that could be developed. For this, without loss of generality, a simplified perspective of compilation is proposed: CNOTs from the initial circuit C are compiled sequentially, one at a time. In the following, methods for compiling remote CNOTs are discussed. First, the methods are illustrated using the quantum circuit formalism. Afterwards, their effect on configurations is analysed in conjunction with coupling graphs.
A. Single remote CNOT at a time -swap strategies At least two strategies are possible for applying a single remote CNOT (illustrated in Fig. 4) . The first strategy (MIM, abrv. for move-interact-move) is to move one of the qubit states on a wire next to the other qubit's wire, interact the qubits, and then swap back to the original wire. The second strategy (MI, abrv. for move-interact) is similar to the first one but without swapping back the moved qubit state.
Applying MIM once introduces 2d SWAP gates in the circuit, while the MI strategy only d SWAPS, where d is the distance between the remote wires. A straightforward distance function could be, for example, the Manhattan distance which can be used for LNN as well as grid NISQ architectures.
For a given permutation p, after applying MIM, the resulting permutation is also p. On the contrary, after 1, 2, 3, 4) ). The MI strategies results in the (1, 2, 3, 0, 4) configuration.
an MI swap, the resulting permutation is a p , obtained through the sequence of transpositions representing the SWAP gates. Although MI introduces less SWAPS, it increases the complexity of the compilation problem: each remote CNOT will result in a new permutation, such that the circuit qubit mapping (configuration) is evolving after each CNOT.
This work focuses exclusively on MI, without affecting the generality of the proposed methodology. Most of the previous works on global wire reordering use MIM.
In the presence of evolving configurations (due to the MI strategy), state of the art compilation methods are solving the following problem: find an optimal circuit consisting entirely of SWAP gates that transforms a current permutation p in to a permutation p out such that a given set of remote CNOTs can be implemented on the given coupling graph. In other words, an optimal sequence of transpositions is sought, such that p out conforms to a set of constraints imposed by the CNOTs to implement. During the search of a SWAP circuit, or after a SWAP circuit was found, it is checked that p out conforms to the coupling graph.
FIG. 5. Example of a SWAP circuit that generates a wire permutation
This approach implies that p out and the SWAP circuit generating it are computed for more than a single CNOT (e.g. Fig. 5 ). Both the set of remote CNOTs and the SWAP circuits are computed using heuristics (e.g. randomised algorithm in the IBM QISKit, A*-search [15] or temporal planners [12] ).
B. Single remote CNOT at a time -find the edge
A different but equivalent problem is formulated: find a coupling graph edge to execute the remote CNOT from circuit C. Fig. 6 is an illustration of how a single CNOT FIG. 6 . Finding the edge where to execute a remote CNOT. a) The CNOT Q0 → Q4 needs to be implemented, but the qubits are not adjacent. b) Depending on a given cost function, it is determined that moving the qubits to the endpoints of the red edge (between Q3 and Q2) is the most cost effective method to achieve Q0 → Q4. The orange edges are used to implement the MI swap strategy. c) A new configuration is generated, and the necessary CNOT direction will require the inclusion of Hadamard gates into the circuit, because the coupling graph does not support the CNOT directly.
at a time is compiled instead of searching for SWAP circuits that generate valid p out permutations.
Problem 2 Choose the coupling graph edge where to execute a remote CNOT.
C. A circuit is a chain of CNOTs
Quantum circuits are often manipulated as directed acyclic graphs (DAGs) having wire inputs, wire outputs and quantum gates abstracted as vertices, and directed edges for the wires where the gates are applied to. Edge directions reflect the gate ordering inside the circuit. However, for the compilation problem it is possible to replace DAGs with a non-unique chain of CNOTs.
A chain is obtained as follows: a) the DAG is topologically sorted, b) only the CNOTs are kept from the sort, and other gates are discarded (e.g. the T gates from Fig. 3 ), c) one or two vertices are added to the chain for each CNOT (one for the control and one for the target, the control vertex is added only if it represents a wire distinct from the previous CNOT target), d) a directed edge is added between the newly added vertices (connect control to target vertex), e) if two vertices were added, the target vertex of the previous CNOT is connected to the control vertex of the current CNOT, f) weights are added to the edges. An example is depicted in Fig. 7 .
Two types of edges exist in a chain: CNOT-edges and non-CNOT edges (between concurrent CNOTs). Edge weights can be in the simplest case binary: 0 for non-CNOT edges, and 1 for CNOT-edges. Another option is to choose weights that reflect the LNN/architecture CNOT distance regarding C. Binary weights are sufficient for the exposition in this work.
Pairs of adjacent vertices in the chain represent the configurations before and after a remote CNOT was compiled. Thus, the CNOT-chain abstracts a sequence of MI swaps for implementing in C all the remote CNOTs from C. During a chain traversal, each CNOT-edge implies A chain is similar to how a topological sort of the circuit's DAG CNOTs would look like. Moreover, the chain illustrates a walk on the wires of C: for each CNOT perform a walk from the control to the target wire, and jump afterwards to the next control wire (e.g. see vertex annotations in Fig. 7) . A chain contains multiple vertices for the same wire, if the wire is used by more than one CNOT.
D. Search diagram
The permutation/configuration cyclic graph (Sec. II A 2 and Fig. 3 ) and the CNOT-chain (Sec. III C an Fig. 7 ) are combined to a search diagram (e.g. Fig. 8 ).
IV. COMPILATION METHODS
The goal is to improve the understanding of NISQ compilation, and the practical utility of an exact algorithm is limited due to its high complexity. However, novel heuristics can be developed starting from the naive backtracking algorithm sketched herein. Its design and implementation is derived from the traversal of search diagrams.
A. Constituent subproblems
Compilation solves the following problems simultaneously: 1) circuit-to-NISQ qubit mapping (first mentioned Sec. II A 1), 2) CNOT-to-edge mapping (first mentioned in Sec. III B), and 3) order in which the CNOTs are compiled to the NISQ.
Problem 3 Determine the order in which the remote CNOTs from C are mapped to the NISQ. The chosen order should still be a valid topological sort of the original circuit DAG.
The first problem is equivalent to computing the initial configuration (Sec. II A 2). This is because along a The concentric cycles are copies of the different configurations. Cycles are interconnected by copies of the CNOT-chain representing the DAG of the circuit to be compiled. Search starts in the center of the diagram and stops at one of the endpoints of the radial CNOT-chains. The chains are not drawn entirely, and contain in this example only two CNOT-edges. Depending on the start configuration (e.g. p0), different configurations may be generated along each chain (e.g. p 0 and p"0, cf. Fig. 10 ).
CNOT-chain all the configurations are a direct result of the initial configuration (the one from on the innermost cycle). The second problem is to execute each remote CNOT using as few as possible additional gates. A potentially new configuration exists after mapping each remote CNOT to a graph edge, and this fact was used to construct the search diagrams. The third problem exists, because CNOT gate parallelism from the original C is not captured in the chain representation. There are multiple equivalent CNOT-chains for the same circuit DAG.
B. Sketch of exact algorithm
A naive exact algorithm can be formulated using a backtracking framework. The first step of the algorithm is to determine an initial configuration: how circuit qubits are mapped to the NISQ (Fig. 9a) . Afterwards, the first edge of the CNOT-chain starting from this configuration vertex is traversed by choosing a coupling graph edge where to execute the CNOT. A new configuration is reached by using the MI swap strategy (Fig. 10) , and the traversal can move around the cycle (Fig. 9b) to land in a new configuration: this is as if the next edge traversal is prepared. The procedure (traverse cycle followed by traverse chain) is repeated until the end of one of the chains is reached.
A solution is reached at each chain end point (e.g. marked by . . . Stop in Fig. 8 ), and includes: a) the compiled circuit C , b) the total number of inserted Hadamard and SWAPs (a quantitative measure of the circuit's gate count), and c) the depth of C . Each solution is stored, and the best one (minimum gate count and depth) is selected after the entire backtracking procedure ends.
The backtracking step undoes the last CNOT-edge mapping and considers the next configuration on the previous cycle: this is equivalent to selecting a different edge where to map the remote CNOT.
The first part of the algorithm ends when all the cycles and configurations were considered. Because the CNOTchains can also be permuted, the second part of the algorithm is to repeat the above procedure for all valid CNOT-chain permutations (Fig. 11) .
Search diagrams allow a worst case complexity approximation of the naive exact method. For a circuit C withubits and n CNOTs, there are q! potential configurations, and at most 2n (when all CNOTs are parallel in C) concentric configuration cycles for a maximum total of 2nq! visualised configurations. Not all configurations are valid, because the coupling graph has at most |E| < q(q − 1) edges, and there are |V | n possibilities to place each of the n CNOTs on the coupling graph edges. A more realistic approximation of the number of valid configurations is 2nq!|V | n . After considering that in the worst case there are n! valid CNOT-chains, the dimension of the search space explodes to 2nq!|V | n n!. This analysis hints at the complexity of the exact solution, and its lack of feasibility. Assuming that the CNOTs can be commuted in the original circuit, the order of the vertices in each CNOT-chain can be permuted. The white CNOT is compiled first and the green CNOT second (a single vertex of this CNOT is included in the figure). The order of the concentric configuration cycles is swapped after commuting gates in the compiled circuit C .
C. Heuristics
The very high complexity of the exact method is a motivation for heuristics. It is useful to attempt to identify heuristic types and functionalities. As mentioned in Sec. I, compilation is the process of transforming a circuit C into another circuit C that conforms to a set of constraints encoded into a coupling graph. Therefore, it is possible to preprocess C and postprocess C .
Preprocessing adapts C for compilation, and it is viable to try and reduce the number of single qubit gates and CNOT gates by using, for example, template based optimisations [10] . Postprocessing can be template based too, as well as include recompilation of subcircuits of C (the IBM QISKit uses this approach for single qubit gates).
The NISQ compilation problem is similar to the dynamic traveling salesman problem. The search diagram traversal is visiting configuration nodes, and each configuration influences the costs considered during the map-ping of a remote CNOT to a coupling graph edge. The weights of corresponding edges in different CNOT-chains would be different (if not chosen to have binary values as mentioned in Sec. III C). Thus, the search diagram is comparable to the search space of a dynamic optimisation problem [3] , and for this reason it would be possible to replace the entire exact algorithm with an ant colony based method or evolutionary algorithm. From this perspective the method from [12] using temporal planners is a heuristic alternative to the presented exact solution.
Heuristics can be included also for the previously discussed mapping problems. Selecting the start configuration (or any other configuration along the concentric cycles) could be performed using existing LNN optimisation methods, but cost models adapted to MI swaps should be formulated and analysed first. Another possibility is to collect all configurations generated along a CNOT-chain and try them out as start configurations. However, given the dimension of each configuration cycle, the collected configurations may be as good/bad as the initial one. Ranking coupling graph nodes is another heuristic for building the initial configuration [4] . The circuit mapping strategy presented in [7] would also fall in this category.
Traversal of edges along CNOT-chains could be sped up by reducing the number of backtracking steps (minimum is zero), and to select from a few best possible edges for the mapping. The procedure for selecting the best coupling graph edge is the following: 1) Shortest paths between all pairs of coupling graph vertices are computed using the Floyd-Warshall algorithm; 2) It is possible to add weights to the coupling graph edges (e.g. to prefer certain areas of the graph), or to treat the coupling graph as undirected; 3) Once a remote CNOT needs to be mapped to an edge, the sum of the distances between the coupling graph vertices where the qubits are located and each graph edge vertices is computed (e.g. Fig. 6 ). The edge with the minimum distance sum is chosen, and, if multiple edges have the same distances, the last one in the list is chosen. Thus, the weighting function used for the coupling graph edges influences the edge selection.
Edge mapping could be performed for multiple remote CNOTs in parallel, too. This possibility shows that the algorithm from [15] is a heuristic fitting in the framework of this work.
Permuting the CNOT-chains could influence the optimality of the compiled circuit, although it is not known for the moment if heuristics targeting this problem would not have the same effect as preprocessing.
V. CONCLUSION
NISQ compilation is gradually receiving an increased attention, due to its practical industrial relevance. However, compilation has a prohibitively high complexity in order for exact algorithms to be feasible. For these reasons, once heuristic methods will be proposed, it is useful to have a clearer understanding of the underlying compilation procedure. This work considered compilation as a sequence of single remote CNOT mappings to coupling graph edges, and this enabled the introduction of the concepts of CNOT-chain and search diagram. The utility of the latter was explored by illustrating and discussing potential heuristics targeting three combinatorial subproblems.
The presented methodology is very general, and future work will address specific quantitative aspects of different heuristics implemented in a modular compilation software. For example, circuit depth, gate count, output error approximation and compilation execution time will be addressed by the compilation tool.
ACKNOWLEDGEMENT
This work was funded by the Linz Institute of Technology project CHARON.
