Fault-tolerant (FT) computation by using quantum error correction (QEC) is essential for realizing large-scale quantum algorithms. Devices are expected to have enough qubits to demonstrate aspects of fault tolerance in the near future. However, these near-term quantum processors will only contain a small amount of noisy qubits and allow limited qubit connectivity. Fault-tolerant schemes that not only have low qubit overhead but also comply with geometrical interaction constraints are therefore necessary. In this work, we combine flag fault tolerance with quantum circuit mapping, to enable an efficient flag-bridge approach to implement FT QEC on near-term devices. We further show an example of performing the Steane code error correction on two current superconducting processors and numerically analyze their performance with circuit level noise. The simulation results show that the QEC circuits that measure more stabilisers in parallel have lower logical error rates. We also observe that the Steane code can outperform the distance-3 surface code using flag-bridge error correction. In addition, we foresee potential applications of the flag-bridge approach such as FT computation using lattice surgery and code deformation techniques.
I. INTRODUCTION
Near-term quantum processors will consist of fifty to a few hundred noisy qubits and allow a limited number of faulty gates. They are also known as NoisyIntermediate-Scale Quantum (NISQ) [1] processors. For instance, Google, IBM, and Intel have respectively announced 72-qubit [2] , 50-qubit [3] , and 49-qubit [4] superconducting processors which have coherence times of ∼ 100 microseconds and two-qubit gate error rates near 0.1% [5] . Many efforts have been focusing on designing special quantum applications [6, 7] and developing compilation techniques [8, 9] such that one can solve practical problems and even demonstrate quantum supremacy on NISQ processors only using noisy bare qubits.
However, fault tolerance will be necessary to reliably implement large-scale quantum algorithms. This can be achieved through the use of active quantum error correction (QEC). The idea of QEC is to encode one logical qubit into many physical qubits and repeatedly perform syndrome extraction to detect and correct errors. Both the encoding and error detection procedure should be fault-tolerant (FT). Furthermore, operations on these logical qubits need to be performed fault-tolerantly. Although the high qubit overhead of QEC makes it difficult to realize scalable FT computation in the near future, we can begin to show how fault tolerance works in practice. The first step is to demonstrate fault-tolerant quantum error correction, that is, FT quantum memory.
General fault-tolerant quantum error correction protocols such as those from Shor [10] , Steane [11] , and Knill [12] can be applied to various stabiliser codes. However, these error correction schemes all require many ancilla qubits, which are scarce resources in near-term quantum processors. In order to perform FT QEC with low qubit overhead, a new error correction protocol has been proposed [13] [14] [15] [16] . It replaces a non-FT syndrome extraction circuit by a circuit which can detect correlated (or hook) errors by adding only one or a few extra ancilla qubits, called flag qubits.
This flag QEC scheme provides an efficient way to demonstrate fault tolerance in small experiments. However, many orthodox flag circuits couple one qubit to many others, requiring high-degree qubit connectivity. It is difficult or even impossible to directly map available flag circuits onto near-term quantum processors which have geometrical interaction constraints such as the nearest-neighbour connectivity in superconducting processors [17] [18] [19] . One may need to apply extra operations such as SWAP gates to move qubits to be adjacent, increasing the circuit size in terms of depth and total gate number, or even circuit width. More importantly, the resulting circuit may not be fault-tolerant, or produce higher error rates when used.
In this work, we extend the set of available flag circuits to a variety of equivalent circuits that can perform the same stabiliser measurement fault-tolerantly. In these circuits, the flag qubits are also used as bridges to cope with the connectivity constraints, called flagbridge qubits. Using these circuits, one can faulttolerantly map a QEC code to a given processor with low overhead by choosing appropriate flag-bridge circuits. We also develop a simulation framework to automate the procedure of fault tolerance checking, decoder design (including a look-up-table decoder and a neural-network decoder) for given flag-bridge circuits of some low-distance QEC codes. This automation is desirable for demonstrating fault-tolerant quantum error correction in small experiments. Moreover, we present mapping examples of the Steane code on two different qubit processor topologies and analyze their fault tolerance numerically. In addition, we show the proposed flag-bridge approach can be applied to FT computation implemented by lattice surgery and code deformation techniques.
The rest of this paper is organized as follows. We first review the basics of flag-based quantum error correction in Section II. Then we introduce the proposed flag-bridge arXiv:1909.07628v1 [quant-ph] 17 Sep 2019 approach in Section III. Afterwards, the mapping of the Steane code onto two qubit processor topologies and corresponding numerical results are shown in Section IV. Moreover, we provide the potential applications of flagbridge circuits in Section V. Finally, Section VI concludes the paper.
II. FLAG-BASED QUANTUM ERROR CORRECTION
In this section, we briefly introduce the flag-based error syndrome extraction for stabiliser codes. For more details, we refer the readers to [13] [14] [15] [16] 20] . Figure 1 shows the circuits for measuring a weight-4 Z-stabiliser (or check), similar circuits can be derived for measuring other Pauli operators. In all the circuits presented in this paper, a cnot gate between a data qubit and an ancilla qubit is called an s-cnot (in black) and a cnot gate between two ancilla qubits is called an f-cnot (in blue). Generally, the syndrome for this Z-check can be extracted using the circuit with only one ancilla qubit ( Figure 1a ). However, this circuit is not fault-tolerant because one single fault could cause 2 or more data errors. These correlated errors may lead to failures of some QEC codes. The surface code is an exception which can correct these hook errors if the two-qubit gates are performed in a specific order [21] . In order to perform fault-tolerant quantum error correction, one can use the flag circuits in Figures 1b and 1c that only add one extra ancilla qubit. When there is no fault, each of these flag circuits behaves the same as the non-FT one. When there is a fault that can lead to hook errors, the measurement of the flag qubit will be nontrivial such that the hook errors are detected. For instance, if the same fault in Figure 1a happens in the circuit of Figure 1b , then the measurement of qubit f will be '1' (raising a flag).
Flag-based quantum error correction can be applied to many codes such as the 5, 1, 3 code, Hamming codes, surface codes, color codes, etc. For example, faulttolerant QEC for the smallest color code, the Steane code in Figure 2 , can be realized as follows: first measure each stabiliser generator one by one using flag circuits similar to those in Figure 1 ; if a flag raises or a syndrome appears, then stop this round 1 and sequentially measure all the stabilisers using the non-FT syndrome extraction circuit. Note that if connectivity is fixed, we can't necessarily change the syndrome measurement circuit all of a sudden. One can use only two ancilla qubits to perform FT QEC for the Steane code at the cost of using more time steps. However, many quantum systems have very short coherence times [17, 22, 23] . Parallelizing stabiliser measurement will be beneficial to achieve lower logical 1 A full round of error syndrome extraction is defined as measuring all the stabiliser generators of the code for one time. operator, where s is the syndrome qubit and f is the flag qubit. (a) The circuit only using one syndrome ancilla may not be fault-tolerant. For example, one fault (Z s ) on the second cnot gate could lead to correlated weight-2 errors on data qubits (Z a , Z b ), which may not be correctable. (b) and (c) The flag-based circuits can detect these hook errors [13] [14] [15] . error rates. Chao and Reichardt [14, 16] have proposed several circuits to perform two or three parity checks in parallel for the Steane code. The circuits they propose for measuring two and three Z-checks at the same time using only one flag qubit are shown in Figure 3 . As shown, more ancilla qubits are required to achieve this parallelism compared to the sequential stabiliser measurement circuits. This implies there is a trade-off between the number of qubits required and the number of stabilisers that can be measured simultaneously. Flag-based syndrome extraction is promising for demonstrating quantum error correction and fault tolerance in small quantum experiments because of its low qubit overhead. However, current or near-term quantum processors have many hardware limitations. One of the main constraints is the degree of qubit connectivity, that is, one qubit can only interact with a limited number of other qubits. It is challenging to map existing flag circuits onto connectivity-constrained quantum processors [14, 16] for (a) measuring two weight-4 Z-checks in parallel using three ancillas and (b) measuring three weight-4 Z-checks in parallel using four ancillas.
meanwhile maintaining the fault tolerance with low costs. For instance, the ancilla qubit s of the flag circuit in Figure 1b needs to interact with five qubits, which cannot be supported in a grid topology where each qubit only has at most four neighbours such as the one in [18] . Besides, general circuit mapping techniques [24] [25] [26] [27] [28] that move qubits to be adjacent by applying SWAP gates will lead to high overhead in the circuit size. More importantly, it may result in higher logical error rates or even destroy the fault tolerance of the QEC circuits because of the error propagation through two-qubit gates. In this work, we propose a flag-bridge approach to solve this mapping problem, which will be explained in the next section.
III. FLAG-BRIDGE QUANTUM ERROR CORRECTION
In this section, we illustrate the proposed flag-bridge approach which allows fault-tolerant quantum error correction with low qubit overhead on connectivity-limited quantum processors.
A. Flag-bridge syndrome extraction circuits
We first provide a microscopic explanation of how a flag-based circuit can perform a specific stabiliser measurement using the stabiliser formalism [29] . Then we generalise this flag scheme such that one can extend available flag circuits to more equivalent ones that are different in terms of the total number of gates, circuit depth, and connectivity requirement. We will use the circuit in Figure 1c as an example. A flag syndrome extraction circuit can be understood as a circuit that replaces the bare ancilla qubit by an 'encoded' ancilla up to gate commutation. As shown in Figure 1c , the first blue cnot gate entangles ancilla qubit s and qubit f (the encoding circuit), encoding a logical ancilla in a 2, 1, 1 error detection code of which stabiliser is X s ⊗ X f and logical operators are
This logical qubit is fixed in the Z basis. Then one can perform stabiliser measurement using this logical ancilla. Assume the four data qubits (a, b, c, d) are initially stabi-
, the four subsequent s-cnot gates between data qubits and ancilla qubits will keep the stabilisers of all the qubits invariant, which are,
but it will gradually transform the logical operators into
More generally, since X f and X s have the same effect on the encoded ancilla state, one can perform each scnot gate between the particular data qubit with any ancilla qubit. Specifically, in the encoded ancilla area, k s and k f s-cnot gates can be applied on ancillas s and f respectively, where k s and k f are integers and k s +k f = 4. For example, the circuit shown in Figure 4a also performs a weight-4 Z−stabiliser measurement equivalent to this circuit (Figure 1c ), where k s = 3 and k f = 1. Afterwards, the last f-cnot (the decoding circuit) disentangles these two ancillas, leading to the final stabiliser to be
and the logical operators of these ancillas to be
This means the readout y of measurement M z on ancilla s indicates the measurement result of the stabiliser Z a,b,c,d . Therefore, this circuit indeed measures a weight-4 Z-check. Besides, the measurement result of ancilla f implies the syndrome of the 2, 1, 1 code, that is, it can detect one single Z error that occurs on any ancilla and then raises a flag. Once a flag circuit based on the above approach is generated, one can transform it into other equivalent ones that can perform the same stabiliser measurement by applying gate commutation, e.g., the circuit in Figure 1b . Note that the circuits generated by commuting gates may not be fault-tolerant.
Moreover, one can use a larger 'encoded' ancilla to measure a weight-n Z-check (similar circuits can be applied to other Pauli operators). This logical ancilla is encoded by m physical qubits denoted by a set Q = {1, 2, · · · , m}, where one is syndrome qubit (Q s = {1}) and the other m − 1 are flag qubits (Q f = {2, · · · , m}). The underlying error detection code m, 1, 1 of this logical ancilla has stabilisers
and logical operators
Similar to the two-ancilla flag circuits, this weight-n check can be distributed to all m ancillas, k i s-cnot gates will be applied on ancilla i, where
For example, the circuit in Figure 4b measures one weight-4 Z-stabiliser using one syndrome qubit (s) and two flag qubits (f 1 , f 2 ), each qubit only needs to interact with at most three others.
In addition, one can also measure p Z-checks in parallel by encoding p logical ancillas into m physical ancillas. The underlying m, p, 1 code is stabilized by
Its p logical operators are
Where, Q s is the set of p syndrome qubits and Q f is the set of m − p flag qubits. After the encoding of ancilla qubits, one can simply assign all the s-cnot gates for performing one check to a particular syndrome qubit. Furthermore, one can reduce the total number of s-cnot gates by applying gate commutation when two or more checks are performed on the same data qubit(s). Figure 3a and Figure 3b show the flag circuits to measure two and three checks of the Steane code in parallel by using ancillas encoded in a 3, 2, 1 code and in a 4, 3, 1 code, respectively, These circuits use less s-cnot gates than required by commuting some cnot gates out of the encoded area (generally 4 s-cnot gates are needed for each weight-4 check).
Moreover, one can achieve this gate reduction by distributing some s-cnot gates to flag qubits, which can even help to reduce the circuit depth. Figure 5a shows the example circuit that measures two checks of the Steane code in parallel but uses less timesteps than Figure 3a . Note that the s-cnot distribution for parallel syndrome measurement needs to be designed carefully since one flag qubit is used for flagging multiple checks. This distribution also depends on the decoding procedure of the m, p, 1 code. Figure 5b shows the example circuit that measures three checks using less timesteps than Figure 3b . Besides, the circuits in Figure 5 require less degree of qubit connectivity than the ones in Figure 3 .
By employing the ideas of encoding ancillas, distributing s-cnot , and commuting gates, we can generate more equivalent syndrome extraction circuits that have different connectivity requirements. Note that not all the equivalent circuits generated using this approach are fault-tolerant. The fault tolerance can be checked based on the error correction protocol, which will be explained in the next section. For these FT circuits, ancillas are not only used as syndrome and flag qubits to detect errors, but also as bridges to allow the interaction between data qubits and the encoded ancilla block. Such a syndrome extraction circuit is called a flag-bridge circuit.
FIG. 4: Flag-bridge circuits for measuring one weight-4
Z-check using (a) two ancillas and (b) three ancillas.
B. Fault-tolerant protocol for flag-bridge error correction
FT QEC condition
For distance-3 codes, a QEC circuit is fault-tolerant if it can either immediately correct all errors from a single fault or only leave a weight-1 error to the next cycle. A formal condition of FT flag-bridge quantum error correction for distance-3 codes, similar to the flag error correction in [15] , can be defined as follows: Consider a stabiliser code S = g 1 , g 2 , · · · , g r and its QEC circuit C which is composed of the flag-bridge circuits for measuring the stabiliser generators, that is, C = {c(g 1 ), c(g 2 ), · · · , c(g r )}, where c(g i ) is the flag circuit of measuring stabiliser g i . Note that the total number of flag-bridge circuits is smaller than r if several stabilisers are measured simultaneously in one flagbridge circuit. For all generators g, all pairs of elements E, E ∈ E(g) satisfy sf (E) = sf (E ) or E ∼ E , where E(g) is the set of all errors caused by one fault, sf (E) is the syndrome and flag string caused by E. We define E ∼ E to mean that there is an element g in S such that E ∝ gE, that is, these errors are stabiliser-equivalent.
Based on this criterion, we check the fault tolerance of each generated QEC circuit C through a brute-force simulation under circuit level noise, analogous to [13] . It is implemented by injecting each individual fault from a circuit-based error model on every single-qubit or twoqubit gate in a given QEC circuit and then collecting the final syndromes and flags. If there are two or more sets of errors which lead to the same syndrome-flag string but do not yield a stabiliser when multiplied, then this QEC circuit is not fault-tolerant.
FT QEC procedure
A full cycle of fault-tolerant error correction for distance-3 codes using flag-bridge circuits can be performed as follows: of c(g i ), then this round will be terminated and another full round for all circuits in C will be performed. All the syndromes s 2 = i s 2 i and flags
i of the second round will be collected.
If
In this FT QEC procedure, we use flag-bridge circuits for both rounds of syndrome extraction because of the connectivity constraint, which is different from the ones proposed in [14] [15] [16] , where non-FT syndrome extraction circuits that use only one ancilla are executed for the second round.
Error decoders
Normally, error correction of topological codes like surface codes have special structures for the measured syndromes so that one can use heuristic algorithms to find high-probability errors. These types of decoders such as the minimum weight perfect matching decoder [30] and the belief propagation decoder [31] can be applied to the same QEC code with different distances. However, the flag-bridge error correction circuits of a QEC code for a specific quantum platform are ad hoc. Different circuits may be chosen based on the qubit topology, leading to different error-syndrome patterns and in turn requiring different decoding strategies. It is difficult to design heuristic decoding algorithms that can be applied to various syndrome extraction circuits. Since flag-bridge circuits are likely to be used for low-distance codes in small experiments, a simple decoding solution is to create a look-up table (LUT) for each QEC circuit. A LUT decoder can find the most likely Pauli errors from a single fault that leads to the observed syndromes and flags. LUT decoders can be easily derived from the brute-force checking procedure [13] .
Another type of decoders are the neural-network (NN) decoders [32] [33] [34] [35] . They can provide high-speed decoding, be adaptable to different error models, and be more easily implemented on hardware. Moreover, a NN decoder can be developed by training the network using only inputoutput pairs without any knowledge of the QEC code, making it favorable for flag-bridge circuits. For example, the inputs of a NN decoder are the observed syndromes and its outputs can be the actual physical errors that have occurred. The implementation details of the LUT decoder and the NN decoder can be found in Appendix A.
In this work, we design a simulation framework to automate the procedure of fault tolerance checking, LUT generation, and NN decoder training for given flag-bridge syndrome extraction circuits of the Steane code. This automation is desirable for demonstrating fault-tolerant quantum error correction in near-term processors which may have different geometrical interaction constraints.
IV. STEANE CODE ERROR CORRECTION ON
TWO DEVICE TOPOLOGIES In this section, we show how to map the Steane code error correction onto two different processors with limited connectivity using the proposed flag-bridge circuits, namely, the Surface-17 transmon processor (Surface-17) [18] and the IBM Q Tokyo processor (IBM-20) [17] ( Figure 6) . Furthermore, we numerically analyze each flag-bridge quantum error correction procedure under circuit level noise. This error model inserts depolarizing errors after each operation in a flag-bridge circuit as follows: 1) each single-qubit gate is followed by a X, Y , or Z with probability p /3; 2) each two-qubit gate is followed by an element of {I, X, Y, Z} 2 \{II} with probability p /15; 3) the preparation or measurement in the Z basis is flipped with probability p. The elementary Clifford operations used in this simulation are preparation and measurement in the Z basis, H and cnot gates. Other operations need to be further decomposed into these elementary operations. For example, each control-phase gate is replaced by two H gates and one cnot gate.
A. Mapping
Many current and NISQ processors have geometrical connectivity constraints, that is, each qubit can only interact with a few neighbours. It is challenging or even impossible to directly perform existing flag-based quantum error correction without adding more operations and/or without losing fault tolerance. For example, the flag circuit which measures one weight-4 Z-stabiliser of the Steane code in Figure 1b cannot be directly executed on the Surface-17 topology (Figure 6a ) but can be supported by the IBM-20 topology (Figure 6b ). This is because qubit s needs to interact with 5 qubits but in Surface-17 each qubit has at most 4 neighbours. The flag circuit in Figure 1c can be performed on both processor topologies. However, a full round of error syndrome extraction requires all the stabiliser generators of the Steane code to be measured. The full syndrome extraction using only these two flag circuits (Figure 1b and Figure 1c) can be directly performed on the IBM-20 topology (e.g., a mapping in Figure 8a) but not on the Surface-17 topology. Figure 5b to measure three stabilisers simultaneously.
As mentioned above, all the flag-bridge circuits shown in this paper are used to measure Z stabilisers, simi-lar circuits with the same ancillas can be derived for measuring X-stabilisers. Figures 7 and 8 show examples of mapping the Steane code error correction using the flag-bridge circuits onto the Surface-17 topology and the IBM-20 topology, respectively.
In these mapping figures, the qubits in each red, blue, or green block are the ancillas in each flag-bridge circuit and they are used to measure the corresponding Z(X)-stabiliser in the same color plaquette in Figure 2 . The flag-bridge qubits in the yellow block are used to measure the Z(X)-stabilisers in both red and green plaquettes. The flag-bridge qubits in the gray block measure the Z(X)-stabilisers in all three plaquettes. The X and Z stabilisers are measured separately, more specifically, one first measures all the stabilisers in one type and then measures the other type. Furthermore, each of the flagbridge circuits for the Steane code error correction need to be executed sequentially. On the Surface-17 topology, one can measure all the stabilisers of the Steane code one by one when using the mapping in Figure 7a . Maximally two stabilisers can be measured in parallel in this topology as shown in Figure 7b . In contrast, three Z(X)-stabilisers can be measured at the same time on the IBM-20 topology (Figure 8c) .
The circuit characterization of one full round of syndrome extraction for the Steane code when using different mappings is shown in Table I . This characterization includes the total number of ancilla qubits, the total number of operations and timesteps, and the number of f-cnot and s-cnot gates. For comparison, we also show these parameters of the rotated distance-3 surface code (SC d=3). As shown in Table I , the circuits which can measure more stabilisers simultaneously require less operations and less timesteps. Moreover, though the distance-3 surface code uses more ancilla qubits, it always needs less operations and less timesteps than the Steane code. 
B. Numerics
We further compare different mapping circuits in terms of their fault tolerance, which is analyzed by numerical simulation under circuit level noise. For each point in the numerics, 10 6 iterations of a full QEC cycle have been run and confidence intervals at 99.9% are plotted. Moreover, NN decoders are used for this comparison since it has better performance than LUT decoders (see Figures 9a and 14) . As shown in Figure 9 , for the Steane code, the circuits that can measure more stabilisers in parallel have lower logical error rates, likely because they consist of fewer operations and require fewer timesteps. Moreover, when there are no idling errors (p I = 0 in Figure 9a) or a small probability of idling errors (p I = 0.01p in Figure 9b ), the Steane code can achieve similar performance to, or even outperform, the distance-3 surface code by parallelizing stabiliser measurements. This is because the circuit for the surface code error correction consists of more s-cnot gates than the QEC circuits that can measure several stabilisers in parallel for the Steane code. When idling errors are significant, we observe that the circuit with fewer timesteps results in lower logical error rates (as shown in Figures 9c and 9d for p I = 0.1p and p I = p respectively).
V. OTHER APPLICATIONS OF THE FLAG-BRIDGE CIRCUITS
In this section, we foresee some possible applications of the flag-bridge circuits including both fault-tolerant quantum error correction and fault-tolerant quantum computation.
A. Flag-bridge QEC for the five-qubit code
Analogous to the flag circuits, the flag-bridge circuits can also be applied to other distance-3 error correction codes such as the 8, 3, 3 , 10, 4, 3 , 11, 5, 3 , 5, 1, 3 codes, Hamming codes 2 r − 1, 2 r − 1 − 2r, 3 , etc. In this section, we consider the 5, 1, 3 code as an example. This code has four stabilisers, which are cyclic permutations of XZZXI. Figure 10 shows the flag-bridge circuits that can measure an XZZX stabiliser fault-tolerantly. Each stabiliser of the 5-qubit code can be measured using these circuits up to data qubit permutation. Similar circuits using three ancillas to measure one stabiliser are also proposed in [13] . All these circuits have different connectivity requirements. By selecting and combining some of them, one can map the 5-qubit code error correction onto different qubit topologies. Figure 11 shows the mapping of the 5-qubit code to the Surface-17 processor topology using the two-ancilla flag-bridge circuits and the IBM Q Melbourne (IBM-16) processor topology using the three-ancilla flag-bridge circuits.
B. Flag-bridge circuits for FT computation
The geometrical interaction constraint in near-term quantum processors has also limited the fault-tolerant implementation of logical operations. For instance, a fault-tolerant cnot gate in planar surface codes and color codes in principle can be implemented transversally in a 3D structure, that is, performing pair-wise cnot Physical error probability Logical error rate
Physical error probability Logical error rate NND, p I = 0.01p
Physical error probability
Logical error rate NND, p I = 0.1p
Physical error probability Logical error rate gates between data qubits of the two lattices. However, this transversal cnot is not realizable in near-term quantum technologies because of the local qubit connectivity limitation in a 2D architecture. Measurement-based protocols such as lattice surgery [36, 37] and code deformation [38, 39] have been proposed to comply with the 2D local interaction constraint. Figures 12 and 13 show the qubit layouts for performing lattice-surgery-based operations on the distance-3 surface code and the distance-3 color code (the Steane code), respectively. The details of implementing logical operations by lattice surgery can be found in [36, 37] . As shown in Figure 12 , the merge operations can be directly performed on a 2D grid topology. As mentioned previously, the stabiliser measurement of surface codes can be realized by only using the one-ancilla circuit similar to Figure 1a . However, one ancilla qubit (the circled one in Figure 12b ) is used by two stabilisers from different lattices during the split operation. One may have to measure these two stabilisers sequentially, which leads to more timesteps and in turn may result in higher logical error rates. To preserve parallelism of the stabiliser measurement, we propose to use the qubit layout in Figure 12c . By using this layout, one can measure all the stabilisers in parallel when splitting lattices since they no longer share ancillas. One can also perform the merge operation by replacing the original syndrome extraction circuit using one ancilla with the proposed flag-bridge circuits using two ancillas (Figure 1c ) where ancillas are connected by dash lines in Figure 12c . Similar mapping can be applied to other code-deformation-based operations on surface codes.
Furthermore, lattice-surgery-based operations for the Steane code in Figure 13a cannot be directly realized in a 2D grid topology. Similar to the mapping in Figure 7b , one can map these operations fault-tolerantly using the three-ancilla flag-bridge circuits as shown in Figure 13b . Compared to the distance-3 surface code, the Steane code can achieve Clifford gates transversally. Moreover, it requires fewer qubits for both FT error correction and FT computation, which may be preferable for demonstrating fault tolerance in small experiments.
VI. DISCUSSION AND CONCLUSION
We have shown that the flag circuits can be phrased as one using encoded ancillas in an m, p, 1 code. Based on this formulation, we proposed a flag-bridge approach
FIG. 10: Fault-tolerant circuits for performing an XZZX-check: (a), (b), (c) using 2 ancillas but requiring different connectivity; (d) using three ancillas, similar circuits can be generated by re-distributing the s-cnot gates for each weight-4 check to different ancillas as mentioned in Section III. Figure 10 and (b) the IBM-16 topology using the three-ancilla circuit in Figure 10d .
to perform fault-tolerant quantum error correction for distance-3 codes on connectivity-constrained near-term quantum processors with low overhead. Furthermore, we mapped the Steane code error correction onto two current qubit topologies using the flag-bridge circuits. The numerical simulation results have shown that, the QEC circuits that can measure more stabilisers in parallel achieve lower logical error rates, providing insights for fabricating processors with more connectivity. Moreover, we also showed that flag-bridge circuits can be applied to the 5-qubit code and lattice-surgery-based operations for the surface codes and the Steane code. In addition, we have observed that the Steane code implementation that uses fewer qubits even outperforms the distance-3 surface code when idling errors occur with low probability. The Steane code also allows transversal Clifford gates, which may make it a better candidate than the distance-3 surface code for demonstrating fault tolerance in small experiments. However, the numerics in this work The layout after mapping using the two-ancilla flag-bridge circuits. were carried out with Pauli errors, it will be interesting to test these circuits using more realistic error models. Furthermore, the mapping procedure in this work was hand-optimized. Future work will focus on automating the fault-tolerant mapping of flag-bridge quantum error correction onto given processors. Besides, we also need to investigate the extensibility and scalability to higher distance codes and fault-tolerant computation.
manuscript. We also thank Yang Wang and Xiaotong Ni for useful discussions on the implementation of error decoders. LLL acknowledges funding from the China Scholarship Council. CGA acknowledges support from the Intel Corporation.
Appendix A: Implementation of LUT and NN decoders
Based on the FT QEC procedure for distance-3 codes in Section III, decoding is only needed when two rounds of syndrome extraction (SE) are performed (the first round has non-trivial syndromes or flags). If there is only non-trivial syndromes (no flags) in the first round, then the decoders will only decode using the measurement results in the second round. If there is any non-trivial flag in the first round, then the decoders will decode using these flags and the measurement results in the second round. For the measurement information in the second round, the simple LUT decoder only considers the results of syndrome qubits, which is enough for correcting all the errors caused by one fault. In contrast, the NN decoder also takes the flags of the second round into account. This means the NN decoder could potentially correct some errors caused by more faults, outperforming the LUT decoder. Physical error probability
Logical error rate The LUT decoder: As mentioned previously, we use a brute-force search to check the fault tolerance of flagbridge circuits. After this search, all the errors from one single fault and the corresponding syndrome-flag (SF) string are collected. For FT flag-bridge circuits, these error-SF pairs can be directly used to design a LUT decoder. Two look-up tables need to be created. One is used for the case where only syndromes are observed in the first round of SE with a size 2 ms , m s is the total number of syndrome qubits in the QEC circuit C = {c(g 1 ), c(g 2 ), · · · , c(g r )}. Note that if the same ancilla qubits are re-used in different c g , they are still considered as different syndrome qubits, similarly for flag qubits. The other table is to decode for the case where flags are raised in the first round of SE, which has a size of i 2 m f i 2 ms , m fi is the total number of flag qubits in c gi . The LUT decoder is designed to correct all single faults, but not to correct the most likely two faults correspond to measured syndromes. The performance of different flag-bridge circuits for the Steane code using LUT decoders is shown in Figure 14 . As can be seen, the QEC circuits that can achieve more parallelism of stabiliser measurement have lower logical error rates.
The NN decoder: Decoding can be seen as a classification problem, that is, given the observed syndromes, the decoder identifies the error or the logical coset of the error that has occurred . It has been shown that neural networks are versatile tools for decoding topological quantum error correction codes [32] [33] [34] [35] . The inputs x i for a neural network decoder are the syndromes (and flags for flag QEC). In this paper, two rounds of syndromes and flags will be collected when using the flag-bridge error correction for distance-3 codes. Therefore, the size of input layer will be 2 × m, m is the total number of syndrome and flag-bridge qubits. In this work, the outputs y i are the suggested physical errors which can result in the given syndromes and flags. For a CSS code with n data qubits, the size of output layer is set to be 2 × n, which can describe whether a X or/and a Z error has occurred on each data qubit. The neural network will find an approximate function f : x → y to describe the inputoutput relation from the set of training data {(x i , y i )}. Note for large-distance codes, it is more efficient to use logical errors as outputs and a simple decoder (e.g., LUT decoder) is required to generate the logical error information.
In this work, a simple NN decoder using the Tensorflow library [40] is developed to analyze the fault tolerance of different flag-bridge circuits. We use the 'sigmoid' activation function for the output layer and 10 5 syndrome-error pairs at physical error rate (PER) around 0.01 are sampled for each training, more details of the designed NN decoder are described in Table II . Since the focus of this work is to evaluate the flag-bridge quantum error correction, we leave the performance and speed optimization of NN decoders for future work.
