Performance and Error Analysis of Knill's Postselection Scheme in a
  Two-Dimensional Architecture by Lai, Ching-Yi et al.
PERFORMANCE AND ERROR ANALYSIS OF KNILL’S
POSTSELECTION SCHEME IN A TWO-DIMENSIONAL ARCHITECTURE
Ching-Yi Lai
Department of Electrical Engineering, University of Southern California
Los Angeles, CA 90089 U.S.A.
Gerardo Paz
Department of Physics and Astronomy, University of Southern California
Los Angeles, CA 90089 U.S.A.
Martin Suchara
Computer Science Division, University of California at Berkeley
Berkeley, CA 94720 U.S.A.
Todd A. Brun
Department of Electrical Engineering, University of Southern California
Los Angeles, CA 90089 U.S.A.
Knill demonstrated a fault-tolerant quantum computation scheme based on concatenated
error-detecting codes and postselection with a simulated error threshold of 3% over the
depolarizing channel. We show how to use Knill’s postselection scheme in a practical
two-dimensional quantum architecture that we designed with the goal to optimize the
error correction properties, while satisfying important architectural constraints. In our
2D architecture, one logical qubit is embedded in a tile consisting of 5×5 physical qubits.
The movement of these qubits is modeled as noisy SWAP gates and the only physical
operations that are allowed are local one- and two-qubit gates. We evaluate the practical
properties of our design, such as its error threshold, and compare it to the concatenated
Bacon-Shor code and the concatenated Steane code. Assuming that all gates have the
same error rates, we obtain a threshold of 3.06 × 10−4 in a local adversarial stochastic
noise model, which is the highest known error threshold for concatenated codes in 2D.
We also present a Monte Carlo simulation of the 2D architecture with depolarizing noise
and we calculate a pseudo-threshold of about 0.1%. With memory error rates one-tenth
of the worst gate error rates, the threshold for the adversarial noise model, and the
pseudo-threshold over depolarizing noise, are 4.06 × 10−4 and 0.2%, respectively. In a
hypothetical technology where memory error rates are negligible, these thresholds can
be further increased by shrinking the tiles into a 4× 4 layout.
Keywords: fault-tolerant quantum computation, quantum error correction
1
ar
X
iv
:1
30
5.
56
57
v2
  [
qu
an
t-p
h]
  3
1 M
ay
 20
13
2 Performance and Error Analysis of Knill’s Postselection Scheme in a Two-Dimensional Architecture
1 Introduction
Quantum error correction [1–10] is necessary to build reliable quantum computers using
unreliable components. Quantum computation can be performed with arbitrary accuracy as
long as the error rates of physical gates are below a threshold [2]. The error thresholds for
several schemes have been estimated, and they range from O(10−5) to as high as 3% [7,8,11–
16]. Many of these analyses of error thresholds make simplifying assumptions, such as allowing
interactions between any two qubits, that are not possible in real physical architectures. Svore,
DiVincenzo, and Terhal designed a two-dimensional qubit layout for quantum computation
[17], using the concatenated Steane code [18]. They assumed that two-qubit gates can be
applied only to adjacent qubits, and that qubit movement is done by SWAP gates. Under
these assumptions, they showed that the error threshold of the Steane code is 1.85 × 10−5.
It decreases by roughly a factor of two due to the locality constraints. Similar work for the
concatenated Bacon-Shor code [19,20] was studied by Spedalieri and Roychowdhury [21]. The
error threshold reported in [21] is also O(10−5).
Knill demonstrated a fault-tolerant quantum computation scheme based on concatenated
error-detecting codes (C4 and C6) and postselection with a simulated error threshold of 3%
over the depolarizing channel. Stephens and Evans analyzed a fault-tolerant quantum com-
putation scheme based on the concatenated error-detecting code C4 with locality constraints
in one dimension, and they reported a threshold of O(10−5) [22]. In this paper we demon-
strate that a two-dimensional layout of the error-detecting code has a significantly better
threshold. To this end, we design the optimal qubit movements required to perform quantum
computation in a two-dimensional architecture for the concatenated error-detecting code C4
with postselection [14,23], which has the highest known error threshold without locality con-
straints. We embed one logical qubit in a 5 × 5 qubit tile layout. Our tile has a recursive
structure, meaning that each qubit is embedded in a 5×5 tile consisting of lower-level qubits.
As in [17, 21], we assume that two-qubit gates can only be performed locally on adjacent
qubits, and that additional SWAP gates are needed to move the qubits that are far apart.
Each tile contains not only the physical qubits required to maintain the state of a single log-
ical qubit, but also dummy qubits to aid qubit movement by SWAP gates and ancilla qubit
preparation for error detection. In this paper we demonstrate only the tile operations of the
error detection block; tile operations for the other gates are available online at
http://mizar.usc.edu/~tbrun/Data/KnillTileOps/
We use both analytical and simulation methods to estimate the error threshold. The
analytical method counts malignant pairs in the extended rectangle of the CNOT gate in a
local adversarial stochastic noise model [23]. In a local adversarial stochastic noise model,
arbitrary Pauli errors can be chosen to attack a given set of gates and we may consider the
error threshold obtained from this model to be the lower bound on the threshold for a more
realistic error model. We calculate the thresholds for different ratios of memory error rate
to the worst gate error rate. Assuming that all gates have the same error rates, we obtain
a threshold of 3.06× 10−4 in a local adversarial stochastic noise model, which is the highest
known error threshold for concatenated codes in 2D.
Our second method estimates the threshold by a Monte Carlo simulation of the 2D archi-
tecture with depolarizing noise. We calculate a pseudo-threshold of about 0.1%. As expected,
C.-Y. Lai, G. Paz, M. Suchara, and T.A. Brun 3
the pseudo-thresholds are generally higher than the thresholds obtained in adversarial noise
models. By setting the memory error rate to be one-tenth of the worst gate error rate, the
error threshold with the adversarial noise model is 4.06 × 10−4, while the pseudo-threshold
with depolarizing noise is about 0.2%.
This paper is organized as follows. In the next section, we describe basic properties of
the Knill postselection scheme, and the circuits used to obtain a universal gate set, using
an ancilla factory model. In the ancilla factory model, ancillary quantum states are distilled
so that the phase and the pi/8 gates can be executed. In Section 3, we describe the 5 × 5
two-dimensional qubit tile, and also give the recursive relations of each gate operation in
terms of the lower-level gates. We establish the error threshold by the method of counting
the number of malignant pairs in Section 4. Simulated pseudo-thresholds are also calculated.
We conclude in Section 5, including an estimate the error thresholds of a 4 × 4 tiled qubit
layout, obtained by shrinking the original 5× 5 tile. This tile outperforms the 5× 5 tile when
memory errors are negligible.
2 Basics of the Knill C4/C6 Scheme with Postselection
In his original scheme, Knill concatenated two error-detecting codes C4 and C6 which alter-
nate. We follow the simpler version, using only the C4 code as in [23], which has a high error
threshold. In addition, we concatenate M levels of the quantum error-detecting code C4 with
a quantum error-correcting code Cec at the top-level. We use the notation C
m
4 to denote the
Level-m encoding of the C4 code and the notation U(m) to denote the gate operation U of
Cm4 . We use the notation |v〉 to denote the state |v〉 at a higher-level of encoding.
The quantum error-detecting code C4 belongs to the class of stabilizer codes [5,24] and can
be defined by the stabilizer group with 2 generators XXXX and ZZZZ, where X =
(
0 1
1 0
)
and Z =
(
1 0
0 −1
)
are Pauli matrices. The matrix representation of a single-qubit operator
is shown in the computational basis {|0〉 , |1〉}. This code encodes two logical qubits in four
physical qubits and can simultaneously detect any single-qubit bit-flip error X and any single-
qubit phase-flip error Z. However, in Knill’s scheme, we use only one of the logical qubits and
treat the other as a spectator qubit. The logical operators are XL = XXII, ZL = ZIZI,
XS = IXIX, and ZS = IIZZ, where the superscripts L and S are labels for the logical and
the spectator qubits, respectively.
The top-level quantum error-correcting code Cec can be the Steane code [18] or the Bacon-
Shor code [19, 20]. We use the tiled qubit architecture of these two codes studied in [21, 25]
on top of the tiled qubit architecture of the CM4 code developed in the next section. We can
use the Steane or Shor error correction method, or we can use Knill’s syndrome extraction in
Fig. 1 at the top-level of concatenation. This choice does not affect the error threshold of the
scheme. In Knill’s syndrome extraction, if an error is detected at any error detection step at
any level of concatenation, the preparation of the logical EPR pair
∣∣Φ+〉 = |00+11〉2 should be
restarted.
Now we describe the basic fault-tolerant logical circuits of the C4 code. We use an ancilla
factory model to prepare the high quality ancillas required to execute the phase gate S =(
1 0
0 i
)
and the pi/8 gate T =
(
1 0
0 eipi/4
)
. These gates are then performed by a logical
4 Performance and Error Analysis of Knill’s Postselection Scheme in a Two-Dimensional Architecture
|Q〉 • X •
|A〉 P|+〉 • Z •
|B〉 P|0〉 X Z︸ ︷︷ ︸ |Q〉
Preparing |Φ+〉
Fig. 1. Knill Syndrome extraction .
teleportation circuit.
The logical states |0¯〉 = |0¯〉L |+¯〉S and |+¯〉 = |+¯〉L |0¯〉S can be fault-tolerantly prepared by
choosing appropriate spectator qubits as in Fig. 2, where P|0¯〉 and P|+¯〉 denote the preparation
circuits of the logical qubit |0¯〉 and |+¯〉, respectively.
To perform fault-tolerant error detection (ED) of C4, the two circuits in Fig. 3 are used
depending on the state of the spectator qubit: we choose ED0 or ED+ when the spectator
qubit is |+¯〉S or |0¯〉S , respectively. This is because the state of the spectator qubit alternates
between |+¯〉S and |0¯〉S after each error detection block. As discussed in [14,23], the ED0 gate
is better suited for detecting Z errors, while the ED+ gate is better suited for detecting X
errors.
If the parity of the X or Z measurement outcomes in ED0 and ED+ is not zero, which
means that errors are detected, the ancilla qubits are discarded and the circuit restarts. If
there are no errors detected, the measurement outcomes of the the first two code blocks
determine the logical Pauli operators to be applied to the second ancilla block to complete
the quantum teleportation. These operations are represented by the decision block in Fig. 3.
Each single-qubit gate other than measurements is followed by an ED routine, and the
two-qubit CNOT gate is followed by an ED on each of the two qubits. As a general rule we
shall assume the presence of the input and output error detection routines before and after
every logical gate, and this should be understood for every circuit shown. Measurements have
quantum ED routines at the input, but classical ED routines at the output, while ancilla
preparations typically have only quantum ED routines at the output. The combination of a
gate and its following ED(s) is called a rectangle (1-Rec).
|+〉 •
P|0〉 |+〉 • |0〉L |+〉S
|0〉
|0〉


|+〉 •
P|+〉 |0〉 |+〉L |0〉S
|+〉 •
|0〉


Fig. 2. State preparation .
The logical controlled-NOT (CNOT) gates between different code blocks of C4 can be
done transversally by applying bitwise CNOT gates. The swap of qubits 2 and 3 implements
the SWAP gate of the logical qubit and the spectator qubit, and we call this an inner SWAP
gate. The logical Hadamard gate H = 1√
2
(
1 1
1 −1
)
is implemented by transversally applying
C.-Y. Lai, G. Paz, M. Suchara, and T.A. Brun 5
|Ψ〉L / • X •
|0〉L |+〉S / P|+〉 • Z •
|+〉L |0〉S / P|0〉 Decision / |Ψ〉L |0〉S
|Ψ〉L / • X •
|+〉L |0〉S / P|+〉 • Z •
|0〉L |+〉S / P|0〉 Decision / |Ψ〉L |+〉S
Fig. 3. Circuits for fault-tolerant quantum error detection.
Top: ED0. Bottom: ED+.
the Hadamard gates, followed by an inner SWAP gate. The inner SWAP gate does not need
to be applied; instead, we switch the labels of the qubits and keep track of them. We assume
this can be done efficiently.
To enable universal quantum computation, it remains to prepare the level-M ancilla state∣∣+i〉 = 1√
2
(∣∣0〉+ i ∣∣1〉), which is the +1 eigenstate of Y = iXZ at level M , and the level-M
magic state T |+¯〉. The phase gate S and the pi/8 gate T can be implemented with the help
of the ancilla state
∣∣+i〉 and T |+¯〉 as shown in Fig. 4 and Fig. 5, respectively.
The logical state |+i〉 can be non-fault-tolerantly prepared by the circuit in Fig. 6. To
prepare the physical state |+i〉 = SH |0〉 at level 0, we sequentially apply the faulty gates H
and S on a physical qubit |0〉. After several iterations of distillation, we obtain a |+i〉 with
high fidelity. The decoding gate D is shown in Fig. 7. The output state ∣∣+i〉 can be distilled
to one with higher fidelity by the circuit in Fig. 8, where the twirl operation is shown in Fig.
9. The state
∣∣+i〉 at level M can be prepared by recursively applying the circuit in Fig. 6 or
by using a level-M to level-0 decoder D in the teleportation at level M . A level-M to level-0
decoder can be implemented by recursively applying the decoding gate D at each level.
|Ψ〉 • • S |Ψ〉
|+i〉 Z |+i〉
Fig. 4. The circuit for implementing the logical S gate.
|Ψ〉 / Z •
/ T |+〉 • S EC T |ψ〉
Fig. 5. The circuit for implementing the logical T gate.
6 Performance and Error Analysis of Knill’s Postselection Scheme in a Two-Dimensional Architecture
|+i〉 • X •
/ P|+〉 • D Z •
/ P|0〉 X Z / |+i〉
Fig. 6. The circuit for preparing the logical state
∣∣+i〉.
|q1〉 • |Ψ〉
|Ψ〉L |0〉S |q2〉
|q3〉 •
|q4〉 •

|q1〉 • |Ψ〉
|Ψ〉L |+〉S |q2〉
|q3〉 • •
|q4〉

Fig. 7. The decoding circuit for C4.
|+i〉 twirl • |+i〉
|+i〉 twirl Z • X •
Fig. 8. The distillation circuit for the state |+i〉.
|+〉 Z •
|+i〉 Y |+i〉
Fig. 9. The twirl operation for the state |+i〉.
The realization of a fault-tolerant T gate is shown in Fig. 5. This gate sequence was
originally constructed in [26] using one-bit teleportation. The gate sequence teleports the
state |ψ〉 from the data block to the ancilla and applies the T gate to the state. The ancilla
state T |+¯〉 is prepared using the state injection method described before, as in Fig. 6, followed
by several rounds of distillation. The distillation and twirl procedures of T |+〉 are complicated
and they are described in [27].
3 The Two-Dimensional Qubit Layout of the C4 Code
We now describe the two-dimensional qubit layout for the C4 code and estimate the number
of each physical gate operation required for each logical operation. For examples of esti-
mation of the resources of the Knill scheme based on the concatenated C4 code, we refer
interested readers to our technical report [28]. We assume that two-qubit interactions are
available only for the nearest neighbors. That is, we apply horizontal or vertical CNOT
gates (hCNOT/vCNOT) only to two neighboring qubits on the same horizontal or vertical
C.-Y. Lai, G. Paz, M. Suchara, and T.A. Brun 7
line. Similarly, we assume movements of the qubits are accomplished by SWAP gates in two
directions: horizontal and vertical SWAP gates (hSWAP/vSWAP).
Following the tile structures presented in [21,25], we design a two-dimensional 5×5 lattice
architecture of physical qubits to represent a logical qubit of the C4 code. A tile is initialized
as one of the following two structures:
Structure I :

O O O O O
O d1 O O d3
O O O O O
O O O O O
O d2 O O d4
 ,
Structure II :

O O O O O
O O O O O
O O d1 d3 O
O O d2 d4 O
O O O O O
 .
The four data qubits of the C4 code are denoted by d1, d2, d3, and d4. The O’s are
dummy qubits used for ancilla preparation or for swapping with data or ancilla qubits in
communication, and their states are irrelevant to computation. Each qubit in the tile is
encoded in a lower-level tile structure.
The following operations are performed on the C4 code:
1. Error detection (ED).
2. Horizontal and vertical CNOT gates (hCNOT/vCNOT).
3. Horizontal and vertical SWAP gates (hSWAP/vSWAP).
4. Measurement in the X basis or the Z basis (MX and MZ).
5. The Pauli operators X, Y , Z, and the Hadamard gate (H).
6. Preparation of the ancilla qubits |+〉 or |0〉 (P|+〉 and P|0〉).
7. The phase gate S and the pi/8 gate T .
For simplicity, all lower-level gates are assumed to take one time step, which is the longest
execution time among all gates. In reality, we may think that a qubit idles for the rest of the
time step after it completes a fast gate, and the error rate of this operation is the physical gate
error rate plus the memory error rate for the idle time. To achieve favorable error-correction
properties and low overhead, our design attempts to minimize the number of SWAPs, idle
qubits, and the total number of time steps.
Since error detection is performed constantly, extra space is need for preparation of the
logical EPR pairs used in Knill’s syndrome extraction. In structure I, the data qubits lie
on the “corners” and the logical EPR pairs for error detection are prepared inside the data
8 Performance and Error Analysis of Knill’s Postselection Scheme in a Two-Dimensional Architecture
qubits. By contrast, the data qubits are located at the “center” of structure II and the ancilla
qubits surround the data qubits, as will be shown in the following. Error detection is designed
in each of the two tiles precisely so that the data qubits are transferred between the “center”
and the “corners.” Therefore, the tile alternates between structures I and II after each error
detection. This alternation avoids extra SWAP gates.
For a SWAP operation to be fault-tolerant, we only swap a data or ancilla qubit with a
dummy qubit. Because of this, only one tile requires an ED circuit. The topmost row and
leftmost column of each tile is reserved for transportation of the lower level qubits. This
allows realization of the horizontal or vertical CNOT gate without affecting the EDs.
Fig. 3 demonstrates the ED+ block for structure I. Due to space constraints, we have
made the other tile operations available online. In the following, “a → b” means applying a
CNOT gate with a being the control qubit and b being the target qubit.
Time step 1:
O O O O O
O d1 O O d3
O P|+〉(a1) P|+〉(a5) P|0〉(a7) P|+〉(a3)
O P|0〉(a2) P|+〉(a6) P|0〉(a8) P|0〉(a4)
O d2 O O d4
Time step 2:
O O O O O
O d1 O O d3
O a1 a5 → a7 a3
↓ ↓
O a2 a6 → a8 a4
O d2 O O d4.
Time step 3:
O O O O O
O d1 O O d3
O a1 → a5 a7 ← a3
O a2 → a6 a8 ← a4
O d2 O O d4
Time step 4:
O O O O O
O d1 O O d3
↓ ↓
O a1 a5 a7 a3
O a2 a6 a8 a4
↑ ↑
O d2 O O d4
C.-Y. Lai, G. Paz, M. Suchara, and T.A. Brun 9
Time step 5:
O O O O O
O MX(d1) O O MX(d3)
O MZ(a1) a5 a7 MZ(a3)
O MZ(a2) a6 a8 MZ(a4)
O MX(d2) O O MX(d4)
At the end of time step 5:
O O O O O
O O O O O
O O d1 d3 O
O O d2 d4 O
O O O O O
The ancilla qubits a1, · · · , a8 are prepared at time step 1, and logical EPR pairs are made
at time steps 2 and 3. Quantum teleportations are completed in the subsequent time steps.
We choose the index such that a quantum teleportation occurs on the qubits di, ai, ai+4 for
i = 1, 2, 3, 4. Observe that the data qubits d1, d2, d3, d4 are transferred to the center after
teleportation and no SWAPs are needed here. However, the error detection ED+ for structure
II needs two SWAPs and it takes one more step. Its first time step is initialized as follows:
O O O O O
O O P|+〉(a1) P|0〉(a3) O
O P|+〉(a5) d1 d3 P|+〉(a7)
O P|0〉(a6) d2 d4 P|0〉(a8)
O O P|+〉(a2) P|0〉(a4) O
.
In addition, applying the logical Pauli operators X or Z to complete the teleportation may
take one or two more steps, but this is not shown. In many cases it suffices to track these
Pauli operators without correcting them. The ED0 for structure I at time step 1 is as follows
and the rest of the steps are similar to those of the above ED+:
O O O O O
O d1 P|+〉(a1) P|0〉(a3) d3
O O P|+〉(a5) P|+〉(a7) O
O O P|0〉(a6) P|0〉(a8) O
O d2 P|+〉(a2) P|0〉(a4) d4
.
Note that the operations and required time of ED0 are the same as those of ED+.
Remark: after a logical Hadamard gate, the labels of data qubits 2 and 3 are switched. This
can be fixed by applying appropriate SWAPs and it takes two more time steps in structures I
or II. However, we don’t adjust it until a CNOT gate acts on two tiles with different labels.
Based on the tiled operations, we build the recursive relations resulting from the concate-
nated code structure in order to quantify the total number of gates and total time required
10 Performance and Error Analysis of Knill’s Postselection Scheme in a Two-Dimensional Architecture
ED(1) vCNOT(1) hCNOT(1) vSWAP(1) hSWAP(1) P|+〉(1) P|0〉(1)
ED(1) 2 2 1 1 1 1
vCNOT(0) 6 4 2
hCNOT(0) 6 4 2
vSWAP(0) * 40 8 20 4
hSWAP(0) * 8 40 20 4
P|+〉(0) 4 2 2
P|0〉(0) 4 2 2
MZ(0) 4
MX(0) 4
Z(0)
X(0)
H(0)
MZ(1) MX(1) Z(1) X(1) H(1) S(1) T(1)
ED(1) 1 1 1 1 2
vCNOT(0)
hCNOT(0) 8 12
vSWAP(0) 20 40
hSWAP(0) 4 12
P|+〉(0)
P|0〉(0)
MZ(0) 4
MX(0) 4 4
Z(0) 2
X(0) 2
H(0) 4 8 8
(∗The number of SWAPs in the ED is zero in structure I but 4 in structure II.)
Table 1. The numbers of the quantum operations contained in each higher-level quantum operation
of the C4 code and its following error detection (1-Rec). Each entry represents the number of
the elementary gate (U(0)) corresponding to that row contained in the higher-level gate (U(1))
corresponding to that column.
for each logical gate. The recursive relations of a 1-Rec for each logical gate in terms of
lower-level gates are listed in Table 1. For example, the vertical SWAP operation vSWAP
requires 20 vertical swap operations at the next lower concatenation level, followed by the
error detection operation ED.
To allow universal quantum computation, we implement the S and T gates by the ancilla
factory method, which uses the decoding circuits in Fig. 7. The overhead of the ancilla
factories and their decoding circuits are not included in Table 1.
Remark: It is possible to combine a gate operation with the following error detection and
save several time steps.
4 Error Analysis of the 2-Dimensional Knill Postselection Scheme
4.1 Error Threshold
Here we estimate the error threshold of Knill’s postselection scheme in the 2-dimensional
tile for the local stochastic, adversarial noise model. Following the procedure presented in
[17,21,23], we count the number of malignant pairs of locations in the 1-exRec of the CNOT
gate.
The 1-exRec of the CNOT gate includes the bitwise CNOTs on two logical qubits, together
with two following and two preceding error detection blocks. As shown in Fig. 10, we assume
the preceding EDs are ED+s and the following EDs are ED0s. Note that the logical Pauli
operators to complete teleportation are assumed to be error-free, since they can be tracked
in the Pauli frame and hence be deferred until a non-Clifford gate occurs. We also assume
that classical computations are perfect, and that any quantum operations depending on the
C.-Y. Lai, G. Paz, M. Suchara, and T.A. Brun 11
classical results can be applied without delay.
ED+ • ED0
ED+ ED0
Fig. 10. The 1-exRec of the CNOT gate.
There are seven types of locations in the 1-exRec of the CNOT gate: (1)P|+〉; (2)P|0〉;
(3)MX ; (4) MZ ; (5) hSWAP/vSWAP; (6) hCNOT/vCNOT; (7) idle qubits. A set of locations
is called malignant if errors happening in these locations could make the calculation of the
rectangle incorrect. Since an error at any single location can be detected, errors at two
locations dominate the source of logical errors. To determine whether a pair of locations is
malignant, we check whether there is any logical error in the output of two perfect ED+s
following the 1-exRec of CNOT, as shown in Fig. 11. The simulation procedure in the
stabilizer formalism proposed in [29] can be used to track the logical operators through the
circuit in Fig. 11.
ED+ • ED0 ideal ED+
ED+ ED0 ideal ED+
Fig. 11. The 1-exRec of the CNOT gate followed by two perfect ED+.
Remark: in general the error rate of a SWAP gate is higher than a CNOT gate, since it
is implemented by a series of gate operations, such as three CNOT gates. However, in the
two-dimensional tile we only swap a data or ancilla qubit with a dummy qubit, and the cost
of such a SWAP gate is less than the cost of a CNOT gate. As for the S and T gates, Aliferis,
Gottesman, and Preskill showed that the distillation method for ancilla preparations has a
higher threshold than the code itself [23].
To maximize the error threshold, we optimized the tile operations of the extended rectangle
of the CNOT gate. The animations showing this are available online. There are 196 locations
in the extended rectangle of the CNOT gate: 32 idle qubits and 154 gates, of which 38 gates
are SWAPs. We assume that the error detection blocks begin before the time step that the
data qubits come in, and thus there are no idle qubits at time steps 1, 2, and 3 in the preceding
ED. We find that the numbers of malignant pairs of locations of each kind are given by
α =

4 8 8 0 0 32 16
0 0 14 96 80 32
16 0 96 104 32
16 96 112 32
442 672 268
322 288
106
 ,
12 Performance and Error Analysis of Knill’s Postselection Scheme in a Two-Dimensional Architecture
scheme Steane code Bacon-Shor code Knill’s postselection scheme
nonlocal 2.73× 10−5 1.94× 10−4 1.04× 10−3
2D(γ = 1.0) 1.1× 10−5 1.3× 10−5 3.06× 10−4
2D(γ = 0.1) 1.85× 10−5 2.02× 10−5 4.06× 10−4
Table 2. Comparison of the error thresholds of three concatenated codes.
where αi,j represents the number of malignant pairs at locations of types i and j.
Let 
(m)
j be the error rates of type j at level m. For error correction to be effective, we
require

(m+1)
6 =
∑
i≤j
αi,j
(m)
i 
(m)
j +O((
(m)
max)
3) ≤ (m)6 , (1)
where αi,j is the number of malignant pairs of types i and j and 
m
max is the maximum of the
seven types of error rate.
We assume all errors of weight 3 or larger are malignant and the effect of errors of weight
higher than three can be ignored. (This might still be an overestimate of higher-order terms.)
Let γ be the ratio of the memory error rate of the idle qubits to the gate error rate. Let
B =
6∑
i,j,=1
i>j
αi,j +
6∑
i=1
γαi,7 + γ
2α7,7
be the effective number of malignant pairs and
A =
(
164
3
)
+
(
164
2
)(
32
1
)
γ +
(
164
1
)(
32
2
)
γ2 +
(
32
3
)
γ3
be the effective number of errors of weight 3, where the
(
a
b
)
’s are binomial coefficients. Then
Eq. (1) reduces to
A(
(m)
6 )
2 +B
(m)
6 < 1.
If we assume the error rates are the same for all types of locations (γ = 1), we have
B = 2, 892 and A =
(
196
3
)
, and Eq. (1) gives an error threshold of
(γ = 1) < 3.06× 10−4.
We compare our results with those of the Steane code and the Bacon-Shor code for γ = 1.0
and 0.1 in Table 2. The rigorous error thresholds obtained in [12, 23, 30] are also listed
as a reference. Knill’s postselection scheme has the highest error threshold of O(10−4), as
expected. Remark: we obtain 714 malignant pairs and calculate a threshold of 1.05× 10−3 if
we assume no SWAP or memory errors.
4.2 Pseudo-Threshold
Knill reported a simulated pseudo-threshold of 3% by his postselection scheme over unbiased
and independent depolarizing noise [14]. We now present a Monte Carlo simulation of the
circuit in Fig. 11 over depolarizing noise to obtain pseudo-thresholds for the Knill scheme in
C.-Y. Lai, G. Paz, M. Suchara, and T.A. Brun 13
two dimensions, as in [16]. In our model, we add depolarizing errors as quantum operations
after gates or before measurements in the circuit. Let p be the depolarizing rate. Any single
qubit location (other than measurements) undergoes X, Y , or Z with probability p3 . Any
binary measurement outcome is flipped with probability p. CNOT gates are modified by one
of the 15 non-identity two-qubit Pauli operators (IX, IZ, · · · , Y Y ) with probability p15 .
We obtain the pseudo-thresholds by calculating the logical error rate of the circuit in
Fig. 11. The logical error rate e(p) for a given depolarizing rate p (the worst gate error rate)
is defined as the number of samples without logical errors at the output of the circuit in
Fig. 11, divided by the number of samples without any errors being detected. If an error-
detecting code works, it is clear that e(p) < p for p small enough and e(p) is an increasing
function of p. The pseudo-threshold ˜ is the value of p such that e(˜−) < ˜ and e(˜+) > ˜.
If we assume all locations have the same depolarizing rate p (γ = 1.0), we find a pseudo-
threshold of about 0.1%. If γ = 0.1, we obtain a pseudo-threshold of about 0.2%. These values
are higher than the error thresholds estimated in the adversarial noise model, as expected,
since the adversarial noise model is the worst case. As a comparison, we calculated a pseudo-
threshold of about 0.8% for the Knill scheme without locality constraints.
Remark: we can reduce the depolarizing rates on the measurements and ancilla prepara-
tions as Knill did in [14] by choosing these error rates to be 4/15 of the worst gate error rate.
We obtain a pseudo-threshold of about 0.35% by choosing γ = 0.1 for the Knill scheme in two
dimension. For the Knill scheme without locality constraints, we obtain a pseudo-threshold
of about 2.5%.
5 Discussion
We designed a two-dimensional 5× 5 qubit tile for quantum computation using the concate-
nated C4 code with postselection. Although we didn’t prove the optimality of our design, we
believe that a substantial improvement within our architectural framework is unlikely.
In this paper we demonstrated the tile operations of the ED+ block for structure I. Different
combinations of error detection (ED+ or ED−) and tile structures (I or II) require small
modifications to the logical gates involving two tiles, such as vSWAP, hSWAP, vCNOT, and
hCNOT. These modifications can be done by slightly changing the locations of di, ai, ai+4 for
i = 1, 2, 3, 4 in our demonstration. For example, we have to modify the ancilla preparation
of an ED+ that follows a vSWAP, and it takes one more time step and four more lower-level
SWAPs than a vSWAP followed by an ED0.
It is desirable to reduce the size of the tile, and hence the number of SWAPs, for physical
architectures with very low memory error rates, such as superconducting qubits. To that end,
we have also designed a 4 × 4 tile, and its performance is compared with the 5 × 5 tile in
Table 3 for different ratios of memory error to gate error rate. The 4 × 4 tile has a higher
threshold with no memory error (γ = 0). Surprisingly, the error threshold of the 4 × 4 tile
decreases by a factor of about two for γ = 0.1. This is probably because there are many more
idle qubits in the 4 × 4 tile, and the operations in the two code blocks of the 4 × 4 tile are
not parallel: one block is delayed by one time step as shown in the tile operations online.
However, the error thresholds of the 5 × 5 tile for γ = 0.1 and γ = 0 are about the same.
The effects of some errors may cancel each other due to the symmetry in the 1-exRec of the
CNOT gate in the 5× 5 tile.
14 Performance and Error Analysis of Knill’s Postselection Scheme in a Two-Dimensional Architecture
1-exRec of the CNOT gate error threshold
tile SWAPs idle qubits times steps γ = 1 γ = 0.1 γ = 0.0
4× 4 38 74 16 1.47× 10−4 2.22× 10−4 4.89× 10−4
5× 5 48 32 14 3.06× 10−4 4.06× 10−4 4.14× 10−4
Table 3. Comparison of the 4× 4 and 5× 5 tiles.
Under the realistic assumption that one- and two-qubit quantum gates are local, our
threshold analyses establish that Knill’s postselection scheme has better error correction ca-
pabilities than other concatenated error-correcting codes. This makes our proposed two-
dimensional architecture a practical choice for quantum error correction.
In addition to the postselection scheme based on error detection, Knill also proposed
a Fibonacci scheme to further reduce the overhead of the postselection scheme [14]. He
calculated a pseudo-threshold of about 1%. It uses the fact that the concatenated error-
detecting code C4 can correct located errors. Aliferis and Preskill showed that the error
threshold of the Fibonacci scheme is slightly lower than the postselection scheme over the
adversarial noise model [31]. Nonrecursive versions of the CNOT gates or the measurements
in the Fibonacci scheme would take many time steps without error detection or correction in
our two-dimensional architecture. This might lead to a much worse error threshold. However,
we still consider finding the threshold of the Fibonacci scheme, combined with the “soft
decision” decoder in [32], an interesting question for future work.
Acknowledgements
This work was supported by the Intelligence Advanced Research Projects Activity (IARPA)
via Department of Interior National Business Center contract numbers D11PC20165 and
D11PC20167. The U.S. Government is authorized to reproduce and distribute reprints for
Governmental purposes notwithstanding any copyright annotation thereon. The views and
conclusions contained herein are those of the authors and should not be interpreted as neces-
sarily representing the official policies or endorsements, either expressed or implied, of IARPA,
DoI/NBC, or the U.S. Government.
References
1. P. W. Shor, “Fault-tolerant quantum computation,” in Proceedings of the 37th Annual Symposium
on the Theory of Computer Science. Los Alamitos: IEEE Press, 1996, pp. 56–65.
2. D. Aharonov and M. Ben-Or, “Fault-tolerant quantum computation with constant error,” in
Proceedings of the twenty-ninth annual ACM symposium on Theory of computing, ser. STOC ’97.
New York, NY, USA: ACM, 1997, pp. 176–188.
3. D. P. DiVincenzo and P. W. Shor, “Fault-tolerant error correction with efficient quantum codes,”
Phys. Rev. Lett., vol. 77, no. 15, pp. 3260–3263, 1996.
4. E. Knill, R. Laflamme, and W. Zurek, “Threshold accuracy for quantum computation,” 1996.
5. D. Gottesman, “Stabilizer codes and quantum error correction,” Ph.D. dissertation, California
Institute of Technology, Pasadena, CA, 1997.
6. ——, “Theory of fault-tolerant quantum computation,” Phys. Rev. A, vol. 57, pp. 127–137, Jan
1998.
C.-Y. Lai, G. Paz, M. Suchara, and T.A. Brun 15
7. A. M. Steane, “Active stabilization quantum computation, and quantum state synthesis,” Phys.
Rev. Lett., vol. 78, no. 11, pp. 2252–2255, 1997.
8. ——, “Space, time, parallelism and noise requirements for reliable quantum computing,” Fortschr.
Phys., vol. 46, pp. 443–457, 1998.
9. J. Preskill, “Reliable quantum computers,” in Proc. R. Soc. London A, vol. 454, 1998, pp. 385–410.
10. A. M. Steane, “Efficient fault-tolerant quantum computing,” Nature, vol. 399, pp. 124–126, 1999.
11. ——, “Overhead and noise threshold of fault-tolerant quantum error correction,” Phys. Rev. A,
vol. 68, p. 042322, Oct 2003.
12. P. Aliferis, D. Gottesman, and J. Preskill, “Quantum accuracy threshold for concatenated
distance-3 codes,” Quantum Inf. Comput., vol. 6, no. 2, pp. 97–165, Mar. 2006.
13. B. Reichardt, “Fault-tolerance threshold for a distance-three quantum code,” in Automata, Lan-
guages and Programming, 33rd International Colloquium, ICALP 2006, Venice, Italy, July 10-14,
2006, Proceedings, Part I, ser. Lecture Notes in Computer Science, M. Bugliesi, B. Preneel, V. Sas-
sone, and I. Wegener, Eds., vol. 4051. Springer, 2006, pp. 50–61.
14. E. Knill, “Quantum computing with realistically noisy devices,” Nature, vol. 434, pp. 39–44, 2005.
15. P. Aliferis and A. W. Cross, “Subsystem fault tolerance with the Bacon-Shor code,” Phys. Rev.
Lett., vol. 98, p. 220502, May 2007.
16. A. W. Cross, D. P. Divincenzo, and B. M. Terhal, “A comparative code study for quantum fault
tolerance,” Quantum Inf. Comput., vol. 9, no. 7, pp. 541–572, Jul. 2009.
17. K. M. Svore, D. P. Divincenzo, and B. M. Terhal, “Noise threshold for a fault-tolerant
two-dimensional lattice architecture,” Quantum Inf. Comput., vol. 7, no. 4, pp. 297–318, May
2007.
18. A. M. Steane, “Error correcting codes in quantum theory,” Phys. Rev. Lett., vol. 77, no. 5, pp.
793–797, 1996.
19. D. Bacon, “Operator quantum error-correcting subsystems for self-correcting quantum memories,”
Phys. Rev. A, vol. 73, no. 1, p. 012340, Jan 2006.
20. P. W. Shor, “Scheme for reducing decoherence in quantum computer memory,” Phys. Rev. A,
vol. 52, no. 4, p. R2493, Oct 1995.
21. F. M. Spedalieri and V. P. Roychowdhury, “Latency in local, two-dimensional, fault-tolerant quan-
tum computing,” Quantum Inf. Comput., vol. 9, no. 7, pp. 666–682, Jul. 2009.
22. A. M. Stephens and Z. W. E. Evans, “Accuracy threshold for concatenated error detection in
one dimension,” vol. 80, p. 022313, Aug 2009.
23. P. Aliferis, D. Gottesman, and J. Preskill, “Quantum accuracy threshold for concatenated distance-
3 codes,” ArXiv quant-ph/0504218v3, 2005.
24. A. R. Calderbank, E. M. Rains, P. W. Shor, and N. J. A. Sloane, “Quantum error correction and
orthogonal geometry,” Phys. Rev. Lett., vol. 78, no. 3, pp. 405–408, 1997.
25. K. M. Svore, B. M. Terhal, and D. P. DiVincenzo, “Local fault-tolerant quantum computation,”
Phys. Rev. A, vol. 72, p. 022317, Aug 2005.
26. X. Zhou, D. W. Leung, and I. L. Chuang, “Methodology for quantum logic gate construction,”
Phys. Rev. A, vol. 62, p. 052316, Oct 2000.
27. S. Bravyi and A. Kitaev, “Universal quantum computation with ideal clifford gates and noisy
ancillas,” Phys. Rev. A, vol. 71, p. 022316, Feb 2005.
28. M. Suchara, A. Faruque, C. Lai, G. Paz, F. Chong, and J. Kubiatowicz, “Estimating the resources
for quantum computation with the QuRE Toolbox,” QCS technical report, 2013.
29. S. Aaronson and D. Gottesman, “Improved simulation of stabilizer circuits,” Phys. Rev. A,
vol. 70, p. 052328, Nov 2004.
30. P. Aliferis and A. W. Cross, “Subsystem fault tolerance with the Bacon-Shor code,” Phys. Rev.
Lett., vol. 98, no. 22, p. 220502, Jun. 2007.
31. P. Aliferis and J. Preskill, “Fibonacci scheme for fault-tolerant quantum computation,” vol. 79,
2009.
32. Z. W. Evans and A. M. Stephens, “Optimal correction of concatenated fault-tolerant quantum
codes,” Quant. Inf. Proc., vol. 11, no. 6, pp. 1511–1521, Dec. 2012.
