Large Scale Modular Quantum Computer Architecture with Atomic Memory and
  Photonic Interconnects by Monroe, C. et al.
Large Scale Modular Quantum Computer Architecture
with Atomic Memory and Photonic Interconnects
C. Monroe1, R. Raussendorf2, A. Ruthven2, K. R. Brown3, P. Maunz4∗, L.-M. Duan5, and J. Kim4
1 Joint Quantum Institute, University of Maryland Department of Physics and
National Institute of Standards and Technology, College Park, MD 20742, USA
2 Department of Physics and Astronomy, University of British Columbia, Vancouver, BC V6T1Z1, Canada
3 Schools of Chemistry and Biochemistry; Computational Science and Engineering; and Physics,
Georgia Institute of Technology, Atlanta, GA 30332, USA
4Department of Electrical and Computer Engineering, Duke University, Durham, NC 27708, USA
5 Department of Physics and MCTP, University of Michigan, Ann Arbor, MI 48109,
USA and Center for Quantum Information, Tsinghua University, Beijing 100084, China∗
(Dated: July 19, 2013)
The practical construction of scalable quantum computer hardware capable of executing non-trivial quantum
algorithms will require the juxtaposition of different types of quantum systems. We analyze a modular ion
trap quantum computer architecture with a hierarchy of interactions that can scale to very large numbers of
qubits. Local entangling quantum gates between qubit memories within a single register are accomplished
using natural interactions between the qubits, and entanglement between separate registers is completed via a
probabilistic photonic interface between qubits in different registers, even over large distances. We show that
this architecture can be made fault-tolerant, and demonstrate its viability for fault-tolerant execution of modest
size quantum circuits.
I. INTRODUCTION
A quantum computer is composed of at least two quantum
systems that serve critical functions: a reliable quantum mem-
ory for hosting and manipulating coherent quantum superpo-
sitions, and a quantum bus for the conveyance of quantum
information between memories. Quantum memories are typ-
ically formed out of matter such as individual atoms, spins
localized at quantum dots or impurities in solids, or supercon-
ducting junctions [1]. On the other hand, the quantum bus
typically involves propagating quantum degrees of freedom
such as electromagnetic fields (photons) or lattice vibrations
(phonons). A suitable and controllable interaction between
the memory and the bus is necessary to efficiently execute a
prescribed quantum algorithm. The current challenge in any
quantum computer architecture is to scale the system to very
large sizes, where errors are typically caused by speed limi-
tations and decoherence of the quantum bus or its interaction
with the memory. The most advanced quantum bit (qubit) net-
works have thus been established only in very small systems,
such as individual atomic ions bussed by the local Coulomb
interaction [2] or superconducting Josephson junctions cou-
pled capacitively or through microwave striplines [3, 4]. In
this paper, we propose and analyze a hierarchy of quantum
information processing units in a modular quantum computer
architecture that may allow the scaling of high performance
quantum memories to useful sizes [5]. This architecture com-
pares to the “multicore” classical information processor, and
is suitable for the implementation of complex quantum cir-
cuits utilizing the flexible connectivity provided by a recon-
figurable photonic interconnect network. Unlike previous re-
∗Present address: Sandia National Laboratories, Albuquerque, N.M. 87123,
USA
lated proposals [6–11], we show this reconfigurable architec-
ture can be made fault-tolerant over a wide range of system
parameters, using a variety of fault-tolerant schemes. All of
the rudiments of this architecture have been demonstrated in
small-scale trapped ion systems, and we speculate on the tech-
nological hurdles ahead in order to realize such a system.
We specialize to the use of atomic ion qubit memories,
due to the outstanding qubit properties demonstrated to date.
Qubits stored in ions enjoy a level of coherence that is un-
matched in any other physical system, underlying the reason
such states are also used as high performance atomic clocks.
Moreover, atomic ions can be initialized and detected with
nearly perfect accuracy using conventional optical pumping
and state-dependent fluorescence techniques [12]. There have
been many successful demonstrations of controlled entangle-
ment of several-ion quantum registers in the past decade in-
volving the use of qubit state-dependent forces supplied by
laser beams [2, 13]. These experiments exploit the collective
motion of a small number of trapped ion qubits, but as the size
of the ion chain grows, such operations are more susceptible
to external noise, decoherence, or speed limitations.
One promising approach to scaling trapped ion qubits is
the quantum charge-coupled device (QCCD), where phys-
ical shuttling of ions between trapping zones in a multi-
plexed trap is used to transfer qubits between (short) chains
of ions [12, 14]. This approach involves advanced ion trap
structures, perhaps with many times more discrete electrodes
as trapped ion qubits, and therefore motivates the use of
micrometer-scale surface traps [15–17] and novel fabrication
techniques [18–20]. The shuttling approach requires careful
control of the time-varying trapping potential to manipulate
the position of the atomic ion, and cannot easily be extended
over large distances for quantum communications applica-
tions. The QCCD approach is expected to enable a quantum
information processing platform where basic quantum error
correction and quantum algorithms can be realized. Further
ar
X
iv
:1
20
8.
03
91
v2
  [
qu
an
t-p
h]
  2
 Ju
l 2
01
3
2(a) 
(b) 
Figure 1: (Color online) Hierarchical modular quantum computer ar-
chitecture hosting N = NELUNq qubits. (a) The elementary logic
units (ELU) consists of a register of Nq trapped atomic ion qubits,
whereby entangling quantum logic gates are mediated through the
local Coulomb interaction between qubits. (b) One or more atomic
qubits within each of the NELU registers are coupled to photonic
quantum channels, and through a reconfigurable optical crosscon-
nect switch (OXC, center), fiber beamsplitters and position sensitive
imager (right), qubits between different registers can be entangled.
scaling is likely limited by the complexity of the trap design,
diffraction of optical beams, and the hardware controllers to
operate the system.
Here we describe and analyze a modular universal scal-
able ion trap quantum computer (MUSIQC) architecture that
may enable construction of quantum processors with up to
106 qubits utilizing component technologies that have already
been demonstrated. This architecture features two elements:
stable trapped ion multi-qubit registers that can further be
connected with ion shuttling, and scalable photonic intercon-
nects that can link these registers in a flexible configuration
over large distances, as shown in Fig. 1. We articulate ar-
chitectural advantages of this approach that allows significant
speedup and resource reduction in quantum circuit execution
over other hardware architectures, enabled by the ability to op-
erate quantum gates between qubits throughout the entire pro-
cessor regardless of their relative location. Finally, we prove
how such a quantum network can support fault-tolerant error
correction even in the face of probabilistic interconnects, and
discuss the technological developments necessary for its real-
ization. While we focus our discussions on quantum registers
composed of trapped atomic ions, the networking aspect of
this architecture is applicable to other qubit platforms that fea-
ture strong optical transitions, such as quantum dots, neutral
atoms, or nitrogen-vacancy (NV) color centers in diamond [1].
II. QUANTUM COMPUTING IN A MODULAR
ARCHITECTURE
A. The Modular Elementary Logic Unit (ELU)
The base unit of MUSIQC is a collection ofNq qubit mem-
ories with local interactions, called the Elementary Logic Unit
Qubit ions 
“Refrigerator” ions Communication qubit 
(a) (b) 
Collection optics 
Single mode fiber 
Control laser fields Excitation laser field 
Figure 2: (Color online) Elementary Logic Unit (ELU) composed
of a single crystal of Nq trapped atomic ion qubits coupled through
their collective motion. (a) Classical laser fields impart qubit state-
dependent forces on one or more ions, affecting entangling quantum
gates between the memory qubits. Second ion species is introduced
as refrigeration ions. (b) One or more of the ions (rightmost in the
figure) are coupled to a photonic interface, where a classical laser
pulse maps the state of these communication qubits onto the states
of single photons (e.g., polarization or frequency), which then prop-
agate along an optical fiber to be interfaced with other ELUs.
(ELU). Quantum logic operations within the ELU are ideally
fast and deterministic, with error rates sufficiently small that
fault-tolerant error correction within an ELU is possible [21].
We represent the ELU with a crystal of Nq  1 trapped
atomic ions as shown in Fig. 2a, with each qubit comprised of
internal energy levels of each ion, labeled as |↑〉and |↓〉, sepa-
rated by frequency ω0. We assume the qubit levels are coupled
through an atomic dipole operator µˆ = µ(|↑〉 〈↓| + |↓〉 〈↑|).
The ions interact through their external collective modes of
quantum harmonic motion. Such phonons can be used to
mediate entangling gates through application of qubit-state-
dependent optical or microwave dipole forces [22–24]. There
are many known protocols for phonon-based gates between
ions, and here we summarize the main points relevant to the
size of the ELU and the larger architecture.
An externally applied near-resonant running wave field
with amplitude E(xˆ) = E0eikxˆ and wavenumber k cou-
ples to the atomic dipole through the interaction Hamiltonian
Hˆ = −µˆE(xˆ), and by suitably tuning the field near sidebands
induced by the harmonic motion of the ions [12] a qubit state
dependent force results. In this way, qubits can be mapped
onto phonon states [12, 22] and then onto other qubits for
entangling operations with characteristic speed Rgate = ηΩ,
where η =
√
~k2/(2m0Nqω) is the Lamb Dicke parameter,
m0 is the mass of each ion, ω the frequency of harmonic os-
cillation of the collective phonon mode, and Ω = µE0/2~ is
the Rabi frequency of the atomic dipole independent of mo-
tion. For optical Raman transitions between qubit states (e.g.,
atomic hyperfine ground states) [12], two fields are each de-
tuned by ∆ from an excited state of linewidth γ  ∆, and
when their difference frequency is near resonant with the qubit
frequency splitting ω0, we use instead Ω = (µE0)2/(2~2∆).
The typical gate speed within an ELU therefore slows down
3with the number of qubits Nq as Rgate ∼ N−1/2q . As the size
of the ELU grows, so will the coupling between the modes
of collective motion that could lead to crosstalk. However,
through the use of pulse-shaping techniques [25], the crosstalk
errors need not be debilitating, although the effective speed of
a gate will slow down with size Nq . Changes of the ions’ mo-
tional states during the gate, arising from sources like heating
of the motional modes [26–28] or fluctuating fields, will de-
grade the quality of the gates, leading to practical limits on
the size of the ELU on which high performance gates can be
realized. It is likely that long chains will require periodic
“refrigerator” ions to remove motional excitations between
gates. Since cooling is a dissipative process, these cooling
ions should be chosen to be different isotope or species of
ions and quench motional heating through sympathetic cool-
ing [29]. We estimate that ELUs ranging fromNq = 10−100
should be possible [2, 13]. More than one ELU chain can
be integrated into a single chip by employing ion shuttling
through more complex ion trap structures [14]. Such extended
ELUs (EELUs) consisting of NE ELU chains can contain a
total of NqNE = 20 − 1, 000 physical qubits. For simplic-
ity, we focus the remainder of the article on systems with one
ELU per chip (NE = 1).
B. Probabilistic Linking of ELUs
Two qubits from a pair of ELUs (or EELUs) can be en-
tangled by each emitting photons that interfere with each
other. Entanglement generated between these “communica-
tion qubits” can be utilized as a resource to perform a two-
qubit gate between any pair of qubits, one from each ELU,
using local qubit gates, measurements, and classical commu-
nication between the ELUs. In this scheme, the communica-
tion qubit is driven to an excited state with fast laser pulses
whose duration τe  1/γ, so that no more than one photon
is emitted from each qubit per excitation cycle following the
atomic radiative selection rules (Fig. 2b). The photon can be
post-selected so that one of its degrees of freedom (polariza-
tion, frequency, etc.) is entangled with the state of the com-
munication qubit [30–33]. When the photons from two com-
munication qubits are mode-matched and interfere on a 50/50
beamsplitter, detectors on the output modes of the beamsplit-
ter can herald the creation of entanglement between the mem-
ory qubits [34–38].
We consider two types of photonic connections, charac-
terized by the number of total photons used in the entangle-
ment protocol between two communication qubits [39]. For
type I connections (shown in Fig. 3a), each communication
qubit with an index i (or j) is weakly excited with probability
pe  1 and the state of the ion+photon qubit pair is approxi-
mately written (ignoring the higher-order excitation probabil-
ities) as ∼ √1− pe |↓〉i |0〉i + eikxi
√
pe |↑〉i |1〉i where |n〉i
denotes the state of n photons radiating from the communica-
tion qubit into an optical mode i, xi is the path length from
the emitter i to a beamsplitter, and k the optical wavenum-
ber [34]. When two communication qubits i and j are ex-
cited in this way and the photons interfere at the beam splitter,
(a) 
(b) 
e
!
!
e
!
!
pe <<1
pe ~1
qubit i qubit j 
qubit i qubit j 
Figure 3: (Color online) (a) Type I interference from photons emit-
ted from two communication qubits. Each qubit is weakly excited so
that single photon emission has a very small probability yet is cor-
related with the final qubit state. The output photonic channels are
mode-matched with a 50/50 beamsplitter and subsequent detection
of a photon from either output port heralds the entanglement of the
communication qubits. The probability of two photons present in the
system is much smaller than that of detecting a single photon. (b)
Type II interference involves the emission of one photon from each
communication qubit, where the internal state of the photon (e.g. its
color) is correlated with the qubit state. After two photon interfer-
ence at the beamsplitter, coincidence detection of photons at the two
detectors heralds the entanglement of the communication qubits.
the detection of a single photon in either detector placed at
the two output ports of the beamsplitter heralds the creation
of the state [eikxj |↓〉i |↑〉j ± eikxi |↑〉i |↓〉j ]/
√
2 with success
probability p = peFηD, where F is the fractional solid angle
of emission collected, ηD is the detection efficiency including
any losses between the emitter and the detector, and the sign
in this state is determined by which one of the two detectors
fires. Following the heralding of a single photon, the (small)
probability of errors from double excitation and detector dark
counts are given respectively by p2e andRdark/γ whereRdark is
the rate of detector dark counts. For type I connections to be
useful, the relative optical path length xi − xj must be stable
to much better than the optical wavelength ∼ 2pi/k.
For type II connections (shown in Fig. 3b), each commu-
nication qubit is excited with near unit probability pe ∼ 1
and the single photon carries its qubit through two distin-
guishable internal photonic states (e.g., polarization or op-
4tical frequency). For example, the state of the system con-
taining both communication and photonic qubits is written as
[eik↓xi |↓〉i |ν↓〉i+eik↑xi |↑〉i |ν↑〉i]/
√
2, where |ν↓〉i and |ν↑〉i
denote the frequency qubit states of a single-photon emitted
by the i-th communication qubit with wavenumbers k↓ and
k↑ associated with optical frequencies ν↑ and ν↓, respectively.
Here, |ν↑ − ν↓| = ω0  γ so that these two frequency
qubit states are distinguishable. The coincidence detection
of photons from two such communication qubits i and j af-
ter interfering at a 50/50 beam splitter herald the successful
entanglement of the communication qubits, creating the state
[ei(k↓xi+k↑xj) |↓〉i |↑〉j−ei(k↑xi+k↓xj) |↑〉j |↓〉i]/
√
2 with suc-
cess probability p = (peFηD)2/2 [35, 36].
The success probability of the 2-photon type II connection
may be lower than that of the type I connection when the light
collection efficiency is low, but type II connections are much
less sensitive to optical path length fluctuations. The stabil-
ity requirement of the relative path length xi − xj is only
at the level of the wavelength associated with the difference
frequency 2pic/ω0 of the photonic frequency qubit, which is
typically at the centimeter scale for hyperfine-encoded com-
munication qubits.
In both cases, the mean connection time is given by
τE = 1/(Rp) where R is the repetition rate of the ini-
tialization/excitation process and p is the success probabil-
ity of generating the entanglement. For atomic transitions,
R ∼ 0.1(γ/2pi), and for typical free-space light collection
(F ∼ 10−2) and taking ηD ∼ 0.2, we find for a type I con-
nection τE ∼ 5 msec (pe = 0.05) and for a type II connection
τE ∼ 250 msec where we have assumed γ/2pi = 20 MHz.
Type II connections eventually outperform that of type I with
more efficient light collection, which can be accomplished by
integrating optical elements with the ion trap structure with-
out any fundamental loss in fidelity [40]. Eventually, τE lower
than 1 msec should be possible in both types of connections.
The process to generate ion-ion entanglement using pho-
ton interference requires resonant excitation of the commu-
nication qubits, and steps must be taken to isolate the com-
munication qubit from other memory qubits so that scattered
light from the excitation laser and the emitted photons do not
disturb the spectator memory qubits. It may be necessary to
physically separate or shuttle the communication qubit away
from the others, invoking some of the techniques from the
QCCD approach. This crosstalk can also be eliminated by uti-
lizing a different atomic species for the communication qubit
[41], so that the excitation and emitted light is sufficiently far
from the memory qubit optical resonance to avoid causing de-
coherence. The communication qubits do not require excel-
lent quantum memory characteristics, because once the entan-
glement is established between the communication qubits in
different ELUs, they can immediately be swapped with neigh-
boring memory qubits in each ELU.
C. Reconfigurable Connection Network in MUSIQC
The MUSIQC architecture allows a large number NELU
of ELUs (or EELUs) to be connected with each other using
the photonic channels, as shown in Fig. 1. The connec-
tion is made through an optical crossconnect (OXC) switch
with NELU input and output ports. The photon emitted from
the communication qubit in each ELU is collected into a
single-mode fiber and directed to a corresponding input port
of the OXC switch. Up to NELU/2 Bell state detectors, each
comprised of two fibers interfering on a beam splitter and
two detectors, are connected to the output ports of the OXC
switch. The OXC switch is capable of providing an optical
path between any input fiber to any output fiber that is not al-
ready connected to another input fiber. An ideal OXC switch
achieves full non-blocking connectivity with uniform optical
path lengths. This optical network provides fully reconfig-
urable interconnect network for the photonic qubits, allowing
entanglement generation between any pair of ELUs in the pro-
cessor with up to NELU/2 such operations running in paral-
lel. OXC switches that support 200 − 1, 100 ports utilizing
micro-electromechanical systems (MEMS) technology have
been constructed and are readily available [42, 43]. In prac-
tice, the photon detection can be accomplished in parallel with
a conventional charge-coupled-device (CCD) imager or an ar-
ray of photon-counting detectors, with pairs of regions on the
CCD or the array elements associated with particular pairs of
output ports from the fiber beamsplitters, as shown in Fig. 1.
III. PERFORMANCE ADVANTAGE OF MUSIQC
ARCHITECTURE
A. Computation Model in MUSIQC
In the circuit model of quantum computation, execution
of two-qubit gates creates the entanglement necessary to ex-
ploit the power of quantum physics in computation [21]. In
the alternate model of measurement-based cluster-state quan-
tum computation, all of the entanglement is generated at the
beginning of the computation, followed by conditional mea-
surements of the qubits [44]. The MUSIQC architecture pre-
sented here follows the circuit model of computation within
each ELU, but the probabilistic connection between ELUs is
carried out by generation of entangled Bell pairs similar to
the cluster-state computation model. In this sense, MUSIQC
realizes a hybrid model of quantum computation, driven by
the generation rate and burn rate of entanglement between the
ELUs. In the event the generation rate of entangled Bell pairs
between ELUs is lower than the burn rate, each ELU would
require the capacity to store enough initial entanglement so
that the end of the computation can be reached at the given
generation and burn rates of entanglement. The hybrid na-
ture of MUSIQC provides a unique hardware platform with
three distinct advantages: naturally parallel operation of each
ELU, constant timescale to perform operations between dis-
tant qubits, and moderate ELU size adequate for practical im-
plementation. One can further reduce the entanglement gener-
ation time by time-division multiplexing (TDM) the commu-
nication ports at the expense of added qubits. Moreover, the
temporal mismatch between the remote entanglement gener-
ation and local gates is reduced as the requirement of error
5correction increases the logical gate time.
For complex quantum algorithm associated with a prob-
lem size of n bits, logical operations between spatially distant
qubit pairs are necessary. In a hardware architecture where
only local gate operations are allowed (e.g., nearest neigh-
bor gates), performing gate operations between two (logical)
qubits separated by long distances could lead to communica-
tion times polynomial in the distance between qubits, O(nk).
When a large number of parallel operations is available, one
can employ a nested entanglement swapping protocol to ef-
ficiently distribute entanglement with communication times
scaling only logarithmically as a function of communication
distance [45]. The procedure requires extra qubits used to con-
struct quantum buses for long-distance entanglement distribu-
tion, and architecture adopting such buses was referred to as
the Quantum Logic Array (QLA) [46]. We construct a simple
model that provides a direct comparison between the QLA
and MUSIQC architectures in terms of the resources required
to execute useful quantum algorithms. Despite the slow en-
tanglement generation times, we find that the performance of
MUSIQC architecture is comparable to QLA (and its varia-
tions [47]), with substantial advantage in required resources
and feasibility for implementation.
In our simplified model, we consider (1) hardware capable
of implementing a Steane [[7,1,3]] quantum error correction
code to multiple levels of concatenation, (2) where all gate
operations are performed following fault-tolerant procedures.
This simplified model is designed to estimate the execution
time of the circuits in select architectures, and not intended
to provide the complete fault-tolerant analysis of the quantum
circuit. For this analysis, we therefore require that the physical
error levels are sufficiently low (∼ 10−7) to produce the cor-
rect answer with order-unity probability using only up to three
levels of concatenation of Steane code. The hardware is based
on trapped ion quantum computing with the assumptions for
the timescales for quantum operation primitives summarized
in Table I. The details of fault-tolerant implementation of uni-
versal gate set utilized in this analysis is summarized in Ap-
pendix A.
B. Construction of Efficient Arithmetic Circuits
The example quantum circuit we analyze is an adder cir-
cuit that computes the sum of two n-bit numbers. Simple
adder circuits form the basis of more complex arithmetic cir-
cuits, such as the modular exponentiation circuit that domi-
nates the execution time of Shor’s factoring algorithm [48].
Quantum adder circuits can be constructed using X , CNOT
and Toffoli gates. When only local interactions are avail-
able without dedicated buses for entanglement distribution, a
quantum ripple-carry adder (QRCA) is the adequate adder of
choice [49], for which the execution time goes as O(n). For
QLA and MUSIQC architectures, one can implement quan-
tum carry-lookahead adder (QCLA) that is capable of reduc-
ing the runtime to O(log n) [50, 51], at the expense of extra
qubits and parallel operations. QCLA dramatically outper-
forms the QRCA for n above ∼ 100 in terms of execution
time. Practical implementation of large-scale QCLAs are hin-
dered by the requirement of executing Toffoli gates among
qubits that are separated by long distances within the quantum
computer. MUSIQC architecture flattens the communication
cost between qubits in different ELUs, providing a suitable
platform for implementing QCLAs. Alternatively, QLA ar-
chitecture can also efficiently execute QCLAs using dedicated
communication bus that reduces the connection time between
two qubits (defined as the time it takes to generate entangled
qubit pairs that can be used to teleport one of the qubits or the
gate itself) to increase only as a logarithmic function of the
separation between them [46].
C. MUSIQC Implementation
In order to implement the QCLA circuit in MUSIQC archi-
tecture, each ELU should be large enough to accommodate
the generation of the |φ+〉L state shown in Fig. 9a. This re-
quires a minimum of 3 logical qubits and a 7-qubit cat state,
and sufficient ancilla qubits to support the state preparation.
We balance the qubit resource requirements with computation
time by requiring four ancilla qubits per logical qubit, so that
the 4-qubit cat states necessary for the stabilizer measurement
can be created in parallel. Implementation of each Toffoli gate
is realized by allocating a fresh ELU and preparing the |φ+〉L
state, then teleporting the three qubits from other ELUs into
this state. Once the gate is performed, the original logical
qubits from the other ELUs are freed up and become available
for another Toffoli gate. We find that 6n logical qubits placed
on 6n/4 = 1.5n ELUs is sufficient to compute the sum of two
n-bit integers using the QCLA circuit at the first concatenation
level of Steane code encoding.
Teleportation of qubits into the ELU containing the pre-
pared |φ+〉L state requires generation of entangled states via
photon interference. In order to minimize the entanglement
generation time, one should provide at least three optical ports
to connect to these ELUs in parallel. In order to successfully
teleport the gate, we need to create seven entangled pairs to
each ELU holding the input qubits. The entanglement gener-
ation time can be reduced by running multiple optical ports
to other ELUs in parallel (we call this the port multiplexity
mp). In a typical entanglement generation procedure, the ion
is prepared in an initial state, and then excited using a short
pulse laser (∼5ps). The ion emits a photon over a spontaneous
emission lifetime (∼10ns), and the photon detection process
will determine whether the entanglement generation from a
pair of such ions is successful. If the entanglement genera-
tion is successful, the pair is ready for use in the computation.
If not, the ions will be re-initialized (∼ 1µs) and the process
is repeated. Since the initialization time of the ion is ∼100
times longer than the time a photon is propagating in the op-
tical port, one can utilize multiple ions per optical port and
“pipeline” the photon emission process. In this time-division-
multiplex (TDM) scheme, another ion is brought into the op-
tical port to make another entanglement generation attempt
while the initialization process is proceeding for the unsuc-
cessful ion. This process can be repeated mT times using as
6Table I: Assumptions on the timescales of quantum operation primitives used in the model.
Quantum Single-Qubit Two-Qubit Toffoli Qubit Remote Entanglement
Primitive Gate Gate Gate Measurement Generation
Operation Time (µs) 1 10 10 30 3000
ELU with 100 physical qubits 
& 6 communication ports 
4 logical qubits (28) 
Communicator qubits (60) 
Ancilla qubits (12) 
12 parallel operations 
Logic Unit with 42 physical qubits 
In 7x7 square format 
4 logical qubits (28) 
Ancilla qubits (20), 1 spare qubit 
12 parallel operations 
Logic Block with 6 logic units embedded in 
communication units 
24 logical qubits 
882 communication qubits (7x7x18) 
441 parallel operations 
(b) 
(c) 
(a) 
Figure 4: (Color online) Example of the MUSIQC and QLA hard-
ware considered. (a) Each ELU in MUSIQC is made up of 100 phys-
ical qubits and 6 communication ports (only one shown in the figure),
where 60 qubits are used to increase the bandwidth of the remote en-
tanglement generation. These ELUs are connected thorugh an OXC
switch as shown in Fig. 1. (b) For QLA, each logic unit is made up of
49 physical qubits hosting four logical qubits and necessary ancilla
qubits. (c) A logic block is six such logic units embedded in com-
munication units. Communication units are square arrangements of
7× 7 qubits, and eight such units fully surround the logic unit.
many extra ions, before the first ion can be brought back (we
callmT the TDMmultiplexity). Using the port and TDM mul-
tiplexity, we can reduce the entanglement generation time by
a factor of mpmT .
In our example, we assume multiplexities mp = 2 and
mT = 10 that require 100 qubits (= 3×7+3×4+3×2×10)
and 12 parallel operations per ELU as shown in Fig. 4a. This
choice adequately speeds up the communication time between
ELUs to balance out other operation times in the hardware.
Multiple ELUs are connected by an optical switch to com-
plete the MUSIQC hardware (Fig. 1b). With these resources,
an efficient implementation of QCLA circuit can realized by
executing all necessary logic gates in parallel. Under these
circumstances, the depth of the n-bit in-place adder circuit is
given by [50]
blog2 nc+blog2(n−1)c+blog2
n
3
c+blog2
n− 1
3
c+14, (1)
for sufficiently large n (n > 6) where bxc denotes the largest
integer not greater than x. Out of these, two time steps con-
tain X gates, four contain CNOT gates, and the rest contain
Toffoli gates which dominate the execution time of the cir-
cuit. We assume an error correction step is performed on all
qubits after each time step, by measuring all stabilizers of the
Steane code and making necessary corrections based on the
measurement outcome.
Once the basic operational primitives outlined in the pre-
vious section are modeled at the first level of code concate-
nation, we can construct all of these primitives at the second
level of concatenation using the primitives at the first level.
We can recursively construct the primitives at higher levels of
code concatenation. Since the cost of remote CNOT gates be-
tween ELUs are independent of the distance between them,
recursive estimation of circuit execution at higher levels of
code concatenation is straightforward on MUSIQC hardware.
D. QLA Implementation
We consider a concrete layout of a QLA device optimized
for n-bit adder with one level of Steane [[7,1,3]] encoding,
which can be used to construct circuits at higher levels of code
concatenation. In order to implement the fault-tolerant Toffoli
gate described in Fig. 9, one should assemble four logical
qubits into a single tight unit, as we did for the ELUs in the
MUSIQC architecture. In the QLA implementation, a “Logic
Unit (LU)” consists of a square of 49 (= 7× 7) qubits, where
a block of 12 (= 3 × 4) qubits form a logical qubit with 7
physical qubits and 5 ancilla qubits (Fig. 4b). Just like in the
MUSIQC example, 6n logical qubits placed on 1.5n LUs are
necessary for adding two n bit numbers. Therefore, we or-
ganize six LUs into a logical block (LB), capable of adding
two 4 bit numbers. Each LU in the LB is surrounded by eight
blocks of 7 × 7 communication units dedicated for distribut-
ing entanglement using the quantum repeater protocol (Fig.
4c). We assume that the communication of the qubits within
each LU is “free”, and do not consider the time it takes for
such communication. This simplified assumption is justified
as the communication time between LUs utilizing the qubits
in the communication units dominate the computation time,
and therefore does not change the qualitative conclusion of
this estimate.
Similar to the MUSIQC hardware example, a Toffoli gate
execution involves the preparation of the state |φ+〉L state in
an “empty” LU, then teleporting three qubits onto this LU to
7complete the gate operation. The execution time of the Tof-
foli gate therefore is comprised of the time it takes to prepare
the |φ+〉L state, the time it takes to distribute entanglement be-
tween adequate pairs of LUs, and then utilizing the distributed
entanglement to teleport the gate operation. Among these, the
distribution time for the entanglement is a function of the dis-
tance between the two LUs involved, while the other two are
independent of the distance.
QCLA circuit involves various stages of Toffoli gates char-
acterized by the “distance” between qubits that goes as 2t,
where 1 ≤ t ≤ blog2 nc [50]. In a 2D layout as consid-
ered in Fig. 4c, the linear distance between these two qubits
goes as
√
2t, in units of the number of communication units
that the entanglement must be generated over. A slightly
more careful analysis shows that the linear distance is approx-
imately given by d(t) ≈ 3 · 2t/2 + 1 when t is even, and
d(t) ≈ 2(t+1)/2 + 1 when t is odd. Since each communica-
tion unit has 7 qubits along a length, the actual teleportation
distance is L(t) = 7d(t) in units of the length of ion chain.
The nested entanglement swapping protocol can create entan-
glement between the two end ions in blog2 L(t)c time steps,
where each time step consists of one CNOT gate, two single
qubit gates, and one qubit measurement process. Using the
expression for d(t), we approximate log2 L(t) ≈ t/2 + 4 for
both even and odd t, without loss of much accuracy. Unlike
in the case of MUSIQC, the entanglement generation time is
now dependent on the distance between the qubits (although
only in a logarithmic way), and the resulting time steps needed
for entanglement distribution within the QCLA is (approxi-
mately) given by
blog2 nc(blog2 nc+17)/4+blog2(n−1)c(blog2(n−1)c+17)/4+blog2
n
3
c(blog2
n
3
c+17)/4+blog2
n−1
3
c(blog2
n−1
3
c+17)/4.
It should be noted that in order to achieve this logarithmic
time, one has to have the ability to perform two qubit gate be-
tween every pair of qubits in the entire communication units
in parallel. The addition of two n qubit numbers require n/4
LBs. Since each LB has 18 communication units, there are
a total of 7 × 7 × 18 = 882 communication qubits in a
LB. The number of parallel operations necessary is therefore
441 simultaneous CNOT operations per LB, or 441n/4 ≈
110n parallel operations for n-bit QCLA. The number of X ,
CNOT and Toffoli gates that have to be performed remains
identical to the MUSIQC case since we are executing identical
circuit. We assume that the error correction is performed af-
ter every logic gate, but the entanglement distribution process
has high enough fidelity so that no further distillation process
is necessary.
Similar to MUSIQC case, one can generate basic opera-
tional primitives at higher levels of code concatenation in the
QLA model. Unlike the first encoding level, one may not
have to explicitly provide communication channels for the
second level of code concatenation if the quality of the dis-
tributed entanglement is sufficiently high so that neither en-
tanglement purification [52] nor error correction of the entan-
gled pairs [53] is not needed. This type of “inter-level opti-
mization” can be justified because the remote interaction be-
tween two logical qubits at second level of code concatena-
tion occurs very rarely, and the communication units at the
first level can be used to accommodate this communication at
higher level without significant time overhead. If dedicated
communication qubits were provided in addition, these qubits
might sit idle most of the time leading to inefficient use of the
qubit resources. The number of physical qubits therefore scale
much more favorably at higher levels of code concatenation
than in the first level in the QLA architecture. The distance-
dependent gate operation at higher levels of code concatena-
tion is somewhat difficult to predict accurately, but the loga-
rithmic scaling of communication time allows effective esti-
mation of the gate operation time with only small errors.
E. Results and Comparison
Figure 5a and Table II summarize the resource requirements
and performance of the QCLA circuit on MUSIQC and QLA
architecture, as well as QRCA circuit on a nearest neighbor
(NN) quantum hardware, where multi-qubit gates can only
operate on qubits sitting right next to one another. Although
the QLA architecture considered in this example is also a
NN hardware, presence of the dedicated communication units
(quantum bus) allows remote gate operation with the execu-
tion time that depends only logarithmically on the distance be-
tween qubits, enabling fast execution of the QCLA. The cost
in resources, however, is significant: realization of efficient
communication channel requires ∼ 3 times as many physical
qubits as used for storing and manipulating the qubits in the
first level of encoding, and requires a large number of paral-
lel operations and the necessary control hardware to run them.
The execution time can be fast compared to the MUSIQC ar-
chitecture, which is hampered by the probabilistic nature of
the photonic network in establishing the entanglement. We
have dedicated substantial resources in MUSIQC to speed up
the entanglement generation time as described in the previous
section. Although MUSIQC architecture will take∼ 15−30%
more time to execute the adder circuit, the resources it requires
to operate the same task is only about 13% of that required in
the QLA architecture. In both cases, we note the importance
of moving qubits between different parts of a large quantum
computer. The speed advantage in adder circuits translate di-
rectly to faster execution of Shor algorithm, so we adopted
QCLA for further analysis.
Once the execution time and resource requirements are
84	  
5	  
6	  
7	  
8	  
9	  
10	  
0	   1	   2	   3	   4	   5	   6	  
lo
g 1
0(
Ad
de
r	  E
xe
c.
	  T
im
e)
	  [S
Q
GT
]	  
log10(Problem	  Size	  n)	  [bits]	  
QCLA-­‐MUSIQC	  
QCLA-­‐QLA	  
QRCA-­‐NN	  
0	  
2	  
4	  
6	  
8	  
10	  
12	  
14	  
16	  
-­‐4	  
-­‐3	  
-­‐2	  
-­‐1	  
0	  
1	  
2	  
3	  
4	  
5	  
0	   1	   2	   3	   4	   5	   6	  
lo
g 1
0(
#	  
Ph
ys
ic
al
	  Q
ub
its
)	  
lo
g 1
0(
Ex
ec
uD
on
	  T
im
e)
	  [D
ay
s]
	  
log10(Problem	  Size	  n)	  [bits]	  
Execu<on	  Time	  on	  
MUSIQC	  
Phyiscal	  Qubits	  in	  
MUSIQC	  
(a) 
(b) 
1 minute 
1 day 
1 month 
1 year 
Figure 5: (Color online) (a) Execution time comparison of quantum
ripple-carry adder (QRCA) on a nearest-neighbor architecture (green
triangles), and quantum carry-lookahead adder (QCLA) on QLA (red
squares) and MUSIQC (blue diamonds) architectures, as a function
of the problem size n. All three circuits considered are implemented
fault-tolerantly, using one level of Steane [[7,1,3]] code. The execu-
tion time is measured in units of single qubit gate time (SQGT), as-
sumed to be 1µsec in our model. (b) Execution time (blue diamonds,
left axis) and number of required physical qubits (red squares, right
axis) of running fault-tolerant modular exponentiation circuit, repre-
sentative of executing the Shor algorithm.
Table II: Summary of the resource estimation and execution times of
various adders in MUSIQC and QLA architecture.
Performance QCLA on QCLA on QRCA on
Metrics MUSIQC QLA NN
Physical Qubits 150n 1,176n 20(n+1)
# Parallel Operations 18n 110n 8n+ 43
Logical Toffoli (µs) 3,250 2,327a 2,159
128-bit addition 0.16 s 0.13 s 0.56 s
1,024-bit addition 0.22 s 0.18 s 4.5 s
16,384-bit addition 0.29 s 0.25 s 72 s
aDoes not include entanglement distribution time
Table III: Estimated execution time and physical qubits necessary
to complete Shor algorithm of a given size. The numbers on top
(bottom) correspond to MUSIQC (QLA) architecture.
Performance n = 32 n = 512 n = 4, 096
Metrics
Code Level 1 2 3
# Physical MUSIQC 4.7× 104 9.2× 107 4.1× 1010
Qubits QLA 3.7× 105 7.2× 108 3.2× 1011
Execution MUSIQC 2.5 min 2.1 days 650 days
Time QLA 2.2 min 1.5 days 520 days
identified for the adder circuit, one can adopt the analyses
provided in Ref. [51] to estimate the performance metrics of
running Shor algorithm. The execution time and total num-
ber of physical qubits necessary to run Shor algorithm de-
pends strongly on the level of code concatenation required to
successfully obtain the correct answer. We first estimate the
number of logical qubits (Q) and the total number of logic
gate operations (K) required to complete the Shor algorithm
of a given size, to obtain the product KQ. In order to ob-
tain correct results with a probability of order unity, the in-
dividual error rate corresponding to one logic gate operation
must be on the order of 1/KQ [46]. From this considera-
tion, we determine the level of code concatenation to be used.
Table III summarizes the comparison on the number of phys-
ical qubits and the execution time of running Shor algorithm
on MUSIQC and QLA architectures for factoring 32, 512 and
4,096 bit numbers [54].
Figure 5b shows the execution time (in days) and the total
number of necessary physical qubits for completing the mod-
ular exponentiation circuit on a MUSIQC hardware, which is
a good representation of running the Shor algorithm. The dis-
crete jumps in the resource estimate correspond to addition of
another level of code concatenation, necessary for maintain-
ing the error rates low enough to obtain a correct result as the
problem size increases. Using 2 levels of concatenated Steane
code, we expect to be able to factor a 128-bit integer in less
than 10 hours, with less than 6 × 106 physical qubits in the
MUSIQC system. The execution time on QLA architecture
is comparable to that on MUSIQC architecture (within 20%),
but the number of required physical qubits is higher by about a
factor of 10. Furthermore, the total size of the single ELU nec-
essary to implement the QLA architecture grows very quickly
(over 4.5 × 107 physical qubits for machine that can factor
a 128-bit number), while the ELU size in MUSIQC archi-
tecture is fixed at moderate numbers (≈ 58, 000 ELUs with
100 qubits per ELU). Therefore, although still daunting, the
MUSIQC architecture substantially lowers the practical tech-
nological barrier in integration levels necessary for a large-
scale quantum computer.
9IV. FAULT TOLERANCE OF PROBABILISTIC PHOTONIC
GATES
Naı¨vely, it would appear that the average entanglement cre-
ation time τE must be much smaller than the decoherence
time scale τD to achieve fault tolerance, but we find that scal-
able fault-tolerant quantum computation is possible for any
ratio τE/τD, even in the presence of additional gate errors.
While large values of τE/τD would lead to impractical levels
of overhead in qubits and time (similar to the case of con-
ventional quantum fault tolerance near threshold error levels
[55]), this result is still remarkable and indicates that fault tol-
erance is always possible in the MUSIQC architecture. In this
section, we provide a complete description of the strategies
used to secure fault tolerance in MUSIQC architecture.
A. Analysis of fault-tolerance for fast entangling gates
First, we consider the case where τE/τD  1, where fault
tolerant coding is more practical. When each ELU is large
enough to accommodate logical qubits encoded with a con-
ventional error correcting code, one can implement full fault-
tolerant procedure within an ELU as in the example presented
in the previous section. When the ELUs are too small to fit
the logical qubits, fault-tolerance can be achieved by mapping
to three-dimensional (3D) cluster states, a known approach for
supporting fault-tolerant universal quantum computation [56].
This type of encoding is well-matched to the MUSIQC archi-
tecture, because the small degree of their interaction graph
leads to small ELUs.
Scheduling. For τE  τD, the 3D cluster state with qubits
on the faces and edges of a three-dimensional lattice can be
created using the procedure displayed in Fig. 6a. The proce-
dure consists of three basic steps: (1) Creation of Bell states
between different ELUs via the photonic link, (2) CNOT-gates
within each ELU, and (3) local measurement of three out of
four qubits in each ELU. As can be easily shown using stan-
dard stabilizer arguments, the resulting state is a 3D cluster
state, up to local Hadamard gates on the edge qubits.
The operations can be scheduled such that (a) qubits are
never idle, and (b) no qubit is acted upon by multiple gates
(even commuting ones) at the same time. The latter is re-
quired in some proposals for realizing quantum gates with ion
qubits. To this end, the schedule [56] for 3D cluster state gen-
eration is adapted to the MUSIQC architecture, and the three-
step sequence shown in Fig. 6a is expanded into the five-step
sequence shown in Fig. 6b. Through Steps 1 - 3 the Bell pairs
across the ELUs are created. Through Steps 2 - 4 the CNOTs
within each ELU are performed, and through Steps 3 - 5 three
qubits in each ELU are measured. The sequence of operations
is such that each of the three ancilla qubits in every ELU lives
for only three time steps: initialization (to half of a Bell pair),
CNOT, measurement. No qubit is ever idle in this protocol.
What remains to complete the computation is the local mea-
surement of the 3D cluster state [56]. All remaining measure-
ments are performed in Step 5 of the above procedure. This
works trivially for cluster qubits intended for topological error
Step 1 Step 2 Step 3 Step 4t p 1 t p 1 Step 2 tep 3Bell:
Step 2 Step 3 Step 4CNOT (front) :
(a) 
(b) 
Step 2: CNOT Gates 
Step 3: Local Pauli-Z/X 
             Measurements 
Step 1: Bell State Creation 
z 
x 
z 
x 
z 
x 
Local	  Pauli-­‐Z/X	  
Measurements	  :	  	  	  	  	  	  Step	  3	  	  	  	  	  	  	  	  	  	  	  Step	  4	  	  	  	  	  	  	  	  	  	  Step	  5	  
face 
edge 
Figure 6: (a) Three steps of creating a 3D cluster state in the
MUSIQC architecture, for fast entangling gates. Step 1: Creation
of Bell pairs between different ELUs, all in parallel. Step 2: CNOT
gates (head of arrow: target qubit, tail of arrow: control qubit). Step
3: Measuring of 3 out of 4 qubits per ELU. If the ELU represents
a face (edge) qubit in the underlying lattice, the measurements are
in the Z- (X-) basis. The resulting state is a 3D cluster state, up to
Hadamard gates on the edge-qubits. (b) Schedule for the creation of
a 3D cluster state in the MUSIQC architecture. Upper line: Schedule
for Bell pair production between ELUs representing face and edge
qubits. Lower line: Schedule for the CNOT gates within the ELUs
corresponding to the front faces of the lattice cell. Schedules for the
ELUs on other faces and on edges are similar.
correction or the implementation of topologically protected
encoded Clifford gates [57], since these measurements require
no adjustment of the measurement basis. To avoid delay in the
measurement of qubits for the implementation of non-Clifford
gates, it is necessary to break the 3D cluster states into over-
lapping slabs of bounded thickness [56].
Fault-tolerance threshold. We assume the following error
model. (1) Every gate operation, i.e. preparation and measure-
ment of individual qubits, gates within an ELU, and Bell pair
creation between different ELUs, can all be achieved within
a clock cycle of duration T . An erroneous one-qubit (two-
qubit) gate is modeled by the perfect gate followed by a par-
tially depolarizing one-qubit (two qubit) channel. In the one-
10
EA
B
C
D
A’
A
B
C
D
successful link E
A
E
BD
C
B’
A’
D’
C’
ELU
a) b)
c) d)
su
cc
es
sf
ul
 li
nk
s
Figure 7: Hypercell construction II. (a) Lattice cell of a three-
dimensional four-valent cluster state. The dashed lines represent the
edges of the elementary cell and the full lines represent the edges
of the connectivity graph. The three-dimensional cluster state is ob-
tained by repeating this elementary cell in all three spatial directions.
(b) Creating probabilistic links between several 3D cluster states. (c)
Reduction of a 3D cluster state to a 5-qubit graph state, via Pauli
measurements. The shaded regions represent measurements of Z,
the blank regions represent measurements of X . The qubits repre-
sented as black dots remain unmeasured. For details, see [56]. (d)
Linking graph states by Bell measurements in the remaining ELUs.
Four-valent, 3D cluster states of arbitrary size can be created.
qubit channel, X , Y , and Z errors each occur with probability
/3. In the two-qubit channel, each of the 15 possible errors
X1,X2,X1X2, .. ,Z1Z2 occurs with a probability of /15.
All gates have the same error . (2) In addition, the effect of
decoherence per time step T is described by local probabilis-
tic Pauli errors X , Y , Z, each happening with a probability
T/3τD.
A criterion for the error threshold of measurement-based
quantum computation with cluster states that has been estab-
lished numerically for a variety of error models is
〈K∂q〉({error parameters}) = 0.70. (2)
Therein, K∂q is a cluster state stabilizer operator associated
with the boundary of a single volume q, consisting of six
faces. Let f be a face of the three-dimensional cluster, and
Kf = σ
(f)
x
⊗
e∈∂f σ
(e)
z as shown in Fig. 7a. Then, K∂q =∏
f∈∂qKf =
⊗
f∈∂q σ
(f)
x . Furthermore, for the above crite-
rion to apply, all errors–for preparation of local states, local
and entangling unitaries, and measurement–are propagated
forward or backward in time, to solely affect the 3D cluster
state.
The above criterion applies for a phenomenological error-
model with local memory error and measurement error (the
threshold error probability per memory step and measurement
is 2.9% [58]), for a gate-based error model (the threshold error
probability per gate is 0.67% [56]), and further error models
with only low-order correlated error. Specifically, the criterion
(2) has numerically been tested for cluster state creation pro-
cedures with varying relative strength of local vs 2-local gate
error [56], with excellent agreement. In all cases, the error-
correction was performed using Edmonds’ perfect matching
algorithm.
The detailed procedure for calculating the error probability
of the stabilizer measurement process for the 3D cluster state
is provided in Appendix B. In combination with Criterion (2),
we obtain the threshold condition
+
55
32
T
τD
< 2.9× 10−3. (3)
Overhead. The operational cost of creating a 3D cluster
state and then locally measuring it for the purpose of computa-
tion is 24 gates per elementary cell in the standard setting, and
54 gates per elementary cell in MUSIQC. Here the elementary
cell of a 3D four-valent cluster state is shown in Fig. 7b. The
overhead of the MUSIQC architecture over fault-tolerant clus-
ter state computation is thus constant. The operational over-
head for fault-tolerance in the latter is poly-logarithmic [56],
as described in detail in Ref. [57].
B. Analysis of fault-tolerance for slow entangling gates
The above construction fails for τE/τD ≥ 1, where
decoherence occurs while waiting for Bell-pair entangle-
ment. However, scalable fault-tolerant computing can still be
achieved in the MUSIQC architecture for any ratio τE/τD,
even for ELUs of only 3 qubits. Compared to the case of
τE  τD, the operational cost of fault-tolerance is increased
by a factor that depends strongly on τE/τD but is independent
of the size of the computation. Thus, while quantum computa-
tion becomes more costly when τE ≥ τD, it remains scalable.
This surprising result shows that there is no hard threshold
for the ratio τE/τD, and opens up the possibility for efficient
fault-tolerant constructions with slow entangling gates. Here
we show that scalable quantum computation can be achieved
for arbitrarily slow entangling gates.
The main idea is to construct a “hypercell” out of several
ELUs. A hypercell has the same storage capacity for quan-
tum information as a single ELU, but with the ability to be-
come (close to) deterministically entangled with four other
hypercells. Fault-tolerant universal quantum computation can
then be achieved by mapping to a 4-valent, three-dimensional
cluster state [56]. First, we show that arbitrarily large ratios
τE/τD can be tolerated in the limiting case where the gate
error rate  = 0 (Construction I). Then, we show how to toler-
ate arbitrarily large ratios τE/τD with finite gate errors  > 0
(Construction II).
Hypercell Construction I is based on the snowflake design
[59, 60], as shown in Fig. 8a. The difference is that in the
present case, each node in the connectivity tree represents an
entire ELU, not a single qubit as in Refs. [59, 60]. At the root
11
ELU
surface area
a)
success
b)
A
B
10001003010
4
0.1 1 10 100 ΤE￿ΤD
0.0001
0.0002
0.0003
0.0004
Εmax
c) 
Figure 8: (Color online) Hypercell construction I. (a) Snowflake de-
sign of Refs. [59, 60]. (b) Connecting two hypercells. If the surface
area is large, with high probability one or more Bell pairs are created
between the surface areas via the photonic link. By Bell measure-
ments within individual ELUs (indicated by ovals) one such Bell pair
is teleported to the rootsA andB. c) Boundary of the fault-tolerance
region for gate error  and ratio τE/τD , for various ELU sizes. The
threshold for the gate error  depends only weakly on τE/τD .
of the tree is an ELU that contains the qubit used in the com-
putation, while multiple layers of bifurcating branches lead to
a large “surface area” with many ports from which to attempt
entanglement generation between two trees. Once a Bell pair
is created, it can be converted to a Bell pair between the root
qubits A and B via teleportation as shown in Fig. 8b.
The links (each representing a Bell pair) within a snowflake
structure are created probabilistically, each with a probability
p of heralded success. The success probability of each hy-
percell is small, but if the surface area between two neighbor-
ing hypercells is large enough, the probability of creating a
Bell pair between them via a probabilistic photonic link ap-
proaches unity. Thus, the cost of entangling an entire grid
of hypercells is linear in the size of the computation, as op-
posed to the exponential dependence that would be expected if
the hypercells could not be entangled deterministically. Cor-
respondingly, the operational cost of creating a hypercell is
large, but the cost of linking this qubit into the grid is inde-
pendent of the size of the computation. The hypercell offers
a qubit which can be near-deterministically entangled with a
constant number of other qubits on demand. A quantum com-
puter made up of such hypercells can create a four-valent, 3D
cluster state with few missing qubits, and is thus fault-tolerant
[56], [61], [72]. Hypercells can readily be implemented in
the modular ion trap quantum computer since the probability
of entanglement generation does not depend on the physical
distance between the ELUs.
We call the part of the hypercell needed to connect to a
neighboring hypercell a “tree”. For ELUs of coordination
number 3, the number m of ports that are available to connect
two hypercells is twice the number of ELUs in the top layer
of the tree. The probability for all m attempts to generate en-
tanglement between two trees to fail is Pfail = (1 − p)m ≈
exp(−mp). (In practice, we will allow a constant probability
of failure which is tolerable in 3D cluster states [61].) In ad-
dition, the number of ELUs in the top layer is 2# layers, and
the path length l (number of Bell pairs between the roots)
is l = 2 log2m + 1. Combining the above, we find that
l = 2 log2
c
p + 1, for c = − lnPfail. For simplification we
assume that the time t for attempting entanglement generation
is the same when creating the trees and when connecting the
trees. Then, p = t/τE in both cases. From the beginning of
the creation of the trees to completion of entangling two trees,
a time 2t has passed. The Bell pairs within the trees have
been around, on average, for a time 3t/2, and the Bell pairs
between the two trees for an average time of t/2. If overall
error probabilities remain small, the total probability of error
for creating a Bell pair is proportional to l. The memory error
alone is
mem =
t
τD
[
3 log2
(
c
τE
t
)
+
1
2
]
. (4)
This function is monotonically increasing with t, and
mem(t = 0) = 0. The task now is to suppress the mem-
ory error rate mem below the error threshold crit that applies
to fault-tolerant quantum computation with 3D cluster states.
From Eq. (2) we know that crit > 0.
From Eq. (4) we find that, for any ratio τE/τD, we can
make t small enough such that mem < crit. The operational
cost for creating a hypercell with sufficiently many ports is
O(hypercell) ∼
(
1
p
) 9/2 c
p
. This cost is high for small p =
t/τE , but independent of the size of the computation. Thus,
whenever decoherence on waiting qubits is the only source of
error, scalable fault-tolerant QC is possible for arbitrarily slow
entangling gates.
We now discuss how the above Hypercell Construction I
fares in the presence of additional gate error . We model ev-
ery noisy one-(two-)qubit operation by the perfect operation
followed by a SU(2)- (SU(4)-) invariant partial depolarizing
channel with strength , same as that used in Section IV A. If
 > 0 then every entanglement swap adds error to the compu-
tation. We must swap entanglement in every ELU on the path
between the roots A and B, and because there are 2 log2m of
them (m ≥ 2), for  1 the total error is
total =
t
τD
[
3 log2
(
c
τE
t
)
+
1
2
]
+ 2 log2
(
c
τE
t
)
. (5)
12
Now it is no longer true that for any choice of τE/τD we can
realize crit > total. A non-vanishing gate error sets an upper
limit to the tree depth, because the accumulated gate error is
proportional to the tree depth (Fig. 8b). This implies an upper
bound on the size of the top layer of the tree, which further
implies a lower bound on the time t needed to attempt entan-
gling the two trees (see Eq. (6) below) and thus a lower bound
on the memory error caused by decoherence during the time
interval t. The accumulated memory error alone may be above
or below the error threshold, depending on the ratio τE/τD.
In more detail, suppose that crit > total holds. Considering
only gate errors, crit > 2 log2
(
c τEt
)
, and hence,
t > cτE2
− crit2 . (6)
Now, recalling that c τEt = m ≥ 2, with Eq. (5) we find that
crit > 3t/τD + 2, or
t <
1
3
(crit − 2)τD. (7)
The two conditions Eq. (6) and (7) can be simultaneously
obeyed only if
τE
τD
<
crit − 2
3c
2
crit
2 (8)
We see that there is now an upper bound to the ratio τE/τD.
Eq. (8) is a necessary but not sufficient condition for fault-
tolerant quantum computation using the hypercells of Fig. 8b.
We have numerically simulated the process of constructing
these hypercells for various values of the decoherence param-
eters  and τE/τD. The boundary of the fault-tolerance region
in the τE/τD, -plane is shown in Fig. 8c. In the above, for
simplicity, we have considered hypercells in which all con-
stituent ELUs are entangled in a single timestep t. However,
there are various possible refinements. (1) The computational
overhead can be significantly decreased by creating the hy-
percell in stages, starting with the leaves of the trees and it-
eratively combining them to create the next layers [59]. (2)
Using numerical simulations it was found that if each of the
4 trees making up a hypercell has coordination number 4 or 5
rather then 3 (i.e., a ternary tree instead of a binary tree), the
overhead can be further reduced. These optimizations were
used to produce Figure 8c.
Hypercell Construction II allows fault-tolerance for finite
gate errors  > 0. In Construction I, the accumulated error for
creating a Bell pair between the roots A and B is linear in the
path length l between A and B. This limits the path length l,
and thereby the surface area of the hypercell. This limitation
can be overcome by invoking three-dimensional (3D) cluster
states already at the level of creating the hypercell. 3D cluster
states have an intrinsic capability for fault-tolerance [56] re-
lated to quantum error correction with surface codes [62, 63].
For Hypercell Construction II, we employ a 3D cluster state
nested within another 3D cluster state. Therein, the “outer”
cluster state is created near-deterministically from the hyper-
cells. Its purpose is to ensure fault-tolerance of the construc-
tion. The “inner” 3D cluster state is created probabilistically.
Its purpose is to provide a means to connect distant qubits
in such a way that the error of the operation does not grow
with distance. Specifically, if the local error level is below the
threshold for error-correction with 3D cluster states, the error
of (quasi-) deterministically creating a Bell pair between two
root qubitsA andB in distinct 3D cluster states is independent
of the path length between A and B.
The construction is as follows. We start from a three-
dimensional grid with ELUs on the edges and on the faces.
Each ELU contains four qubits and can be linked to four
neighboring ELUs. Such a grid of ELUs (of suitable size)
is used to probabilistically create a 4-valent cluster state by
probabilistic generation of Bell pairs between the ELUs, post-
selection and local operations within the ELUs.
After such cluster states have been successfully created, in
each ELU three qubits are freed up, and can now be used for
near-deterministic links between different 3D cluster states,
as shown in Fig. 7b. After 4 probabilistic links to other clus-
ters have succeeded (the size of the cluster states is chosen
such that this is a likely event), the cluster state is transformed
into a star-shaped graph state via X and Z measurements
(Fig. 7c). This graph state contains 5 qubits, shared between
the 4 ELUs at which the successful links start, and an addi-
tional ELU. Due to the topological error-correction capability
of 3D cluster states, the conversion from the 3D cluster state
to the star-shaped graph state is fault-tolerant [56]. By further
measurement in the ELUs, the graph states created in differ-
ent hypercells can now be linked, e.g. to form again a 4-valent
3D cluster state which is a resource for fault-tolerant quantum
computation [56], as shown in Fig. 7d. This final linking step
is prone to error. However, the error level is independent of
the size of the hypercell, which was not the case for Hypercell
Construction I.
The only error sources remaining after error-correction in
the 3D cluster stem from (i) the (two) ports per link, and (ii)
the two root qubits A and B, which are not protected topolog-
ically. The total error total of a Bell pair created between A
andB in this case is given by total = c1t/τD+c2 , where t is
the time spent attempting Bell pair generation, and c1 and c2
are algebraic constants which do not depend on the time scales
τE and τD, and not on the distance between the root qubits A
and B. Then, if the threshold error rate crit for fault-tolerance
of the outer 3D cluster state is larger than c2 , we can reach
an overall error total below the threshold value crit by making
t sufficiently small. Smaller t requires larger inner 3D cluster
states, but does not limit the success probability for linking
Construction II hypercells. Thus, fault-tolerance is possible
for all ratios τE/τD, even in the presence of small gate errors.
V. OUTLOOK
The success of silicon-based information processors in the
past five decades hinged upon the scalability of integrated cir-
cuits (IC) technology characterized by Moore’s law [64]. IC
technology integrated all the components necessary to con-
struct a functional circuit, using the same conceptual approach
over many orders of magnitude in integration levels. The hier-
13
archical modular ion trap quantum computer architecture dis-
cussed here promises scalability, not only in the number of
physical systems (trapped ions) that represent the qubits, but
also in the entire control structure to manipulate each qubit at
such integration levels.
The technology necessary to realize each and every com-
ponent of the MUSIQC architecture is either already available
or within reach. The recognition that ion traps can be mapped
onto a two dimensional surface that can be fabricated using
standard silicon microfabrication technologies [15, 18] has
led to a rapid development in complex surface trap technol-
ogy [19, 20]. Present-day trap development exploits extensive
electromagnetic simulation codes to design optimized trap
structures and control voltages, allowing sufficient control and
stability of ion positioning. Integration of optical components
into such microfabricated traps will enable stronger interac-
tion between the ions and photons for better photon collection
and qubit detection [65] through the use of high numerical
aperture optics or integration of an optical cavity with the ion
trap [40]. Moreover, electro-optic and MEMS-based beam
steering systems allows the addressing of individual atoms in
a chain with tightly focused laser beams [66, 67] and an opti-
cal interconnect network can be constructed using large-scale
all-optical crossconnect switches [42]. While technical chal-
lenges such as the operation of narrowband (typically ultravi-
olet) lasers or the presence of residual heating of ion motion
[12] still remain, they do not appear to be fundamental road-
blocks to scalability. Within the MUSIQC architecture we
have access to a full suite of technologies to realize the ELU
in a scalable manner, where the detailed parameters of the ar-
chitecture such as the number of ions per ELU, the number of
ELUs, or the number of photonic interfaces per ELU can be
adapted to optimize performance of the quantum computer.
VI. ACKNOWLEDGEMENTS
We thank D. Bacon, M. Biercuk, B. Blinov, S. Flammia, D.
L. Moehring, and R. E. Slusher for helpful discussions. This
work was supported by the Intelligence Advanced Research
Projects Activity, the Army Research Office MURI Program
on Hybrid Quantum Optical Circuits, and the NSF Physics
Frontier Center at JQI. LMD acknowledges support by the
NBRPC (973 Program) 2011CBA00300.
Appendix A: Universal Fault-Tolerant QC using Steane Code
We utilize the basic operational primitives of universal
quantum computation using Steane [[7,1,3]] code [68] fully
outlined in Ref. [21], summarized below.
1. The preparation of logical qubit |0〉L is performed by
measuring the six stabilizers of the code using four-
qubit cat state |cat〉4 ≡ (|0000〉 + |1111〉)/
√
2, fol-
lowing the procedure that minimizes the use of ancilla
qubits as outlined in Ref. [69]. The stabilizer measure-
ment is performed up to three times to ensure that the
error arising from the measurement process itself can
be corrected. We perform a sequential measurement
of the six stabilizers re-using the four ancilla qubits for
each logical qubit, which reduces the number of physi-
cal qubits and parallel operations necessary for the state
preparation at the expense of the execution time. Once
all the stabilizers are measured, a three-qubit cat state is
used to measure the logicalZL operator to finalize qubit
initialization process. This procedure requires eleven
physical qubits to complete preparation of logical qubit
|0〉L.
2. In Steane [[7,1,3]] code considered here, all opera-
tions in the Pauli group {XL, YL, ZL} and the Clifford
group {HL, SL, CNOTL} can be performed transver-
sally (i.e., in a bit-wise fashion). We assume seven par-
allel operations are available, so that these logical op-
erations can be executed in one time step correspond-
ing to the single- or two-qubit operation. The transver-
sal CNOTL considered here is between two qubits that
are close by, so the operation can be performed locally
without further need for qubit communication.
3. In order to construct effective arithmetic circuits, we
need Toffoli gate (a.k.a. CCNOTL) which is not in
the Clifford group. Since a transversal implementation
of this gate is not possible in Steane code, fault-tolerant
implementation requires preparation of a special three
(logical) qubit state
|φ+〉L =
1
2
(|000〉L + |010〉L + |100〉L + |111〉L), (A1)
and “teleport” the gate into this state [70]. This state can
14
V. OUTLOOK
The success of silicon-based information processors in the
past five decades hinged upon the scalability of integrated cir-
cuits (IC) technology characterized by Moore’s law [64]. IC
technology integrated all the components necessary to con-
struct a functional circuit, using the same conceptual approach
over many orders of magnitude in integration levels. The hier-
archical modular ion trap quantum computer architecture dis-
cussed here promises scalability, not only in the number of
physical systems (trapped ions) that represent the qubits, but
also in the entire control structure to manipulate each qubit at
such integration levels.
The technology necessary to realize each and every com-
ponent of the MUSIQC architecture is either already available
or within reach. The recognition that ion traps can be mapped
onto a two dimensional surface that can be fabricated using
standard silicon microfabrication technologies [15, 18] has
led to a rapid development in complex surface trap technol-
ogy [19, 20]. Present-day trap development exploits extensive
electromagnetic simulation codes to design optimized trap
structures and control voltages, allowing sufficient control and
stability of ion positioning. Integration of optical components
into such microfabricated traps will enable stronger interac-
tion between the ions and photons for better photon collection
and qubit detection [65] through the use of high numerical
aperture optics or integration of an optical cavity with the ion
trap [40]. Moreover, electro-optic and MEMS-based beam
steering systems allows the addressing of individual atoms in
a chain with tightly focused laser beams [66, 67] and an opti-
cal interconnect network can be constructed using large-scale
all-optical crossconnect switches [42]. While technical chal-
lenges such as the operation of narrowband (typically ultravi-
olet) lasers or the presence of residual heating of ion motion
[12] still remain, they do not appear to be fundamental road-
blocks to scalability. Within the MUSIQC architecture we
have access to a full suite of technologies to realize the ELU
in a scal ble manner, where the detailed parameters of the ar-
chitecture such as the number of i ns per ELU, the number of
ELUs, or the number of photonic interfaces per ELU can be
adapted to optimize performance of t e quantum compute .
VI. ACKNOWLEDGEMENTS
We thank D. Bacon, M. Biercuk, B. Blinov, S. Flammia, D.
L. Moehring, and R. E. Sl sher for helpful discussions. This
work was supported by the Intelligence Advanced Research
Projects Activity, the Army Research Office MURI Program
on Hybrid Quantum Optical Circuits, and the NSF Physics
Frontier Center at JQI. LMD acknowledges support by the
NBRPC (973 Program) 2011CBA00300.
Appendix A: Universal Fault-Tolerant QC using Steane Code
We utilize the basic operational primitives of universal
quantu computation using Stean [[7,1,3]] code [68] fully
outlined in Ref. [21], summarized below.
1. The preparation of logical qubit |0￿L is performed by
measuring the six stabilizers of the code using four-
qubit ca state |cat￿4 ≡ (|0000￿ + |1111￿)/
√
2, fol-
lowing the procedure that minimizes the use of ancilla
qubits as outlined in Ref. [69]. The stabilizer measure-
ment is performed up to three times t ensure that the
error arising from the measurement process itself can
be corrected. We perform a sequential measurement
of the six stabilizers re-using the four ancilla qubits for
each logical qubit, which reduces the number of physi-
cal qubits and parallel operations necessary for the state
preparation at the expense of the execution time. Once
all the stabilizers are measured, a three-qubit cat state is
used to measure the logicalZL operator to finalize qubit
initialization process. This procedure requires eleven
physical qubits to complete preparation of logical qubit
|0￿L.
2. In Steane [[7,1,3]] code considered here, all opera-
tions in the Pauli group {XL, YL, ZL} and the Clifford
group {HL, SL, CNOTL} can be performed transver-
sally (i.e., in a bit-wise fashion). We assume seven par-
allel operations are vailable, so that these log cal op-
erations can be executed in one time step correspond-
ing to the single- or two-qubit operation. The transver-
sal CNOTL c sidered here is between two qubits t at
are close by, so the operation can be performed locally
without further need for qubit communication.
3. In order to construct effective arithmetic circuits, we
need Toffoli gate (a.k.a. CCNOTL) which is not in
(a) |cat￿7 / • •
|0￿L / Z
|0￿L / H • |φ+￿L
|0￿L /

(b) / • • • X |x￿L
|φ+￿L / • Z X • |y￿L
/ Z |z ⊕ xy￿L
|x￿L / •
|y￿L / •
|z￿L / • H

Figure 9: Circuit diagram for realizing fault-tolerant Toffoli gate us-
ing Steane code. (a) The initial state |φ+￿L is prepared by measuring
theX1 and CNOT12 of three qubit state |0￿1(|0￿2+ |1￿2)|0￿3/
√
2.
Note that the Toffoli gate shown here is a bitwise Toffoli between the
7-qubit cat state and the two logical qubit states. (b) Using the state
prepared in (a), Toffoli gate can be realized using only measurement,
Clifford group gates and classical communication, all of which can
be implemented fault-tolerantly in the Steane code.
Figure 9: Circuit diagram for realizing fault-tolerant To foli gate us-
ing Steane code. (a) The initial state |φ+〉L is prepared by measuring
the 1 and CNOT12 of thr e qubit state |0〉1(|0〉2 + |1〉2)|0〉3/
√
2.
ote that the To foli gate shown here is a bitwise To foli betw en the
7-qubit cat state and the two logical qubit states. (b) Using the state
prepared in (a), To foli gate can be realized using only measurement,
Cli ford group gates and cla sical communication, all of which can
be i plemented fault-tolerantly in the Steane code.
14
be prepared by measuring its stabilizer operator using a
7-qubit cat state on three logical qubits |0〉L, as shown
in Fig. 9a. Successful preparation of this state requires a
bitwise Toffoli gate (at the physical level), which we as-
sume can only be performed locally among qubits that
are close to one another. Once this state is prepared, the
three qubits |x〉L, |y〉L and |z〉L participating in the Tof-
foli gate can be teleported to execute the gate, as shown
in Fig. 9b. Therefore, a successful Toffoli gate opera-
tion requires 3 logical qubits (which in turn require ex-
tra ancilla qubits to initialize) and 7 physical qubits as
ancillary qubits, in addition to the three logical qubits
on which the gate operates on.
4. When a CNOT gate is necessary between two qubits
that are separated by large distances, we take the ap-
proach where the two qubits of a maximally-entangled
state is each distributed to the vicinity of the two qubits,
and then the gate is teleported using the protocol pro-
posed in Ref. [71]. Efficient distribution of the en-
tangled states makes this approach much more effec-
tive than where the qubits themselves are transported
directly.
Appendix B: Error Probability for 3D Cluster States with Fast
Entangling Gates
Here we calculate the total error probability of the stabi-
lizer measurement process for the model considered in Sec-
tion IV A, assuming independent strengths for the local errors
and 2-local gate errors. We have local errors with strength
T/τD, and 2-local gate errors with strength . The expecta-
tion value of the stabilizer operator K∂q in Eq. (2) is
〈K∂q〉 =
∏
E∈error sources
1− 2pE . (B1)
Therein, pE is the total probability of those Pauli errors in
the error source E which, after (forward) propagation to the
endpoint of the cluster state creation procedure, anti-commute
with the stabilizer operator K∂q. The r.h.s. of Eq. (B1) is
simply a product due to the statistical independence of the in-
dividual error sources. Since the cluster state creation pro-
cedure is of bounded temporal depth and built of local and
nearest-neighbor gates only, errors can only propagate a fi-
nite distance. Therefore, only a finite number of error sources
contribute in Eq. (B1).
To simplify the bookkeeping, we make the following ob-
servations. (a) A Bell state preparation, 2 CNOT gates (one
on either side), and two local measurements on the qubits
of the former Bell pair (one in the Z- and one in the X ba-
sis) amount to a CNOT gate between remaining participating
qubits. Therein, the qubit on the edge of the underlying lattice
is the target, the qubit on the face is the control. We call this
a teleported CNOT link. (b) Errors can only propagate once
from face qubit to an edge qubit or vice versa, but never farther
than that. To see this, consider e.g. a face qubit. There, an X-
or Y -error can get propagated (face = control of CNOTs). In
either case it causes an X-error on a neighboring edge qubit.
ButX-errors are not propagated from edge-qubits (edge = tar-
get of all CNOTs). (c) The stabilizer K∂q has only support on
face qubits, and is not affected by X-errors.
Based on these observations, we subdivide the error sources
affecting 〈K∂q〉 into three categories, namely Type 1: First
Bell pair created on each face (according to the 5-step sched-
ule); Type 2: The CNOT links, consuming the remaining
Bell pairs; and Type 3: The final measurements of the clus-
ter qubits (1 per ELU).
Type-2 contributions: For every CNOT link we only need
to count Z-errors (and Y ∼= Z) on both the control (= face)
and target (= edge), because on the face qubit the Z-errors are
the ones that matter [with (c)], and on the edge qubit, such
errors may still propagate to a neighboring face qubit [with
(b)] and matter there. With these simplifications, the effective
error of each CNOT link between two neighboring ELUs is
described by the probabilities pZI for a Z-error on the face
qubit, pIZ for a Z-error on the edge qubit, and pZZ for the
combined error; and
pZI = 2+
10
3
T
τD
, pIZ = pZZ =
4
15
+
2
3
T
τD
. (B2)
Herein, we have only kept contributions up to linear order in
, T/τD. The contributions to the error come from (1) the
Bell pair, (2) a first round of memory error on all qubits, (3)
the CNOT gates, (4) a second round of memory error on all
qubits, and (5) the two local measurements per link.
Now we need to discuss the effect of each of the above gates
on 〈K∂q〉, taking into account propagation effects. For exam-
ple, consider the link established between the face qubit of a
front face f with its left neighboring edge qubit. (The Bell
pair for this link is created in Step 1, the required CNOTs are
performed in Step 2, and the local measurements in Step 3.)
The Z-error on f does not propagate further. The Z-error
on e is propagated in later steps to a neighboring face (see
Fig. 6b). Thus, the errors Zf and Ze of this gate affect 〈K∂q〉,
and ZeZf doesn’t. With Eq. (B1), the gate in question reduces
〈K∂q〉 by a factor of 1− 68/15 − 8T/τD.
The following links contribute: three for every face in ∂q
from within the cell, and three more per face of ∂q from the
neighboring cells (links ending in an edge belonging to the
cell q can affect 〈K∂q〉 by propagation). (i) Contributions
from within the cell. If a Ze-error of the link propagates to
an even (odd) number of neighboring faces in q, the total er-
ror probability affecting 〈K∂q〉 is pZZ+pZI (pIZ+pZI ). But
since p1Z = pZZ , all 18 contributions from within the cell q
are the same, irrespective of propagation. (ii) Contributions
from neighboring cells. Each of the 18 links in question con-
tributes an effective error probability pIZ + pZZ if an error
on the edge qubit of the link propagates to an odd number of
face qubits in ∂q. By inspection of Fig. 6b, this happens for 6
links. With Eq. (B2), all the Type-2 errors reduce 〈K∂q〉 by a
factor of
1− 160 T
τD
− 88. (B3)
Type-1 contributions: Each of the initial Bell pair creations
carries a two-qubit gate error of strength , and memory error
15
of strength T/τD on either qubit. Similar to the above case,
we can group the 15 possible Pauli errors into the equivalence
classes I , Zf (ZeZf ≡ I and Ze ≡ Zf for Bell states). The
single remaining error probability, for Zf , is
pZI =
8
15
+
4
3
T
τD
. (B4)
For each face of ∂q, there is one Bell pair within the face
that reduces 〈K∂q〉 by a factor of 1 − 2pZI . Bell pairs from
neighboring cells do not contribute an error here. Thus, all the
Type-1 errors reduce 〈K∂q〉 by a factor of
1− 8 T
τD
− 16
5
. (B5)
Again, only the contributions to linear order in , T/τD were
kept.
Type-3 contributions: The only remaining error source is
in the measurement of the one qubit per ELU which is part
of the 3D cluster state. The strength of the effective error on
each face qubit is pZ = 2/3 . Each of the six faces in ∂q
is affected by this error. Thus, all the Type-3 errors reduce
〈K∂q〉 by a factor of
1− 8. (B6)
Combining the contributions Eq. (B3), (B5), (B6) of error
Types 1 - 3 yields
〈K∂q〉 = 1− 512
5
− 176 T
τD
(B7)
for the expectation value 〈K∂q〉.
[1] Ladd, T. D. et al. Quantum computers. Nature 464, 45 (2010).
[2] Wineland, D. & Blatt, R. Entangled states of trapped atomic
ions. Nature 453, 1008–1014 (2008).
[3] Neeley, M. et al. Generation of three-qubit entangled states us-
ing superconducting phase qubits. Nature 467, 570–573 (2010).
[4] DiCarlo, L. et al. Preparation and measurement of three-qubit
entanglement in a superconducting circuit. Nature 467, 574–
578 (2010).
[5] Monroe, C. & Kim, J. Scaling the ion trap quantum processor.
Science 339, 1164 (2013).
[6] Thaker, D. D., Metodi, T. S. & Chong, F. T. A realizable
distributed ion-trap quantum computer. In High Performance
Computing - HiPC 2006, 13th International Conference, Ban-
galore, India, December 18-21, 2006, Proceedings, 111–122
(2006).
[7] Jiang, L., Taylor, J. M., Sørensen, A. S. & Lukin, M. D. Dis-
tributed quantum computation based on small quantum regis-
ters. Physical Review A 76, 062323 (2007).
[8] Moehring, D. L. et al. Quantum networking with photons and
trapped atoms. J. Opt. Soc. Am. B 24, 300–315 (2007).
[9] Helmer, F. et al. Cavity grid for scalable quantum computa-
tion with superconducting circuits. Europhys. Lett. 85, 50007
(2009).
[10] Yao, N. Y. et al. Scalable architecture for a room temperature
solid-state quantum information processor. Nature Communi-
cations 3, 800 (2012).
[11] Fujii, K., Yamamoto, T., Koashi, M. & Imoto, N. A distributed
architecture for scalable quantum computation with realistically
noisy devices. arXiv:quant-ph/1202.6588 (2012).
[12] Wineland, D. J. et al. Experimental issues in coherent quantum-
state manipulation of trapped atomic ions. J. Res. Nat. Inst.
Stand. Tech. 103, 259–328 (1998).
[13] Ha¨ffner, H., Roos, C. & Blatt, R. Quantum computing with
trapped ions. Phys. Rep. 469, 155 (2008).
[14] Kielpinski, D., Monroe, C. & Wineland, D. Architecture for a
large-scale ion-trap quantum computer. Nature 417, 709–711
(2002).
[15] Chiaverini, J. et al. Surface-electrode architecture for ion-trap
quantum information processing. Quant. Inf. Comp. 5, 419
(2005).
[16] Seidelin, S. et al. Microfabricated surface-electrode ion trap
for scalable quantum information processing. Physical Review
Letters 96, 253003 (2006).
[17] Wang, S. X., Labaziewicz, J., Ge, Y., Shewmon, R. & Chuang,
I. L. Demonstration of a quantum logic gate in a cryogenic
surface-electrode ion trap. Physical Review A 81, 062332
(2010).
[18] Kim, J. et al. System design for large-scale ion trap quantum
information processor. Quant. Inf. Comp. 5, 515 (2005).
[19] Moehring, D. L. et al. Design, fabrication and experimental
demonstration of junction surface ion traps. New J. Phys. 13,
075018 (2011).
[20] Merrill, J. T. et al. Demonstration of integrated microscale op-
tics in surface-electrode ion traps. New J. Phys. 13, 103005
(2011).
[21] Nielsen, M. A. & Chuang, I. L. Quantum Computation and
Quantum Information (Cambridge University Press, 2000).
[22] Cirac, J. I. & Zoller, P. Quantum computation with cold trapped
ions. Physical Review Letters 74, 4091–4094 (1995).
[23] Sørensen, A. & Mølmer, K. Quantum computation with ions
in thermal motion. Physical Review Letters 82, 1971–1974
(1999).
[24] Ospelkaus, C. et al. Trapped-ion quantum logic gates based
on oscillating magnetic fields. Physical Review Letters 101,
090502 (2008).
[25] Zhu, S.-L., Monroe, C. & Duan, L.-M. Arbitrary-speed quan-
tum gates within large ion crystals through minimum control of
laser beams. Europhys. Lett. 73, 485–491 (2006).
[26] Turchette, Q. A. et al. Heating of trapped ions from the quantum
ground state. Physical Review A 61, 063418 (2000).
[27] Deslauriers, L. et al. Scaling and suppression of anomalous
heating in ion traps. Physical Review Letters 97, 103007 (2006).
[28] Labaziewicz, J. et al. Temperature dependence of electric field
noise above gold surfaces. Physical Review Letters 101, 180602
(2008).
[29] Lin, G.-D. & Duan, L.-M. Equilibration and temperature distri-
bution in a driven ion chain. New Journal of Physics 13, 075015
(2011).
[30] Blinov, B., Moehring, D. L., Duan, L.-M. & Monroe, C. Ob-
servation of entanglement between a single trapped atom and a
single photon. Nature 428, 153 (2004).
[31] Togan, E. et al. Quantum entanglement between an optical pho-
16
ton and a solid-state spin qubit. Nature 466, 730 (2010).
[32] De Greve, K. et al. Quantum-dot spinphoton entanglement via
frequency downconversion to telecom wavelength. Nature 491,
421 (2012).
[33] Gao, W. B., Fallahi, P., Togan, E., Miguel-Sanchez, J. &
Imamoglu, A. Observation of entanglement between a quan-
tum dot spin and a single photon. Nature 491, 426 (2012).
[34] Cabrillo, C., Cirac, J. I., Garcia-Fernandez, P. & Zoller, P. Cre-
ation of entangled states of distant atoms by interference. Phys-
ical Review A 59, 1025 (1999).
[35] Duan, L.-M. & Kimble, J. Efficient engineering of multiatom
entanglement through single-photon detections. Physical Re-
view Letters 90, 253601 (2003).
[36] Simon, C. & Irvine, W. T. M. Robust long-distance entangle-
ment and a loophole-free bell test with ions and photons. Phys-
ical Review Letters 91, 110405 (2003).
[37] Moehring, D. L. et al. Entanglement of single-atom quantum
bits at a distance. Nature 449, 68 (2007).
[38] Duan, L.-M. & Monroe, C. Colloquium : Quantum networks
with trapped ions. Rev. Mod. Phys. 82, 1209–1224 (2010).
[39] Duan, L.-M., Blinov, B. B., Moehring, D. L. & Monroe, C.
Scaling trapped ions for quantum computation with probabilis-
tic ion-photon mapping. Quant. Inf. Comp. 4, 165–173 (2004).
[40] Kim, T., Maunz, P. & Kim, J. Efficient collection of single
photons emitted from a trapped ion into a single-mode fiber for
scalable quantum-information processing. Physical Review A
84, 063423 (2011).
[41] Schmidt, P. O. et al. Spectroscopy using quantum logic. Science
309, 749 (2005).
[42] Kim, J. et al. 1100 x 1100 port mems-based optical crosscon-
nect with 4-db maximum loss. IEEE Photon. Technol. Lett. 15,
1537 –1539 (2003).
[43] Neilson, D. et al. 256 x 256 port optical cross-connect subsys-
tem. J. Lightwave Technol. 22, 1499 – 1509 (2004).
[44] Raussendorf, R. & Briegel, H. J. A one-way quantum computer.
Physical Review Letters 86, 5188–5191 (2001).
[45] Briegel, H.-J., Du¨r, W., Cirac, J. I. & Zoller, P. Quantum re-
peaters: The role of imperfect local operations in quantum com-
munication. Physical Review Letters 81, 5932–5935 (1998).
[46] Metodi, T. S., Thaker, D. D., Cross, A. W., Chong, F. T. &
Chuang, I. L. A quantum logic array microarchitecture: scal-
able quantum data movement and computation. Proc. 38th An-
nual IEEE/ACM Int. Symp. on Microarchitecture (MICRO’05)
12 (2005).
[47] Metodi, T. S., Thaker, D. D., Cross, A. W., Chuang, I. L. &
Chong, F. T. High-level interconnect model for the quantum
logic array architecture. ACM Journal on Emerging Technolo-
gies in Computing Systems (JETC) 4 (2008).
[48] Vedral, V., Barenco, A. & Ekert, A. Quantum networks for
elementary arithmetic operations. Physical Review A 54, 147
(1996).
[49] Cuccaro, S. A., Draper, T. G., Kutin, S. A. & Moulton, D. P.
A new quantum ripple-carry addition circuit. arXiv:quant-
ph/0410184v1 (2004).
[50] Draper, T. G., Kutin, S. A., Rains, E. M. & Svore, K. M. A
logarithmic-depth quantum carry-lookahead adder. Quant. Inf.
Comput. 6, 351–369 (2006).
[51] Van Meter, R. & Itoh, K. M. Fast quantum modular exponenti-
ation. Physical Review A 71, 052320 (2005).
[52] Du¨r, W., Briegel, H.-J., Cirac, J. I. & Zoller, P. Quantum re-
peaters based on entanglement purification. Physical Review A
59, 169–181 (1999).
[53] Jiang, L. et al. Quantum repeater with encoding. Physical Re-
view A 79, 032325 (2009).
[54] Clark, C. R., Metodi, T. S., Gasster, S. D. & Brown, K. R. Re-
source requirements for fault-tolerant quantum simulation: The
ground state of the transverse ising model. Physical Review A
79, 062314 (2009).
[55] Knill, E. Quantum computing with realistically noisy devices.
Nature 434, 39 (2005).
[56] Raussendorf, R., Harrington, J. & Goyal, K. A fault-tolerant
one-way quantum computer. Ann. Phys. 321, 2242 (2006).
[57] Raussendorf, R., Harrington, J. & Goyal, K. Topological fault-
tolerance in cluster state quantum computation. New Journal of
Physics 9, 199 (2007).
[58] Wang, C., Harrington, J. & Preskill, J. Confinement-higgs tran-
sition in a disordered gauge theory and the accuracy threshold
for quantum memory. Ann. Phys. 303, 31 (2003).
[59] Li, Y., Barrett, S. D., Stace, T. M. & Benjamin, S. C. Fault toler-
ant quantum computation with nondeterministic gates. Physical
Review Letters 105, 250502 (2010).
[60] Fujii, K. & Tokunaga, Y. Fault-tolerant topological one-way
quantum computation with probabilistic two-qubit gates. Phys-
ical Review Letters 105, 250503 (2010).
[61] Barrett, S. D. & Stace, T. M. Fault tolerant quantum computa-
tion with very high threshold for loss errors. Physical Review
Letters 105, 200502 (2010).
[62] Kitaev, A. Ann. Phys. 303 (2003).
[63] Dennis, E. & et al. J. Math. Phys. 43, 4452 (2002).
[64] Moore, G. E. Cramming more components onto integrated cir-
cuits. Electronics 38 (1965).
[65] Kim, J. & Kim, C. Integrated optical approach to trapped ion
quantum computation. Quant. Inf. Comput. 9, 181 – 202 (2009).
[66] Schmidt-Kaler, F. et al. Realization of the cirac-zoller
controlled-not quantum gate. Nature 422, 408 – 411 (2003).
[67] Knoernschild, C. et al. Independent individual addressing of
multiple neutral atom qubits with a micromirror-based beam
steering system. Applied Physics Letters 97, 134101 (2010).
[68] Steane, A. M. Error correcting codes in quantum theory. Phys-
ical Review Letters 77, 793 (1996).
[69] DiVincenzo, D. P. & Aliferis, P. Effective fault-tolerant quan-
tum computation with slow measurements. Physical Review
Letters 98, 020501 (2007).
[70] Zhou, X., Leung, D. W. & Chuang, I. L. Methodology for quan-
tum logic gate construction. Physical Review A 62, 052316
(2000).
[71] Gottesman, D. & Chuang, I. L. Demonstrating the viability of
universal quantum computation using teleportation and single-
qubit operations. Nature 402, 390 – 393 (1999).
[72] If ELUs of size Nq = 3 are used, resulting in hypercells of
valency 3, then two such hypercells can be combined into one
of valency 4.
