Note on (active-)QRAM-style data access as a quantum circuit by Carstens, Tore Vincent & Theis, Dirk Oliver
ar
X
iv
:1
81
0.
10
75
9v
1 
 [q
ua
nt-
ph
]  
25
 O
ct 
20
18
Note on (active-)QRAM-style data access as a
quantum circuit
Tore Vincent Carstens and Dirk Oliver Theis∗
Institute of Computer Science, University of Tartu
and Ketita Labs OU¨
Tartu, Estonia
{carstens, dotheis}@{ketita.com, ut.ee}
May 2018
Abstract
We observe how an active (i.e., requring 2n parallel control operations) QRAM-like effect
N−1∑
y=0
|y〉〈y| ⊗ Uy
result,memoryy
can be realized, as a quantum circuit of depth O(n+
√
m) (wherem is the size of the result
register) plus the maximum over all z of the circuit depths of controlled-Uz operations.
Keywords: Gate-based quantum computing, quantum circuits, QRAM.
1 Introduction
All data processing, classical or quantum, requires that input data be made available in the
computer. On a quantum computer, one way to make data available is through Quantum
Random Access Memory (QRAM). QRAM is an abstract concept defined through the follow-
ing interaction: If y is a nonnegative n-bit integer representing the address of a memory
location, |µy〉 =
∑
s∈{0,1}m αsy |s〉 is a pure m-qubit quantum state “stored” at that “memory
location”, and r ∈ {0, 1}m is a targetm-bit string, then QRAM access has the following effect:
|y〉|r〉|µy〉 QRAM access−−−−−−−−→ |y〉
∑
s∈{0,1}m
αsy |s⊕ r〉|s〉,
where ⊕ is the bit-wise XOR between bit-strings. This can be concisely written as
N−1∑
y=0
|y〉〈y| ⊗ CNOT⊗m
memy,result
.
QRAM is an implicit assumption that quantum algorithms such as the quantum linear
system solver, HHL [4], and quantum machine learning algorithms derived from it make
on the quantum computer on which they run. A physical realization of QRAM with O(n2)
access time has been proposed [3, 2]; it is, however, unclear whether it is possible to build a
so-called “passive” QRAM [1], i.e., one which doesn’t require 2n parallel (classical) operations
∗Partly supported by the Estonian Research Council, ETAG (Eesti Teadusagentuur), through PUT Exploratory
Grant #620.
1
controling the quantum hardware. Using active QRAM will not give exponential speedup of,
say, HHL over classical algorithms using 2n processors.
This paper deals with a realization of (an obvious generalization of) active QRAM as a
quantum circuit. Given, for each z ∈ {0, . . . , 2n − 1}, a “memory register” memz of kz qubits
and a unitary operation Uz acting on two registers, a result register of m qubits and the said
“memory register”, we realize the unitary operation defined through
|y〉|result〉|memy〉 7→ |y〉Uy(|result〉|memy〉), (1)
whenever y ∈ {0, . . . , 2n − 1} and |result〉 is in a computational basis state. This can be
written concisely as an operation on the Hilbert space H(n)
address
⊗ H(m)
result
⊗
N−1⊗
z=0
H(kz)
memz
(the
superscripts give the number of qubits):
N−1∑
y=0
|y〉〈y| ⊗ Uy (2)
where it is understood, by abuse of notation, that Uy acts onHresult⊗Hmemy , i.e., it is tensored
with identities on Hmemz , z 6= y.
The point of this note is the observation that at the expense of (a) 2n parallel quantum
operations being performed, and (b) O(m2n) ancillary qubits, (2) can be realized in with a
quantum circuit of depth n+
√
m plus the maximum (over all z) circuit depth of a controlled-
Uz.
Taking kz = m for all z, and U
z an m-fold tensor-product of CNOTs gives QRAM. Given a
function f : {0, 1}n → {0, 1}m, setting, kz = 0 for all z, and
Uz :=
m⊗
i=0
Xf(z)i
(where X stands for the Pauli X operator and exponents are taken as usual) realizes the
unitary Uf with Uf (|y〉|r〉) = |y〉|f(y)⊕ r〉.
Hence, it can be said that (2) gives access to data which is partly “hard-coded” into the
quantum circuit, and partly “stored” in qubits on the quantum processor. Another exam-
ple are, e.g., controlled rotations e−iπµX , where µ is an m-bit fixed point fraction stored in
m qubits (possibly in superposition).
Our proposed quantum circuit follows the structure of [3], i.e., it is arranged in a binary
tree in such a way that operations on nodes with the same distance from the root can be run
in parallel.
2 Description of the quantum circuit
Some notation
Let N := 2n. We freely switch between interpreting nonnegative integers in {0, . . . , N − 1} as
bit-strings of length n; as usual, bit-strings have the higher-significant bits to the left.
The empty bit-string is denoted by ε.
Overview
Suppose implementations (quantum circuits or “black boxes”) of the controlled unitaries
|0〉〈0| ⊗ Id + |1〉〈1| ⊗ Uz (3)
2
for z ∈ {0, 1}n are given. Each Uz acts on two quantum registers: a result register resz of
size m, and a “memory” register memz of size kz; we allow kz = 0. All these registers resz,
memz , z ∈ {0, 1}n are assumed disjoint.
As in [3], the whole process is organized in a binary tree. The nodes of the tree are labeled
by bit-strings x = xℓ−1 · · ·x0 of up to n bits; we denote by |x| length of the bit-string (ℓ in the
case of x = xℓ−1 · · ·x0). The root of the tree has the label ε, which is the empty bit-string; if
x labels a node and |x| < n, then x0 and x1 are the labels of the two (left and right) children
of x; the leaves of the tree are the bit- strings of length n.
2.1 Down–Run–Up
As in [3], the quantum circuit operates in two phases: The “Down” phase, which propagages
the address information from the root of the tree to the leaves; and the “Up” phase, which
propagates the result of running Uz back to the root. Between the two, we have a “Run”
phase, which runs the controlled unitaries (3).
Uncomputation is needed in general, i.e., the complete quantum circuit will be: Down–
Run–Up–do-stuff–(Down–Run–Up)†.
2.1.1 The “Down” phase
For each non-root node x in the tree (i.e., each bit-string x with 1 ≤ |x| ≤ n), we use an ancilla
qubit lifex. If the address register address is in a computational basis state |z〉, then the
“Down” phase will set lifex to state |1〉, iff the node x is on the path from the root to the leaf
with label z.
For each non-leaf node x in the tree (i.e., each bit-string x with 1 ≤ |x| < n), we use an
ancilla register adrx with n − |x| qubits. If the address register address is in a computa-
tional basis state |z〉 with z = zn−1 · · · z0, then the “Down” phase will set adrx to the state∣∣zn−|x|−1 · · · z0
〉
, i.e., adrx is a copy of the n− |x| least significant bits of the address register.
We also need to “hand down” the contents of the result register res. For that, for each
non-root, non-leaf tree node x, we use an ancilla register resx, of size m. This is in addition
to the register resz, for each leaf z, on which the U
z operate, which we also consider as
ancilla registers. Further, we denote resε := result.
The “Down” phase proceeds as follows. First of all, all ancilla qubits (including resz, for
z of length n) are prepared in state |0〉, except for lifeε, whis is prepared in state |1〉.
3
For each k = 0, 2, 3, . . . , n− 1 (sequentially) do the following:
1.) For each node with label x of length k in parallel:
for each j = 0, . . . , n− k − 1 in parallel:
Apply the following CNOT gate: Controlled on adrx[j] flip adrx0[j]
2.) For each node with label x of length k in parallel:
for each j = 0, . . . , n− k − 1 in parallel:
Apply the following CNOT gate: Controlled on adrx[j] flip adrx1[j]
3.) For each node with label x of length k in parallel:
Apply the following Toffoli gate:
Controlled on adrx[n− k − 1] and on lifex, flip lifex0.
4.) For each node with label x of length k in parallel:
Sandwitched between two applications of the Pauli-X gate on adrx[n − k − 1],
apply the following Toffoli gate:
Controlled on adrx[n− k − 1] and on lifex, flip lifex1.
5.) For each node with label x of length k in parallel:
for each i = 1, . . . ,m sequentially:
Apply the following two Fredkin gates in parallel:
Controlled on lifex0, swap resx[i] and resx0[i]; and
Controlled on lifex1, swap resx[i] and resx1[i].
This has the effect that the address bits required to determine the lifeness of each node
is “handed down” in the tree to the leaves, which allows, on each level k of the tree, to run in
parallel the operations for all nodes on that level.
At the end of the “Down” phase, lifez, z ∈ {0, 1}n, indicates whether the execution of Uz
is requested.
2.1.2 Resource analysis
The life-qubits alone already require 2N qubits. A short calculation shows that the number
of adr-qubits is
n−1∑
k=0
(n− k − 1)2k = (1 + o(1))N
The number of res-qubits is, of course, (1− o(1))2mN . In total, the number of ancilla qubits
is (1 + o(1))(2m+ 3)N .
Owing to the parallelism, the circuit depth is O(n + m). The circuit width (maximum
number of parallel operations) is 2mn.
Remark 1. At the expense of
√
m additional ancilla qubits, the circuit depth can be reduced
to O(
√
m + n). Indeed, replace the sequential loop in step5.) by the following. In every
step i = 1, . . . ,
√
m, do this in parallel: (a) make a CNOT-copy of the control-ancilla, and (b)
perform s− 1 controlled operations controlled on the copies of the ancilla qubit created in the
earlier steps. Finally, uncompute the ancillas.
2.1.3 The “Run” phase
After the completion of the “Down” phase, we execute the controlled unitaries (3).
For each z ∈ {0, 1}n in parallel:
lifez •
resz /
m
Uz
memz /
kz
4
Recall that kz can be zero.
2.1.4 The “Up” phase
The “Up” phase moves the result from the leaves up to the root. As a unitary operator, it is,
the adjoint operation of the “Down” phase. We repeat it here for clarity.
For each k = n− 1, n− 2, . . . , 0 (sequentially) do the following:
1.) For each node with label x of length k in parallel:
for each i = 1, . . . ,m in parallel:
Apply the following two Fredkin gates in parallel:
Controlled on lifex0, swap resx[i] and resx0[i]; and
Controlled on lifex1, swap resx[i] and resx1[i].
2.) For each node with label x of length k in parallel:
Sandwitched between two applications of the Pauli-X gate on adrx[n − k − 1],
apply the following Toffoli gate:
Controlled on adrx[n− k − 1] and on lifex, flip lifex1.
3.) For each node with label x of length k in parallel:
Apply the following Toffoli gate:
Controlled on adrx[n− k − 1] and on lifex, flip lifex0.
4.) For each node with label x of length k in parallel:
for each j = 0, . . . , n− k − 1 in parallel:
Apply the following CNOT gate: Controlled on adrx[j] flip adrx1[j]
5.) For each node with label x of length k in parallel:
for each j = 0, . . . , n− k − 1 in parallel:
Apply the following CNOT gate: Controlled on adrx[j] flip adrx0[j]
2.2 The effect
It should be apparent from the construction that the circuit does what it is supposed to do:
Proposition 2. If the address register is in a computational basis state |address〉 = |y〉, for
y ∈ {0, 1}n, then the unitary transformation has the following effect:
|address〉|result,mem〉 |0〉 · · · |0〉︸ ︷︷ ︸
ancillas
−→ |address〉(Uy ⊗ 1)(|result,mem〉)|0〉 · · · |0〉, (4)
where the effect of (Uy ⊗1) is Uy on result and memy registers, and identity on all memz with
z 6= y.
References
[1] Scott Aaronson. Read the fine print. Nature Physics, 11(4):291, 2015.
[2] Vittorio Giovannetti, Seth Lloyd, and Lorenzo Maccone. Architectures for a quantum
random access memory. Physical Review A, 78(5):052310, 2008.
[3] Vittorio Giovannetti, Seth Lloyd, and Lorenzo Maccone. Quantum random access mem-
ory. Physical review letters, 100(16):160501, 2008.
[4] Aram W Harrow, Avinatan Hassidim, and Seth Lloyd. Quantum algorithm for linear
systems of equations. Physical review letters, 103(15):150502, 2009.
5
