Ultra-Low-Power Superconductor Logic by Herr, Quentin P. et al.
ar
X
iv
:1
10
3.
42
69
v1
  [
co
nd
-m
at.
su
pr
-co
n]
  2
2 M
ar 
20
11
,
Ultra-Low-Power Superconductor Logic
Quentin P. Herr,∗ Anna Y. Herr, Oliver T. Oberg, and Alexander G. Ioannidis
Northrop Grumman Systems Corp., Baltimore, MD 21240
(Dated: March 23, 2011)
We have developed a new superconducting digital technology, Reciprocal Quantum Logic, that
uses AC power carried on a transmission line, which also serves as a clock. Using simple experiments
we have demonstrated zero static power dissipation, thermally limited dynamic power dissipation,
high clock stability, high operating margins and low BER. These features indicate that the tech-
nology is scalable to far more complex circuits at a significant level of integration. On the system
level, Reciprocal Quantum Logic combines the high speed and low-power signal levels of Single-Flux-
Quantum signals with the design methodology of CMOS, including low static power dissipation, low
latency combinational logic, and efficient device count.
Power consumption has increasingly become a limiting
factor in high performance digital circuits and systems.
According to a U.S. Environmental Protection Agency
study [1], the demand of servers and data centers in the
U.S. is approaching 12GW, equivalent to the output of
25 power plants. Here we show a new logic family, Re-
ciprocal Quantum Logic, that combines the low energy
and high clock rates of superconductor devices with the
essential qualities of CMOS, including low static power
dissipation, low latency combinational logic, and efficient
device count. On a system level, this yields a factor of
300 reduction in power compared to projected nano-scale
CMOS, while taking in account the overhead of cryogenic
operating temperature.
Superconducting digital electronics has long been con-
sidered the ultimate low energy alternative to CMOS [2]
based on the fundamental advantages of lossless inter-
connect and fast, low energy signal levels. Passive super-
conducting interconnect allows data transmission with-
out signal amplification. Superconducting interconnects
have a typical bandwidth of 700 GHz, which has allowed
serial data rates up to 60Gbs−1 for chip-to-chip commu-
nication [3]. Lossless interconnects would enable large
systems with high computational density, as compared to
conventional systems where interconnect dominates the
total power budget.
Unlike transistor circuits, where dissipated power is set
by device size and materials, superconductor circuits are
in the regime where device size and power dissipation is
set by the thermal noise limit. The active device, the
Josephson junction, generates quantum accurate digital
information in the form of Single Flux Quanta (SFQ) of
magnetic field Φ0 = h/2e = 2.06 × 10−15Wb. Using
equivalent units Φ0 ≈ 2mVps=2mApH illustrates that
the SFQ can exist as a transient voltage pulse across the
Josephson junction V =
∫ τ
0
V dt ≡ Φ0 or as a persistent
current in a superconducting inductive loop. For a typi-
cal minimum critical current of 0.1mA at liquid helium
temperature, the SFQ pulse energy ESFQ =
∫ τ
0
IV dt ≈
IcΦ0 is only 1× 10−19 J. This is only about three orders
of magnitude above the fundamental thermal Boltzmann
limit, kBT , and is the practical limit for classical digital
circuits operating with low bit error rate. Beyond this
limit there are only reversible computing and quantum
computing.
Numerous circuits with record-breaking clock rates
have been demonstrated in the Rapid-Single-Flux-
Quantum (RSFQ) [4] logic family, including a static digi-
tal divider operating up to 770GHz [5], digital signal pro-
cessors clocked at 20-40GHz [6], [7], and a serial micro-
processor at 20GHz [8]. Cryocooler-mounted prototypes
have included a digital receiver used for satellite commu-
nications [9] and high-end router components operating
at 47Gbs−1 port speed [10]. However, the technology suf-
fers from high overhead in static power dissipation and
device count, which offset the energy advantages of SFQ
data encoding. RSFQ circuits use DC power delivered on
a common voltage rail via bias resistors, which is anal-
ogous to TTL logic and inferior to CMOS due to static
power dissipation. Ten times more power is dissipated in
the bias resistors than in the active devices even in a fully
active RSFQ circuit. While the power rail voltage is only
about 1mV, it draws significant current, reaching 1A for
a circuit with 10,000 Josephson junctions. This results
in high parasitic heat load in the cryopackage. Addi-
tional overhead is incurred in the timing design; RSFQ
uses an active clock distribution network, which leads
to significant accumulated jitter and timing variations
based on device parameters and data pattern statistics.
Timing design of high speed circuits results in a total
device count that is dominated by the clock distribution
network [11]. RSFQ is pipelined on the gate level, which
enables high clock rates but also incurs high latency. The
new logic family described here circumvents each of these
limitations while preserving the fundamental property of
SFQ data encoding.
We report a new superconducting logic family, Recip-
rocal Quantum Logic (RQL), that eliminates static power
by replacing bias resistors with inductive coupling to an
AC transmission line that effectively powers the devices
2in series and eliminates large ground return current. The
AC power also serves as a stable clock reference signal,
preventing accumulated clock jitter. The novel power
supply is paired with a novel data encoding. A logi-
cal “one” is encoded as a reciprocal pair of SFQ pulses
of opposite polarity. During the positive half cycle, the
logic operation involves storage and routing of SFQ data
pulses. While the gates have internal state with respect
to the positive pulse, the trailing negative-polarity SFQ
pulse serves as a reset. This greatly simplifies gate de-
sign and produces combinational logic behavior. Similar
to CMOS, these combinational gates allow multiple levels
of logic per stage for low latency. Overall, RQL combines
the high speed and low-power signal levels of SFQ signals
with the design methodology of CMOS.
Using simple experiments involving logic gates and a
1600-device shift register and logic gates, we have demon-
strated that RQL is at once high speed and low energy
with a low bit-error rate. We measure energy dissipa-
tion to be within a factor of 1000 of the thermal limit
at clock rates in the range 2-10GHz for the shift regis-
ter, and negligible BER of less than 10−40 for the logic
gates while maintaining operating margins of ±30%. AC
power is supplied to the circuit on superconductor mi-
crostrip transmission line, which also serves as a passive
clock distribution network. We show high stability of the
clock at frequencies up to 12GHz. The technology scales
to the one-million device level clocked at 6GHz with only
a 6mW power supply, amounting to only 15mA on a
50Ω line with dynamic timing variation of only ±1% of
the clock period. This indicates that the technology is
scalable to complex circuits at a significant level of inte-
gration. Computational efficiency of the circuits is nearly
three orders of magnitude higher in terms of operations
per Joule compared to high performance CMOS. Tak-
ing into account that superconductor circuits require a
cryocooler, with efficiency of 1,000W/W achievable at
4.2Kelvin, RQL circuits offer a system-level factor of 300
less wall-plug power dissipation. This makes RQL tech-
nology attractive for many applications, including high
end computing.
Results
Power dissipation. RQL circuits have zero static
power dissipation, so for the first time dynamic power
dissipation in a superconducting SFQ circuit could be
measured directly. The clock power is carried on 50Ω
lines that return to room temperature without termina-
tion on chip, allowing direct measurement of the relative
amplitude of the output waveforms for an inactive and
fully active circuit. Because dynamic power dissipation
is so small, the experiment requires a circuit with a large
number of Josephson junctions and a low AC power am-
plitude with relatively high coupling to the clock line.
1 1 0 0
Clock 2
Clock 1
Output
b
phase 2
phase 3
phase 4c
25 µm
Input
dc offset
Clock 1
SFQ Input
SFQ Output
a
0 1 1 0
Clock 2
phase 1
Clock 2
Clock 1
Lc
LbL1 L2
Ic1 Ic2
Iclk
Lm
−Φ0 Φ0
FIG. 1: A Reciprocal Quantum Logic (RQL) shift
register. a, Novel reciprocal data encoding, where the AC
clock propagates digital ones as pairs of Single Flux Quan-
tum (SFQ) pulses of opposite polarity. b, Schematic of the
RQL shift register bit. Cross-wired transformers effectively
produce a four-phase clock with only two AC power lines,
in quadrature. The SFQ pulses are shown as loop currents
that move through the circuit with a half cycle of separation.
Circuit parameters are: Ic1 = 0.141mA, Ic2 = 0.200mA,
L1 = 3pH, L2 = 2.1 pH, Lb = 10.9 pH, Lm = 5.1 pH,
and Iclk=0.7mA amplitude and 0.2mA effective offset (not
shown). c, Physical layout of the shift register in a fabrication
process with four Nb metal layers, with the middle layers used
for inductive wireup, and the top and bottom layers serving
as ground plane shields. The AC clock lines, and a separate
line to apply dc offset, are microstrips with the filament in a
first metal layer and ground in the top layer. Bias inductors
lie in the third metal layer situated on top of the clock signal
line with strong inductive coupling scaling linearly with the
length of the transformer.
The shift register was chosen as a convenient test vehi-
cle.
Fig. 1 shows the schematic and physical layout of one
bit of the RQL shift register circuit. The four-phase
clock is a fundamental feature that provides directional-
ity. Without this, the positive pulse that moves forward
during the positive half clock cycle would travel back-
ward during the negative half of the cycle, annihilating
the negative pulse. Instead, the positive pulse rides the
leading edge of the clock from one phase to the next and
arrives at the output after one cycle of delay, and the
33
2
1
0
 0  2  4  6  8  10  12
 
 
Clock Rate (GHz)
Pseudo−random code
Clock 2
Clock 1
Power Dissipation (µW)
2nIcΦ0f
1.35× Pnum
FIG. 2: Power dissipation. Power ratio of the clock output
for the two data patterns, corresponding to all “ones” and all
“zeros,” was measured for frequencies of 2-12GHz. Power
dissipation is derived from the directly-observed power ratio
and the on-chip clock power of 12.5 µW, calculated as the
geometric mean of applied and returned power. At 6GHz
and below, measurement error is within the size of the data
points. At 8GHz and above, the primary source of error is
variation in clock attenuation on the different lines in the
American Cryoprobe BCP-2 chip holder, producing visible
spread between data points for clock 1 and clock 2. Additional
data points correspond to a 6Gbs−1 pseudo-random input
pattern that shows half the power dissipation compared to all
“ones”, as expected. Measured power dissipation agrees well
with the result from circuit simulation, Pnum, with a single
multiplicative fitting parameter. However, the power is three
times smaller than the analytical estimate that scales with
circuit size and frequency as 2nIcΦ0f and that would apply
to dc-powered SFQ devices.
negative pulse follows with half a cycle of separation.
A 200-bit shift register with 1600 Josephson junctions
was fabricated in a commercial superconductor fabrica-
tion process with 4.5 kA cm−2 Josephson junction crit-
ical current density, 1.5µm minimum feature size, and
four metal layers [12]. The junction plasma frequency is
250GHz, corresponding to an SFQ pulse width of 3 ps at
critical damping. The impedance of the power transmis-
sion line, implemented as microstrip with 2.3µm mini-
mum line width and SiO2 dielectric thickness of 850nm,
is limited to 32Ω including parasitic capacitive coupling
to the bias inductors of about 7 fF. Impedance matching
to 50Ω was accomplished with tapered microstrip lines
leading to the contact pads.
The data in Fig. 2 correspond to the power dissipated
in a fully active circuit with an all “ones” input pat-
tern relative to the inactive case of all “zeros”. Using
the estimate 2IcΦ0, energy dissipation per digital “one”
is 6.8 × 10−19 J with average Ic = 170µA. Total power
dissipation of the circuit would scale with the number
of Josephson junctions and clock frequency. Measured
power dissipation in the circuit is three times smaller
than this estimate. Additional data points corresponding
to a 6Gbs−1 pseudo-random input pattern show 0.6µW
total power dissipation in the 800 Josephson junctions
on each clock line. This is half the power dissipation
of the all “ones,” as expected, and is only three orders
of magnitude above the von Neumann-Landauer thermal
limit[13], kBT ln 2 per bit.
A model for SFQ dissipation based on the energy po-
tential indicates that the work done on a switching junc-
tion is a function of bias current rather than critical cur-
rent [14]. Physical-level simulation [15] of the circuit
shows that in the range of interest, where clock period
is much longer than the switching time of the Josephson
junctions, data pulses pass through each stage early in
the clock cycle under low bias conditions. This results
in a low energy SFQ pulse with simulated dissipation of
about 0.25IcΦ0 at 6Gbs
−1. Switching of the Josephson
junctions is shifted to higher bias at higher frequency,
producing a slightly non-linear frequency dependence.
Experimental data agreess with this simulation result fit
with a single prefactor of 1.35.
Clock phase stability. Switching of Josephson junc-
tions not only attenuates the AC clock but also adds
accumulative delay. The magnitude of this effect can be
estimated using a simple linear model where the Joseph-
son junction acts as an inductor if superconducting, and
as a resistor if switching. The clock propagation time in
the case of all digital “ones” is the same as for an isolated
line τ =
√
LcCc=7.6 fs/µm, where Lc = 0.3 pH/µm, Cc =
0.29 fF/µm are the clock line inductance and capacitance
in the circuit. In the case of all digital “zeros”, propa-
gation time is τ ′ =
√
L′cCc with inductance L
′
c given by
the impedance transformation for inductive coupling
L′c = (1 − k2)Lc + k2Lc(1||(Lg/Lb)), (1)
where Lb is the bias inductor and k = Lm/
√
LcLb is the
magnetic coupling constant as shown in Fig. 1b. Lg is
the inductance of the RQL gate connected to the bias in-
ductor. In the shift register, Lg = (LJ1+L1)||(LJ2+L2)
is the series and parallel combination of the interconnect
and the Josephson inductances LJ = Φ0/2piIc. In gen-
eral terms, data-dependent phase delay of the clock scales
as k2 and can be minimized by reducing coupling to the
clock line and increasing AC clock power.
Accumulated variable clock delay for the entire 200-
bit shift register is 1.4±0.2 ps and is independent of fre-
quency. Variable delay was directly observed on the clock
return from the chip at 2-12GHz on a sampling oscillo-
scope to compare the data patterns of all “ones” and all
“zeros”. Accuracy was limited by drift between the two
phase-locked synthesizers that clocked the chip and trig-
gered the oscilloscope. This result is in agreement with
the analytical estimate of Eq. 1, which gives 1.5 ps vari-
4A
B Q
a
c
A pulse goes to Q
If no B pulse,
A
B
Q1
Q2
A
B
d
First pulse goes to Q1
Second pulse goes to Q2
Q
Q1
Q2
A B
B Q
A
b
25 µm flux
DC
bias
Clock
e
DC flux bias Lc
Lb
Lm1
L1
L
L
Lt
Lm
L2
Lm
Lm1
L1
L1L
L Ic1Ic2
Ic
Ld
Iclk
Iflux
FIG. 3: RQL logic gates. The logic gates AnotB and An-
dOr are simple and robust due to the reset function implicit
in reciprocal data encoding. a,b, Block diagram, pulse logic
behavior, and schematic of AnotB gate. Operation of the gate
is based on a high-efficiency transformer that is cross-wired
to invert the polarity of the signal from input B. Schematic
values are Ic1 = 0.141mA, Ic2 = 0.100mA, L1 = 3pH,
L2 = 2.1 pH, L = 20pH, Lm = 17pH, Lm1 = 2.5 pH, and
Iflux = 0.4mA. c, In the physical layout, the high efficiency
transformer is implemented using moats in the upper and
lower ground plane, which appear as seven horizontal traces
cutting across the vertical wires. d,e, Block diagram, pulse
logic description, and schematic of AndOr gate. Here the
high-efficiency transformers aid propagation of either input
pulse to the outputs, but inhibit propagation to the oppo-
site input. Schematic values are Ic = 0.118 pH, L = 3pH,
L1 = 20 pH, Lm = 16 pH, Lt = 20pH, Lb = 12.8 pH,
Lm1 = 2.5 pH, and Iclk = 0.7mA amplitude and 0.2mA off-
set.
able clock delay for the complete 200-bit shift register
circuit.
Logic gates. Routing and processing of pulse-based
signals is distinct from transistor-based voltage-state
logic, as shown in Fig. 3. Logical A-and-not-B (AnotB)
means that an input pulse A will propagate to output Q
unless a pulse on input B comes first. Logical And & Or
(AndOr) means that the first input pulse, if any, goes to
Q1 (logical OR), and the second input pulse goes to Q2
(logical AND). Inputs to the AnotB gate must satisfy the
timing requirement that B arrive before A generates an
output. There is no similar timing requirement for the
AndOr gate.
The logical behavior of the gates is based on the re-
ciprocal data encoding. Considering only the positive
pulses, the gates are similar to the state machines of
RSFQ logic, as input changes the internal flux state of
the inductive loops. However, the trailing negative pulse
Clock 2
Clock 1
A
B
XOR
OR
AND
AnotB
Input A
Input B
Voltage (arbitrary units)
A
B
Phase 1 Phase 2 Phase 3 Phase 4 Phase 1
OR
AnotB
AND
Active interconnect −
XOR
AND
XOR
OR
AnotB
Outputs
A
B
AND
XOR
OR
AnotB
c
b
a
6 Gbs    Return−to−Zero (1 ns/div)−1
FIG. 4: RQL logic test. a, The circuit has two inputs,
three logic gates, and four logic outputs including a synthe-
sized XOR. Fanout is produced using interconnect consisting
of two-junction shift register stages. b, The complete cir-
cuit includes input gates that convert a return-to-zero (RZ)
waveform to RQL data encoding and on-chip output ampli-
fiers that convert signals back to RZ voltage levels. The clock
and data inputs are inductively coupled to the circuit and
return on another set of signal lines without contacting chip
ground, which contributes to high signal integrity at GHz
rates. c, Input and output waveforms at 6Gb/s were cap-
tured on a sampling oscilloscope. A 1023-bit pseudo-random
pattern was split and applied to the inputs with a 39 bit off-
set. No signal averaging, smoothing, or subtraction was used
in the measurement.
erases the internal state every clock cycle and produces
combinational logic behavior. The reset operation af-
forded by the negative pulse greatly simplifies the logic
design, so that each gate consists of only two active de-
vices with inductive interconnect. The gates have large
parametric operating margins, including simulated toler-
ances on junction critical currents of at least ±50%.
On the physical level, both RQL gates have a bistable
internal flux state corresponding to ±Φ0/2. The AnotB
5-16
-14
-12
-10
-8
-6
-4
-2
 0
 1  1.5  2  2.5  3
Bit Error Rate (log scale)
No errors detected
Flux Bias on AnotB Gate (mA)
10x reduced power
4x reduced power
FIG. 5: Bit-error rate (BER). The BER of the AnotB
gate at 6GHz is shown as function of its flux bias Iflux. A
32-bit input pattern generated with an Anritsu MP1763C was
split and applied to the inputs with a 15 bit relative shift,
and the XOR output was compared to the correct pattern
with an Anritsu MP1764C error detector. Error bars on the
lowest points correspond to counting statics of 4 errors (left)
and 5 errors (right). Near the center, no errors detected for
a period of 30 hours gives an error floor below 10−15 for the
entire circuit. The data fit to the error function extrapolate
to a minimum BER of 10−480 at the optimal bias of 1.82mA.
Additional curves correspond to the BER scaled for reduced
device size and power.
gate has an explicit flux bias line that sets up positive
current through both junctions to ground. When either
junction is triggered by an SFQ pulse, the flux state is re-
versed, which reverses current through the junctions and
inhibits triggering of the opposite junction. The AndOr
gate has an implicit flux bias on inductor Lt, as the bias
inductor is connected at the end that favors the OR out-
put. Triggering of OR output redirects the bias current
through Lt to the AND output. The gates require high
efficiency transformers that provide high common-mode
inductance, when currents through the inductors are in
the same direction, and low differential mode inductance.
In the AnotB gate, the transformer inverts the polarity
of the signal between input B and the output. In the
AndOr gate, the transformers aid propagation of either
input pulse to the outputs, but inhibit propagation of
input to the opposite input.
RQL logic gates cannot drive each other directly, but
need at least one interconnect cell to achieve fanout of
one. The interconnect cell is similar to the two-junction
shift register unit cell, but with some reduction of bias
current to junctions interior to the clock phase bound-
aries. The standard requirement of fanout equal to four
can be achieved using seven interconnect cells arranged in
a binary tree. The four phase clock provides an implicit
pipeline without additional devices for latches or clock
distribution. Large circuit blocks with multiple levels
of logic can be on a single clock phase for low latency,
or smaller blocks can be used in a deeper pipeline for
higher maximum clock rate. A crucial property of the
RQL pipeline is self-correcting timing. A pulse arriv-
ing early at a phase boundary will be delayed due to
low bias current. A late pulse will arrive during high
bias and will be accelerated. Pulses travelling through
the pipeline constantly tend toward timing equilibrium.
However, at too high a clock rate a late pulse will arrive
at the phase boundary on the falling edge of the clock and
will be delayed further, causing failure. The maximum
clock frequency is determined by the number of switch-
ing events per phase during approximately one quarter
of the clock period.
Fig. 4 shows the RQL logic test circuit. The circuit
consists of an AndOr gate, two AnotB gates, and inter-
connect. The circuit has two inputs and four outputs to
produce the logic functions OR, XOR, AND, and An-
otB. Logical XOR is synthesized using an AndOr gate
followed by an AnotB. Total latency through the circuit
is one clock cycle. The timing requirement of each AnotB
gate is satisfied by placement on a clock phase boundary,
with gate output controlled by the trailing clock phase of
the output interconnect. With this arrangement the rel-
ative timing of inputs A and B is not important, as both
input are certain to arrive before the output is generated.
The logic circuit was collocated on the same chip as the
shift register. The complete circuit includes input gates
that convert a return-to-zero (RZ) waveform to RQL data
encoding, and output amplifiers that convert back to RZ
voltage levels. The distributed output amplifier, simi-
lar to that described in [16], provides a 2mV signal. At
the 1.5µm lithography node the logic circuit itself was
designed for 20GHz operation, but was limited in test
to 6GHz by the input pattern generator. Measured op-
erating margins on clock power were ±25%, limited on
the low end by the output amplifers, and on the high
end by overbiasing of the logic circuit. Similar operating
margins were measured at the lower frequency of 2GHz.
Bit-Error Rate SFQ circuits operate in the thermal
limit, with device size ultimately determining both en-
ergy dissipation and bit-error rate (BER), which must
be considered in conjunction with each other. Fig. 5
shows the BER of the AnotB gate clocked at 6GHz,
measured by monitoring the logical XOR output while
setting the flux bias on the gate to values near failure.
At the nominal point, the flux bias provides symmetry
between the bi-stable flux states in the gate. At low bias,
the observed failure is of logical “ones” becoming logical
“zeros” uniformly throughout the pattern; at high bias,
logical “zeros” become logical “ones”. Both error modes
indicate switching errors of the junction, labeled Ic1, that
generates the output.
The data are fit to a Gaussian distribution with bit-
6error rate
p =
1
4
erfc
(±(I− It)/20√
2δI
)
, (2)
where I − It is the distance of the flux bias current I
from the error threshold It, and δI is the root-mean-
square noise current. The prefactor takes into account
that for either error mode only half of the bits in the
pattern contribute to the error rate. The factor of twenty
represents the transfer function between applied flux bias
and current induced in the loop containing the Josephson
junctions. Numerical fits were obtained with less than
1% asymptotic standard error. On the left, It = 0.66mA
and δI = 1.02µA; on the right, It = 3.04mA and δI =
1.56µA. At the optimal bias point of 1.82mA, the data
extrapolate to negligible BER for the gate under test.
The lowest rate actually measured is below 10−15 for the
entire circuit, including the output data link.
Discussion
The extrapolated minimum BER in our test indicates
that device size and power could be scaled down still fur-
ther. The error mechanism may involve either storage
errors [17], decision errors [18], or even timing errors in
an over-clocked circuit [19]. In all of these cases, noise
current scales as the square root of the Josephson criti-
cal current, while the current scale of the error threshold
goes linearly. Applying this scaling to Eq. 2 indicates
that device size could be reduced by a factor of ten and
still extrapolate to a minimum error rate of 10−44, which
is negligible even for the most demanding applications in-
cluding high-end computing. As a practical matter, BER
below 10−44 could be maintained over a wide flux bias
margin of ±30% by scaling device size down by a factor
of four. Measured noise current in our test is consis-
tent with previous results for the gray zone of the RSFQ
comparator [20], [21], but with much lower bit error rates
relative to RSFQ circuits [22–24] due to the larger oper-
ating margins in RQL.
The chip power scales linearly with number of junc-
tions and frequency. We measured that with a 12.6µW
power supply, 800 junctions clocked at 6GHz have a
worst-case data-dependent power ratio of 0.91, corre-
sponding to a variation in bias current amplitude of ±2%.
Such a circuit scaled to 106 junctions and with a man-
ageable maximum bias current variation of ±10% would
require a 6mW power supply, amounting to only 15mA
of current on a 50Ω line. On the same circuit we mea-
sured a 1.4 ps worst-case variable clock propagation de-
lay, independent of frequency. The given 106 junction
circuit would correspond to twenty times less coupling to
the clock line. Because variable clock delay scales as k2
this circuit would have a timing variation of only 5 ps, or
only ±1% of the clock period at 6GHz. The AC clock
provides a stable clock reference that suppresses accumu-
lative clock jitter, so ultimately clock frequency is limited
by the switching time of the Josephson junction, which
scales linearly with feature size. At the 0.8µm lithogra-
phy node, we can expect a 70GHz maximum clock fre-
quency [25], or alternately a 6GHz operation with twelve
levels of logic per pipeline stage.
AC power distribution on-chip will benefit from the
exceptional microwave properties of superconducting ma-
terials that have found applications ranging from single-
photon qubit resonators with Q of 104 to THz dark mat-
ter detectors [26, 27]. Monolithic integration of RQL
gates with microwave components, including power split-
ters, matching networks, and phase shifters, is a strength
of the technology. Power dissipation in these passive com-
ponents is as low as 1% per wavelength [28, 29], which
would correspond to only 2.3% of the applied power
in our shift register experiment. In the cryopackage, a
15mA amplitude for the clock is consistent with low heat
transport on the wires [30]. Very high clock rates up to
71GHz have already been demonstrated for Josephson
voltage standards using waveguides in the cryopackage
[31, 32].
Computational efficiency of the measured circuits are
approaching 1000 kBT with further reductions expected
using smaller devices, giving unmatched efficiency in
terms of operations per Joule. This means the technol-
ogy offers a low energy solution for high end computing
even after taking into account the overhead of the cry-
ocooler, on the order of 1000W/W at 4.2K [33]. Because
the 700GHz energy gap in Nb makes superconductors in-
herently radiation-hard [34], the technology may be use-
ful for computationally intensive applications in space.
Since device size and power can be scaled with tempera-
ture to remain in the noise-limited regime, the technology
would be ideal for classical control, readout, and error-
correction feedback for solid state qubits [35] operating
at millikelvin.
This work was supported in part by the Defense Micro-
Electronics Activity under the Advanced Technology
Support Program. The authors thank Marc Manheimer
for discussion, John Fusco for administration, and ac-
knowledge assistance from Isaac Carruthers with the soft-
ware design environment, from Donald Miller and Steve
Shauck with the design.
∗ quentin.herr@ngc.com
[1] Data Center Report to Congress, U.S. Environmental
Protection Agency, www.energystar.gov, (2007).
[2] ITRS 2004 Update: Emerging Research De-
vices, Intern. Technol. Roadmap for Semicond.,
www.itrs.net/links/2004Update/2004 05 ERD.pdf
(2004).
[3] Q.P. Herr, A.D. Smith and M.S. Wire, High speed data
7link between digital superconductor chips, Appl. Phys.
Lett., 80 (2002), pp. 3210–3212.
[4] K.K. Likharev and V.K. Semenov, RSFQ logic/memory
family: a new Josephson-Junction digital technology
for sub-terahertz-clock-frequency digital systems, IEEE
Trans. Appl. Supercond., 1 (1991), pp. 3–28.
[5] W. Chen, A.V. Rylyakov, V. Patel, J.E. Lukens and K.K.
Likharev, Rapid Single Flux Quantum T-Flip Flop oper-
ating up to 770GHz, IEEE Trans. Appl. Supercond., 9
(1999), pp. 3212–3215.
[6] I.V. Vernik, D.E. Kirichenko, et. al., Cryocooled Wide-
band Digital Channelizing RF Receiver Based on Low-
Pass ADC, Ext. Abs of ISEC’07, Washington (2007),
pp. 71–74.
[7] A.Y. Herr, RSFQ baseband digital signal processing, IE-
ICE Trans. Electron., E91-C, (2008), pp. 293–305.
[8] M. Tanaka, T. Kawamoto, Y. Yamanashi, Y. Kamiya,
A. Akimoto and K. Fujiwara, Design of a pipelined 8-bit-
serial single-flux-quantum microprocessor with multiple
ALUs, Supercond. Sci. Technol., 19, (2006), S344.
[9] O.A. Mukhanov, D. Kirichenko, et. al., Superconductor
Digital-RF Reciever Systems, IEICE Trans. Electron.,
E91-C. (2008), pp. 306–317.
[10] Y. Hashimoto, S. Yorozu and Y. Kameda, Development
of Cryopackaging and I/O technologies for high-speed su-
perconductive digital systems, IEICE Trans. Electron.,
E91-C, (2008), pp. 325–328.
[11] I. Kataeva, H. Engseth and A. Kidiyarova-Shevchenko,
New Design of an RSFQ Parallel Multiply-Accumulate
Unit, Supercond. Sci. Technol., 19, (2006), S381.
[12] Hypres Nb design rules, www.hypres.com, (2010).
[13] R. Landauer, Irreversibility and heat generation in the
computing process, IBM J. Res. and Devel., 5, (1961),
pp. 183–191.
[14] A. Barone and G. Paterno, Physics and Applications of
the Josephson Effect, Wiley, (1982), Chap 6.
[15] S.R. Whiteley, Josephson junctions in SPICE3, IEEE
Trans. Magn., 27, (1991), pp. 2902–2905.
[16] Q.P. Herr, A high-efficiency superconductor distributed
amplifier, Supercond. Sci. Technol., 23, (2010), 022004
(4pp).
[17] M. Klein and A. Mukherjee, Thermal noise induced
switching of Josephson logic devices, Appl. Phys. Lett.,
40, (1982), pp. 744–747.
[18] T.V. Filippov and V.K. Kornev, Sensitivity of the
balanced Josephson-junction comparator, IEEE Trans.
Magn., 27, (1991), pp. 2452–2455.
[19] A.V. Rylyakov and K.K. Likharev, Pulse jitter and tim-
ing errors in RSFQ circuits, IEEE Trans. Appl. Super-
cond., 9, (1999), pp. 3539–3544.
[20] V.K. Semenov, T.V. Filippov, Y.A. Polyakov and K.K.
Likharev, SFQ balanced comparators at a finite sampling
rate, IEEE Trans. Appl. Supercond., 7, (1997), pp. 3617–
3621.
[21] T.V. Filippov, Y.A. Polyakov, V.K. Semenov and K.K.
Likharev, Signal resolution of RSFQ comparators, IEEE
Trans. Appl. Supercond., 5, (1995), pp. 2240–2243.
[22] Q.P. Herr and M.J. Feldman, Error rate of a supercon-
ducting circuit, Appl. Phys. Lett., 69, (1996), pp. 694–
695.
[23] P. Bunyk and P. Litskevitch, Case study in RSFQ de-
sign: fast pipelined parallel adder, IEEE Trans. Appl.
Supercond., 9, (1999), pp. 3714–3720.
[24] Q.P. Herr, M.W. Johnson and M.J. Feldman,
Temperature-dependent bit-error rate of a clocked
superconducting digital circuit, IEEE Trans. Appl.
Supercond., 9, (1999), pp. 3594–3597.
[25] L.A. Abelson and G.L. Kerber, Superconductor integrated
circuit fabrication technology, Proceedings of the IEEE,
92, (2004), pp. 1517–1533.
[26] A. Palacios-Laloy, F. Nguyen, F. Mallet, P. Bertet, D.
Vion and D. Esteve, Tunable resonators for quantum cir-
cuits, J. Low Temp. Phys., 151, (2008) pp. 1034–1042.
[27] J. Zmuidzinas and P.L. Richards, Superconducting detec-
tors and mixers for millimeter and submillimiter astro-
physics, Proceedings of the IEEE, 92, (2004), pp. 1597–
1616.
[28] A. Vayonakis, C. Luo, H.G. Leduc, R. Schoelkopf and
J. Zmuidzinas, The millimeter-wave properties of super-
conducting microtrip lines, AIP Conf. Proc., 605, (2002),
pp. 539–542.
[29] M. Bin, M.G. Gaidis, J. Zmuidzinas and T.G. Phillips,
Low-noise 1 THz niobium superconducting tunnel junc-
tion mixer with a normal metal tuning circuit, Appl.
Phys. Lett., 68, (1996), pp. 1714–1716.
[30] D. Gupta, A.M. Kadin, et. al, Integration of cryocooled
superconducting Analog-to-Digital converter and SiGe
output amplifier, IEEE Trans. on Appl. Supercond., 13,
(2003), pp. 477–483.
[31] L. Palafox, G. Ramm, R. Behr, W.G.K Ihlenfeld and
H. Moser, Primary AC power standard based on pro-
grammable Josephson junction arrays, IEEE Trans. In-
strum. Meas., 56, (2007), pp. 534–537.
[32] S.P. Benz and C.A. Hamilton, Application of the
Josephoson effect to voltage metrology, Proceedings of the
IEEE, 92, (2004), pp. 1617–1629.
[33] R. Radebaugh, Cryocoolers: the state of the art and
recent developments, J. Condens. Matter, 21, (2009),
164219 (9pp).
[34] S.E. King, R. Magno andW.GMaisch, Radiation damage
assessment of Nb tunnel junction devices, IEEE Trans.
Nuc. Sci., 38, (1991), pp. 1359–1364.
[35] V.K. Semenov and D.V. Averin, SFQ control circuits
for Josephson junction qubits, IEEE Trans. Appl. Super-
cond., 13, (2003), pp. 960–965.
