Implementation of phase-mode arithmetic elements for parallel signal processing by Onomi  Takeshi et al.
Implementation of phase-mode arithmetic
elements for parallel signal processing
著者 Onomi  Takeshi, Horima  Yohei, Kobori 
Masayuki, Shimizu  Itsuhei, Nakajima  Koji
journal or
publication title







IEEE TRANSACTIONS ON APPLIED SUPERCONDUCTIVITY, VOL. 13, NO. 2, JUNE 2003 583
Implementation of Phase-Mode Arithmetic
Elements for Parallel Signal Processing
Takeshi Onomi, Yohei Horima, Masayuki Kobori, Itsuhei Shimizu, and Koji Nakajima
Abstract—We report the preliminary designs and the exper-
imental results of high-speed digital processing elements based
on phase-mode logic circuits. The core cell of these elements is a
bit-serial adder cell consisting of the ICF gate which is the basic
gate of phase-mode logic. Our main target is the application of the
logic circuits to Digital Signal Processing. The basic arithmetic
operations of DSP are a multiplication and an addition. Basic
concept of the phase-mode pipelined parallel multiplier has been
proposed previously. We design a 2 2 AND array block and a
2-bit ripple-carry adder for the primitive parallel pipelined multi-
plier and also a 2-bit subtractor with a pipelined structure. These
processing elements have been fabricated using NEC standard 2.5
kA/cm2 Nb/AlOx/Nb process. The low-speed test results of these
elements show correct operations. Numerical simulations show
that a carry save adder (a 2-bit ripple carry adder) can operate
over 10 GHz. We also discuss the prospects of large-scale SFQ
DSP based on Nb junction technology.
Index Terms—Logic circuit, phase mode, signal processing,
single flux quantum, superconductive device.
I. INTRODUCTION
SFQ logic circuits have attracted a lot of digital applica-tions because of its extreme low-power dissipation and
high-speed operation. We have proposed the phase-mode logic
which utilizes SFQ’s and their interactions for digital compu-
tation [1]. In the phase-mode system, SFQ’s are functioned as
information bits; furthermore, all logic functions are achieved
by using propagating pulses and interactions between SFQ’s.
There is no huge difference between phase-mode logic and
rapid single flux quantum (RSFQ) logic [2] at the root of
theirs two concepts. However, we are studying two significant
concepts as follows: 1) We have proposed an ICF (INHIBIT
controlled by fluxon) gate as the basic device of the phase-mode
logic for circuit simplicity [3]. 2) A bit-serial adder cell can
be easily constructed by using the storage function of an SFQ
in the ICF gate; therefore, this advantage is useful for signal
processing circuits. Some fundamental digital circuits have
been fabricated using the ICF gates based on branched circuits
of Josephson transmission lines (JTL’s) [4], [5]. We have also
introduced an ICF gate by an RSFQ circuit scheme [5], [6].
Recently, we are studying a parallel processing system in the
Manuscript received August 9, 2002. This work was supported in part by
the Special Coordination Funds for promoting Science and Technology of the
Ministry of Education, Culture, Sports, Science and Technology of the Japanese
Government.
The authors are with the Laboratory for Electronic Intelligent Systems,
Research Institute of Electrical Communications, Tohoku University, Sendai,
980-8577 Japan (e-mail: onomi@riec.tohoku.ac.jp; yhorima@nakajima.riec.to-
hoku.ac.jp; m-kobori;@nakajima.riec.tohoku.ac.jp; a-itsu@nakajima.riec.to-
hoku.ac.jp; hello@nakajima.riec.tohoku.ac.jp).
Digital Object Identifier 10.1109/TASC.2003.813952
Fig. 1. ICF gate. (a) Circuit diagram. (b) Moore diagram. (c) Symbol.
phase-mode logic [7]. Toward such large system, we have
proposed new gates with the high yield and large parameter
margins [6].
In this paper, we report the preliminary designs and the
experimental results of high-speed digital processing elements
based on phase-mode logic. Some basic arithmetic elements
are fabricated by NEC 2.5 kA/cmNb/AlOx/Nb Josephson
junction technology and successfully tested. We also discuss
the prospects of large-scale SFQ DSP based on Nb junction
technology.
II. A PPLICATION OFICF GATE FORARITHMETIC ELEMENTS
A. Basic Elements of Phase-Mode Logic
We have proposed an ICF gate as the basic device of the
phase-mode logic. This gate has two inputs (X,Y), two outputs
(A,B), and a reset input (Re) as shown in Fig. 1. The function of
this gate is to control a destination (A or B) of an SFQ from
the X input by using an SFQ from the Y input. The gate is
reset every one operation by an SFQ from Re as a timing signal.
The A output and the B output represent logical INHIBIT and
AND, respectively. Because INHIBIT is a universal operator,
all logical functions can be constructed by combination of the
ICF gates. However, we also use the AND function accompa-
ni d with the gate; a logical representation by the combination
of only INHIBIT function becomes lengthy [1]. To get large op-
erating regions of circuit parameters, we have proposed that two
functions of INHIBIT and AND should be executed in separate
gates as shown in Fig. 2 [6].
Most of arithmetic processes are based on the addition in
logic circuits; therefore, the key device in the arithmetic units is
an adder circuit. A serial input adder cell is easily achieved by
feeding A output of an ICF gate back to Y input. Fig. 3 shows
the adder cell using an ICF gate [7].
1051-8223/03$17.00 © 2003 IEEE
Authorized licensed use limited to: TOHOKU UNIVERSITY. Downloaded on May 20,2010 at 06:03:38 UTC from IEEE Xplore.  Restrictions apply. 
584 IEEE TRANSACTIONS ON APPLIED SUPERCONDUCTIVITY, VOL. 13, NO. 2, JUNE 2003
Fig. 2. (a) INHIBIT gate. Device parameters areIc1 = Ic3 = 0:12 mA,
Ic2 = 0:195 mA, Ic4 = 0:25 mA, Ic5 = 0:14 mA, Ic6 = 0:125 mA,
Ic7 = 0:275 mA, Ic8 = 0:145 mA, Ic9 = 0:25 mA, Ic10 = 0:23 mA,
Ic11 = 0:2 mA, L1 = 0:8 pH,L2 = 3:8 pH,L3 = 0:6 pH,L4 = 3:2 pH,
L5 = 1:7 pH, L6 = 2:5 pH, L7 = 0:7 pH, L8 = 2:5 pH, L9 = 1:2 pH,
Ib1 = 0:261 mA, andIb2 = 0:224 mA, respectively. (b) AND gate. Device
parameters areIc1 = Ic3 = 0:23 mA, Ic2 = 0:16 mA, Ic4 = 0:18 mA,
Ic5 = 0:19 mA, Ic6 = 0:22 mA, andL1 = L2 = 2:0 pH, respectively.
Fig. 3. Adder cell. (a) Circuit diagram. Device parameters areIc1 = Ic4 =
0:20 mA, Ic2 = 0:19 mA, Ic3 = 0:38 mA, Ic5 = 0:15 mA, Ic6 = 0:32
mA, Ic7 = 0:18mA, Ic8 = 0:11mA, Ic9 = 0:25mA, L1 = 1:0 pH,L2 =
0:7 pH, L3 = L4 = 0:5 pH, L5 = 2:2 pH, L6 = 1:7 pH, L7 = 0:8 pH,
L8 = 3:0 pH, Ib1 = 0:41 mA, andIb2 = 0:34 mA, respectively. (b) Moore
diagram. (c) Symbol.
B. Multiplier
We have proposed the multiplier which executes one of the
most intensive processes in the signal processing circuits [7].
Fig. 4 shows a block diagram of our multiplier. The proposed
multiplier has an AND array for generating a partial product of
a multiplication and a Wallace-tree structure comprising trees of
carry save adders (CSA’s) for the addition of partial products.
This structure has regularity in its layout; hence, it is suitable
for a pipelined scheme.
We have designed and fabricated some basic blocks to re-
alize our multiplier by NEC 2.5 kA/cm Nb/AlOx/Nb stan-
dard process [8]. Fig. 5(a) and (b) show a block diagram and
CAD layout of a preliminary 2 2-bit AND array which con-
sists of four AND gates shown in Fig. 2(b). The total number
Fig. 4. A 4-bit 4-bit multiplier.
Fig. 5. 2 2-bit AND array. (a) Block diagram. (b) CAD layout. (c) Low-
speed test result for AND(2,1) gate. The three upper traces show input currents
of DC/SFQ converters. The bottom trace shows the output voltage of the SFQ
detector. The voltage of the detector returns to zero by another signal which is
not shown in this figure. The same results for other AND gates have been also
confirmed.
Authorized licensed use limited to: TOHOKU UNIVERSITY. Downloaded on May 20,2010 at 06:03:38 UTC from IEEE Xplore.  Restrictions apply. 
ONOMI et al.: IMPLEMENTATION OF PHASE-MODE ARITHMETIC ELEMENTS FOR PARALLEL SIGNAL PROCESSING 585
Fig. 6. (3, 2) Carry save adder. After three serial inputs from A1 are added,
the result of the addition is sent to outputs (S2, S1) by reset signal. (a) Block
diagram. (This CSA can operate as a 2-bit ripple carry adder by using additional
terminals of A2 and C2.) (b) Low-speed test result.
of Josephson junctions is 212 in this whole circuit (including
In/Out interface circuits). Fig. 5(c) shows the low-speed test re-
sult for AND(2,1) gate. We have also confirmed the same results
from the other AND gates. The measured lower and upper bias
margins are 17% and 22%, respectively.
Partial products generated by an AND array are added in the
next stage of carry save adders. We have designed and fabricated
(3-input, 2-output) a carry save adder unit which consists of two
adder cells shown in Fig. 3. Fig. 6(a) shows the block diagram
of the CSA. This CSA can operate as a 2-bit ripple carry adder
by using additional terminals of A2 and C2. Fig. 6(b) shows
low-speed test result of the CSA. This result shows the proper
operation of addition. The measurements of lower and upper
bias margins are 21% and 29%, respectively.
We have successfully tested the most primitive 2-bit mul-
tiplier which consists of the 2 2-bit AND array and the 2-bit
ripple carry adder [9]. In order to realize larger scale multiplier,
a carry lookahead adder (CLA) is necessary for high-speed op-
eration. We have already confirmed the operation of a delta op-
erator cell which is a core cell of the CLA, and the design of
the CLA is now in progress. Moreover, we proposed a Booth
encoder for generating partial products recently [9]. Using this
algorithm for a large scale multiplier, we can generate partial
products more efficiently.
C. Subtracter
We have designed and fabricated a primitive 2-bit subtracter
which consists of two INHIBIT gates shown in Fig. 2 and two
adder cells shown in Fig. 3. A subtraction is an important arith-
metic for various signal processes (e.g., radix-2 butterfly circuit
in FFT). The calculation principle of a subtraction is also based
on an addition. Furthermore, an inverter stage and a 1’s comple-
menter input are necessary at the previous stage of an addition.
Fig. 7(a) shows the block diagram of the subtracter. The in-
verter cell consists of INHIBIT gate where X and Y shown in
Fig. 2(a) are defined as an inverter clock and a data input where
those are shown in Fig. 7(a), respectively. The total number of
Fig. 7. 2-bit subtracter. (a) Block diagram. [The inverter cell consists of
INHIBIT gate defined X and Y shown in Fig. 2(a) as an inverter clock and
a data input, respectively.] (b) Low-speed test result. The upper photograph
shows the proper operation of the addition in the adder circuit shown in (a). The
lower photograph shows the outputs from the adder circuit which processes
signals from the inverter. The same result for a1 input has been also confirmed.
Josephson junctions is 172 in this circuit including In/Out in-
terface circuits. Fig. 7(b) shows the low-speed test result of the
subtracter. The upper photograph shows the proper operation of
the addition for b1 and b2 inputs in the adder circuit shown in
Fig. 7(a). On the other hand, the lower photograph shows the
outputs from the adder circuit which processes signals from the
inverter. The measurements of lower and upper bias margins are
13% and 21%, respectively.
Although this result is the preliminary 2-bit subtracter which
is based on a ripple carry adder, the configuration of this sub-
tracter is a parallel pipelined structure. For larger scale sub-
tracter, a carry lookahead adder is necessary for high-speed op-
eration as mentioned in the previous section. We will discuss its
performance in the next section.
III. D ISCUSSION
In this section, we will discuss the prospects of large-scale
SFQ DSP based on Nb junction technology. Using NEC 2.5
kA/cm Nb/AlOx/Nb technology, one is able to fabricate cir-
cuits containing a few thousands Josephson junctions with the
high yield. Although the results reported in this paper are con-
cerning preliminary arithmetic units with a few hundreds junc-
tions, we discuss the performance and the prospects toward LSI.
Authorized licensed use limited to: TOHOKU UNIVERSITY. Downloaded on May 20,2010 at 06:03:38 UTC from IEEE Xplore.  Restrictions apply. 
586 IEEE TRANSACTIONS ON APPLIED SUPERCONDUCTIVITY, VOL. 13, NO. 2, JUNE 2003
TABLE I
ESTIMATIONS OF OPERATION SPEED
The increasing amount of connection wires between logic
cells and their signal propagation delays are serious problems
on a realization of SFQ LSI. Table I shows the estimation results
of operation speed calculated by the numerical simulator JSIM.
Assuming that the logic cells are connected by one JTL buffer
with two Josephson junctions, the circuits can operate at 22.2
GHz. In these cases, the operation speed is limited by the carry
signal delay of the ripple carry adder. However, maximum oper-
ation frequencies of actually fabricated circuits do not reach the
ideal frequency because the designed adder cells are connected
by five JTL buffers. We use 10m width of the line for wiring
mainly. (We do not use wires with the width of the line less than
10 m as inductors. It may be difficult to design appropriate
circuits because the line width becomes the same order of sizes
of junction areas, contact holes, and shunt resistors including
aliment margins.) The circuit are designed by using upper and
lower wiring layers with the sheet inductance of 1.0 pH and 0.5
pH, respectively. Layout pattern using these two layers must
include roundabout wire connections of JTL’s because of ge-
ometrical limitations. If we can use another wiring layer with
relatively low sheet inductance, it is possible to decrease the
propagation delay of middle range (a few hundredsm) dis-
tance without using a passive microstrip line. Of course, the pas-
sive microstrip lines are necessary to propagate SFQ’s between
arithmetic blocks. Thus, it is required for high-speed operations
of larger scale circuits not only to improve the junction char-
acteristic but also to improve the circuit integration technology
(e.g., a multilayer technology of wiring).
IV. CONCLUSION
We have reported the preliminary designs and the experi-
mental results of high-speed digital processing elements based
on phase-mode logic circuits. We design a 22 AND array
block, a 2-bit ripple-carry adder for the primitive multiplier, and
a 2-bit subtracter with a pipelined structure. These processing
elements have been fabricated using NEC standard 2.5 kA/cm
Nb/AlOx/Nb process. The low-speed test results of these ele-
ments have shown correct operations. Numerical simulations
show that the 2-bit ripple carry adder can operate over 10 GHz.
This operation frequency is limited by the propagation delay of
SFQ. Using a multilayer technology of wiring, we are able to
expect higher frequency operation.
ACKNOWLEDGMENT
This work was partially carried out at the Laboratory for Elec-
tronic Intelligent Systems, Research Institute of Electrical Com-
munications, Tohoku University.
REFERENCES
[1] K. Nakajima, H. Mizusawa, H. Sugahara, and Y. Sawada, “Phase-mode
Josephson computer system,”IEEE Trans. Appl. Superconduct., vol. 1,
pp. 29–36, March 1991.
[2] K. K. Likharev and V. K. Semenov, “RSFQ logic/memory family: A new
Josephson-junction technology for sub-terahertz-clock-frequency dig-
ital systems,”IEEE Trans. Appl. Superconduct., vol. 1, pp. 3–28, March
1991.
[3] K. Nakajima, G. Oya, and Y. Sawada, “Fluxoid motion in phase mode
Josephson switching system,”IEEE Trans. Magn., vol. MAG-19, pp.
1201–1204, May 1983.
[4] T. Onomi, Y. Mizugaki, K. Nakajima, and T. Yamashita, “Design and
fabrication of an adder circuit in the extended phase-mode logic,”IEEE
Trans. Appl. Superconduct., vol. 7, pp. 3172–3175, June 1997.
[5] T. Onomi, Y. Mizugaki, H. Satoh, T. Yamashita, and K. Nakajima,
“Phase-mode circuits for high-performance logic,”IEICE Trans.
Electron., vol. E81-C, pp. 1608–1617, Oct. 1998.
[6] T. Onomi, K. Yanagisawa, and K. Nakajima, “New phase-mode logic
gates with large operating regions of circuit parameters,”IEEE Trans.
Appl. Superconduct., vol. 11, pp. 974–977, March 2001.
[7] T. Onomi, K. Yanagisawa, M. Seki, and K. Nakajima, “Phase-mode
pipelined parallel multiplier,”IEEE Trans. Appl. Superconduct., vol. 11,
pp. 541–547, March 2001.
[8] S. Nagasawa, Y. Hashimoto, H. Numata, and S. Tahara, “A 380ps,
9.5mW Josephson 4-K bit RAM operated at a high bit yield,”IEEE
Trans. Appl. Superconduct., vol. 5, pp. 2447–2452, Jun. 1995.
[9] Y. Horima, I. Shimizu, M. Kobori, T. Onomi, and K. Nakajima, “Com-
parison between an AND array and a Booth encoder for large-scale
phase-mode multipliers,”IEICE Trans. Electron., submitted for publi-
cation.
Authorized licensed use limited to: TOHOKU UNIVERSITY. Downloaded on May 20,2010 at 06:03:38 UTC from IEEE Xplore.  Restrictions apply. 
