New BSFQ circuit designs with wide margins by Teh CK et al.
970 IEEE TRANSACTIONS ON APPLIED SUPERCONDUCTIVITY, VOL. I I ,  NO, I .  MARCH 2001 
New BSFQ Circuit Designs with Wide Margins 
Chen Kong Teh and Yoichi Okabe 
Abstract-Recently we have proposed novel Boolean Sin- 
gle-Flux-Quantum (BSFQ) circuits, which just like CMOS cir- 
cuits support Boolean primitives directly, and do not require 
local synchronization for each operation cell. However, previous 
BSFQ AND, OR, and XOR cells suffered from problems with 
narrow margin, where their critical margins hardly exceeded 
+lo% due to low flux gain. Furthermore, while being suitable 
for combinational circuits, previous BSFQ NOT cells had ini- 
tialization problems in sequential circuits. In this paper, new 
versions of these circuits with simulated margins beyond 430% 
are proposed. Moreover, a Muller C-element, an error canceller, 
a destructive read-out (DRO), and a demultiplexer are also 
newly created. The operation time, parameter margins, and cir- 
cuit size of these BSFQ cells are comparable to those of the con- 
ventional RSFQ cells. 
Index Terms-asynchronous circuits, BSFQ, Boolean primi- 
tives, dual-rail, flux level, level logic. 
I. INTRODUCTION 
he distinct differences between the Boolean Sin- 
gle-Flux-Quantum (BSFQ) logic system and other types 
of single-flux-quantum (SFQ) logic systems are their im- 
plementation of operation timing, and the type of operation 
primitives they support. In the BSFQ logic system, a Boolean 
signal is represented by a “set” SFQ pulse at the rising edge 
of the signal, and a “reset” SFQ pulse at the falling edge of 
the signal [l]. These set and reset pulses are transferred by 
using a dual-rail Josephson transmission line (JTL), directed 
toward BSFQ cells, where they are converted into supercon- 
ducting flux levels for performing Boolean operations, and 
the results are outputted in the form of set-reset pulses. Thus, 
there is no need for local synchronization for each BSFQ cell, 
and Boolean primitives are supported directly just as in 
CMOS logic. 
For conventional Rapid Single-Flux-Quantum (RSFQ) 
logic, clock signals are required by each operation cell for 
implementing timing windows on their data signals [2]. 
Compared to other SFQ circuits, RSFQ circuits require few 
Josephson junctions. However, clock skew might become a 
problem for large-scale circuits operating at clock-speed ap- 
proaching a terahertz, since timing uncertainty exists in the 
clock distribution tree due to process variations and thermal 
fluctuation. 
T 
Manuscript received September 18, 2000. This study was performed 
through Special Coordination Funds for Promoting Science and Technology 
of the Science and Technology Agency of the Japanese Govemment. 
C. K Teh is with the Department of Electronic Engineering, University 
of Tokyo, and also with the Research Center for Advanced Science and 
Technology, University of Tokyo, 4-6-1 Komaba, Meguro-ku, Tokyo 
153-8904, Japan. (telephone: +81-3-5452-5153, e-mail: teh@okabe.rcast. 
u-tokyo.ac.jp). 
Y. Okabe is with the Research Center for Advanced Science and Tech- 
nology, University of Tokyo, 4-6-1 Komaba, Meguro-ku, Tokyo 153-8904, 
Japan. 
Several types of asynchronous SFQ logic systems have 
been proposed in recent years, such as Data-Driven 
Self-Timing logic [3] and Pulse-Driven Dual-Rail logic [4]. 
These types of asynchronous circuits eliminate the necessity 
of global clock signals for the local operation cells, since 
timing information is encoded in dual-rail data signals. How- 
ever, the number of Josephson junctions required to imple- 
ment these asynchronous circuits is comparably large, since 
the timing information requires to be decoded from the data 
signals and then directed to the local operation cells of these 
circuits. As a result, the circuits are bulky, and their layout 
designs are complicated. 
There are other types of asynchronous logic systems, which 
do not require local synchronization at all, such as Delay In- 
sensitive (DI) logic [ 5 ] .  These circuits have potential to offer 
high performance, since they operate at an average speed 
rather than a worst case speed. However, DI logic is 
event-based logic, which is unusual in modern VLSI tech- 
nology. Thus, DI logic might be unable to utilize the hi t f id  
knowledge base of today’s VLSI technology, which is based 
on Boolean primitives. Moreover, DI circuits require large 
areas for layout, and the placement is complicated, since DI 
circuits have many branches of data signals converging at the 
same location. 
BSFQ logic system has merits all these logic systems have; 
while requiring only comparable number of Josephson junc- 
tions, BSFQ circuits do not require local synchronization, and 
support Boolean primitives directly. Moreover, since BSFQ 
circuits use SFQ pulses in transferring level information, it 
can utilize the know-how of the leading RSFQ technology. In 
fact, the BSFQ logic system can share circuits with RSFQ. 
BSFQ may be considered an extended logic system for 
RSFQ. 
11. PROBLEMS OF PREVIOUS BSFQ CIRCUKS 
A .  Narrow Margin Problem 
For BSFQ circuits, AND and OR cells share the same 
structure (Fig. 1) [ 6 ] .  Boolean operations are performed 
based on the threshold value of the flux level across L2. 
However, if we consider a flux quantum trapped inside 
JI-LI-J2-L2, we will find that the flux level o f  L2 is much 
smaller than a flux quantum Qo, since the inductance L2 has 
to be much smaller than Ll for the stability of the circuit op- 
eration. Thus, the parameters of the following stage have 
small margins due to the low flux gain across L2. In simula- 
tion, the flux gain hardly exceeds O.lQo. As a result, critical 
margins (11, J3) of previous BSFQ AND, OR, XOR cells 
were below 40%. 
B. Incompleteness of Previous NOT Cell 
A BSFQ NOT cell or an inverter is implemented by just 
IOS1-8223/01$10.00 0 2001 IEEE 
97 I 
Fig. 1.  Previous BSFQ AND/OR cell. 
crossing the set and reset rails of the dual-rail JTL. However, 
unlike other logic system using dual-rail, a BSFQ inverter 
must be initialized, since at the initial state where there is no 
input pulse, the output of the cell is a '0' state instead of a '1' 
state. Two initialization methods were proposed previously. 
One is sending an initial pulse as a set pulse to the output of 
the cell by using a merger [6].  However, this makes the cir- 
cuit inefficient, and the placement of the cell complicated. 
The alternative choice is sending a series of set-reset pulses to 
the dual-rail input of the system [ 7 ] .  These pulses change the 
internal flux level of the cells connected at the back of each 
inverter. It was shown that this method could initialize all 
inverters in arbitrary combinational circuits. However, the 
inverters were found to not work well in sequential circuits, 
since initial pulses might not reach some inverters in the cir- 
cuits. Moreover, the number of initial pulses required to ini- 
tialize all the inverters in the system are indefinite for a black 
box system. 
111. NEW BSFQ CIRCUITS 
In this section, the critical current density, 1&, product and 
McCumber parameter of a shunted Josephson junction are 
assumed to be 2.5 kA/cm2, 0.37mV and unity, where Z,is the 
critical current, and R, is the normal resistance of a shunted 
Josephson junction. 
A.  Error Canceller 
In BSFQ circuits, the use of dual-rail JTLs to transfer level 
information might cause 2 problems. Firstly, error pulses 
I1 
Fig. 2. BSFQ Error Canceller cell. Optimized parameters: I1 = 0.31mA, I2 
= 0.13mA,13 = 0.29m.4, I4 = I5 = 0.27mA,Jl= 54 = J 5 =  J6 =J7- J9 = 
JlO= Jll = 512 = 0.20mA,J2 = 0.12mA, 53 = 0.30mA, J8 = 0.23m.4, Ll = 
2.9pH, L2 = 2.0pH, L3 = 1.2pH, L4 = L5 = 2.5pH, L6 = 0.99pH, L7 = 7.8pH, 
L8 = L l l  = 4.9pH, L9 = 2.7pH, LlO = 2.2pH, L12 = 2.7pH, L13 = 2.3pH. 
Fig. 3. BSFQ NOT cell. Optimized parameters: I1 = 0.27mA, I .  = 
0.48mA-+O.l3mA, I3 = 0.291~4, I4 = I5 = 0.28m.4, Jl = J 3  = J5 = J6 = J7 
= J 8 =  J l O  = Jll = J12 =J13 =0.20mA, J2 = 0.17m.4, 54 = 0.281~4, J9 = 
0.23mA, LI = 1 .  IpH, L2 = I .3pH, L3 = 2.2pH, L4 = 2.5pH, L5 = 2.4pH, L6 
=OS96pH,L7=7.7pH, L 8 = L I i  =4.8pH,L9-2.5pH, LI0=2,3pH, LIZ= 
L13 = 2.4pH. 
occur occasionally in the circuits due to thermal fluctuations. 
The presence of these pulses in between any set and reset 
pulses will cause the following data signals to be out of order. 
Secondly, there might be a situation that pulses in one rail 
overtake pulses in another rail at a certain point in the circuits, 
especially when SFQ pulses of high speed propagate in a long 
dual-rail JTL . 
Formerly, 2 escape junctions were added to the input of the 
AND/OR cells to solve the first problem [7]. However, some 
cells such as the XOR cell cannot be modified to include 
these escape junctions. Thus, we created an error canceller 
cell for removing the unwanted pulses in arbitrary BSFQ cells. 
For solving the second problem, the 2 rails of the dual-rail 
JTL are placed as near as possible to avoid process variations, 
and error cancellers are used to break up long JTLs. 
Fig. 2 illustrates the schematic of an error canceller cell. 
This cell behaves like a JTL if the set pulse and reset pulse 
arrive altemately. However, if 2 set pulses (2 reset pulses) 
arrive, junction J3 ( J 7 )  will switch and throw the second 
pulse out of the rail. This cell has critical margin as wide as 
*32% (Table I) in simulation. 
B. NOT Cell 
This new NOT cell is similar to an error canceller cell ex- 
cept the set rail and reset rail are interchanged, and the bias 
current 12 is variable. From its initial state, the current is set 
to a higher level and then returned to its initial level. When 12 
is set to a higher level, junction J5 switches one and only one 
time, and 3 SFQ pulses occur. One of them propagates to the 
set rail of the output, and one of them is trapped inside the 
loop JS-L7-J9 as an internal flux level of the inverter. The 
other SFQ pulse propagates toward the input of the cell, and 
then is thrown out of the rail through buffer junction JZ. Afier 
being initialized, the NOT cell operates in the same way as 
the error canceller cell. The simulated critical margin of this 
cell is as wide as *32%. 
C. AND/OR/Muller C-element Cell 
Fig. 4 illustrates new BSFQ AND, OR, and Muller 
C-element cells. They are constructed by using two RSFQ D 
flip-flop and a dc SQUID. Mutual coupling is used to raise 
the flux gain of inductance L8. As a result, the flux gain 
reaches 0.25@, as a flux trapped inside one of the J4-L4-J8 
loops. For the AND cell, junction J9 will switch when the 
flux level across the two inductances L8 is raised to 0.5 cPo, 
and junction J12 will switch when the flux level returns to its 
initial condition. The inductance L4, L8, and the mutual in- 
ductance between them are extracted from the layout of cell 
by using our inductance calculation tool [ S I .  The calculated 
critical margin of the AND cell is i31%. However, for the 
OR cell, the critical margin is *21%, whch is still narrow for 
using in large-scale circuits. Hence, instead of using an OR 
cell, it is better to use a combination of NOT and AND cell 
for this purpose. 
Muller C-element cells are required for constructing 
self-timed circuits. For this cell, a set pulse will release only 
after each of its inputs receives a set pulse, and a reset pulse 
will release only after each of its inputs receives a reset pulse. 
972 
The critical margin of Muller C-element cell was calculated 
to be as wide as 130%. 
D. XORCelI 
A new BSFQ XOR is constructed by using two error can- 
cellers, two mergers, and a modified RSFQ B flip-flop cell 
(Fig. 5 ) .  The error canceller and merger line-up the input set 
Iol(R 
Fig. 4. BSFO AND/OR/Muller C-element cell. Outimized oarameters: for 
& circuit, fi = 0.291~4, I2 = 0.14mA, I3 = 0.3& I4 = 0.071mA, I5 = 
O.O77mA, 16 = 0.24mA, Jl = 52 = 53 = J5 = J11 = 0.2011~4, J4 = 0.2611~4, 
J6 = 0.14mA, 57 = 0.25mA, J8 = 0.22mA, J9 = J12 = O.lOmA, JZO = 
0.14mA, LZ = L2 = L6 = 2.5pH, L3 = 0.66pH, L4 = 8.6pH (mutual induc- 
tance K =3.3pH), L5 = 3.0pH, L7 = I.OpH, L8 = 4.6pH, L9 = 6.9pH, LlO = 
2.9pH, LIl = 2.0pH. For OR circuit, substitute I4 = 0.15mA, I5 = OmA, J7 
= 0.24mA, J8 = 0.23mA. For Muller C-element circuit, substitute 14 = 
0.089mA, 15 = OmA, J4 = 0.25mA. 
TABLE I 
6 OF THE NARROWEST PARAMETER MARGINS FOR EACH BSFQ CELL 
Simulated Parameter Simulated 
Margin Margin Cell Parameter 
Error J3 -32% to +36% I3 4 2 %  to +68% 
Canceller I1 -44% to +50% L7 -47% to +68% 
J7 -44% to +55% 56 -88% to +42% 
J4 -32% to +35% JI -38% to +50% 
NOT I2 -32% to +35% J2 -51% to +38% 
I1 -38% to +45% J8 -41% to +58% 
J7 -3 1 % to +43% I5 -4 1 % to +42% 
AND J4 -34% to +32% I4 -49% to +35% 
I6 -44% to +37% J3 -41% to +43% 
I4 -21% to +21% 54 -36% to +28% 
OR 512 -90% to +24% J8 -27% to +42% 
J7 -35% to +25% J3 -33% to +43% 
Muller J4 -33% to +30% 57 -33% to +39% 
C- I4 -37% to +30% J3 -36% to +39% 
element JZ2 -90% to +30% I3 -36% to +45% 
15 -34% to +32% J16 -40% to +32% 
XOR J3 -32% to +35% JZ4 -34% to +46% 
I4 -35% to +35% J19 -48% to +35% 
P~~~~ - 
Parameters in this table refer to theirP&& shown in the corresponding 
figures. 
Io1 
Outl( S) 
Outl(R) 
. 
Fig. 5. Implementation of BSFQ demultiplexer. 
Read) 
Fig. 6. Implementation of BSFQ DRO cell. 
and reset pulses in an alternate manner, and ensure that the 
first pulse is a set pulses. Then, the modified B flip-flop re- 
leases a SFQ pulse into the set or reset rail alternately when 
its inputs receive a SFQ pulse. The critical margin of XOR 
cell is as wide as Lt32%. 
E. Demultiplexer 
The implementation of BSFQ Demultiplexer is easier than 
other dual-rail logic. It is constructed by using two RSFQ T 
flip-flop (Fig. 5). A BSFQ demultiplexer splits consecutive 
pairs of set-reset pulses into 2 groups, such that the adjacent 
pairs are in different output rails. This cell has only 8 Jo- 
sephson junction, and is smaller compared to those of the 
conventional asynchronous circuits. 
F. DRO Cell 
BSFQ DRO (destructive read-out) cell is implemented by 
using a RSFQ D2 flip-flop cell (Fig. 6). The write pulse will 
turn the flux level of the inductance high, and when the cell 
receives a read pulse, it will release a set pulse to the output. 
If the flux level of the inductance is low, the arrival of read 
pulse will result in the output of a reset pulse. 
N. BASIC CHARACTERISTlCS OF BSFQ CELLS 
Table I1 shows the basic characteristics of BSFQ cells. The 
latency of BSFQ cells is small compared to conventional 
asynchronous circuits. Note that the latency for one stage of 
standard JTL is about 3 'to, where %, is defined as 
where I, is critical current, R,, is n o m 1  resistance of shunted 
Z, =@o121d,R,,  
Fig. 7. BSFQ XOR cell. Optimized parameters: I1 = 0.31mA, I2 = 0.13- I3 = 0.29mA, 14 = 0.33mA, I5 = 0.35mA, I6 = 0.24mA, JZ = J4 = J5 = J8 
=J9  = JIO= J12 = JZI = 0.20mA, J2 = 0.12mA, J3 =0.31mA, J 6 =  J 7 =  0.1411~4, JlZ ~0.2311~4, J13 = 0.18mA, J14 = 0.15mA, J15 = JZ6= 0.16mA, 
J17 = 0.13mA, J18 = 0.19mA, J19 = J20 = 0.14mA, L1= 2.9pH, L2 = 2.lpH, L3 = L8 = l.OpH, L4 = L9 = 4.9pH, L5 = 7.7pH, L6 = L7 = 2.5pH, LZO = 
2.8pH, LII = 0.67pH, LIZ = 5.2pH, L13 = 9.9pH, L14 = 2.9pH, LIS = 2.0pH. 
973 
Josephson junction. The latency for the NOT cell and error 
canceller is about the same as that of one stage of a standard 
JTL, since they have one internal loop. For AND, OR, Muller 
C-element cells having 2 loops, the latency is two times larger, 
and so on. This means that BSFQ cells introduce no extra 
delay, except for the propagation delay of SFQ pulses in the 
equivalent stages of standard JTL. Besides, the number of 
Josephson junctions involved in a BSFQ cell is small. 
v. EXPERIMENTAL RESULTS 
There are only AND, OR, Muller C-element cells being 
designed and tested at this moment. However, since the other 
cells do not use mutual coupling structures, and are not much 
different from the conventional RSFQ cells, there is no reason 
to suspect their workability in the real world. 
Fig. 8, Fig. 9, Fig. 10 illustrate the experimental results of 
new AND, OR, and Muller C-element cells respectively, in a 
low frequency testing. The testing chip was fabricated by 
NEC Corporation using their standard Nb/AlOxA% process 
[SI. Set-reset input pulses were generated by using a BSFQ 
level-to-pulse converter, and dc-voltage outputs were ob- 
tained by using a BSFQ pulse-to-level converter. The tested 
global bias margins for AND, OR, Muller C-element cells 
were &13%, *6%, and *lo% respectively. 
TABLE II 
BASIC CHARACTERISTICS OF BSFQ CELLS 
Cell Latency JJ Count 
AND 820 10 
OR 620 10 
NOT 420 7 
XOR 2420 30 
Muller C-Element 820 10 
Error Canceller 420 4 
Demultiplexer smalla 8 
DRO smallb 12 
Tritical margin and latency of this cell are the same as for a RSFQ T 
flip-flop. 
bCritical margin and latency of this cell are the same as for a RSFQ D2 
flip-flop. 
Fig. 8. Measui 
Input 1 
Input 2 
output 
l lYe?MO 
04 $1 -57 
*ed waveform for new BSFQ AND cell in low frequency 
testing. The vertic :a!e:kal$s, 1 V/div, and the horizontal scale -lyi$v, 
1 
1 
nput 1 
nput 2 
Fig. 10. 
frequency 
25mddiv. 
Measured waveform for new BSFQ Muller C-element cell in 
‘testing. The vertical scale is IV/div, and the horizontal scale 
low 
is 
VI. CONCLUSION 
BSFQ logic system is classified as a flux level logic system, 
which directly supports Boolean primitives just the same as 
CMOS logic, and do not require any local synchronization for 
each operation cell. New BSFQ fundamental cells offer wide 
margins, which increase their ability to be used in large-scale 
circuits, and in circuits where switching speed is approaching 
a terahertz. The hardware complexity and latency of a BSFQ 
cell is small compared to conventional asynchronous circuits. 
Further work is in progress to draw out a global self-timed 
scheme for the BSFQ logic system. 
ACKNOWLEDGMENT 
C. K. T. is deeply indebted to Mr. Manabu Kitagawa, who 
passed away in May of this year, for numerous fruitfid dis- 
cussions. c. K. T. would also like to express his thanks to Dr. 
Hiroki Kodaka for continuous encouragement. 
[51 
is 25msldiv. 
~ 9 1  
REFERENCES 
H. Kodaka, T. Hosoki, and Y. Okabe, ”Single flux quantum level cir- 
cuit using new DC/SFQ,” IEEE Trans. Appl. Supercond., vol. 9, pp. 
3729-3732, Jun. 1999. 
K. K. Likharev, and V. K. Semenov, “RSFQ logidmemory family: A 
new Josephson junction digital technology for sub-terahertz- 
clock-frequency digital systems,” IEEE Trans. Appl. Supercond., vol. 
Z. J. Deng, S. R. Whiteley, and T. Van Duzer, “Data-driven self-timing 
of RSFQ digital integrated circuits,“ Ext. Abst. of 5th Int. Supercond. 
Electron. Con$, pp. 189-191, 1995. 
M. Maezawa, I. Kurosawa, M. Aoyagi, H. Nakagawa, Y. Kameda, and 
T. Nanya, “Pulse-driven dual-rail logic gate family based on rapid sin- 
gle-flux-quantum (RSFQ) devices for asynchronous circuits,” Proc. of 
2nd Int. Symp. on Adv. Res. in Asynchronous Circ. and Syst., pp. 
P. Patra , S. Polonsky, and D. S. Fussel, “Delay insensitive logic for 
RSFQ superconductor technology,” Proc. of 3rd Int. Symp. on Adv. 
Res. in Asynchronous Circ. and Syst., pp. 42-53, 1997. 
T. Hosoki, H. Kodaka, M. Kitagawa, and Y. Okabe, “Design and ex- 
perimentation of BSFQ Logic Devices,” Supercond. Sci. TechnoL, vol. 
12, no. 11, pp. 773-775, Nov. 1999. 
T. Hosoki, S. Sonoda, and Y. Okabe, “Improved BSFQ gates and a 
3-bit decoder,” presented in poster session 4E103 of this conference. 
C. K. Teh, M. Kitagawa, and Y. Okabe, “Inductance Calculation of 3D 
Superconducting Structures with Ground Plane,” Supercond. Sci. 
Technol., vol. 12, no. 11, pp. 921-924, Nov. 1999. 
H. Numata, S. Nagasawa, M. Koike, and S. Tahara, “Fabrication 
technology for a high-density Josephson LSI using an electron cyclo- 
tron resonance etching technique and a bias-sputtering planarization,” 
Supercond. Sci. TechnoL, vol. 9, pp. A42-A45, 1996. 
1, pp. 3-28, 1991. 
134-142, 1996. 
Fig. 9. Measured waveform for new BSFQ OR ceii inlow frequency test- 
ing. The vertical scale is lV/div, and the horizontal scale is 25mddiv. 
