

Internationales Wissenschaftliches Kolloquium International Scientific Colloquium

PROCEEDINGS

11-15 September 2006

# FACULTY OF ELECTRICAL ENGINEERING AND INFORMATION SCIENCE



INFORMATION TECHNOLOGY AND ELECTRICAL ENGINEERING -DEVICES AND SYSTEMS, MATERIALS AND TECHNOLOGIES FOR THE FUTURE

Startseite / Index: <u>http://www.db-thueringen.de/servlets/DocumentServlet?id=12391</u>



#### Impressum

| Herausgeber: | Der Rektor der Technischen Universität Ilmenau |
|--------------|------------------------------------------------|
|              | UnivProf. Dr. rer. nat. habil. Peter Scharff   |

Redaktion: Referat Marketing und Studentische Angelegenheiten Andrea Schneider

> Fakultät für Elektrotechnik und Informationstechnik Susanne Jakob Dipl.-Ing. Helge Drumm

Redaktionsschluss: 07. Juli 2006

Technische Realisierung (CD-Rom-Ausgabe): Institut für Mer

Institut für Medientechnik an der TU Ilmenau Dipl.-Ing. Christian Weigel Dipl.-Ing. Marco Albrecht Dipl.-Ing. Helge Drumm

Technische Realisierung (Online-Ausgabe):

Universitätsbibliothek Ilmenau <u>ilmedia</u> Postfach 10 05 65 98684 Ilmenau

Verlag:

## isle

Verlag ISLE, Betriebsstätte des ISLE e.V. Werner-von-Siemens-Str. 16 98693 Ilrnenau

© Technische Universität Ilmenau (Thür.) 2006

Diese Publikationen und alle in ihr enthaltenen Beiträge und Abbildungen sind urheberrechtlich geschützt. Mit Ausnahme der gesetzlich zugelassenen Fälle ist eine Verwertung ohne Einwilligung der Redaktion strafbar.

| ISBN (Druckausgabe):   | 3-938843-15-2 |
|------------------------|---------------|
| ISBN (CD-Rom-Ausgabe): | 3-938843-16-0 |

Startseite / Index: <a href="http://www.db-thueringen.de/servlets/DocumentServlet?id=12391">http://www.db-thueringen.de/servlets/DocumentServlet?id=12391</a>

V. Mladenov / V. Todorov / B. Dimov / Th. Ortlepp / F. H. Uhlmann

## Statistical Description and Optimization of the Time-Domain Parameters of Asynchronous RSFQ Digital Circuits

### INTRODUCTION

With its extremely high operation speed and ultra low power consumption, the Rapid **S**ingle-Flux **Q**uantum (RSFQ) technique [1] is one of the most promising alternatives to the conventional semiconductor electronics for the development of powerful digital computational devices [2]. Nevertheless, due to the large feature sizes of the currently available RSFQ fabrication technologies, the realization of a complex RSFQ digital circuit with multigigahertz clock frequency is unimaginable [3-4], and the only possibility to overcome this restriction is given by the asynchronous logic design. There, the circuit's components react to changes on their inputs as these changes arrive, and produce changes on their outputs when conclude the current computation [5]. No clock signal is provided to synchronize this communication, and the coordination is performed by some kind of handshaking protocol. This is schematically shown in Fig. 1 – the sender may issue a new data bit only if the receiver has confirmed its readiness to accept it. This is done by issuing an acknowledge signal through a separate communication channel, in which the stream flows contrary to the transmitted data.



Fig. 1 Asynchronous data exchange between two communicating units.

The asynchronous circuits can be **d**elay-insensitive (DI), i.e. to operate correctly for any finite delay times of its gates and interconnects. For this, a handshaking feedback should connect each pair of communicating circuit components. This, however, results into extremely complicated circuit topology and reduced operation speed, making the DI synthesis impractical about the high-speed large-scale digital designs. Significantly simpler and faster circuits can be realized, if the handshaking feedbacks are omitted always when possible. Being no more DI, such an asynchronous circuit operates correctly under certain timing assumptions, whose violation leads to erroneous behaviour. Therefore, the exact and efficient handling of the gate delays is a vital precondition for the realization of complex asynchronous digital circuits. In this paper, we present our novel technique for RSFQ time-delay estimation and optimization, suitable for high-level synthesis of complex asynchronous RSFQ digital circuits.

### STATISTICAL PREDICTION OF THE TIME DELAY OF THE RSFQ GATES

The fabrication process of the RSFQ chips is influenced by many and typically uncorrelated factors, shifting all circuit parameters (and also the time delay) out of their expected nominal values. This effect [6] can be neither forecasted, nor influenced by the designer. In case of synchronous applications, a certain margin is usually included into the clock interval to compensate all such time-domain deviations. This strategy, however, is not applicable about non-DI asynchronous circuits.

In order to model the time-delay spread of the asynchronous RSFQ gates, we have developed the Windows/DOS compatible software package JSIMSA, based on the free Josephson junction circuit simulator JSIM [7]. Its block diagram is shown in Fig. 2.



Fig. 2 Block diagram of JSIMA. m – number of randomly generated scaling coefficients,  $n_{max}$  – maximum number of iterations.

Within one iteration loop of the program, a set of coefficients is initially generated. They are used to scale the nominal circuit parameters, thus simulating their spread during the fabrication process. In order to represent adequately the stochastical nature of this spread, each coefficient is independently subjected to a Gaussian distribution with statistical parameters specified by the fabrication foundry. The resulting circuit's netlist is simulated with JSIM, and the obtained time-domain behaviour is estimated as working (good) or not operating (bad). In case of a working circuit, it is counted and its time delay is automatically calculated are stored in a file. This cycle is repeated many (typically over 100 000) times. Finally, we build the delays' histogram, which is analyzed by the means of the statistics, thus deriving the mathematical model of the time-delay distribution of the RSFQ gate. Additionally, we estimate the gate's fabrication yield as  $j/n_{max}$ , with *j* - the number of the working circuits, and  $n_{max}$  - the total number of simulated circuits.

We have applied this technique for statistical description about all gates from the asynchronous RSFQ cell library [8]. Below, we will discuss the results about the three most complex gates of the library – the dual-rail AND, the dual-rail XOR, and the dual-rail 1×2 demultiplexer. Their electrical schemes are shown in Fig. 3, while their element values and operation principle can be found in [4], [8-9].



Fig. 3 Electrical schemes of (a) – asynchronous dual-rail RSFQ AND gate; (b) – asynchronous dual-rail RSFQ XOR gate; (c) – asynchronous dual-rail RSFQ 1×2 demultiplexer.

Two components of technological spread should be distinguished: the inter-chip, and the on-chip one. The first one represents the parameter deviations among different chips produced with a given fabrication technology, while the second one represents the parameter spread within one and the same chip. For any stable fabrication process, the on-chip parameter spread is negligibly small. Therefore, we ignore it within this analysis and consider only the inter-chip spread.

In order to model it, we generate at each iteration loop of JSIMA three independent scaling coefficients –  $k_L$ ,  $k_i$ , and  $k_g$ , subjected to Gaussian distributions with mean values  $\mu_{kL}=\mu_{ki}=\mu_{kg}=1$  and standard deviations  $\sigma_{kL}$ ,  $\sigma_{ki}$ , and  $\sigma_{kg}$ , respectively. Within the analysed circuit, we scale all inductances by  $k_L$ , all junction critical currents and all dc bias currents by  $k_i$ , and all junction parasitic inductances to ground by  $k_g$ .

The resulting time delays are shown in Fig. 4 and Fig. 5 for the cases  $\sigma_{kL}=\sigma_{kj}=\sigma_{kg}=10\%$ and  $\sigma_{kL}=\sigma_{kj}=\sigma_{kg}=20\%$ , respectively. As by all other gates from the cell library, the resulting statistical distribution of the time delay can be successfully fitted to a Gaussian one. The obtained large standard deviations clearly demonstrate the strong impact of the technological parameter spread over the stability of the time-domain behaviour of the RSFQ circuits. Therefore, such an analysis is a key component of the successful asynchronous RSFQ design, and the proposed technique is a powerful tool for its accurate performance.



Fig. 4 Time-delay spread due to the technological spread in the case  $\sigma_{kL} = \sigma_{kg} = 10\%$ . (a) – dual-rail AND, obtained standard deviation = 10.37%; (b) – dual-rail XOR, obtained standard deviation = 10.67%; (c) - 1×2 dual-rail demultiplexer, obtained standard deviation = 5,06%.



Fig. 5 Time-delay spread due to the technological spread in the case  $\sigma_{kL}=\sigma_{kg}=20\%$ . (a) – dual-rail AND, obtained standard deviation = 18.23%; (b) – dual-rail XOR, obtained standard deviation = 23.01%; (c) - 1×2 dual-rail demultiplexer, obtained standard deviation = 9.41%.

### OPTIMIZATION OF THE TIME-DOMAIN CHARACTERISTICS OF THE ASYNCHRONOUS RSFQ DIGITAL CIRCUITS

A very important and time-consuming step of the small-scale RSFQ design flow is the optimization of the cell library components [3]. Within the classical synchronous RSFQ design, the aim of this step is to adjust properly the circuit's parameters in order to maximize the gate's fabrication yield. In this way, we obtain a maximum fabrication yield also of the complex synchronous RSFQ digital circuit composed from these optimized gates.

As already emphasized in the beginning of this paper, the correct operation of the complex asynchronous circuits depends not only on the correct operation of their building blocks itself, but also on the timing assumptions allowing the omitting of handshaking feedbacks. Therefore, we optimize the components of our asynchronous RSFQ cell library [8] with respect to minimize the standard deviation of their time-delay spreads, keeping the fabrication yields reasonably large.

This novel optimization strategy will be illustrated about the asynchronous dual-rail 1×2 demultiplexer in Fig. 3(c). Important for the time-domain behaviour of this gate are the critical currents of the junctions *J14-J21*, because their switching determines the propagation path of the signals. Below, we designate their value with  $I_c$ , while d and  $\sigma_d$ 

designate the nominal value of the gate's time delay and its standard deviation, respectively. The dependence of the gate's fabrication yield and the ratio  $\sigma_{d}/d$  is shown in Fig. 6. The maximum fabrication yield is obtained at  $I_c$ =162µA, which is far away from the optimum of  $\sigma_{d}/d$ . Therefore, we choose a nominal value  $I_c$ =175µA, reducing with few percents the fabrication yield, but shrinking with about 10% the ratio  $\sigma_{d}/d$ . Thus, the time-domain stability of the gate is improved, while its yield is slightly diminished.



Fig. 6 Dependence of the fabrication yield and the ratio  $\sigma_{a'd}$  of the asynchronous dual-rail 1×2 demultiplexer in Fig. 3(c) on the critical current  $I_c$  of the junctions J14-J21. Assumed technological spread: (a)  $\sigma_{kL}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}=\sigma_{kr}$ 

#### CONCLUSIONS

The development of delay-insensitive complex asynchronous digital circuits is very difficult and results into complicated topologies with bad computational performance. This problem can be overcome only if certain assumptions are supposed about the time-domain behaviour of the circuit's components, thus allowing the omitting of many handshaking feedbacks without reducing the circuit's reliability. For this, the gates composing the circuit under design should have well known and stable time delays. This requirement disagrees with the RSFQ time-delay instability, caused by the fabrication spread, which has stochastical nature and cannot be exactly determined during the design phase.

In this paper, a novel technique for RSFQ time-delay estimation has been presented. It performs an exact statistical modelling of the RSFQ time-delay deviations due to the fabrication spread. Assuming miscorrelated deviation of the technological parameters, we have obtained a Gaussian distribution of the RSFQ time delay. Its standard deviation has been calculated to be of the order of the standard deviation of the technological parameter spread, thus being unacceptably large for efficient high-level asynchronous design. Therefore, we have developed a novel strategy for the optimization of the asynchronous RSFQ gates. Contrary to the usual optimization of synchronous RSFQ circuits, its goal is not to maximize the fabrication yield, but to minimize the time-delay spread, keeping the yield reasonably large. This strategy has been performed about the components of the asynchronous RSFQ cell library [8], making it an excellent basis for efficient high-level design of complex asynchronous RSFQ digital circuits.

#### ACKNOWLEDGEMENT

This work is supported by the DAAD PPP-program (contracts DAAD-13/2005 and D/04/08637) between the Ilmenau University of Technology / Germany, and the Technical University of Sofia / Bulgaria.

#### **References:**

[1] K. K. Likharev, V. K. Semenov, "RSFQ Logic/Memory Family: A New Josephson-Junction Technology for Sub-Terahertz-Clock-Frequency Digital Systems," *IEEE Trans. Appl. Supercond.*, vol.1, pp.3-28, 1991

[2] H. Toepfer, "Digitale Flussquantenelektronik fuer neuartige schnelle und verlustleistungsarme Schaltungem," Habilitationsschrift, Fakultaet Elektrotechnik und Informationstechnik, Technische Universitaet Ilmenau, 2002

[3] B. Dimov, V. Mladenov, V. Todorov, Th. Ortlepp, F. H. Uhlmann, "Design Aspects of Complex Asynchronous RSFQ Digital Circuits," *this conference* 

[4] B. Dimov, General Restrictions and Their Possible Solutions for the Development of Ultra High-Speed Integrated RSFQ Digital Circuits, Wissenschaftsverlag Ilmenau, Germany, 2005

[5] J. A. Brzozowski, C.-J. H. Seger, Asynchronous Circuits, Springer-Verlag, 1995

[6] Th. Ortlepp, "Dynamische Analyse stochastischer Einfluesse in der supraleitenden Einzelflussquantenelektronik," Dissertationsschrift, Fakultaet Elektrotechnik und Informationstechnik, Technische Universitaet Ilmenau, 2004

[7] E. S. Fang, T. van Duzer, "A Josephson Integrated Circuit Simulator (JSIM) for Superconductive Electronic Application," *Ext. Abstr.* 2<sup>nd</sup> *ISEC, Tokyo, Japan*, pp.407-410, 1989

[8] http://www4.tu-ilmenau.de/El/ATE/kryo/asyn/index.htm

[9] B. Dimov, V. Mladenov, F. H. Uhlmann, "Asynchronous RSFQ Gates with Flexible Delays," Proc. 48. Internat. Wiss. Kolloquium, 22.-25. Sept. 2003, TU limenau, Germany, pp.387-388, 2003

#### Authors:

Prof. Dr. Valeri MLADENOV Ing. Valery TODOROV

Dept. Theoretical Electrical Engineering, Technical University of Sofia 8 Kliment Ohridski St., Sofia-1000, Bulgaria E-mail: valerim@tu-sofia.bg

Dr.-Ing. Boyko DIMOV Dr.-Ing. Dipl.-Math. Thomas ORTLEPP Prof. Dr.-Ing. habil. F. Hermann UHLMANN

RSFQ Design Group, Institute for Information Technology, Ilmenau University of Technology P.O.Box 100565, D-98684 Ilmenau, Germany E-mail: tet@tu-ilmenau.de