Educational aspects of VLSI training at postgraduate level by Guyot, A. et al.
Eduational aspets of VLSI training at
postgraduate level
Prof.dr. Alain GUYOT
Institut National Polytehnique du Grenoble
Mihai T. L

AZ

ARESCU
\POLITEHNICA" University of Buharest
Conf.dr.ing. Daniel C. IOAN
\POLITEHNICA" University of Buharest
Faultatea de Eletrotehnia, sala EB206,
Spl. Independentei 313, 77206 Buuresti,
ROM
^
ANIA
ABSTRACT
This paper will desribe the way a VLSI iruit projet is intended to be used for training in
the Postgraduate Shool for Computer Aided Eletrial Engineering in Buharest, Romania.
The stress will be foused on the main design steps, on the use of various CADENCE Edge
TM
VLSI design environment failities, and on strong team ollaboration stimulation.
I { INTRODUCTION | The Romanian PSCAEE shool
The TEMPUS Joint European Projet JEP 2717, entiteled \Initiation of Formal Training in Com-
puter Aided Eletrial Engineering in Romanian Universities" is oordinated by Polytehni University of
Buharest and has as ontrator COREP { Politenio di Torino. Among other institutions, the partners
interested in VLSI iruit design are: INPG-CMP, Universita di Genova, National Tehnial University
of Athens, Thames Valley University and ISEN Lille. The projet is naned mainly by CEC, having for
the three aademi years 1991-1994 a total budget of 586,000 ECU.
The main goals to of this projet are:
 establishing a Postgraduate Training Center of CAD/CAE in Eletrial Engineering (PSCAEE) at
Polytehni University of Buharest;
 organizing and operating a CAD/CAE laboratory and assoiated doumentation enter;
 urriulum development and textbooks editing;
 EC-RO mobility program for professors and students;
 organization of a workshop in CAD/CAE, with international partiipation.
The postgraduate enter of CAEE ould train about 20 high level speialists annually. Two main
diretions of study are available: eletromagneti eld modelling and (miro)eletroni systems design.
Boths diretions' urriulum inludes a restrited number of ompulsory ourses (assuring theoretial
bakground and Unix operating system knowledges) and a large hoie of optional ourses. In addition,
individual studies and omputer-laboratory works are inluded in the student's training. During the two
years of part-time study, the students will have to ahieve both a mid-term projet and a graduation
projet working in Polytehni University Buharest or in one of the partners' university.
In the frame of mobility program of TEMPUS JEP 2717, ve Romanian professors and one graduate
student spent eah 3-6 months at INPG-CMP, Frane. Three speialists from INPG visited Polytehni
University of Buharest. Following this program, 14 volumes of textbooks, summing more than 3,500
pages were edited and printed in 30 opies.
The ontat with the JEP partner representative, Dr. Bernard Courtois from INPG-CMP oered to
this newly founded Romanian CAEE training enter from Polytehni University Buharest the oppor-
tunity to apply and to be aepted as partner in the EUROCHIP sheme. The suess of this TEMPUS
JEP allows to improve essentialy the VLSI systems design training at Polytehni University. Software
pakages for VLSI systems design suh as Cadene-Edge, Hilo and HSpie were aquired by JEP 2717
using the EUROCHIP oer and were instaled for PSCAEE laboratory on workstations (DECStation
5000/25, SUN LX, HP-Apollo 9000/710, and 705) purased by JEP budget.
The organization of the Postgraduate Training Center in CAD/CAEE is the rst attempt of this kind
in Romania.
As for every begin, there were many diÆulties onsisting mainly in the lak of information in Romania
regarding: Unix-RISC workstation (software and hardware); appropriate books and periodials, and
important delay in CAD tools delivery due to restritions imposed for advaned tehnology transfer
towards Eastern European ountries.
Among the suesses, one of the most important and enouraging is the fruitful ollaboration with
the JEP partners representative. The joint with EUROCHIP sheme is an exelent opportunity to be
integrated in the European eorts aimed to improve VLSI systems design training.
II { The laboratory work
The students will have free aess to the whole laboratory doumentation. They will reeive the
theme, the work sheduling and will be helped with hints whenever neessary. There is no ontinously
supervising provided for the lab. A few hours several days a week, a qualied person will be available for
inquiries and help.
A theme suitable for this almost self-teahing working method has to be simple enough to be easy
in-deepth understood by eah student, and a splittable one in order to allow a high degree of parallel
development by assigning the subparts to separate teams.
Sine mathematial operators are inherently involved in data proessing irrespetive of the data type,
they are to be found in almost every proessor. Moreover, the overall automaton performanes are
denitevely inuened by the quality of the used operators.
Given the above onsiderations, we intend to propose as laboratory work for the VLSI division of
PSCAEE a design of a fast binary divider.
The internal struture of the proposed divider an easily be splitted into several quasiindependent
laboratory works. As the nal goal is to put all the piees together into the full ustom layout of the
divider, the laboratory work implies also a permanent ollaboration among students to oordinate their
implementations, that beames more tight as the projet is narrowing the end phase. Also, optimizations
in shematis and layout for eah part as well as for the whole design have to be realized, as the low
omputation delay and a minimum area on siliium are important riteria for the iruit's quality.
III { Ciruit's struture
Two operands are needed as divider's input: the dividend and the divisor, both written in a standard
binary notation, that is using only 0 and 1 as binary digit values. They are supposed normalized binary
numbers, aording to IEEE 754-1985 standard. Internal omputations inside the divider are done in a
binary borrow-save notation, that is using  1, 0, and 1 as possibles values for eah binary digit. The
quotient is then omputed in the same redundant notation. This is why an extra blok was added to
perform the quotient onversion from the redundant notation to the standard binary one, supposed to
be used in the rest of the iruitry.
The divider is omposed of three parts (g. 1), eah of them playing a distint role in internal dataow:
 the head ell reeives as inputs the three MSB of the partial remainder, written in redundant
notation. The ell has three roles at its turn.
TAIL CELL
d r r- +
-qq-
q+ q+
s+ s-
TAIL CELL
d r r- +
-qq-
q+ q+
s+ s-
TAIL CELL
d r r- +
-qq-
q+ q+
s+ s-
TAIL CELL
d r r- +
-qq-
q+ q+
s+ s-
TAIL CELL
d r r- +
-qq-
q+ q+
s+ s-
TAIL CELL
d r r- +
-qq-
q+ q+
s+ s-
TAIL CELL
d r r- +
-qq-
q+ q+
s+ s-
TAIL CELL
d r r- +
-qq-
q+ q+
s+ s-
TAIL CELL
d r r- +
-qq-
q+ q+
s+ s-
TAIL CELL
d r r- +
-qq-
q+ q+
s+ s-
TAIL CELL
d r r- +
-qq-
q+ q+
s+ s-
TAIL CELL
d r r- +
-qq-
q+ q+
s+ s-
-
+ +
-
r
HEAD CELL
r
-2
- r
-2
+ r
-1 -1
+- r0
- r0
+
q
q
q
q
s-
-1 s-1 s0
-+
-
+ +
-
r
HEAD CELL
r
-2
- r
-2
+ r
-1 -1
+- r0
- r0
+
q
q
q
q
s-
-1 s-1 s0
-+
-
+ +
-
r
HEAD CELL
r
-2
- r
-2
+ r
-1 -1
+- r0
- r0
+
q
q
q
q
s-
-1 s-1 s0
-+
-
+ +
-
r
HEAD CELL
r
-2
- r
-2
+ r
-1 -1
+- r0
- r0
+
q
q
q
q
s-
-1 s-1 s0
-+
a a ad1 1 d2 2 d3 3
FINAL REMAINDER
QUOTIENT CONVERSION
0
MSB LSB
a
QUOTIENT IN STANDARD BINARY NOTATION
Critical path
Q =
R
0
= A
A
D
Figure 1: the blok shemati of a 4 bit divider
First it omputes a quotient digit. This quotient digit is used then by the subsequent tail ells in
the same row to hoose the operation to exeute on the rest of the partial remainder digits | an
addition, a subtration, or nothing.
The seond role of the head ell is to do that omputation on the MSBs of partial remainder as the
tail ells do with the rest of the remainder's bits. The MSB of the normalized divisor D is always
1, so it is onsidered as an impliit onstant signal in the head ell during omputations.
Finally, the head ell take advantage of the redundant notation of the partial remainder and rewrite
its MSB in an equivalent form, so that the next remainder has one most signiant digit less. This
is to avoid trunation errors when shifted one bit to the left for the next stage.
The ve signals generated by the head ell (g. 1) have the following boolean expressions:
q
+
j
= r
+
 2
+ r
 
 2
(r
+
 1
r
 
 1
+ r
 
 1
r
+
0
r
 
0
+ r
+
 1
r
+
0
r
 
0
) ;
q
 
j
= r
 
 2
+ r
+
 2
(r
+
 1
r
 
 1
+ r
+
 1
r
+
0
r
 
0
+ r
 
 1
r
+
0
r
 
0
) ;
s
+
 1
= r
+
 2
(r
+
 1
+ r
 
 1
+ r
+
0
r
 
0
) + r
 
 2
r
+
 1
r
 
 1
r
+
0
r
 
0
;
s
 
 1
= r
 
 2
(r
 
 1
+ r
+
 1
+ r
 
0
r
+
0
) + r
+
 2
r
 
 1
r
+
 1
r
 
0
r
+
0
;
s
 
0
= q
 
j
 r
+
0
 r
 
0
:
 the tail ell reeives as inputs a digit of the divisor (in binary standard notation), a digit of the
partial remainder (in binary redundant notation), and the quotient digit omputed by the head
ell of its row. The tail ell's role is to perform an addition, a subtration or no-operation on its
two input digits (one of the divisor and one of the partial remainder), aording to the value of the
reeived quotient digit.
The two signals generated by the tail ell, s
+
i 1
and s
 
i
satisfy the equality:
2  s
+
i 1
  s
 
i
= r
+
i
  r
 
i
+ (q
+
j
d
i
+ q
 
j
d
i
)
 the quotient onversion blok , as the name shows diretely, has to onvert the omputed quotient in
redundant notation to binary standard notation, the last notation beeing supposed to be used by
the rest of the iruit that the divider is part of. This blok is omposed from small ells disposed
in a regular struture.
IV { One possible laboratory work. Complexity evaluation
The students an be organized in three teams, eah team having to design one of the divider's parts:
the head ell, the tail ell, and the quotient onversion blok.
Clearly it seems that the head ell is the most ompliated ell with respet to the others, sine it
has to generate 5 signals, all of them using all the 6 input signals. Anyway, due to the perfet symmetry
of the borrow-save notation, the q
+
j
and q
 
j
signals have the same struture for their expression, only
the input signals are dierent. The same stands for s
+
 1
and s
 
 1
too. This means that this two pairs of
signals will have the same shemati at transistor level, but the transistors themselves will be driven by
dierent input signals. The fth head signal, s
 
0
is generated by a very simple gate anyway.
So, besides the simpler gate for the s
 
0
signal, there are to be designed only two struturaly dierent
omplex gates: one for the q
+
j
and q
 
j
signals, and the other for s
+
 1
and s
 
 1
. On the other hand, the
surfae of the head ell layout isn't a very important onstraint to be observed, beause for a n-bit divider
there are just n head ells in the layout.
In gure 1 we an see that the ritial path of the divider is passing through the head ell. More
preisely, the omputation time of the q
 
j
and q
+
j
signals is very important. This is why the omplex
gates whih generate this two signals have to be arefully optimized in terms of speed.
The tail ell is muh smaller than the head ell. Depending on the implementation, the transistor
ount is two to four times smaller than in the head ell. The input signals are 5 for eah tail ell and the
generated ones are just two. This means that the omplexity of the gates inside a tail ell is lower than
those in the head ell. Instead, the main issue with the tail ell is the layout area, beause for a n-bit
divider the number of its tail ells is n (n   1). This makes the total area of the tail ells to represent
almost all the divider's area, espeially for a large number of bits (n).
The divider's ritial path runs through the tail ells too (g. 1), so that its omputation time is
aeting the overall omputation speed of the whole divider. Shematis optimization for area and speed
are thus mandatory for this ell.
Finally, the quotient onversion blok is a rather sholasti one. It onsists of several identi ells
disposed in a regular struture. If applied, area optimization riteria will impose however the use of
two dierent types of ells in order to take advantage of the higher omputation speed of the quotient
onversion versus quotient digits output rate. This may ompliate a whit the quotient onversion blok
layout design. Quotient onversion is done \on the y", so the main optimization for this ell is to hoose
the minimum number of rows in the quotient onversion ell based on the ratio between the omputation
time of q
j
signals and the traversal time of a row of ells in the quotient onversion ell.
Given the onsiderations above it shows up that the three parts of the divider's struture ould on-
stitute well balaned design projets for laboratory work. The teams in harge to design the divider's
parts are stimulated to ooperate to deal eÆiently with issues like minimizing the divider's area, opti-
mizing the ritial path, hoosing the rows number in the quotient onversion ell based on delays, power
distribution, ommon ell pith, ommuniation pins position for assembling by abbutment, et.
In the same time, eah of the divider's parts has its own spei hot-point. The designers of the head
ell will have to deal with omplex gates and the omputation time minimization for q
 
j
and q
+
j
signals.
The tail ell layout has to be as small as possible and the omputation time for its signals minimized too.
The more novies ould be grouped in the team harged to design the quotient onversion ell, whih is
likely to be a good starting projet for them.
For an arithmeti operator, the worst ase omputation time is the only one onsidered at performanes
evaluation. The students will have to pay attention to the evaluation and minimization of the worst
ase delay for the ritial path signals of their ells. This implies worst ase input vetors sequene
determination, and several shemati struture simulation.
V { Some variants
There are several implementations for the divider's ells that an be hoosed from. There are also for
eah ell's shemati several hoies. Let's skim some of them.
Fairly the most omplex one, the head ell an be implemented using either stati or dynami CMOS
logi. Of ourse, to eÆiently use the dynami logi major hanges in the whole divider's struture are
needed, but for an eduational purpose projet it's use may be restrited just for the head ell.
The tail ell is struturally a Full Adder ompleted with a multiplexer for one of the input signals.
The ell an be implemented in this blok struture fully with CMOS gates or with transmission gates.
Another hoie may be to draw out the ell's shemati from the signals' truth tables, whih ase one of
the most advantageous hoie is that using just transmission gate based multiplexers, and inverters.
For the quotient onversion ell it is possible also to use a ombination of pass transistors and CMOS
gates in order to minimize the transistor number.
So far we disussed only about the full ustom (i.e. at transistor level) implementation of the ells.
This is the most performant one, but for CAD tools teahing purposes one might onsider a standard
ell based design too. In this latter ase, the freedom of hoie in shemati oneption is somehow
restrited with respet to the full ustom design, and there are less optimizations to be searhed. But the
standard ell approah ombined with the Automati Plae and Route faility of CADENCE Edge
TM
design environment is by far the fastest way to get the layout of a iruit using this CAD tool.
The most time onsuming and tedious stage of the projet is ertainly the layout design if is done
by hand. This an be rended onsiderably shorter and a lot more pleasent making use of \The Layout
Synthesizer" tool (LAS), ombined with Symboli Layout Editor and Compator. This tools are both
provided in the CADENCE Edge
TM
environment. The LAS is able to automatially generate a symboli
layout of a ell using the transistor level shemati as its input. The generated symboli layout may be
further optimized using the Symboli Layout Editor and Compator. Of ourse, to have a more omplete
ontrol over the ell's layout, one may use from start only the symboli Layout Editor and Compator,
designing the symboli layout itself.
VI { Sheduling
Depending on the number of laboratory hours alloated for the projet and the student's knowledge
level in miroeletronis, the theme denition an be biased either to a more automati or a more manual
approah.
In any ase, the theme should ompel the hardworking student to go through and master all the
failities oered by the design environment for the given approah. Logi simulation to hek shemati
validity, eletri simulation for timing optimization, and post-layout simulations on extrated shematis
with parasitis are a must. This should be joined by the use of the various features of the Automati Plae
and Route tool or, for the other extreme, those of the LAS and Symboli Layout Editor and Compator
tool.
Viewing the theme's omplexity, it would be a good idea to extend the alloated laboratory hours
on two semesters. The most part of the time shall be devoted by the students to iruit and layout
oneption as well as studying various optimization alternatives. This way, the omputer working time
will be eÆiently used to implement the best hoies and to learn the design environment. This means
that there is no need to use important omputing resoures for this projet, whih is an important issue
for a poor hardware dotation lab.
VII { Expeted results and omments
The proposed work is rather large and omplex. It should nally learly distinguish the hardworking
students from the others.
The advantages of suh a work are important and multiples:
 oers the feeling of a real projet (although a small one) and ould onstitute a valuable starting
experiene for a further work in VLSI ASIC design.
 develop team ollaboration relationship and inter-team ooperation on various stages of the design,
mainly in the oor planning phase (layout design).
 as a high performane iruit, this should stimulate the student's ommitment in searhing optimi-
sations making use reatively of their eletronis knowledges as well as learning new ones.
 there is already available a CMOS stati logi implementation of the divider, whose performanes
ould be taken as referene by the students in order to self-evaluate their own work. This should
make them willing to ahieve better results, stimulating their inventivity.
 the division is a very well known operation, that does not need intriate speiations.
Skills in using the design environment should be appreiated in the nal notation.
Referenes
[1℄ ANSI/IEEE, IEEE Standard for Binary Floating-Point Arithmeti, ANSI/IEEE Standard 754-1985,
1985
[2℄ A. Avizienis, \Signed-digit number representation for fast parallel arithmeti", IRE Transations on
eletroni omputers, 10, 1961, pp. 389-400
[3℄ C. Y. Chow and J. E. Robertson, \Logial design of a redundant binary adder", in Pro. 4
th
sym-
posium on omputer arithmeti, 1978, pp. 109-115
[4℄ L. Montalvo and A. Guyot, A hybrid radix-4 divider with operand saling. Unpublished internal
paper, TIMA Grenoble, 1993
[5℄ I. Moussa, A. Skaf and A. Guyot, Design of a GaAs redundant divider, VLSI'93 Grenoble, 1993
[6℄ I. Moussa, A. Guyot and P. Rost, Design and omparison of GaAs and CMOS dividers, ESSIRC'93,
Sevilla, 1993
[7℄ J. M. Muller, Arithmetique des ordinateurs, Paris: Masson 1989
[8℄ J. E. Robertson, \The orrespondene between methods of digital division and multiplier reoding
proedures", IEEE Transations on omputers, C-19 N
o
8, 1970
[9℄ N. Weste & K. Eshraghian, Priniples of CMOS VLSI Design, Reading Massahusetts: Addison-
Wesley Publishing Company, 1985
[10℄ J. William & V. C. Hamaher, \A linear-time divider array", Canadian Eletrial Engineering Jour-
nal, Vol. 6 n
o
4, 1981
[11℄ Mead & Conway, Introdution to VLSI systems, Addison-Wesley Publishing Company, 1980
