A demonstration of CMOS VLSI circuit prototyping in support of the site facility using the 1.2 micron standard cell library developed by National Security Agency by Smith, Edwyn D.
,i
i
,jK'--
1,4 / - B - <ra?..,."
/..o"'73
FINAL REPORT _
i
A DEMONSTRATION OF CMOS VLSI CIRCUIT PROTOTYPING IN SUPPORT
OF THE SITE FACILITY USING THE 1.2 mm STANDARD CELL LIBRARY
DEVELOPED BY NATIONAL SECURITY AGENCY
by
=_
[]
[]
Edwyn D. Smith
Associate Professor of Electrical Engineering
The University of Toledo
2801 West Bancroft Street
Toledo, Ohio 43606
= :
m
= ,
=
i
i
- ?
%--
prepared for
NATIONAL AERONAUTICS AND SPACE ADMINISTRATION
Lewis Research Center
Cleveland, Ohio
Space Electronics Division
Monty Andro, Technical Officer
Grant NAG 3-1036
i
i
E
t_
March 1991
(NASA-CR-18_014) A DEMgNST_ATI;-)_N OF CMOS
VL3[ C[oCUIT P_,OT_TYPT_ IN _U-PPO_-T OF THE:
SIIF FACILITY USING THE 1.2 MICRON STAN_APD
CELL t!q_AR Y DEVELOPED _Y NATIG,_'AL SECURITY
Nqi-21.20
Uncl _s
0001573
https://ntrs.nasa.gov/search.jsp?R=19910012113 2020-03-19T18:55:25+00:00Z
ACKNOWLEDGMENT
It is a pleasure to acknowledge the support provided by
this grant to The University of Toledo. The chip designs
created while pursuing the objective of the grant served well
as foundations upon which four Master of Science in
Electrical Engineering theses were constructed. From the
University perspective, that is good "mileage" indeed. As the
project was concieved as a demonstration of standard cell
CMOSASIC implementation of low-cost logic, the fact that
totally inexperienced students could produce working designs
on first silicon would be convincing evidence. Let us hope it
will prove to be so. The P.I. would like to thank the NASA
Technical Officer, Mr. Monty Andro, for assistance, for
numerous helpful suggestions, and much patience in answering
questions from several struggling designers. Finally, I
should like to speak a word of thanks to NASA on behalf of
Professor Arthur R. Thorbjornsen whose untimely death in
September, 1990, left the project and the University poorer.
m
1
INTRODUCTION
m
It was proposed to design two Si CMOS ASIC's, a Data
Generator chip and a Data Checker chip. The effort was made
more specific by some guidelines and constraints, summarized
following. The logical design of these circuits was supplied
by Lewis Research Center. It was agreed that the designs
would be implemented to the extent possible using only cells
from the scalable CMOSN Standard Cell Library developed by
the National Security Agency [i]. It was further agreed that
the designs be scaled to 1.2 _m technology with the
understanding that a small number of prototype chips be
fabricated through MOSIS [2]; the cost of fabrication to be
borne separately by Lewis Research Center. Preliminary and
full-functional testing was to be shared between The
University of Toledo and Lewis Research Center as
appropriate.
SUMMARY OF EFFORT AND PERSONNEL INVOLVED
The official period of performance was 14 April 1989
through 15 June 1990, including one no-cost extension. The
informal period of performance began approximately November,
1988 and continued through the present (March, 1991). Permit
me to explain what is meant by "informal period of
performance." Five international graduate students elected to
contribute to this grant activity and use these same
contributions as major portions of their M.S. theses. These
students were not supported at any time by monies charged to
this grant. I was given to understand by both the grant
Technical Officer and the Office of University Affairs (NASA
LeRC) that there was no prohibition or objection to these
students being so involved. It follows that they constituted
a volunteer labor force which accomplished most of what is
reported herein.
Professor Arthur R. ThorbJornsen participated as co-P.I.
in the grant activity from its inception to his death in
September, 1990. All activity has remained continuously under
the general supervision of Dr. E. D. Smith, P.I. The student
volunteers were/are by name:
Mr. Anurag Gupta
Ms. Shobha R. Mallarapu
Mr. Anil S. Kapatkar
Mr. Sathyanarayana K. Rao
Mr. Kin Lui
2
SUMMARY OF ACCOMPLISHMENTS TO DATE
The CMOSN standard cells when read into the Univ. of
Calif., Berkeley, layout editor MAGIC cause a minor but
annoying design rule violation to be reported. Dr.
ThorbJornsen modified many of the cells to remove the
violationwthe infraction occurs in the second layer of
metal.
Dr. Thorbjornsen designed and submitted to MOSIS a
simple test chip, intended to verify the Iibrary-MAGIC-MOSIS
pipeline at 1.2 mm geometry. The prototypes were fully
functional and indicated no difficulties. This work was
accomplished during Spring quarter, 1988-89, and the first
ten weeks of Summer, 1989. These prototypes were fabricated
at University expense.
Mr. Anurag Gupta and Dr. E. D. Smith assumed
responsibility for partitioning the total task into subunits
and assigning the subunits to student volunteers. It quickly
became apparent that the total project, Data Generator plus
Data Checker could not be accommodated on a single chip.
Thus, the first partition was to make the Data Generator and
the Data Checker separate projects. The Data Generator was
given first priority. Anurag Gupta agreed to be responsible
for the logical design of the Data Generator as well as
continuing to help coordinate the overall undertaking. Ms.
Shobha Mallarapu agreed to be the principal layout designer
for the Data Generator. As the layout approached completion,
Anurag and Shobha cooperated to extract the interconnect
wiring capacitances which Anurag used in hand-calculated
estimates of the propagation delays--a flrst-order procedure
to validate his initial timing analysis. At the time, we did
not have a functional digital simulator capable of accepting
the CMOSN library cells directly.
Well after the Data Generator layout was complete, we
learned there was a major mistake in the logical design. The
error concerned the all-zeros output data word--both as to
when and how often it appears in the sequence of data words
and the fact that the all-zeros data word has to have
risetimes and falltimes essentially as fast as any other data
word. The first Data Generator layout which I call Chipl was
put on a 40-pin MOSIS standard frame, the smallest available
at 1.2 _m geometry. Chipl very nearly filled the 40-pin
frame. After a good bit of discussion, it was decided that a
redesign would need to go to the next larger frame size, a
64-pin. The MOSIS price schedule revealed that a 64-pin chip
at 1.2 _m geometry would cost about ten times as much as a
40-pin chip. Lewis technical officer requested we use two
40-pin chips rather than one 64-pin. The additional circuitry
required was put on a second 40-pin layout, Chip2. A small
3
amount of rework was done to Chipl. The major change was the
replacement of small pull-up transistors by large pull-down
transistors to speed up the transition times for the
all-zeros data words. The capacitance of sixteen large
transistor gates connected in parallel was thought to be too
big to drive comfortably with the output of a standard input
buffer. A nonelegant but workable solution was to omit the
input buffer and drive the line directly from the output pad
driver on Chip2. As this connection between the two chips is
presumed to be short, dedicated, and isolated, not going
anywhere else, little ringing or other signal corruption is
anticipated. The design and layout of Chip2, a relatively
uncomplicated piece of work, was done mainly by Anil Kapatkar
with some assistance from Anurag and Shobha. Technical
details of the Data Generator, Chipl and Chip2, are included
as Appendix A; most of the material was excerpted from M.S.
theses by Anurag and Shobha [3]-[4].
Anil Kapatkar assumed responsibility for the logical
design of the Data Checker. It was immediately apparent that
the I/O requirements could not be accommodated by a 40-pin
frame even had the necessary circuitry been able to fit
within the available area which it didn't. Again, by request
of the Lewis technical officer, Anll began considering ways
in which the Data Checker could be designed so as to allow
the complete circuit to be divided between two 40-pin frames.
As the design effort progressed, it became clear that it
would be risky if even possible to attempt a scheme whereby
all the necessary processing steps would be accomplished
within a single period of a 13.2 MHz clock. By suggestion of
the Lewis technical officer, the strategy for logical design
was redirected towards a pipeline approach--a technique
retaining synchronous data flow but permitting the processing
of a single data packet to extend over more than one clock
period.
Anil proposed a design which seemed to meet all
pertinent criteria. Basically, it divided the 16-bit input
data words into a 12-bit portion and a 4-bit portion. A block
of circuitry consisting of input registers for the 12-bit
portion, X0R gate comparators, a 12-bit population counter,
and a small amount of control circuitry was designated to go
on one 40-pin frame called (not too originally) Chlpl. It was
presumed that the remainder of the Data Checker could be put
on a second 40-pin frame, Chip2; a rough check done by simply
comparing the total area of the cells required to the core
area available indicated it would be a tight fit. The Lewis
technical officer having given permission to proceed, Anil
performed a hand-calculated timing analysis which indicated
the proposed design to be workable and, hopefully, robust.
4
Here, too, we lacked a simulator capable of doing critical
timing analyses while working directly with CMOSNstandard
cells. Anil proceeded to do the layout of Data Checker,
Chipl, a relatively simple piece of circuitry.
Mr. Sathy Rao agreed to undertake the layout of Data
Checker, Chip2. As of the date of this report, this work is
still ongoing. Our present estimate is that the circuitry
dedicated to Chip2 will fit on a 40-pin frame, but only a
completed or nearly completed design will confirm that.
Whenever a finished layout becomes available, we shall be
glad to pass it along to NASA LeRC. Technical details of the
Data Checker, Chipl and Chip2, to the extent they are
available, are included as Appendix B. The bulk of this
material is excerpted from the M.S. thesis by Anil Kapatkar
[5].
Mr. Kin Lui is in process of bringing online for us some
logic simulation capability: One tool will be a much improved
version of the simulator RSIM capable of working with fully
expanded CMOSNlibrary cells. Another tool will be the
proprietary Digital Logic Simulation module of PSpice using
some subcircuit definitions created by Ms. Kalpana
Vijayakumar [6]. The subcircults implement logic macros with
one-to-one correspondence to a subset of the CMOSNstandard
cell library. We are even now in the process of joining the
Massachusetts Microelectronics Center; among other benefits,
we hope to gain access to the HILO simulator, a product of
the GenRad Corporation. As I now read it, we should be able
to take a design based on CMOSNlibrary cells, expand it in
CIF, read it into Octtools, and then translate it into the
HILO "language."
The Data Generator and Data Checker chip sets are
convenient examples of small to medium complexity CMOSASIC's
designed using the CMOSNstandard cell library. Hopefully, in
due time, Kin should be able to examine one to all of the
chips using each of the simulators mentioned above. This
should provide us not only with a comparison of the merits of
the several simulators but should also give a firm decision
whether the chips are functionally correct. As any new
information becomes available, we shall be glad to pass it
along.
5
[1] CMOSN Cell Notebook: Scalable 2.0 and 1.2 Micron
CMOS/Bulk Cell Family, developed by the National
Security Agency, distributed by MOSIS, USC Information
Sciences Institute, 4676 Admiralty Way, Marina Del Rey,
CA 90292-6695.
[2] The MOSIS Service, USC/Information Sciences Institute
4676 Admiralty Way, Marina Del Rey, CA 90292-6695
[3] Anurag Gupta, "A Design Exercise: Transforming a Block
of ExistingCircuitry into a CMOS ASIC using Standard
Cell Methodology," M.S. thesis, The University of
Toledo, Toledo, Ohio, June, 1969.
u
[4]
[5]
[63
S.R. Mallarapu, "A Demonstration of CMOS VLSI Circuit
Prototyping in Support of the SITE Facility using the
1.2 _m Standard Cell Library," M.S. thesis, The
University of Toledo, Toledo, Ohio, June, 1989.
Anil Kapatkar, "A CMOS ASIC Implementation of an
Existing Block of Circuitry using the Standard Cell
Approach," M.S. thesis, The University of Toledo,
Toledo, Ohio, in progress.
Kalpana ViJayakumar, "Extension of the Proprietary
PSpice Digital Simulation Library to Include the
Scalable CMOS Standard Cell Library Developed by the
National Security Agency," M.S. thesis, The University
of Toledo, Toledo, Ohio, in progress.
m
6
Technical Details of the Data Generator, Chipl Plus Chip2.
I. Logic Diagram of Chipl ....................... A1
2. Logic Diagram of Chip2 ....................... A2
3. TABLE I--A Listing of the Mappings,
16-bit to 64-bit, ............................ A3
4. Pin Assignments and Interchip
Wire list .................................... A4
5. A Discussion of the Logic Design
and the Timing Relationships ................. A5
6. A Discussion of the Physical
Layout Strategy .............................. AI6
w
Sl Bi_l L
16/ I> To E_.JFFere1,2 .Zi 4
:
"" I I !
B-BIL
|RI_
IG-mtT
_..,_;,_'_ Funct:.lon_l bloch dxagr'=m For OoEo Gener'a_or'
AI
.M
_- I"
o.u _
Z
Z
_. _ <
.<
z _ _ _-
A2
The output of the Data Generator originates from an 8-blt
up counter and an 8-bit down counter; each pair of
counter settings yields four 16-bit data words. The
convention is here followed: 0 refers to the LSD and 7
refers to the MSD. Pin assignments and bit mappings are
summarized in the table below.
TABLE 1
Output Data Mapping and Pin Assignments
Package
Pin
Number
33
31
3O
29
27
18
17
13
Ii
i0
9
7
2
40
39
37
O/P Bus
Line
Number
1
2
3
4
5
6
7
8
9
i0
ii
12
13
14
15
16
Up/Down Counter Connections
Word 1 Word 2 Word 3 : Word 4
7D IU 2U 0D
3D 7D 5U 5D
7U 4D ' 2D 3DI
2U 7U ' 6U 2Di
6D 6D 4U 6U
4D ID 3U 5D
6U
3U
_o
2D
5U
0D
0D
0D
ID
4D
6U 3D 4U
IU 2D IU IU
1D 5D 4D 7U
3U 0U 0U 2D
6U ID 2U 5U
4U 4U 7D 0D
5D 6D 7U 3D
3D OU ' 5U 6D$
A3
Data Generator, Chipl Plus Chip2, Package Pin Assignments and
SystemmChipl--Chip2 Interconnect Wiring
The pin assignments of Data Generator--Chipl are shown in the
drawing on page A23.
The pin assignments of Data GeneratormChip2 are given below.
I/O Power ........................ Pins I0, 17
I/0 Ground ....................... Pins i, 30
Core Circuitry Power ............. Pins 5, 35
Core Circuitry Ground ............ Pins 20, 27
System Clock In .................. Pin 22
System Enable In ................. Pin 23
System Reset In .................. Pin 19
Control Out ...................... Pin 2
Enabled Clock Out ................ Pin 28
Pass-Through Clock Out ........... Pin 21
All other pins are unused.
System Reset is supplied to both Chipl and Chip2. System
Clock and System Enable are supplied only to Chip2. Enabled
Clock Out and Control Out are passed from Chip2 to Chipl. The
wire list is tabulated below.
System _ Chi_2 (Function)
Clock
Enable
Reset Pin 20
Pin 21
Pin 23
Pin 22
Pin 23
Pin 19
Pin 2
Pin 28
Control
Enabled Clock
The pin assignments for the output data bus are given in
TABLE i, page A3, and in the drawing on page A23.
A4
Chapter III
The Data Generator
System Specification
This thesis documents the conversion of the Data Generator circuitry,
made available to us in TTL technology, into a pair of CMOS ASIC chips
using the 1.2 um standard cell library made available by the National
Security Agency (NSA). The redesigned functional block diagram of Data
Generator Chip 2 is shown on page A1.
Given a clock of 13.2 MHz frequency, we are to generate a 16-bit word
by using an 8-bit UP and an_-bit DOWN counter, at one-fourth the given
frequency (3.3 MHz), then do a 16-bit to 64-bit mapping and finally
generate four, 16-bit words sequentially that are delivered to the 16-bit
output bus. Thus, we are generating four words (16-bit wide) from a
single word (16-bit wide) by accomplishing the 16-bit to 64-bit mapping
that was already specified to us. Table 1 shows the bit by bit mapping of
both the 8-bit UP and DOWN counters. We are also to skip a 3.3 MHz clock
pulse (254th or 255 th) thereby generating the same word twice for that
pulse. The output is pulled to an all-zeros state under control of circuitry
on Chip 2. Thus, the counters would not increment or decrement for that
particular clock pulse. So, with all of this in mind, let us divide the logic
diagram of the Data Generator into four major sections:
A5
1 ) Logic for generating an internal clock of 3.3 MHz from the given 13.2
MHz clock.
2) Logic for skipping the 254th/255 th clock pulse, i.e., generate the same
word twice for that pulse.
3) Logic to generate four words sequentially using sixty-four tristate
buffers in four blocks of sixteen each.
4) Control circuitry to disable the tristate buffers, and pull the outputs
low.
Generatin_ the 3.3 MHz clock
Two edge-triggered D Flip-Flops (DFFs) are used as frequency dividers,
each dividing the frequency by two, hence generating a 3.3 MHz clock from
the given 13.2 MHz clock. The DFF has a single data input D, and it
operates such that the logic level present at the D input is transferred to
the Q output only on the positive or the negative going edge of the input
clock signal. In our case, the D input is transferred on the falling edge of
the clock signal. The D input has no effect on Q at any other time.
The given clock of 13.2 MHz is applied to the clock (CLK) input of the
first DFF (refer to page A1), with Q bar fed back to the D input. Thus, the
output Q generates a clock whose frequency is one-half the frequency of
the given 13.2 MHz clock.
w
m
/k6
Now, the Q output of the first DFF (i.e. 6.6 MHz clock) is applied to one
of the inputs of an Exclusive OR (X OR), the other input of which is
connected to the Q output of the second DFF. The 6.6 MHz clock also serves
as the SO bit for the 2:4 line decoder, which will be discussed later in the
chapter. The given clock of 13.2 MHz is also applied to the CLK input of the
second DFF and the output of the X OR is fed to the D input of the same DFF.
Thus, with this configuration, we get a 3.3 MHz clock at the Q output of the
second DFF. In short, the two flip-flops and the exclusive OR are simply
interconnected to form a 2-bit synchronous up counter -- this is a standard
design for synchronous up-down counters. To see the actual waveforms,
please refer to the timing diagrams illustrated in chapter four. So, at this
point we have the 3.3 MHz clocks S1 and S1 bar. The same clock also
serves as the S1 bit for the 2:4 line decoder, discussed later in the chapter.
SkiDnin_, the 254th/255th clock pl_ls¢
The control circuitry to inhibit the 254th/255th clock pulse requires
an 8-bit ripple counter. The operational mode of the counter is
determined by the SELECT input and as it is used as an UP counter in our
circuitry, the SELECT input is connected to a logic low (GND). The DATA
input of the ripple counter is connected to the 3.3 MHz clock S1 generated
at the Q output of the second DFF. All outputs for the counter occur on the
negative edge of the clock.
A7
The outputs (Q7-Q1) are connected to the four input and three
input NAND gates. This is done because on the 254 th and 255 th
pulses, all outputs (Q7-Q1) are high and as we want the outputs of the
two NAND gates to be low for these two pulses, all inputs to the NAND
gates must be high. The two outputs from the NAND gates are ORed
and the result, which is the output of the OR gate, is connected to one
of the inputs of the 2-input AND gate.
Due to the propagation delay across the 8-bit ripple counter, the
zero bit, which is supposed to disable the AND gate and hence stop
the counters from incrementlng/decrementing, is delayed by approx -
imately 55 nsec. This amount of time is sufficient for the AND gate to
allow the leading edge of the 254 th clock pulse to pass but which will
end 55 nsec later when the zero bit disables the AND gate. The falling
edge of the pulse in turn increments/decrements the counters and
this constitutes the 254th count even though we "wanted" to skip it.
But, having a logic low at the output of the OR gate for two successive
pulses keeps the AND gate disabled for both the pulses and so even
though the 254 th pulse is counted by the counters, the next pulse i.e.,
the 255 th is skipped because of the logic low at one input of the
2-input AND gate. This allows us to inhibit the 255 th clock pulse from
incrementing/ decrementing the UP/DOWN counters respectively.
The logic low at the output of the OR gate disables the AND gate, the
output of which is applied to the clock inputs of the UP and DOWN
counters, thereby preventing the counters from
incrementing/decrementing for that clock pulse.
A8
Thus, the previous word is repeated, but the outputs are pulled low by
the control circuitry on Chip 2. The timing diagrams in chapter four
illustrate the waveforms of all these clock pulses in question. The RESET of
the 8-bit ripple counter is given a logic high during normal operation.
Generating, four words (16-bit wide) seouentiallv
Looking at the functional block diagram of the Data Generator (page
A1), we see that the 16-bit word generated by the UP/DOWN counters is
mapped bit-by-bit to the sixty four tristate buffers. The sixty-four tristate
buffers are divided into four groups of sixteen each, Thus, it is convenient
to speak of our tristate buffers, each sixteen bits wide. The ENABLE input
of each buffer is connected to the output of one of four 2-input AND gates.
One input of each of the four AND gates is connected to one of the four
outputs of the 2:4 line decoder. The inputs to the 2:4 line decoder are the
SO and S1 signals, generated at the outputs of the first and second DFFs
respectively. The 2:4 line decoder enables the four tristate buffers one at
a time in a sequence which repeats indefinitely. The four tristate buffers
being enabled one-by-one (sequentially), generate four data words
sequentially at the output for each state of the UP/DOWN counters. The
timing diagrams in chapter four show how the four tristate buffers are
enabled sequentially. The DATA inputs of the tristate buffers are derived
by the 16-bit to 64-bit mapping from the outputs generated by the 8-bit
UP counter and the 8-bit DOWN counter.
A9
Control Circuitry
Sixteen gated Pull-down, are connected in parallel with the outputs
from the tristate buffersl they pull the data outputs low when activated by
the control signal from Chip 2. These devices are simple N-channel MOS
transistors with their sources tied to GND, their drains connected to the
output bus in parallel with the outputs from the tristate buffers; the gates
are connected together and controlled by the control signal from Chip 2.
Please refer_pages A1 and A2. Hence, to disable the tristates, the control is
given a logic high which forces a low output from the AND gates; this in
turn, imposes a high impedance state on the tristate buffers. Additionally,
the logic high at the gates of the pull-downs turns the transistors on, thus
pulling all the outputs low. A logic low at the control input is used for the
normal operation of the Data Generator.
This completes the description of the logic operation of the Data
Generator circuitry. The propagation delays and the timing diagrams for
the circuitry are discussed in the next chapter.
AIO
Timin_ Diagrams
Figure 16 illustrates the timing diagrams for the Data Generator, Chip
1, circuit with approximate rise and fall times and the propagation delay of
one clock with respect to another. It was only after looking at these
diagrams, that we realized the need for a clocked latched at the input of
the third buffer (refer to page A1).
Actually, what was happening was that the UP counter and the DOWN
counter were being incremented/decremented precisely at the time the
third (#3) tristate buffer was being enabled.
It was thus questionable whether the 16-bit data word output from
the number three buffer (actually the last because the readout order is 4-
1-2-3) would remain valid long enough to satisfy system requirements.
All
CELL
NAME
2-1nput
NAND
2 -Input
AND
8_bit UP/
DO_
2:4 Line
DECO(_R
D- FLIP
FLOP
Tri-State
Buffer
8_bit
RIPPLE
O3JN'_
2-Input
NOR
2-Input
OR
"l'iminginformationof standardcells
usedin the DataGeneratorcircuit.
EQUATIONSUSED
(for WorstCaseat 125 C)
Pd(0-1)= 1.59+ (2.60)CL
Pd(1-0)= 1.11+ (1.98)CL
Pd(0-1)= 0.72 + (2.53)CL
Pd(1-0)= 0.75 + (2.16)CL
Tris = 2.82 + (6.71)CL
Tfal = 1.79 + (4.44)CL
Tris = 1.18+ (5.10)CL
Tfal = 1.11 + (4.14)CL
Pd(0-1)= 11.08+ (1.88)CL Tris = 4.50 + (3.90)CL
Pd(1-0)= 11,60+ (1.50)CL Tfal = 3.68 + (2,39)CL
Pd(0-1)= 3.79 + (1.83)CL
Pd(1-0)= 5.36 + (1.53)CL
Tris = 1.61+ (4.01)CL
Tfal = 2.14 + (2.37)CL
Pd(0-1)= 4.82 + (1.61)CL
Pd(1-0)= 5.36 + (1.34)CL
Tris = 2.36 + (3.93)CL
Tfa/= 2,32 + (2.34)CL
Pd(0-1)= 2.16 + (2.13)CL
Pd(1-0)= 2.60 + (1.91)CL
Pd(0-1)= 5.46 + (1.82)CL
Pd(l-0) = 6.56 + (1.53)CL
Tris = 1.57 + (4.49)CL
Tfal = 1.38 + (3.54)CL
Tris = 2.78 + (3.75)CL
Tfal = 2.68 + (2.34)CL
Pd(0-1)= 1.58+ (1.62)8L
Pd(1-0)= 1.60+ (1.24)CL
Pd(0-1)= 0.77 + (1.33)CL
Pd(1-0)= 0.76 + (1.24)CL
Tris = 3.09 + (3.82)CL
Tfal = 2.46 + (2.45)CL
Tds= 1.29+ (2.90)CL
Tfal= 1.09 + (2.55)CL
I
Pd
(o-1
ns
-_.45
1.52
1.41
.40
.00
.35
.67
301
00
Pd Tris
'1-(
ns ns ns
1.76 1.86
L82 ;.75
1.8 5.2( tll
.87 .30
.30 .10
67 .08 3.36
68 06 1.47
t.20 i.0 .60
1.5( 1.0 5.6
Tfal
3.26
AI2
i
i
I
i
I
I
:i
/t
(,)\J_!
. _ _ i.,i,.w _
li 'r"
II'
J..<>°\. _" o
C 0
! 1"
:i
G) t)
i
\
0
.
5
(1
5
0
®
t-'l
I
ill
lti
//_
(,1
r__[_=CIC
!
I
i
!
!
!
!
I
J
/
r- 1
0
0
0
0
(-
__I._
AI3
o
0
c
0)
c
o
W
x
o
0
c
e,l
W
r
7_
O
®
• "C "-"
In fact, in a worst case scenario, whether the word would even be
valid at all. Introducing a clocked latch ensured a valid word at the
output of the third tristate buffer. As the latch stored the previous
word, the fact that the counters and the buffer were enabled at the
same time had no effect on the valid word (word stored by the latch)
being generated at the output. Thus, there is no doubt about the four
words being generated sequentially at the output. It is also true now
that every output word has exactly the same timing as every other one.
Another helpful application of the timing diagrams was made use of
in connection with the 254 th pulse that we wanted the counters to
skip. In fact, due to the propogatlon delay across the 8-bit ripple
counter, the zero bit, which is supposed to disable the AND gate and
hence stop the counters from incrementing/decrementing, is delayed
by approximately 55 nsec. This amount of time is sufficient for the
AND gate to allow the leading edge of the 254 th pulse to pass; the
pulse is terminated 55 nsec later when the zero bit disables the AND
gate. The falling edge of the pulse in turn increments/decrements the
counters and this constitutes the 254 th count even though we
originally intended to skip it.
This lead us to design the skip circuitry such that the zero bit
disables the AND gate for two successive clock pulses (254th and
255th); it then happens that even though the 254th pulse is counted
we do end up skipping the 255 th pulse, which is what we wanted to
do.
AI4
Thus, we see that the timing information and the timing diagrams play
a very vital role in the logical design of the circuit. A brief description of
the physical layout and design of the Data Generator chips is provided
following.
AI5
CHAPTER IV
PHYSICAL DESIGN OF THE DATA GENERATOR
Layout design is the process of translating a schematic diagram
representation into a corresponding representation consisting of geometric features on
several independent mask layers. All present day (MOS) IC chips contain some
unavoidable overhead: bonding pads, output pad drivers, input and output electrostatic
discharge protection circuitry, (occasionally) input receivers, and level shifters.
Dictated by packaging requirements, there is a semi-standard set of chip sizes and
rectangular shapes. A chip conforming to one of these semi-standard sizes and shapes
and containing the necessary associated overhead features is called a standard frame.
In the standard cell approach, the layout design of the circuitry within each
cell is pre-accomplished; layout drawing consists of placement of the cells and detailed
design of the interconnect wiring, the so called "place and route" problem. The first
task in creating the layout is to create a floorplan for the chip, The chip is broken into
blocks, and by looking at their relative sizes and the wiring between them, decisions are
made where the blocks should be placed. To draw the plan, we need an estimate of the
size and shape of the cells that will be used to make up the blocks plus the amount of
space that will be left between them for wiring channels. When the floorplan is
complete, the actual to-scale place and route is undertaken.
Design Methodology
Fig.7 shows a possible methodology for VLSI circuits [7]. It is a
combination of top-down and bottom-up approaches. It is top-down because it considers
AI6
the overall functions to be performed and how these functions interact with each other
and with the system. At the same time, low-level details (concerning capabilities, etc.)
are considered to enable the designer to make informed, intelligent trade-offs. This
thesis deals with the physical design starting from chip block diagram level. The part
prior to it can be found in the M.S. thesis done by Mr. Anurag Gupta[9].
Floorplanning
It was decided to put five rows of cells on the chip. Part of the analysis that
led to the choice depends on how the data and control signals flow and how it is possible to
place the standard cell blocks and still remain within the constraints of actual space
available. These considerations dominate the process of deciding where and how far apart
to locate the various building blocks -- a process better known as floorplanning.
As the floorplan and the functionality of the individual blocks become better
defined, the actual cells to be used to implement the blocks must be determined. These
must be chosen from the cells available in the CMOSN standard cell library. The
floorplan of the Data Generator is shown in Fig.8.
The size of the chip is 3917x3917 lambda units. The estimated area
available for the placement of cells after placing the pad drivers, pad protectors, power
and ground pads is 2550x2550 square lambda units. As a standard cell approach Is used
in this design, the height of the cells is standard at 250 lambda units. Hence for five
rows of cells, 1250 lambda units has already been used. The rest of the space can be
used for wiring.
The placement of cells has to be decided depending on the interconnections to
be made between the cells. As each large buffer consists of 16 one-bit tristate buffers,
all the sixteen tristate buffers have to be placed next to each other. In addition to this
consideration, the outputs of the four large buffers must be connected to the same bus.
AI7
Buffer1+ Gated pull-ups Buffer 2
Down-Counter
Up-Counter
Buffer 4 Buffer 3
Decoder+ ID Flipflops
Rest of
cktry to
skip count
Ripple Counter
Fig.8 Floorplan for the Data Generator
AI8
Corresponding bits of each large buffer plus an active pull-down are connected together and to a
rail of the bus. Keeping this in mind, buffers 1 and 2 are placed side by side and similarly buffers
3 and 4. The gated pull-downs are placed with buffer 1. Because the input to the third buffer has
to be passed through data latches from the up-down counters, the data latches are placed directly
above buffer 3 and next to both counters. The four AND gates used to enable the four large buffers
are placed next to buffer 4 because space is available there and that location brings them close to
other buffers. The rest of the Data Generator circuitry is placed together in a row because there
are many more connections among that group of cells than to the cells placed in other rows.
The cells are also arranged such that all the inputs are to the left and the outputs to the
right. But as there are sixteen outputs, they had to be distributed around the chip. There are only
three inputs -- Control form Chip 2, Reset and the Clock. The cell rows are centered leaving equal
space on either side of each row of cells for VDD and VSS buses. A space of 250 lambda was left
between the rows of cells for wiring. At the bottom of the last row, only 80 lambda units was left
as there are only a few connections to be made through that channel.
Although the wiring was considered in the floorplan, as the actual layout process
progressed, the 16-bit to 64-bit interconnect wiring layout was changed to accommodate the
required positions of individual wires as the design evolved and the necessary positions became
clear. The wiring section of the layout is discussed in detail later in this chapter.
The Workstation
The process of developing a VLSI chip demands a great deal of interaction between the
designer and the evolving design. Workstations support computer-aided engineering tools that
organize and expedite the interaction. From schematic creation to simulation and physical layout,
workstation tools help manage all interrelated design tasks associated with VLSI development and
AI9
allow easy updates to schematics and layouts. The workstation used to create this design was a Sun
3/160 running a 4.0 operating system.
One of the CAD tools available on the Sun workstations is called MAGIC. It is an interactive
layout editor and design rule checker. It does not have the capability of placing the cells
automatically but has the capability of routing automatically using netlists. In order to use netlist
routing, one has to specify the nodes that have to be connected to each other. Then it automatically
connects the nodes taking care of any crossovers hat occur in the midst of its routing. However,
netlist routing was not used in this design in order to optimize the area of the chip. When the
netlist router is used, it is hard to predict how it will route - where it will place wires and how
long wires it will use. On the other hand, routing manually allows us to visualize the routing and
run the wires as desired. In order to avoid crossovers and contacts, all horizontal wires are run
in metal1 and all vertical wires in metal2° The technology used is 1.2um scalable CMOS and the
design rules are lambda based as has already been discussed in Chapter 1.
40 pin Standard Frame
A 40-pin tiny chip can accommodate the Data Generator. The frame of the chip has been
redesigned to meet the requirements of the Data Generator -- the number of inputs and outputs,
etc. Each input pin of the chip requires a pad protector and each output pin requires a pad driver.
As we need only three inputs and sixteen outputs, the remaining 21 pins could be used for power
and ground. It is a good practice to provide separate powei" and ground pads and buses for the I/O
circuitry and the core circuitry. The big benefit of dong this is that it prevents power supply and
ground noise produced by the pad drivers from being distributed directly to the core circuit. A
good bit of bypassing can be done at the package pins. The power and ground pads for the pad
drivers and pad protectors are called power and ground. The VDD and VSS pads for the circuit are
called power circuit and ground circuit. This approach involves isolating the pad power and
ground circuits.
A20
=/-
VSS
VSS
/
/
/
i
,/
CELL ROWS
I
I
I
VDD
GND
PWR
PWR END VDO
Fig.9 Chip Power Distribution for the Data Generator ,_j'w_, _ t
A21
_CNDING PAD ARF__
(NWELL, PWELL, METAL1,
VIA, METAL2.., PASSIVATION)
Fig. 10 I/O Pad Layout Structure
As can be seen in the Fig.10, the bonding area in the pad is 170x170 square
lambda units where as the area of each pin in the frame is 175x175 square lambda
units. This disables the bonding area of the pad to be symmetric on the pin when placing
the pad on the pin. Therefore, to be uniform throughout the chip, all pads are overlapped
on the pins with 2 units on the top, 3 units at the bottom, 3 units on the left and 2 units
on the right. To place the pads on the pins, the following procedure is adopted:
1). The MAGIC file of the 40-pin chip is opened.
magic 40pc22x22
2). One of the pads - input, output, power or ground pad is called on to the screen.
:getcell x2ipd
A22
will bring the input pad onto the screen.
3). The pin where the pad has to be placed is chosen and its glass layer is selected by
pressing 's' two or three times. The command :what will give the name of the layer
that has been selected. Once the glass layer is selected, the lower left coordinates of the
box around it is obtained by pressing 'b' which is a macro for box. It gives the lower left
coordinates as II=(x 1,yl).
Similarly the glass layer of the bonding area of the pad is selected and its lower left
coordinates are obtained (x2,Y2).
The pad now has to be moved (Xl-X2-3 , y1-Y2-3) units in both directions so as to
be placed on the pin with an overlap of 3 units. After selecting the whole pad by pressing
'r, the following commands would place the pad exactly on the pin as desired:
:move east (x1-x2-3) or move west (x1-x2-2)
:move north (y1-Y2-3) or move south (Yl"Y2 "2)
Similarly all the pads are placed on the frame.
To avoid overlapping of the pads, the pins next to the corner pins are left
unconnected. The pin diagram of the Data Generator is shown in Fig.11.
Input protection and banding
To protect against damage caused by electrostatic discharge and to reduce the
possibility of latchup, special structures have been included in the I/O pads. The input
protection circuit [3] is shown in Fig.12. The input resistor is composed of ten squares
of polysilicon having a resistance of approximately 500Q. After the input resistor, the
signal line is connected in metal to a P+/N-diode and an N+/P- diode. The diode areas
are large to enhance their capability to handle current. The P+/N- diode is surrounded
by an N+ ring tied to VDD and the N+/P- diode is surrounded by a P+ ring tied to VSS.
To reduce the possibility of latch-up, a P+ active area guard ring surrounds
the N-channel transistor and an N+ active area guard ring surrounds the P-channel
transistors in the I/O pads.
A23
NC
J
vssp
_R_L _d
VSSp
CLK _
NC 07 GN3
NVqR
NC 04 PWR
O - OUTPUT PIN
I INPUT PIN
NC - NOTCCNNEC'I'ED
Fig.11
08 09 O10 _ O11 NC
U_WWI,IW
DATA GENERATOR
_RmmR_
03 02 O1 _ O0 NC
GND - PADGRCXJND
PWR - PAD POWER
VDD - CIRCUITPOWER
VS$ - CIRCUIT_ND
Pin diagram of Data Generator, _ _[_3
VDD
NC
F_
O12
E_
O13
O14
O15
E PWR
F
VDD
A24
buses from the cell power buses in order to reduce power and ground noise in the
functional cells. The more the number of power supply pads, the better is the power and
ground noise in the pad area. Also, because this noise results from the switching of
output buffers, large strips of adjacent output pads are avoided as much as possible. The
chip power distribution for the Data Generator is shown in Fig.9.
The basic I/O pad layout structure is shown Fig.10.
VDD
= 500Q
_SILIt_N
RESISTOR
Fig. 12
2
P+/N-
N+/P-
VSS
Input Protection Circuitry
iv- TO ACTIVE CIRCUITRY
Placement and Routing
The placement of the cells has been done in a bottom-up fashion. This is
true mainly because the last row of cells had very few connections at the bottom and the
connections at the top were mostly interconnections among cells in that row. So, leaving
a space of 80 lambda units from the bottom I/O pads and 250 lambda on the left, the
decoder is placed using the command
:getcell dec3.
A25
Similarly, the other cells in that row are placed with a spacing of 10 lambda between cells.
Before placing the next row of cells, the interconnections among these cells are completed. In
addition to the wiring area already occupied, a spacing of 100 lambda is left for the connections to
the next row of cells. The number 200 lambda is obtained by estimating that 16 wires are
needed for sixteen outputs and another three for the inputs. So at the rate of 4 lambda for the
spacing between two wires, each wire will require a width of 8 lambda units. The spacing of wires
including both the thickness and the gap between them is often called the wiring pitch; i.e., so
many lambda per wire. The wiring pitch in poly or metal1 or metal2 is not always the same.
The fourth row of cells from the top contains four AND gates which are placed with a
distance of 20 lambda between each pair of them and thirty-two tristate buffers which constitute
buffer 3 and buffer 4. Because all thirty-two buffers are the same, an array command is used
instead of calling the cells thirty-two times. The command :getcell tribl will bring a tristate
buffer onto the screen. A box is placed around tribl with its right edge extended by 5 units. Now
the command :array 16 1 will create an array of 16 tristate buffers with a spacing of 5 units
between each pair of them. The same procedure is followed to create buffer 4, but at a distance of
30 lambda from buffer 3 to distinguish easily between the two buffers. All tristate buffers of an
array are selected or deselected together.
w
To select an up or down counter, the select bits Sl and s2 have to be supplied with a 1 and a
0. To select up-counter, SlS2= | 0 With sl connected to VDD and also to the input of the inverter,
s2 is connected to 0 which is the output of the inverter. Similarly s2 of down-counter is connected
to VDD and Sl to the output of the inverter. The connections from the 8-bit up-down counters are
made only after placing buffers 1 and 2 along with the gated pull-downs and the data latches.
Before proceeding any further with the connections, the VDD and VSS connections of the cells are
completed. First, the VDD and VSS between cells are connected. Then VSS is brought out on the
right hand side and connected to two power circuit pads at the corners.
A26
To make the 16-bit to 64-bit mapping convenient, all the terminals of each counter are
labelled as F(feed through), D(data), or Q(output), etc. The feedthrough's were helpful especially
when a wire in one row has to be taken two rows above or below it.
Corresponding bits of all four buffers plus a gated pull-down plus a wire connecting to an
output pad driver are connected together; this six-fold connection is repeated sixteen times
constituting the on-chip output bus. To achieve this, the outputs of the gated pull-downs are
connected to the corresponding outputs of the tristate buffer 1. They are also connected to the
corresponding outputs of buffer2. In the same way, the outputs of buffers 3 and 4 are connected.
All these outputs are brought out on the right hand side of the chip and connected together to get the
final 16-bit output.
The completed designs are presented in Figs. 13 and 14. Fig. 13 shows the wiring with the
pads and the cells unexpanded. Fig. 14 shows the fully expanded version of the layout design of the
Data Generator.
A27
Fig. 13
1t It
uxd8a
,J.zdP, a_ 1
I I
12rip8
iB_p8_O
Layout of the Data generator with unexpanded cells and pads
A28
ORIGIV&L PAGE iS
OF POOR QUALITY
L.:.J
I-; !
Fig.14 Layout of the Data generator with expanded cells and pads
O_iGIN_L P,SGE IS
OF POOR _,IItLITf'
A29
Technical Details of the Data Checker, Chipl Plus Chip2.
i. Logic Diagram of Chipl ....................... B1
2. Logic Diagram of Chip2 ....................... B2
3. Pin Assignments of Chipl ..................... B3
4. Tentative Pin Assignments
of Chip2 ..................................... B4
5. Tentative Interchlp Wire List ................ B5
6. A Discussion of the Partitioning
and Logical Design of the
Data Checker ................................. B6
7. A Discussion of Timing
Considerations Pertinent to
the Data Checker Chip Pair ................... B20
i
12
bits
fr 0 FA
i
RDG i 16 bil nara!Tel
_ •
•l .....
i IU-_d_f
f
!
---_ 010,.':t<
i
I oa $i
,.f ,
_-__= .
l
!
fro rl'i i
, t
L[:,F_ i
1 16 bit F.,.=,ral]el
lo._der
i
! C" ";: -- .i
?' I___i
reset
12 bits
i
Ifro;r_RE:,G !
i j
!
I
: I
J
i
I
!
i
, !
tl.-
ite__t
tbi+._,i
frorn i
L[.,G
I delag
I
! 2 '.:-iOR- for
I i
t1 t
@,_'i"0 Y I
kits 1
I . I
!
!
i
l
!
i" "%
/.___2 "-',, l
:@@' I,
cl,:,,-:k EN ABLE
12 bit
p o -,u i s t i r,n
i
1
I
!
!
!
!
I"
t c:ourit of
1
:. err,-r bi.+.-_-.
i 1
! i
1 t
!
v
.-!.. i.
1.: i 11 i.: t:
g:,e$ tr:
chip2
..-_
I"
r
Fig.3.
Block di:_gran-iofchip I.
Bi
OF POOR _..IALITY
4 XORs
i
f
I 4 error bit_
l frgn-zchip I
y"
r---ISO
• __F_S I
L_JI ,
,r-_i
RE AC,
I
O_JGINA!. PAGE IS :_:;EL
OF PO0_ i_JALITY
8-bit counter2
8-bit cour, terE
_2-bit p._r _lle, loader
. +I..t_:ese. I "--,
t
Clo,::k
"t_" 32- bit error count
1 6 2::<:1 multiplexers
16 bit error count (either jt
I :, I'.,:_B or I b L:::EU "_ Fig. 5
Block diagram of chip2.
I
Reset
i
i
B2
Pin Assignments of Data Checker--Chlpl
w
L
I/O Power ........................... Pins 5, 35
I/O Ground .......................... Pins 16, 26
Core Circuitry Power ................ Pins 15, 25
Core Circuitry Ground ............... Pin 34
Data I/P--AO ........................ Pin 12
Data I/P--A1 ........................ Pin 13
Data I/P--A2 ........................ Pin 14
Data I/P--A3 ........................ Pin 17
Data I/P--A4 ........................ Pin 18
Data I/P--A5 ........................ Pin 19
Data I/P--A6 ........................ Pin 20
Data I/P--A7 ........................ Pin 21
Data I/P--A8 ........................ Pin 22
Data I/P--A9 ........................ Pin 23
Data I/P--AIO ....................... Pin 24
Data I/P--All ....................... Pin 27
Data I/P--BO ........................ Pin 7
Data I/P--B1 ........................ Pin 6
Data I/P--B2 ........................ Pin 4
Data I/P--B3 ........................ Pin 3
Data I/P--B4 ........................ Pin 2
Data I/P--B5 ........................ Pin 1
Data I/P--B6 ........................ Pin 40
Data I/P--B7 ........................ Pin 39
Data I/P--B8 ........................ Pin 38
Data I/P--B9 ........................ Pin 37
Data I/P--BIO ....................... Pin 36
Data I/P--B11 ....................... Pin 33
Error Sum Out--SO ................... Pin 31
Error Sum Out--Sl ................... Pin 32
Error Sum Out--S2 ................... Pin 29
Error Sum Out--S3 ................... Pin 30
System Clock In ..................... Pin 8
Clock Out ........................... Pin 28
System Reset In ..................... Pin I0
Reset Out ........................... Pin Ii
System Enable In .................... Pin 9
NOTE: The data inputs are labeled AO--AII and BO--BII only as
a matter of convenience; corresponding bits of two data words
should be connected to the same numbered A and B inputs, but
it is not necessary that the LSB's be connected to AO and BO.
B3
Tentative Pin Assignments of Data Checker--Chip2
I/O Power ........................... Pin 40
I/O Ground .......................... Pins 5, 26
Core Circuitry Power ................ Pins 15, 25
Core Circuitry Ground ............... Pin 35
Data I/P--A12 ....................... Pin 31
Data I/P--A13 ....................... Pin 32
Data I/P--A14 ....................... Pin 33
Data I/P--A15 ....................... Pin 34
Data I/P--B12 ....................... Pin 27
Data I/P--B13 ....................... Pin 28
Data I/P--B14 ....................... Pin 29
Data I/P--B15 ....................... Pin 30
Error Sum In--S0 .................... Pin 39
Error Sum In--Sl .................... Pin 38
Error Sum In--S2 .................... Pin 37
Error Sum In--S3 .................... Pin 36
Error Count Data Out--o0 (LSB) ...... Pin 24
Error Count Data Out--ol ............ Pin 23
Error Count Data Out--o2 ............ Pin 22
Error Count Data Out--o3 ............ Pin 21
Error Count Data Out--o4 ............ Pin 20
Error Count Data Out--o5 ............ Pin 19
Error Count Data Out--o6 ............ Pin 18
Error Count Data Out--o7 ............ Pin 17
Error Count Data Out--o8 ............ Pin 16
Error Count Data Out--o9 ............ Pin 14
Error Count Data Out--ol0 ........... Pin 13
Error Count Data Out--oll ........... Pin 12
Error Count Data Out--of2 ........... Pin ii
Error Count Data Out--of3 ........... Pin i0
Error Count Data Out--ol4 ........... Pin 9
Error Count Data Out--ol5 (MSB) ..... Pin 8
Chipl Clock In ...................... Pin 7
Chipl Reset In ...................... Pin 6
System Clock In ..................... Pin 3
System Enable In .................... Pin 2
System Select (I/P) ................. Pin 1
System Read (I/P) ................... Pin 4
NOTE: The Read command is active low. Select, when low,
selects the two most significant bytes; when high, the two
least significant bytes.
B4
vThe tentative interchip wire list for the Data Checker chip
pair is given below.
System _ Chip2 (Function)
Clock
Enable
Reset
Read
Select
Pin 8 Pin 3
Pin 9 Pin 2
Pin I0
Pin 4
Pin 1
Pin 28 Pin 7
Pin ii Pin 6
Pin 31 Pin 39
Pin 32 Pin 38
Pin 29 Pin 37
Pin 30 Pin 36
Clock Out/In
Reset 0ut/In
Error Sum--S0
Error Sum--Sl
Error Sum--S2
Error Sum--S3
NOTE: The language Error Sum refers to the errors detected,
if any, produced by any single comparison of the the least
significant 12 bits of the data words done on Chipl; it does
not refer to the running total of errors counted as
maintained in the 32-bit output register.
B5
wi
iq,,,
CHAPTER I1
LOGIC DESIGN OF DATA CHECKER
System specifications
The Data Checker works in conjunction with another chip called the Data
Generator, as shown in Fig. 2. The Data Generator was previously designed at the
UniYersity of Toledo [5].Itgenerates a 16-bit nea;-random data stream at the rate
of 13.2 MHz. This data stream is employed as a test signa] in a simulated satellite
relay link. The data is returned by the simulated satellite after processing. The
Data Checker receives the 16-bit data from the satellite along with a duplicate of
the original 16-bit data which is provided by theLocal Data Generator, compares
u,L,,,,u,_u keeps a running count oI the total number of bl_:. in error
accumulated during ti_eevaluation interval. The accumulated error count is stored
in a 32-bit register. When interrogated, the register contents are multiplexed to a
16-bit bus as a double precision word. The Data Checker circuit also operates at
I-:": '-"
.,.- MH-.
_,,it,. 16-BitDat_
ts_te]]it_/ "_ /._ From S;,te]lite
,/ x -/_1
I _l DataChecker
LocalDat._ 16-Bit
G_nerator Referen,.'.e
Data
32 bit
error
Fig.2
Working of the Data Generator and Data Checker System
B6
w• - ..... "''''" Of :Fan::,,il_,._y ;,;,u,u_, ,.u thef_r,,-tio_::; _' ::uL:tlM 1,rl the m,__,,,,,._ the D,_,ta lq_nerator '-- - ---_'+_ ............. - *--
:-_.,_u,,,'_'te (called ih_.... Remote r_.,ata Ger;eratc,.r or- RDG:L The LDS n_'o'--o+_.--w.,,........ .. =, the fir:-;t
,:,:,':,r,. ,=,,_ is then di'._b]ed _-;_" _,,_,_,-.=,., . I[_,c, word is present at the in;,t of Lhe ,--"p."'-"-"
-., _..u,,.ry ,.,.,_-+_r,._ +,-.,-the WOTQ fr-om the PI:_;_ to _t7-I',,,_ The _,"'-_"o_ of a ',,'ali__ word
frr rri ,._,,_ ,h__', b!-I -", _ ,_,_,._,_ _ -:.,g,,u+ wh;ch !:loe'.:;_hiqh._.',,,','_er_a '.,'a]_d .,.,.,_d
:_,+..... _'-,'-:_: ,. -':rie ,_,,,,-,_.,=__.c_'_-;'-_:_ '-_'.>_g,,ul'-'-i.:,_-nrov',ded by an ir_deF_,endent cont,'-o]_e,'-. UhL_-''- the
_I', .... _ _ _,, "
,:.,**,,=_.,_ g..:.e:_:,high, ,,,,.,,_,_,,_,.._,y_'_>-,_o'_+-_+_all +w, .<,,... ........b]ocks are _.nablo_! and +v,,-, UiUL:K iS ;_I_I.i_;]IP_-._.
_._.:_u, u,,,._,!:t, the first word will be _atched it-,. At the same time the LDG _'---ai:-;o
-_..7,__b]e:-',]:'id i* starts _-o"_'-_+_ ..... _,_,,. ,.__......., u,.,r_!_l the second war-c1. '-,- ,","_the tit;,-,_....... the .':;_L:LI,.';Q ',,',t'nt-_i
+,-'-...... tri_, ,O..DGis at tv,_,.,,,_._no!.lt of fho. checLer, ti-ie _rig ,i,_;_-_-outoutte,1: the ,.-:econd vvord
_"_ it _...... _;_h].-, tO tile cllecker fo,_.... " ..... " '"-
. cu:rlwU/l:-';OtL Th,_,o, wi;il_, th_ checker is
"-."--+-.,_ . .... ,-.h word. " "_. c,i_ data from,_.t,_,_.t-..,r,_the Nth word the L;.,U is generating the (N+';t ,,-,-,h_.n"--
""-,.:___RDG ceases to.. he ,,_,_M...... the ENABLE si!:lna]_ autornaticaliy goes low di=,,_-I >;"],,u..
_--t i r)R .... ,
._,u,htile,__,_.and the cr.,ecker.Again, wherl valid d_ta are received, hnth ciroLii+S
•-.4"- -'I"
=,,.ul ,. functi ,-_u,_:ngt,-",_,m the state in which they ,,*,,'ere halts._,. _.
Partitioning of the design
::ecause tile Data _..,!eul:.e, requires. ..,-_":'...i/lput pins., lb Oi.ltpiit. . pins,, arid sot_le
ad,._itioria! pins for signals like RESET, ENABLE, Clock:., Power, and Ground, a 64-pin
chip would be required. The cost of fabrication for a 64-pin prototype is
appro:.-:imately $58,000 while the cost of a 40-pin chip is about $5800. Due to this
enormous difference in cost it was decided to, use 40-pin chips. Since the design
could not be accommodated on a single 40-pin chip, it was partitioned ir,,to two
- .......... _ysi -part,:-:,and e_.ch part was implemented on a 40-pin cMp. After a careful _h_
th_.....n,-+u,.u-Checker was pai-titioned as;_fo]lows.
IS? ORIgINaL PAGE iS
OF POOR QUALITY
I],input the first 12 (least significant_,bits of the 16-bit words from the RDG
and tileLDG to the first chip, find the number of bits in error and then output this
nurnber as an input to tilesecond chip.
2) Use as inputs the 4 most significant bits from the RDG and the LDG.
Determine the number of error bits occurring here. Add this to the number of error
bits fromchipl to get the total number of bits in error produced by a single 16-
bit word comparison. Additional circuitry is implemented in the second chip to
keep a running, count of errors and to read out ti_e .,,__,,-z..__t.output, register as a
sequence of two 16-bit words.
Both chips have identical control circuitry witIl the second chip having an extra
command signal for reading the 16-bit multiple:.-'edoutput plus a select signal to
output first either the most significant 16 bits or the least significant 16 bits.
The complete design of both chipl and chip2 can be divided into four major
parts.
i
1)Thelogic to detect the number of bits in error resulting from a single 16-
bit word comparison.
2) TIie logic to keep a running count of the total number of errors that have
occurred since the last reset.
3) The output circuitry.
4_,The control circuitry.
,,_,r
The first and fourth parts are common to chipl and chip2 whereas parts 2 and 3
are implemented only within chip 2
ORK_IN_L _ _-_vA..-_.,.IS
OE I_OOIiIQtJALITY" B8
wLogic to Detect the Number of Bits in Error in a Single 16-bit
Word
Chip 1
The complete logic diagram of the circuit on Chip 1 is shown in Fig. 3.
The least significant 12 bits from the RDG and the LDG are compared.
Initially, the 12 bits from the LDG are at the input of the 16-bit parallel
loader waiting for the 12 bits from the RDG to arrive.
=
A 16-bit shift register is configured as a 16-bit parallel loader. The
configuration for an 8-bit shift register is shown in Fig. 4. The 16-bit shift
register is configured like wise. When the select signals $1 and SO are both
low the inputs D0-D15 are parallel loaded on the falling edge of the clock.
When the select signals $1 and SO both go high then a hold is placed on the
output; i.e., the output data remains the same irrespective of the inputs
and the clock. When the word from the RDG arrives, the ENABLE signal
goes high. Both parallel loaders are enabled and on the next falling edge of
the clock the words are loaded.
The comparison operation is done using twelve XORs. An XOR
produces a zero output if both inputs are the same and an output of one if
they are different. Thus, if the corresponding bits are not the same, i.e., in
error, then the output of the corresponding XOR is one.
]39
w
vi
v
I
L
8 bit
E:,7
_4 50
-----ir:,4
_ L:.__,
[:'2
_I C.,!
_i r:,n
J
_] DL
_'_ F',R
<3[): c:LOC:I-:::
parallel loader
o5[
02t_
• Oil
Qn!--
i
l
1
I
!
!
l
8 bit population counter
[:,.
C'6
i
---l E-,4
!
r;-')
J
--i r:,I
_i DO
[
_-.-y
,:-T I
._,,_,
$3
FUNCT ION T/_BLE FOR 8 B IT P _.RALLEL LO ADER
L-L _ L-E:,
"--:_-....
|
I
0
I
1
!
1
i c.1 _ ':'".'
0 0
1 0
DO-C,7 ' :L
i
]PEP AT iON ! r.j,-,....-,'j,'_
1
RESET I 0 _ ,_,r'
!
LOAD DO-D7 ! DO-D7
!
SHIFT RIGHT _l DR-O6(N-I)
!
SHIFT LEFT i 01(N-I)-DL
HOLE) I NO CHANGE
.t
FUNCTION TABLE FOR B BIT POPULATION COUNTER
DO- D7
l's= 0
"--'ST--l's= 1
t's= Z
.¢.i.
l's= 3
_ 4I ".:,=
I':_ = S
> 1's= 6
W 1's= 7
0T l's= ,-,
$3
0
0
0
0
0
0
0
0
1
S2
!
0
0
0
! o
1
1
1
0
¢.,iz, 1
0
!,3
i
1
i I
0
I
!
i °
1
SO
0
I
0
i
J O
I
!
I
0
I
]°
Fig. 4
Block diagrams and fun,::tior_ tables of the cells used in the design.
BIO
I=--
i
w
I
Ii
uutp:,,.:, of ,ti_eE;_ r:nutiting the number of ones _t the '- ' ""+'- ',.,'n ;, ,:. the tntal n im,hm- OT
bits in error can be obtained.The err-orcount can be put in binary form by using a
UUutI,.U,,.,,.,un is The 12-bil'-", =atlon C'.-HJfltAT _' _ 'D,_,_H_,, ._ _,l o-blt ....pnF_H_]_+_-,..... "+-_- showr', in Fi!q. ,_
population counter is sirnilar to the 8-bit popu:'ation counter in Fig. 4. Since there
are t2 bits a 12-bit population counter is vsed. A population counter is a rather
.=,,_,i_,,,,.:,t,z_,_,,._,,diqital nircuit wMch nrnd:mp..':; ,_,binary '- ,fnHf equal 1o th_ nu,rr1Der _qf
-I
, . -"tr_ie- or ....-m_-'.'al_p.dsi_]nals_, appli _u:_ to i t':-,.._,inputs. -__,1._c.ounter _,_D,u,_e,l'-o-_",, ', here h_s i
-- ",--'-" .,._,.... -...,). i_Iput :Jat_=_
_,_t;_ ,n-_,_,t_- (IDO-D _ i) The numbp.r ,_pp_u, =, ,it the outputs ,'c-___..,--,- -rhp.
or_; c_n appe__r in any order and '._--;tillproduce the correct result H,_,_.....,._r.. ,_,,_
indeterrniriate i ...... +r,_-'_ '"" - ,-,J,-i-,-_,-. This 4-,,_.,:.,,.,._-.,.w=!it produce a;_ unpredictable state _t the ' ' "+
bit error co_Jnt is pro','ided as inputs to the 8-bit parallel _oader on the second
chip
in Chip2
The complete logic diagram of the circ_Jiton ChilD2is given in Fig.5 0,_chip2,
the most significant-4bits from the RDG and the LDG are compared. The operation
is '-_",._-.,,_,liarto the u,,e_"on r:hipl._ ;,,:l;_r.... the _r:_.",_,_,_, ,_,_,.r goes hlgh. the 4-bit words are
parallel loaded into 4 bit parallel loaders. These are derived from a-bit sMft
registers and ,,.vork exactly the same as the 16-bit par_llel _,oaders e'.:'r_]_ined
,_,_,'-","_l er. The ,-:ompariso/1 is done 'o . .....' L,.,li_g 4 XOF.'S. if _:orr_.:::pnnding bits are nnt ti_e
',--- ,,-,,.,_.will be nne. All the ones at the,;arne _.,_, the output of the corresponding "l-l,_
-,,+,-,. a
_ . . i_-_L.:.La,,a.ij,.,,._,ut of the XnRs are coutlted by "u.:._r,g an 8-bit population counter. ,'..,-+,,all,, 4-
bit population counter coulfi suffice. But as only an 8-bit population cour.,ter is
_;,,_,,,_.,=n the standard cell li.... ' --, uru_y, it is ._sed The remair_ing four input.':,are
perrnanentiytie_1to logicalzero (ground).
I
BII
ORIGIt,_AL PAGE IS
OF POOR QUALITY
---
i
,,w --u,, error nnuht froAi r:hipiand the 4-hi+ eFror CIq+It'll_--'"........' ---
r:rov] :_--,_,_,_as inputs _-,.uthe u-u+t°_' para_ ;_i ioader. The lnad+.r._ is ,,_,.iL,_u,_._,uto avo_ '_.+
_+,to ........tirning_ prob]--'-,---(dl,.:m:u.':,:;.o.d in +,-"__-,+-"-_.,._ 3) Tile outputs + -'-",t :J ,. +_-.-,.,t_io_+ler are pro',,'+_._ed
:=,.: i r, n', _t _: t n t h 8 "+ _ - ""'-<-+-_,]t ad::ler ',,.','r_lchad,ds +_;a _.t, ,Jr ISl]ili"it TI-i]I'FI Ct'ii[II ({:!- ; '2'i f N ,,t,_+.
- '+ ' l,+::l_+i,,i,+ P a
.................... ;,tlibit.++ulY:g +'_rrnr '-'-...... fi-om r:hip2 (0-4) to gi','+ the. 5-blt error cnlmt '+ ..... +h'-' +-'->'"i .| _+;,i_ i I '. :_ +
::,+Irtt._i_ ]D-FJi t ;",,,Urd.
Logic to Keep a Running Count of the Total Number of Errors in an
interval
• [ If.|_'['td ..+.I vU1Ji i +.++J aA'.-: rnentior_eG -.... "+ y. __ - _ "- _....._,+_,.:ousl the tntai nf _.rt-r;t-+; .......... dw-ing :-;ingie
+ " , '-'-' . du,.+_r. A 4-:_],.addernor-+par_'.=;nnof two Ir_,-ult',,:,?ords+s obtai,,+dusing a 4-bit --,+'_- " _+
......... _,ii.' a ._i:::h , :.1,,u,:t._- two numbers arid F:+udu._.e.:,a 5-bit olitptJt, a 4-bit :--;+in]-"_+ ;--'-" '
hit.
To _;'+a,i+- " ::++ -, " +[+_ of+it-rent,,,+,,tt,+,f+ a t-Ilrtt-tlrig tot+B]._ of "_'-'-n-+-_,_.,,,_,.. co+Jnted sinr_:e the beginninfl_ of +....._ .
test interval, the error courlt from the nlo'.-;t t-ecent r:omparisor: i'.-: adde;! to+ the
previous tota+.This is done by using an 8-bit adder. The outpcitof the adder is
passe_ +.....-,an '_o-ult"regio,._-+-' the output of which is then.+,e,_' back as one itU_,d:.+to the.
-_dder;the most recently determined error count is the other input.B_._.a,.,._-,_the
immediate error COL'ntcannot exceed sixteen, only a 5-bit Y,/ordis needed; the
three most significantbits of this input are tied off to ground (Icgisalzero).
• +wo 4-bit °'_"+'-"-_ as .:,,,.J.,;.,,in Fig. C, The ]mastAt] 8-t::,t ad4er is realized using ......
• '-'-- ,++::h, hs_yrill _L.a,lc b],.- u, + a,!ded first using the _-I::+. adder (upper blnnk) Then the '--'-,- '
ORIG_NAI_ PAGE IS
OF POOR I;_JAL.ITY BI2
4-B IT ADDER
8-BIT ADDER FROM TWO
4-B!t ADDER BLOCKS.
f ,t., ._
,he,_ $4
L.NB4
i
--! it,',,,,---; s4
--I INB._3
_, INA2 c;4
_NE:2
--_ !NAt $4-
!NB !
iCt F:OUT
i
ff'iA4 $4
_. INB4
} Ir.lA3 $3
Least _ tNB3
significant j 1NA2 S-.-'
! =1:o_,hits _ ,N.,-
_l IN.A1 2;I
Ir,iE*.l
_-_F:I COUT
to qr, d
t
t
i
. -&4 L=:.=>.
significant
bit; of
o_;tput
FUNCTION TABLE FOR 4- BIT ADDER
" i ,°, , •
i lr4Ai i ii,,_I
, !
! 0 i n
i 1 fi
l 0 0
I Ii 1
! ; o
t i :Si
0 FI
o ;
G
0
I
i li"iA8 c.o I
._==_,
]_YB8 i
i i
_ INA7
Pto;t _ INB7 -".7 i_
sigrdficant i i
Ibits --_ iriA6 .. i
il i !,,_h_, '35 i_INB5
_'-,Ci COUT !
i
I 1
Fig. 6
4 Host
bifsof
outpu. +
Block diagram ;_nd function tables for adder ci rc:uits used in the design.
w
i
i
B-BIT COUNTER
I [.,7 l
--i D5 Q6 i_
i L) _ ,-:--'
DO Q3 --
Q 2
-_::_ ! I--Q:t--
RE...;ET QO }--
--! CLOCK I
i COBARI--
CLOCk!DO- D7
\"'.__ I 4("
"--_"--.. _
--I. ;
i
RESET
0
I
i
I
I
FUNCTION TABLE
SELl
!
SEL2 1OF' '" "-'"ERaT_ ,.
I
0 0
0 I
,I°
1 I I
I
I
RE:-;ET
PARALLEL LOAD OF DO-[-,7
INCREHENT (COUNT UP)
DECF'EPIENT (COUNT DOWN)
i
HOLD (STOP COUNT) i
Fig. 7
...... U_!_Block diagram and functior, tmh!e of :-m?,-bit ...... '-
i
BI3
ORIGINAL PAG'E 'IS
OF POOR _,JALITY
Iw
w
., Ca,, b ,rt irLput to the ,.-;_rnrv-_4-b!t "'rrsrn that addit_or: !:-: ..-:.... : ........,JI%,,ied as - - ,-_ adLier ,,;nwer
_ -f! _" __. c.,_5',/,_/,_.._ ,z. ._ .__..... _ _ .....blc:-_:k}' whinh uJu::, the 4 rnn::;t -_,_<'_ .......÷ i-it_--, Thus we c,_t an :=i-h_+ ::.'Jrn 6rid u
1_'=_-'-" i , % __:.l I._..ll.,i I-...._,_ y b i+ 8S -,l,+r, ,*
_,_ _ _,,_,_+c.,.._,ut-e th=r...... _,,=,"_-_ie!,_, lnad_.d...... into an 8-hlt nara',. 1el_ .... 1-,._d_r A."_-:in t _:_,, . '-."-"-',_.,_,._.._.
.... r =*,_-._li-'._""....4.....bi_!
,.!_e ,._ .,. -,_'_'- '' 8-bit sMft register is u'.::m_ias a _........ _ i, ,_,:_._ .,_Of +-.- _-h_t _a[-a]i_; ] ......J.__.. ,ut ,. _ ....
prope:--;y conf": -_'''-'-';_ " ._: "+_=']" ,.,,u :,ali_ _. outpL-'ts4'-,, ,,,a the :-:elect :--_igt.ais ':1 and SO. The .:,:.Jb.... -_-_,_
from the loader are then fed back to t",_ R-bit P,_'_.'- set "._tb a,Ju_, _:E; one of :npiit_;
Th_ L,.4,.pU,.::.t..tom 8-bit parallel loader _ ,_ 'r,";-':- the ]ea:-:t s]!qn!t_cant 6 u,t.:, of
.... " " '.... +-rledthe /!._t!rt!r!g ÷,-{_]:,_,,._,_.rrF, r _-:oil/it.. The higher bit'.-; ot t_:e error count ar__ uu,J_! as
foiIOY'/S.
When the sum exceeds 8 bits a carry is generated by the C,-bitadder The carry
is fed to the data input of a D flip-flop.The Qbar output of the flip-flop is
connected as a clock to 8-bit counterl as in Fig. 7. The counter increments
whenever it receives a fallingedge at its clock input. Initiallywhen the D flip-
flopis reset..Qbar is at logicone.So the clock input to the counter is at logic high.
When a carry ]s generated by the 8-bit adder, it is latched using the D fiip-f!opon
the next clock cycle.So the Q output becomes high while the Qbar output goes low.
Thus, the coLmter receives a fallingedge and its count is incremented. The counter
output thus gives the 8 next most significantbits of the error count. Every time
the output of the 8-bit adder exceeds 255, a carry is generated which increments
the count of 8-bit counterl.Thqs the outputs from 8-bit counterl plus those from
" "L "_ " - -Mt loaderl g]'/esthe 16 lea'.:;t.:,lqh,f]car:,tbits of the total error cnunt. The 16
;7_o'.:_;tsignificantbits at-eobtained as foiiows
BI4
i
ii
8EL_L__ '-"
I "I';_2 j
FUNCTION TABLE
SEL i [:,1 r:,:-'iOUT11 OUT2
0 _' * [:'2 D2
I
iI_LIT2
-'-_irr" "L,,__, ]
D I D I Fig. 8a
2X i [.lu!ti piexer
BIT I _ D I
!
r-,i q D2L,,T I
BiT .7 [
BIT ! _
output I1
t
8EL
DI
2D2
t S
II
!
1output -_
Fir
BIT 16 [DI
--i D2BIT 3:Z
I :L
t"
output 3
I
I
I
I
I
I
output 16
Fig 8b
111ustration of ho',,/output is multi plexed.
When SEL is HIGH the 161east
sigr4ficant bits are outputted.
When 8EL is LOW the 16 most
significantI-,itsare outputted.
BI5
ORIGINAL PAGE IS
OF POOR QUALITY
w= =
w_
w
V
Tile Q7 output of counterl is applied as e clock to 8-Oit counter2. During tile
total count sequence 0-255 the Q7 bit goes from low to high and from high to low
r--_r,_ .CIn e.c;_ rorrinn.,-:p. Tt"_I'° v.,,h_n_v_'.r counterl resets after the re, m .... l r:nunf FiT h]qh
__ -- " ,'--- _ :'oci"ed.qe for-taunter2 ar,j it increrr,,ertts.Thef lov-.'v;,hichcorr_;t]tJ,.u.:,a fal]inQ _:i . .- ..
"4i-c.
i--i-bitoutput of tids counter m',p._-the 8 next most c.{,_-,ifirar_{hi:...The 6 most
significant bits are similarly obtained by applying Q7 from counter2 as a clock to
counter3.
The 32-bi_ output obtained in parts from the 8-hit adder- _"'_ the three 8-hit
co.unters is parallel loaded into a 32-bit Faratlel loader. The 32-bit paraliel loader
,_,_,,evedby t,e_,-,_,two lb-Llt parallel loaders. Its fLinctioriis simiiar to '"-
parallel loaders used at tl]einput.
P
Output Circuitry
The 32-bit output is to be outputted in two grouPs of 16 bits each. This can be
done using 16 two-to-one multiplexers. A 2xi multiplexer is shown in Fig. 8a.
There are two inputs and either of them can be selected as the output using the
select .,lgr,.,l_-,-, '-_.__L.If .q'-__L=.J,"D2 appears at output 1. tf SEL=I, Dt appears at. output.. 1.
,. e_..,f we needAS we ,,;.'ant 7,,._ = outputs to b_. multiplexed into two groups of 16 rats -,--'-
si::-'teen 2xl multiplexers. Also, we want either the 16 least significant bits to be
outputted first and then the t6 most significant bits or vice versa. So the inputs
to the multiplexer are given as shov.,,nin Fig. 8b. When SEL=O, the 16 most
siqnificant bits are selected. When SEL=I the 16 least significant bits are
selected.
ORIGINAL PAGE IS
OF POOR QUALITY
BI6
li_-.
Control Circuitru
Th_ "-."--_ n-:ui+'- of tWO _-_-"=
I) Circuitry to Disable the Data Checker and Halt its Operation
m
qil
-he clock signa _,is ANDed with the ENABLE signal and the output of the A['!D
gate is applied as a clock to the rest of the Data chesker circuitry on chipl and
chip2. As ]ong as a valid word is at the input of the checker the ENABLE signal is
• , " - I. -..-. rt" _,:,:q"t_.it"_l '_- -
rll_i"c _>--.d the ,--]o_-:{:-.. ic: appt1_ci to :.h_ chstsker - ',-:ifciliti-L-i ............. ,_f ifiva]ii-!, data
arri;'es,the ENABLE goes low preventing the clock signa_ f;-ornbeing app!ied to the
circuitry. However, simply ANDing the clock signal with the ENASLE signal results
in an unwante;2 faliing edge if the clock is high when the ENABLE goes low. This
"' '_"+ tO di,.::r:ardi_falling edge ',,;',q1_be applied to tile circuit ,,,'.,.'her_actual!y we ;,,_,_......
This wouid resu}t in a wrong error count. To avoid th';s,the output of the AND gate
is applied to the checker circuitry through a delay circuit.Also, the ENASLE signal
is inverted and applied to the 16-bit parallel loaders at the input, to 8 bit parallel
1oaderi arid to the 8-bit counters (counterl, counter2, cour:ter3) and it is also
applied at the it:putof the XOR gate which is used to control the read operation
(explained later).
So, when the ENABLE goes low: a high at the output of the inverter is applied to
the select in_,uts..Sl and SO of the 16-bit and the 8-bit paraiiel !oaders .......Alan, a
high is applied to the $I input of the counters making their select inputs $I and
SO high. Thus, the 16- and the 8-bit loaders along with the counters are forced
into a halted state. So even if the clock signal is applied their oufputs will not
BI7 ORIGINAL PAGE IS
OF POOR QUALITY
wm
cr',-Jnge.-_-,uwhenever the ENASLE goes low the loaders and the counters ate '-÷'-'_-,4
U_ t dt U• . l.i, t.t]t, , I._ '.YY ui ,._ ..IAS the ,7 ,_: ,: of th;. AND :_._....,:._..... nntp. ,_-,_:- throu,'_h the de]ay _',--:÷ -_'--;'-'-- the ' ........'+_"_
u.... irl _ hu_t_,,:
-.. lo,_uer.:,,t_y alreadg -'-_ -' -"+_,,_l]ihgP._!ie_.._is received by the r:nunt_.rs....... and _'_ -,. ...
".,---- effect from th_ unwanted '--_:'ingedge. Again v,,;her_a v_lidstate.T,,L..:..there is no .... ,,_,, ..
word u,,,-'-'-_"_'-,:.,.:,at the input, t'--._ieEr'.!ABLE _7_g..... high. So the _,-._,4,-,,-,:... ..... and r:nur:t _.r..::again
h .-..¢ ,_ .... _.
t-_':iIlt'[,_theii- noITTia] orJeration bP-',,J H the CIuL:K i c :=_r_r,]i,-,6
2) Circuitry to Read the Output
The -"';_"*read r:p_.ratim,meLi _--' '-__ _ _ :,, U,_. a -,; F_:,A;_:i-OCes:-: -" ,'_ . ._,: .....v_._,. ,h ...._ t i-i_9_ Oris c] o ;::V
cycie.So there should be a stable validoutput untilthe read cpe:-ationis cot-n_ete.
At t.he sar:,e t,,,;.., the runrling cnHr', t.... nf _,-:-n,-:-,.,,_ ., •_,,,:.,_:_ nuI'_ nct he lost. This is:_achieved
.__.,-u]÷ parallelloader (realized usinga.:- tul_u_.,,,'.:LThe output is obtained f,-r,:-n a "_ _
two i6-bit parailet loaders:}. The select signals $I and SO are obtained from the
output of an XOR gate whose inputs are the inverted ENABLE signal and the READ
sigr_al.
i_!.Idsr -,-'-'.... _ "" 'r u,r_,_,operation the READ signal is Mgh and the ENAE_LE _.]_t,alis _:_,_;• /li!_li,
So the output of the XOR gate is low. Thus both $1 and SO are set _ow and the block
functions as a parallel loader. Whenever output is to be read, the READ signai is
• _- ,-I - ., i.-, . - - .I. -. . - i _ _ - _' - _ "_ - -mu,;_ low ;,_,,ile ENABLE remains high; lMs force.-:; ,.he U,._,.iJUL I-M"_.h'_.}::OR,.__utualong
with $I and SO or the 32-bit parallel loader to go high• Tf_e loader is thL,'S Stopped
and maintains its data constant even though the cloct" is still being app]ied Vow
_, L- .-_ t'_ " -_'f . -u:_::!ng ,.,_ .:,,_L .signu_,, either tt_e 16 least signifir:ant bits or tha It, rnost
significant bits are selected and dumped to the output bus. After- the first
:-;eiect!onand a high or- a low word has been read, SEL is complerner_ted,and the
ORIGINAL PAGE IS
OF POOR QUALITY BI8
ilII
second read is done. The,_ operation is,.complete., READ returns to ,:,'nigh "+_ie_,,..,. _, ,_"'_
the 32-bit loader resumes its normal operation. While the _.--.,-h_÷l _,_>- _o,__._.,_.ria_t.c.,d. -' -" L, ! '. -_
. " _ _ . ri'tU, ilii. l_it'l ., _.for _ reaFi u-ulit'L. parailei',o_der..--..... plu_ R-bil couri'r.ersI _::"amj 3 ;-- ,-,+--_-+,!ir-
rLl,r_Kiirt!.]COUnt correctly withnut r+ -_-,mfi-,• "_ cniir_l-
- . . .... i.l il_. i i tJ I_t i: .J I.. _.1 I,.; I _ / I _.t
rH:.,rrr-_a!ly,.soor_8s t.llebig loader i.'_--:agair_ab]e to lec:ei',ie_]ata,.
- I-4;--IThUS th_ iO!liC_design _s,comp,e,__u..The next chapter describes tilep,"-"',,_,,.=y,=_-_+,Jon
_-_.,,4_I;_!__4,.s,_n,_._ n_.._.timing diagrams,,for-the ._.r.ir_-uitry_....
L
: !ORIClI_NA- PAGE IS
OF POOR QUALITY
BI9
CHAPTER III
DELAYS AND TIMING INFORMATION
_+
_+
v
r
.... . u_,Igh. Under i_:.. ..,+Pi-opa_a,.lu_,*_", de!ay pi_yo a vital roie in any ASIC _---i - _,+ nnnd]t]ons
•,.".,,,tl-',"P__,propagatinn_ _,e]_.y--through the... ....:_:_+'-"=-._,,+.,,the loLlin.,_ nf +I_ d'-'-_'-'''._,+°,, ,-r ,,l.+,_be
l_i _,-L+_ .+,._:L.,.ona_i!.i::orrect but under real Y,;'oridconditions mall be far from i"+._, Each
" ,+_ . _J,,Ivun,.of + _,-;o_.s it p_sses through a logic +e',,'e_.sIcn;51 ,:s4e_aue-_ by a certain -'.... + ,l..... .
Each signai has a finite rise and fall time a:-:sccia+edwith it.These can ca_'.=;P.
• _.... r_', _._..... ani0.q-+ cm-rnr,:;an:_ may impose a ]i,,,Iton the spee_+;of n _p +-ati-r+5o accurate
_::..,_,,L+ul in m+_kB t__at the _ " '
................... _ . _ .:,,.+++ . . _ +o_I v ,.++_.,,,+<+:-i+,m",
-+,=_ti ,TaSte nf ti rri ng +-p.+me,;nn,.mhi D'.-',i s --o-,"+ +++ "'"- .
+
÷i
....-+-+-+-.+,,-,',,+'-,_,-+1,+'+-+-practical condi,.,nns
i-:onceptuai!y,an ASIC consists of a n.m_her of fur+ctional blocks (parts or
ol!_IISulS CP'id pr0t_uces one or moregates), each of which takes one or more input + "
,jI.A+,+.I,.driv_.s the Inpdt., Of one br more o,.h+, _+"-outputs. Each .. . . " , _. _ +-,-+- ......o on occasicn, an
• _- _ L_a,_,.- u:=+';_<,,_i[l e:--._l_Frtalexternal output. The '-_+ - ..... °. ,,+ of the r:apamtann+._: of t, drivmr', '- +-'
outputs is called the absolute fanout of the driving output. The propagation deiay
'-': +-- -, _ I % ,- i.t!i 0t+,, lu+.+g,hthe _--_+'-_ _ . "+"- b...._+.,or the functional bloc:ks comprisirIM aFi _oIC iS related to ++_++-
,++c,olb .U f arll-tljt.
The propagation delay depends on circuit design, transistor sizing, nurnber of
levels of logic, temperature, supply voltages, variations in process dependent
parameters, and capacitive loading of the output of the gate. High temperature
leads to reduced carrier mobiiities, Mgher resistance, and longer delays Also, low
supply voltages and slow rise/fall times of driving inputs also lead to longer
propagation delays.
B20 ORIGINAL PAGE IS
OF POOR QUALITY
. f
The. i-:MOSN ceil noteboo[- ,,,-m,_
_ _ F,,,,:,des eq_'ations that predir:t the del"" '--",_,,__o f, d,[, i"'-' '_ . I tiJ'J'-
',..'.:,1qn theto output arid the rise and fall ti"_.",,,.oof tile output signal. Sin:-e the. _.-. of
Data Checker is done using standard cells, the cell equations are used for
Ca ct_] _* " ""] . ,.,.]n_ the delays.
...... _,J,r, the CftOSNTI'- Foilcr,.ving timing an,..1cell equation ihformation i':-- tak:en " ..... '
ca]] hotehook.
',-.e
v
,r_.,
_h
w
Calculation of Delays for Chipl and Chip2
F_y '._'singthe delay equations, the propagation delays for all the various logic
blocks are ca]cuiated for the worst case process parameters at 125 degrees
Ce!sius and typical case process parameters at 25 degrees Ce]sius. These delays
forchipl are listed inTablel and Table2and forchip2 inTab]e 3, Table 4, Table
5, and Table 6. The delays through the individual blocks are then added to get the
total delays from inputs to outputs.
a) For chipl-
The delay from the input to the output of chipl is given by the sum of the
delays through the input pad, 16-bit parallel loader, XOR gate, 12-bit population
counter, and output pad. It comes to 56.9601 ns at 125 degrees Celsius (worst
case) and !7.0488 at 25 degrees Celsius (typicalcase).
,,h',For chip2-
The rJropagation delays in chip2 are calcu]ated in four stages.
I) Delay from the input to 8-bit paralle] lo-_der I.
This de]ay is g!ven by the sum of the delays through the input pa_1,the 4-bit
ORIGINAL PAGE IS
OF POOR QUALITY
B21
wTABLE I
Timing ;rdc;-rr;ationfor Standard Cells Used in Data Chec:ker Chipl _ I2S Deqrees '--"-_ "
_m Nu,__L _PIE
16-Bit
Parallel
Loader
V,OF:
12-bit I
r,0pu]-_tion t
c:our,ter
Output Pad
EOU AT IONS USED t'.@25 r:,EO.C)
Pd(O-1 ) = 1 t .9 + (1.9)CL
Pd(! -0) = ! 4.9 + (t .47)CL
Tris = 3.42 + -(S.gS)CL
Tfal = 4. ! 8 + (2.1S)CL
F'd(O- 1) = 2 1,- + (3.5:S)CL
Pd(!-0) = 1.38 + (2.22)CL
Tris = 3.35 + ,o._,.l,_.L
Tfal = 2.33 + (4.70)CL
Pd(0-1 ) = 32.79 + (I.85)CL
" "" .T4:CLPd(1-0../= 31 .-.-_ + .. t
Tris = 3.02 + (_.56]CL
Tfal = 3.27 + (12.19)C:L
Pd(O-13 = 2 29 + (O.08)CL
Pal(l-O) = 2.04 + (0.06)CL
Tris = 1.04 + f.'O.19)CL
Tfal = 0.08 + (0.13]CL
ed (0- l3
ris
12.565
4.i 048
35.01
Pd (.I-!T, ! Tris i !
r,s I ns ins i
i i I
1
I0.4145 -_'-',-" .i .'-Z.J.__
.o,J=, i "_ I1
I
1
2.62S2
33.318
5.16
I
I
i
7 .=.'9:.'
! 0.926.45
4.9E2
6 .,'-_44
* = The delag eq_.ationfor-Pd( I- n', .,-. _.....,.,, _:.:.Tfal is the one bet'_,.'e_n RESET and QO but not the one between i'_ rE?- and QO.
B22 ORIGINAL PAGE IS
OF POOR QUALITY
-- TABLE 2
r_
- i
%a
Ti P,'MDg I ;ifGrrrl_ti - n for Standard Cell:-; Used i n Data Chenker C[-dp I @, Z5
CELL NAPE
16-B_,'._
F'ar-_lle!
Loader
12-bit
p,:,F_,_Jl._tic, rl
Output Pad
EQLIATITINS LISEr_(@ 2.5DEG. C)
Pd(O-l) = _,.6E:+ (0.62)CL
F'd(i1-0)= 4.78 + LL,.._,__,J,_.L,
Trim = 0.90 + (I .4)CL
Tfal = 1 .'-J9 + (0.90)CL
Pd(O-1 ) = ,_,.r:"?r_,._.+ (I .06)CL
Pd(l-O) = 0.55 + (9.79)CL
Tris = 1.1 I + (255)CL
'" .61 )CLTfal = ! .i-t2 + :.
Pd(O-I ) = _4...L,-C,+ (0.60)CL
F'd(1-O)= 8.E0 + (0.60)CL
Tri- = 0.87 + (I .25)CL
Tfal = 0.94 + (0.91)CL
Pd(O-l) = I.01 + (O.02)CL
Pal(l-O)= 0.70 + (0 02)CL
Tris = 0.4.3 + (O.06)CL
Ttal = 0.45 + (O.05)CL
3.897
I 2q X_--
9.74
2.05
Pd (I-C} !
i
ri_ j
4:978
.9724
9 Z2
3.55
De..aFees Cel si u:-';
Tri- Tfa_, I
I I1 I
i i
11 Z9 1.605 1
!
!
I
I
t
i
25_8 _ i .921E I
1
I
' iI
t
I
i
I
I
I
I
1.74 3.04 I
W
* = _,he :.i_Ir,:4._. equation........ for Pd( 1 -0) & Tfal is the one b_:.t,_,,'m_.nRESET and QO but n.d th_ one bet',.veen L.-_L.,,r"rtr v and Ogl._.
= ,
z 7 _
= =
= ORIGINAL PAGE IS
OF POOR QUALITY
B23
--- TdBLE 3
r_
v
v
=
V
Timing Inforrnatior,for Stand-er,JCellsUsed in DataChecker Chip2 _ 125 Degrees Celsius.
F'd (0-I) i P,.I(1 --V
CELL N APE EQU ATiOF',iS USED (@ 125 DEG. C)
ri_ ff_
.t-
4-8'.'+
Parallel
Loader
>::OR
8-1-,it
population
i
COUf'_{eI"
.-I-
8-bit
parallel
loader
"' .90JCLPd(C-t ) = 9.36 + L.J .
Pd(1-0) = 10.59 + (I 59)CL
Tris = 3 4 _ + (3.97)CL
Tfa] = $ 32 + (2 45"_ CL
= _7 F,o'--'Pal(O-l) 2.10+ ............. L
F'd(!-O) = 1 ._'-:,8+ (2.22)CL
Tris : 3..'_-:5 + (8.53)CL
Tfal = 2...=,.'-5,+ (4.70)F:L
Pd(O-1) = 14.1 + (1.83)CL
Pd(l-n/ 18.83 + (1 "'"_• _. = .--o..,,_L
_-_: .20)CLTri- = t .... + ".4
Tt-al = 0.98 + (2.54,CL
Pd(O- I ) = 10 22 + (1.9)CL
Pal(l-O) = I 1.40 + (! .46)CL
Tris = 0.43 + (O.06)CL
Tfal = 0.45 + (O.05)CL
F'd(O-I ) =7.29 + (3.26)CL
F'd(l-O) = 9.60+ (2.27)CL
Tris = ",-,. 13 + (8.41 )CL
Tfal = 3.97 + (6.163CL
1 U.U--"_
4.069
14.46
I I .075
E 757
10.5565
2.,501
19.0;-3,6
12.,357
10.6215
Tris i Tfa! i
Jr,s ns ....
I
4.799 4.1_
E:.04! ! 4.915
i
I .488
7 4.971
7.91 6.742 1
I
i
1
÷ The d_l-'Jq equ,_tion for F'd( 1-O) &. Tfa! i.:, th,. one _,:t,,,'_-_.r,RESET ar,,_nn but not _h_.one t-P÷',,"_'er', LnrK er,,_nn
B24
ORIGINAL _""=,--,_,.. IS
OF POOR QUALITY
?- TABLE 4
v
r--_
=
i--
W
P
=:
v
Timinq information for Standard c.eil.-._ Used in Data Rl',_r I.'er Chip2 m, 1 -'R Degrees Celsius
CELL t'ia,'l:
8-Bit
adder
8-t-,it
parallel
to._der2
8-bit
counter
32-bit
parallel
loader
2Xl
n-,Jltil:,le::<er
EQUATIONS U._-,_D(,_,1"_ DEG. C'I.
Pal(O-1 ) = 7."29 + (_.26)CL
Pal(l-O) = 9.60 + (2.27)CL
Tris = 4.1_ + (8.41)CL
Tfal = _.97 + (6 16)CL
Pdfn-1._ ) = ";'-,J.-.-,_... + L"'_.9)CL
F'd(1-0) = 1 t.40 + (i .46)CL
Tris : 3.-c:7 + (4.0)CL
Tfal = $.99 + ,,-. _,_,.J_L
Pd,.'.O-I) = 11 08 + (I .88)CL
Pd(1-O) = 12.50 + (I "_.4-,CL
Tris = 4.50 + (3.90)CL
= i"_Tfa] 4.10 + .....15)CL
F'd(O-1) =11.9 +(1.9)CL
Pd(t -0) = 14.9 + (I .47)CL
Tris=7° f_q#,'lCL,..-.4,_ + ...... ,
Tfal = 4.18 + (2.13)CL
Pd(O-1) =4.63 + (3 52)CL
F'd(1-0) = 3.50+ (1.81)CL
Tris = 8.36 + (8.41)CL
Ttal = 6.27 + (4 4.X_CL
.. . _
Pd (0-I)
n_
15.885
11.455
11.7994
12.299
o_A
Pd "::I-0) i
n_
20.0172
! 2.349
I_.137
15 208
5 ,E,72
Tris _ Tfal i
I
! n_- ns
i
11.E.2 10.404
,_,.=.97 5.407
i
5.982 " "'
4.251 4.627
i
18.45 _!586
I
* = The delay equation for Pd( 1 -0) c_:...Tfal is the one bet,..,een RESET and QO hut not the one bet,,,'eer, CLOCK and QO.
ORIGINAL PAGE iS
OF POOR QUALITY B25
_ 1
v
=
Y
I.F
== =
v
T_BLE 5
Ti n-,dn,:l i nform_{ion for Standard Cells Used i r, C,at_ Checker Chi pZ @ --': Degrees F:etsi us
Pd r.-,-,"l PH t:_-Fti
CELL NAHE EOUATiONS USED (@ 25 DEG. C).
ns ns
4-Bif.
Parallel
Lo.sder
XOR
8-bi+
populalior,
¢:,:,ur,f__-
8-bit
parallel
loader
4-bit
adder
F'd_iO-I) = 2.79 + (0.63)C:L
F'd:_i-O)= Z.OE,+ (0.59)CL
Tris = 0.89 + (_ .41 )CL
= . n_ '_'Tfal 0.95 + (1._±u:.-
Pd(O-I )= 13.7n+ (I .06)CL
Pd(1-0) = 0.53 + (0.7%_CL
Tris = I.II + (2.55)CL
Tfa!= 1.02 + (I.61)CL
Pd(O-1 "_= 4.0 + (0.59)CL
P,:I(1-0)= 5.06 + (O.50)CL
Tris = in.30+ (I.52)CL
Tfal = 0.25 + (I.14)CL
Pd(O-l) :_.0.9 + (0.62)CL
Pd(1-0) = 3 .=,_+ fO _7"ICL
Tris = 0.90 + (I.4)CL
Tfal = I.19 + (0.94)CL
F'd(O-I) =2. I0 + (I.Oi)CL
Pd(i-O) : 2..6+ (0.97)C:L
Tris = 1.03 + (..,'-.-.CL
Tfal = I.!6 + (I.96)CL
.0105
I .283
4.11;_'3,
3.359
2.5545
.3.2665
.9645
5.16
8365
3.1965
Tris [ Tfal
r,s ! ns
i
1
!
i .38_ l l ._I o5
!
i
i
|
2.512 1.905
0.604 0.478
I .53 1.613
2.25 2.042
÷ = The delayequ_tionforPd( 1-O) _:-.Tfa!istheone bet:./eenRESET andQO hU+nottheone bet'../eenCLOCK :,_,_OF!
= :
v
B26 ORIGINAL PP,GZ IS
OF POOR QUALITY
-'_ TABLE 6
-be
V
Timing Information for Standard Cells Used in Data Checker Chip2 @ 75 Degrees Celsius.
Pd(O-4) Pal(l-O)
CELL NA,r"iE EOUATIONS USED (@ 25 DEG. C)
n_ ns
8-E:it
8-bit
p._r =,4lel
lo._,der2
8-bit
counter
32-bit
parallel
loader
2XI
rnultiplexer
F'd(O-4) = 2.40 4- (1.01)CL
Pd(f-O) = 2.7E, + (0.g7,'rCL
Trim = 1.055 + (2.72)CL
Tfal = 1.16 + (1.96)CL
F'd(O-I ) = 3.O:-J + (O.E,2)CL
Pd(1-O) = S.58 + (0.57)CL
Trim = 0.90 + (I .40)CL
Tfal = 1.19 + (0.94)F:L
F'd(O-1 ) = 3.32 + (0.63)CL
Pd(1-0) = 3.97 + (0.56)CL
Tris = 1.20 + (! .Z,6)CL
Tfal = 1 "_" + f'O 94iF:L
Pal(O-1 ) =.7_,.68 + (0.62)CL
Pd(1-0) = 4.78 + (0.56)CL
Trim = 0.90 + (1.4)CL
Tfal = 1.29 + (0.90)CL
Pd(O- I ) =1.5.5 + (4,40)CL
Pd(1-O) = 4.78+ (O.56)CL
Trim = 2.27 + (2.64)CL
Tfal = 1.85 + (1.63)CL
4.604
•-5_',.493
3.5594
3.8102
2.65
5.908
_.95 ]5
4.1828
4.8976
"T ,",
.1 i ,_,
I
J
I
!
3.14E: !3.104
1.81 4,804
11.716 1.5:37
1.194 1.479
;.438 3.806
* = The delay equationfor Pd( 1-0) & Tfat is the one bet'.,..'eenRESETandF!Obut not the one t,et'-veenCLOC,_"andL!O.
ORIGINAL PAGE IS
OF POOR QUALITY
B29
parallel loader, the XOR gate, and the 8-bit population counter. It comes
to 33.7115 ns at 125 degrees Celsius (worst case) and 9.7095 ns at 25
degrees Celsius (typical case).
2) Delay from parallel loader 1 to parallel loader 2
It is given by the sum of the delays through loaderl, the 4-bit
adder, and the 8-bit adder. It comes to 42.6957 ns at 125 Celsius and
12.9022 at 25 degrees Celsius (typical case).
3) Delay from loader 2 to the 32-bit parallel loader.
This is given by the sum of the delays through loader2, 8-bit
counterl, counter2, and counter3. Its value is 51.766 ns at 125 degrees
Celsius (worst case) and 18.7023 ns at 25 degrees Celsius (typical case).
4) Delay from the 32-bit parallel loader to the output,
It is the sum of the delays through the 32-bit loader and the 2xl
multiplexer. It comes to 24.0627 ns at 125 degrees celsius (worst case)
and 7.5476 ns at 25 degrees celsius (typical case).
The clock period is 75.75 ns (frequency of 13.2 MHz). From the
delay calculations above it is clear that the delay between successive
stages is less than 75.75 ns. Since the stages are separated by parallel
loaders which act as latches, inputs to the loaders are stable and valid at
the time the clock is applied to then, Also data is present at the input to
these loaders for the minimum data setup time before the clock is
applied and for the minimum data hold time after the clock is applied.
So, whenever the clock is applied to the loaders, the data at their inputs
is stable and valid and this data is passed on to the next stage.
B28
