CMOS array design automation techniques by Lombardi, T. et al.
General Disclaimer 
One or more of the Following Statements may affect this Document 
 
 This document has been reproduced from the best copy furnished by the 
organizational source. It is being released in the interest of making available as 
much information as possible. 
 
 This document may contain data, which exceeds the sheet parameters. It was 
furnished in this condition by the organizational source and is the best copy 
available. 
 
 This document may contain tone-on-tone or color graphs, charts and/or pictures, 
which have been reproduced in black and white. 
 
 This document is paginated as submitted by the original source. 
 
 Portions of this document are not fully legible due to the historical nature of some 
of the material. However, it is the best reproduction available from the original 
submission. 
 
 
 
 
 
 
 
Produced by the NASA Center for Aerospace Information (CASI) 
https://ntrs.nasa.gov/search.jsp?R=19750022360 2020-03-22T21:06:00+00:00Z
,.o
Final Report
CMOS Array Design Automation Techniques
by
P. Ramondetta, A. Feller, R. Noto and T. Lombardi
Prepared under Contract NAS .2 . 2233 Mods 6 and 11 for:
GEORGE C. MARSHALL SPACE FLIGHT CENTER
NATIONAL AERONAUTICS AND SPACE ADMINISTRATION
MARSHALL SPACE FLIGHT CENTER
ALABAMA 35812
May 1975	 •.'^^1^
(NASA-CF-120(83)	 CMOF AF?AY EFSIGN	 N75-30433
FUTOMATTCN T'FCHY10UES (?adio Corp. o,
America)	 68 p HC $4.29	 CSCL 09C
U1iclas
G3/33
	
32410
	
.lC	 •t
t	 Prepared by:
ADVANCED TECHNOLOGY LABORATORIES
GOVERNMENT AND COMMERCIAL SYSTEMS
RCA
CAMDEN, NEW JERSEY 08102
Final Report
CMOS ARRAY DESIGN AUTOMATION TECHNIQUES
by
P. Ramondetta, A. Feller, R. Noto and T. Lombardi
Contract NAS12-2233 Mods 6 and 11
May 1975
Prepared for
GEGDGE C. MARSHALL SPACE FLIGHT CENTER
NATIONAL AERONAUTICS AND SPACE ADMINISTRATION
MARSHALL SPACE FLIGHT CENTER
ALABAMA 35812
Prepared by
ADVANCED TECHNOLOGY LABORATORIES
GOVERNMENT AND COMMERCIAL SYSTEMS
RCA
CAMDEN, NEW JERSEY 08102
FOREWORD 
This report covers the work accomplished on the NASA Contract NAS12-2233 
Modifications 6 and 11 under the direction ofJobn Gotild, Technical ProgrmnD:i.r~tor, to develop 
and/or enhance a design automation capability for generating low cost, quick turnaround cnstom 
LSI arrays nsing the standard cell approach with the silicon-gate btilk CMOS and SOS tecbnologies. 
Section 1 briefly details the work leading up to that described in this report. 
Sections 2 and 3 describe the wOL'k relating to the CMOS/SOS teChnologies, which were 
the principal focus of this progrmn. The silicon-gate, bemn-lead bulk CMOS portion 
of this program is described in the Appendix. 
ii 
. , 
ACKNOWLEDGEMENT
The authors wish to acknowledge the following other members of RCA Advanced
Technology Laboratories in Camden, N.J. , who teamed with them to conceive and
successfully implement this program: Mr. A. Smith -- co-author of the basic array
topology and principal cell designer -- who directed the design of all LSI arrays;
Mr. R. Pryor, who performed cell design and analysis; Mr. R. Lisowski, who assisted
in the testing and evaluation of the arrays; and especially Mr. F. Bertino, who together
with Mr. J. DeLuca, was responsible for translating, through the use of the standard
cell design automation computer program, the input logic design into a form for plot-
ting the final mask artwork in addition to implementing all the tasks involved in
developing a working cell library tape from the original design.
The authors also gratefully acknowledge the contributions of the following members
of RCA Solid State Technology Center in Somerville, N. J.: Messrs, H. Borkan,
D. Woo and S. Policastro, who developed the technology and processed the CMOS/SOS
LSI arrays; Messrs. S. Cohen and G. Caswell, who provided diagnostic and general
assistance; Messrs. T. Mayhew, A. Woodhull, and K. Long, who tested and packaged the
LSI arrays and R. Geshner and B. Greening who provided the working masks.
The authors also would like to acknowledge the contribution of G. Lines and his
activity who were very responsive, cooperative and efficient in providing 80X precision
artwork.
Iii
TABLE OF CONTENTS
Section	 Page
1	 INTRODUCTION ..................................
	 1
2	 WORK REQUIREMENTS & SUMMARY OF ACCOMPLISHMENTS 	 4
3	 TECHNICAL DESCRIPTION	 6
A. CMOS/SOS PR2D Automatic Placement and Routing
6Program .... . ................0.......000....
B. Silicon-Gate CMOS/SOS Technology ................... 7
C. Analysis and Basic Cell Design	 ..................... 11
D. CMOS/SOS Standard Cell Design Layout ................ 22
E. CMOS/SOS Standard Cell Array Topology ............... 25
F. CMOS/SOS Standard Cell Library .................... 31
G. CMOS/SOS LSI Chip Measurement and Analysis ........... 34
H. Conclusions and Results .......................... 54
APPENDIX ............................................. A-1
LIST OF ILLUSTRATIONS
Figure	 Pie
1 Silicon-gate CMOS on sapphire process, double epitaxy......... 10
2 Evaluation circuit for CMOS/SOS standard cell .............. 12
3 Evaluation circuit for crosstalk due to metal/polysilicon
overlap
	 ........................................ 13
4 Pair delay vs transistor channel width	 ................... 15
5 Speed vs device width ............................... 15
6 Pair delay vs transistor channel width.	 (Fanout equivalent
to 3.6 two-input NORs.) ............................. 16
7 Two-input NOR layout ............................... 18
8 Two-input NOR test chain and simulated circuit ............. 19
9 Validation of simulation techniques	 ..................... 21
10 Two-input AND cell composite ......................... 23
11 Layout rules 1-7	 .................................. 26
12 Layout rules 8-11	 ................................. 27
iv
LIST OF ILLUSTRATIONS (Cont'd. )
Figure	 Page
13	 Composite and levels of two-input AND ................... 28
14	 Typical metalization level for SOS standard cell array ......... 30
15	 Structure of gate protection device ...................... 31
16 CMOS/SOS standard cell test chip ....................... 35
17 Typical drain characteristics .......................... 37
18 Two-input NOR stage delay test circuits 	 .................. 39
19 Measured two-input NOR chain delay	 .................... 40
20 Measured two-input NOR stage delay	 .................... 41
21 8-Bit counter daly measurements	 ...................... 42
22 Measured delay for floating point multiplexer ............... 44
23 Calculated delays for up/down counter	 ................... 45
24 Critical path of 8-bit multiplexed input adder ............... 47
25 Typical input-output waveforms of 8-bit adder .............. 48
26 Calculated delays for 9-bit 4 x 2 multiplexer ............... 49
27 SUMC-CVT CMOS/SOS adder control chip measurement path .... 51
28 Typical input-output waveforms of adder control 	 ............ 52
LIST OF TABLES
Table	 Page
1	 Comparison of Average Measured Delay and Computer-
Predicted Delay ...................................	 5
2	 Technical Analysis Parameters .............. 0 .... 0 0 .. 0 11
3	 Standard Cell Height Components ....................... 17
4	 SOS Test Chip (011) Device Parameters... o . . . . . o ......... 20
5	 Documented CMOS/SOS Standard Cell Family .........0000.. 33
6	 Measured Test Transistor Characteristics ................. 36
7	 SUMC-CVT CMOS/SOS LSI Array Performance Measurements ... 53
Section 1
INTRODUCTION
In the 1966-1968 time frame, RCA, supported by contract, developed and success-
fully implemented a standard cell design automation (DA) approach for generating
low cost, quick-turnaround custom LSI arrays. 1 The computer programs, as well
as the circuit design and layout, were customized for the dynamic, two-phase, high
threshold PMOS ratio logic circuitry. This DA approach has since proven itself in
the production of hundreds of custom LSI arrayf, by different contractors on a number
of Government programs.
In 1969, with contract support from NASA-ERC, RCA began a program to extend
the standard cell DA capability to the CMOS technology. Because CMOS circuitry is
a static technology, this program required a completely different approach then taken
with PMOS for the circuit design and layout as well as the basic algorithms of the
computer programs. This CMOS standard cell DA development was successfully
demonstrated in measurements on a CMOS standard cell test chip specifically designed
to evaluate the program results. 2
To demonstrate the effectiveness of the CMOS DA capability in reducing the cost
and design times of systems using these LSI arrays, as well as in providing enhanced
performance, RCA designed and built a 16-bit computer -- the SUMC-DV -- with
contract support from NASA-MSFC, Huntsville, Alabama. The SUMC-DV uses 10
different LSI array types in a total LSI usage of 55 parts. The entire system was
successfully designed and built in one year, including the design, fabrication and
testing of the 10 LSI array types using the CMOS standard cell DA capability. 3
1. "Banning Programmer's Manual for Artwork Program", Final Report, Contract
DA-18-119-AMC-03460(X), August 1967.
2. CMOS Array Design Techniques, Quarterly Report No. 1, Contract NAS12-2233,
March 1970.
3. SUMC-DV Hardware Manual, Nov. 1972,  Prepared on Contract NA812-2233,
"CMOS Array Design Techniques", NASA-MSFC, Huntsville, Ala.
Following the successful implementation of the SUMC-DV, several application
areas for design automation LSI were identified. One of these was a requirsment for
a high performance, low power, sophisticated fault-tolerant multiprocessor system
using 32-bit CPUs. Because of the performance and throughput requirements in this
computer development, silicon-gate, beam-lead bulk CMOS technology was selected
for its high speed, high packing density, and low power dissipation potential. Suppori
for the development of the DA capability for this technology was received in one of the
earlier modifications (mod 6) to Contract NAS12-2233 (described in the Appendix to
this report).
As a result, a family of silicon-gate, beam-lead CMOS bulk silicon standard cells
was developed. The automatic placement and routing program was modified to be
compatible with the polysilicon-gate, beam-lead bulk silicon process. A functional
test chip was designed and tested with only moderate success in achieving the high
performance required and anticipated. Because of this limited success with bulk
CMOS an(' bec r use of substantial advances in establishing a mature and stable CMOS/
SOS pilot line*, the technology for the multiprocessor was switched to self-aligned,
silicon-gate C;.MOS/SOS to take advantage of its higher performance, high density and
lower power capabilities as well as to capitalize on the CMOS/SOS process develop-
ment in RCA.
A further riodification (mod 11) to Contract NAS12-2233 enabled RCA to develop-
the standard cell LSI design automation capability for the silicon-gate CMOS/SOS
technology (described in Sections 2 and 3 of this report). Work was performed and
results obtained in the following areas:
• Improvements and modifications to the automatic placement and
routing program for the CMOS/ScS technologies
• Analysis and developments relating to basic cell design and layout
• Description of CMOS/SOS standard cell design and layout
• Description of the CMOS/SOS standard cell chip layout
• Functional description of the cells in the NASA CMOS/SOS standard
cell family
• Description of cell data sheets
• Description of the standard cell test chip
* RCA Solid State Division had established separate facilities for special purpose,
large volume CMOS/SOS applications.
• Documentation, evaluation and interpretation of the performance and
characteristic data measurements made on the test chip
• Documentation, evaluation and interpretation of the static and dynamic
performance measurements which were made on an Ind"ndent
Research and Development program on live different SUMC-C VT
CMOS/SOS chip types designed on Contract NASS-29072 with the
standard cell approach.
3
Section 2
WORK REQUIREMENTS AND SUMMARY OF ACCOMPLISHMENTS
A low cost, quick turnaround technique for generating custom CMOS/SOS LSI arrays
using the standard cell approach was developed, implemented, tested and validated.
This was, in essence, the objective of this program. To achieve this result, a series
of intermediate objectives and goals had to be, and were, accomplished. These accomp-
lishments and results include the following:
(1) The Automatic Placement and Routing Computer program was modified and
enhanced to ensure compatibility with, as well as to optimize the performance of, the
self-aligned silicon-gate CMOS/SOS technology.
(2) A basic cell design topology and guidelines were defined based on an extensive
analysis that included circuit, layout, process, array topology and required performance
considerations -- particularly high circuit speed. A standard cell height of 7 mils and
a minimum pad spacing of 1 mil were established. In addition to meeting the principal
design consideration of speed, the cell area of CMOS/SOS was dramatically reduced
compared to that for CMOS bulk standard cell design. For example, a two-input NOR
requires 79.8 square mils in the metal-gate standard cell family and only 21 square mils
in the CMOS/SOS standard cell family with virtually the same size devices -- a reduction
of almost four to one.
(3) A family of 11 self-aligned, silicon-gate CMOS-SOS standard cell circuits was developed
on the program. Additional cells were added to the family as a result of work dace on NASA
contract NAS-29072 and RCA Independent Research and Development programs. For each of
11 logic cell types developed on this program a user oriented data sheet has been generated.
The data sheet contains the circuit design, logic configuration, input and output capacitance,
and dynamic performance design information. The performance of each of these cells was
validated initially by computer simulation techniques and later by direct performance measure-
ments on LSI arrays designed for the NASA CMOS/SOS SUMC computer. Similar data were gen-
erated for other cells designed on the other programs. These cells, as well as the 11 designed
for this program, were incorporated into a separate, convenient, expandable standard cell
notebook.
(4) The silicon-gate CMOS/SOS test chip was designed not only to provide experimental
validation that the standard cells functioned properly but also, more critically, to deter-
mine that their dynamic performance correlated with the predicted delays based on com-
puter simulation. This performance validation was verified. For example, the average
stage delay for a two-input NOR circuit in a serial logic chain containing eight levels of
logic was less than 1.6 no for the devices with 0.25-mil channel lengths and approximately
2.4 no for devices with a 0.3-mU channel length. In addition, the test chip provided
device and characterization data which was used to update the values of the device model
parameters used in the computer simulation program. By such means the accuracy of the
speed predictions based on computer simulation techniques is increased.
4
(5) Since somis of the cells developed had not been designed when the test chip was
laid out and fabricated, they do not appear on the test chip. These cells were experi-
mentally validated by the measured data taken on five CMOS/SOS custom ISI arrays
designed for the SUMC-CVT computer system. These custom LSI arrays varied in
complexity from a 150-pLta 4 x 2 multiplexer array to a 450-gate, multiplexed, 8-bit
adder array.
(6) The correlation between  the average measured delay and the delay predicted toy
the computer simulation program was excellent -- well within design tolerances. For
example, the difference between the average measured delay and computer-predicted
delay for specially selected logic paths on each of the five chips can be seen from T.101.
TABLE 1. COMPARISON OF AVERAGE MEASURED DELAY AND COMP11TER-
PREDICTED DELAY
Custom Standard Cell
CMOS/SOS Array IMme
Computer
Predicted
Delay
(no)
Average
Measured
Delay
(no)
Measured
Average Stage
Delay
(no)
Floating Point Multiplexer 27 23 4.3
Up/Down Counter 46 52 5.8
8-Bit Adder with Carry 75 73 6.0
Prediction , 103 105
9-Bit 4 x 2 Multiplexer 50 60 8.0
Adder-Multiplexer Control 66 67
71 76 5.2
As seen in the table, there is generally good correlation between the predicted and
measured results. Differences fall within design tolerance. The major significance of
the correlation between predicted and measured results is that the circuit speeds achieve
the dynamic performance objectives for which they were designed -- the NASA SUMC-
CVT computer system program.
9
Section 3
TECHNICAL DESCRIPTION
A. CMOS/SOS PR2D AUTOMATIC PLACEMENT AND ROUTING PROGRAM
The changes, modifications and improvements introduced into the PR2D automatic
placement and routing program to obtain compatibility with and optimization of the
CMOS/SOS technology are as follows:
(1) Power Distribution, Chip Criterion. The power distribution routine is similar
to the double-sided metal ground in the bulk CMOS version of the PR21) program. Each
cell row has VDD and ground routed through the cell row and connected on the side to
the peripheral power. The VDD is routed at the top of the cell row (midway between
back-to-back cell rows) and routed on side row number 1 to the peripheral power. The
ground is routed at the bottom of the ctdl row (both top and bottom of back-to-back
cell rows), joined at the right side of the cell rows, and routed to the peripheral power
on side row number 2. For odd cell rows, the ground is routed at the top of the cell row and
via side row number 1 to the peripheral power. The V DD is routed at the bottom of the
cell row and via side row number 2 to the peripheral power.
(2) Power Distribution, Peripheral. The SOS option for the Pdripheral power is an
interrelated set of two nonintersecting open double 11 0" patterns. The left side of the
chip (row number 1) has the power closer to the chip interior. The right side of the
chip (row number 2) has the VDD closer to the chip interior. This configuration allows
VDD and ground to be all-metal lines from the power bonding pads to the cell within the
chip. This peripheral power distribution is required in order to proper distribute the
ground and VDD to the cell rows.
(3) Power Bonding Pad Locations. In this modification the PR2D propgram places
the ground power bonding pad in the lowest position on row number 1 in the lower left-
hand corner of the chip, and the VDD power bonding pad in the highest position on row
number 2 in the upper right-hand corner of the chip. These locations are required
because of the peripheral power distribution.
(4) Zero Channel Routing. Because of the SOS standard cell design, zero channel
routing is permitted. The SOS version of the PR2D program has been modified to allow
zero channel routing. This feature will decrease the number of horizontal channels
required by one per cell row.
a
(5) Zero Channel Tunnel Ends. To provide compatibility with the standard cell
design, zero level tunnel ends are required when the routing to the pin of a cell is
metal. This tunnel end will connect the metal routing to the tunnel level for connectivity
to the cell interior.
(6) Chip Borders. This SOS option of the P112D program will generate the proper
border requirements on all mask levels for the SOS technology.
(7) Low Profile Metal. The low profile metal option has been modified for the SOS
technology. The modification will not allow lower profile of metal in the same vertical
channel which contains another node. This feature, while still allowing low profile
metal, will not increase the node crossover. Further, the low profile metal option has
been further debugged. In addition, a new feature -- removal of unneeded low profile
metal -- has been added. This will reduce the number of artwork instructions generated.
(8) New Features. One of the new features added in the SOS modification of the
PR21) program is the modified class 2 routing from the odd cell to the top bonding pads.
Normally this would be a class 3 route. When this option is exercised, the program will
generate class 2 routing instead of class 3. This feature Improves routing by not routing
these nodes on the side surfaces. However, it is required that all such routes be to the
center bonding pads of the top pads; that is, no top bonding pad which does not go to the
odd cell row can be imbedded within those bonding pads which do go to the odd cell row.
(9) General. Additional debugging of basic prograri has bebn done. This should
improve even further the quality of the chip design.
B. SILICON-GATE 'CMOS/SOS TECHNOLOGY
Although the CMOS/SOS standard cell library was designed and laid out for the
double-epitaxial, silicon-gate SOS process, SOS standard cell arrays may be fabricated
with either the double- epitaxial or the single -epitaxial technology. The mask set for
any of they processes is extracted from a common set of masks.
The silicon-gate, single-epitaxial processes actually require that the first two photo-
masks, used in the double -ep itaxial process, be superimposed to form a new photomask.
The new mask is the logical "or" of the first two. This is routinely done in the photomask
shop whenever the single-epitaxial process is elected. All other photomasks are identical
for the two processes. The similar nature of the single- and double-epitaxial fabrication
techniques along with their pilot line status virtually guarantees the continuance of identical
design/layout rules for the two technologies. This, in turn, guarantees that the photomasks
for the double -epitaxial silicon-gate standard cell family will continue to be compatible with
the single-epitaxial processes. Characteristics common to both processes are described
in the following paragraphs.
7
1. Insulating Substrate
The combination of isolated thin film islands and an insulating sapphire substrate
is the principal reason that this technology delivers a maximum level of performance,
and yet provides a high gate density on an array at an extremely low power level.
2. Guard Band Elimination
Because this technology provides complete isolation between devices when
required (the sapphire substrate eliminates the possibility of field inversion), anti-field
inversion and anti-parasitic device techniques, such as P+ and N+ diffused guard bands,
field shield and counter diffusion, are not required. The elimination of diffused guard
bands allows a significant reduction in the area of circuit layouts with no performance
penalties. Layouts show a 3-to-1 to 4-to-1 area reduction for standard cells using
this technology compared to the cell area using the standard CMOS metal-gate, bulk
silicon technology.
3. Minimum Capacitanc3 Technology
The self-aligned, silicon-gate CMOS/SOS technology virtually eliminates all
junction and interconnection capacitances. The only junction capacitance which exists
in this process is associated with the sidewalls of the diffused junction in the thin film
islands. And except for the input capacitance, which depends on transistor width and is
near minimum because of the polygate-selfalignment technique, the only other significant
capacitance is the coupling or crossover capacitance between the metal and polysilicon
interconnect routing.
The principal result of this low inherent capacitance is to enhance on-chip
circuit performance.
Alternately, speed can be traded for density when minimum size devices are
used. These devices, constrained principally by the topological design rules, can be
used for circuit design with little loss in performance.
4. Maximum Speed/Area Ratio
The propagation delay of this technology in an LSI environment doesn't vary
extensively with device size. Therefore, as is common with other MOS technologies,
speed and performance are not seriously sacrificed by reducing the device size and the
corresponding area. The virtual elimination of the junction and substrate capaeiiance
Is the principal reason that the CMOS/SOS technology has such a high speed/area ratio.
8
5. Elimination of Substrate Effect
Because the sapphire substrate is essentially an insulator, the increase in
threshold voltage as the source-to-substrate reverse bias is increased, which most
MOS technologies experience, is eliminated in this technology. Therefore, even when
devices are stacked, the size of the devices need not be increased because of the
source-to-substrate effect.
6. Simplicity of Process and Design Procedure
Although the silicon-gate CMOS/SOS process has many unusual features, it
requir. as only six* process photomask steps plus a passivation mask. This is identical
in nu- .ber to the standard CMOS metal-gate technology. Thus, among the newer high
perLormance, high density technologies, the silicon-gate CMOS/SOS has an excellent
potential for maturing into a low cost, high volume technology. RCA has delivered a
large volume of a CMOS/SOS LSI array to an industrial customer over t?bs past year,
and has announced several CMOS/SOS LSI products and will be offering e . least one
each month in 1975.
In terms of design procedures and layout rules, elimination of the guard bands
makes this technology easier to use.
A cross-section of the silicon-gate, double-epitaxial topology appears in Fig. 1.
The numbers in parentheses indicate photomask numbers. The levels are:
Level 1: P-Epitaxial Island Definition
Level 2: N-Epitaxial Island Definition
Level 3: Polysilicon Gate and Interconnect Definition
Level 4: N-Type Diffusion Definition
Level 5: Contact Hole Definition
Level 6: Metal Interconnect Definition
Level 7: Passivation Mask Opening Definition
* The silicon-gate, double-epitaxial SOS process requires six process photomasks; the
silicon-gate, deep- depletion SOS process requires five process photomasks.
1
9
SOS FPITAXY	 fur	 wv.wr
nAl VAn VevAi 1 lur ell 1PAu
1131
GATE DEFINITION
nunenunnue nnarn AVinr	 OADAN_HAorn AVlnr
(4)
_.	 Anur•nre
METAL
161
Note: Numbers in parentheses indicate ph , *omask numbers.
Fig. 1, Silicon-gate CMOS on sapphire process, double epitaxy.
to
C. ANALYSIS AND BASIC CELL DESIGN
1. Technical Analysis
This section discusses the analysis involved in defining the basic device geometry
and the standard cell height for the selected double epitaxial, P+ poly, CMOS/SOS process.
Initial considerations center on capturing a nominal set of processing parameters
and design rules for the process. The parameters used in the analysis are listed in
Table 2. The saturation currents, for 10.0-V operation, resulting from these para-
meters are 2.0 mA/mil and 1.2 mA/mil for the N- and P-type transistors, respectively.
Although the N- to P-transistor current ratio is 1.67 (2.0/1.2), the actual transistor
geometry design ratio was fixed at 1.8:1, which based on past experience is considered
a conservative estimate.
TABLE 2. TECHNICAL ANALYSIS PARAMETERS
VTN = 2.3 V
NA = i x 10 15/cm 3
U = 42v cm2/volt-secondsn
Leff =	 0 . 20 mil (with 0.30-mil mask)
EOx =	 3.9
E S =	 11.7
TOx =	 1100 !(gate)
TOX' = 6000 A (between poly and metal)
VTP = -2.1 V
ND	= 3 X10 15 /cm 3
A	 - 251 cm 2 /volt-seconds
P
11
To determine the on-chip circuit speeds of a generalized combinatorial LS1
array using this process, a chain of two-input NOR circuits was analyzed. Each two-
input NOR cell was loaded with a NOR circuit and a large inverter. The N- and P-
transistors comprising the inverter were chosen to be large enough that the total load
on each NOR circuit would be the approximate equivalent of 3.6 two-input NOR circuits.
A fanout of 3.6 for the NOR circuits is a reasonable conservative estimate of the
average fanout of a generalized logic array. One result of selecting this value for the
fanout is that the circuit speeds will be lower than would be expected had some special
purpose functional logic array with lower fanout been selected. However, a fanout in
excess of three is considered typical in general purpose digital systems where extensive
combinatorial and sequential logic are used in a nonregular way.
Care was taken to consider those features of the current DA system that would
influence effective on-chip circuit speeds. Accordingly, the intercell resistance associated
with the polysilicon interconnects was included in the evaluation circuit. Similarly, the
crossover capacitance introduced by the metal/polysilicon crossovers in the intercell
connection area was included. Figure 2 shows the generalized evaluation circuit with
the intercell resistance and crossover capacitance represented as lumped parameters.
NI
CO+^ *
	
Cc't I *
	 cc^
Cin --I	 Cin— 1	 Cln-
NI	 I	 N 
+V	 aV
—^ WP	 +V	 WP:4.Omils
4. .12Pf	 12pf	 .12pf	 1.12pf
WP-6.6mils	 IWP -4.0	 WP=4.0 -t
mils
	
41	
.15pf	 milsi
030	 WN=4.7milt
_L— WN =I.1milt
	 WN n I.1milt
.15
	
T .12	 .41	 .03
R - INTERCELL RESIST
Cin • FAMOUT EOUIV. TO 3.6
TWO INPUT MOPS
C, s INTERCELL POLY/METAL
CROSSOVEP CAPACITANCE
Fig. 2. Evaluation circuit for CMOS/SOS standard cell.
12
f
I^	 Ii
2K
RS^E
+V
it
P Wp m 3.6 Mift
N wa-2.0mIN
.1^"T
Although the length of the polysilicon interconnect resistance between two cells
can range from 0 mil to perhaps 100 mils, past experience with more than 50 custom LSI
arrays indicates that 20 mils is a fairly representative length. (Present plans call for
polysilicon runs to be 0.80 mil wide.) With resistivity of 70 ohms/square, the 20 mils
of polysilicon interconnect converts to a 1.8-kilohm resistance. For completeness, the
effects of 46-mil (4 kid) and 6-mil (0.5 k0) polysilicon interconnects were also investigated.
The parasitic capacitance introduced by each metal/polysilicon crossover in the
intercell connection area is 0.01 pF of either coupling or additional loading capacitance.
Examination of past ISI array designs indicates that a given connection between two
cells may have no metal/polysilicon crossovers or hundreds of them. Although the
distribution is quite wide, an average of 30 crossovers per cell output was assumed.
Further statistical analyses of current custom LSI arrays showed that 40 is a more
accurate estimate of the average number of interconnections:. The design will be
modified to include these new statistics. The effects of the crossovers were considered
in two ways; first, as an additional lumped capacitance load kis shown in Fig. 2), and
second as a means of injecting crosstalk or noise into a lightly loaded circuit (as shown
in Fig. 3).
+V
I'	 P wo• 0.9 mils
.04 pr
K we 0.5 mils
.04 pf 1
Fig. 3. Evaluation circuit for crosstalk due to metal/polysilicon overlap.
11
With the parasitic resistive and capacitive models defined, several computer
simulation analyses were made. Considering interconnection resistance alone, a
separate run was made for four versions of the circuits shown in Fig. 2. For each
case, the transistor geometries (specifically channel widths) were adjusted to be 1/3 X,
1/2 X, 1X, and 2X those shown in Fig. 2. As the transistor sizes were modified, the
parasitic interconnect resistance was kept constant. * This was repeated for resistance
values of 0.5 kn, 1.8 ko, and 4.0 ko, The results, shown in Fig. 4, clearly indicate
that for the case of no interconnect capacitance, increased transistor geometries
deteriorate circuit speed. This is shown analytically in Fig. 5 which shows the load
charging time constants for various device widths. RT and C T are the intrinsic output
resistance and input capacitance of the silicon-gate CMOS/SOS devices. If the effect of
the coupling capacitance between crossing signal lines on the array is assumed to be
negligible, then C T is essentially the only effective on-chip capacitance. The resistor
RC represents the interconnection resistance associated with polysilicon intercon-
nections between the cells. Because the polysilicon crosses over the sapphire sub-
strate, the capacitance to substrate is essentially zero. Therefore, a net on the array
can be represented as shown in the equivalent circuit in Fig. 5. If we assume a
reference device size and normalize it, the load time constant as shown in the first row
of the table is (RT + RC) C T . Doubling the device width, as shown in row 2, yields a
time constant (RT + 2RC) C T where RT is halved and C T doubled with respect to case 1.
As shown in the third column, the time constant has increased with respect to case 1
by the quantity RCC T. Similarly, case 3 shows that by reducing the device width by one-
half, the time constant has decreased, with respect to case 1, by -R C C T/2. Therefore,
the delay associated with the load time constant has decreased by the quantity -RCCT
corresponding to an increase in speed as compared to case 1. Thus, the presence of
the interconnection resistance R C produces the result that the smaller devices provide
the higher speed and performance, if a zero signal coupling capacitance is assumed.
For the case where RC ='O, the time constant and therefore the switching speed are
constant, independent of the device width. On the basis of this result alone, transistor
sizes should be reduced to an absolute minimum to -,nhance circuit speeds.
With an assumed 30 crossovers/interconnect (30 x 0.01 = 0.3 pF), the evaluation
circuit of Fig. 2 was analyzed for the four transistor geometries (1/3 X, 1/2 X, 1X, and
2X) and two assumed parasitic resistance values (1.8 ko and 0. 5 kf)).
When both the parasitic resistance and the crossover capacitance are included
in the analysis, the curves of Fig. 4 continue to show maximum performance for re-
duced device size. The results show that optimum performance will be obtained for
standard transistor geometries between 1/2 X and IX those in Fig. 2. Figure 6
illustrates this relationship. Past measurements on LSI arrays indicate nominal
poylsilicon interconnect resistances of 1.8 kO. Therefore, device sizes somewhat in
excess of 1/2 X would be appropriate to the initial design phase.
* The length of the polysilicon runs and the number of crossovers per interconnect are
essentially functions of LSI array interconnection complexity and may therefore be
considered essentially independent of transistor size.
14
32
=0
-27
NC
q
o^	 Cc = 0
+\0
q-
	 CC=O
*^
vLs05
I x 2x	 Ix	 2x	 2x
NORMALIZED DEVICE WIDTH
Fig. 4. Pair delay vs transistor channel width.
RT 	RC
I CT
RT	 RC
ET 	CT
I TIME CONSTANT I A TIME CONSTANT
CASE I NORMALIZED
WIDTH
R C +R CT T	 C T
RT CT + R C 2CT
CASE 2 DOUBLE DEVICEWIDTH RC CT (INCREASE)
RTCT+RCCT
CASE 3 HALF DEVICE 2WIDTH C T (DECREASE)
2
F.g. 5. Speed ve device width.
r
J 22
W
0
Cr.
Q
d IT
12
15
N
C
v
\	 R.05KJ1 CCs0.309f
WQ
C
Cr
_: l
Figure 6 also suggests that devices designed within the 1/2 X to 1X region will
differ in nominal stage delay by 1/2 ns, at most. In other words, designs falling with-
in the 1/2 X to 1X region will be optimized to within the accuracy limits of this analysis.
However, the impact of signal crosstalk must still be considered. Small devices
with their inherently low input capacitance may, conceivably, be sensitive to the fixed
coupling capacitors associated with the metal/polysilicon crossovers in the interconnect
area. These crossovers may number from 0 to more than 100 in a typical array. As
a means of estimating the effects of the crossover coupling, the circuit shown in Fig.
3 was analyzed. Two minimum size inverters, isolated by a 2-kp resistor (represent-
ing the resistive effects of more than 20 mils of polysilicon intercon ►iection), were sub-
jected to the simulated effects of 30 crossover lines switching simultaneously in the
same direction. The switching waveforms were simulated by the oversize inverter I2.
Noise spikes were observed at node A. For 30 and 60 crossover lines switching
simultaneously, the noise spikes observed at node A were 3.0 V and 5.0 V, respectively.
The short duration of the spikes at A (less than 6 ns) and the response time of inverter
A-B combined to produce a maximum voltage swing of 0.5 V at node B. These favorable
results, in addition to the conservative nature of the evaluation circuit (specifically,
minimum size inverters experiencing the crosstalk of 60 simultaneously switching
crossovers, etc.), led to the design of a standard cell family at the 1/2 X scale.
I	 I X	
Ix.	 2x	 2x
NORMALIZED DEVICE WIDTH
Fig. 6. Pair delay vs transistor channel width. (Fanout equivalent to
3.6 two-input NORs. )
16
iThe analysis of the required standard cell heights started with a survey of
the existing bulk CMOS standard cell family. * Figure 7 illustrates one such cell.
Table 3 summarizes the typical cell height usage for the bulk family. As can be seen,
out of 14.0 mils of cell height, the bulk family uses 3.8 mils with the required guard
bands and spacings. An additional 1.0 mil is consumed by the cell 1/0 pads. The
remaining 9.0 mils of cell height is used
-
for device construction. and intracell con-
nections and can be apportioned nominally into 6. 5 mils and 2.5 mils, respectively.
To determine the SOS cell height: If the area allotted to intracell connection
is kept at 2.5 mils (for the new cell family), and if all SOS transistors are designed
within the 1/2 X to 1X ratio (as determined earlier), an additional 4.0 mils of cell
height will be needed for device design. Then allowing only 0. 1 mil for the new SOS
:/O pad height, we conclude that an SOS cell family, comparable in effective device
and intr-a^onnection area to the bulk family, may be designed around the 7.0-mil height.
TABLE 3. STANDARD CELL HEIGHT COMPONENTS
Bulk CMOS Cell Proposed SOS Cell
Height Statistics Height Statistics Cell Height
(mils) (mils) Consumed By
3.8 0.4 Guard bands
1.0 0.1 I/O Pads
6.5 4.0 Devices
-
2.5 2.5 Device Interconnection
14.0 7.0 Total
2. Technical Analysis Techniques Verification
The validation and updating of both simulation techniques and device model
parameters are of extreme importance in generating performance optimization analyses
and delay-transition time data. In either case, it serves as the vital link between
theoretical analysis and measured performance — in short, a calibration mechanism.
Figure 8(a) shows one of the circuits implemented on the ATL-011 test chip.
It is one of the principal circuits used to validate the accuracy of the device model and
simulation technique. The circuit model for the logic chain is shown in Fig. 8(b). The
1310 and 1520 buffer cells are designed with the 0.30-mil polygate width. Notice in
Fig. 8(b) that an additional 1310 cell has been added to the front of the test circuit.
* CMOS array design techniques, Quarterly Report No. 2, Contract No. NAS12-2233.
17
14,
N-MO!
TRANSISTOF
CELL
HEIGHT
P- MO
TRANSISTOR!
50% REDUCTION
IN NODE CAP
MIN I/O
PAD SIZE
10
9
8
_J 7
6
5
4
3
2
I
0
14
13
12
II
,T E
i	 i	 I
Fig, 7, Two-input NOR layout.
18
r
m
V6 3120	 V5 3120	 V4 3120	 V3 3120	
V2 
1310
	
VI 1520
CL= 5.5 pf
3120 L =.25 MILS
13,1520
  L=.3 MILS
(a) Two-input NOR test chain.
VIN	
V7 3120
jsesl -aw	 §L - .s R
(b) Circuit model for two-input NOR test chain.
Fig. 8. Two-input NOR test chain and simulated circuit.
The program generated pulse is applied to node 3. THe output of that circuit, node 5,
whose waveform more closely approximates actual on-chip waveforms, becomes the
input waveform for purposes of computing propagation delay. The model for the trans-
istor is a sophisticated series of nonlinear equations that is incorporated into the cir-
cuit simulation program. The values of the SOS processing parameters used in the
simulation were taken from representative samples of ATL-011 test chips and are
shown in Table 4. These parameters were used, together with the model, to generate
the propagation delay data for comparison with measured values.
TABLE 4. SOS TEST CHIP (011) DEVICE PARAMETERS
Symbol Description Value Used in
Simulation
VTP Threshold Voltage P-Transistor -1.5 V
VTN Threshold Voltage N-Transistor +1.5 V
Leff Effective Channel Length 0.2 mils
µp Effective Surface Hole Mobility (V GS = 0) 400 cm2/V- e
An Effective Surface Electron Mobility (V GS = 0) 500' cm2/V- s
ND Donor Density 1.5 x 10 15
 /cm 3
NA Acceptor Density 1.5 x 1015/3
TOX Gate Oxide Thickness 12001
Slope N (In Saturation) Slope of Drain Characteristic -N 0.02
Slope P (In Saturation) Slope of Drain Characteristic -P 0.01
C GS Capacitance (Gate-to-Source) 0.2pF/mil2
C GD Capacitance (Gate-to-Drain) 0.2pF/mil2
RP Resistance of Polysilicon Strip 60 ohms/(3
In the first and third columns of Fig. 9(b), the measured propagation delay at
10 V of the two-input NOR test chain for test chip 101 and 105, respectively, are listed.
The 15-no delay from Vin to Vout for TC 105 is a repeat of the data presented by the
actual waveforms in Fig. 9(a). Similarly, the 17.5 no is the corresponding measured
propagation delay for TC 101. Column 2, the simulated time delay, shows the results
20
(a) Waveforms of measured delay,
tJr
Two	 INPUT	 POOR TEST	 CHAIN
TC	 101 T  105
MEASURED DELAY SIMULATED DELAY MEASURED DELAY
no ns ns
VI	 —17.5 -OUT VI J/	 16 .9 OUT VIN IS	 OUT
J•
V2 77
—O 
U T V	 B 4 OUT V 6.T	 OUT
V^ 6.4
AVER _ 6 STAGE DEL . = 17.5-77 _ 16 AVER = 4 STAGE DEL _ 6_4 
=16
4
AVER _ 6 STAGE DEL.
DELAY	 6 F DELAY	 4 DELAY	 6 6
(b) Measured and simulated results.
Fig. 9, Validation of simulation techniques.
of the computer simulation for the delay through the two-input NOR chain using the model
and parameters described. As shown, the predicted time delay through the complete
test chain is 18.9 ns ve the measured delay of 17.5 ns for TC 101 and 15 ns for TC 105.
Thus, the simulation ir.. this case is somewhat conservative or pessimistic in the results
it generates. It is clear 'chat because of these high speeds even a 1-ns deviation is
significant and measurably detectable. Since it is good design practice to operate with
conservative estimates, the parameters that produce the 18.9-ns delay were those
that were used to produce the cell data delay and characterization curves.
D. CMOS/SOS STANDARD CE LI DESIGN LAYOUT
Elements of the basic standard cell and device design have been discussed in
Section 1: " From performance considerations, we have defined basic device (n-type)
sizes to be within the 1/2 X to 1X (0.5 to 1.0 mil) range. By analyzing typical cell
height usage in earlier standard cell systems, and making the appropriave modifications
for the silicon-gate SOS technology, we have defined a new standard cell height for the
SOS family. However, not all cell geometry considerations have been discussed. In
this section design considerations relating to the other cell geometries and topologies
are discussed.
1. Cell Power and Ground Distribution
Contrasting with the single ground bus of the bulk technology standard cell
systems, both power (+V) and ground potentials are now distributed via 0.4-mil metal
bus lines. The abser•ce of a conducting substrate in SOS necessitates this scheme. The
power line is shared by two rows of cells. The cell rows are connected in a back-to-
back fashion with the power line between them. C i ,)und connections for the cells are
accomplished by passing a ground bus across the Lwer edge (through the origin) of each
cell.. Figure 10 is a Calcomp composite plot of a two-input AND cell. The power and
ground bus lines are broken for clarity prior to passing through the cell.
2. Cell I/O Pads
Also unique to the SOS standard cell system is a nonmetalized cell I/O pad.
In essence, all logic connections to the cells are accomplished via the p-doped poly,-
silicon tunnels. The tunnels are 0.70 mil wide and spaced at multiples of A. 0 mil.
They pass under the ground bus and extend to 0.70 mil below the cell origin. Poly-
silicon was chosen for this connection because:
(1) The majority of a cell's pads are input pads , and therefore inherently
polysilicon anyway.
(2) It provides a low resistance cell connection while still permitting the 0.40-
mil ground bus to pass over all of the I/O leads without interference. Figure 10 illus-
trates the two-input AND cell with its three 1/0 pads. They appear under the extended
ground bus. The two rightmost pads are input connections.
22
?OWE
(^ROLMDJ^2
111_1_1_1L`
7OPNLS
I	 '	 I
^I
2	 3 *
Fig, 10. Two-input AND cell composite.
An additional feature of this all-polysilicon-pad approach is that all metal
connections to a cell must be supplied with a metal-polysilicon contact arrangement.
Such an arrangement is automatically provided by the software, in the form of a
"tunnel-end" cell, whenever the metal-polysilicon interconnect is required. This
feature permits the routing software to make efficient use of the first (closest to the
cell pads) wiring channel by running a metal interconnect over all intermediate pads.
(In the bulk standard cell system, the use of this first channel was blocked because the
metal-tunnel interconnect was an inherent part of all cell I/O pads.)
3. Special Layout-Design Considerations for Standard Cells
After design and before use, each standard cell must be reviewed not only for
layout rule violations internal to the cell boundaries, but also for all possible rule
voilations that may occur when a IX of the cells are placed next to each other. A simple
way to accomplish this exhaustive and impractical requirement is to define a standard
23
I) 
, , 
!'-; 
_"J J ----- _1 __________ L __ J -- -J---"'f 
interface for the top, bottom, and both sides of each cell that would "spell out" the 
proximity relationships between all edges of the cell and all six photomask levels. 
With such an interfacing scheme, all possible intracell rule voilations are automatically 
a voided. A set of rules to govern cell layout-design at the cell edges is suggested as 
follows: 
Rule 1. Standard cell height is 7.0 mils and is measured from 
the center of the ground bus to the center of the V DD bus. 
Rule 2. The ground bus an.d VDD bus are each 0.4 mil wide and 
are centered on the bottom and top lines of the cell, 
respectively (the bottom of the cell being defined as 
that portion closest to the cell I/O poJ.ysilicon pads). 
The ground and VDD bus lines are not part of the 
standard cell; both bus lines will be automatically 
placed when the intracell wiring is done. 
Rule 3. Cell width is an integer multiple of 1.' 0 mil measured 
between the origin and antiorigin of the cell. 
Rule 4. All polysilicon or metal must be at least 0.2 mil from 
the left edge (defined as a vertical line drawn from the 
origio) of the cell and O. 1 mil from the right edge 
{defioed as a vertical line drawn from the antiorigin) 
of the cell. 
Rule 5. All N !lnd P epitaxial silicon must be at least 0.25 mil 
from the left edge of the cell and O. 15 mil from the 
right edge of the cell. 
Rule 6. The N+ diffusion mask must be at least 0.05 mil from 
the left edge of the cell and can overlap the right edge 
of the cell by no more than O. 05 mil. 
Rule 7. The cell I/O pads are treated as tunnel (polysilicon) ends 
and are ioserted into the cell design as stubs of polysilicon 
extending 0.7 mil b/illow the origin. The I/o pads are placed 
on 1. O-mil, centers and must be at least O. 7 mil wide from 
the y = -0. 7 point to the y = -0.4 point. At and above the 
y = -0.4 point, the polysilicon stub can be made any width 
consistent with the SOS design rules., 
Rule 8. The N- epitaxial silicon (level 2) may run up to the center 
of the VDD bus if it is to be electrically connected to 
the VDD; otherwise, it must remain 0.30 mil below 
the center line. 
24 
I 
i 
, 
I 
I 
, 
i 
I 
Rule 9. The P- spitaxial silicon (level 1) may run to within
0.40 mil of the VDD bus center line. (This is either
y = 4.60 mils or y = 6.60 mils, depending on cell
family.)
Rule 10. Polysilicon (level 3) may run to within 0.20 mil
of the VDD center line. (This is either y =4.80
mils or y = 6.80 mils, depending on cell family.)
Rule 11. The N+ diffusion mask may run to within 0.20 mil
of the VDD bus center line. (This is either y =
4.80 mils or y = 6.80 mils, depending on cell
family.)
Figures 11 and 12 illustrate layout rules 1-7 and 8-11, respectively.
Figure 13 is a topological plot of the two-input AND cell with levels 1 through 6
plotted separately. (Level 7 is used only to open the protective oxide at the chip
I/O bonding pads.)
E. CMOS/SOS STANDARD CELL ARRAY TOPOLOGY
SOS standard cell array layouts follow the basic scheme illustrated by the metal
mask of Fig. 14. The array topology falls into four distinct areas: I/O bonding pads,
power/ground buses, cell interconnections, and logic cells.
1. I/O Bonding Pad Area
The I/O bonding pad areti ru..s along the periphery of each chip. Although
primarily intended for pad placement, efficient area utilization is guaranteed by also
using this area for gate oxide protection devices, alignment keys, test transistors
and special off-chip buffer circuitry. The protective devices are automatically placed
whenever a chip 1 /0 bonding pad is called for. When going off chip, the designer has
the choice of using either a buffered or an unbuffered output pad. In short, logic
designers and system partitioners need choose only the nature of the on/off-chip
transition -- a buffered output, an unbuffered output or an input -- the rest is automatic.
Gate oxide protection sgainst electrostatic discharge is provided with a stack
(series connection) of eight diodes. One such stack is connected between each I/O pad
and one of the buses. Alternate, heavily doped, n+ and pf' diffusions form four forward-
biased and four back-biased diodes. With 6-volt breakdown for each of the four zeners
and 0.7 volt for each of the four forward diodes, the stack has approximately a 27-volt
breakdown in either direction. Figure 15 includes a schematic, cross-section and
topological representation of this device. The 10 mils2 needed for its implementation
fits well within the interpad space aril, for this reason, is included as an implicit part
of all I/O pad designs.
25
vw sus
— N x 1.0 mils
( IN THIS CASE N =5 WHICH COULD REPRESENT
A PAD LIMITED 4 -INPUT NAND)
METALIZATION
^- 0.1
POLY
f] ^0_0.1
NOR P EPI
0.25 -
	 0
	 NOR P EPI
	
f 0.15
0.2
0.2
N + DIF
0.05
L_
P EPI
N+ DIF
A --,
^
r
;0.05
L_ - -JP EPI
ORIGIN `^X
0.2 -^
ANTIOR!K_IN
0.2
	
0.4
4ffL.73
0.2
 0.3♦ MIN r z- 0.4 — ^-0.1
GROUND
POLY 1/0 CELL PAD 	 BUS
NOTES
(1) THREE IO PADS ARE SHOWN IN THREE POSSIBLEPADCONFIGURATIONS. NO SIONIFICANCE IS ATTACHED TO THECONFIGURATION WITH RESPECT TO PAD LOCATION.
(2)ALL DIMENSIONS ARE IN MILS.
Fig. 11. Layout rules 1-7.
26
Vpp BUS
\	 0.40mil	 0.20 mil	 0.20mil
0.30 mil
^/I	 POLYSILICONN-EPI
	 ( LEVEL 3)(LVL No.
N+ OIFFUSI( LEVEL 4 )
P-EPII
	
/ "r( LEVEL I) 
	 P - EPI( LEVEL I )
HEIGHT 7 mils
BUS
Fig. 12. Layout rules 8-11.
Diode leakage averages less than 1 NA per input or output pad at 10-volt bias.
Since there has not been any known gate failures on protected gates, special handling
is not believed necessary. Nevertheless, it Is still recommended that care be exercised
in handling the arrays.
The optional off-chip buffer circuitry located along the chip's periphery offers
the designer the capability of "going off-chip" with drivers appropriately scaled for
larger capacitances. They can drive 30-pF external loads with rise or fall times of
18 ns (nominal). Any standard cell may be used to drive the off-chip buffers. Using,
however, on-chip buffers as drivers generally improves dynamic performance. Periph-
eral placement of the off-chip buffers is utilized because:
(1) Large drive capability is rarely needed for driving the small
on-chip loads.
(2) Such placement prohibits the possibility of encountering the large
RC delays associated with the use of resistive tunnels between
drivers and the off-chip load.
27
P-EPI ISLAND
,E-
N-EPI ISLAND	 POLYSILICON
COMS ITE
Fig. 13. Composite and levels of two-input AND (sheet i of 2).
N-DIFFUSION DEFINITION
N
m
Q
METALIZATION
CONTACT OPENINGS
COOMITI
Fig. 13. Composite and levels of two-input AND (sheet 2 of 2).
LOGIC CELLS
GROUND PAD AND
GROUND BUS
POWER BUS
Fig. 14. Typical metalization level for SOS standard cell arras.
0
-.' ,.
_ CONTACT Of'tIIiNG 
c:::JIiIETAL 
TOPOLOGICAL VIEW 
~~'---~::-':' :::::::::;~ 
CROSS SEC'fION 
~-----~ 
6V .TV •• --- ETC. 
SCHEMATIC REPRESENTATION 
Fig. 15. Structure of gate protection device. 
2. Power and Ground Distribution 
The power and ground distribution buses are clearly Identifiable as the two 
wide lines running along the periphery of each chip (Fig. 14). By passing these 
buses through each I/O pad cell, the ground and +V potentials are available for both 
output buffer circuitry and oxide protection devices. However, this scheme results 
in the necessity to pass all on-chip and off-chip signals under the two bus metallzations 
with either an epitaxial or polysilicon tunnel. (The added connection resistance associated 
with this is negligible -- about 200 ohms.) The modified interdigitated bus layout places 
both +V and ground buses in convenient locations for subsequent logic cell row connections. 
F. CMOS/SOS STANDARD CELL LIBRARY 
1. General 
The CMOS/SOS standard cell library is an open-ended collection of logic circuits 
designed to be fabricated with either the double··epitaxial pilot line process or the single-
epitaxial 80S process. All circuits have gate lengths of 0.25 mil for optimized performance. 
All standard cells have been defined, designed, topologically configured In 
accQrdance with the standard set of SOS design and process constraints, analyzed, and 
then permanently stored for future use on magnetic tape. The present cell library was 
31 
designed to meet current and anticipated LSI implementation needs of the NASA SUMC-
CVT compuxr system. However, because it is an open-ended system the user can
define and design new cells to meet unique system requirements. A list of the present
CMOS/SOS standard cell family is contained in Table 5.
To enable the addition of new cells and facilitate maximum use as a design
tool, the data sheets listing the properties and performance of the cell family and the
necessary supporting instructions are described in a separate document -- the CMOS/
SOS Standard Cell Notebook. The notebook contents and its use are briefly described
in the following paragraphs.
2. Standard Cell Notebook
The CMOS/SOS Standard Cell Notebook contains the following information:
(1) Data sheet for each of the 21 cells that constitute the
present family.
(2) Functional description of each of the cells in the library.
Each data sheet contains the following information:
Cell Name
Cell Number
Cell Width
Schematic Diagram
Logic Symbol
Truth Table
Pertinent Cell Node Capacitance
Performance Data (Delay and Transition Times vs Load Capacitance)
The propagation delays and transition times, as given on the data sheets,
were originally generated using the RCA CMOS/SOS circuit simulation program. The
device, circuit, and process parameters used in the simulation were based heavily on
the parameters determined from measurements on SOS standard cell test chips.
The dynamic data format for each cell depends upon the function of the cell.
Generally performance information is presented as a straight-line graph with load
capacitance and performance scales plotted on the X-axis and Y-axis, respectively.
The points located on each graph define the nominal performance of each cell as a
function of loading. Therefore, the propagation delay curves define estimated delays
that are expected to occur for each cell.
l
32
TABLE 5. DOCUMENTED CMOS/SOS STANDARD CELL FAMILY
Cell Number Cell Function
1120 Two Input NOR
1130 Three Input NOR
1140 Four Input NOR
1220 Two Input NAND
1230 Three Input NAND
1240 Four Input NAND
1310 Inverter
1340 2 x 1 Multiplexer (Single Clock)
1370 Transmission Gate
1510 Non-Inverting Buffer
1520 Buffer Inverter
1620 Two Input AND
1630 Three Input AND
1640 Four Input AND
1720 Two Input OR
1730 Three Input OR
1740 Four Input OR
2310 Exclusive 'OR'
2820 D Type M/S FF
8060 Off Chip Inverting Buffer Pad
8070 Off Chip Inverting Buffer Pad
33
Deviations between a given cell's performance and that anticipated by its data
sheet may be attributable to the normal variations in the SOS process. (For example,
normal variations occur in mask alignments, diffusion depths, gate oxide thickness
and doping levels.) In addition, there are a host of second-order effects that are inde-
pendent of processing. These include a dependence on the rise (or fall) time of the
input signal and the particular input (on a multiple-input gate) to which a signal is
applied.
Estimated delays based on these data sheets will generally be within 10% of the
average measured delays, and therefore should not be considered worst case numbers.
All dynamic propagation information is based on an assumed +10.0-V supply
voltage, an ambient temperature of 25°C, and a 10-ns transition time for the driving
signal. Delays are measured between the 50% points of the input and output signals.
The processing parameters assumed in the performance analysis are those of the SOS
double- epitaxial process.
The primary purpose of the standard ce!l data sheets and the associated
supplementary discussion in the notebook is to provide the logic a ,,.d system designer
with sufficient information about each of the star.3ard cells so thi.t he can optimize
his selection of the available standard circuits. This should enable the designer to
avoid race conditions, optimize critical path delays, avoid excessive loading conditions,
avoid improper cell usage, and estimate circuit speed. This forms the basis for design
comparison and evaluation before +h-! arrays are processed.
G. CMOS/SOS LSI CHIP MEASUREMENT AND ANALYSIS
1. General
This section covers the chip measurements and corresponding analyses of
device characteristics and dynamic performance tests made on several CMOS/SOS
standard cell array types. The CMOS/SOS test chip (Fig. 16) served as the principal
vehicle for examining and evaluating the individual 7.0-mil SOS standard cell circuits.
The test chip provided I/O pad connections to several test transistors and a large number
of logic chains as well as many special purpose evaluation circuits. From these test
circuits, it was possible to obtain:
(1) Accurate information about the effect of transistor geometries
on device drive capabilities
(2) Empirical data concerning the absolute and relative performance
characteristics of each of the standard cell types
(3) Additional insight into the optim;zing standard cell array design for
Improved circuit and system performance.
34
.,: 
: , 
i •
 
r ! r r ) 'r i I I , ! , , .j I • I , J , , 
•
 
p 
~ 
>
 
J 
4 
~ 
•
 
~ 
•
 
,
 ~ 
-
,.. 
"
 
,
 
i:! , , 
"
 
'"' 
n
 
~ 
~ 
,
 
,
 
,
 
~
 
•
 
,
 
,
 ,
 
,. 
J 
•
 
"
 
"
 
~ 
., 
"
 
'
-
~ 
0 
,) 
•
 
•
 
0 
0 
~ 
0 
•
 
,
 
0 
~ 
~ 
"
 
•
 
,
 
:l 
0 
•
 
"
 
., 
~ 
•
 
~ 
~ 
n
 
>
 
"
 
~ 
~ 
In 
•
 
.. ~ ~ 
I :j .;j 
>
 
I 
"
 
,
 
.!; 
fj 
•
 ~ 
J 'j 
Of 
:'i >'1 
•
 
il 
1-
"
 
0 0 
i 
0 2 
.J 
Ii! 
.J 'il ;( ~ 
7 
,
 
,
 
:! 
J 
~ 
~ 
,
 
"
 
<
 
,
 
.. 
"
 
.
 
•
 
,
 
•
 
·
 
-
~ 
~ 
•
 
~ 
•
 
.
 
•
 
•
 
~ 
·
 
§ 
•
 
"
 
"
.
 
~ 
~ 
•
 
0 
•
 
•
 
ri-
II 
•
 
•
 
~ 
U 
;; 
,
 
<
 
•
 
o
· 
t 
~ Fig. 16. 
C
M
O
S/SO
S standard c
ell test chip. 
35 
The electrical performance characteristics of five other SOS douple-epitaxial
array types, that were designed and fabricated for the SUMC-CVT project, were also
examined. These arrays are designated as follows:
ATL-026A Floating Point Multiplexer
ATL-027 Up/Down Counter
ATL-030 8-Bit Adder with Carry Prediction
ATL-031 9-Bit d x 2 Multiplexer
ATL-032 Adder-Multiplexer Control
From these measurements, the empirical data needed to confirm earlier estimates
of the SOS standard cell performance in an LSI environment were extracted. The logic
paths chosen for the delay measurements were those identified as either critical paths
to the SUMC-CVT operation or extended paths representative of on-chip performance.
2. Test Chip Measurements
The specific objective of the test chip was to provide a direct means for
measuring and evaluating the 7.0-mil CMOS/SOS standard cell circuits. Updated
device models and enhanced simulation techniques are two etlhar goals of this approach.
A brief summary of the tests included on the chip and the data collected from them is
presented in the following paragraphs.
a. Test Transistors and Transistor Characteristics
Three pairs . of n- and p-type transistors with channel lengths of 0.25 mil,
0.30 mil, and 0.35 mil serve as the means of collecting the test transistor characteristics.
Figure 17 shows typical chain characteristics for a 0.25-mil-channel-length transistor
pair. Average drain currents, taken from several wafer lots, for each of the 0.25-mil
and 0.30-mil channel length pairs, are listed in Table 6. The average 0.25-mil p-device
current, IDP, of 3.2 mA, represents a current of 1.6 mA/mil. Similarly, the average
0.25-mil n-device current, IDN, of 4.2 mA represents a current of 2. 1 mA/mil. Both
the n- and p-device currents approach, very closely, the bulk silicon values. These
currents, measured as indicated on Table 6, serve as a useful figure-of-merit for
estimating circuit performance.
TABLE 6. MEASURED TEST TRANSISTOR CHARACTERISTICS
Transistor Characteristic
Channel Length
(mils) n-Devices p-Devices
Drain Current (Average) 0.25 4.2 mA 3.2 mA
Vg = VD = 10 V 0.30 3.7 mA 2.2 mA
Threshold Voltage (Average) 0.25 1.5 V -1.5 V
VTH =Vg @ID =10µA 0.30 1.5V -1.5V
36
(a) 1' device (1Y' 101)
(h) 1 device CIV 1 111
Fig. 17. TYpical drain characteristics.
Table 6 also sho%%s measured threshold voliages for both the 0.2::-mil
and the 0.30-mil channel length test transistors. The uireshold vultages for the test
chip can he considered typical, perhaps a littler higher than -average",
b. Polysilicon Interconnects
t
	
	 Tx%o 100-mil-long polvsilicon strip, one covered with phosphorous (1)
doped glass and the other coated with boron (P) doped glass, and provided as a means
of measuring the absolute and relative resistivities of the polvsilicon interconnects.
3
Each strip is 250 squares from end to end and therefore permits an accurate resistance
determination. Test chip measurements indicate average values of 52 ohms/sq and 200
ohms/ sq for the P-doped and N-doped polysilicon, respectively. A consequence of this
result is that whenever polysilicon is used as an interconnect medium, only P-doping will
be permitted.
c. Device Fanout
Three separate chains of cascaded two-input NOR cells, with fanouts
of 1. 0, 2. 0, and 3.6, provide the means of determining, empirically, the effects of
performance vs loading. However, this experiment was carried out for devices with
0.30-mil channel lengths, not the 0.25-mil channel lengths used in all of the SUMC-
C VT chips. Nevertheless useful comparative data can still be obtained from this
experiment. Figure 18(a), (b) and (c) illustrate the three strings of cascaded NOR
gates with their associated dummy loads. Figure 19 is a photograph of the input and
output waveforms associated with the fanout = 1 chain. The 24-ns delay is represent-
ative of the eight cascaded stages when 0.30-mil-channel-length devices are used.
For this case, the on-chip stage delay is 24 ns _ 8 stages = 3 ns/stage. With the aid
of simulation, the effects of the two output buffering stages may be eliminated. When
this is done, the internal two-input NOR stage delays are 2.4 ns, 5.4 ns and 7.9 ns,
for fanouts of 1. 0, 2. 0, and 3. 6, respectively. Fig. 20 is a plot of this data.
d. Channel Length
Two test chains of two-input NOR cells differing only in the designed
channel lengths (0.25 mil and 0. 30 mil) permit a simple side-by-side comparison of
performance vs gate length. Figu •^ As 18(a) and 18(b) illustrate the two test chains.
The photograph in Fig. 19 shows the superimposed input and output waveforms for the
two test chains. An approximate 40% difference in the 0.25-mil and 0.30-mil channel
length chain delays can be observed. For the shorter channel length circuits, the
15-,:s overall stage delay works out to be less than 2-ns/stage. Figure 20 illustrates
the projected stage delay for a family of 0.25-mil circuits as a function of fanout.
e. 8-Bit Counter Operation
A three-stage or 08 -bit counter circuit utilizing three 1820 M/S flip-flops
is illustrated in Fig. 21(c). The output of the third 1820 stage is buffered through a
1310 and a 1520 cell before coming off chip. All circuitry for this test employs 0. 30-
mil gate lengths. Test results, presented in Fig. 21(a) and (b), indicate that at 10
volts the 1820 cell can be toggled at approximately 75 MHz. The input clock pulse
width to the first stage used in this test was 4 ns. A furher reduction in the pulse
width was limited by the capabilities of the available test equipment.
38
FANWT = 1.0
L : 0.25
(b)
FANOUT =1.0
L n 0.30
(t)
FANOUT c 2.0
(d)
FANOUT = 3.6
t
Fig. 18. Two-input NOR stage delay test circuits.
CD
15 ns , Lch = 0.25 MILS (VII)
24ns, Lch = 0.30 MILS (V9)
V 1 0	 1120
	 1120 1120
( 3120) (3120) (3120)
VII(3120 Lch=0.25)
V9 (1120 Lch =0.30)
L")'- 1 120
L
31201520(3120)
	 (3120)(0)
	
1310 5.5 p f Tl
Fig. 19. Measured two-input NOR chain delay.
.. ...,.... . ..
	
..........	 ..-o......::.	 ... .-...._	 eanea:.:-r..e.e...	 ....^y^n.YM?FF?F?'^'e±ls.: a!!lf.
	
p:•,T>x*nFr¢v,RCII^AI!i:+^Fit.S"AiS
10
`fad	
^
°	 //a	 Anp C^a
°.	 L*QU
STAK OEM 
/
4	 //
//
3	 /
2.4
1.4 ff
01-
0
FANWT
Fig. 20. Measured two-input NOR stage delay vs fanout.
Measurements were also made between the negative edge of the input
clock pulse and a change in state of the output. This measurement represents a delay
through three 1820 slave stages, four 1310 stages, and one 1520 buffering circuit (with
a 20-pF load). This delay measurement averaged about 42 ns or roughly 5 ns/stage.
Figure 21(d) and 21(e) are the scope tracing for these measurements.
Based on the performance improvements recorded for the 0.25-mil two-
input NOR circuits, it is estimated that 100-MHz operation would result if the counter had
been implemented with the smaller 0.25-mil gate lengths.
3. SUMC-CVT Double-Epitaxial, SOS Standard Cell Array Measurements
All SUMC-CVT arrays are designed with 0.25-mil gate lengths for optimized
performance. As part of the eventual screening and sorting procedure that will be used
to separate the fabricated arrays into performance categories based on measured pro-
pagation delay characteristics, all arrays are dynamically tested. In many cases the
measurements are taken at the wafer probe level since many chips are de , ined for
h vbrid mounting rather than standard dual - in-line packages (DIPS) or flat-packs. The
data arrived at in this manner can serve as an excellent source of material for
evaluating the SOS cell family in an actual LSI environment. In addition, this information
can be used to validate and further enhance our techniques for standard cell array per-
formance prediction using the standard cell notehook.
41
CLOCK IN
PIN 44
-8 OUT
PIN 46
(o )	 (b)
1310	 1310	 1310
( C )
(d1	 (e )
Fig. 21. 8-Bit counter delay measurements.
42
Consequently, for each of the five SUMC-CVT array types examined on this
program, the logic path over which the data was taken, the total measured delay (aver-
aged over several units), and the calculated delay (based on the standard cell data
sheets) were compared. And finally, an "average delay per stage" was calculated for
each chip type. Generally, the logic paths investigated were those identified as
"critical paths". In other cases longer logic paths were chosen to obtain measurements
less influenced by off-chip loading.
a. Floating Point Multiplexer, ATL-026A (155 x 134 mils, — 163 gates)
This chip is a 2 x 1 shifting multiplexer. In the SUMC-CVT system, it
operates on every fourth bit, either shifting f4 bit positions or passing the bits straight
through. Provisions are included for mixing two extraneous inputs. The primary
input path is 8 bits wide, while the shifted output is 9 bits wide. The chip is totally
combinatorial.
The logic path cosen for measurement consists of six levels and is shown
in Fig. 22(c). The recorded chain delay ranged from 17 no to 34 ns. Figures 22(a)
and 22(b) are photographs of the input and output waveforms for the 17-ns m,--rurement.
Averaging this time over the four standard cells yields 4.3 ns/cell. If, however, this
delay were averaged over the actual number of logic levels used to implement the chain,
the average delay per logic level would be closer to 2.9 ns.
Calculating the total chain delay with the standard cell data sheets, how-
ever, gives a predicted delay of 27 ns; and indeed when a larger number of ATL-026A
arrays were examined, the average delay for the path did work out to be 23 ns. This is
within 20% of the value obtained from the data sheets.
b. Up/Down Counter Clhip, ATL-027 (199 x 199 mils, — 300 gates)
This array consists essentially of a 12-bit up/down counter divided into
separate 8-bit and 4-bit sections. Each has separate controls, but only the 8-bit
section has carry and borrow outputs for expansion. Each stage has a two-input multi-
plexer for pre-setting the count value. Counting is done in a ripple carry/borrow full
adder subtracter. The touter outputs are tri-state, buffered elements.
Figure 23 illustrates the logic path to be used in chip operation. It con-
tains six cascaded standard cells or eight levels of logic. The "clock" is used to trans-
fer data from the '820 storage elements to the output which, as shown, is loaded with
10 pF. For these measurements, the external chip controls are set to toggle the flip-
flop with every clock pulse. Delays are measured between the 50% point of the negative-
going clock to the 50% point of the output. Delay results for several packaged units
ranged from 42 no to 59 no. The average delay was 52 no. Calculations based on the
standard cell data sheets pradict 46 no for the same path (which is correlation to within
11%). On a per stage basis, the measured average is (52 no + q stages) - 5.8 no.
i
43
1510
2 23 pi y
(5.14 ns)
17 nS—
I	 II	 I
II	 II	 ^
I	 = -
I	 I
1 1890	 I
L- - - - - - - - - J
.51 pf
(4.9 ns)
r
-- — — — — — -- I
1 620	 9G60
I
I	 II	 fI	 I
I	 I
L
-------- J
2.67 pf T	 ^ CL
(8.54 ns)	 - (8.81 ns) -
12
Fig. 22, Measured delav for floating print multiplexer.
INPUT
CLOCK
PIN
61
CELL 1620
2	
(67 )
[5 ns]
INPUT
CELL 1120	 CELL 1520
D^—^
 1
 [6 ns]
(3.16)
	
CELL 2820	 CELL 1'520	 CELL 9020	 DO ( f )
02	 ( 1 .00)	 (1.4)	 2	 PIN
54
[2 no]	 [5 ns]	 [12
 
	 ns] .110.0 pF
= OFF-CHIP
OUTPUT
	
OUTPUT
— 46 ns ( CALCULATED )
KEY:
( ) s ESTIMATED ON-CHIP LOAD IN pF
[	 = DEVICE DELAY FROM STANDARD CELL NOTEBOOK
D
: "X" LOGIC LEVELS WITHIN DEVICE
Fig. 23. Calculated delays for up/down counter.
c. 8-Bit Adder, ATL-030 (229 x 229 mils, — 450 gates)
This array is an 8-bit binary or decimal adder. It is fully expandable and has
carry anticipate logic for fast arithmetic operations, a multiplexed B input, a data om-
plementand logical capability, and several special condition outputs. The logic pat'i
examined is a portion of the SUMC-CVT 32-bit adder's critical path. It is composed
of 11 cascaded standard cells. Depending upon the method used, this works out to be
11 to 13 levels of logic. Figure 24 illustrates this path. Measurements are made from
the 50% point of signal at the input to the 50% point of output signal. Data were taken at
both the packaged device (64-pin DIP) and wafer probe levels. For a 15-PF output load,
packaged device delays averaged 73 ns. The average measured device delay is 73 ns
12 stages = 6 ns. For an output load of 100 pF, the total delay averages 105 ns. Figure
25 shows typical photos of the input and output waveforms of the 64-pin DIP packaged
unit measurements. In terms of the SUMC-CVT program, the 15-pF load measurements
are more directly applicable than the 100-pF load case since 15 pF is more typical of
the anticipated on-hybrid loading.
From a system point of view, the adder path is extremely critical. With this
in mind, calculations based on the 7.0-mil SOS standard cell data sheets were performed.
From the Calcomp checkplot, of the adder array, the additional loading contributed to
the output of each cell in the adder path by the wiring crossovers was precisely accounted
for. Figure 24 shows the crossovers on each output in the critical path. It also includes
the calculated delays on a cell basis. From these calculations, it is possible to discount
the large 9060 delay associated with going off-chip. If this is done, the estimated on-
chip delay is 5.4 ns/logic level. Totaling all calculated delays for the path, we arrive
at a total estimated delay of 75 ns which closely correlates t, the measured average
of 73 ns.
A further investigation was carried out. It centered on the n- and p-type
transistor saturation currents assumed for the standard cell data sheet delay curve
calculations and those of the actual devices on the fabricated arrays. (The actual device
currents were measured on the array's output inverters and were within 10% of those
assumed for the delay curve calculations.)
d. 9-Bit 4 x 2 Multiplexer, ATL-031 (172 x 175 mils, — 150 gates)
This array is a flexible multiplexer that may be used as either a 9-bit 4 x 2
multiplexer or an 18-bit 2 x 1 multiplexer. Its mode of operation is determined with
four independent control lines. The chip is 100% combinatorial. The longest logic
path on the chip is five cells long, or in terms of logic levels, six levels deep. Figure
26 shows a small portion of the array's logic. The path over which measurements are
taken has been highlighted. Dynamic measurements were made between the 50% levels
of the input and output signals. Measurements made at the wafer probe level produced
a range of delays for the six logic levels from 50 to 80 ns. This included the loading
46
^2 1	 1120	 11140	 11OX
1220	 1240	 1130	 4OX
6OX	 1520	 1140
1630	 1140
Ism	 SOX
box
LEXTRA LOADING
1AVERAGE MEASURED DELAY = 73as of l5PF LOAD
i\ TOTAL COMPUTED DELAY = 75r s of 15PF LOAD
1-1-1-
2OX 	 2OX	 OFF-CHIP
CAPACITANCE
INPUT Is90
PAD
	 (5-s)
2310
( 7.6)	 1120	 1520 1520	 1130	 1240	 1230
r-1	(6.0)	 (6.5) (6.0)	 (7.0)	 (4.6)	 (4.0)
9060
( 7 at 15PF)
1120	 1510 (30 at 100PF) OUTPUT(5.0)	 (6.0)	 r1__	 PAD
KEY
ED = INDICATES 'Y' LOGIC LEVELS WITHIN OEVICE
X = INDICATES METAL/POLY CROSSOVERS
(	 I • CONTAINS STANDARD CELL OATA
SHEET COMPUTED DELAY
Fig. 24. Critical path of 8-bit multiplexed input adder.
.4_- ----
I^iiSEEMSSPURN
EANIP, A-
NANFAE
movAno
offlumm
mollm
NONE
Fig. 25. fvpic:cl input-output "aveforms of , -bit adder.
.t 6k
A
m
^a
INPUT
L----J	 ^—V—J
OUTPUT
Fig. 26. Calculated delays for 9-bit 4 x 2 multiplexer.
effects of the test equipment cables which was measured at more than 100 pF. The
average total delay was approximately 60 ns. From the standard cell data sheets,
the total path delay is calculated to be 50 ns; most of which is due to the output stage.
By separate measurements, the cabling alone introduces 12 ns of delay. Taking this
into account, we have an average measured delay of (60 - 12) _ 6 = 8 ns/stage. (This
figure drops considerably when the delay of the output buffer is neglected. Under these
conditions, we have (60 - 12 - 30) _ 5 = 3.6 ns/stage. )
e. Adder-Multiplexer Control, ATL-032 (154 x 139 mils, — 166 gates)
This array houses the special random control logic which combines the ROM
outputs with data dependent conditions. The array's outputs serve to route the adder
inputs and specify the adder operations appropriate to the instruction boing executed.
Figure 27 shows the logic path used to examine the array's performance. It was chosen
primarily because of its length -- 10 standard cells long. In terms of the way the cells
are implemented, this path may be considered to be 13 levels U1 cascaded logic.
Measurements are made from the 50% point of the input clock signal to the 50% point
of output signal. The "clock" is used to transfer data from the 2820 storage element
to the output. The output is connected back to another chip input which provides for
flip-flop toggle action on every "clock" pulse. Measured delay data, on packaged units,
averages 67 ns for the path when the off-chip loading is 18 pF. It averages 76 ns for
the path when an external load of 40 pF is used. For 18-pF load, the average measured
device delay is 67 _ 13 = 5.2 ns.
Calculations for the same path were made using the standard cell data sheets
and a Calcomp checkplot of the array's topology. The latter provides a precise count
of metal /polys il icon crossovers at each cell's output. The crossovers are considered
because their loading effect is not negligible. Figure 27 shows the calculated delays
for each individual cell in the measured path. From these calculations it is possible to
discount the large delay associated with going off-chip through the 9060 element. When
this is done, the average measured on-chip stag e delay is (67 - 9) _ 12 = 4. 8 ns. Totaling
all the calculated delays for the path, we come up with 65.5 ns for the case of 18-pF
external load and 70.5 ns for the case of a 40-pF external load. For both calculations
the predicted values are within 10% of the average measured values. Figure 28 shows
photographs of the input and output waveforms for loads of 18 pF and 40 pF.
Table 7 is a compilation of the measured and calculated on-chip/off-chip delays
for five SUMC-CVT SOS arrays. The delays are averaged overall logic levels in the
path, and in some cases averaged over only those cells internal to the array. The
latter calculation is done by discounting the large off-chip buffering delays. Very good
correlation is noted between the averaged measured and calculated delays.
50
Cnr
1220
INPUT 8.0)	 (2.5CLOCK	 (3.0)	 12.0);	 (4.5)
1220	 (3.0 )
^rG	 1230	 (6.5 )EF+^ _L34 =	 0 I I ='	 17 _	 1220	 1620
2 ='
MOR (3+)	 2310
28
1230
(9 of 18 pF) :OUTPUT
(5.0)	 (12.01	 (14 at40PF)	 I
1240	 I1630 
^	 I
24	 6	 I	 9060	 I
15= L------- --J
CLI = 18pF ( TEST FIXTURE 6 PROSES)
KEY	 = CL2 = 40PF ( 22 pF + CLI )
s E0UrwUx T TO TWO L061C LEVELS
( ) : COMPUTED DELAY IN as
= REPRESENTS METAL/POLY CROSSOVERS
T
Fig. 27. SUMC-CVT CMOS/SOS adder control chip measurement path.
nsnnn^^s^a^^e
(a) C OFF-CHIP - 18 PF
(h) C OF'c -CHIP - 40 pF
Fig. 28. Typical input-output waveforms of adder control.
52
TABLE 7. SUMC-CVT CMOS/SOS LSI ARRAY PE:I FORMANCE MEASUREMENTS*
Total Delay (ns)
Measured Average Estimated On-Chip Off-ChipMeasured
Array Name (Average) Calculated Stage Delay (ns) Stage Delay (ns) Load (pF)
Floating Point 23 27 4.3 --- 12
Multiplexer (ATL-026A)
Up/Down Counter 52 46 5.8 --- 10
(AT L-027)
8-Bit Adder 73 75 6.0 5.4 15
(ATL-030) 105 103 --- --- 100
9-Bit 4 x 2 Multiplexer 60 50 8.0 3.6 100
(AT1,031)
Adder-Multiplexer 67 66 5.2 4.8 18
Controls (ATL-032) 76 71 --- --- 40
* All delay measurements at 10 V.
r
Extensive leakage data has not been collected to date since many of the arrays
examined were not 100% functional. Arrays in this cateogry may well have artificially
large "leakage currents" caused by internal floating gates, etc. Measurements to
date have verified this fact on non-100 19,- functional units since many leakage currents
varied over a couple of orders of magnitude.
11, CONCLUSIONS AND RESULTS
A low cost, quick turnaround technique for generating custom CMUS/SOS LSI arrays
using the standard cell approach was developed, implemented, tested and validated.
This was, in essence, the objective of this program. To achieve this result, a series
of intermediate objectives and goals had to be, and were, accomplished. These accomp-
lishments and results include the following:
(1) The Automatic Placement and Routing Computer program was modified and
enhanced to ensure compatibility with, as well as to optimize the performance of, the
self-aligned silicon-gate CMOS/SOS technology.
(2) A basic cell design topology and guidelines were defined based on an extensive
analysis that included circuit, layout, process, array topology and required performance
considerations -- particularly high circuit speed. A standard cell height of 7 mils and
a minimum pad spacing of 1 mil were established. In addtion to meeting the principal
design consideration of speed, the cell area of CMOS/SOS was dramatically reduced
compared to that for CMOS bulk standard cell design. For example, a two-input NOR
requires 79.8 square mils in the metal-gate standard cell family and only 21 square mils
in the CMOS/SOS standard cell family with virtually the same size devices -- a reduction
of almost four to one.
(3) A family of 11 self-aligned, silicon-gate CMOS/SOS standard cell circuits was
developed. For each cell type this included the circuit design, topological layout, per-
formance validation through circuit simulation, electrical characterization and documen-
tation in the form of user-oriented data sheets. In addition, the performance of virtually
all of the cells was experiemtnally verified either as a result of measurements taken on
the CMOS/SOS test chip (NAS12-2233) or on five LSI arrays implemented for the SUMC-
CVT computer system (Contract NAS8-29072).
(4) The silicon-gate CMOS/SOS test chip was designed. not only to provide experimental
validation that the standard cells functioned properly but also, more critically, to deter-
mine that their dynamic performance correlated with the predicted delays based on com-
puter simulation. This performance validation was verified. For example, the average
stage delay for a two-input NOR circuit in a serial logic chain containing eight levels of
logic was less than 1.6 ns for the devices with 0.25-mil channel lengths and approximately
2.4 ns for devices with a 0.3-mil channel length. In addition, the test chip provided
54
device and characterization data which was used to update the values of the device model
parameters used in the computer simulation program. By such means the accuracy of the
speed predictions based on computer simulation techniques is increased.
(5) Since some of the cells developed had not been designed when the test chip . + 4
laid out and fabricated, they do not appear on the test chip. These cells were exper.
mentally validated by the measured data. taken on five CMOS/SOS custom LSI arrays
designed for the SUMC-CVT computer system. These custom LSI arrays varied in
complexity from a 150-gate 4 x 2 multiplexer array to a 450-gate, multiplexed, 8-b'
adder array.
(6) The correlation between the average measured delay and the delay predictet' -v
the computer simulation program was excellent -- well within design tolerances. F,
example, the difference between the average measured delay and computer-predicte.
delay for specially selected logic paths on each of the five chips can be seen in the ti:- `: Z
Custom Standard Cell
CMOS/SOS Array Types
Computer
Predicted
Delay
(ns)
Average
Measured
Delay
(no)
Measured
Average Stage
Delay
(ns)
Floating Point Multiplexer 27 23 4.3
Up/Down Counter 46 52 5.8
8-Bit Adder with Carry 75 73 6.0
Prediction 103 105
9-Bit 4 x 2 Multiplexer 50 60 8.0
Adder-Multiplexer Control 66 67
71 76 5.2
As seen in the table, there is generally good correlation between the predicted and
measured results. Differences fall within design tolerance. The major significant of
the correlation between predicted and measured results is that the circuit speeds achieve
the dynamic performance objectives for which they were designed -- the NASA SUMC-
C VT computer system program.
55
APPENDIX
SILICON GATE BEAM LEAD TFCHN!:__OGY
Two chip types, a four-bit adder and a twelve-bit adder were designed as test
vehicles to evaluate the SG-BL technology. In add:H^n, a twenty-bit adder hybrid
was also designed using five of the four-bit adder chips. Each of the chip designs
used standard cells developed under this program. The PR21) placement and routing
program, which was successfully used to automatically interconnect MG CMOS
standard cells for LSI, was modified to conform to the topological restraints imposed
by the SG-BL standard cell technology. Both of the SG-BL test chips made extensive
use of the modified PR2D program to place the standard cells and to interconnect
them in the required logic pattern. Minimum manual modification was required,
and this was constrained primarily to the special test circuitry that was included on
each chip. The output of the PR2D program was used to generate drive tapes from
which the 12-level Gerber artwork was created.
One of the test vehicles used to evaluate the SG-BL process was ATL-049. The
ATL-049 consists of a four-bit adder and special test circuitry.
The four-bit adder portion of the chip (shown in Fig. A-1) is a duplicate of the
circuitry contained on a MG CMOS chip (ATL-004A). By duplicating this circuitry
on the ATL-049, a one-to-one speed comparison between the two technologies eould
be made. Extensive data taken from the ATL-004A chip was available to make this
comparison.
In addition to the four-bit adder, special test circuitry was included on the ATL-049
to further evaluate the SG-BL process and standard cell development. The test cir-
cuitry on the ATL-049 included: 1) an inverter with uncommitted sources, 2) two
six-stage logic chains (one with .25 mil gate lengths and one with .2 mil gate lengths)
consisting of five two-input NOR gates and an inverter, 3) a six-stage logic chain
consisting of five EOA gates and an inverter and 4) a six-stage logic chain with
intermediate logic outputs bonded out so that a pair delay measurement could be made.
Figure A-2 details the test circuitry contained on the ATL-049,;
The test vehicle philosophy that was adopted for this program centered on developing
confidence in the standard cell designs and the silicon-gate beaus leaded process by
initially concentrating on the ATL-049 four-bit adder. In compliances with this
philosophy, the ATL-049 logic was implemented using the modified PR2D program.
Working process masks were created so that the test samples of the ATL-049 could be
fabricated and evaluated.
pRWEDI NG PAGE BL" NOT FEUM
A-1
9(9
C1101
19
FEI 6)
^-
---Frtir OINN
-14
Mtl l io
J = 3IMI
,>«	 13O
O
a
N
34 2T 25 @(9 O O 24 ^ 31
iltl 1 10 KC^ ^ 10 t ^1 KCt IiD—#
Fig. A-1. Arithmetic and logic unit on SG-BL ATL-04 test chip.
I - nenuu e
EH P jd^
Fig. A-2. Test logic on SG-BL ATL-049 test chip.
Several batches of ATL-049 were processed and evaluated. Initial testing revealed
high chip leakage, low operating voltages and excessive non-functional operation. In
response to this data, modifications were made to both the circuit design rules and to
the processing procedure. Incorporation of these modifications resulted in a batch
of ATL-049 chips from which characteristic circuit data could be taken on the special
test circuitry. The data taken from two of the special test chains and on the inverter
with uncommitted sources is presented in Figs. A-3 through A-6.
Figures A-3 and A-4 present the data taken on PMOS and NMOS test transistors.
Each transistor had a gate width of 2.0 mils. The drain - to-source breakdown voltage
of the NMOS devices averaged 17.3 volts with no unit breaking down below 16. 5 volts.
Average breakdown of the PMOS device exceeded 18 volts. Both test devices had
mask gate lengths of 0.25 mil.
Figure A-4 shows the characteristic transistor curves for a PMOS and NMOS test
transistor. The average drain current me ,;ured for twelve NMOS devices and five
PMOS devices was 3.9 mA (1.95 mA/mil gate W) and 1.95 mA ( . 98 mA/mil gate W)
respectively. These values reflect a biasing condition of VGS = VDS = 10 volts.
These currents fall within the range of those attainable with the aluminum gate bulk
CMOS technology and thus appeared substandard to what should be expected from the
silicon gate beam leaded technology.
e_4
Nambes Of Averap Average AverageHrealub"chips 1D NYOS PNOS
V"p (v)
NMOS PMOShaA)
12 8.8 1.86 - -
5 - - 17.3 >18
Note: Usamemeats made at V G8 = VDS sad gate width - S. 0 mils.
NMOS PoN
♦qv
VU
-IOv
Fig. A-3. PMOS and NMOS drain currents and breakdown voltages.
The logic chain shown in Fig. A-5 consists of five two-input NOR gates plus an
inverter. All gates have a mask gate length of 0.25 mil and each of the NOR gates
drive one load. Propoagation delay measurements were made on this chain for
four sets of input conditions. With an input waveform having 15-ns 10 percent to
90 percent edge times, delay timea were measured with supply voltages of 10 , 12,
and 15 V. Also, using an input waveform having 90-ns, 10 percent to 90 percent
edge times, the propagation delay was measured with a supply voltage of 10 V. Delay
results for these four tests are presented in Fig. A-5.
The results shown in Fig. A-5 indicate that the average stage delay, using a 10-
volt supply voltage and 15-ns, 10 percent to 90 percent input pulses, was about 10-12
ns per stage. Incre!ising the supply voltage to 12 V increased the speed by about 15
percent. The delay measurements at a supply voltage of 15 V averaged 9-10 ns per
stage. This was slower than expected and was attributed to increased leakage at a
15-volt supply voltage.
To isolate the sources of delay in this six-stage chain, a computer simulation of
the entire logic path was made. Input parameters to the simulation program (such as
effective gate length, doping concetrations, mobilities, etc.) were based on information
taken from measurements on the test chip, as well as on additional data supplied by
SSTC, Somerville. Simulated transistor currents were calibrated to be exactly the same
A-4
r r J f G
n
n
_
^
 :
' 
i 1
n
 ~
^
^
i^
l^
^^
^l
lll
i n
^
 ^
.
n
^
^
`
II
IS
11
11
 n
n
is
w
l^
 ^
il^
 1
11
1 
n
n
n
n
n
^
^
\\\
11
11
11
1 n
1, 
^!
n
1^
^^
\\\
11
1 
^ 
n
Average Average Average Average
Number Delay** Delay- Delay* * Delay*
J J t l J J t 1 J J l 1 J J l 1OfChips In out In out In out in out In Out In out In out in out
(^) (na) (08) (115) (na) (na) (s) (^)
10 70 84 -- -- -- -- -- --
4 -- -- 88 76 -- -- -- --
6 -- -- -- -- 69 57 56 58
Data VDD-VS8 - 10 V VDD - V88 - 10 V VDD - VSS - 12 V VDD -VSS - 15 V
Condo-
10% - 90% 11111111 10% -90% Input 16% - 90% onpUt 10% - 90% inputtions
edge time - 15 no edge time - 90 no edge time - 15 no edge time - 15 as
NOTES:	 * Five 1120s (2-input NOR) plus 9050 (inverter).
"Gate leogW - 0.25 mil (mask dimension).
1111 100	 logo
9M
no	 logo	 logo OUT
•M
Fig. A-5. Average propagation delays of a six-stage logic chain*.
as those measured on the actual chip. Included in the analysis was the resistance
associated with the polysilicon interconnect and gates as well as the input protective
resistor. The results of the simulation run indicated that of the 11-na delay associated
with each stage, 2-3 no was attributable to the interconnection resistance-capacitance
between stages. An 85 percent increase in transistor drain current would be required
to reduce the average stage delay to 8 no (including RC delay).
The logic chain shown in Fig. A-6 was used to measure the pair delay of the 1120
two-input NOR gate. Each of the first four stages of this logic chain drives two loads.
Pair delay was measured between the second and fourth stage through outputted invorting
buffers. Since each output buffer introduced identical delays, they cancelled one another
and provided an accurate measure of the pair delay for the 1120 NOR gate. Data from
Fig. A-6 indicate the pair delay to be about 25 ns or 12.5 no per stage. This value of
the stage delay agreed favorably with the 11 no predicted by computer simulation when
the additional loading introduced by the output buffer was taken into account.
Two additional sets of data taken on this logic chain are presented in Fig. A-6. One
set gives the delay of two 1120 NOR gates plus an inverter. Comparing this data with
the pair delay data indicates that the ou, at inverter introduces 8 to 10 no of delay. The
final set of data in Fig. A-6 presents the delay of the six-stage chain comprising five
A-6
Pair Delay*	 I	 Propagation Delay*
Chip
No.
I I I
21 22
1 1
2 1 22
I I
18 18
l lis 18 I Z18 21 l I18 21
ES-10 23 24 70 67 31 32
E8-11 26 26 76 74 34 37
E8-14 25 25 74 72 33 35
E8-16 25 24 -- -- 32 33
E8-18 26 27	 1 78 1	 75 1	 33 1	 36
*Dab Conditions: 	 VDD - V88 = 10 V
10% - 80% edge time - lb as.
ant
T
Z W Opp
in
NM I CtLL sow a AN
=
_
00"UT Mo
GONTAMM AN
"WV"I T an
Fig. A-6. pair delay and propagation delay of a five two-input NOR plus inverter.
1120 NOR gates plus an output inverter. This test chain is similar to the chain evaluated
in Fig. A-5. Minor variations in the measured delays between these two chains were
justifiable considering the additional interstage loading associated with the logic chain
of Fig. A-6.
Subsequent work on this program included: (1) defining the logic on the twelve-bit
adder chip (ATL-064), placing and routing the logic and creating of the 80x artwork
masks; (2) evaluating one 20-bit adder hybrid; and (3) evaluating several additional
processing batches of ATL-049 adder chips.
Evaluation of the hybrid and the additional batches of ATL-049 adder chips indicated
that the silicon-gate beam leaded process had not stabilized to the point where repeatable
functional units could be fabricated. Several completely functional adder units were
produced which operated at reduced voltages (5-7 volts), however, it was felt that the
circuit perfonnance objectives of this program could not be met using the SG-BL
approach.
