Electron lithography STAR design guidelines.  Part 3:  The mosaic transistor array applied to custom microprocessors.  Part 4:  Stores logic arrays, SLAs implemented with clocked CMOS by Trotter, J. D.
General Disclaimer 
One or more of the Following Statements may affect this Document 
 
 This document has been reproduced from the best copy furnished by the 
organizational source. It is being released in the interest of making available as 
much information as possible. 
 
 This document may contain data, which exceeds the sheet parameters. It was 
furnished in this condition by the organizational source and is the best copy 
available. 
 
 This document may contain tone-on-tone or color graphs, charts and/or pictures, 
which have been reproduced in black and white. 
 
 This document is paginated as submitted by the original source. 
 
 Portions of this document are not fully legible due to the historical nature of some 
of the material. However, it is the best reproduction available from the original 
submission. 
 
 
 
 
 
 
 
Produced by the NASA Center for Aerospace Information (CASI) 
https://ntrs.nasa.gov/search.jsp?R=19850023529 2020-03-20T17:29:53+00:00Z
WW I" I 'S^ 7 z T, . _.
'/	 IT -/P,14-7
ELECTRON LITHOGRAPHY STAR
DESIGN GUIDELINES
PART III of IV: The Mosaic Transistor Array
Applied to Custom Microprocessors
PART IV of IV: Stored Logic Arrays - SLAs
Implemented with Clocked CMOS
(NASA —CH-1707h7) EI1CT6GN LITEOGRAPHY "IAd
::tSIGN GUIDELINES. PARR 3:
	 7hE MOSAIC
0	 ThANSISTOi IR;aAY AFFLIEC iC CLSTGM
Lr
	 BICECPRUCESSUBS. PART 4: EICRES LOGIC
lrississiFpi State UL15., Ci:E5 1ssipF1
•	 - •ass,-t^—i^•\-T^^
ELECTRICAL ENGINEERING
HE'-31842
Unclas
63/60	 16022
FINAL REPORT - FARTS III & Ill
Submitted to'
Na;ionel Aeronautics and Space Administration
:aeorge C. Marshall Space Flight Center
Principal Ivestigator
^	 r	 ^coC z °
4
'_	 O	 _.
J J
7	 ^..
coJ	 ^
E
Date of general release
Wssi, sippi State, MS 39762
MSSU-EIRS-EE-83-4
^ - "tier ..
Unclassified
19ECUtJTV CLASSIFICA T ION O R Ti MI S PA GE (When Data Enured)
REPORT DOCUMENTATION P{1!' REA.0INSTRUCTIONSBEFORE COMPLETING FOR:1
REPORT NUMBFR 2. GOV: ACCESSION NO. 3.	 RECIPIENT'S CATALOG NUMBER
':SSU-EIRS-EE-83-4
4. TITLE t end Subt , rle) S. TYPE OF REPORT A PERIOD COVERED
Electron Lithograph s' STAR Design Guidelines Final Report
Part 3: The Mosaic- TrE.nsistor Array Applied Parts 3 and 4 of 4
to Custom Mi.:roprocesscrs 6.	 PERFORMING ORti. REPORT NUMBER
Part 4: Stored Logic Arrays - FLAs
7. AUTMOR(ta) S.	 CONTR.:CT OR GRANT NUMBER(&)
1- Donald Trottar, Principal In vestigator
F
NASB-33450
P. PERFORM-NG ORGAYIZP TION NAME AND ADDRESS 10	 PROGRAM ELEMENT. PROJECT, ' 	 K
AREA 6 WORK UNIT NUMBERSMississippi State university
Drawer EE
Mississippi State, MS 39762
11. CONTROLLING OFF I CE NAME AND ADDRESS 12.	 REPORT DATE
NASA/MSFC August 31, 1982
MSFC, AL 35812 11,	 NUMBER OF PAGES
ATTN: Harley R. Hope	 AP29-C	
_ 73
4. MONITORING AGENCY NAME S ADDRESS01 d1l'orwlt Iran Controlling Since)	 IS.	 SECURITY CLA!S. (of this report)
ONRRR, Georgia Institute of Technology
Unclassified206 O'Keefe Building	 I
Atlanta, GA 3933	 1S^ DECLJ . SSrFICATIO N DOWNGRADING
kii.
I	 SCHEDULE
DISTRIBUTION STAYEMENT (of [Arta Report)
17. DISTRIBU T ION ST ATEMENT (of the abstract entered In Slo:k 20, It different frwl Report)
IB. SUPPLEM: NTARY NOTES
C: i', n ..II'^.r:.	 ....
OF PGV^ ^ j ,- 4'
19. KEY WOROS (Continue on Power&• old* I: rt:cessry and Idontlty by block number)
Far= 3:	 STAR, Seni=ustom nicro.^rocessor, 	 Data .-)atli
Par*_ 4:	 Stored Logic Array	 (SLA), Programmable Logic Array 	 (PLA),
Clocl_ed C::OS,	 STAR,	 'Iosaic Tra..sis_or A.:ra '	 (MTA)
20. ABSTRACT fContlnus on reverse side If naeseesry and Identify by 610ek rt	 nber)
Part 3:	 The "osaic Transistor Array is an extension of the STAR system
developed by NASA %hic ln has deaicaced. field cells designed to be
soccl.fically used in semicustom microprocessor applications. 	 The basic
ioaic functions for a data math are designed with compatible interface
to the STAR grid system.
Pa.t 4:	 Stored logic arrays are folded PLAs with the AND and OR planes
merged into one physical snac-e.
	 The structure is shown to be compatible
9
a 
v
1001
t
.000
F
,AORN „ 1473M	 EDITION OF 1 NUV 6S IS OBSOLETEDD , 
S/N 0102-LF-014.6601	 1'r,-I APsi °i P-1
SECURITY CLASSIFICATION OF THIS PAGE (i71on Data Lnterod, 1
"s—..	 .
FINAL REPORT
CONTRACT NAS8-33450
ELECTRON LITHOGRAPHY STAR DESIGN GUIDELINES
Part 3: THE MOSAIC TRANSISTOR ARRAY APPLIED TO CUSTOM
MICROPROCESSORS
Part 4: STORED LOGIC ARRAYS - SLAs IMPLEMENTED WITH
CLOCKED CMOS
Principal Investigator
J. Donald Trotter
Mississippi State University
Department of Electrical Engineering
Mississippi State, Iississippi 39762
August 31, 1982
f or
NATIONAL AERONAUTICS AND SPACE ADMINISTRATION
Marshall Space Flight Center, AL 35812
Mjwf
=CTRON LITHOGRAPHY STAR DESIGN GUIDELINES
Part 3: The Mosaic Transistor Array Applied
to Custom Microprocessors
Principal Investigator
J. Donald Trotter
I
'mot+
THE MOSAIC TRANSISTOR ARRAY
APPLIED TO
CUSTOM MICROPROCESSORS
Outline
I. Introduction	 1
II. The Sandia CMOS Process with Double Layer Metal
	 3
III. The Design Approach	 12
IV. The Data Path Circuitr y	17
V. Additional Comments 	 29
LIST OF FIGURES
Figure	 1
Figure 2a
Fi gure 2b
Fi our e 3
Figure 4
Fiaure 5
Figure 6
Figure 7
Figure 6
Fi our e 9
Fi gure 10
Figure I1
Figure 12
Figure 13
Figure 14
Fi our e 15
6,7
14
14
Process Gross-Sections
CMOS Inverter
Three Types of Switches
A CMOS Latch
Block Diagram of Late]
Latch with Multiple I;
Chip Plan for Data Pa{
Generalized Functiona
Block Diagram for a 1
Control Buffers
Two Port Register
R 4:4 Barrel Shifter
Bus Circuitry and Tien;
Literal Buffer
Input/Out p ut Latches
Tri-State I/O
iii
.i a
U,
LIST OF TABLES
iv
	
Table la	 NASA Design Rules for Sandia Process
	 8
	
Ib	 9
	
Ic	 10
	
Id	 11
ORIGINAL PA09 19
OF POOR QUALITY
SUMMARY
The Mosaic Transistor Array is an extension of the
STAR system developed by NASA which has dedicated field
cells designed to be specifically used in semicustom
microprocessor applications. The Sandia radiation hard bulk
CMOs	 process	 is	 utilised	 in	 order to satisfy the
requirements of space flights. A design philosophy is
developed which utilises the strengths and recognises the
weaknesses of the Sandia process. A style of circuitry is
developed which incorporates the low power and high drive
capability of CMOS. In addition the density achieved is
better than that for classic CMOs, although not as good as
for NMOS.
The basic logic functions for a data path are
designed with compatible interface to the STAR grid system.
In this manner either random logic or PLA type structures
can be utilised for the control logic.
v
3	 '	 _	 t
ORIGINAL RAZE M
OF POOR QUALITY
THE MOSAIC TRANSISTOR ARRAY
APPLIED TO
CUSTOM MICROPROCESSORS
I. Introduction
NASA at Marshall Space Flight Center has developed
the Standard Transistor Array (STAR) as a means of providing
cuick turn-around to the desi gn-fabrication cycle for custom
inte g rated circuits. It is in essence a semicustom approach
utilising two levels of metal interconnect for customising a
chip to an application.	 It Provides a me+tns for fabrication
of	 the diffused understruoture at one location while the
customising double-laver metal can be applied at another.
This allows organisations with proficiency in the thin film
hybrid field to p lace	 the turn-around of custom I. C.
development under their own control.
The associated STAR software allows for any mix of
automatic layout and hand layout desired. Since the
Interconnections are regimented into a vertical-horisontal
format, a reasonable display of the chip design can be
mapped onto the column-row oriented line printer output. The
STAR	 software	 system	 therefore	 does	 not demand an
interactive	 graphics	 capability as a prerequisite for
efficient operation._
Due to the double-layer metal aspect, STAR provides
higher densit y than former automatic layout schemes which
utilise only one laver of metal. The chip understruoture is
packed with transistors and the interconnections are handled
overhead in the metal layers. The STAR scheme calls for all
metal to be routed over a rigid grid structure. The
interconnect points of the understruoture transistors to the
metalisation	 layer,	 i.e., the grad points, are also fixed.
The	 nature	 of	 the	 layout	 and	 fabrication of the
understruoture can be varied in all other aspects. Thus,
the STAR a p proach can be suited to variations of technology. 	 {
The intent	 is to separate the chip logic	 (in terms of
interconnections of grid points)	 from the technology (in
terms of solid state devices which feed through up to the
grid).	 Logic,	 once defined, can easily be transferred from
one technology to another or scaled up or down as desired.
The	 understruoture consists for the most part of two
complementary transistors that are replicated over the whole	 I
1
41-0 
1 d
ORIGINAL PA
AA(
;
E^Iy
S
chip.	 The task of changi ng
 OR Q
ofirid3bgfes is reduced to
redesignning two transistors. 	 The accommodation within a
technology to a specific vdndor"s desires on design rules
can	 also	 be	 handled	 easily by redesigning the two
transistors.
The STAR format dedicates certain of the horizontal
and vertical lanes to the task of converting transistors
Into standard logic cell designs in order to ease the burden
on automatic layout computation. This leaves clear lanes,
plus	 unused	 tell	 lane	 segments,	 for	 the	 global
interconnection of oelis Into the final chip design. STAR
provides an excellent solution for custom integrated circuit
needs in the area of random logic. There is a considerable
need for such capabilit y in the interfacing of a standard
off-the-shelf microprocessors and their associated *hips out
to the real world. The STAR approach is not area efficient,
however,
	
in actual development of new microprocessors or
similar chips.
The Mosaic Transistor Array (MTA), the subiect of
this research effort, is an approach aimed at reducing the
turn-around time in the development of microprocessor or
similar chips.	 Such chips tend to have segments devoted to
RAM, ROM, FLA, and re g ister activities. The STAR format,
which	 is	 optimized	 for	 random logic, provides more
interconnect lanes than are needed for these very regular
structures.	 The	 MTA	 provides	 a	 small	 variety of
understruotures to acco:smodate each such activity. These
understruotures or	 field cells still maintain the same grid
far compatibility with the random logic field call, but the
density of
	
transistors ma y inorsase by a factor of four.
The MTA loses the semioustom capability of the STAR.
Prudent	 inclusion of extra random logic and/or PLA field
cells Into a design provides elbow room for mistakes or
changes in control section. In concept one only has to
revise the metal level masks. Wafers having the previous
understruoture processing would still be good for the new
chip logic design.
A requirement	 for	 "total	 dose" radiation )eltrdness
constrains the design to be oompttible with an available
"hard" process.	 The Sandia silicon gate bulk CMOs process
has been selected. This process presents interesting
consequences for VLSI type designs. Channeling over the
p-well is minimised by requiring the poly gate to extend
over thin oxide into the P+ guard ring. This constraint
compromises the normal CMOs circuit density -- such less
compared with HMOs microprocessors.
The goal of this research proleot is to develop at
least
	 a	 first	 generation set of cells suitable for
implementing the basic computer functions. 	 In order to
achieve flexibility for various applications, a customised
control	 structure	 is	 visualised,	 possibly	 through
microprogramming.	 In addition an organised approach to the
2
ORIGINAL PAGE 0
OF POOR QUALITY
assembly of the s y sttm building blocks from both circuit and
topology points of view is required.
	 A generalised data
path	 (bit-*Ito*) structure with a finite-state machine
controller offers such an ap p roach.	 Furthermore, the data
flow through the bit-slice can be optimised for performance.
II. The Sandia CMOS Process with Double-Layer Metal
The Sandia process has been chosen since it
represents a bulk silicon, radiation-hard process suitable
for low-power space applications. Double layer metal add-on
processing is asssumed for this design -- lust as in the
case for the STAR program. The Sandia process is a p-well
croctss	 with
	
a	 separate
	 P+	 guard-ring	 diffusion.
Consequently, the all N+ poly interoonneotlon/gate layer is
permitted to cross the field and p-well boundary. The
fabrication steps required for n-type and p-type deviate in
the Sandia process are explained with the aid of a series of
cross-stotionai drawings in Figure 1.
The fabrication process begins with the growth of a
thermal oxide on an n-type silicon wafer. Figure Ia shows
the cross-section of the wafer after the thermal oxide has
been grown. The nest set of steps develops the p-well which
is necessary for the isolation of the NMOS devices. In
order to create the p-well, a hole must be made in the
thermal oxide; this is done by a masking process. The
p-well mask allows the thermal oxide to be removed in
regions where a p-well is desired. The p-well is implanted
and driven-in. A thin oxide is grown over the wall in
ooniunotion with the drive-in. The and result is indicated
in Figure 1-b.
The P+ guard ring is the nest addition. This guard
ring servos to prevent the formation of a n-type channel
over the lightly doped p-well -- even with the trapped
positive	 charges	 in the oxide due to high doses of
radiation. It also serves to inhibit SCR latch-up. The
piactmtnt of the guard ring is controlled by a mask. Figure
I-o displays the wafer after a boron diffusion has produced
the P+ guard ring. The guard ring mask (photoresist),
indicated by the region enclosing the wavy lines, is shown
In the figure.
A thick oxidt, referred to as the field oxide, is
then grown over the entire wafer. A mask is used to remove
the thick *aide in regions whore a thin oxide is desired.
Figure I-d illustrates the results.
The philosophy of the Sandia process is to have a
thin oxide cover the entire p-well, including the guard ring
and the PMOS devices. 	 The other regions are covered by a
thick oxide.
	
It should be noted that
	 the exposure to
radiation creates bolt-tleotron pairs in the *ride. The
3
ORIGINAL PAGE ig'
OF POOR QUALITY
electrons move relatively freely with moderate electric
fields	 and are drained off by the positive potential
interconnections.
	 ":ht holes are trapped at
	 the aside
interface,
	
resultino	 in an effeotive accumulation of
positive charge which attracts electronic oharge in the
silicon below. The charge trapoed is related to the oxide
thickness to the second power, as a minimum. Consequently,
over lightly-doped p-regions, only thin gate oxide is used
In order to prevent channeling between "unrelated" N+ nodes.
The thin oxide is provided by growing an oxide over
the entire wafer. This aside growth does not appraolably
change the thickness of the thick aside already present. A
uniform polvsilicon (pol y ) deposition follows the thin aside
growth.	 The poly is doped N+ and will be used as the pates
for all the transistors, both NMOS and PMOS.
	 A mask
referred to as the poly mask is used to remove the unwanted
p oly.	 The poly that remains is covered by growing a layer
of thin oxide.	 Figure 1-e shows the cross-section of the
wafer at this point in the fabrication process.
The wafer is now ready to be implanted with an N+
material which will provide the sources and drains of all
the NMOS devices. The N+ imp lant mask (photortsist) is
plaoed on the chip and the N+ material implanted. The
results from this N+ implant step are indicated in Figure
1-f.	 The mask is still in plaoe as indicated by wavy lines.
It is obvious from the figure that the placement of the
source and drain regions in the p-well for the NMOS devices
is controlled b y the implant mask and the poly. This
provides a self-alignment of the NMOS transistors which is
vital for proDer operation of the device. One notes that a
N+ diffused region can be placed over the n-field in order
to achieve *halo contact to the n-substrate. The N+ implant
mask must fall within the thin-oxide region.
Similar to the NMOS devices, the PMOS devices are
also self-aligned. The P+ implant mask is used with a boron
implant	 to dope the P+ regions.	 Figure 1-9 illustrates the
results.	 From the figure it is seen that the placement of
the sour** and drain regions of the PMOS devices is
controlled by the thick oxide and the poly. The P+ implant
mask does not allow the P+ material to be implanted inside
the p-well except in regions for ohmic contact to the
p-well.
The removal of the P+ implant mask is followed by a
glass coating.	 The cross-station of the water after this
step appears as in Figure i-h. It is seen in the figure
that ones the NMOS and PMOS devices have been constructed it
remains only to provid♦ the metal interconnections between
devices.
As	 mentioned	 previously,	 double-layer	 metal
processing is assumed. Before the first layer of metal is
deposited on the chip, holes art out into the glass that is
present on the chip. This is done through another masking
4
4
S`^	 404_
4V
QRIGINAL PAGE la
OF POOR QUALITY,
'-	 stap;	 the mask is thin * a rse referred to as the contact mask.
figure	 1-1	 illustri.:*s the chip after this masking step.
The holes produced in the glass are called contacts, and tLe
'	 o%ip 1s ready for the first layer of metal.
The first layer of metal, sometimes a silioid*, is
deposited on the chip. A mask is used to remove the metal
in locations where it is unwanted. This technique was used
when the poly was laid on the chip and will be used again
for	 the second layer of metal.	 After	 the first metal
processing steps,	 the chip app**rs as in Fiqure 1--$. The
metal
	
is indicated in the figure by the cro p s -hatched area.
In the figure the drain of the NMOS device has been
connected	 to	 the	 drain	 at the PMOS device.
	
O that
connections between devices can be made by use of the metal.
The first metallization is followed by another glass
coating similar to the glass coating step which followed the
P+ implant.
	 Before connections are made using the second
PC	metallization, holes are out	 in the glass.	 Again a mask
(called a via mask)	 is used. The resulting holes in the
glass are known as vias. 	 The cross -seotion of tha ohip
after the vias have been etched is indicated in Figure 1-k.
'	 The second layer of metal, appropriately referred to
e	 as to p metal,	 is deposited an the chip. Ono* again a mask
is used during the etching process to r*move undesirable top
metal.	 This	 second	 m*tallisation	 stop provides theLv	
remaining device connections that could not be completed
with first metal.
	 A final glass overooat is placed on the
chip as shown in Figure t-1.
	 The top metal has been
cross -hatched opposite to that of first metal.	 According to
the figure the connection of the gates of the YMOS and ?MOB
devices has been made by the top metal. This second
m*tallisation Completes the process except for a pad mask
st*a that allows terminal connections of the paoxag*d chip
to the integrated circuit.
t
The Sandia process can be cuickly summarised by
reviewing the diff`r*nt masks required during fabrication.
Ten masks excluding the final pad mask comprise tha complete
set of required masks. These masks are in order: p-well, P+
guard ring,	 thin oxide, poly gate, N+ implant, P+ implant,
contact, first metal, via, and o * oond ( top) metal.
The design rules used for the MTA pros*ot are the
same as those developed for the radiation-hard STAR design.
These rules are repeated in the following Table for the
reader's oonv*nienoq.
3 u
0
0
^,
rE
Y
V
♦
O
o ri
^ O
A4 W
. r
MO
Y ^
^•1
M y^
ORIONAC PAGE iii'
OF POOR QUAUTY
0
a
ro
V
O
- 
O
X
N
r. O
t)
E M
1"I
Q^
I•^
rJ4
aO
.,.I
a,
ro
a
ro
a^
N
ro
w
I
44
I•"I
;11
3
1
1
I♦
V
4
ro
3
C
O
O
ro
ro
a^
E
1
ro
ri
d)
O+
W
6
10
M
a £_r
c0
r
N
•rt
rt
41
U
ji
Q
H
w
I
•r.
r-•I
C)
t7+
-4
W
w
it
.r.t
.T
.r!
ro
ri
C)
1a
t7+
•r{
Ga
r-i
U
a
0E
I
PA
r-I
U
14
0
V)
.11
W
ra
a
S
a
a;
is
•r4 o
U) Cn
1
1T
r-1
C)
•ri
w
C)
U
•rl
C)
A
W
O
41
ro
0
U
>w m
v c^
N U
N :3
rroi 4J
N
1
ri
C1
N
1T
W
N
41
U
b
a
U
tT
•.i
ro
I
•ri
ri
C)
>i
•rl
W
MM
OF Fm oomff
7
ORIGINAL PAGE 18
OF POOR QUALITY
NASA	 SANDIA
MASK DESCRIPTION (MILS)	 (MICRONS)
LAYER 1: P-WELL
1.1.1 MINIMUM WIDTH .2	 5
1.1.2 MINIMUM SPACE 1.0	 25
LAYER 2: P+ CUARD-BAND
2.2.1 MINIMUM WIDTH .2	 5
2.2.2 MINIMUM SPACE .6	 15
2.2.3 MINIMUM WIDTH AROUND WELL .3	 7
2.1.1 INSIDE EDGE OF GUARD BAND
OVERLAP TO OUTSIDE OF P-WELL 0	 0
LAYER 3: THICK OXIDE (ALIGNED TO P+ GUARD-BAND)
3.3.1 MINIMUM THICK OXIDE WIDTH .3	 7
3.3.2 MINIMUM THIN OXIDE SPACE .3	 8
3.3.3 MINIMUM SPACE (P+ TO P+) .5	 12
3.2.1 MINIMUM THIN OXIDE OVERLAP
OF GUARD BAND EDGE .2	 5
3.2.2 MINIMUM THICK OXIDE TO
GUARD BAND WHEN USED TO
DEFINE N+ .325	 8
3.2.3 MINIMUM THICK OXIDE TO
GUARD BAND WHEN USED TO
DEFINE P+ DRAIN .65	 17
3.2.4 SAME AS 3.2.3 EXCEPT
P+ SOURCE .5	 13
3.1.1 MINIMUM THICK OXIDE TO
P-WELL WHEN USED TO
DEFINE P+ DRAIN .85	 21
3.1.2 SAME AS 3.1.1 EXCEPT
P+ SOURCE .675	 17
3.3.3 MINIMUM THIN OXIDE TO
SCRIBE LINE 2.0	 50
Table Ia - NASA Design Rules
8
a
a
ORIGINAL PAGE 18
OF POOR QUALITY
AASK DESCRIPTION
NASA
(IMILS)
SANDIA
(:MICRONS)
LAYER 4: POLY SILICON (ALIGNED TO THICK OXIDE)
4.4.1 MINIMUM LINE .25 6
4.4.2 MINIMUM SPACE .25 6
4.4.2 MINIMUM PMOS GATE LENGTH .25 5
4.4.3 MINIMUM NMOS GATE LENGTH .2 8
4.3.1 MINIMUM POLY TO THICK
OXIDE .n75 1
4.2.1 MINIMUM POLY TO GUARD BAND
WHEN POLY DEFINES N+ .35 9
4.2.2 SAME AS 4.2.1 EXCEPT POLY
DEFINES P+ DRAIN .7 18
4.2.3 SAME AS 4.2.2 EXCEPT POLY
DEFINES P+ SOURCE .6 14
4.3.2 MINIMUM POLY GATE OVERLAP
OF THICK OXIDE .2 5
4.2.4 MINIMUM POLY OVERLAP OF
GUARD RING .25 6
4.1.1 MINIMUM POLY TO P-WELL WHEN
POLY DEFINES P+ DRAIN .9 22
4.1.2 SAME AS 4.1.1 EXCEPT POLY
DEFINES P+ SOURCE .7 18
LAYER 5:	 N+ IMPLANT (ALIGNED TO THICK OXIDE)
	
5.5.1
	
MINIMUM WIDTH
	 .2	 5
	
5.5.2	 MINIMUM SPACE	 .2	 6
	
5.2.1	 MINIMUM N+ TO P+ GUARD	 .4	 10
LAYER 6: P+ IMPLANT (ALIGNED TO THICK OXIDE)
6.6.1 MINIMUM WIDTH .2 5
6.6.2 MINIMUM SPACE .5 12
6.2.1 MINIMUM P+ DRAIN TO
P+ GUARD .75 19
6.2.2 MINIMUM P+ SOURCE TO
P+ GUARD @ VDD .6 15
6.2.3 MINIMUM P+ DRAIN TO
P+ GUARD @ VDD .6 15
6.3.1 MINIMUM P+ TO THICK OXIDE
WHICH DEFINES P+ .2 5
6.1.1 MINIMUM P+ TO P-WELL .9 23
6.1.2 MINIMUM P+ TO P-WELL
@ VDD .75 19
6.5.1 MINIMUM P+ OVERLAP OF
N+ FOR SHORTING .05 1
Table Ib -(Continued). NASA Design Rules For Sandia Process
9
91t^ _-f
­
__77U
ORIGINAL PA"' M
OF POOR QUALITY
AASK UESC:RTPTION
NASA
MILS
SANDIA
(MICRONS)
LAYER 7: CONTACT (ALIGNED TO THICK OXIDE)
7.7.1 MINIMUM CONTACT .3 x .3 6 x 6
7.3.1 '.MINIMUM CONTACT INSIDE
THICK OXIDE DIFFUSIad .2 11
7.4.1 MINIMUM CONTACT INSIDE
POLY TO EDGE
.15 4
7.4.2 MINIMUM CONTACT OUTSIDE
POLY TO EDGE
.2
7.5.1 MINIMUM CONTACT INSIDE
N+ DIFFUSION
.2 5
7.6.1 MINIMUM'_ CONTACT INSIDE
P+ DIFFUSION
.2 5
7.3.1 MINIMUM CONTACT FROM
OXIDE STEP
.2 5
7.7.2 MINIMUM CONTACT FOR SHORT-
ING OVERLAP DIFFUSIONS
.3 6
MINIMUM P+ GUARD CONTACT .3 7
LAYER 8:
	 METAL (ALIGNED TO CONTACT)
8.8.1 MINIMUM METAL WIDTH .5 8
8.8.2 MINIMUM METAL SPACE .3 7
8.7.1 MINIMUM METAL CONTACT
OVERLAP .1 1
8.3.1 MINIMUM METAL TO SCRIBE
LINE 2.0 50
LAYER 9:	 VIA (ALIGNED TO FIRST METAL)
	
9.9.1	 MINIMUM VIA
	 .3 x .3
	
6 x 6
	
9.8.1
	 MINIMUM VIA INSIDE METAL 	 .1	 1
Table Ic - (Continued). NASA Design Rules For Sandia Process
10
t
ORIGINAL PAGE 18
OF POOR QUALITY
MiASY. DESCRIPTION
NASA
(MILS)
SANDIA
(MICRONS)
LAYER 10: SECOND METAL (ALIGNED TO VIA)
10.10.1 MINIMUM METAI. WIDTH .5 8
10.10.2 MINIMUM METAL SPACE .3 7
10.9.1 MINIMUM METAL OVERLAP
OF VIA .1 1
10.3.1 MINIMUi4 BONDING PAD TO
THIN OXIDE 1.5 40
10.10.3 MINIMUM BONDING PAD TO
SECOND METAL 1.5 40
10.8.:! MINIMUM BONDING PAD TO
FIRST METAL 1.5 40
10.10.4 MINIMUM UNBUFFERED
PAD SPACING 8.5 220
10.10.5 MINIMUM BUFFERED PAD
TO PAD SPACING 12.5 320
10.10.6 MINIMUM PAD SIZE 4 x 4
10.10.7 MINIMUM VSS & VSS BUS
WIDTH .6 16
10.3.1 MINIMUM METAL TO
SCRIBE LINE 2.0 50
10.11.1 PAD MASK INSIDE METAL PAD .175 4
Table Id -(Continued). NASA Design Rules for Sandia Process
11
I
ORIGINAL P,iaL Is'
III. The Design Approach 	 OF POOR QUALITY
1, L
A	 generalized	 bit-slice	 structure	 with	 a
finite-state machine controller	 is visualized to offer the
flezibility objectives of this work. Mead and Conway
suggest such an approach in their design teat. However,
their circuit philosophy :.s based on silicon gate NMOS with
depletion devices -- a technology not suitable for the
radiation environment of space flights. Their
implementation of logic by means of register-to-register
transfer paths through combination logic with each register
being gated with a particular clock phase is not new;
however, their extensive use of pass transistor logic to
implement the combinational logic is unusual. Ratioed logic
is principally restricted to the latches and registers; the
ratioless pass transistor logic achieves the extended AND
function at essentially zero stand-by power with the only
performance oompromise being fan-in. In this case the pass
transistors are bi-lateral switches supplying current as a
source-follower and sinking current as common source device.
The source follower confi guration has two short-comings:
1) Its final output value (pull-up) is limited by
the body-effect to approximately 3.5 volts from a
5 volt supply -- even for a high performance NMOS
process.
2) The time for the 0 to 3.5 volt transition is
approximately five times longer than for the 3.5
to 0 volt transition (operating as a common source
device).
Nevertheless,	 the simplicity of	 pass transistor logic is
very attractive, particularly for VLSI designs. 	 Precharging
"high" can minimize transition delays in many cases.
Attempting to apply these concepts to CMOs
teohnologies presents some interesting dilemmas. Replacing
the ratioed inverters in the latches and registers by CMOs
inverters	 is a stra:ght-forward decision; however, 	 the
"structured" combinational logic poses a problem.	 Pass
transistor switches, historically,	 in CMOs are implemented
with pairs of devices, one NMOS and one PMOS for eaoh
switch, in order to avo i d using a device in its compromised
source-follower configuration. A bulk CMOs process with the
1MOS dovioes built in a p-well, such as the Sandia process,
provides NMOS devices with poor device oharacteristics
compared with standard NMOS devices. The high p-well doping
in the Sandia process (higher than normal for radiation
hardness) produces a greater body-effect and reduced channel
mobility compared with the standard.	 Ccnsequently,	 the
Sandia	 NMOS	 device	 is	 particularly unsuitable as a
reuroe-follower. As a result a direot implementation of the
pass-transistor logic utilized by Mead and Conway is not
p ractical in this design effort.
12
On the other hand classical CMOs is not practical
for VLSI either.	 The CMOs device count for combinational
logic
	
is approximately twin• that	 for NMOS.	 Further,
implementing pass-transistor logic with classical CMOs
requires true and compliment signals to drive the HMOs and
PMOS pair -- thus requiring roughly twice the number of
control signals and drivers as for NMOS.
One
	
alternative	 approach	 remains:	 implement
pass-transistor	 logic with PMOS only, utilizing the fact
that	 the	 PMOS devices in the Sandia process have a
relatively low body-effeot.	 In addition the P* guard ring
diffusion can be used as a diffusion tunnel under poly
interconnections	 to	 provide a much needed cross-under
capability.
Consider a CMOs latch with a PMOS pass-transistor
serving	 :s a multiplexing switch. The PMOS pass--gate will
provide	 output swing from about 1.3 volts to the power
supply. Adjusting the ratio of the first inverter in the
latch can provide the necessary level shifting. However, an
additional HMOs device is required with the PMOS feedback
transistor	 to	 supply	 the	 full	 voltage swing after
"latch-up",	 thus providing the full gate drive signals,
facilitating the full output drive from the inverters.
In this design philosophy the NMOS transistors,
which re quire the extra area for channel stops, are utilized
only where necessary. The HMOS device is an effective
switch in the common-souroe configuration; however, a PMOS
device is required for pull-up either in a classical CMOs
fashion or precharced in a clocked scheme.
I
13
!J
ORIGINAL PACC. 13
OF POOR QUALITY
i
A	 A
14-16VRS 2a - CM06 INVERTER
—1 T-
0,2	 02
oj
03	 0 3
T7
03
1'160RZ' 2A - 71- IRZFf TSlicf.5
14
AobA*iO--
Of POOR QUALO
Of
..L
t	 I
A -,o A
I?
1-16ORf 3- A CMOS L A TCAI
1 ,	 !.I 	 ,^	 I.	 !	 -.	 ; I I. t ... 	'.	 :.- I
15
ORIGINAL PA ,E ES1
OF POOR QUALITY
.. B ---,o B
9
,•SEC/
O,	 P
\ d, • SEC2
d,•SEL3
03 	P
c
D
D
8B0^ 02 B
F/GURE 4 - BL OC/C D/A G,?A.M OF "A L A TCR
D,	 A • sel
D2	 d^ • SEL 2	 d^	
O
G3	 ^j•SEL3
F/GURE S- LATCH W/TN MUL T/PZ E INPUTS
16
WORIGINAL PA_A'i ig
OF POOR QUALI'F'Y
The Data Path Circuitry
following the philosophy of Mead and Conway the data
path is organised as parallel bit-slioes from one port to
another port as illustrated in figure 6. Two different
buses run between the two I/O ports and pass through the
dual	 port	 register array, the barrel shifter, and the ALU.
The ALU is built around a Manchester-type carry circuit.
The carry-out signal is precharged high prior to logic
evaluation (phase 2) and remains high unless the kill signal
(K) is true or P*Cin' is true where P is the pass signal and
Cin' is Compliment of the oarry-in signal.
Cout' . K + P*Cin'
The kill and pass signals are generated by g*neralisod
function blocks, Figure 7. Each block contains four control
signals and is driven by true and compliment signals from
two data sources, e.g.,
Gout . GO*A'*8' + Gi*A'*8 + G2"A*8' + G3*A*8
where A i 8 are data signals and GO - G3 are control signals
determined by the interpretation of the instruction.
The ALU is illustrated in Figure S. The carry chain
Circuitry is implemented in NMOS since it is precharged high
during phase 1 and conditionally discharged low during phase
2.	 The HMO& device is faster in this mode than the FMOS
device.
	
In addition the K and P function blocks can be
preoharged low during phase i and conditionally charged high
during the evaluation of the logic, phase 2 -- thus
aohieving a ripple-through *ffeot and saving one signal
Inversion and two logic gates.
The
	
ALU is preconditioned for logic *valuation
(preoharged) during phase 1 and the logic is *valuated
during phase 2. The data is transferred over the buses
during phase 1 with the buss* preoharged low during phase 2.
The two input latohes, A and 8, have multiplexing gates
which select signals from three sources:	 1) For latch A the
possible signals at* Bus A,
	 the shifter output, and the
shift-contr ,)l signal. 2) For latch 8 the possible signals
are Sus 8, the shifter output, and the shift control signal.
These signals are seleoted during phase 1 and latched during
phase 2.	 The two output latches sample only the output of
the R block during phase 2. 	 Either output latch can drive
either bus during phase I.
The oarry-chain output drives the carry input for
the next bit-slios. 	 After several bit-slices are added
too*thsr,	 the	 discharge	 delay	 through the	 several
pass-transistors can Locos* excessively long.	 By utilising
17
ate`.-
ORIGINAL PA(M IS
OF POOR QUALITY
the Cin signal generated to drive the It block as the drive
signal for the carry propagation gate on a periodio basis,
the pass-gate logic is buffered to minimise the carry
propagation delay.
The data buses are preoharged low during phase Z and
conditionally	 charged	 high	 through	 the enabled PMOS
pass- gat* as illustrated in figure Ia. The two port
register cell with its PMOS drivers to either bus is shown
in figure 10.
Between the register and the ALU is a barrel shifter
which conoatenaes the two buses with bus 2 being in the
lower significant bit po.itions. With the shift constant
equal to sero the shifter output corresponds to the A bus.
With a non-sero shift constant, 'he output is shifted down
the corresponding number of bits. 	 'he most significant bits
from the A bus are omitted and the	 most significant bits
of	 the H bus are Included.	 Figure 11	 illustrates the
pass-pate	 logic used In a 4 s 4 bit barrel shifter. 	 For the
Sandia process the shifter is implemented in PMOS with the
shift output preonarged low in phase Z. 	 If data is read out
of a register to an input	 latch, the transfer occurs in
phase I through three PMOS pass-gates: 	 a dual port read
switch,	 4 shifter switch, and the multiplexer at the latch
input.
The shifter also aocomodates data transfers between
bus A and a literal -port to the controller. The controller
can supply a literal	 in phase 1.	 Data transfer in the
reverse direction is possible in the opposite phase. 	 Figure
13 illustrates the bi-lateral buffer.
A tri-state I10 buffer	 is given In figures 14 and
15. The output is latched from either bus and enabled with
an asynchronous signal with the latch driving either bus
during phase 1. An appropriate number of inverter stages is
included in the buffer to achieve the necessary capacitance
driving capability.
3
^e
18
0
	 0
v°
	 2
iQ
t ^s^
i" Zd
}
V
^^ O
O
lk
0
v
U
O
ORIMNAL PAOIC 19
OF POOR QUALM
IA
O
OV
v
Ilk
^ Q
o ^
V
ry
2
I	 ^
W
J
^y
I3
_a
i
C,I
A -^
B
ORIGINAL PAGE 18
OF POOR QUALITY
G.2	 G,	 Go
G
II
63	 G2	 G,	 G,
I
P A P A P H P
B P B P B P B P
G
^^ N
C73 6. G, Go
A
FZINCT/ONAL	 ---♦- G
$	 BLOCf(°^^^
f1G l/RE 7 — GENERA L IE ED FUNC TIONAL &06,1(
20
ORIGNSIAL	 ig
OF POOR QUALITY	
0
Ll-_ -- M- _ -
t-44
h:	 '+•	 ^^ j
A ---
T—
'c3
zs r4,
-4
v	 a qv	
Q 
Q4
(lj
%n	 %Aj 
21
f/GUR£ 9 - CONTROL BUFFS.
22
,W
-	 ORIGINAL PAGE IS
OF POOR QUALITY
;V
.2^	 111, ^NJ
ABSTRACT FORM
A IN)
d,	 }--- 0417
/MPL EMEN TED FOR
AL U OUTPUT L A T6P,
M01. T/PL IS./ER
*V
d,
	
IN —P^	 AL TERNA TE FO =^
,FUNC710AIAL BLOCK
IN SAMPLED
DURING 0	 ^o, 7, )P'
r,,
 /N/
O—^ Ov T
N	 OUTPUT YAL /O
	
^^	 DUR/N G ^Z
Qao
J
V
h
v^
ti
^J
O
J
I
ORIGINAL PAGE 19
OF POOR QUALITY
23
'J
1
^/!	 Q ^ ^ ^ Q ^ Q V 1	 O -V ^ v ,	 Q ^V
O ^n
;rl ;rl
rAl
to N V1 ^
V1N vl V1
O
I
v
y
\
ORDINAL PAGE 19
OF POOR QUALITY
J	 A. 1	 • \ 1 J
4
II,
vl
i
\`
9
3
v(, n 1
i
1
Q	 R1^'aC^	 ^ Q	 ^` Q
24
^GUR,67
 12 BUs Cl^PCv/
25
S
OF pooR QUALITY
J
l
4V
IN
E4 ,P I p ,P
.1.
2	 y
-
B OT
.^.^^
	
^ ^S^/.c? ^^ iF S FL ?
P 
*RAA	 p	 P
TWo Tt:4kr	 SH)FIFA	 ALU
REr. irTF2 	I u P V r
L Aril
 14
BUS VI SCHA"Cb .4 0 W IN Oa-
+ $US CoAjoiri 0A1A4.LY CM,4f@Geb
C O^t17'RO {. $ ^F1^^'RS
- SAa^^^E i^^pvTS ^c/ c^Z
loUTpu-'S' ARC P EcH/l, ^GC.v i-1iG<i
	
cavTRac_ ^Z	 Cav:-< 0L 1
	
^^
	
our
^^	 p
SI
ORIGINAL PAGE 19
OF POOR QUALITY
r7)4	^ ' 1
i I1
7'Ervm -
^'U S
4-V	 +-V
P P
	
p
p
PAN	
OVT 
N	 ^ )A/	
0v^
^G v,QE l 3 ^, ^ T^^2 ^ ^. BCJ ^iC^^2
-1
-^^ fs
26
--- -
Q----.-------- --
-..
ORIGINAL PAc
OF POOR QUALITY
j
ao
Q
v
1G
ti
J
t
W
v
J Z
J
113 ~
27
Q14
ORIGINAL PAQE 1-3
OF POOR QUALITY
O
14
t
o
IN
4,4
vORIGINAL PAGE 19
OF POOR QUALITY
V. Additional Comments
This design of the functional elements for a data
path	 logic provides the foundation for
	 implementing a
microprocessor with the MTA concept. In this regard either
the structured SLA described in another part of this report
and/or conventional STAR cells can be used to implement the
control logic.
The	 data	 path or bit-slice approach is quite
flexible and is compatible with the concept of
microprogramming with a PLA type structure. Electrically
reprogrammable PLAs could even provide field changes in the
basic	 instruction. The popularity of the bit-slice approach
is exemplified by recent announcements from H*wiett-Paokard
and Bell	 Laboratories.	 It should also be noted that the
Belimao32 from Bell Laboratories
	 is a 32 bit processor
implemented with CMOS which uses their poly cell (standard
cells)	 structures in coniunotion with a data path function.
The detail drawings of the functional cells for the
data path are provided in the appendix.
29
gIRICiINAL FeAtii. IS
OF POOR QUALITY
REFERENCES
1. Carver Mead and Lynn Conway, Introduction to VLSI Systems,
Addison-Wesley Publishing Company, 1980.
30
Mu MioroeiootrMloo
MoOhlo 0"isn Lob
i^-j
BASIC LATCH (no silicide)
1001
scale: 1001X
ORIGINAL PAGE
COLOR PHOTOGRAPH
000 LATCH ( gate added )
1002
scale: 1001X
PACE M
AUTO
ORIGINAL PAGE
COLOR PHOTO tl ^Py
ICJ Mioroolootronioo
& sOMo Wei/n L b
EVEN LATCH (gate added )
1003
scale: 1001X
ORIGINAL PAGE
COLOR PHOTOGRAP14
IW Mioroolootronioo
& Wilo D"Isn UW
oro CELL
10C4
scale: 1001X
T­4
^z tngn¢ ... 	^
ORIGINAL PAGE
COLOR PHOTOGRAPH
71I^
I	 ICI
L IT
^	 ^	 I	 I	 I	 I
	
^ I	 I I
	
I I
	
II	 II	 ^I	 I	 I	 I
	V 	 I I ^1-
	
h	 I	 I	 ^ I ;^	 I
it
ORIGINAL PAGE
COLOR PHOTOGRAPH
T7
I	 T
jL
i
i	 is
^^xpyi,wsca le: 3001% ORIGINAL PAGE
r L
-TTjl
i	 I	 ^	
l
	 I.	 ^ i	 i	 I	 I I ^I I	 ^
4pq I	 I	 I
jill
I Eli
-- I ^ f TS -t
-T T
L-J!', I 	 I 	 ^	 ^ 	 -#- ^ 1 1 i L4,
-1 ^	 L^	 r	 I IFl
F T
=1 -7
a-@ r
	
J.I--
ORIGINAL PAGE
COLOR PHOTOGRAPH
—L Eowux ir"m
^ u ^I I- r-	 ^ ^ i _^
III	 iII	 Illli.	 ^I	 ^^ .r^i.
J
ill. i I
1
,---T
_T_'F; I	 ^ '	
^ ^I I
	
^	 j II q^  ^	 ^+	 I	 II -j i	 I I_	 1 ^	 1 1;	 I^ j'^ 'I	 ILL
1 ^ I-Tlrt	 Irl-t^--^ I ^^Y
T I^ - I I ^^	 1- r
1- Llr -T T^	 ^ ^	 I ,'y ' _
	r	 ^
	
`_ I 	 I	 L	 I^_ll'^^
............ 11 H^	 r ^
^^ i '
- 1 L
-Fti i 	 i - i	 i it ' i r--f --- 7-71
DUAL CELL
1006
scale: 50OX
ORIGINAL PAGE
COLOR PHOTOGRAPH
Is	 _
4—I
I	 i
I	
6
I
f-OLDOU1 kY-AMg
MSII Microelectronics
Graphic Design Lob
t - 
I; q 	 III I I	 q s	 }
Jill
III	 i
__	
I	 III	 II	 ^	 I
II
I
I I q 	 q
L
T-J L I L
4u.!
ORIGINAL PAGE
COLOR PHOTOGRAPH
r^
I^t4
I t-j
2 EOL OUX FRAME
C7
i
1
I•
1007
scale: 1001X ORIGINAL' PA-GV'
COLOR PHOTOGRAP14
-t`f
rt-
FUNCTIONAL BLOCKS P & K
: r
-41
L	 4A
_I 1 ^ ^	 trt
^ i
,^,'^ 1 _ _	 it
ILI
L 
t T	 I ^	 - ^,
mu "largeleatf4mias	 10WOUT ERAW
W-mW la Melon UW
I 
i-- -L-
- r^- I	 - F
r1-
r +,
r-r! c
-ii T
r+-
Li
ORIGINAL: PXQW
QQLOR PHOTOGRAP14
T-1
U zi -- !
CONTROL BUFFER [ from Y=0 to Y =85 ]
10081
scale: 1001X
ORIGINAL PAGE
COLOR PHOTOGRAPH
E=-OUX FIC MM
wu 141w"leotrwim
trsM la Dmion Lib
r
Ya85
ORIGINAL PAGE
COLOR PHOTOGRAPH
i
{3
2.-a
CONTROL BUFFER [ from Y =85 to Y=150 ]
10082
scale: 1001X
ORIGINAL PAS
COLOR PHOTOGRAPH
wu Nlororlootronloo
fraphlo Union Lob
w
L ^
i^	 r-I
in
m
N
II
^ a
O
41
°C
a^
^r
O Z a.
II ^o
^' U
EOL
%-
ClW
U-
LL X
O wl
m O
J O
O ..
ac m a^
F- 00
z O ^o
O O U
U I+ (A
L-1
ORIGINAL PAGE
COLOR PHOTOGRAPH
CONTROL BUFFER [ Y=235 to Y=340 ]
10084
scale: 1001X
r --
q I ^ ^^I ^rjl II
MU Mlorosisotronios
8rsohio Mmian lab
.74-17,
ORIGINAL PAGE
WLOR PHOTOGRAPH
i^J ^' ^rJl II ^ j^JrT ^ q ^^^JLL- -A-
^EIII!R-0^ EKAMP,
I/O TRI-STATE BUFFER
1009
scale: 50OX
[ ENABLE CIRCUIT
ORIGINAL' PAJGf
COLOR PHOTOGRAPH
Mill MSo^osisoC^onias — -
graphic Design Lob
CARRY CHAIN & R BL^CK [ path #1 ]
10101
scale: 50OX
ORIGINAL PAGE
COLOR PHOTOGRAPH
L
FOLD_oU2
MGU Miorosisctronios
Graphic Donlon Lab
-	
.1
71n	 F t7V62,
path #1
INAL PAGE
	 ORIGINAL PAGEPHOTOGRAPH
	 COLOR PHOTOGRAP14
POPA
I LL
(:	 ^	 li	 it	 ^f
+. It—
ta-
. 
I 
Lbo-411
^^j ^t^t^l-tY
^,-OLDO_ux ERAMN
4:!LL
MM Mloroolootronloo
Groohlo Doolon Lob
Y CHAIN & R BLOCK [ path #2
50OX
ORIGINAL
COLOR PHOT
^L Lfi
EoJ,D-oUl
path #2 ]
AL M	
ORIGINAL PAGV'
COLOR PHOTOGRAPH
4901F
-illr-illr,il
L-i LL44
t
F3 E Ell
ii T TI liAl	 I
UL
sir iii
tHL I	 -, ,u i
ro	 ELI. 	 11	 0
+	 f- 
-4 t 4-1- +-P
tE431 1 pm al	 L
'+1
3
r^
ZmLD-O-U ERMft
41
i^T
t_-1-I	 '
..........
I
I
-lT
r i q
OUTPUT LATCH to ARITHMETIC LOGIC UNIT
1011
s c a t a: 1001 X	 ORIGINAL PAGEs;QLOR PHOTOGRAPH
EOL DOUR FRAME
1
I,
f^
I
J
wu Mlorosiso on os
Nrsph lo Dsslan Lmb
OGIC UNIT
	 ...	 —`
oamiNp[ PAGE
ryyop Rarocaorx
2 Eowola emNY
IC	 1	 ^ I	 t—'I!—"I'I
a	 I	 I	 ^
^-y 	I I	 ^
I I	 f-- ^
^	 I ^	
-f	 I.^^ I El I=
l	 I^ I ^ 
L
F-^ ICI
r—I ^^ +^ L	 I
INPUT LATCH to ARITHMETIC LOGIC UNIT
1012	 ORIGINAL PAGE
scale: 1001X	 S,DLOR PWOTOdWK
r t f - I 1- 1 1-
L4 
	 I I ^^ ^J	 L
I	 I^
I	 C	 I	 I	 ^
I	 C	 I	 I	 ;.
	
I
L+ICI
	 IITII^
^^^. I jlIII
II I
	II	 T
F-T-1
--4 - 1--j-
I	 I. I
^	 ^	 I	 ^I	 illy	 17
1- ^I	 II
	
I 	 I
mu Miorosisotronioe
@rwhlo Dnion UW EQU) UXERAMIC
TIC LOGIC UNIT
ORIGINAL PAGE
MOR PHOTOdPAOH
	
EE	 M-^-
, 
L
4-4
	
OF	 1111	 111
t_
I	 {
it
^rt	 ^
	
I	 1	 -
	
IfI I I 	 L	 ^^-^	
i .
^^_ f	 I	 1
I' ^ ^^^I	 I
L
-1—J4,D-ouT
- I Z
7BASIC CELL for the SHIFTER
1013
scale: 1001X
O AO MCOLOR P GR xH
r
	 I
i
LJ
L
p I^ I ^ 	I
F-LL
mm miaroozeatraniov
Graphic Desion Lab
J
- -3
CELLS on the LEFT EDGE of SHIFTER
1014
scale: 1001X
ORIGINAL PAGE'
COLOR PHOTOGRAPH
^-I
I7-j
i
l
i
L-T
^I	 I	 II	 I
LJ
MU H1014"lect"n1m
X
CELLS on the RIGHT EDGE of SHIFTER
1015
s c a l ke: 10 01 X	 ORIGINAL PAGe'
COLOR PHOTOGRAPH
^-IIj ^--I
i
I	
I	
I
1
'T
	
--t----4	 -- --t^ -
	
L —I	 L
MALI mar"iwtronim
Grwhie Mmien UW
CELLS on the DIAGONAL of SHIFTER
iols
Scale :1001X	 ORIGINAL PAGE"
COLOR PHOTOGRAPH
7i-iii10
mu
W40hio Dmign LO
X.
Y^
F
3
CELL # (16, 16) of SHIFTER
1616
scale: 1001X
NAl1'^
ORIG!NAL PA aE
PHOTOGRAPH
i F-1 I
4
i	 i^T rte: TI	 i I	
Ii	 ' 1 
---	 I
L -i
IOU Mioroalaotronioa
& no,
 io ONign Lab
I^ q
L -1	 L l
mou miorosisotranice
frophia Design UW
CELL # (Ov 0)
100
scale: 1001X
ORIGINAC PADS
WON P-NOTOGRAP14
^	 ^	 I	 I
F-I
-0 El
r -I
11	 I;
1018	 ORIGINAL' PAGE *
1 @' 4 O 01 X	 COLOR PHOTOGRAPR
7TT,
4-
I	 ; I I	 r„	 E	 I	 ^	 L^
	
_--	 ! _^
- -^ r F, F
it
I	 T__	 	 -	 i	 L
F	
-J-1
T^^
ILL
J14-
IT,
BUFFER
ORIGINAL' PhaC '
COLOR PHOTOGRAPR
I	 E.
1 i- i^il^i - 	 Ft
.0
ji^
a^ ^
I j
MULTIPLEXER (adds silicide to latch)
2002
scale: 1001X
ORIGINAL PAGE
COLOR PHOTOGRAPH
^	 ^--- r
	
I Ir I
--- L----	 I J I I+1^ I I ^ `'x'11^I_IL-1
-T T
FOLDOUT, M
F-T!
_j j
Sir sW to Design Lob
I	 1 ^	 ^
T t4
jrl-tr+ , I I
rt
J?
I- -J
2-EOLDOUl FRAME
licide to latch)
ORIGINAL' PAGE'
COLOR PHOTOGRAPH
__ rL^ f-I ^- r-
1A,
__ -TI _^ 1 ^ Th
FE]IL
F
+
FFIL
rT
-^^i I i
_ t _J I ^^ 
L 1L ^I ^^,^'^ I
---t ^ 11_I I^ljii;^^ i f, 4
_j i^ ^.^^ E^^ II
F DTI I ^rt^
tj
ELECTRON LITHOGRAPHY STAR DESIGN GUIDELINES
Part 4: Stored Logic Arrays - SLINs Implemented
with Clocked CMOS
Principal Investigator
J. Donald Trotter
STORED LOGIC ARRAYS - SLAs
IMPLEMENTED WITH
CLOCKED CMOS
OUTLINE
I. Introduction	 1
II. Cloaked CMOS	 2
III. Programmable Logic Assays	 13
IV. STAR Layouts with Dual Input Latches 	 26
V. Appendix	 33
ii
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16
Figure 17
Figure 18
Figure 19
V; 'ture 20
rure 21
pure 22
ture 23
LIST OF FIGURES
3
5
7
8
11
12
14
15
18
19
21
23
24
26
27
28
29
29
30
30
31
31
32
Clocked CMOS Concept
Clocked CMOS Observations
AND/OR with Ripple CMOS
Basic Types of C40S Gates
Charge Splitting Concerns
Clocked CMOS Gate
PLA Using NMOS
AND/OR Combinations
Dynamic CMOS PLA
Alternate Input Circuits
Latches
Dynamic CMOS PLA
PLA/SLA with Dual Input F/F
Chip Floorplan
Legend
Chip Floorplan Detail
Output Buffer Circuit Detail
Latch Detail
Input Circuit Detail
Input Circuit Detail
NCOM and PCOM
FA and FA'
AND/OR Plane Detail
iii
^s
ORIGINAL PAGE 18
OF POOR QUALITY
STORED LOGIC ARRAYS - SLAs
IMPLEMENTED WITH
CLOCKED CMOS
SUMMARY
Stored locio arrays are fold,d PLAs with the AND
and OR planes merged into one physical space. The
structure is shown to be compatible with the STAR layout of
transistors.	 As a result	 the STAR foundation wafers, or
similar structure,	 can be used to implement the structured
logic associated with PLAs.
	 This structure offers the
potential	 for	 implementin g
 the control structure for the
Mosiac Transistor Array. In addition the structure offers
an organized approach for the distribution of logic gates
in state machines and as such may represent an approach to
untangling the placement and routing difficulties in random
logic.
In order to efficiently compact the logic onto the
regular grid of the STAR, it is necessary to develop a new
style of CMOS circuitry which does not utilise the classic
CMOS pair of transistors for each fan-in.
	 A olooked
variet y of CMOS is develooed which achieves higher layout
density	 and
	 performance.
	 Concepts of simultaneously
preoharging	 a series of	 gates are developed.
	 At	 an
appropriate *looking signal the logic is evaluated with the
siqnals "ripplinq" through the loci*.
	 This concept has led
the author	 to the term "ripple" logic.
	 It is shown that
the logic is not general; consequently, the compromises are
developed	 relative to high performance,
	
regular
	 logic
arrays.	 Since dynamic techniques are utilized, careful
considerations of charge splitting are observed. As such,
clocked CMOS concepts are presented in a form suitable for
the uninitiated readers.
iv
r
fr
	-
ORIGINAL PAGE 19
OF POOR QUALITY
STORED LOGIC ARRAYS - SLAs
IMPLEMENTED WITH
CLOCKED CMOS
I. Introduction
Cate arra y s have become increasin g ly popular ir• the
last few years as a means of implementing complex logic
with reduced engineering cost. The NASA STAR is a
double-laver metal CMOS version of a sate array. The STAR
has at
	
least	 two features which offer consideration of an
alternate design ap p roach: 1) The STAR is composed of rows
of NMOS transistors and rows of PMOS transistors with the
neighboring transistors always coupled via the source/drain
diffusion.	 2) The gates of	 the transistors are	 left
unconnected through the standard lavers and are connected
with the orogramminq mask levels.	 Array logic such as
programmable logic arrays (PLAs) or storage logic arrays
(SLAs) offer an organised means of placement and routing
for logic with some sacrifice in performance.
Classic CMOS circuitry requires si gnals to drive
pairs of	 transistors	 instead of one, as normally used in
NMOS.	 This requires greater area for interconnections and
transistors.	 However,	 olookod CMOS, with the concept of
precharginq	 circuit
	
nodes	 and	 then	 conditionally
discharging them,	 requires only one transistor per signal
input.	 Furthermore,	 it	 offers the potential	 for speed
enhancement.	 A requirement for this circuit style is that
the	 qates for NMOS and PMOS pair transistors remain
uncommitted, a feature found in the STAR design.
	
An SLA is a
	
particular variety of PLA which is
folded with the OR and AND planes merged together. The
latches
	 (	 or	 flip-flops	 ) can be placed around the logic
array. The actual size of the AND and OR gates for
particular si gnals can vary with the actual applications,
thus providing folding opportunities not normally found in
PLAs. The STAR design offers the potential of implementing
the SLA on its standard structure. The SLA design can be
part of a STAR chip design and merged ir. with conventional
gate implementations.
With these concepts in mind the SLA design is
pursued.
ORIGINAL P a^'^ it]
()F, POOR QUALITY
1I. Clocked CMOS
A. Basic Conoeat
There are several clocked CMOS circuit styles. All
are based on the concept of preoharged gates. Defining an
N-Gate as a clocked CMOS gate which utilises NMOS
transistors for logic evaluation, one simply uses a PMOS
transistor to preoharge the output unconditionally to the
positive supply p otential during some preoharge clock
signal and conditionally discharges the output back to
around through a series-conneottd NMOS transistor which is
switched on with an evaluation cloak signal. If the
series/parallel combination of NMOS transistors which form
the logic function offers a path to ground, the output is
discharged during the evaluation period. 	 If	 the logic
devices do not offer a conduction path or the evaluation
device	 is not	 turned on,	 the output remains in its
preoharged state subject 	 to the leakage discharge of the
node.	 This Ieakaae time constant can be expected to exceed
1 mete with t y pical values being in the order of one
second.	 This information associated with the ohargtd state
of the output	 is dynamic in nature and requires sampling
with approoriate timing constraints. 	 One can realise the
logic	 function	 of	 the Gate by noting the condition
associated with discharge is producing a "O", for positive
locio.	 If	 two devices are in series, both must be on for
discharge,	 resulting in the NAND function being rea:ised.
If	 devices are	 in parallel,	 either	 can discharge the
output; hence, the NOR function is realised. The sort
general parallel/serifs arrangement can be analysed in a
similar manner.
The P-Gate is defined in a dual manner to the
i	 -
N-Cate.	 The logio-forming transistors are implemented with
PMOS devices and the precharginq load devioe is a NMOS
transistor.
	
The output
	
is unconditionally preoharged to
the "O" state and conditionally charged to the positive	 ^'►^
supply through the PMOS logic during the evaluation period.
With series PMOS logic transistors the logic formed is the
4
NOR function, assuming positive logic. With parallel logic
transistors the logic formed is the NAND function. 	 In
general,	 the	 series-parallel	 combination	 of	 PMOS
transistors must be the dual of the parallel-series
combination of NMOS transistors for a given logic function,
just as for claustcal CMOS.
	
Consider
	
the following concepts	 illustrated in
Figure	 1:	 The	 information	 in	 the $tat*	 flip-flops is
stored	 in wort--or-less conventional CMOS latches with
active output drive in both directions. Local feedback in
the latch provides indefinite retention of the information.
The olocktd P-Gates and N-Cates are oreoharged to their
2
inputs
a
a
W"
OF POOR QUALITY
:rms
	
•clocked cmosv
phase p	 phase e
oh	 d h I a h
output
Ii.—.^ f
i ab+o
iI
0'–"— o f
if
t s.0.
10	 (a+b) o
phase a	 phase p	 preoh"d low
output
n-gate definitioning p-gate
concept
n drives p p drives n 	 olocksd
standby information is stored in static f/f ee and clocked Oates are precharped
evaluation takes place in one clock period in a ripple throw manner
FIGURE-1 CLOCKED CMOS CONCEPT
3
i	 }
i
SRO
ORIGINAL Yas "^': t^
OF POOR QUAI .I'iY
appropriate levels during the preoharge
of s+and-by period, a period which can last an indefinite
time since the state
	
information is stored
	 in static
latches.
	
N-Gates drive 8-Cates and vice-versa. Similar
gates never drive each other, e.g., N-Gates never drive
N-Cates.	 One observes that the clocked gate inputs, if
driven from an opposite type, are preoharged to the
condition which turns the logic devices off. Consequently,
no complete conduction path can exists between the power
supply and ground in this preoharged state, even if the
evaluation device is removed.	 The gates driven directly
from the state flip-flops, however, do not necessarily have
their	 logic devices turned off; oonsequently, evaluation
devices are re quired for these stages in order to achieve
near zero stand-by p ower.	 In principle:	 a string of
clocked gates of alternating types can be preoharged during
the stand-by period;	 the p reoharge cloaks can be removed;
the
	
evaluation	 signals can be applied to the state
flip-flop driven stages; logic signals can then propagate
through the string in a ripple-through manner limited only
by the transition times of the individual stapes.
B. Observations
Timing	 signals are re quired for the preoharge
(stand-by)	 phase, the logic evaluation phase, and the latch
enable phase.	 No additional timing clocks are required --
unlike the case of clocked, non-ratioed, single-polarity
circuitry with multiple	 logic phases.
	 The ideal timing
could be achieved with a single s y stem clock. For example,
the s y stem clock could represent the stand-by phase when
the clock is low; when the clock goes high, the evaluation
	
phase begins and continues until the clock returns to the 	 i
low state which enables the flip - flops ( edge triggered).
Dynamic information is used during the evaluation phase;
consequently,	 there is an upper limit on the pulse width
for the system clock of about 1 mseo.
The
	
propagation
	
delay	 for	 the ripple - through
clocked CMOS should be faster than for classical CMOS or
clocked single - p olarity circuitry.	 The discharge delays
are achieved through common -souroo configured FETs, the
fastest
	 type,	 wit',	only	 single	 transistor	 loading	 per
fan-out.	 Each transition delay begins when the input
signal	 reaches the device threshold instead of the circuit
threshold of ratioed or classical CMOS circuitry. There is
no added delay associated with additional evaluation clocks
as ir. the case of single - polarity clocked non - ratioed
circuitry.
Classical CMOS gates or static gates can be used in
class of clocked gates; however, the preoharged levels must
be compatible as before. This may be benefioial since the
4
ORIGINAL PAGE W
OF POOR QUALITY
observations
static circuits can be introduced. however the "charged xtxOss must
baoompetible.(low stand-by power)
-sximle
0 --- P Y
the p-gato could be static
its inputo are preeharged high by the n-gete outputs
Its output will be procharged low
its output can only drive n-gatoo
adding extra cloaks provides more floxibilit y at the sacrifice in
asynchronous behavior
FIGURE--2 CLOCKED CMOS OBSERVATIONS
5
ORIGINAL	 6g
OF POOR QUALITY
classic Gate supplies active drive from both directions,
although compromising input loading.
It should be noted that the output of any stage can
drive anv "downstream" stage of the opposite type. As an
example, let the first stage from the flip - flops be stage 1
and of N -Type. The P -Gates than fall on the even number of
stages and any of them can be driven from stage 1.
Extra evaluation ciooks can be added for more
flexibility althou g h asynchronous behavior is sacrificed.
The added flexibility may be desired in order to achiave a
more General form of logic function. Driving an N-Gate
with another N-Gate is an example. This subiect is dealt
with in more detail in the section implementing the PLA
logic function.
Figure 3	 illustrates a ripple realisation of the
AND-OR function needed in PLAs.	 The first stages are
N-Gates, series NMOS to provide the NAND function. 	 The
second stages are P -Gates, paralleled PMOS also to provide
the NAND.	 The NAND -NAND is one implementation of the
AND-OR.	 The output of the second stage is precharged low
and conditionally char g ed high during the evaluation phase.
One notes that An entire NAND -OR function can be
implemented in the first N -Cate stage by paralleling
(providing the OR function) additional discharge paths to
ground through series devices ( providing the AND function).
A consequence of this two-level gate is that the implicants
(the AND terms) cannot be shared with other OR gates.
Another consideration is the comparative delay associated
with series- eonneoted gates having extended fan-in (F/I).	 J
Larqe	 PLAs	 require	 large	 F/I	 and,	 preferably,	 the
	 i
impiicants can be shared. 	 As a consequence,	 parallel
configured gates are explored next.
C. Basic Types of Parallel -Conneoted CMOs Gates	
^ a
Figure	 illus t rates	 five different	 versions of
clocked CMOs	 :. as which utilize parallel connected HMOs
logic devices.	 The classic CMOS gate is also included for
illustrative purpose s. 	There are two families of clocked
CMOs:	 the ripple variety without the evaluation device and
the gated variety which responses to an evaluation signal.
Dual
	
versions exist	 in the P -Gate type, providing a total
of	 ten	 versions	 of	 clocked	 CMOs	 gates	 using
parallel - connected logic devices.
The	 first	 of	 the	 ripple varieties	 utilizes
common-souroe	 logic devices and is labeled NRi which
indicates N-Gate, Ripple, and inverting output. Assuming
the N -Cate type, the logic function is NOR; the inputs are
precharged low; and the outputs are precharged high. This
date implementation provides the fastest response of all
6
r-,
f / 3f__^^r 
OF POOR QUALITY
N-gate
stage 1
P-gate
stage 2
-► AB+CD
_	 + EF
ED-
EF
 ^p
from other
N-gates
i
^e
- 
d L 0 0 0
the evaluation device in the p-ate can be removed
-if all of the inputs ere preohergsd high by n-gate$
the logic gates •downstream' can only discharge in one direction
there cannot be a race condition
logic decisions are propagated asynchronously during the evalkuation time
can drive the P-gste but
-an evaluation device is required
tM "ate at stage i can drive p-gates at stages d. 4.6. *to.
observe: the series n-gets. nand function: the parallel C-gats.nand
function.
a parallel seriescombination at one steps also provides and-or functions
FIGURE-S AND--OR WITH RIPPLE CMOS
7
classical
or
static
+V
phase
output
NOW
gated (b)	 phase p
type
	 —^
'bootstrap'
phase • ^
AH 
1:2A, E-E]
phase e f
OF POUR QUALITY
basic types of CMOS gates
(with parallel n logic devices)
+V	 +V
phase
output
p ,	 r ,	 output
phase
NRn
NRi	 ripple
ripple	 non-inverting
inverting
N8s
gated (a) type
'active load'
FIGURE-4
8
We
gated (o) type
requires large
capacitance load
1
J
pR90IN^'- QUALITY
OF F C10R -
the versions.
The	 second	 of	 the ripple varieties utilizes
souroe-follower configured logic devices, which results in
a non-inverted output, and is labeled NRn.
	 For the N-Gate
the
	 input and outputs are preoharged low;
	 the logic
function..	 is OR.	 Since the output voltage swing is not all
the way to the supply ootential -- limited by a threshold
voltage below the rail -- this version is generally not
useful and only used in special circumstances. Since the
body-effect on the threshold voltage lowers even further
the output	 swing,	 this version is especially dubious for
the logic devices fabricated in "wells", such as NMOS in a
P-well process.	 This variety is approximately an order of
magnitude slower than the inverting version -- even to
reach its limited output swing. 	 As an example, for a
P-well prooess and with a 5 volt supply, the output swing
is approximately from ground to 2.5 - 3 volts. Requiring
the output to swing only over a limited range, near ground
fcr the X-Gate, imoroves its usefulness.
The	 first	 and	 third of the g ated versions utilize
	 i
common,-•source devices and provide the NOR logic function.
Both provide	 full	 rail-to-rail signal swings; the outputs
are precharged high; and the inputs need not be preoharged
The difference between the two relates to the location of
the evaluation devico.
The label for the NGa version refers to the Gated
variety and the fact that its inputs must be driven from an
"active" souroino output to prevent charge splitting.
	 For
example,	 as	 illustrated	 in Figure 5, the outputs driving
this pate must be able to supply current from the positive
	
supply for a "i" input when the evaluation device is turned
	 j
on.	 Durinq precharge the output of the NGa gate is
precharged high. One input
	 is presumed to be precharged
high,	 resultino in the (common) sources of
	 the logic
devices being precharged high to within one threshold below
	
the supply voltaqe. The channel is only weakly induced in
	 1
this case.	 When the evaluation device is turned on, the
source node is pulled to ground with coupling through the
full	 gate capacitance to the
	 input nodes.	 NOTe that if
there is no active source for charging this coupling
capacitance,	 then the input signal level is pulled toward
around,	 reducing	 the effective drive of the input. The
amount of this signal
	 loss depends on the effective node
capacitance of	 the input and the coupling zapacitance of
the gate.	 A precharged out p ut for a simple inverter with a
large fan-out
	 requirement would lose significant signal. A
classic	 inverter	 or a P•-Gate can supply the necessary
current to provide the full input drive voltage. One also
notes that the evaluation device can be shared with other
NGa c_ates.
The third version of	 the gated variety, tha NGo
9
OF POOK
version, has its evaluation device between the output node
and the common drains of the logic devices. The "c" refers
to the output requirement of driving a large capacitance
load when the fan-in is large to minimise charge splitting.
When	 the	 preoharge phase is present, the output is
preoharged high and the common drains of the logic devices
are presumed to be discharged low.
	 If the inputs are
subsequently discharged to ground and then the evaluation
device	 is turned on,	 only the output
	 capacitance is
available to supply charge °.n order to charge the common
drain node.
	 The resulting charge splitting may reduce the
output
	 signal	 to unacceptable levels.
	 For example, if the
output is driving a simple inverter (minimum fan-out) and
the fan-in is large, then the effective ca p acitance on the
common drain node may well dominate the load capacitance,
leading to effectively no "i" output.
Another version of the gated varieties of clocked
gates is labeled NCb for "bootstrap" from the nature of the
circuit action. The sources are tied to the evaluation
clock which is at ground during prechar g e. The drains of
the logic and load devices are connected to the output.
Since the preoharge clock is high during preoharge, the
output is discharged low during preoharge through the load.
The in p ut signals are fed through the series devices during
precharge
	 to the gates of the logic devices.
	 At the end of
the precharge phase the series sampling devices are turned
off,	 leaving the logic device gates; charged to some state.
If all of the inputs are low, then no channels exists
between the evaluation clock and the output; consequently,
very little coupling exists between the evaluation clock
and the logio gates. When the evaluation clock goes high,
the cutout remains low, at ground. On the other hand, if
any one of the g ates is charged when the evaluation clock
got high, there is a strong coupling between the
evaluation olock and the input through the gate capacitance
which bootstraps the gate up with the rising evaluation
clock. The channel of the high input remains fully
conductive, and sources current from the evaluation clock
to the output to charge it high -- even to the supply
voltage.	 If any one of the inputs is sampled high prior to
the evaluation clock,
	 the output	 is subsequently driven
high, generating the OR function. The input sampling
devices provide isolation between the charged gates of the
logic devices and the outputs of the previous stages;
however,	 they may constitute an unacceptable addition to
the circuitry. If they are deleted and the inputs are
driven from PMOS devices, then the bootstrap action drives
the drain junction of the PMOS into forward bias with its
associated injection of holes -- a sure method to lead to
SCR action and latoh-up.
	 Consequently, this version of
gated CMOs is	 likely to have limited utility in bulk
10
.y
UAkANAL, PAGE 11
OF POOR QUALrM
iA
M!
r!	 VL
9:hij^ll- L 	 -I
^ - E 1:7 "'HARGE-Z'F -I't-	 L,
11
ORIGINAL PA(m 1.9
OF POOR QUALITY
r - r
A+&+,
1
i ^	 D:ar_;; prechargb
output diechargud low thniu phaes 6
ainptjt 
w i gnOi4 trunafcrrGd to he iv iC	 V°^^•	 ^t	 e'7`	 dc. .... aDtEv
{ The trunsf_r devices une t::^ned off aftc ^
 pncchunps
Phare =-- high far evuluaticn
if a
l
l cf the inputs vGns low. there iv little caGaaitanac	 30
-	 ccup:ina Oc to the etor^ae node
4 f t!lf, iZpjt' iC h' a!1 a channel .:' fu 	 r
the c:stp:it is drivtn high + with boctst^ap sz;ti:n
	 i •
`and" func:i:n fc^ p -gate
^^7c%^ •'i^
_	 .)^.:	 ^cS! •, i^f^..^^:	 JL ar n ► t rc!f. j
 ar-,!3 y
thvl :FIG FTi^.^ " n 5, n oios: i-
f
thL rES .it it 1:jt :h up--
^!1^^ 72 C rli ^f tN(J ^ ► a. ► ` i^I c^
	
^j zL	 On
	
S.a wrc	 c^ Lg:jivxlcnt
FIGURE 6 - CLOCKED CMOS GATE
tprocesses.	 The methods to provide isolation to prevent
latch-up,	 e.g.,	 substrate bias or
	 the
	 inclusion of the
series sampling devices,
	 are likely to result
	 in an
impractical
	 gate	 implimentation.
	 For	 fundamentally
isolated structures such as SOS or laser annealed poly over
oxide,
	 the bootstrap gate offers
	 future consideration
because	 of	 its	 non-inverting	 logic	 functions with
reasonable speed.
With these possible candidates for logic gates the
question remains: Which of the many forms represents the
best implementation for PLAs using the CMOS teohnology4
III. Programmable Logic Arrays
A. The PLA General Function
The typical block diagram of a PLA implemented in
NMOS is presented in Figure 7. Latches with sampling at
phase i provide the complimentary signals and temporary
stora g e during the evaluation period. The AND plane, as
Illustrated, consist of ratioed NMOS in NOR gate form. The
OR plane consist also of ratioed NOR gates. One notes that
complimentary signals are used as inputs and outputs. The
output of :he OR plane feeds the output inverting buffers
(latches) which are enabled with phase 2. Since ratioed
logic is used, the evaluation period overlaps phase 1 and
phase 2.
H. Clocked CMOS PLA Timing Concept
A master clock is hypothesised which controls four
events, although other clocks may be required and derived
from the master.
1) Static CMOS latches are used with data clocked
in at the trailing edge of the master clock.
2) The PLA is precharged in the stand-by state
(clock low) to facilitate fast logic evaluation.
3) At	 the rising edge of
	 the master clock the
"state" and control inputs are gated into the PLA.
4) The si gnals "ripple" through the AND-OR planes
while the clock is high.
Another
	 important	 assumption	 is	 that	 the	 fan-in
requirements are large which implies parallel gates.
C. Ripple AND-OR Planes with Parallel Devices
Figure 8	 illustrates	 the sixteen possible basic
functional oombinations which provide the equivalent AND-OR
function.	 The task is to select from the possible twenty
13
IF
---v ----% 7. AND
ARRAY
OUTPUT
	
+V
lot
OF POOR
7L
OR
32:	 AKWA*f
ot
4v
AND -
--F-T
hrim
30O ^ritAy
i I L
+y
FIGURE 7 - PLA USING NMOS
14
ORIGR41AL .1
OF POOR QUALITY
E>i Dv^	 E>>E)
-V	 #	
.
Y^
ID^^1
p
FIGURE 8 - AND/OR COMBINATIONS
15
	
ti
D-E>1
D-fl,^
D°al
^ Al
OF POOR QL,•".,LI y Y
clocked, parallel CMOS pate% which can be used in these
sixteen basic functional combinations.
	 The OR plane is
assumed
	 to	 be of the ripple variety; what are the
possibilities? Without regard to the polarity of the
output, which can be handled in the latches, four basic
combinations are possible:
1) NOR	 2) OR	 3) NOT-NAND	 4) NOT-AND
From the ripple versions for these funo.ions one observes
that all require that their inputs be preoharge low and
conditionally charged high during evaluation. How can the
AND plane be implemented from either the gated or ripple
varieties? This time the polarity of the inputs are of no
concern.	 One finds the following possibilities:
1) NAND -NOT	 2) AND	 3) NOR	 4) OR-NOT
One observes that all "AND" outputs must be preoharged high
and then be conditionally discharged low in contradiction
to the OR plane input requirements. The assumed obiectives
are not possible.
It should be no surprise that the same conclusion
is reached if an OR -AND function is sought. All possible
outputs for the OR plane are required to be preoharged low
-- in contradiction to the possible
	 inputs for the AND
plane.	 Henoe,	 the
	 conclusion
	 is that no possible
configuration of
	 p arallel
	 devices can be realised in a
non-ratioed ripple - through form.
	 The alternatives are to
consider:
1) Series forms	 2) Ratioed forms	 3) Gating the OR plane
The last appears to best satisfy the requirements of high
performance for large arrays at low power.
D. Gated AND -OR with Parallel Devices
Of the sixteen possible	 combinations
	 twelve	 can be
eliminated if the	 bootstrap	 gates are	 disallowed due
	 to
latoh -up and the	 source-follower configurations
	 are
disallowed due to	 speed	 arguments. The	 four	 remaining
follow:
1) NAND -NOT-NOT-NAND	 2) NOR-NOR
3) NAND-NOT-NOR	 4)NOR-NOT-NAND
The	 first	 has	 too many levels of logic to be
	 practical.
The NOR -NOR version will result In unaoceptable charge
splitting,	 Consider	 that	 the OR plane is lightly loaded
16
,
a
^ "'°s "t"
	
*^ r
ORIGINAL P^auin
OF POOR Q
since it drives simply latches. This implis o; that the OR
plans NOR must not be of the NCc version. The remaining
version NGa re quires an active source from the previous
stage, not	 achievable with olooked N-Cates.	 As a result
only the last two combinations need further consideration.
One notes that the candidates are really the AOI and OAI
functions,	 the duals of	 each other. This duality allows
one to develop a design in either version and translate
that	 design into the other by using the duality principle.
One also observes that each contains a NMOS p lane and 4.
PMOS plans.
Consider	 the	 NAND -NOT-NOR version for further
study. The second level of gating can occur at either the
Inverter or the N-Gate. Both gates which make up the logic
planes are driving minimum capacitance and have large
fan-in,	 thus	 requiring	 the "a" gate versions.	 The
resulting choices are then reduoed to the following:
(1) (PGa or PRi)-(classic inverter)-NGa
(2) (PCa or PRi) -PCc-NRi
The gated inverter in the second version is assumed to be
limited to the "o" version because no active drive is
available from the previous stage. The PGo gate requires a
large output	 load capacitance.	 If this is not the case in
the	 minimum fan-out of one,	 then unacceptable charge
splitting	 may	 result.	 Possibly in some designs the
resulting charge splitting	 is less using the PGa inverter 	 a
since the output	 of	 the	 first stage is likely to have a
high output oapacitance. These compromises in charge
splitting relative to particular designs and programming
lead to the conclusion that the classic inverter is a far
safer ap p roach in spite of its possibly slower performance. 	 i
For	 this reason the classic inverter approach has been 	
S.
selected	 for this design effort.	 Fiqure 9 illustrates this
approach schematically.
E. Input Partitioning
ANDinq together two signals before entering the AND
plane is sometimes used as a means of compacting the
design.	 This partitioning is considered as a programming
option.	 The inputs to the PLA are assumed to run globally
around the chip and be subieot to noise. Adding a buffer
Inverter
	
inproves the sigr. al	 level	 for sampling which
occurs during the precharge state.	 The information is
stored
	
as	 charge	 on	 capacitance ( gate) during the
evaluation state.	 Doubit inverters provide the necessary
active drive for the true and compliment signals. The
input	 partitioning	 can	 be	 achieved with NOR gates
implimented as NGa versions which supply the evaluation
i17
-mil
+V tV
A
phas^
Phes"
oval. 'or*
Iphase i
and latch
evaluate
	
preohorpeand
i
T
Chose p
E^
yPRi In shown
Phase P
SNOW&'a
W, POOR QU
dynamic cmos PLA
0 sod on Noe 'or' array)
Poe is an alternative but re-
quires s clock
NRn requires a second array of
nsoe transistors and is usually
slower then the above. this is
especially true for 0-well pro-
cessing
FIGURE-9
is
3
u
of u
AZ D422
s	 i
N
M.
RI
-
L
a	 a
a
}
r-1
x	 x
N	 N
•
'I
b
I.^G C7
ab
-----------------
i
M
aN
u
M
1
0
sw
•.1
In
m
i
^.^ V r ►
 t.4 9 t q ^:s lf. ^,J^..^
ORIGINAL PACE 18
OF POOR QUALITY
gating for the ANC plane of the PLA. The AND plane can then
be	 implemented	 with	 ripple	 gates	 FRi	 for higher
performance.
	 This approach plus alternates are illustrated
In Figure 10. In same designs where the input capacitance
to the AND p lane is assured to be sufficiently large. NCc
gates can be used with the associated elimination of one
input inverter.
F. Latches
Figure 11 Illustrates various versions of latches.
The switch notation with a N, P, or C indicates a
transmission gate implemented with a NMOS transistor, a
PMOS transistor, or both as in classic CMOS. The classic
CMOS transmission gate utilises the common-souroe operation
for both devices to provide rail to rail coupling whereas
the single r.olarity types suffer a threshold drop in signal
when coupling as source followers.
The classic NMOS version of a latch is indicated
for referenoe. If the CMOS version is implemented with
CMOS transmission gates, active drive in both directions is
required as an input. If the input sampling device
represents one of several multiplexing switches, then two
sampling clocks for eaoh input are required. By adjusting
the gain of	 the first	 inverter and utilising only the
transistor	 not fabricated in the "well", one of the
compliment sampling cloaks can be eliminated.
The common CMOS sample-and-hold circuit, sometimes
referred to as the "H" latch is also illustrated. This
version suffers also in not providing symmetric drive from
:he two outputs. A major advantage of this type, however,
Is that the input impedanoe is only capacitance. The input
signal can be charge on this capacitance at the time of
enabling the latch unlike the first versions.
A simplified version of the "H" is illustrated
which utilises the fact that the input is preoharged high.
At phase 2 the input is sampled and enables the latch. The
feedback path is illustrated as a high impedanoe path in
order to minimise the cloaks required. The drive from the
Input sampling inverter must over power the feedback in
this case.	 After the signal propagates through the two
inverters,	 the full	 logic levels are generated with near
sero stand-by power achieved. If only temporary storage of
Information is required, then the feedbaok is not required
at all.
A dual input form of a latch is also shown. This
version requires suffioient ratios of the pull-down devioes
compared with the internal FMOS devioes such that the
internal latch can be upset. As before, after the input
signal has propagated through the internal latch. the logic
levels are fully established and near zero stand-by power
20
C -0-
—r- --
If
x
H
H
0
t
179
W t
90
^ xt a
x
to
v1 t
0
i
I
i
i
f
1
N
ri
I
-4	 i
a	 '
c^N
w
4 @ 91
.r
12,
^s
ORIGINAL
OF POOR QUALITY }
pRtGINA^ Q^ E^T'^f
OF Po
^—
is achieved.	 Output drive buffers are included to supply
adequate source current.	 There is an advantage of this
configuration: since the set and reset transistors are
turned off during the preoharge (stand-by) state, only a
low-to-high signal from the OR plane is required for
setting or resetting, without any clock timing signal
required for enabling. The disadvantage is that two outputs
from the OR plane are required for each flip-flop -- even
though these involve less complex functions.
C. PLA with Single-Ended Latches
Figure 12 illustrates a *looked CMOS PLA with a
single-ended latch.	 It uses the PRi-classic inverter-NCa
version.	 It also uses an extra clock with which to gate
tho latch in order to prevent the glitch in the latch
output which would occur if the latch is enabled at the
same time as the OR plane. Since the outputs are isolated
from the PLA inputs during this time by the input sampling
devices, not shown, this glitch does not constitute a logic
error to the PLA.	 The system requirements may not allow
it, however.	 If	 the glitch is not harmful then phase 1'
can be merged with phase 2, sa y ing one clock.
H. PLA with Dual Input Latches
Figure 13 illustrates the PLA/SLA with dual input
flip-flops. In order to achieve good ratios in the latch
the set transistors are selected as NMOS with their inputs
precharged low. This requires that their inputs are driven
by PCa gates. This implies the dual version of the PLA,
i.e., the OR plane is implemented in PMOS, and the AND
plane is implemented in NMOS. One additional timing event
is required:	 the beginning of the stand-by or precharge
state.	 A one shot type action is required with the action
triggered by the falling edge of the master clock.
IV. Star Layouts With Dual Input Latches
Figure	 14	 illustrates a possible chip floor plan
for a SLA using the attached coil designs which are based
on a STAR type foundation.
	 The design shows the output
buffers around the chip with an interconnection bus routing
signals around the chip Lust outside of
	 the buffers.
Internally in the chip is the folded array with the AND
plane made up of parallel NMOS devices connected along
columns (vertical).	 The inputs feedin along rows from the
outside through the output buffers and the latch* to the
input circuitry. Along each row two signals with their
compliments are routed into the AND plane. The implicants
are generated from columns of NMOS devices with their
22
A
Vx
td
rhese a
i
phase p
Phase p
ORIGINAL PAGE IN
OF POOR QUALITY,
DYNAMIC CMOS PLA
(with alternate timing for no output glitch)
+V
hose p ^
	
--^	 Vx
hose i'I
s
	
Semple
	 !void
eval evegate
AND OR P/f
phase i
precharge
phase i'
phase 2
1
phase p
phase
A
A
phas4L
a
IBated Buffer	 ,	 AND
+V
1-10
requires another clock signal
FIGURE-12
22
4r
4cap%%
i
se-rlk If
SMUT
ORIGINAL PAGE 13
OF POOR QUALITY
I Ems. ON, I m I -"T". -eY I
tg"y ^UN MACMftW
24
P;
+V
C
^P
+	 t
C	 Q* 1
3 - PLA/SLA WITH DUAL INPUT F/F
	.:	
GRiGINAR Q^ UTY.OF POO
	
'.	 outputs feeding Into two dedicated rows which provide the
inverters and the "load" devices for the implicants. The
	
t.	
t outputs of the inverters are fed back down the column to
serve
	
as	 inputs to the horizontal rows of parallel
	
t°1	 connected PMOS serving as the OR plans. The outputs of the
	
-	 OR plane run horizontally back to the set and reset
transistors of	 the latches.	 Miscellaneous row oriented
devices are grouped into t%t NCOM and PCOM cells between
-' the input circuits and the array itself. It should be
noted that the actual boundaries of the various gates used
in the AND and OR planes can be moved to fit the individual
requirements for the logio in question. This is possible
sinoe the arrays are back - to-back (merged).
The detailed 0e11 designs are included in the
following figures.
25
(saw as be low)
.
.
t
:f
'S
iL#^=
3
cloak
oirouit
ol•
-
cloak
olroult
AND/ORplansrbwux3lt^r row
uxiii^r
(man N above)
f	
P .
f
iUMIGINAL PAGE 19
OF POOR QUALITY
CHIP FLOORPLAN
FIGURE-14
26
ORIGINAL PAGE IS
OF POOR QUALITY
LEGEND
X
FIRST METAL
SECOND METAL
TRANSISTOR
DIFFUSION
CONTACT
FIRST METAL
TO
SECOND METAL
CONTACT
I _	 -1
FIGURE-15
27
x inj
-i
INPUT CIRCUITS
Se"
INPUT
9WLlTFpET8 j LATCH I 01TIMLI
AND
PLANE
mcom
and	 OR
PCom	 PLANE
jib 40 "-v O 'a
FIGURE 16 - CHIP FLOORPLhN DETAIL
28
+V
FAO
FA
V
Poom
NCOM
Tp 
It %NOR',G
OF POOR QUALir
x
TjnletReset
L
T
LATCH
FA
and
FA'
OUTPUT BUFFERS
aWa
H
H
u
E
1
CO
.-1
N
W
s
r
i
"4 "41 c a
I Ot	 OI	 z
r,
a
M
Q
H
N
N
U
a
H
.'7
O
I
rWl
aD
C7
N
W
29
ORIGINAL P I E9
OF POOR QUALITY
1'ot 'cx	 Q
c^
ON
M
w
a
^aa
EN
aNU
H
i
v+
^a
a
H
w
30
ORIGINAL PAGI-7
OF POOR QUALITY
ORIGINAL IPAG IZ `g
Of POOR QUALITY
sD11P
	
® cde 1
' 4-- 11~
i	 i	 A
t	 '	 t
1	 1	 -_ A
NCOM
i i 1 r ^^ Ir-- 8 --
1	 1
FIGURE 21 - NCOM AND PCOM
+V
iv I
x
rFA
11
X (Ai:X
1
1GND 
_A_
1	 1	 1	 1
i	 r	 I	 l	 i
v. K ^i .ir..^/
1	 1	 .
1
r	 '	 i
1
iir,i	 i	 i	 i 	 i 	 i
+V,^,,
	 1
1	 1
FA'
FIGURE 22 - FA AM FA'
Zia
4p W
9MINAL P ►GE 12
Of MOOR QUALrr%
FA'	 FA	 FA'	 FA
i	 „SET
UJAN
rr^--^---r--r-
rr	 -rr
FIGURE 23 - AND/Olt PLANE DETAIL
32
REFERENCES
1. S. S. Patil and T. A. Welch, "A Progrwmable Logic
Approach for VLSI", IEEE Transactions on Computers,
Vol. C-28, No. 9, September 1979, pp. 594-601.
2. R. A. Wood, "A High Density Programmable Logic Array
Chip", IEEE Transactions on Computers, Vol. C-28,
No. 9, September 1979, pp. 602-608.
8 s
9
^^II INS 0 • Lee'
