PLA Design in NAND Structure by Lin, Chong Ming
PLA Design in NAND Structure 
Chong Ming Lin 
Semiconductor Engineering Group 
Digital Equipment Corporation 
75 Reed Road 
Hudson, MA 01749 (617) 568-4888 
343 
ABSTRACT--A NAND (serial gating) structure PLA of the MOS poly-silicon gate 
process has been developed for high density and medium fast speed VLSI 
application. Dynamic clocking is used for minimum power dissipation and 
elimination of the ratio problem associated with static NAND gate. 
Ion-implantation for memory cell programming and the elimination of contact in 
the memory area drastically reduces the cell size, and reliability is improved . 
A simple but effective self-timed clocking scheme is employed for better 
operating margins against process variations; the overhead chip area for the 
clock generation is sufficiently small. The advantages of allowing metal signal 
and power lines to cross the PLA memory area is discussed. Some measured data 
from a 3. 5)Jm NMOS Si-gate process with regard to gate height and transistor 
sizes are also described. 
INTRODUCTION 
In MOS circuit design, the NAND circuit, due to its inherited electrical 
characteristics, has been restricted in application for only a limited number of 
inputs. In the past, ion implant programmed NAND structures had only been used 
for very slow speed ROM applications [1] [2]. However, with the fast progress 
in process technology, a properly designed NAND structure PLA is becoming more 
attractive for some of the existing applications. New product of a new 
structure with a new process is always impressive for its performance, but its 
cost effectiveness is not guaranteed on production level. This question was 
better stated by G. Moore in his lecture, in the 1st Caltech VLSI Conference, 
January 1980: 
"· •• the semiconductor industry is not now process--technology limited for 
non-memory product. How to best make use of the processing technology is 
really what the problem is." 
CALTECH CONFERENCE ON VLSI, January 1981 
344 
Cho ng Ming Lin 
PLA Design in NAND Structure 
The experiment discussed in this report was done primarily as an answer to a 
request, in November 1979, for a high density and medium fast speed PLA design. 
In order to make best use of a then newly developed 3.5 pm process and fully 
utilize the given timing spec of the speed requirement, a NAND structure PLA was 
proposed for better performance which was also cost effective for an existing 
application and beyond. 
In order to achieve overall low chip size and a high operating margin; a 
dynamic, self -timed clocking scheme was proposed for most of the circuitries. 
Knowledge from measured data are used to construct circuits with better 
performance than the original approaches used on a circuit test chip [3). 
Since the NAND and the NOR structures are the two basic building blocks and 
complementary to each other in HOS circuit design, the author feels that the 
study is also useful for understanding the general NOR type PLA design as an 
expected by-product. 
In the following sections, process background, 
layout, reliability and design consideration will 
improvement will also be discussed. 
PROCESS BACKGROUND 
NAND circuit model, cell 
be described. Further 
Although the process performance is crucial in the evaluation of a new 
circuit structure, this information was not available in the previous papers [1] 
[2]. In order to test out some of the assumptions and limitations of the NAND 
structure PLA, a test chip was designed and manufactured in 1979 with a 3. 5 ~m 
NMOS process , ?400. This process uses ion implantation for Source/Drain, plasma 
dry etching , silicon doped aluminum; plus 4 types of devices, as shown in Table 
I, which provide more flexibility in circuit design and better 
powerXspeedXdensi ty product than many of the processes in the previous 
generation. 
The performance of the process was measured through a ring-oscillator built 
on the test chip. As shown in Fig. 1(c), the powerXspeed product curves 
indicate that in a typical case, the performance of that ring-oscillator is 
around 0.35 pj at 5V vee, -3V VBB, and RT. Those curves also indicate that the 
process parameters are being optimized against those skewed process corners. 
With this kind of performance, the NAND PLA does have better potential for some 
applications which were not feasible by processes of the previous generations. 
INNOVATIVE CIRCUIT DESIGNS SESSION 
PLA Design in NAND Structure 
TABLE 1 
[)(!vice tr.resttcld voHages and ion-iJ!llancat!on df'finil lc>11 
or ct.e P~, c procE':.., . 
Boron Arsen1c 
Enhancement X --
lntrLnaic -- --
Depletion-1 X --
Depletion- 2 
--
X 
y,. ,., lvl!'l 1 . 
~·ig. I b- 'Jl)e wav.• fore. o r thE' 
ri~-osc1llator , ~a­
survd at 23 e, 'S>J vee, 
and -""!'>' Vbb. 
Vth (typical) 
0.7v 
o.ov 
-1. 2v 
-2. Sv 
Fig.la - Phvtonicroe;rap: . or a 
r1np:-osci11ator wl th 19 
s tllt!>:S and fa11-1 n an<l 
ran-out o r 1. 
.~ce . sv.t r - - i 
lV JV 7V na 
Fig. 1 c - Performance of tho.! r1np:-
osclllacor at 23 e, 5V vee. 
"'' 2c - .. ......, •• •• tM •Utlhlltr '-- faterhc.~., 
t• • - u uu ••u:• ef ,.,. .u. n... 
345 
CALTECH CONFERENCE ON VLSI, January 1981 
346 
t A.ILl 11 
&J:e-rtMW ,..,......,.,.. .... 
~ t 'ft1rwsno1d . 
--·- -I-E "v-
'l><l ' ...... 
O.lo, n... Tta\ • T Dl\ 
P'wl··:tM'· Dt'l1to- ltld'U'l 
ft.o •cnu~...,... ·n· > "we. • •,... 
.. 
,,, b 
JWC C'1!"S'~U 111\U" ~tiC' f'Wl•'l 
IW-' 
-·---~ 
TNAt.:. ~ n·T Dr'\ 
~ · "'""cr.· 
11& It 
• Aas..,.,. ....._ alt.• 
r tr-.Jl ~11·~ 
oevsc-.-.. 
NNt: Clr"":'W\ wttt 1Yt~JL" Nh. 
POLY WOR::J LINE 
PO' .. ,.....A OIU lA,JOv\ r 
l.j . .... Con\.6(\ ,.,....J.,.. 
"II J. ..til i,.,aJQ.! 
,~ .. r ... 1 -,. fl"'l.,;rw.!,.. od U 
:-: o .. ' • ~t-~Jr 
Tl•lt..:"• 0 
.... ) (d) 
GNO 
~S.. l~W1\..allon ~..1 ty 
wtu. """' ~~ or rot,....::t 
Al BIT ll NE 
UNE 
A L. B IT LINE 
INNO VATIVE CIRCUIT DESIGNS SESS ION 
Chong N/ng Lin 
---k~-..... --:1-:-.. -l+-,.. ....... --' __ ... ~.,  ..__.., 
•t ·ro· r'+ 
..... £< 
A ~~h.tt..:m 
l"trWor.t c.. ~lant 
..,._ ........ 
... 
.,. 
"''~ f 
F··--· t-or 
A h1t:" ~·oe4&nt" ..... ,,.. 
,. r~c. uu- ·~~ G...lrol"' 
t/of" t,.,..1t10f ,,_Wid 
r7-T.7l:.,..-,"7h':7-nC'._... .. uv -cuor .• 
r::- r.:- r-
r: ~,. [/ ·~- >"'/ 
v. A. . 
'"' 
rfl' 
1/ 'V r/ .,. 
- ~ 
..._ 
PLA Design in NAND Struct ure 347 
PLA Design in NAND Structure 
HODEL OF NAND CIRCUIT 
As shown in Table II, the major reasons that the NAND circuit has not been 
used as popularly as the NOR structure are due to its inferior electrical 
properties in comparison with NOR circuit [4]. Although NAND circuit's d.c. 
characteristics, such as its logic threshold and pull-down device channel width, 
shown in Fig. 3(a), can be improved by using dynamic pull-up device, shown in 
Fig. 3(b). However, the delay time of a NAND circuit is slow and proportional 
to its number of transistors in series. Even so, a NAND circuit can use ion 
implantation as a programming method for its memory bits, as shown in Fig. 4, 
and this programming approach leads to the smallest PLA/ROH cell size that can 
be achieved by the current HOS technology. 
CELL LAYOUT 
For a conventional NOR Structure PLA/ROM Cell, as shown in Fig. 5(a), the 
size of a cell is defined and limited by its components--word line (poly), bit 
line (aluminum), contact between drain (N+) and bit line, memory transistor gate 
area, drain, and source (N+). Furthermore, due to process requirement, minimum 
area and space for each element must be used in the implementation of the memory 
cell. Thus, within the limitation of the present process technology, selection 
and/or elimination a certain part of the memory cell leads to different 
structure variations and cell sizes in PLA/ROH cell design. Table III shows the 
size ratio and then the basic properties of the four major types of cells which 
are well known to the public with straight forward layout techniques. From Fig. 
5, it is obvious that the NAND structured PLA/ROM can be made up to a quarter of 
the size of a 'contact programming' cell, and the elimination of the contact 
also enhances the circuit reliability. 
The size of the NAND Structure PLA/ROM cell can be made smaller if the 
process can provide smaller poly and N+ lines without causing electrical 
problems [3). Furthermore, the alignment and resolution of the ion implant 
process hold the last barrier on the minimum size this approach can be. 
When the NAND PLA/ROM cells are placed closer to each other, the enhancement 
and depletion implant can overlap into each other. Out of the four possible 
overlapping cases, there is one fatal case and the other three cases, although 
not fatal, can all cause electrical problems as shown in Fig. 6. 
One way to test out the process and equipment limitations is using a 
checkerboard test pattern, shown in Fig. 6(c). This test pattern, if properly 
decoded, would be able to indicate the safty margins left in a process for 
implementing the NAND structure PLA/ROH. On the other hand, a NAND structured 
PLA/ROM is also a good test tool for process control monitoring, especially for 
ion implantation's definition control. 
CAL T8C H CO NFERENCE ON VL SI, Ja nua r y 1 981 
348 Chong Ming Lin 
PLA Design in NAND Structure 
CIRCUIT DESIGN CONSIDERATIONS 
As shown in Table II, the NAND Structure PLA with static pull-up would have 
difficulties with ratio, pull-down device size, and slow speed in discharge 
against the constant conducting pull-up device. In dynamic operation, the ratio 
problem is eliminated, pull-down device size is minimized, and discharging time 
is reduced. However, the generation and implementation of the control clocks 
will complicate the design and require extra silicon area. As a result, in 
dynamic operation, the design effort and total implementation area for a PLA is 
more than putting two ROM arrays together. Furthermore, in NOR structure 
dynamic (or semi-dynamic) PLA design, an interface circuit is needed to allow 
precharging of the two arrays at the same time during the early cycle time of 
the operation. Consequently, a clock scheme of four phases generated from the 
system is the most common approach, and the safest way is using four 
non-overlapping clock phases to execute the operation at the expense of slow 
through-put time and zero process tracking capability. 
For the proposed NAND PLA, the elimination of the interface circuit between 
the two arrays simplifies the layout work between the arrays and save area from 
implementing the interface circuit. Basically, a dynamic NAND circuit's access 
time is limited by its precharge and discharge time. In this proposed NAND PLA, 
precharge time is significantly reduced with precharge from both ends, because 
RC constant of the serial channel is halfed. The precharging devices of the AND 
array are driven by bootstrapped voltage level, which gives the AND array full 
VCC level that helps to speed up the discharge time in the OR array. The 
precharge operati on can be further optimized by generating a longer pulse width 
with normal VCC level for the OR array, because the OR array will be enabled 
only after the AND array is settled. A high beta ratio input inverter in the 
output register wi th amplified positive feedback through an intrinsic device, as 
shown in Fig. 8(g), allows a weak logic "1" output from the OR array at VCC-VTn 
with no difficulty in the initial sensing and final stored level. 
CLOCKING SCHEME AND GENERATION 
The proposed NAND PLA uses five clock phases. Their wave forms are show in 
Fig. 7(a), where Cin, Cpr, Cena, and Ceno are used for the control of the PLA 
operation flow. While Cla is designed for the latch of the processed data 
against precharge and low frequency operation's leakage problem. It depends 
upon each design's system spec and circuit structure, Cla may be spared without 
effecting the performance. 
IN NOVATIVE CIRCUIT DESIGNS SESSIO N 
349 
PLA De8ign in NAND StPuctuPe 
.... ..,_. 
~ ...-~u 
. 1 
• ln 
. , 
• • 
I ' ,-~":~-~to:,.--·. 
:p., I:: ; C..,. 
... .... .. ····"· . 
' ·· '"" 
c., c., 
l>lA! ~ 
It;) . of Otll 
~ 
Plo . or Otll 
_,.. 
''"""" 
lmH~.H 
-·· 
JC:III ~U"Wetun> 
•• . 
• 1 ......... 1 0 ~l'r\im An:u'ICI~ 
J«P ~\n,rtW"'!' 
•• • 1 Dt,..,_ton • ll) 
'lhln O.t. OaJda- S.l•c:ted 
'11\1 .. • DI ... Mle<:trd 
fiWC) Stnct~ 
~ ~J,~glMI\ ... ~le-eteo • 2 
- 2 • V) ~t.uon t..,t.rt• D::ln' t Carr 
NAHD St.ruc:t~ 
• 2 
- > fl:'t- • V) ~.~s~~1ti!·~h 
___r---.__ _ __..,__r 
~~o.-.. ---,~.--------------;::x:::: 
IF3. cPIII1n .-
" r---oo-/- r.: ~···"" ~~---
-cc;;;,-L__LJ 
~ I 
~--
.. , 
F10. 1d CLou ScH£"£ AND WAv£ Fooou 
, .. TH£ IIAHO PI.A 
fiG.8 - (1-CUITI UIUI ,. THl IIAHD PI.A 
Ill 
. ........ , .... ,,., 
111. • ceU •f ,..,. .._.,,.7 
c . • c:e11 •f , .. oe _.,.,., 
• · c.- a.•Mrater a.M '-7 • ..,.., .. .,,_. 
• . c.- ··-····· f u • .... , ••• , 
•· o.,,., ...... , ... 
CALTECH CONFERENCE ON VLSI, JanuaPy 1981 
350 
PLA Design in NAND Structure 
All these clocks are derived from the rising edge of the input system clock, 
Tclk, and they are generated through dummy circuits which provide enough 
tracking capability against process variations and power supply changes. In 
order to use the same circuits or cells to achieve the best result for dummy 
circuits, the proper number of depletion devices are suggested to use for 
capacitance loading duplication, and device width effect {3} is another source 
to provide extra operation margin. 
CIRCUIT OPERATION: 
1. Array precharge--
Cpr is a self-timed clock pulse triggered by the rising edge of the system 
clock, Tclk. As shown in Fig. 9, both Cpr and Cin are generated through a dummy 
circuit which uses a row of the OR array to track the loading in OR array and 
uses a column of the AND array to track the completion of the precharge action. 
During this period, both Gena and Ceno are '0'. This allows both arrays to be 
fully charged to their highest possible level against couplings and the 
so-called "charge sharing" problem (3) . 
2. Data input--
New input data should be ready in the beginning of a new cycle. Those 
input data are strobed into the input buffer through T1, see Fig. 8(a), and 
temporarily stored at node CD after the Cin pulse is gone. During the 
precharge time, T4 is turned 'off' by Cin. This guarantees the completion of 
the precharge to be independent of those input data--such that either T3 is 'on' 
or 'off', there is no d.c. path through T2 to ground due to the exclusive wave 
forms between Cpr and Cin. The input buffer only consists of 4 transistors. 
Transistor T2 is an intrinsic device which gives a higher output voltage than an 
enhancement device when Cpr is high, but it will not conduct much current, with 
proper choice of channel length, when Cpr is low. This structure allows optimal 
output level and minimum device sizes for the pull-down devices T3 and T4, as 
well as for T2 itself, partly because the intrinsic device has the highest 
mobility among the four types of devices available in this process. Also, due 
to the compactness of the structure and the lack of appreciable d .c. path in 
this input buffer, the interface problem is helped and power dissipation is 
greatly reduced. 
3. Enabling of the arrays--
Once the arrays are fully charged, Cpr and then Cin go down to '0'. When 
Cin is '1', T4 in the dummy input buffer, INdmbf, turns 'on', and the output of 
the INdmbf, see Fig. 8(d), start to change from a precharged '1' to '0' due to 
the input is vee. With enough depletion type capacitors on the output and a 
Schmitt trigger to sense the change, Gena is ensured to turn 'on' only after the 
completion of all the inverted input data have been transferred to their 
outputs, or word lines of the AND array. 
I NNOVATI VE CIRCUIT DESIGNS SESSIO N 
PLA De s ign in NAND St~u~tu~e 351 
PLA Design in NAND Structure 
Cena enables the AND array to decode its inputs through its pre-programmed 
memory bits. Cena also enables a dummy circuit in the AND array to discharge 
from its precharged level to '0' and thus generate the Ceno pulse, as shown in 
Fig. 8(e), by the same principle as the Cena generation. With the starting of 
the Ceno clock pulse, NAND PLA is ready to send out its decoded results to the 
output register and outside buses. 
~. Output Register--
As shown in Fig. 8(g), the output buffer contains 1 transistors with 
static pull-ups used in the register to provide easy data storage through 
amplified positive feedback. The advantage of precharging the output data bus 
lines is incorporated into the circuit design to save power and size in the 
output section. Because of the precharge from T17, T15 can be made small as D1 
(light depletion) device for sustaining purpose. This also speeds up the 
discharge on the bus line if bout is a '0'. 
Even though the output section only contains a minimum number of 
transistors, the layout work to interface the OR array cell pitch to the output 
buffer is not an easy task. Techniques like: combining two buffers together; 
bringing out output from both ends of the OR array; or constructing the buffers 
at a distance and then connect the two parts through spread out N+, Poly, or 
metal lines, are up to the designer's choice for the best matching between the 
NAND PLA and system requirement. 
5. Latch of the output register--
For a dynamic PLA without any sustaining pull-up device used in the 
arrays, maintaining Cena = Ceno = '1' to keep ORRDY at '0' is needed against 
noise and coupling. Furthermore, isolating the OR array precharged outputs from 
their output buffers is also crucial for operation at low frequency where 
leakage eventually will change a precharged '1' to a '0'. In this proposed NAND 
PLA, a simple but effective design is used, as shown in Fig. 7(b). The dummy 
circuit in the OR array generates a delayed "data is ready" signal, ORRDY, which 
tracks after the completion of the OR array transition through narrower device 
width transistors and/or a Schmitt trigger. The Cla clock is also controlled by 
a system latch signal, SYSLA, which happens only after the transition is over, 
see Fig. 7(b). 
CALTECH CO NFERENCE ON VLSI, January 1981 
352 
Chong Ming Lin 
PLA Design in NAND Structure 
TEST RESULTS 
Since most of the circuits proposed for the NAND PLA in this paper are of 
dynamic operation, power dissipation is being optimized to a minimum. Thus, the 
major concerns left for this approach are speed related device width effect, 
gate height effect, circuit operating range, and effect of precharge methods 
(3]. These measured data are considered to be useful as a reference point in 
related applications with this kind of circuit structure. 
FURTHER IMPROVEMENT 
SPEED: If the negative '0' level can be generated from an external VBB 
power supply, a Depletion-1 /Depletion-2 pair can be used for a 
transistor programming purpose. A D1 device, even with the lowest 
carrier mobility among the four types of devices, does c o n d u c t 
current more strongly at the same gate voltage than the enhancement 
type, thus the 01/02 pair would be faster in transition time than 
the regular E/02 pair. 
DENSITY: Since this structure does not need metal lines in the array, 
this PLA's memory area is free for metal lines of other 
functions on the chip, as shown in Fig. 10. If this PLA 
section is properly located on the chip, further area savings 
can be achieved through sharing the memory area with wide power 
lines or limited number of data/control lines. The problem of 
poly lines going across the N+ lines can be handled by using 
depletion implant at the cross section. Properly clocked poly 
line, buried contact, and a few more contact points to the 
power lines will further improve the topology and electric 
conditions in this special application. 
$A- ~lel.h: .. t DtY1ctt 
......_ - ........._.. .......... 
INNOVATIVE CIRCUIT DESIGNS SESSION 
Fig.7b- The Circuit 
Structure of the NAND 
PLA. 
PLA Design in NAND St PuctuPe 
Poly WORU LTNE of ANO- AI<RAY 
~~Vte 
TCL.K~ 
·~-~ (1)} :" 1 : 
c .. ~l~.T~-c~ j 
OB-i 
DB-i•1 
CNTL- j 
y 
VS5 )I 
__,....,___ D 1 N \_ INPuT BuFFER 
- - - - - , AARAYS 
+----L-~~------~r--- ~~J 
'\_0Uf'lf'IY (ELL 
OR-ARRAY 
;-ouf'lf'lv CELL 
AND-AARAY 
Fl C. ~· lnput-8utfer and th• generat.l.on of the self-t.Ltned 
clock II C IN •nd Cp~. 
Vee n+ 
1//7//// I/ / / /// /•;/ / 
VAN'o/Qfr~( / 1/ I/ / ./qR/ frray/~/ 
LAv 
v~ 
. 
n+ 
._ __ AI)(ft:SS _____~_lNPliT ___ E~.JFn:R ____ __;! Cp r Po ly 
Fl.t;. lO PtA ~D' "':a 1s~ ror flletal LJnesl!l 
• Use ~ltot1on Ion-~lant to allow Pol.,y and I# 
~·T:~·~:jrif&' 
.,.,tal (Al) _J- - N+ 
Pol,y 
353 
CALTECH CONFERENCE ON VLSI , JanuaPy 1981 
354 
ACKNOWLEDGEMENTS 
The author would like to thank those whose contribution and help made this work 
possible. 
L. Nguyen --His request for development of an 
advanced VLSI Chip. 
D. Morgan 
Tuan H.T. (BURROUGHS) --Their technical evaluation and 
encouragement. 
J. Zeh --His vision and commitment on the 
project. 
J. Schneider --His decision, funding and continuous 
support. 
T. Northrup, K. Slater, 
and F. Zereski --Their continuous encouragement and 
support. 
Layout 
--H. Nguyen, J. McHood, H. Riley and 
A. Sella 
Process 
--L. Y. Wu 
P400 Program 
--R. Spencer 
Test 
--R. Saul and D. Cote 
Review and preparation of the manuscripts, figures and typing--H. Forsyth, J. 
Blake, A. Flohr, J. Giles, R. Ryan, and H. Burton. 
REFERENCES 
1. H. Kawagoe and Nobuhior Tsuji, "Minimum size ROM structure compatible with 
silicon-gate E/D/MOS LSI," IEEE J. Solid-State Circuits, Vol. sc-11, No. 3 , 
pp. 360-364 , June 1976. 
2. Y. Kitano, S. Kohda, H. Kiduchi and S . Sakai, "A 4Mb full wafer ROM," in 
ISSCC Dig. Tech. Papers, Feb. 1980, pp. 150-151. 
3. Chong H. Lin, "A 4um NMOS NAND Structure PLA," IEEE J. Solid-State Circuits, 
April 1981. 
~- Carver Mead, Lynn Conway and Charles L. Seitz, Introduction to 
systems, Addison-Wesley Publishing Co., 1980, pp. 15-16, and Ch. 7. 
INNOVATIVE CIRCUIT DESIGNS SESSTON 
VLSI 
