A writable programmable logic array by Hwang, Yuan Iee
Rochester Institute of Technology
RIT Scholar Works
Theses Thesis/Dissertation Collections
1-1-1988
A writable programmable logic array
Yuan Iee Hwang
Follow this and additional works at: http://scholarworks.rit.edu/theses
This Thesis is brought to you for free and open access by the Thesis/Dissertation Collections at RIT Scholar Works. It has been accepted for inclusion
in Theses by an authorized administrator of RIT Scholar Works. For more information, please contact ritscholarworks@rit.edu.
Recommended Citation
Hwang, Yuan Iee, "A writable programmable logic array" (1988). Thesis. Rochester Institute of Technology. Accessed from
Rochester Institute of Technology
School of Computer Science and Technology
A Writable Programmable Logic Array
by
Hwang, Yuan lee
A thesis, submitted to
The Faculty of the School of Computer Science and Technology,
in partial fulfillment ofthe requirements for the degree of
Master of Science in Computer Science
Approved by: James Heliotis
Roy S. Czernikowski
George A. Brown
.Tnnp 27 1~RR
Dr. James Heliotis
Dr. Roy Czernikowski
Prof. George Brown (Chairman)
TABLE OF CONTENTS
o. Abstract.
1. Introduction and Background
1.1 Overviews of Programmable Hardware
1.1.1 ROMs & PROMs
1.1.2 EPROMs & EEPROMs
1.2 Electrical Alterable Programmable Logic Array
1.3 Alterable Programmable Logic Array
1.4 Writable Programmable Logic Array
1.4.1 Improving Writing Speed
1.4.2 Random Selection of Free Location
1.4.3 Flexible Erasure and Updating Abilities
1.5 The Overview of Each Chapter
2. Architecture Descritption
2.1 An Abstract Block Diagram
2.2 The Architecture of a WPLA
2.2.1 Storage Pattern for the PLA Personality
2.2.2 Multiplexing Input Scheme
2.2.3 Data Formatting
2.2.4 The Data Path
2.2.5 Pseudorandom Addressing Scheme
2.2.6 The ERASE and READ Scheme
2.2.7 Design for Testability
2.3 The Extended Block Diagram of a WPLA
2.4 The Configuration of a Writable Programmable Logic Array
2.5 Operational Modes
1
1
3
4
4
5
7
8
8
9
9
11
11
13
15
15
17
20
24
27
30
32
32
34
2.5.1 Basic Operations
2.5.2 Macro Operations
3. The Data Portion of a WPLA
3.1 The Basic Memory Cell
3.2 The Input Buffer
3.3 The Output Buffer
3.4 The Master Drive
3.5 The Glue Logic Circuit
4. The Control Portion of a WPLA
4.1 The Row Control Circuit
4.2 The Mode Selection and Main Phase Clock Circuits
4.3 The Multiplexing Control Circuit
4.4 The Glue Logic Circuit
5. Implementation and Verification
5.1 Implementation
5.2 Functional Verification
5.3 Timing Evaluation
6. Performance Evaluation and Comparison
6.1 Comparison between WPLA and APLA
6.2 Comparison between WPLA and CAM
7. The Application of Writable Programmable Logic Array
7.1 Reconfigurable Combinational Circuit
7.2 Programmable Finite State Machine
7.3 Peripheral Device Controller
7.4 Fast Turn-arround Time Custom Design
7.5 An Emulation for Different Computer Systems
7.6 Testability and Reliability
8. Conclusion
34
36
37
37
44
49
51
54
56
56
61
62
68
76
76
78
79
81
82
84
89
89
90
90
92
93
93
98
TABLE OF FIGURES
Fig. 1 . 1 The conventional PLA using a two phase clock 2
Fig. 1.2 An APLA struture 6
Fig. 1.3 An abstract block diagram ofaAPLA 6
Fig. 2. 1 An abstract block diagram ofaWPLA 12
Fig. 2.2 AWPLA with six rows 14
Fig. 2.3 Multiplexing input scheme 16
Fig. 2.4 Data format for the AND plane duringWRITE or SEARCH operation 18
Fig. 2.5 Data path for NORMAL operation 21
Fig. 2.6 Data path for WRITE operation 22
Fig. 2.7 Data path for SEARCH operation 23
Fig. 2.8 A row control circuit 25
Fig. 2.9 Pseudorandom addressing scheme 26
Fig. 2.10 ERASE scheme 28
Fig. 2.11 READ scheme 29
Fig. 2.12 Shift register chain 31
Fig. 2.13 The extended block diagram ofWPLA 33
Fig. 3.1 Amemory cell AND plane 39
Fig. 3.2 Amemory cell OR plane 40
Fig. 3.3(a) The layout of thememory cell in AND plane 41
Fig. 3 . 3(b) The layout of the memory cell in OR plane 42
Table 1 The size comparisonwith different memory types 43
Fig. 3.4 Input buffer forAND plane 45
Fig. 3.5 Input buffer for OR plane 47
Fig. 3.6 Output buffer 48
Fig. 3.7 Master driver 50
Fig. 3.8 Data format for theWRITE operation 52
Fig. 3.9 Data format for the SEARCH operation 53
Fig. 3.10 Glue logic circuit 55
Fig. 4.1 A row control circuit 57
Fig. 4.2(a) Mode selection circuit 63
Fig. 4.2(b) Phase clock circuit 64
Table 2 Mode selection 65
Fig. 4.3 Multiplexing control circuit 66
Fig. 4.4 Timing diagram formultiplexing control circuit 67
Fig. 4.5 Control signal circuit(part 1) 70
Fig. 4.6 Control signal circuit(part 2) 71
Fig. 4.7 Timing diagram forWRITE/SEARCH modes 72
Fig. 4.8 Timing diagram for READ, ERASE , and SCAN modes 73
Fig, 4.9 Timing diagram for NORMAL and TEST modes 74
Fig. 4.10 Glue logic circuit 75
Fig. 5.1 Pad assignment ofWPLA chip (22*64*22) 77
Table 3 Functional comparison 87
Table 4 Chip size comparison 88
Fig. 7.1 AWPLA peripheral controller 94
Fig. 7.2 The emulator for different computer 95
Appexdix A
Fig.A.l 2inputNORgate(4:l)
Fig.A.2 2inputNORgate(8:l)
Fig. A.3 2 inputNAND gate(4: 1)
Fig. A.4 2 buffer pair, 4 buffer pair, and 8 buffer pah-
Fig. A.5 Inverting superbuffer
Fig. A.6 Noninverting superbuffer
Fig. A.7 3 input NOR gate(4: 1)
Fig. A.8 3 input NOR gate(4: 1)
Fig.A.9 MOSIS spice file
Fig. A.10 AND_cell_l
Fig. A.11 AND_cell_2
Fig. A.12 AND_cell
Fig. A.13 AND_row
Fig. A.14 AND_row_4
Fig. A.15 OR_cell
Fig. A.16 OR_row_4
Fig. A.17 AND_buffer_cell
Fig. A.18 AND_buffer
Fig. A.19 OR_buffer_cell
Fig. A.20 OR_buffer
Fig. A.21 Row priority control cell
Fig. A.22 Output_buffer
Fig. A.23 Master driver cell
Fig. A.24 Master driver
Fig. A.25 Control signal 1
Fig. A.26 Control_signal_2
Fig. A.27 Dual phrase clock
Fig. A.28 Mode selector
Fig. A.29 Multiplexing latcher
Fig. A.30 Controller
Fig. A.31 Writeble Programmable LogicArray
Appexdix B
Fig.B.l Cell library -
Fig. B.2 Cell library
Fig. B.3 Cell library-
Fig. B.4 Cell library -
Fig. B.5 Cell library -
Fig. B.6 Cell library -
Fig. B.7 Cell library -
Fig. B.8 Search cell
Fig. B.9 And plane input buffer cell(partl)
Fig. B.10 And plane input buffer cell(part2)
Fig. B . 1 1 Or plane input buffer celKpartl )
Fig. B.12 Or plane input buffer cell(part2)
Fig. B.13 Master driver cell
Fig. B.14 Multiplex controller
Fig. B.15 Mode selector
Fig. B.16 Row control cell
Inverting buffer, Non inverting buffer,
Inverter(4:l), Inverter(8:l)
2 input nor(4:l), 2_input nor(8:l),
2 input nand(4:1), 2 input nand(8: 1)
3 input_nor(4: 1 ) , 3 input nor(8: 1 ) ,
3 input nand(8:l)
Pad_Vdd, Pad_GND
Superbuffered input pad(4: 1)
Superbuffered input pad 2(4: 1)
Output_pad(8:l)
0. ABSTRACT
This thesis contains the analysis, design, and implementation of a writable
programmable logic array integrated circuit. The WPLA is able to be reprogrammed
any number of times as needed. A content addressable scheme is proposed to conduct
READ, WRITE, and SEARCH operations in the WPLA. The WPLA is programmed
by writing binary data into storage cells associated with each node in the AND/OR
planes of the array; the binary data then form the personalities of the PLA. The
layout of the WPLA will be implemented using Mentor Graphic's CHIPGRAPH
layout editor with 2 iim NMOS technology and MOSIS design rules. The event-
driven logic level simulator QUICKSIM, and a MOS circuit level simulator
MSIMON, are used to verify the functional and timing behavior of theWPLA.
1. INTRODUCTION and BACKGROUND
This section provides a background of PLAs and their related technology. With the
advent ofVLSI technology, circuit complexity has been increasing exponentially, but
using structured design concepts and design automation tools, design cycle times can
be made shorter. Designers can use standard cells and/or a silicon compiler to
create structured modules inorder to reduce development time. Moreover, through
minor modification and with little overhead, PLAs lU-t5] can be constructed into a
testable architecture to improve fault coverage and reduce test time.
1.1 Overview ofProgrammable Hardware
The history ofPLA development parallels the history of programmable ROMs. Due
to programming limitations, early PLAs were available only in mask-programmed
versions. Just as with a ROM, a logic designer would indicate on the vendor's PLA
AND/OR logic map, where the desired connections were to be made. The vendor
would then generate a custommask for the PLA to implant the customer's logic.
The conventional PLA, as shown in Fig. 1.1, consists of input lines, bit lines (i.e.,
output lines of the input buffers which provide the true and the complement terms of
the input variables), product lines, and output lines. The portion of the PLA
consisting of the intersections ofbit lines and product lines is called the AND plane.
VDD
Al
Al
Al
Al
4.
Pha I
phtu 1
4
AND plane.
4.
. OR plane
VDD
.!b lb id h
Z. Z, 2.
Jf-Pha
pha* 2 n_
Fig. 1 . 1 The conventional PLA using a two phase clock
Similarly, that portion of the PLA consisting of the intersection of the product lines
and the output lines is called the OR plane.
A particular combinational logic function is realized with a PLA by assigning pull
down devices at desired intersections in the AND plane and the OR plane. The pull
down device in the PLA represents the programmed code, called the PLApersonality,
which specifies boolean expressions as the sum of products of literals. The PLAs
input lines are driven by either inverting or non-inverting superbuffers, which are
controlled by a pass transistor clocked on phase 1 . Each product line carries the
NOR combination of all input signals that lead to the gate of the transistors attached
to it. In a like manner, each output in the OR plane is the NOR combination of all
the product lines connected to it and leads to an inverting output buffer through a
pass transistor clocked onphase 2.
1.1.1 ROMs&PROMs
The first devices were one time Programmable Logic Arrays which could be
programmed inmuch the same way as a Programmable Read OnlyMemory(PROM) is
programmed: built-in fuses were blown by a special programming machine to
implant programs or data into the PLA. NOR-NOR logic is used to create the AND
plane and the OR plane architecture of an NMOS type Programmable Logic Array.
It uses the sum of product forms to implement boolean functions. The advantages
of a PLA are its design simplicity and the regularity of its structure, both of which
reduce the complexity of logic function sections in integrated circuit design. On the
other hand, its main disadvantage is that PLAs are non-erasable after they have been
programmed by the custom-made mask.
1.1.2 EPROMs & EEPROMs
The Eraseable Programmable Read Only Memory(EPROM) was the next
development in VLSI technology, making it possible to reprogram the personalities of
the chip. However, an EPROM has to be programmed in two steps. First, an
EPROM must be bathed under the utraviolet light for about twenty minutes, to place
the device in its erased or initialized state. It can then be reprogrammed in much
the same way as a PROM was. As the VLSI technology moved on, the Electrically
Eraseable Programmable Read Only Memory(EEPROM) was invented. The
EEPROM employs a much simpler way of implementing logic functions in integrated
circuits. This chip is similar to an EPROM except that the does not have the clear
window; ultraviolet radiation is not required for erasure. Instead, a special voltage
signal applied for specific times can erase an EEPROM, and this voltage can often be
applied from within the host system.
1.2 Electrical Alterable Programmable Logic Array
An electrically alterable programmable logic array (EAPLA) has been designed by
WoodH9] and Fongtf]. The use of the EEPROM technology in a PLA results from the
idea of combining the electrically alterable nonvolatile memory devices within the
PLA design. Although the EAPLA chip is not as dense as a PLA chip, the EAPLA
reduces the design effort considerably in the application field. The chip can be
programmed or erased by applying high voltage to the control pin. The drawback of
this scheme is the need to disconnect the chip from the socket or the board in order to
reprogram the personality matrix. Removing a soldered chip from a circuit board
almost always guarantees that it will be damaged.
1.3 Alterable Programmable Logic Array
Through a different approach, a RAM-like Alterable Programmable Logic Array
(APLA) was designed by Marchandt16! In Marchand's scheme, the APLA performs
the same function as a standard programmable logic array. The APLA is
programmed by writing the personality into a storage cell at each node of the
AND/OR plane through an associated peripheral register. Unlike the EAPLA, in
which a high voltage must be applied to change its personality, the Alterable PLA
can be reprogrammed any number of times by rewriting the data into the storage cells
at normal logic levels. Fig. 1.2 shows the typical arrangement of the APLA, which is
composed of two parts: (a) the basic PLA function with AND and OR planes, and (b)
the control logic needed to operate the "alterable" functions and to maintain the
information in the dynamic storage cell.
Even though the writing scheme used to program the device is easier than its
predecessors, the APLA still suffers from the long time required to alter the
personality. In Fig. 1.3, an APLA with 22 inputs, 22 outputs, and 64 product
terms!16! requires 2.2 ms for the entire personality to be written into the AND/OR
hh
'iid |
wrote*
Ch
-t>J
e^Di
l^ll^l'OlI'Dll'Oll'Ot
-o^
h
-oJ
w
J2?
JS?
J2?
W
5
Fig. 1.2 An APLA structure
PERIPHERAL REGISTERS
< "
" uj
f 2 4
E ui
"1
a
Ul
s
<
a
T
SOOT
^
-
COL.?/y SEL.^
2 2
_4_
DATA // REG.
AND
PLANE /
2%L
//
OR
PLANE
~7~
//
J
Ul
IS T is
u
T
SIN
1
u I
: ui
Fig. 1.3 An abstract block diagram
of a APLA
plane. Even writing a single row or column takes 34.4 us. Furthermore, the write
time grows linearly with respect to the size of the circuit. This means that the larger
the configuration becomes, the longer the writing time will be. Another
disadvantage is that the user must be aware of the locations of free or unprogrammed
storage cells and mustwrite the appropriate data into the correct peripheral registers,
before transferring the data to the cell itself. Also, prior to updating the personality,
the corresponding address of the row or the column with that personality needs to be
known. It is inconvenient for a user to keep a table showing all occupied rows or
columns and their associated personalities. Even using a table look-up instead of an
exhaustive search is still time consuming.
1.4 Writable Programmable Logic Array
In order to avoid the problems of an APLA, such as the long writing time, the need to
know the address of a free location etc., we will design a Writable Programmable
Logic Array(WPLA) which employs a pseudorandom addressing scheme for fast
WRITE and easy ERASE operations. To achieve fast access operations, the WPLA
will use a content addressable scheme for SEARCH and READ operations. The end
user does not need to know the physical address of a free location. Instead theWPLA
will search the memory and allocate a free space for the user. In addition to the
functional specification, the testability of theWPLA is considered in our design. Test
vectors will be used to check the PLA circuit. As a result, improved testability will
provide benefits, both in the design phase and in the field applications.
1.4.1 ImprovingWriting Speed
One of the core ideas of the WPLA is that the number of input variables or output
variables of a PLA depends on system specification, so that segmenting the
personalities is required. To update a memory cell, the system bus supplies data to
the input variable dynamic latches by multiplexing the device input lines. The
WRITE operation depends upon the number of multiplexed variables in the
personalities instead of the number of peripheral registers as in the case of an APLA.
The specific objective is to improve theWRITE speed by achieving a rate at least ten
times faster thanMarchand's scheme.
1.4.2 Random Selection ofFree Locations
Each row is associated with a status tagwhich indicates whether the memory cells in
that row are free or occupied. A row with a free tag will gate the writing clock to
allow the personality to be written into its memory cells in the AND/OR planes.
However, ifmore than one status tag has a free flag, the WRITE mechanism should
be able to distinguish priorities and prevent multiple WRITES. The purpose is to
omit the need to specify a free location, thus causing this chip work in a friendly
manner.
1.4.3 Flexible Erasure and UpdatingAbilities
A search by content operation, like that in the content addressable memory(CAM),
is conducted simultaneously on the AND/OR planes. A target tag is attached to each
row and is set if the corresponding row is targeted by the searching key or by the same
personality. Through target tags, erasing and updating personalities can be done
fast enough and, further, the target tags help make the WPLA fault tolerant by
using redundant rows. The objective is to implement the SEARCH operation and
make it easy to operate. In addition, it is possible to examine all the personalities or
just those associatedwith the targeted row.
1.5 The Overview ofEach Chapter
This thesis consists of eight chapters. In chapter one, the motivation and objectives
are presented by introducing ROM, EPROM, EEPROM, and APLA devices. In
chapter two, the architecture of a WPLA is described, including the pseudorandom
addressing scheme, the multiplexing input mechanism, and the design for testability
etc^ The functions of the proposed WPLA are also summarized by each operation
mode. In chapter three and four, the data portion and the control portion of the
WPLA are discussed through their associated schematics and detailed timing
diagram. In chapter five, the implementation is described, and the functional and
timing verification is also evaluated in detail. In chapter six, a performance
comparison among WPLA, PLA, APLA, and CAM is presented. In chapter seven,
six application examples are presented. Finally, in chapter eight, the conclusion
and the trade-offs in the design of implementation and various application points of
view are summarized. The performance impact of future VLSI technology
improvements are also evaluated.
10
2. ARCHITECTURE DESCRIPTION
A Writable Programmable Logic Array basically uses volatile memory cells in the
AND/OR planes as programmable nodes instead ofusing fixed pull-down devices as in
the conventional PLA. The content or personality in the memory cell can be
programmed, updated, and examined through versatile and powerful functions, such
asWRITE, SEARCH, READ, and SCAN operations. The architecture of aWPLA
and its associated operationswill be described in the following subsections.
2.1 AnAbstract Block Diagram
The proposed configuration of a WPLA is given in Fig.2.1. The AND plane and the
OR plane perform the NOR-NOR operation like a conventional PLA. Using a non-
overlapping two phase clock scheme, the primary inputs provide literals to the AND
plane through input buffers. The outputs of the OR plane, representing the stored
boolean functions, are located in the output buffers.
The programmable nodes in both planes are constructed on the basis ofa pseudo-static
RAM cell. The PLA personality is programmed into each memory cell through the
input buffer and the master driver. The system data bus feeds a part of the primary
input lines into the master buffer for formatting the desired personality or the search
11
OUTPUT
OUTPUT
BUFFER
<t>i
1?
"* '*
i
P LN
1 L '
ROW
CONTROL
MEM
CELL
MEM
CELL
AND
PLANE
OR
PLANE
v
. . . i i
*
BIT
LINE
CONTROL
CIRCUIT
v
INPUT
BUFFER
INPUT
BUFFER
S SI
N
N
J
V
S S OUT
*x 1[
I
MASTER DRIVER
\
PMI LN
OR -
\
SYSTEM
'
DATA
BUS
Fig. 2.1 AN ABSTRACT BLOCK DIAGRAM
12
argument. The row control circuit selects a vacant location and gates the write clock
to store the prepared personality into the addressed location. Alternatively, it can
select a targeted row, gate the read clock, and unload the personalities into the input
buffers. The personalities are then scanned out from the input buffers for
examination. Moreover, a priority mechanism in the row control circuit allows no
multiple WRITES (READS) during the WRITE (READ) operation. The control
circuit interfaces with the external mode control signals(CBTO, CBT1, CBT2) and the
phase clocks(PHEl, PHE2) in Fig. 2.13, and generates internal control signals to
activate the associated operations.
Detailed operation will be described in section 2.4 after NORMAL, WRITE, SEARCH
schemes are described in the following section.
2.2 The Architecture ofaWPLA
A row of a WPLA indicates the memory cells on the AND plane and the OR plane
concatenated through a single product line, P LN, as shown in Fig.2.2. Each row
has its own row control circuit.
13
ROW
CONTROL
CIRCUIT
AND PLANE OR PLANE
ROW
ROW S
ROW 4
row a
rrow 2
ROW 1
ROW
CONTROL
CELL
<FI9. 4.1)
MEM
CELL
(Flfl.
a.i)
P_LN
MEM
CELL
(Flfl.
a. 2)
P LN
P LN
m4
P. LK
P LN
+2
P LK
*1
Fig 2.2 WPLA WITH SIX ROWS
14
2.2.1 Storage Pattern for the PLA Personality
On the AND plane, a pair of memory cells is used to store the true term and its
complement ofa personality bit, or both cells store "logic
0" to represent "don't care".
In the OR plane only a single memory cell is used to store a personality bit, with
either "logic 1" or "logic 0". The basic storage pattern is similar to one used in a PLA
using two nodes for each variable and one node for each output line. The pattern of
personalities on all the programmed nodes in the AND and OR planes represents a
boolean function in the sum of product form and is referred to as the personality
matrix.
2.2.2Multiplexing Input Scheme
The input buffer attached to the AND and OR planes shown in Fig.2.3 is used to latch
the formatted data provided by a master driver and then activates the AND and OR
planes. Since the number of inputs and outputs of a WPLA is usually larger than
the number of system data bus lines, a single row of the personality matrix or a
search argument can not be loaded into the input buffers without partitioning.
In Fig.2.3, a multiplexer is used to input the partition of personalities(search
argument) and is implemented by a set a pass transistors and multiplexing phase
clockW{. The ith partition of the personalities(the search argument) is latched into
15
utputI
>
OUTPUT
BUFFER
ROW
CONTROL
CONTROL
CIRCUIT
MEM
CELL
AND
PLANE-
INPUT
BUFFER
P LN
V MEM
CELL
OR
PLANE
BIT
LINE'
INPUT
BUFFER
'
>;....4 4 i
,
^i < H5 "q *< ! uu
I ... .i. L
LTIPLEXER
MASTER DRIVER
PMI LN
OR
SYSTEM
DATA
BUS
Flo- 2-3 MULTIPLEXING INPUT SCHEME
16
the correct input buffer by clocking its associated phase clock Wi. If j indicates the
minimum number ofclock phase for this multiplexing operation, then.
j = [(n+ q)/m]
Where
n: the number of the primary lines in the AND plane
q: the number of the primary lines in the OR plane
m: the number of the system data bus lines
After the jth clock cycle, a single row of the personality matrix or a complete search
argument is finally latched into the input buffers prior to the coming WRITE or
SEARCH operation.
2.2.3 Data Formatting
The purpose ofdata formatting in themaster buffer comes from the need to encode the
personality (search argument) during theWRITE or the SEARCH operations. Since
the data format to be prepared for the OR plane is either "logic 1" or "logic 0", the
required data format, i.e. the code for these two operations, is dominated by the data
requirement in theAND plane and is illustrated in Fig.2.4.
It was difficult to design a data formatter circuit without using register storage which
increases the area of the overall device. A dynamic storage circuit, as described in
17
D B
MASK
BIT
MASK
HIT
DATA
BIT
D DB
STATE
REPRESENTED
0 0 0 t logic 0
0 1 1 0 logic 1
1 I 0 0 "Don't cars"
(a) WRITE OPERATION
D B
MASK
BIT
DATA
3IT
D DB
STATE
REPRESENTED
0 0 0 1 logic 0
0 1 1 0 logic 1
1 1 0 0 "Don't cars"
1 0 1 1 "Mask a bit"
LOGIC O
(b) SEARCH OPERATION
Flfl. 2.4 DATA FORMAT FOR THE AND PLANE
DURINQ WRITE OR SEARCH OPERATION
18
chapter 3.4, was used to minimize area. The advantages of this arrangement are
attributed to: (1) no modification in the circuit of the input buffer, (2) the
centralization of the formatting capability, and (3) the compact layout size between
the input buffer and the memory.
The formatting function is basically constructed by dual NOR gates, a mask bit and a
configuration bit as shown in Fig.2.4. The mask bit contributes to formatting the
data to be written, and the configuration bit contributes to formatting a search
argument. The formatted data for WRITE and SEARCH operations are needed not
only in a pair ofcomplement states, but also in both "logic
0"
and "logic 1".
In theWRITE operation(Fig.2.4.a), if themask bit is "logic 0", the same state as the
data bit will appear on the D, and its complement will appear on DB. However, if
the mask bit is "logic 1", then "logic 0" will appear on both D and DB to represent a
don't care state.
In the SEARCH operation(Fig. 2.4.b), If the configuration bit is "logic 0", D and DB
will be in the complement states and D and the data bit have the same state. If the
configuration is "logic 1", an additional inverter is added between two NOR gates to
generate the complement state of the data bit on both D and DB. The mask or
configuration bit is latched before the data bit is fed into the master buffer. Then, the
data bit will determine the personality of the search argument. If the configuration
19
bit is logic 1, then the logic value ofD and DB either the don't care state(both "logic 0")
or the masking bit state(both "logic 1").
2.2.4 The Data Path
The data path for NORMAL operation in Fig.2.5 is similar to the conventional PLA as
mentioned before. The primary input and the output are separately latched by <pi
and $2- During theWRITE operation, a data path for a single bit in the AND plane
and in the OR plane is illustrated in Fig.2.6. Wi is a multiplexing clock phase which
is used to latch the formatted data. In the AND plane, D and DB are stored into cell
1 and cell 2. In the OR plane, only the formatted data D goes through the inverting
buffer and is stored into cell 3.
For the SEARCH operation (an abstract diagram is shown in Fig.2.7) the formatted
data, as a search argument, is processed by the master driver and latched into the
input buffer by the multiplexing clock phase Wi. Next, the search argument in
BIT LN and BIT LN is applied to the comparison circuit of the memory cells.
P LN is kept in "logic 1" only if the personality in node ~Q ofthe storage cells and the
search argument are in opposite states; the results of this comparison P LN(high)
means
"match"
with the particular argument. The comparison circuit in the OR
plane is different from that in the AND plane. A single storage cell is only reserved
20
AEJtD PLAEIS
__flf.
c
CELL
^1 P^
CELL
*-fH
VDD
OUT LN
4> II '
P LN
HE
CELL
1 UCL
Q PtLABtl
~BIT LN
BIT LN
QHPMTT
saoppaia
PMI LN
Flo. 2.S DATA PATH FOR NORMAL OPERATION
21
AC3B) PILABia
(BID
piLAina
~BIT LN
Flfl. 2.1 DATA PATH FOR WRITE OPERATION
22
AGO PILACad
P.LN
STR
CELL
1
i ?
STR
CELL
2
0 P(LIi
STR
AC CELL Jh
Q Q
P^
s on p p (i a
LOGIC 0 DATA
BIT
LOGIC 0 DATA
BIT
Fig. 2.7 DATA PATH FOR SEARCH OPERATION
23
for a bit in the OR plane, while the comparison of the "logic
0"
state in the storage
cell needs more complicated circuitry.
2.2.5 Pseudorandom Addressing Scheme
A single row control circuit in Fig.2.8 is composed of a status tag, a target tag, and a
priority circuit with some control gates. The status tag is set to logic 1 initially
through a master reset signal (RS) in order to indicate its associated row as "vacant".
Whenever the row is gating the WRITE clock, the trailing edge of that clock will
reset the status tag of this particular row to logic 0. A "logic
0" in status tag
represents
"occupied"
and inhibits any activations by the followingWRITE clocks.
The pseudorandom addressing scheme in Fig.2.9, implemented by a priority chain, is
used to select a vacant row to write in the personalities. The physical order of the
priority circuit chain in Fig.2.9(b) is arranged to let the lowest vacant row be written
first. Since the initdalization(Fig.2.9.a) sets all status tags to "vacant"(logic 1), the
initial consecutiveWRITE operations (Fig. 2.9.c) will store the personalities into row
1, row 2, and so on.
Next, erase operations (Fig.2.9.d) will release some vacant rows. Thus, the future
personalities to be written will not constructed in the previous sequential order.
Instead, the current lowest vacancy, addressed by the priority chain, will be written
24
TO
NEXT
STAGE
ee
a
>
a.
PRIO
RITY
CKT.
>
a.
**
c
*
< |
ca
>
a.
TARGET
TAG
1 N F\_^
. .
P LN
<
PRIO -
RITY
CKT. STATUS
TAG
*\ w> \_
,
;
1
5 <
i
a
>
0.
1
<
. i
t
R S RCK^ VrcK
R D
FROM
PREVIOUS
STAGE
Fig. 2.8 A ROW CONTROL CIRCUIT
25
o o o
o o o
5
o o
ft 3:- fc5 g =
^ii^ti^ U HI
Eg | J J J=^J J J J d I'1 =* g1 E S
"?-
i J J J J j J J-=d P'E
o
o
o
a
26
next (Fig.2.9.e). The more the erase operations are mingled with the WRITE
operations, the more random the location of the personalities become. However, the
upper most vacant row always has the least priority, and therefore accepts the fewest
WRITE operations. We label this kind ofarchitecture as apseudorandom addressing
scheme.
2.2.6 The ERASE and READ Scheme
The target tags in the row control circuitry, accompanied with another priority
circuit chain, are used to control the ERASE and READ operations. Initially all
target tags are reset to "logic 0" to indicate the rows as "untargeted".
Before the ERASE and READ operation, some of rows have already been written,
and their associated status tags have been set to "occupied" (Fig.2. 10.a).
Accompanying the SEARCH operation, and through a strobe pulse(STB), the tag(s)
of the associated targeted row(s) is (are) set to logic 1, through the P LN(s),
representingva
"match" (Fig.2.10.b). To erase the occupied rows we easily shift the
target tags to the status tags. The erased row(s) need not be cleared and will be
ignored by the NORMAL and SEARCH operations until we write it(them) again.
For the READ operation shown in Fig.2.11, the target tags set by the previous
SEARCH operation,(Fig.2.11.a and Fig. 2.11.b) can accept a READ clock and gate the
27
CSC9
<C9
t <
H 3r-
a^
C9C9
<<3
t <
^ =
31 3r-
a ass
<
'ee
Ui
LU OT
cs fro
UJCS
T a a . i j
fl 3f 51 *
&&EU&
28
Sg5'S55Sg s=
u
o
:ca
1=^ iL^^L^
o a
o \j
a
:-c a4=44i g i iz<
u
o
29
data from one of targeted rows to the input buffers (Fig. 2.11.c). The priority chain
also arranges the lowest targeted row to be read first. The trailing edge of the READ
clock will reset the target to "untargeted"(logic 0) after its personalities are gated into
the input buffers.
2.2.7Design for Testability
During the SCAN or TEST operation, the input buffers in the AND and OR planes are
configured into a serial shift register chain as shown in Fig.2. 12.a. The
personalities, which have been unloaded into the input buffer during the previous
READ operation, are scanned out by clocking SSN1 and SSN2. SSOUT, the output
pin of this shift register chain, can be used to examine the personalities in the series.
Similarly, all status tags and target tags in the row control circuit can also be chained
together into a long shift register chain shown in Fig.2.12.b. In the TEST mode, the
states of the status tag and the target tag for each control circuit can be observed from
the output pin, SOUT, of this shift register chain. Moreover, the stand-alone test for
the above two register chains can also be conducted simultaneously in the TEST
mode. If the test vectors are serially scanned from the input pin SSIN and SIN of
both shift register chains, then their test responses can be examined from the output
pins SSOUT and SOUT.
30
oC3
<
d
II
Zi
o
< y
- *-Ch?:LU <
U -
to|< to
t
> k .
19 Zi
3 ' to'
t
a.
cc bi-S:
< J
y
)H-r:
-
<
3
d
s
to to
to to
i =
to, I
v> to
31
Because theWPLA can be used to do stand-alone testing of the input buffers or of the
tag registers, these featuresmake it easier for theWPLA to be tested,
2.3 The extended Block Diagram of aWPLA
The extended block diagram of the proposed WPLA, given in Fig.2.13, shows the
data and control connections among the modules and illustrates the overall
architecture and components discussed in the last section. The detailed
implementation ofeachmodule will be described in chapter three and four.
2.4 The Configuration of aWritable Programmable LogicArray
We will implement aWPLA with the same configuration asMarchand's chip(Fig.l.2)
which has 22 inputs, 22 outputs, and 64 product terms. The bus-wide master drivers
in theWPLA are constructed in a 16 bit structure, which provides for interface with a
16 bit system data bus. Therefore, the input buffers for the AND/OR planes total 44
stages, and are divided into 3 partitions of 16 stages, 16 stages, and 12 stages, all of
which are used for latching data by multiplexing from the master drivers. The
conceptual approach and objectives involved are described in the subsections that
follow. A summary of the expected goals of theWPLA is:
(1) Fastwriting speed.
(2) Random selection offree locations for theWRITE operation.
32
SIN
WTCMP
RDCMP
P
P : PRIORITY CKT.
S : STATUS TAG
T : TARGET TAQ
OUTPUT
OUT BUFR.
G5
0
Cf
p &3_J
1 r
..}._.
0
0
p li>
RD
WRT
P Lt
RO
WRT
fa""S
MEM
CELL
P
^E
P L&
3~S
Vi
SOUT
0
0
>
p Lr
RD
WRT
P Ll-
RD
WRT
_SS_
NT
AND
PLANE
** II! .Jl!-H?
P LN
P LN
P LN
MEM
CELL
V
OR
PLANE
P LN
P LN
\
CONTROL
I
TIMING
CIRCUIT
"" '
t
' At ' t ' LrJ^ H
BIT LN-.
SSN1 I I V\ I I
" I \\l I
INPUT
X
BUFFER JlNPUT XN BUFFER
H ;i,Hi4 ^,i(_!mux
WW
SSOUT
c
B
T T T
2 1 0
SSIN
MASTER
BUFFER
E E
1 2
PMI LN
0~R
SYSTEM DATA
BUS
Fig. 2.13 THE EXTEN0E0 BLOCK DIAGRAM OF WPLA
33
(3) A flexible content addressable search to erase and update WPLA.
2.5OperationalModes
A WPLA should provide good functionality and fast operation for users. In this
design there are eight basic operations. Some sequences of basic operations are
meaningless or illegal to execute in a WPLA chip. A legal sequence of basic
operations, called a macro operation, combines more than one basic operation. We
shall first describe the basic ideas behind those operations, then describe two macro
operations to illustrate howWPLAworks.
2.5.1 BasicOperations
Normal
Feed the input from the primary input lines, activate the AND plane, and latch
the outputs from the OR plane through two non-overlapping clock phases.
(Performs the same function as a conventional PLA.)
MasterReset
Activate the reset pulse(RS) to set all the status tags to vacant (logic 1) and clear
all the target tags to untargeted (logic 0).
34
Write
Load the personalities into the input buffers by multiplexing. Select a vacant
row through pseudorandom addressing scheme. Write the personalities into the
addressed row.
Search
Load the search argument or the search key into the input buffers by
multiplexing. Strobe the targeted row(s) through the primary input line and set
the associated target tag(s).
Erase
Activate the erase pulse(ERA) to shift the target tags of all occupied rows to the
status registers.
Read
Following the priority chain, select a targeted row. Unload the personalities of
this targeted row into the input buffers.
Scan
By specifing the contains ofWPLA, the personalities of that row will be scanned
out into the input buffers, which are reconfigured into a shift register chain.
35
Test
Test the two shift register chains which are configured from the input buffers and
the tag registers, respectively. Apply test vectors to the control circuit and the
input buffers and examine the test responses from their output ports through
clocking two non-overlapping phase clocks.
2.5.2MacroOperations
Search then Erase
Search then erase the occupied row(s) which have the specified search argument
or the specified search key.
Search then Read then Scan
Serially examine those personalities which match on the search argument or on
the search key in one of the targeted rows. If a row matches a key, then the
contents of that rowwill be read.
36
3. THE DATA PORTION OF A WPLA
The circuitry of a WPLA can be divided into the data portion and the control portion.
In this chapter, the circuitry of the data portion will be described. The associated
control and timingportionwill be discussed in chapter 4.
The data portion of the WPLA consists of (a) the basic memory cell, (b) the input
buffer, (c) the output buffer, (d) themaster driver, and (e) the glue logic circuit.
3.1 The BasicMemory Cell
The basicWPLA memory cell includes a pseudo-staticmemory cell and a comparison
circuit. We chose the pseudo-static memory cell, instead of the dynamic one, to
achieve fast operation, to avoid complicated refresh control timing, and to provide
stable driving capability. Both the AND planes and the OR planes of theWPLA use
the same storage cell, but each employs a different comparison scheme.
The memory cell in the AND plane is shown in Fig.3.1 and is composed of nine
transistors. The two cascading inverters are formed by Q3-Q6 whereas Q2 provides
the feedback path to construct a pseudo-static memory cell. The data on
BIT LN(~BIT LN), coming from the input buffer, is then gated by Ql as the
37
personality during the WRITE operation. The control signals WRT and
~(WRT+ PH2B)on the gates ofQl and Q2 assure that there is no data conflict or any
ambiguous states on the input of thememory cell. During the READ operation, Q9 is
turned on in order togate the storeddata to line - RD LN (RD LN) .
The comparison circuit is implemented by Q7 and Q8(shown in Fig.3.1). If the
previous data stored in the memory cell at node ~Q is opposite of the current states on
line ~BIT_LN (BIT_LN), the P_LN is kept high, which indicates a match. In the
NORMAL mode, the state on the P LN is utilized to activate the OR plane.
Similarly, in the SEARCH mode, the state of P LN is gated to the target tag
through a strobe pulse(STB).
The memory cell in the OR plane shown in Fig.3.2 consist of 13 transistors. Q1-Q6
construct a pseudo-staticmemory cell. Q9 is used to gate the stored data to ~RD LN
for examining purposes. The comparison function in the NORMAL mode is
performed through Q7 and Q8. In a "match" state, the OUT LN is high, and the
state ofP__LN and the storeddata on the node ~Qareopposite.
In order to provide the content addressable function on this plane, two additional
comparison circuits, accompanied by BIT LN and CMP LN, are needed.
CMP LN is connected to all memory cells in the same row of the OR plane. It is
charged (or discharged) to the same state as the P LN when either the path of Q10
38
FS.
<J 1 I *
X
<H
u
T
Z> -'
i i
o a
cc cc
u_
O
LU
u
CC
o
3
39
V&///&
sssa^a^n yzz^zMtzzz^
i
\%///)fc//\///&J&!fy \^2Z^fe
B
2
OUT LN
P LN
BIT LN
thlT
10 dh
'-BIT LN
<RT_|r
WRT
PH2B
62
H
r-'o a
<a
.. ir</v
-III
Q1 2
Q13
Ld 02 1
Q
R D
CMP LN
I Q8
| 09
~RD LN
Fig. 3.2 A MEMORY CELL OF OR PLANE
41
r
u r -0 <r
E cn -H -0 N CD
<I
or
D
E -a ra * n
n
in z * G in
in cn
1
LU
u O o n
Z in
-" O O CD
> < oE
4J
ID
Ch rj n r *
-*
ri
o Z
+> ru ~H
Z cn ^-i ~*
Ui
H in <r -0 o
z < Q r* l> -0
Ul
cc a. z
1 .-i o .- rj """
LU -* d
Ll
U,
-
D (J o o cc
in -. n -0 N
X K a .* n ro ~ CD ^ N
H D E ID .- ~ * 43
n Z u rj 0-
3
<
in in rt
Z
O
_L
EL
cn 3
-i u o c 0-
cc a
z
<
tn ... n 0 0-
<
Q.
a
E
p
11
o- ro n -0 *
m
<r
e Z +> Ol <f
a in -, ~*
u
Ul
~*
l~4
*-* u r -o .H
CO < cn f*E
n
in -,
Ul
-J
DL
a
E mom m V
0-
I <t Z c in ra
K
*
5.
a
IS t-
a,
3< . **. o
Ol 01 L. ai e +
^
' o L. 1- u tn ^ **. u
ro
i-t 3 Q ai * n ID
t o p *- O..C <4- c E E U.
C U O HI C L. - 3_ 3_
-c 3 Q Ul o> _l 01 -~ II w m
u L ** M 01 01
Ol 4* a c .^1 k L-
1- in 2 l-l in < <
43
and Ql 1 or the path ofQ12 and Q13 is turned on. If these two paths are both off, the
state ofCMP LN is floating or temporarily kept on the previous state until entering
into the SEARCH mode.
During theWRITE operation, the state ofBIT_LN and ~BIT_LN will be stored into
the node ~Q and Q. In the SEARCH mode, CMP_LN is connected toGND through a
control gate Ql(Fig.3.10). If the current state of ~BIT_LN (BIT_LN) is opposite of
the state of the memory cell on the node Q (on the node Q), P LN will be kept at
high. Through a strobe pulse(STB), P LN is gated to the target tag. On the other
hand, if no match is found, P LN is pulled to GND through CMP LN and clears
the target tag associatedwith this particular row.
Fig.3.3 shows the layout of these two memory cells. The size of comparison among
PLAI6], APLAH6], CAMU4], and SRAMU5], which is normalized by the size of the
APLA, is shown on table 1. The higher overhead onWPLA memory cells compared
withAPLA is due to the usage ofa pseudo-staticmemory cell and the augmentation of
WPLA by content addressable functions.
3.2 The Input Buffer
There are two different kinds of input buffers, one for the AND plane and one for the
OR plane. The input buffer for the AND plane is shown in Fig.3.4. It provides the
44
<s
O
Lb
cc
3
3 - -S
tn
45
data to a pair of memory cells or receives the data from them. In the NORMAL
operation, the gates of Q2 and Q3 is turned on by <&i. The primary input on
PMI LN goes through the superbuffers Si and S2 to supply the complement and the
true terms ofthe input variable, called literals, to ~BIT_LNandBIT_LN.
In the WRITE or the SEARCH operation, D and DB, the data to be written or to be
searched respectively, come from themaster driver. Since Q3 is off, the superbuffers
Si and S2 latch the data individually from D and DB through Ql and Q7 by clocking
Wi. BIT_LN and ~BIT_LN, the outputs of the superbuffers SI and S2, lead to all
memory cells in these two particular columns and provide the personality (the search
argument) to be written(searched).
In the READ mode, ST is active (Fig.3.4) and turns on Q6(Q10). The personality bit
is immediately propagated to G2(G4) and Q6(Q10). By activating ST, the personality
is latched onto the input of the inverter Gl(G5). If a SCAN mode is activated next,
Q6(Q10) will be turned off, and the two non-overlapping clock phases SSN1 and SSN2
are activated. The personalities previously latched, then will be scanned out to the
next buffer stage in the shift register chain.
The input buffer for the OR plane is shown in Fig.3.5. It is similar to the right half
portion of the input buffer in Fig.3.4, except for an additional transistor Q2, a control
46
OO V
i.r
* f
II
n
n
* E
3
a.
3 -
47
03
3
a.
< "^
48
signal FRS, and aBIT_LN coming from the superbuffer S2. In the NORMAL mode,
the input buffer of this plane is idle and is a redundant circuit.
In theWRITE or the SEARCH operation, FRS turns on Q2. The data D goes through
Ql and Si and provide the personality(the search argument) to be written(searched)
on ~BIT LN. Moreover, ~BIT LN through S2 provides its complement state on
BIN LN which is utilized by the comparison circuit of the memory cell in the OR
plane as mentioned in section 3.1.
In the READ mode, Q2 is turned off by FRS. After the personality has been read
through Gl and Q5, it is latched onto the input of S2. The SCAN operation is
performed, by clocking SSN1 and SSN2. As a result, the data will be shifted to the
next buffer stage in the chain or will be observed at SSOUT pin.
3.3 The OutputBuffer
The output buffer is shown in Fig.3.6. Ql is a pull-up load for OUT_LN. The state
of OUT LN is clocked by PH22 to an inverting superbuffer, Si, during the
NORMAL operation and finally drives the OUT pad to show the state of the boolean
function. The 02 is originally generated by the timing control circuitry. Through
the line driver S3-5, 02 is propagated to the output buffer and is denoted as PH22.
49
13
C9
o
a
DC
cn
50
3.4 The MasterDriver
The master driver in Fig.3.7 is connected to the system data bus through PMI LN.
G3-G6 and Q1-Q3 construct a data formator. Gl and G2 control the pass transistors
Q2 and Q3 to perform this function. The non-inverting superbuffers SI and S2
enhance the driving capability of the formatted data D and DB. D and DB, like bus
lines, are distributed to the input buffers of different partitions. By clocking PH12,
the data bit is latched onto the node B as a mask bit and C as a configuration bit. As
soon as PH12 is changed to low, the NOR gates G4 and G6 will generate the desired
formatted data, as shown in Fig.2.4, through themasking bits and PMI LN.
In theWRITE operation, since the SRB is high, Q2 is on and Q3 is off. The WTB is
low and enables the NOR gate G3. The output G4 and G6 will be both low if the
mask bit latched on the node A is "logic 0". On the other hand, if the latched state of
the mask bit is high, the output of G3 will enable G4 and G6. Since G4 and G6 are
cascaded through Q2, the opposite stateswill appear on the output ofG4 and G6.
In the SEARCH operation SRB is low. The transistor Q2 will be offand Q3 will be on
if the configuration bit on the node A is low. An additional inverter G5 is appended
between G4 and G6. Meanwhile, WTB is high at this mode and the output of G3 is
kept in a low state. Therefore, the output ofG4 and G6 will both be in a state that is
the opposite of the state of the PMI LN. On the other hand, if the configuration bit
51
SRB
WR B
"1
PHI J
_n.
PMI LN
D B
Jl
0A n_
~BIT LN
BIT LN
LOGIC LOGIC 1 DON'T CARE
223
UNDEFINED
Fig. 3.1 DATA FORMAT FOR WRITE OPERATION
52
SRB
~L
WR B
PHI 2
_TL J~L n_
PMI LN
D B
*t
^-BIT LN
BIT LN
^f
' ' ''DON'T*
LOGIC 0
'
LOGIC 1 ! MASK ! CARE' I
SEARCH
YS//A : UNDEFINED
Fig. 3.9 DATA FORMAT FOR SEARCH OPERATION
53
is high, Q2 is on and Q3 is off. Again, the same state as PMI LN will appear in the
output ofG6 and the opposite state ofPMI LN will appear in the output ofG4.
The detailed timing diagrams for the data formatting for the WRITE and SEARCH
operation are shown in Fig.3.8 and Fig.3.9. The dependency relation between
D(DB) of themaster driver and ~BIT_LN (BIT_LN) of the input buffer is also shown
in these two figures. The related circuit diagram are shown in Fig.3.4, Fig.3.5, and
Fig.3.7.
3.5 TheGlue Logic Circuit
The pull-up load of P LN and CMP LN in Fig.3.10 is connected with the memory
bank of each row. In the SEARCH mode, CMP LN is kept in GND by clocking
PH12 to gate the control signal SEREN1. This activates the comparison function in
thememory cells of the OR plane asmentioned in section 3.1.
54
Q
Z
a
"*-
or
o
cc
u
ca
o
o
<&
55
4. THE CONTROL PORTION OF A PLA
The control portion of a programmable logic array provides the associated control and
timing signals to the data portion. It is composed of (a) the row control circuit, (b)
the mode selection and main phase clock circuit, (c) the multiplexing control circuit,
and (e) the glue logic circuit.
4.1 The Row Control Circuit
As mentioned in chapter two, each row in theWPLA has its associated control circuit
as shown in Fig.4.1. The status register consists of two inverters, G8 and G9, and
three pass transistors, Ql 1-13. Similarly, the target register consists ofG2, G3, and
Q3-Q5. Normally, the control signal SNB and SN2 turn on Q12 and Q13(Q3 and Q5)
unless the WPLA is in the TEST mode. The function of the NOR gate G7 (Gl) is to
turn on Qll(Q4) to maintain the feedback path of the tag register, and to disconnect
itwhenever any operation control signal is applied to the input ofG7(G1) for updating
the state of the tag registers.
Initially, a master signal RS (Fig.4.2.b) sets the node Y, the output of the status
register, to "logic 1", and resets the node V, the output of the target register, to "logic
56
X3M
(l-u)MSAd
ZNS
SN S
INS _
VH3
S U
o
0
ais
o
u
o
MDU
57
0". Since Q14, controlled by the node Y, is turned on, the P_LN is pulled down to
GND at thismoment and ignores the successive operations.
During theWRITE operation, if this specific row is vacant ("logic 1" on the node ofY)
and receives the WRITE permission("logic 1" from PVBW(n-l)), the following
WRITE clock (WCK) will be gated to the node Z to generate a negativeWRITE pulse.
Through superbuffers S2 and S3, theWRT will clock the associated memory cells on
this row and will reset the node ofY to "logic 0" representing
"occupied" through Q16.
As soon as the trailing edge ofWCK occurs, Q15 is turned on by the node Z. The
"logic 0" in the node Y passes to the gate G12 and inhibits the following WCK to
activate this row again.
In a like manner, during the SEARCH operation the target row will let its associated
P LN be high. By applying a strobe pulse(STB), the state of P LN will be gated
through Q17 to the target register. The state of the target register will be set to
"logic 1" (target flag) on the node V.
During the READ operation, if the row is targeted ("logic
1"
on the node of V) and
receives the READ permission ("logic 1" from PVBR(n-l)), the following READ
clock(RCK) will be gated to the node W of G4 to form a negative READ pulse.
Through an inverting buffer SI, RD unloads the personalities of the memory cell to
the ~(RD LN) or (RD LN) ofthis particular row and simultaneously resets the node
58
of V through Q2. Like WCK, the trailing edge of RCK turns on Q16 through the
nodeW and inhibits any following RCK from activating this row again.
On the other hand, during the ERASE operation, if the target tag on the node V is
high, it turns on Q6. The targeted tag, then, is shifted to the status register, and the
status register is set to the vacancy ("logic 1") state.
The WRITE(WCK) and READ(RCK) permission signals of the current row control
stage are generated by the priority circuit of the previous stage. The priority control
signal PVBW[n-l] (PVBR[n-l]) provides the WRITE permission state (the READ
permission state) for the current stage and the priority control signal PVBW[n]
PVBR[n] provides the WRITE permission state ( the READ permission state) for the
next stage.
If aWPLA totally has k product lines, it indicates there are k rows in the chip. Each
physical row is assigned a positive integern(lasi) and is chained together except
for the two dummy priority signals PVBW[0] (PVBR[0]) and PVBW[k] (PVBRfk]).
PVBW[0] (PVBR[0]) is constantly connected to Vdd and PVBW[k] (PVBR[k]) is
assigned as the handshaking signalWTCMP (RDCMP) with the system controller.
The circuit serving the WRITE permission is implemented by gate G10 and Gil.
Row 1 will pass the WRITE permission to row 2 if the node Y in row 1 is "logic 0".
59
Otherwise, row 1 will write first before it releases the WRITE permission. The
permission signal will follow the chain order and will propagate to the lowest vacant
row. The lowest vacant row blocks the permission signal until it finishes theWRITE
operation. After finishing this operation, the associated status register of this latest
written row is set to "occupied" ("logic 0") and this lets gate Gil pass the WRITE
permission to the next lowest vacant row until WTCMP is high and is received by the
system controller.
Similarly, the priority circuit for the READ permission is implemented by G5 and
G6. The lowest targeted row on the priority chain, through its target tag with the
high state on the node V, will block the READ permission to the following rows and
this lets RCK gate into this row. After finishing the operation, the associated target
register of this latest read row is set to "logic 0" on the node V and this lets the NOR
gate G6 pass the READ permission to the next lowest targeted row until RDCMP is
high and is received by the system controller.
The function of the TEST mode is to check two tag register by shifting the test vectors
from SIN and examining at SOUT as described in chapter two. SNB is changed to
"logic 0" during this mode. The feedback path ofboth tag registers are disconnected
by Q5 and Q12. But, through the pass transistor Q8 and Q9, a dynamic shift register
chain is configured by the target register and the status register of each row control
stage.
60
Node A of the current stage is connected to node B of its adjacent higher row except for
the lowest and the highest rows. Node A of the highest row is assigned as SIN, and
node B of the lowest row is assigned as SOUT.
4.2 The Mode Selection and Main Phase Clock Circuits
The mode selection and main phase clock circuit is shown in Fig.4.2. To reduce the
pin count, the operation mode selection in Fig.4.2.a is implemented by a 3-to-8
decoder. The mode selection bits CBTO-2, and their decoded modes, have been
shown inTable 2.
The bits CBTO-2, through the pad drivers, actives only one line to "logic 0" each time
except in the TEST mode. At the TEST mode, CBTO-2 activates SN and SSN to
concurrently test two dynamic register chains configured by the row control circuit
and the input buffers. Gl-6 are inverters that provide the complement state of each
connected line in order to control each one's associated operation mode.
The two non-overlapping phase clocks, PHE1 and PHE2(shown in Fig.4.2.b), are
supplied by the system controller and their driving capability is enhanced by the two
pad drivers, PSl and PS2. The inverting buffers, SI and S2, provide the
61
complement forms ofPHE1 and PHE2. The output of these superbuffers PH1B and
PH2B are used in the internal control.
4.3 TheMultiplexingControl Circuit
Themultiplexing control circuit includes a module-./ counter(/ is the partition number
of the input buffers as mentioned in section 2.2.2) and a decoder. It generates a set of
multiplexing clock phases(MD0-2) for latching data into the different partitions of the
input buffers.
In Fig.4.3, a modulo-3 counter and decoder are implemented. The modulo-3 counter
is constructed by G6-10 and Q4-11. The decoder consists of G12-13 and Q12-20.
The gates Gl-5 and Ql-3 are used to gate the clock PH1B and the clear signal. If it is
neither in theWRITE operation nor in the SEARCH operation, the output ofGl will
be high to turn on Ql. Then, Q4 and Q5 will reset the counter output N2 and N4 to
low. If it is either in theWRITE or in the SEARCH operation, the timing diagram
for the counter and decoder is shown in Fig.4.4. The MDO, MD1, and MD2, the
outputs of the decoder, are alternately changed to "logic 0". Through these three
outputs of the decoder, the multiplexing phase clock W1-W3 and the WRITE pulse
WCK (the STROBE pulse STB) can be generated as in Fig.4.5 and 4.6. The timing
diagram is shown in Fig.4.7.
62
a>
IH
5H
>
o
?i ^
3t-
-
PI-
o
xp
5 jr jr ji
<?i
m
to
cc
"1
11-
:i
to
to
r<H
r^
h "4H
V
<?
:o
o|
a Mi
u u
LIJ
to
a
o
63
CBT2 CBTl CBTO MODE
0 0 0 TEST
0 0 1 MSRS
0 1 0 SCAN
0 1 1 READ
1 0 0 ERASE
1 0 1 SEARCH
1 1 0 WRITE
L 1 1 1 NORMAL
Table 2 Mode Select
64
^LU
X
a.
o
o
u
Ui
CO
<
65
- *
o
a
or
z^r
<r
(9
c
O
*1
Ih
^
>
o
*-i_r*
A
UP
?i
\7 \7 \7
J
a
z
to v>
66
SEREN i
JWTEN(SEREN)
PH 1 B
PH2
u I
n 1
J
! n
u I
n 1
r
n
u
n
:Li
n
N 5 1 \ i
N S 1
N 7
~
N 8 I
N 8
'!
ri ii rL H n
N1 0 Ll u u u u
N1 1
N1 2 ij
N 2 ii
n a
N 4
iM D 0
M D 1
i
M D 2 i
Fig. 4.4 TIMING DIRQRAU FOR MULTIPLEXING
CONTROL CIRCUIT
67
4.4 The Glue Logic Circuit
Except for the control signals for the WRITE and the SEARCH operations, the other
associated control signal is described next. The NOR gate Gl and a supper buffer
S3(shown in Fig.4.2.b) use MSRSB and PH1B, generating the master reset pulse RS
in order to initialize all status and target registers in the row control circuit.
In the READ mode or the ERASE mode, the relevant control signals are ST and FRS
(shown in Fig.4.5) and RCK and ERA(shown in Fig.4.6). All of them are generated
by the mode enable signals RDEN, RDENB, EREN and SSN, and by the two non-
overlapping phase clocks, PHI and PH2. Similarly, the SSN1 and SSN2 are
generated by the scan enable signal, SSNB, and the clock phases, PH1B and PH2B.
The timing diagram associated with the READ, ERASE, and SCAN mode is shown
in Fig.4.8. The other timing control signals shown in Fig.4.5 and Fig.4.6 are 0%, 02,
SNi and SN2. The 0i and 02 are used individually to latch the input and output
data during the NORMAL operation. SNI and SN2 provide the two non-overlapping
scan clock phase for the row control circuits during the TEST operation. The timing
diagram to implement these two operations is shown in Fig.4.9.
68
To speed up the operation, line drivers are appended to some of the control and data
signals as shown in Fig.4.10. Each input and output pin also has its own pad driver
to enhance the driving capability of this design.
69
-/
-^
as".
s-S- o
drr^j
T<1
< T<
o I
r<l r<
7^ :<
to a *
<
2
to
to
o
o
Ul
a
70
5S
O
tn
to
U I
*~
a
a
CS
ui
CO
u
ee
S
S3
a
**
. o
a sa
O r *
a
as
OL.
DC
u
o
o
CO CO
CO CO
<<E
a
cr i
71
PHI
PH2
PH2B
M D 0
MD1
~
M D 2
WTENB
(SERENB)
il R R R R R
WTEN r
(SEREN) i
P3IB
tfl"
LN2
WCK
(STB)
si
is
si
ii"
ji n
mnj-
ii
ii
jl
is
R
U
r
u
H
u U ;
~~\s~~
_Jl_
u U ;
ri : ; ri
n j ri !
~l L_
FIO- 4.7 TIMING DIAGRAM FOR WRITE /SEARCH MODE
72
PH1
PH1 B
P H2
PH2B
RDEN
RDENB
EREN
SSN
SSNB
ST
FRS
N 1
N 2
N 3
RCK
SSN1
SSN2
ERA
n ; n n n in n n
Li ; y
nl I n
y
n
Lj IP
n n n i
IT"
I f
u
i
jui i u y y ii y i I I
j ii
i ii
i iI = ;; i
L | :
!! h i: ; i
r
i i
ri n
H H =
READ SCAN ERASE
Fig. 4.8 TIMING DIAGRAM FOR READ, ERASE,
AND SACN MODE
73
PH 1 B
PH2B
NORMB
U U U U Li
1
S N
S NB
IT
_Ji
it
ii.
ii n
SN1
SN2
li
il
ir
R
ir
A
NORMAL TEST
Fig. 4.8 TIMING DIAGRAM FOR NORMAL AND TEST MODE
74
3
O
U
o
a
ui
_i
a
75
5. IMPLEMENTATION AND VERIFICATION
A WPLA with 22 inputs, 22 outputs, and 64 product terms, the same configuration
asMarchand's chipUS] has been implemented here. The bus wide master drivers in
the WPLA are constructed with a 16 bit structure which provides for an interface
with a 16 bit system data bus. The input buffers for the AND and OR planes are
divided into three partitions of 16 stages, 16 stages, and 12 stages. These 44 stages
are available for latching data by multiplexing from the master drivers. The
functional and timing behavior has been evaluated using Mentor Graphics logic
simulator(QUICKSIM) and circuit simulator(MSIMON), and will be described in the
following sections.
5.1 Implementation
The layout of the WPLA is implemented using Mentor Graphics layout editor
CHIPGRAPH with 2 jim NMOS technology and MOSIS NMOS design rules. ECAD
corporation DRACULA IC layout verification system was used to insure that the
MOSIS design rules were correctly followed. The overall chip size, including
bonding pads, is 10.3mm*10.5mm. A drawing of the pad assignments for the total
76
DDDDDDDDDDDDDDDD
K H t-
3 3 3
O o O
t-j-L.L.L.L.L.L.L.
333333333
ooooooooo
K 1- |_ |
3 3 3 1
O O 0 1
l-
3
0
Qs.N OUT 18 D
Fl WTCMP OUT 17 ?
| ) RDCMP OUT 18 ?
? OUT 18 ?
D OUT 2 0 D
D OUT 21 ?
?
?
Fig. E.2
PAD ASSIGNMENT OF WPLA
(22*04'22)
?
?
| [ VDD GND D
| | SOUT SSOUT ?
1 1 CBT2 PM_LN 21 Q
1 1 CBT1 PM_LN 20 D
| j CBTO PM_LN i.D
I I PHE2 PM_LN i.D
I I PHE1 PM_LN "D
1 1 SSIN PM_LN i.D
zzzzzzzzzzzzzzzz
J I I I I I I I l_J I I I I I I
I I I I I I I I I I I I I I I I
3323232X33222332
La.a.LA.a.s.k.a.B.a.a.a.a.a.0.
DDDDDDDDDDDDDDDD
77
57 pins are shown in Fig.5.1. The transistor count in this chip is a total of 50,028,
including 10,430 depletion transistors and 39,589 enhancement transistors.
The event-driven logic level simulator, QUICKSIM, and the timing analyzer,
MSIMON, have been successfully used to verify the functional and timing behavior of
the implementedWPLA.
5.2 Functional Verification
The eight basic operations and four of the macro operations were checked using seven
subtests:
(1) MASTERRESET_then_WRITE
(2) WRITE_then_NORMAL
(3) NORMAL
(4) SEARCH
(5) SEARCH_then_ERASE
(6) SEARCH_then_READ_then_SCAN
(7) TEST
The test set-up, test vector, and test responses for each subtest are in the prepared
manual. Subtest(4) is especially used in QUICKSIM to test the targeted P_LNs
78
and their associated target tags. The PROBE function(QUICKSIM) allows the states
of P_LNs and the target tags to be observed directly without an additional TEST
operation. In the test of the finished product, users can conduct a macro operation
SEARCH-TEST(instead of a basic operation) in order to observe the same results.
5.3 Timing Evaluation
MSIMON is a powerful CAD tool used to analyze the worst case timing behavior of a
circuit. This is especially useful for circuits designed usingmultiple non-overlapping
clocks. In this test, the worst case duration for the phase clocks, PHE1 and PHE2,
is determined for theWPLA.
Since the WRITE/SEARCH operation took more time to activate the additional
counter and the decoder, as mentioned in chap four, these two operational modes as
well, as the NORMAL operation, dominated the cycle time of the phase clocks.
According to these observations, there are some different duration requirements for
the phase clocks between the NORMAL operation and the WRITE/SEARCH
operations. In the NORMAL operation, the duration ofPHE1 should be longer than
that ofPHE2. This will be reversed in the WRITE/SEARCH operations. Therefore,
there are two alternative clock strategies, and both of them have their PROs and
CONs.
79
The first strategy is the dual clock scheme, which uses two sets of non-overlapping
phase clocks, one set for NORMAL operation, and the other set for the remaining
operations. The advantage of this strategy is that the operation speed is faster in the
NORMAL operation mode. However, the system controller has to monitor the
operation mode to provide different clock phases to theWPLA.
Next, a single set of phase clocks is used for all operations. Unlike the dual clock
scheme, the system controller need not supervise the operation modes in order to
deliver the different phase clock sets. On the other hand, the worst case operation
frequencywill be slower than that in the dual clock scheme.
80
6. Performance Evaluation and Comparison
A WPLA belongs to the application memory chip class. It not only includes the
properties of the PLA and the SRAM, but also combines some functions of the APLA
and the general purpose CAM. Comparisons among the WPLA and different
memory types, based on their functionalities, speeds and chip sizes, are shown in
Table 6-8.
The WPLA can perform the same function as the PLA and further is reconfigurable by
programming the different personalities through its versatile operation modes. The
memory cell of the WPLA is similar to SRAM, but it includes additional comparison
circuits. Due to the additional internal control circuits and the augmentation of the
memory cell, there are penalties in the slower speed performance and the larger chip
size. The trade-offs amongWPLA, PLA, SRAM are listed in Table 3 and Table 4.
The performance evaluation, in the following sections, will focus on the comparisons
betweenWPLA andAPLAl16] and betweenWPLA and CAM!"].
81
6.1 Comparison between WPLA andAPLA
In term of functionalities, both WPLA and APLA can implement the sum of product
boolean equations and the state machines. The personalities in both chips can be
programmable and updatable. The differences betweenWPLA andAPLA are:
(a) In the WRITE mode the WPLA uses segmented write by multiplexing through a
bus-wide length instead of using the serial scan approach in APLA. In
comparison toAPLA, the write speed ofWPLA(inTable 7) is 18 times fasterwhen
using a single set phase clock, and 22 times faster when using a two set phase
clock.
(b) The pseudorandom addressing scheme in WPLA alleviates the necessity of using
a table or exhaustive search to find a free location for a WRITE operation.
Without knowing where a free location is, the self addressing scheme provides a
free location through a priority chain. In the APLA, the addressing scheme,
provided by the scan-in and decoding circuit, is time consuming and the vacant
rows or columns need to be known by the users.
(c) No background pattern is required to put into the vacant rows. The WPLA
provides ease of initialization and an ERASE operation. In the APLA, The
82
vacant rows and columns must be programmed for the background pattern during
the initialization and ERASE operation in order to avoid malfunctions in the
NORMAL mode.
(d) A powerful content addressable search in the WPLA makes the erase and
updating ofpersonalities easy and amendable to fault tolerant applications. On
the contrary, the APLA has no way to perform a content addressable search.
(e) The control and timing interface of the system controller in the WPLA is easy.
Unlike the WPLA, the system controller of the APLA needs to generate
complicated timing in order to control the operations and the refresh.
(f) The disadvantages of the WPLA is a larger chip size and slower speed in the
NORMAL operation. The pseudo-static memory cell of WPLA has been
compared with the dynamic memory of the APLA in Table 2. The area factors of
the memory cells in theAND plane and the OR plane are 4.99 and 6.78.
83
6.2 Comparison between WPLA and CAM
The CAM that will be compared with the WPLA is a general purpose VLSI CAMH3]
with READ, WRITE, ASSOCIATE and other content-oriented operations. Both the
WPLA and CAM have a content addressable capability and provide the ability to
examine the targeted rows. However, there are some variations between these two
architectures, and the major differences are listed below.
(a) In the functional aspect, a NORMAL operation for the WPLA drives the AND
plane, like a search plane, by applying a search argument or key through the
primary inputs. The AND plane then activates the OR plane as a functional
plane to construct the boolean functions. For CAM, there is no functional plane
in it. An associative search is conducted on the single plane and the targeted
row(s) or record(s) is(are) retrieved immediately, and are followed by a READ
operation. Without modification in the architecture, CAM can not perform the
sum ofproduct boolean functions or implement statemachines. However, CAMs
provide more versatile and faster data retrieval capabilities, especially in
associated-linked and order retrieval, whichmay be more suitable in applications
to LISP or DATABASE machine.
84
(b) The WRITE mode in CAM operates just like a standard RAM WRITE. The
address mask register is loaded with the word address. Then the data mask
register is loaded with the word data. Finally the match control register is
loaded with the write signal. In the WPLA, the row to be written is pseudo-
randomly selected by the priority chain in the row control circuits. On the other
hand, CAM can perform overwrite operation through the above addressing
scheme, but the WPLA always writes into the current lowest vacant row,
regardless of the released rows during the previous ERASE operation, unless the
lowest vacancy is the same as one of the released rows.
(c) A bit in the AND plane of the WPLA is constructed by a pair ofmemory cells to
allow a mask information(don't care state) pre-programmed during the WRITE
operation. The CAM uses only a single memory cell for each bit and has no
capability to pre-program themask information. The mask capability ofCAM is
implemented only in the SEARCH operation and depends on a search argument
and a mask register. The WPLA also includes this mask scheme, implemented
through a pre-latched configuration bit and a data bit in the master driver.
Moreover, the formatted search argument generated from the master driver is
capable of searching the "don't
care"
state or masking a bit as well as searching
"logic 1" and "logic 0" in the AND plane. These four types of formatted data,
"logic 1", "logic 0", "don't care search"(both "logic 1"), and "mask a bit"(both
85
"logic 0"), not only provide a more powerful search capability for the WPLA to
update personalities, but are also applicable to minimizing the two level boolean
functions stored; they even support PLA partitioning and folding.
(d) At a READ operation immediately following the SEARCH/ASSOCIATE
operation, the WPLA as well as the CAM determines which targeted row or
column is read first through a priority circuit. The personality in the WPLA is
examined through a serial scan path, but the data in the CAM is examined in
parallel through the bit lines. It takes more time forWPLA to examine the target
row(s). Since the macro operation, SEARCH-READ-SCAN, in the WPLA is
only used for off line test purposes, it will not affect the NORMAL operation in
the WPLA because there is no need to examine the personalities of the target
row(s) in theWPLA at thismoment.
(e) The chip size of the WPLA is somewhat larger than the CAM due to the larger
number of I/O pins and the overhead in the control circuit.
86
ot c.
1
_ _ ! 1 1 i _ |Jl C c l
Ul - C a< : i
'
I b *C C ft 1 c ! - _ - 1
o: i- o 3 c ; 0 fl) t u - 1 tj ^ .
Ul c o T. L-. u L u* 1 r L. - G L c. 1 s
T3 0. C - ft ft t o I i. Z | ' 3 > | Z< C - H* c r__ t - i i > c *" - ! ^ ~ \~
CI | 1 |
c j ;
1 ft _ 1 -
<
+ IH ft 7 j 1 "
C *n o C l -
b b ft c .- H i ~ ^. - "* 1
-* t. 3 c c ll - If. lP v- 1 ir .-. 1 T Ifl --C "5 Q J". u - _T t ft S j ft l J- a. z t- i - ft C 1
O -D C o i, r i. >!>!>> > 3. > Z > r > Iz '
U < - L. > " a ^" --I-I- ^. > " ~ ! c- w'~i
Ol
: |
C "^ 1I
_ < *- *' I 1" o
o
Si
_i c * a 0 e . :
c si oi Z ai D 0 fl) _ c 4J '
3 4- (_ < c 3 o ni 3 M M - IT ~ -* t/i in i/i L;|
<
Q.
C *D (D tfl ft c L E L. l/l ft o ft o c Ul o c- ft L &
o -o c - ft Ji HI fl) ft ft > Z > Z 2 a1 Z > > ft > ^i
u < - a. > 0. L. a. : CO > *-' -
i
i
u CI
_l
<
z
o
<
c
1
4* 1*
Si *
J Ca a M -*
a. 01 Dl Z u *B flj ** 9 **
>- < * I- < c e ft -* * * ** ui
C "O ij * *D Q U t. HI ft o o (. ! OCJ
z
3
e -o c ft J ft O e Ol > z Z ft > Z
w < -* a. > cn M cn cn Z > *-' *- cn ** -*
u.
Ol
00 C1
61 <
4* m
v
-
.c
O
D.
3 eiZ a-
c ft
c
C "O (D t a
o -o C * o ft o o ft
CJ < 0. Z a z Z a
TS
i
j
Ol i
Ol
b o fl) Ol c
E ft C Ot c
ft * Lt
ft <D m * c **
u E 3 -4 ft ft ft m ft ft a l
ifi ft s u -* -* * C ft in in ij 3* 3" .-. c
*4 D * -~ r*> O L c a i ** L JJ * a.
in U M * IP U L L. - L? U fl) - o u
w< 0) m m - 3 3 < LJ ft ***--* o
c ft -O 10 < Ll I 1 1 < L. 1 IP. >> a. c
o L ig 1 fl) E V ft ^ c *a L
j o x* ** tfi ^ -w ** M C 2 *-* ** D Q Ol fl) fl) **
I ** o * c * ** C fl) fl) U i/l L. > - ** ** Ci fQ /Q e> < a a ft -. ft D 3 b ** -Z fl) 3 L, rfl Ul H 0
e i- *j o ** L a _J a: E E * Ul ?< E E -* ft ft ft Q U
* * fl) c -o V i/i Q t-
o a. L ft 0 "a * L * * i *
z a 3 cr ;j < UJ C
87
T
n
2- '
< 1 7
c-
? o m
K CC J n **-%
It JI . r.i
2: rr <l
\r> ry o r
i
r.
r 6
< * o ru
u cc .-. -o 0- in
-7-
~"
1 r.
>c
i
-
'"':
.--, C o l'l
^
z
o
cn
| -0 ui c-.
^-
Ul C
LT
<
g.
<
c
-7
o
* CD
IB
L. c
a
T o h. i- ^
5
u
0 n rj t> JJ
$ r- rj LT 3 ii
Cu ^i o a. ^
LU rj
~* ^-i <r n rj
3
. 4V
a
C Cf L
cn V n It
<r JV 3
a.
-r
rj n 3 -* **
<
L.
(j
m
a.
c
U ll 4i
O
C7
-G ra
v4
C
" " 3 -
s o nj "Z e c l
tVJ 6 . c - 3 c
"*
i
ft m r
i
t
C T5
!-!
l/l J Cj
a ai "D C "C
C ! m - ^ ^
iii
1
c 2
r,i O 1 =r. u cr o c
< ,-j * C C c ^i
-> 5 Ifj (*! -.. -.. ..
!! 2_ =7 Il> a
Z -J C
~
L.
1 <= 'o 2. c o O 1
i r '
i ' [ J'1 ::j:.. i __ ^-.
! i ! 1 i ' O i 41 & ~
! 1! i = l 1
P * !- I- c
r^ 1
I c !~ ! ; i! : - ! = , - i
-* = . ~. j c 0 ! rj
"~
t ! = N-l Zi *J 1
i_ ! ? = 1 u 1 a.
-
c E | in <D t.
Ot w 1 a. m tJ- 0
'"*
1
1 s >- 21
1 *- 1 JL m | a. 1! H1 - 1 r* a 1 . u J*
J i '"* i t- j c L. Oi
! !
:- 1 < 1 "Z. ^l2!
88
7. The Applications ofWritable Programmable Logic Array
A WPLA has the advantage over a combinational logic circuit, not only in its
regularity of structure, but also in its ability to program personalities. In addition,
the operations of pseudorandom WRITE, content addressable SEARCH and ERASE
abilities of theWPLAs can be used to advantage inmany applications.
7.1 Reconfigurable Combinational Circuit
A WPLA can implement combinational functions as well as a PLA can implement
them. However, the function in a WPLA can be modified through reprogramming
the personalities, without requiring any change, either in the design or in the layout
of the structure. Moreover, the SEARCH_ERASE_WRITE operations can
dynamically reconfigure the implemented functions within a single WPLA chip if
macro operations can operate on-line or can concurrently operate with the other
executingmodules in the system. This macro-operation will highly promote WPLA
in dynamically reconfigurable systems and save the hardware overhead at the system
design level. Thus, the larger silicon area ofWPLA, in comparison to PLA, may be
overlookedwhen considering its contribution at the system level.
89
7.2 Programmable finite state machine
The a WPLA can be used to implement a finite state machine with two non-
overlapping phase clocks(PHl, PH2). The outputs of a sequential machine, under a
given input sequence in aWPLA, are reconfigurable in contrast to the fixed code in a
PLA. AWPLA's ability to be reconfigured extends the product life cycle and provides
a modification capability for field applications.
Similar to a reconfigurable combinational circuit, a sequential machine can also
dynamically change the personalities to implement the complicated functions within
a single WPLA chip with its associated reconfigurable firmware. This contribution
means the WPLA is also applicable to a dynamically reconfigurable state machine
design instead ofa statically reconfigurable one.
7.3 Peripheral Device Controller
The peripheral device normally has low speed compared with CPU operations. An
interface between the computer and the peripheral device, implemented by a WPLA
in Fig.7.1 has the benefit of good programmability and updatability. The
personalities through a microprocessor or a computer data bus can be written into a
WPLA chip or be modified to upgrade the performance. Moreover, aWPLA-based
90
DATA BUS
DATA
1 CONTROL
PERIPHERAL
DEVICE
T WPLA
PERIPHERAL
CONTROLLER
* >
MICRO
PROCESSOR
"*
?
MEMORY
Fig. 7.1 A WPLA PERIPHERAL CONTROLLER
91
design can become a standard interface for all slow speed peripheral devices. The
changeable personalities of a WPLA provide the different control sequences for the
associated peripheral devices.
7.4 Fast Turn around Time Custom Design
While designing a computer or a digital system, the complete machine instruction
set, or the control sequence, may not be well defined until interactively dealing with
the customers and getting the final approval for the prototype. At design phase, a
WPLA can easily change its personalities to fulfill fast turnaround time, such as
SEARCH, ERASE andWRITE.
In the final product, if the speed requirements are not critical and not over the limit of
maximum operation frequency of a WPLA, the WPLA-based design will continue to
provide the capability of fast engineering change and low modification cost for the
customer. On the contrary, if the maximum operation frequency ofaWPLA in the
final product is slower than the specifications, the WPLA can also be easily
substituted by a PLA with the same personalities and still contribute to fast turn
around time at the design phase.
92
7.5An Emulator forDifferent Computer Systems
The need to execute the instruction set of another computer for which the user has
existing programs is quite common. Regardless of the control portion of the
computer, which is conducting this operation using PLAs or micro stores, the
instructions could be executed by simulation. They can be interpreted through a
program written in the original machine instruction set. But, the simulation
usually requies high performance penalties. Even, in the case ofusingmultiple PLA
chips, to emulate the different instruction set following the chip select(as shown in
Fig.7.2), the hardware still demands high cost as well as having a lack ofupdatability
and is limited to emulating fewmachines.
Using aWPLA and a database on the disk to store the different control sequences for
each emulated system, a system can emulate a number of different systems with a
single chip WPLA. It can also use online operations to program or update the
personalities.
7.6 Testability and Reliability
Design for testability is one of the most important issues in VLSI design. Good
controllability and observability in the WPLA makes the system easy to test, and a
93
SET 1
SEL 1 ;
O R AND*l .
_J L_ _r t_
I
SET 2
SEL 2 i
O R AND
1 ( _J L_ T f
i
SET 3 1
SEL 3 ! O R AND
*>
_J L_ _r tll * 1
3ET i
SEL n i
OR AND
*
I t k
r
> (
.
FLAG REG.
DATA BUS
A piTU
D*ATA BUS
(a) A PLA APPROACH
Fig. 7.2 AN EMULATOR FOn DIFFERENT COMPUTER SYSTEMS
94
WPLA
O R AND
T T"
FLAG REG.
DATA PATH
DATA BUS
<b) WPLA APPROACH
Fig. 7.2 AN EMULATOR FOR DIFFERENT COMPUTER SYSTEMS
(Cantlnuad)
95
testable design can save an engineer a lot ofwork. The tests can be concluded either
in the design phase or during the field application period.
During the field application period, a testable design will provide good
maintainability and availability for the system. Moreover, if the macro operations
are allowed to execute online, and the control sequences of the system are issued from
the WPLA, then fault tolerance tests, in the data portion, can be conducted through
reconfiguring the control sequences or using the redundant rows in the WPLA to
recover the fault. For example, if a built-in hardware multiplier in the system is
faulty and is detectable, an adder with the control sequences which simulate the
function of the multiplier can replace the faulty multiplier immediately and backup
this operation. The original control sequences for the fast multiplier are erased
through a macro operation, SEARCH-ERASE.
The performance degradation is unavoidable in this approach, but it achieves fault
tolerance at the functional level instead of the circuit level. Since multiple P LNs
and OUT LNs will possibly be activated concurrently, the online fault isolation and
correction will be difficult to perform inWPLA itself. TheWPLA design, so far, has
not explored the fault tolerant capability. But, by modifying the current WPLA
architecture, or using error-correcting coding schemes, we may achieve a fault
96
tolerant capability in the future and may attribute fault tolerance both to the data
portion and to the control portion ofaWPLA-based design.
97
8. Conclusion
AWPLA uses a PLA architecture with pseudo-static memory cells and employs a set
ofmacro operations. Without changing the design and the chip layouts, the WPLA
based design can be reconfigured with respect to its embedded boolean functions only
by dynamicallymodifying or reprogramming the stored personalities.
In contrast to an APLA, a WPLA uses an improved writing scheme in both the data
portion and the addressing mechanism. In the data portion of a WPLA, the
personality to be written is fed into the input buffers in parallel instead of serially.
In the addressing mechanism, a new pseudorandom addressing technique was
introduced to replace the address decoding approach and to accelerate the WRITE
speed by at least a factor of 18. Besides, the augmented functions provided such as
content-addressable SEARCH and ERASE operations can also contribute to a fast
updatability. Moreover, the ease of interface and user friendly features, as well as a
good testability design, facilitates a WPLA-based system in situations that demand
fast turnaround time, customized design, and low volume applications.
Because ofadditional circuits used to implement the additional functions in aWPLA,
as mentioned above, more silicon area is needed in comparison to a PLA or to an
98
APLA. But, following the increase of the inputs, outputs, and product lines, the
internal control circuits are shared by the augmented data portion except for these
needed to duplicate the row control circuit. This reduces the overhead percentage for
the internal control circuits in the overall chip size.
Another difference between a WPLA and a PLA is that the rows in aWPLA may not
be fully occupied. The physical area for the vacant rows remains in the chip to
maintain updatability. Therefore, a WPLA based design with a good functionality
may not be dense enough. On the contrary, a PLA based approach can implement a
set of specific boolean functions or control sequences through the fixed number of
product lines and hence no redundancy is embedded for future updating.
Generally, the chip size in a PLA based design is more dense and compact than that
in aWPLA based design. Although the trade offbetween functionality and chip size
is decided by the user, WPLAs with different configurations, like standard devices,
can be manufactured. The various forms ofWPLAs provide the users with choices
that best fit their designs, with the least overhead possible. A sufficient number of
redundant rows for updating or reconfiguring the personalities.
On the other hand, the low speed in the NORMAL operation is another disadvantage
of aWPLA. This penalty results from more internal control circuits leading to the
99
same phase clock, and from the large chip size, and the longwire connection. If the
pin count is not a critical constraint in the future package technology, then the speed
of a WPLA, for all operation modes, can be improved by externally providing the
multiplexing phase clocks and mode control signals, instead of a built-in multiplexer
and a mode decoder. If the WPLA design is scaled for the coming 1 p.m silicon
technology, its predicted speed ofoperation is improved, TheWPLAmay be operated
at approximately 7.5MHz for a dual phrase clocking scheme and at 4.7 MHz for a
single phrase clocking scheme. Then, the functionality, speed, performance, and
smaller chip size may expand the use of WPLAs in some other applications in
addition to the fast turnaround custom design field.
100
REFERENCE
1. [Agrawal 1983]
V. D. Agrawal, and S. K. Jain, "Test Generation for MOS Circuits Using D-
Algorithm," Proc, 20th Design Automation Conference. June 1983, pp. 347-
354.
2. [Bose 1982]
P. Bose, J. A. Abraham, "Test Generation for Programmable Logic Arrays,"
Proc., 19thDesignAutomation Conference. June 1982, pp. 574-580.
3. [Bozorgui-Nesbat 1984]
Saied Bozorgui-Nesbat, Edward J. McCluskey, "Lower Overhead Design for
Testability of Programmable Logic Arrays," International Test Conference.
Proc, 1984, pp. 856-863.
4. [Bryant 1984]
R. E. Bryant, "A Switch-Level Model and Simulator for MOS Digital Systems,"
IEEE Trans, on Computers, vol. c-33, no. 2, Feb. 1984, pp. 160-177.
5. [Daehnl981]
W. Daehn and J. Mucha, "A Hardware Approach to Self-Testing of Large
Programmable LogicArrays," IEEE Trans. Computing, vol. c-30, Nov. 1981,
pp. 829-833.
6. [Fongl984]
E.Fong, M.Converse, and P.Denham, "An Electrically Reconfigurable
Programmable Logic Array using a CMOS/DMOS
Technology," IEEE J. Solid-
State Circuits, vol. sc-19, no. 6, Dec. 1984, pp. 1041-1043.
7. [Fujiwara 1985]
Hideo Fujiwara, R. Treuer and V. K. Agarwal, "A Low Overhead, High
Coverage, Built-in, Self-Test PLA
Design," Digest 15th Int. Svmp on Fault-
Tolerant Computing. June 1985,
101
8. [Fujiwara 1983]
Hideo Fujiwara, Kewal K. Saluja, K. Kinoshita, "An Easily Testable Design of
Programmable Logic Arrays forMultiple Faults," IEEE Trans. Computers, vol.
c-32, no. 11, Nov. 1983, pp. 1038-1046.
9. [Fujiwara 1981]
Hideo Fujiwara, K. Kinoshita, "A Design of Programmable Logic Arrays with
Universal Tests," IEEE Trans. Computers, vol. c-30, Nov. 1981, pp.823-828.
10. [Hong 1980]
S. J. Hong, D. L. Ostapko, "FITPLA: A Programmable LogicArray for Functional
Independent Testing," Digest 10th Int. Svmp. Fault-Tolerant Computing. 1980,
pp. 131-136.
11. [Hua 1984]
K. A. Hua, J. Y. Jou and J. A. Abraham, "Build-In Tests for VLSI Finite State
Machine," Digest 14th Int. Svmp on Fault-Tolerant Computing. June 1984, pp.
422-425.
12. [Hwang 1985]
Kai Hwang, Faye' A. Briggs, Computer Architecture and Parallel Processing.
Reading, Mass., 1985.
13. [Khakbaz 1983]
J. Khakbaz, "A Testable PLA Design with Low Overhead and High Fault
Coverage," Digest 13th Int. Svmp. on Fault-Tolerant Computing. June 1983,
pp. 426-429.
14. [Kohnen 1982]
Teuvo Kohonen, Content Addressable Memory. Springer-Verlag, Reading,
Mass., 1982.
15. [Kohnen 1977]
Teuvo Kohonen, AssociativeMemory. Springer-Verlag, Reading, Mass., 1977.
102
16. [Marchand 1985]
J. F. Phippe Marchand, "An Alterable Programmable Logic Array," J. Solid-
State Circuits, vol. sc-20, no. 5, Oct. 1985, pp. 1061-1066.
17. [Minl984]
Y. Min, "A PLA Design for Ease ofTest Generation," Digest 14th Int. Svmp. on
Fault-Tolerant Compouting. June 1984, pp. 436-442.
18. [Treuer 1985]
Robert Treuer, Hideo Fujkwara, Vinod K. Agarwal, "Implementing a Built-in
Self-Test PLA Design," IEEE Design & Test. Apr. 1985, pp. 37-48.
19. [Wood 1981]
R. A. Wood, Y. Hsieh, C. A. Price and P. Wang, "An Electrically Alterable PLA
for Fast Turnaround-Time VLSI DevelopmentHardware," IEEE J. ofSolid-State
Circuits, vol. sc-16, no. 5, Oct. 1981, pp. 570-577.
103

