The design and implementation of a read prediction buffer by Nowicki, Gary Joseph
Calhoun: The NPS Institutional Archive
Theses and Dissertations Thesis Collection
1992-12
The design and implementation of a read prediction buffer
Nowicki, Gary Joseph






Approved for public release; distribution is unlimited
THE DESIGN AND IMPLEMENTATION OF A READ PREDICTION BUFFER
by
Gary Joseph Nowicki
Lieutenant, United States Navy
B.S.E.E., University of South Carolina, 1987
Submitted in partial fulfillment of the
requirements for the degree of





IURITY CLASSIFICATION OF THIS PAGE





DECLASSIFICATION / DOWNGRADING SCHEDULE
3 DISTRIBUTION/AVAILABILITY OF REPORT
Approved for public release;
distribution is unlimited
PERFORMING ORGANIZATION REPORT NUMBER(S) 5 MONITORING ORGANIZATION REPORT NUMBER(S)





7a. NAME OF MONITORING ORGANIZATION
Naval Postgraduate School
ADDRESS (City, State, and ZIP Code) 7b ADDRESS(C/ty, State and ZIP Code)




9 PROCUREMENT INSTRUMENT IDENTIFICATION NUMBER









TITLE (Include Security Classification)
1UFFER
THE DESIGN AND IMPLEMENTATION OF A READ PREDICTION
PERSONAL AUTHOR(S)
OWICKI, Gary Josephj_








supplementary notation The views expressed in this thesis are those of the
uthor and do not reflect the official policy or position of the Depart-
lent of Defense or the US Government.
cosati codes
FIELD GROUP SUB-GROUP
18 SUBJECT TERMS (Continue on reverse if necessary and identify by block number)
VLSI (very large scale integration) design; mem-
ory address prediction; dynamic ram; MAGIC; CMOS;
cache performance improvement
ABSTRACT (Continue on reverse if necessary and identify by block numberQ&cYlQ memories Which are the
vel of memory between the CPU and the main memory, hold small amounts of
ta and instructions, and allow the CPU to access the contents in them
ry quickly. This significantly reduces the read access time for the CPU
the required information is available in the cache. However, caches
e small and can only hold the most commonly used data and instructions
quired by the CPU. When information requested does not appear in the
che, a "cache miss" occurs and the CPU must fetch the required data from
e main memory. The Read Prediction Buffer reduces this time-costly read
cess by attempting to predict the possible miss address, and pre-fetch
e read data.
DISTRIBUTION /AVAILABILITY OF ABSTRACT
Xl UNCLASSIFIED/UNLIMITED SAME AS RPT DTIC USERS
i NAME OF RESPONSIBLE INDIVIDUAL
OUTS, Douglas J.
21 ABSTRACT SECURITY CLASSIFICATION
UNCLASSIFIED




Form 1473, JUN 86 Previous editions are obsolete
S/N 0102-LF-014-6603
SECURITY CLASSIFICATION OF THIS PAGE
UNCLASSIFIED
ABSTRACT
Cache memories, which are the level of memory between the
CPU and the main memory, hold small amounts of data and
instructions, and allow the CPU to access the contents in them
very quickly. This significantly reduces the read access time
for the CPU if the required information is available in the
cache. However, caches are small and can only hold the most
commonly used data and instructions required by the CPU. When
information requested does not appear in the cache, a "cache
miss" occurs and the CPU must fetch the required data from the
main memory. The Read Prediction Buffer reduces this time-
costly read access by attempting to predict the possible miss




A. THEORY OF OPERATION 1
B. BASIC DYNAMIC RAM 2
1. Dynamic Memory Operation 3
C. RESEARCH GOALS 4
D. REQUIRED CAD TOOLS 4









E. THESIS STRUCTURE 9
II. READ PREDICTION ALGORITHM AND BUFFER DESIGN . . 10
A. THE READ PREDICTION ALGORITHM 10
B. RPB DESIGN 13
1. Finite State Machine 16
III. IMPLEMENTATION 23
A. REGISTER CELL 23





B. ADDER CELL 2 7
C. COMPARATOR CELL 2 8
D. MULTIPLEXER CELL 29
E. FINITE STATE MACHINE 3
F. POWER ESTIMATES 31
IV. SIMULATIONS 36
A. FINITE STATE MACHINE TESTING 3 6
B. RPB CIRCUIT TESTING 3 7
V. CONCLUSIONS AND RECOMMENDATIONS 39
A. CONCLUSIONS 39
B. RECOMMENDATIONS 39
APPENDIX A. FINITE STATE MACHINE DESIGN PRINTOUTS . 4
A. PEG PROGRAM FOR FSM 40
B. PEG OUTPUT 42
C. EQNTOTT OUTPUT 45
APPENDIX B. BASIC CELL LAYOUTS 47
A. MAGIC LEGEND FOR LAYOUTS 47
B. BASIC REGISTER LAYOUT 48
1. Two's Compliment Register 49
C. BASIC ADDER LAYOUT 50
D. BASIC COMPARATOR LAYOUT 51
E. BASIC MULTIPLEXER LAYOUT 52
F. FINITE STATE MACHINE 53
G. RPB FLOOR PLAN 54
H. READ PREDICTION BUFFER 55
v
APPENDIX C. SIMULATION DATA 56
A. FINITE STATE MACHINE ESIM RUN 5 6
B. READ PREDICTION BUFFER CIRCUIT ESIM RUN . . 76
LIST OF REFERENCES 194
INITIAL DISTRIBUTION LIST 195
VI
LIST OF TABLES
Table I. ADDER TRUTH TABLE 27
Table II. BASIC CELL PEAK CURRENTS 34
Table III. CELL RESISTANCE AND VOLTAGE DROPS . . 35
vn
LIST OF FIGURES
Figure l. Basic RPB Algorithm Flow Chart 12
Figure 2. Read Prediction Buffer Block Diagram . . 15
Figure 3
.
Finite State Machine Flow Chart 17
Figure 4. Clocked Inverter 24
Figure 5. Latch Make-up (a) , Latch Symbol (b) .... 25
Figure 6. Basic Register Cell 26
Figure 7. Basic Adder Cell 28
Figure 8. Basic Comparator Cell 29
Figure 9. Basic Multiplexer Cell 30
Figure 10. Block Diagram Simulation Labels 38
Vlll
I . INTRODUCTION
A. THEORY OF OPERATION
Modern high performance microprocessor systems have
increased memory bandwidth, due in part to high speed cache
memories that are logically situated between the micro-
processor and main memory. The cache itself is much faster
than main memory, but is also smaller and can hold only a
fraction of the contents of main memory. Cache make-up is of
high speed CMOS static memory which has a typical access time
of 30 ns or less, whereas main memory is made up of lower
speed dynamic memory circuits with typical access times of
around 70 ns . [Ref. l:pp. 2-13]
The contents of the cache is the most recently accessed
data and instructions. Whenever the CPU requires data or code,
it first checks the cache to see if it is contained there. If
not present in the cache, a "cache miss" occurs, and the
required information must be obtained from main memory. This
causes a substantial time delay until a read access of main
memory can be completed and the required data is acquired.
Cache miss rates can be quite high, and can significantly
reduce system performance. [Ref. 2: pp. 408-428]
The theory of the Read Prediction Buffer is to predict what
the next cache miss read address request may be from the
pattern shown by the previous reads, to pre -fetch the data and
instruction at the predicted address, and to determine if the
predicted address matches that of the next read access. In
this manner, the information of the future read will be ready
and waiting for the CPU when a cache miss occurs.
Read predictions vice write predictions are done for two
reasons. First, once a prediction is made a pre-read can be
accomplished to pre- fetch the data at the indicated address.
However, in the case of a write operation, a pre-write makes
no sense since the data to be written is not known and is
unpredictable. Second, the CPU stalls whenever a read
operation is begun until the operation is completed, for write
operations the CPU does not stall so no time is saved in
accomplishing any prediction.
B. BASIC DYNAMIC RAM
Dynamic RAMs (DRAMs) store data in one bit cells. Because
of this smaller cell size, high density, small package size,
and lower costs are possible compared to the faster static RAM
(SRAM) . These advantages make DRAMs the choice for large main
memories
.
However, there is a draw-back. DRAM cells are capacitors
(unlike SRAM cells which store bits in flip flops) so the
stored charge dissipates over time, and the stored data could
be lost. Therefore, DRAMs require a refresh of their charge
periodically to maintain the stored data. These refreshes are
done at fixed intervals, usually about 4 to 8 ms . [Ref. 1:
pp. 2-4 - 2-7]
1. Dynamic Memory Operation
Memory operations include the read access, write
access, and the refresh operations. A read access is when the
processor requests data from a given address and the memory
responds with that data. A write access refers to when the
processor sends data to the memory to be written at a specific
address. The refresh operation can be done using several
methods which are determined by the system parameters, but
basically refreshes are periodically interspaced between
memory accesses to ensure that a refresh is done at least
every 8 ms . The refresh operation causes a pause while the
refreshing is being accomplished, which means each cell is
first read and then written back into with the same data. A
counter is utilized to sequence the refresh through all the
rows and columns of the memory. The refresh timer is then
reset, and return to normal operation is accomplished.
The job of arranging just when these operations should
all be done is accomplished by the Dynamic Memory Controller.
It utilizes control signals, ready request signals, and timers
to determine when a specific operation needs to be done, and
in what particular order. [Ref. l:pp. 2-7 - 2-13]
C. RESEARCH GOALS
The motivation for this thesis is to decrease the apparent
read access time to main memory. To accomplish this, a read
prediction buffer IC has been designed, developed, and
implemented using CMOS VLSI. The IC has an address length of
22
-bits for use with 4MEG memory chips. Data length is 9 -bits
for use with l-byte memory modules with parity. Simulations of
the buffer have been accomplished to ensure it performs
reasonable predictions for read accesses, and to confirm
layout for fabrication in silicon.
C. REQUIRED CAD TOOLS
The design and layout of the Read Prediction Buffer chip
was done on the Naval Postgraduate School's Sun SPARCstations
utilizing The University of California at Berkeley's VLSI CAD
tool package.
1. Sun SPARCstation
The Sun Microsystem SPARCstation IPX is a desktop
computer that offers high-speed color graphics. Operating at
40-Mhz, the system is equipped with 32 MB of RAM, 207 MB of
internal hard drive, and mounts several large file systems
from a remote server.
2 . Magic
Magic is an interactive editor for creation of Very
Large Scale Integration (VLSI) layouts, that runs under
various Unix based systems, one of these being the Sun
SPARCstation with an integrated color display. Using Magic,
the designer can create basic cell layouts, and combine them
into larger structures, or even complete chip layouts. The
Magic program has a built in design rule checker that
constantly checks layouts as they are being created to ensure
the layout rules are obeyed for the particular technology
being utilized. It knows about connectivity and transistors,
and allows the user to extract the created circuits for
simulations. Magic only permits Manhattan designs, which are
designs whose edges are horizontal or vertical, no diagonal or
curved structures are allowed. The further attributes of Magic
are numerous and reference to the referred manual is
recommended for additional descriptions of its' abilities.
[Ref. 3]
a. Peg
The PLA Equation Generator compiles finite state
machines to generate a working PLA. It takes a high level
description of a finite state machine, and translates it into
usable logic equations that are required to implement that
design. Peg is a Moore model for finite state machines, which
implies that the outputs created are functions of the current
state. Inputs can be provided in-order to cause transitions to
particular states. The equations generated by Peg are in an
"eqn" format which is compatible with the Eqntott program. For
this specific finite state machine, the Peg command used was:
peg -s -t infile > outfile
This allowed for the infile, which in this case was newfsm, to
be processed into a truth table for the fsm, and summary
information to be created in a peg. summary file. Included in
Appendix A. are copies of the finite state machine program
newfsm, and its Peged output newfsm. peg. [Ref. 3]
b. Eqntott
Eqntott is a generator that takes the Boolean
equations generated by the PLA program and creates a truth
table from which a PLA layout may be created. Through various
options, the output variables can be made in varied ways. The
option chosen for this work was the -1 option in which the
6
input and output variables are mutually exclusive. Also, a
truth table is generated that is compatible with the MPLA
program to generate a layout . The command used was
:
eqntott -1 infile > outfile
The infile was newfsm.peg and the outfile was newfsm.eq, these
outputs may be seen in Appendix A. [Ref . 3]
c. Mpla
Mpla is a PLA generator that can generate Magic
layout compatible PLAs in various styles and technologies.
There are several styles of PLAs that are supported by this
program, this work was completed in scalable CMOS cis version
(SCS3cis) . It supports MOSIS 1.5/2.0/3.0 micron SCMOS process,
a pseudo-nmos static PLA with p- channel pullups, clocked
inputs and clocked outputs if required. The cis stands for
buried contacts, with inputs and outputs on the same side of
the PLA. Also available was SCS3trans which is the same except
the inputs and outputs are on opposite sides of the PLA. A
dynamic PLA style is also provided called SCD3cis or
SCD3 trans.
The read prediction buffer finite state machine
(PLA) utilized clocked current-state inputs and next-state
outputs only.
Further options allowed extra ground and power
lines to be added to ensure power requirements and
electomigration were compensated for. The command used to
generate the PLA was
:
mpla -I -0 -G4 -S4 -s SCS3cis -o outfile infile
The infile was newfsm.eq and the outfile was fsm4pla.mag. The
completed PLA exhibited a few minor rule violations that
required correcting, and the current -state to next -state
connections were required, a plot can be seen in Appendix B
part F. [Ref. 3]
d. Esim
Esim is the event driven switch level simulator for
NMOS or CMOS circuits. It can be entered after the circuit has
been extracted in Magic. Esim is used to watch various nodes,
to set or reset nodes, and to simulate the operation of the
circuit. The watched nodes can then be inspected, and the
circuit evaluated. There are numerous commands and options
that may be employed in the simulation, and the reader is
directed to the reference material for further clarification.
[Ref. 3]
e. SPICE3C1
SPICE3C1 is an updated version of SPICE which is a
general purpose circuit simulation program. It can perform dc
analysis, ac small -signal analysis, and transient analysis.
SPICE can also be used to provide extensive plotting of
circuit parameters and circuit behavior using its plot
command. SPICE is a tremendously powerful tool in observing
just how circuits are acting, but it has a limitation in that
large circuits take an extremely long time to process. It was
for this reason that it was used mainly for power estimations.
Esim was utilized for logic simulations. [Ref . 4]
E. THESIS STRUCTURE
The algorithm description and fundamental block designs are
discussed in chapter II. The implementation and basic cell
descriptions along with electronic parameters are shown in
chapter III. Simulations of the finite state machine and the
final complete circuit will be presented in chapter IV.
Chapter V outlines the thesis conclusions and further
recommendations
.
II. READ PREDICTION ALGORITHM AND BUFFER DESIGN
A. THE READ PREDICTION ALGORITHM
The RPB algorithm is an original idea of Professor Douglas
J. Fouts and has not yet been fully characterized and
evaluated. However, it is under study in a current parallel
thesis to determine its computer enhancement possibilities.
[Ref. 5]
The Read Prediction Buffer (RPB) uses two addresses, a
previous address and a current address, to determine an
offset, which is the difference between the two addresses. Two
read requests are required to obtain these addresses. A basic
flow chart is shown in Figure 1.
The first read request gives the current address, the
second read request causes a transfer of the current address
into the previous address, and the second read request address
becomes the new current address. The previous address is now
subtracted from the current address to determine an offset.
The offset will be either a positive or a negative value
depending on whether the current address was greater than or
less than the previous address. This offset is then added to
the current address to determine a new predicted address. It




When the next read request occurs (the third read) the
requested address is first compared to the predicted address.
If there is a match, the predicted data will then be
retrieved. If they do not match, a normal read access of main
memory will be conducted. Either way, the new read address
will become the current address, the old current address will
be transferred to the previous address, and a new predicted
address will be calculated.
11
No Match
1 st Read Access
Load Read Address to Current Address
2nd Read Access
Transfer Current Address to Previous Address
Load Read Address to Current Address
ubtract Previous Address from Current Address, Obtain Offset
Add Offset to Current Address get Predicted Address
Use Predicted Address to fetch Predicted Data
3rd Read Access
Compare 3rd Read Address to the Predicted Address
Match
Send Predicted Data to CPU
Figure 1. Basic RPB Algorithm Flow Chart
12
B. RPB DESIGN
In order to implement the RPB algorithm, several functional
blocks will be utilized, as shown in Figure 2. The Current
Address Register is a 22 -bit register which receives the Read
Address on the Address Bus from the CPU. Its output goes to
the Previous Address Register, the Subtractor, and the Adder.
The Previous Address Register is a 22 -bit register and a
two's compliment convertor, which is used to provide the
subtraction operation of the Subtractor that receives it's
output
.
The Subtractor is really a 23 -bit (22 -bits plus a sign bit)
adder that utilizes the two's compliment output of the
Previous Address Register, and adds it to the output of the
Current Address Register. Functionally, it appears as if the
Previous Address Register contents are subtracted from the
Current Address Register contents. This creates an offset
which is then sent to another 23 -bit Adder to be combined with
the output of the Current Address Register to form a Predicted
Address.
The Adder sends the Predicted Address output to the
Comparator, the Address Multiplexer, and a most significant
bit (sign bit NEG) to the Finite State Machine to indicate a
negative Predicted Address.
13
The Comparator XOR compares the Predicted Address output of
the Adder with the address on the Address Bus from the CPU,
which will be either a read or a write address from the CPU.
The output of the Comparator XOR will be either a MT for an
address match or a NMT for an address no-match condition,
which goes to the Finite State Machine.
The Address Multiplexer is a 22 -bit 2-to-l multiplexer that
receives the Predicted Address from the Comparator XOR and the
Read or Write Address from the CPU. Selection of one of these
addresses is then sent to the Main Memory (off -chip) under
control from the Finite State Machine.
The Data Multiplexer is a 9 -bit 2-to-l multiplexer that
will receive the Read Data output from Main Memory and the
Data Line from the CPU, and under control from the Finite
State Machine will output to the Predicted Data Register.
The Predicted Data Register is a 9 -bit register that
receives the selected predicted data and outputs it to the
Output Multiplexer.
The Output Multiplexer is another 9 -bit 2-to-l multiplexer
which has inputs of the Read Data from Main Memory and the
Predicted Data from the Predicted Data Register. Its' output
is controlled by the Finite State Machine. The selected data
is sent to the computer systems' cache memory for use by the
CPU.
14
A4*M» Bu) From CPU
?k dim Uhia (kgur . Pmtoia Mftm FtogMUr









P\ upxnV 7 UvnoryWrm
D<M Input








Figure 2. Read Prediction Buffer Block Diagram
15
1. Finite State Machine
The Finite State Machine is the brains of the Read
Prediction Buffer. It coordinates the control signals
necessary for correct sequential operation of the RPB.
Basically a Programmable Logic Array, the outputs and required
inputs are best described by the use of the flow chart in
Figure 3
.
Starting at State (state numbers are written in
binary format on the flow chart) , signals Not Data Valid (NDV)
and Refresh OK (ROK) are issued. NDV and ROK are sent to the
Dynamic Random Access Memory (DRAM) controller to indicate
that data is not valid and that REFRESH can be allowed to
occur. If a Write operation occurs, signals Not Read/Write
(!RDWR), Address Valid (AV) , and No REFRESH (NRF) are issued
and the transition to state 1 is taken. Here, output control
signals Enable Memory Access (EMA) , Select Address MPXR for
Write Address (AMWR) , and Enable Write pulse (EWR) are sent
out. EMA starts the memory access timer, AMWR allows for the
selection of the write address from the CPU address bus to be
gated to main memory, and EWR sends a write pulse to main
memory to initiate a write operation.
When the Memory Access Complete input signal (MAC) is
received, (a function of the memory access timer timing out)
state 2 is entered. At state 2, Data Valid (DV) is issued to
the DRAM controller.
16











Figure 3 . Finite State Machine Flow Chart
17
When the controller sees DV, it drops its Address Valid
(NAV) and the state machine returns to state to look for a
read or write operation. This sequence is referred to as the
write sequence, it will be repeated each time a write
operation is encountered.
When a read operation is encountered, the signals
Read/Write (RDWR)
,
AV, NR are received, and state 3 is
entered. In state 3, the memory access timer is started (EMA)
,
the Address MPXR allows the read address from the CPU to be
gated to main memory (AMRD) , and the Current Address Register
to be loaded (CAR) for the first read. Again, when the memory
access timer times out, MAC is received and the transition to
state 4 is taken.
State 4 sends out a signal to Enable the Output MPXR
for a Read Address (OMRD) to be gated to the systems' RAM, and
DV is issued to the DRAM controller. The controller responds
with NAV, ending the read operation and the state machine goes
to state 5. Here, operation is virtually the same as described
earlier, a write operation will cause a transition to states
6 and 7, with the same signals produced as in the write
operation previously discussed. The return from this write
will be to state 5 vice state 0, as before.
A read operation (the second read) will proceed to
state 8, where in addition to the same signals generated in
state 3, a signal to Load the Previous Address Register (PAR)
18
will be issued before the CAR in-order to transfer the
contents of the current address register to the previous
address register. This is done before the current address
register is loaded with the address of the second read. When
the MAC signal is received, state 9 is entered so that the
conclusion of the read operation can be done with the signals
OMRD and DV. Also, Subtract (SUB) and ADD are issued which
cause the previous address register to be subtracted from the
current address register to obtain the offset, which is then
added to the current address register to get a predicted
address.
State 10 is activated when the controller issues the
NAV signal, and its here that a memory access operation to
pre- fetch the predicted data is done. In state 10, signals to
start the memory access (EMA) , Enable Address MPXR for the
Predicted Address (AMPA) (to select the predicted address just
calculated sent to the main memory) , and Not Data Valid (NDV)
(which is sent to the controller to let it know that the state
machine is ready for another read or write request) are sent
out. When MAC is received from the memory access timer, state
11 enables the Data MPXR for Read Data (PDMR) and the signal
Load Predicted Data Register (PDR) , which selects the new
predicted data from the main memory to be gated to the
Predicted Data Register.
19
The next state is state 12 which again monitors for the
write or read operations. The write operation will send the
flow to state 13, but now with a slightly different sequence
than the previous two times. If a write operation should come
along after the second and subsequent reads, a write coherency-
problem could be created.
A write coherency problem refers to the condition that
if a write operation should occur at the same address that
happens to also be the predicted address, the data present in
the predicted data register would then be invalid because the
new write operation would be changing it. Thus, in-order to
protect against this condition, the write address will be
compared with the predicted address to determine if they are
the same. If they are equal then the new write data will
replace the predicted data.
In state 13, a compare (COM) is done with the write
address and the predicted address, the Address MPXR is
selected for a Write Address (AMWR) , and EMA timer is started.
State 14 will be entered upon a No Match (NMT) condition, and
will issue EWR to the main memory.
State 15 is entered if a Match (MT) condition is
encountered. Besides issuing EWR, the signals for the Data
MPXR for CPU Data (DMCP) and Loading the Predicted Data
Register are sent, which allows the new write data to replace
the predicted data in the Predicted Data Register.
20
MAC will transfer either state 14 or 15 to state 16 so
the Data Valid (DV) can be sent to the controller. The
controller then responds with a NAV which allows state 17 to
send out NDV, and return the state machine to state 12
.
If a read operation should occur, state 18 will be
entered where AMRD and COM are sent out, so that the read
address (the third and subsequent reads) can be compared to
the predicted address. Signals to load the Current and
Previous Address Registers CAR and PAR are also sent out at
this time to ensure the new and the old read addresses are
placed in their respective registers.
If there is no match (NMT) , state 19 sends out EMA (for
a new read memory access) and enables the Output MPXR for Read
Data (OMRD) . This causes the main memory to do a normal read
access and the read data will be transferred to the systems'
cache
.
However, if a match occurs, then state 2 sends out DV,
enables the Output MPXR for Predicted Data (OMPR) , and a
signal is sent to the controller to cause the Memory Access
Timer to Complete early (MACE)
.
21
Both states 19 and 20 pass control to state 21 when MAC
is sent out by the DRAM controller. State 21 sets up the
entire prediction calculation sequence again by sending out
SUB, ADD, and DV, causing a new predicted address to be
calculated, and with NAV in, it sends the flow back to state
10. The state machine cycles again to fetch the predicted data
and monitor for the next read or write operations.
One other signal is used (NEG) for detecting the
possibility that a negative value for the predicted address
could occur from the addition of the offset and the current
address. Should this NEG signal be generated, the state
machine resets back to state since the calculation of the




The basic logic design of each of the cells used in the
make-up of the Read Prediction Buffer will be shown, to give
an understanding of the design characteristics of the cells.
The basic cell layouts are include in Appendix B so that the
reader can see how the basic cell descriptions given in here
look in a Magic layout format. The cell reproductions were
created using the cif2ps function. A legend for the cif2ps
plot representations is included to aid in readability. The
reader is directed to the reference material for further
information on the cif2ps plotting function. [Ref. 3]
A. REGISTER CELL
The register cell is a Master Slave Flip Flop (MSFF) which
is comprised of two Latches. The latches are built from three
inverters, two of which are clocked inverters, as seen in














Figure 4 . Clocked Inverter
The clocked inverter (Figure 4) , which is also called a
tri-state inverter, uses the clock signals CL and !CL along
with the input to drive the output. When CL = "0", (!CL = "1")
the output is not driven by the input. When CL = "1", (ICL =














Figure 5. Latch Make-up (a) , Latch Symbol (b)
The latch internals are shown in Figure 5 (a) , this
illustrates that the circuit works functionally like a D
latch, that is with a logic "1" input, and clock a logic "1"
(CL=1, !CL=0), the output will also be a logic "1". When clock
is a logic "0" (CL=0, !CL=1)
,
then the output will remain at
whatever level it was at, a holding condition. Part b of
Figure 5 shows the logic symbol for the latch. [Ref. 8: pp.
358-360]
The MSFF is made-up of two of these latches and an inverter
as shown in Figure 6. Internally, the first latch acts as a
Master Latch, and the second latch acts as the Slave Latch.
The register cell will be loaded whenever clock is a logic "1"
(CL=1, !CL=0) , and held to whatever the input value is in the
master latch.
25
IN LATCH LATCH niiT
i
4
CL , * ^^^
ICL
Figure 6. Basic Register Cell
When the transition of the clock from a logic "1" to a logic
"0" (CL=0, !CL=1) occurs, the master latch outputs to the slave
latch, and the register will provide an output equal to that
which was held while clock was a logic "1". [Ref. 5: pp. 11]
[Ref. 8:pp. 363-364]
1. Two's Compliment Register
This cell combines three components, the basic register
cell, an inverter, and the basic adder cell, to produce a
two's compliment representation of the address input. The
two's compliment of a binary number is created by inverting
each of the bits and adding one. Hence, the inverter completes
the inversion, and the adder allows one to be added to the
least significant bit of the inverted number. See Figures 6




The basic adder cell is a combinational adder that follows
the truth table as shown in Table I. Here, A and B are inputs,
C is the carry- in from the previous stage, SUM is the sum
output, and CARRY will be the carry- out to the next stage.
Table I. ADDER TRUTH TABLE
c A B A*B A + B AOB SUM CARRY
1 1 1 1
1 1 1 1
1 1 1 1 1
1 1
1 1 1 1 1
1 1 1 1 1
1 1 1 1 1 1 1
A * B is the generate signal that occurs when a carry is
generated internally from the cell. A + B is the propagate
signal that will pass a one to the CARRY output when C is a
one. The boolean equations for this truth table yield:
SUM = ABC + A!B!C + !A!BC + !AB!C
CARRY = AB + AC + BC = AB + C (A + B)
Notice that the sum equation breaks down further to A
Exclusive ORed with B Exclusive Ored with C, which can be
verified from the truth table. These equations translate into
the following gate cell as shown in Figure 7. [Ref . 7: pp. 310-
312]
27
Figure 7. Basic Adder Cell
C. COMPARATOR CELL
To compare between two addresses a bit by bit comparison is
necessary. Only a Match or No-Match condition is required,
therefore, XOR gates can be used to determine if any one of
the bits is different. Hence, the basic comparator is made-up
from XOR gates. The simple make-up of the XOR is shown in












Figure 8. Basic Comparator Cell
D. MULTIPLEXER CELL
The multiplexers (MPXR) used in the Read Prediction Buffer
are all 2-to-l MPXRs that act like a switch, they select one
line or the other depending on whether the select line is a 1
or a . However, these particular switches are unidirectional
devices that only allow data or information to flow from the
input to the output. The basic MPXR cell is shown in Figure 9.
The two input lines are X and Y respectively. X is selected
when "0" is on the SEL line and Y is selected when SEL is a
"1". The enable line ENBL must be a logic "1" for either of
the lines to be selected. This gives the controller, or in our
case the finite state machine, the ability to turn the MPXR on
or off as desired. [Ref. 8:pp. 278-280]
29
Figure 9. Basic Multiplexer Cell
E. FINITE STATE MACHINE
Creation of the finite state machine was not as simple and
straight forward as the other cells. First, the flow chart of
Figure 3 was developed. Then, a PEG program was written for
the PLA generator CAD tool PEG. (See Appendix A.) The PEG
program defines the inputs, outputs, and state definitions.
The program is run through the PEG generator to create the
logic equations in a format that can be accepted by another
CAD program called EQNTOTT.
Eqntott generates a truth table representation of the PEG
program from a set of boolean equations which define the PLA
outputs in terms of the inputs. (See Appendix A.)
30
This output is then sent to the MPLA program which is a PLA
generator that generates the PLA in several different styles
and technologies. The technology selected for this work was
scalable CMOS cis version. (See Mpla section I.D.2.C.) From
this, a MAGIC layout was produced that could be slightly
modified to obtain the final working layout that can be viewed
in Appendix B. Some work was required on the output of MPLA,
such as cleaning up various Design Rule Checker (DRC) faults
that were created by the MPLA program, and connection of the
next-state outputs to current-state inputs. Also, because the
next -state outputs and current -state inputs have been clocked
in the PLA design by choice, the current -state inputs where
clocked by signals clka and clkabar, while the next -state
outputs are clocked by signals clkb and clkbbar. To handle the
clkbar functions, two inverters were subsequently added, so
that only clka and clkb need be provided. [Ref. 9: pp. 1-10]
F. POWER ESTIMATES
Electromigration refers to the transport of metal ions
through a conductor that results from excessive current flow
through the conductor. To prevent electromigration, it is
required that the conductors be of sufficient size to handle
the power and ground currents of the circuit. If the conductor
is not wide enough to handle the load on it, it may ultimately
become an open circuit, causing the failure of the circuitry.
31
This is due to the fact that in a conductor of insufficient
size, a high current density causes a build up of heat, and if
too great, can cause the conductor to blow like a fuse. In
order to determine the proper size of conductors, particularly
those for Vdd and GND (Vss) , it was necessary to estimate the
current density in the Vdd and GND supply rails. [Ref. 7:
pp. 144]
A rule of thumb for current density was used of 1.0 mA per
sq.um of metal for both the Vdd, and GND. The design of the
Read Prediction Buffer uses both Metall and Metal2 for Vdd and
GND routing. The minimal cross -sectional area for metal is 3
microns wide by . 5 micron thick. Therefore, a 3 micron strip
of metal would have a maximum current rating of 1.5mA.
[Ref. 10:pp. 206-209]
Another consideration is the voltage drop of the
conductors. This is due to the resistance of a narrow section
of conductor which causes an IR drop on the line. Hence,
conductors must be of sufficient size in order not to
accumulate a large voltage drop and limit the voltage being
delivered to the circuit for proper operation. To determine
proper conductor widths the formula used to calculate the
conductor resistance is




Rs = sheet resistance
L = conductor length
W = conductor width
Then using the conductor resistance the voltage drop on each
of the conductors can be calculated using Ohms Law
(V = IR)
.
SPICE3C1 a general -purpose circuit simulation program that
provides among other things dc and ac analyses, was utilized
on all the basic circuit cells to determine peak current
consumption of each circuit. Voltage drops were then
calculated using peak current estimates and a nominal value of
.05 ohms for the sheet resistance of the conductors. Armed
with these voltage drop estimates the size of the conductors
were varied in order to ensure proper power and ground
conductor sizing (see Table II and Table III)
.
[Ref. 4]
From the data presented in Table II, the adder cell
presented the most difficulty due to the fact that the current
could be as high as 11.58 mA whenever three "ones" were being
added in one cell. It was decided that this occurrence in 22-
bits would only occur approximately 20% of the time, and was
calculated on that basis. To error on the side of safety, one
other technique was applied in that extra supply points (from
both left and right sides) where added to distribute power and
ground. [Ref. 7: pp. 144-149]
33
Table II. BASIC CELL PEAK CURRENTS
BASIC CELL 22/23-BIT CELL 9-BIT CELL
MPXR
.32 mA 7.04 mA 2.88 mA
REG
.464 mA 1 0.672 mA 4.1 78 mA
COMP
.1875 mA 4.125 mA N/A
ADDER
.437 mA 68.051 mA N/A
11.58 mA (3-"1'S")
Table III shows the basic cell resistances and their
voltage drops as calculated utilizing the data in table II.
34
Table III. CELL RESISTANCE AND VOLTAGE DROPS
Cell Name Left Side Right Side
Conductor
Cur. Adrs. Reg.
Vdd 1 .92 Ohms 2.45 Ohms
.0205 Volts .0261 Volts
GND 1.74 Ohms 2.34 Ohms
.01 85 Volts .0250 Volts
Prev. Adrs. Reg.
Vdd1 .620 Ohms 1 .01 Ohms
.0068 Volts .0107 Volts
Vdd2 .159 Ohms .272 Ohms
.001 7 Volts .0029 Volts
GND1 .128 Ohms .396 Ohms
.001 4 Volts .0042 Volts
GND2 .080 Ohms .202 Ohms
.0009 Volts .0022 Volts
Adder /Subtracter
Vdd .176 Ohms .328 Ohms
.0120 Volts .0223 Volts
GND .087 Ohms .187 Ohms
.0059 Volts .01 28 Volts
Comparator
Vdd 5.68 Ohms 6.97 Ohms
.0234 Volts .0287 Volts
GND 5.86 Ohms 7.44 Ohms
.0240 Volts .0307 Volts
Adrs. MPXR
Vdd 4.27 Ohms 2.21 Ohms
.0300 Volts .01 58 Volts
GND 4.03 Ohms 1.29 Ohms
.0284 Volts .0091 Volts
Data MPXR
Vdd 3.07 Ohms 2.09 Ohms
.0088 Volts .0061 Volts
GND 2.24 Ohms 1.53 Ohms
.0065 Volts .0044 Volts
Pred. Data Reg.
Vdd .701 Ohms .422 Ohms
.0029 Volts .001 8 Volts
GND .494 Ohms .210 Ohms
.0021 Volts .0009 Volts
Output MPXR
WUipui mrAn
Vdd 1 .40 Ohms 2.11 Ohms
.0040 Volts .0061 Volts
GND .639 Ohms 1.54 Ohms
.001 8 Volts .0044 Volts
35
IV. SIMULATIONS
All logic, including the finite state machine, was
simulated using the event driven logic -level simulator Esim.
The registers, adders, comparator, and multiplexers were all
simulated to perform their respective functions. Rather than
present redundant performance data on each, only the finite
state machine and the entire chip simulation data will be
presented.
A. FINITE STATE MACHINE TESTING
The clka signal for the current -state inputs was provided
by a vector of inputs to be a logic "1" for two clock periods
and then a logic "0" for four clock periods. This is followed
by the clkb signal which was declared a logic "0" for three
clock periods, then a logic "1" for two clock periods, then a
logic "0" for one clock period. Appearing as this:
clka 110000
clkb 000110
These six clock pulses set up the timing for the PLA so that
each clock would be "on" for two clock periods while the other
clock was "off", followed by one clock period where neither
was active. The simulator could now be run through a group of
cycles that would repeat the clock pattern so that the output
36
states could be monitored to ensure watched nodes changed as
designed. Appendix C part A contains the collected data of
simulation run on the finite state machine. It is recommended
that the data be read with the finite state machine flow chart
in Figure 3 so that the state flow can be followed.
B. RPB CIRCUIT TESTING
The testing of the entire read prediction buffer required
that the Computer System, CPU Address Bus, Main Memory, and
the Data Line from the CPU all be simulated as inputs in order
to ensure the internal performance of the RPB. To accomplish
this, all data lines have been labeled so that each line could
be set or observed to ensure proper operation of the circuit.
Shown in Figure 10 is the convention used for labeling. The
circuit simulation run data is presented in Appendix C part B
for inspection. The simulation run is very similar to that of
the finite state machine, with the exception that address and
data lines are monitored for the correct values.
37
Adfrm But From CPU
Cunwit Addf*» ft»gr*l»r Prwvteu*MMm Roglttw
"
\ T. I
























Figure 10. Block Diagram Simulation Labels
38
V. CONCLUSIONS AND RECOMMENDATIONS
A. CONCLUSIONS
1) The RPB has been designed to be implemented on a single
chip to work with a 4MEG by 9 bit memory module. Although the
RPB uses a standard CMOS process, the functionality and
algorithm implemented is a new contribution to memory
subsystem technology.
2) Further testing or simulation is required in a working
system in-order to determine the actual performance gains from
this circuit. However, thesis research is currently being done
to estimate the effects of the RPB on system- level performance
by simulation. [Ref. 5]
B . RECOMMENDATIONS
1) Do an architectural study to more accurately determine
performance improvements that the RPB can provide. [Ref. 5]
2) Put the RPB chip onto a memory module and test it out.
3) Design a DRAM Controller to implement the RPB and
Refresh logic on a single chip.
4) Development of a 1-bit version of the RPB on new DRAM
chip designs, so that systems can take advantage of this new
innovation without making further modifications.
39
APPENDIX A. FINITE STATE MACHINE DESIGN PRINTOUTS
PEG PROGRAM FOR FSM
-- PEG program for the Finite State Machine.
-- Read Prediction Buffer Thesis
— Author : Gary J. Nowicki
-- Date Written : June 16 1992
-- Revision Date : Rev2 Sept. 4 1992
-- Program Filename : newfsm
INPUTS : RDWR AV NRF MAC NAV NMT MT RESET;
OUTPUTS: NDV ROK EMA AMWR EWR DV AMRD CAR OMRD
PAR SUB ADD AMPA PDMR PDR DMCP COM OMPR
MACE;
sO assert NDV ROK;
case (RDWR AV NRF)
1 I => si;
1 1 1 => s3;
endcase => LOOP;
si assert EMA AMWR EWR;
if MAC then s2 else LOOP;
s2 assert DV;
if NAV then sO else LOOP;
s3 assert EMA AMRD CAR;
if MAC then s4 else LOOP;
assert OMRD DV;
if NAV then s5 else LOOP;
si assert NDV ROK;
case (RDWR AV NRF)
1 1 => s6;
1 1 1 => s8;
endcase => LOOP;
s6 assert EMA AMWR EWR;
if MAC then s7 else LOOP;
s7 assert DV;
if NAV then s5 else LOOP;
s8 assert EMA AMRD PAR CAR;
if MAC then s9 else LOOP;
assert OMRD DV SUB ADD;
if NAV then slO else LOOP;
slO assert EMA AMPA NDV;
if MAC then si 1 else LOOP;
sll assert PDMR PDR;





case (RDWR AV NRF)
1 1 => sl3;
1 1 1 => sl8;
endcase => LOOP;





sl4 : assert EWR;
if MAC then s 16 else LOOP;
sl5 assert EWR DMCP PDR;
if MAC then s 16 else LOOP;
sl6 : assert DV;
if NAV then sl7 else LOOP;
sl7 assert NDV;
goto si 2;





sl9 assert EMA OMRD;
if MAC then s21 else LOOP;
s20 : assert DV OMPR MACE;
if MAC then s21 else LOOP;
s21 : assert SUB ADD DV;











































(!NAV&!RESET& InStO*&!InStl*& InSt2*&!InSt3*& InSt4*)l
( MAC&IRESET& InStO*&!InStl*& InSt2*&!InSt3*&!InSt4*)l
(IRESET& InStO*&!InStl*&!InSt2*& InSt3*& InSt4*)l
( NMT&!MT&!RESET& InStO*&!InStl*&!InSt2*& InSt3*&!InSt4*)l
( NAV&IRESET& InStO*&!InStl*&!InSt2*&!InSt3*&!InSt4*)l
(!MAC&!RESET&!InStO*& InStl*& InSt2*& InSt3*& InSt4*)l
( MT&!RESET&!InStO*& InStl*& InSt2*&!InSt3*& InSt4*)l
(!NMT&!MT&!RESET&!InStO*& InStl*& InSt2*&!InSt3*& InSt4*)l
(!RDWR& AV& NRF&!RESET&!InStO*& InStl*& InSt2*&!InSt3*&!InSt4*)l
( MAC&!RESET&!InStO*& InStl *&!InSt2*&!InSt4*)l
(!NAV&!RESET&!InStO*& InStl*&!InSt2*&!InSt3*& InSt4*)l
(!RESET&!InStO*&!InStl*& InSt2*& InSt3*& InSt4*)l
( MAC&!RESET&!InStO*&!InStl*& InSt2*& InSt3*&!InSt4*)l







( NAV&IRESET& InStO*&!InStl*& InSt2*&!InSt3*& InSt4*)l
(!MAC&!RESET&!InStl*&!InSt2*& InSt3*& InSt4*)l
( NMT&IRESET& InStO*&!InStl*&!InSt2*& InSt3*&!InSt4*)l
(!NMT&!MT&!RESET& InStO*&!InStl*&!InSt2*& InSt3*&!InSt4*)l
(!MAC&!RESET&!InStO*& InStl*& InSt2*& InSt3*)l
(!NMT& MT&!RESET&!InStO*& InStl*& InSt2*&!InSt3*«& InSt4*)l
( NMT&!MT&!RESET&!InStO*& InStl*& InSt2*&!InSt3*& InSt4*)l
( RDWR& AV& NRE&!RESET&!InStO*& InStl*& InSt2*&!InSt3*&!InSt4*)l
(!RESET&!InStO*& InStl*&!InSt2*& InSt3*&!InSt4*)l
( NAV&!RESET&!InStO*& InStl*&!InSt2*&!InSt3*& InSt4*)l
(!NAV&!RESET&!InStO*&!InStl*& InSt2*& InSt3*& InSt4*)l
(!RESET&!InStO*&!InStl*& InSt2*& InSt3*&!InSt4*)l
(IRDWR& AV& NRF&!RESET&!InStO*&!InStl*& InSt2*&!InSt3*& InSt4*)l
(!NAV&!RESET&!InStO*&!InStl*&!InSt2*&InSt3*&!InSt4*)l
( MAC&!RESET&!InStO*&!InStl*&!InSt2*&!InSt3*& InSt4*)l
( RDWR& AV& NRF&!RESET&!InStO*&!InStl*&!InSt2*&!InSt3*&!InSt4*);
OutSt2*=
(!NAV&!RESET& InStO*&!InStl*& InSt2*&!InSt3*& InSt4*)l
(!RESET&!InStl*&InSt2*&!InSt3*&!InSt4*)l
( MAC&!RESET&!InStl*&!InSt2*& InSt3*& InSt4*)l
(!NMT& MT&1RESET& InStO*&!InStl*&!InSt2*& InSt3*&!InSt4*)l
(IRESET& InStO*&!InStl*&!InSt2*&!InSt3*& InSt4*)l
(!MAC&!RESET&!InStO*& InStl*& InSt2*& InSt3*)l
(!RESET&!InStO*& InStl*«& InSt2*&!InSt3*& InSt4*)l
(IRDWR& AV& NRF&!RESET&!InStO*& InStl*& InSt2*&!InSt3*&!InSt4*)l
(!AV& NRF&!RESET&!InStO*& InStl*& InSt2*&!InSt3*&!InSt4*)l
(!NRF&!RESET&!InStO*& InStl*& InSt2*&!InSt3*&!InSt4*)l
(!RESET&!InStO*& InStl*&!InSt2*& InSt3*& InSt4*)l
(!RESET&!InStO*&!InStl*& InSt2*& InSt3*)l
(IRDWR& AV& NRF&!RESET&!InStO*&!InStl*& InSt2*&!InSt3*& InSt4*)l
(!AV& NRF&!RESET&!InStO*&!InStl*& InSt2*&!InSt3*& InSt4*)l
(!NRF&!RESET&!InStO*&!InStl*& InSt2*&!InSt3*& InSt4*);
OutStl*=
( NAV&IRESET& InStO*&!InStl*& InSt2*&!InSt3*& InSt4*)l
(IRESET& InStO*&!InStl*&!InSt2*&!InSt3*& InSt4*)l
(!MAC&!RESET&!InStO*& InStl*& InSt2*& InSt3*)l
(!RESET&!InStO*& InStl*&!InSt3*& InSt4*)l
(IRDWR& AV& NRF&!RESET&!InStO*& InStl*& InSt2*&!InSt3*&!InSt4*)l




( RDWR& AV& NRF&!RESET&!InStO*&!InStl*& InSt2*&!InSt3*& InSt4*);
OutStO*=
(!NAV&!RESET& InStO*&!InStl*& InSt2*&!InSt3*& InSt4*)l
(IRESET& InStO*&!InStl*&!InSt3*&!InSt4*)l
(!RESET«& InStO*&!InStl*&!InSt2*& InSt3*)l
( MAC&!RESET&!InStO*& InStl*& InSt2*& InSt3*)l












































































































































-00101 1 00 1000000000000000000000
.-0—
-001 100 001 100000000000000000000

















































-001 100 001 100000000000000000000
-11—
-000000 100000000000000000000000
01 1—-000101 01 1000000000000000000000
01 1—001 100 101100000000000000000000
1 1 1—-000000 10000000000000000000000
1 1 1—-000101 000 100000000000000000000
1 1 1—-001 100 010010000000000000000000
46
APPENDIX B. BASIC CELL LAYOUTS






B. BASIC REGISTER LAYOUT
48
1. Two's Compliment Register








C. BASIC ADDER LAYOUT
i ..^.wmwMzmwMmtimw'tw'W:':-,: "'.:-, --':. rty.r - ™ ^MrxmvmmmmiMZi
50
D. BASIC COMPARATOR LAYOUT
51




F. FINITE STATE MACHINE
53

















i i i • i . . i « - 1
.
- 1. i • i
55
APPENDIX C. SIMULATION DATA
A. FINITE STATE MACHINE ESIM OUTPUT
sun 1 7 :/home3/now icki/thesis/FSM
% ext2sim newfsmpla
Memory used: 242k
sun 1 7 :/home3/nowicki/thesis/FSM
% esim newfsmpla.sim
ESIM (V3.5 03/27/91)
703 transistors, 174 nodes (100 pulled up)
sim> I
initialization took 395 steps
sim> I
initialization took steps
sim> w RDWR AV NRF MAC NAV NMT MT RESET NDV ROK EMA
AMWR EWR DV AMRD CAR OMRD PAR SUB ADD
AMPA PDMR PDR DMCP COM OMPR MACE clka clkb
sim> I Set up clock
sim> V clka 1 10000
sim> V clkb 0001 10
sim> I Simulations will be for 12 clock pulses































sim> I To State 1 (00001)
sim> V AV 1
1

































sim> I To State 2 (00010)
sim> V AV 00
sim> V NRF 00

































sim> I Back to State
sim> V MAC 00

































sim> I To State 3 (00011)
sim> V NAV 00
sim> V RDWR 1
1
sim> V AV 1
1
































sim> I To State 4 (00100)
sim> V AV 00
sim> V RDWR 00
sim> V NRF 00
































sim> I To State 5 (00101)
sim> V MAC 00
































sim> I To Slate 6 (001 10)
sim> V NAV 00
sim> V AV 11
































sim> I To State 7 (00111)
sim> V AV 00
sim> V NRF 00

































sim> I Back to State 5 (00101)
sim> V MAC 00
































sim> I To State 8 (01000)
sim> V NAV 00
sim> V RDWR 1
1
sim> V AV 11

































sim> I To State 9 (01001)
sim> V RDWR 00
sim> V AV 00
sim> V NRF 00
































sim> I To State 10 (01010)
sim> V MAC 00
62
































sim> I To State 11 (01011)
sim> V NAV 00
































sim> I To State 12 (01100)































sim> I To State 13 (01101)
sim> V AV 1
1


































sim> V AV 00
sim> V NRF 00
































sim> I To State 16 (10000)
sim> V NMT 00
































sim> I To State 17 (10001)
sim> V MAC 00
































sim> I Back to State 12 (01 100)
































sim> I To State 13 (01101)
sim> V AV 11
































sim> I To State 15 (01111)
sim> V AV 00
sim> V NRF 00

































sim> I To State 16 (10000)
sim> V MT 00

































sim> I To State 17 (10001)
sim> V MAC 00
































sim> I Back to State 12 (01 100)































sim> I To State 18 (10010)
sim> V RDWR 1
1
sim> V AV 1
1

































sim> I To Stale 19(10011)
sim> V RDWR 00
sim> V AV 00
sim> V NRFOO
































sim> I To State 21 (10101)
70
sim> V NMT 00
































sim> I To Stale 10 (01010)
sim> V NAV 1
1






























> 1 100001 lOOOOxlka
>0001 100001 lOxlkb
sim> I To State 11 (01011)
sim> V NAV 00
































sim> I To Stale 12 (01 100)
































sim> I To State 13 (10010)
sim> V RDWR 1
1
sim> V AV 11
































sim> I To State 20 (10100)
sim> V RDWR 00
sim> V AV 00

































sim> I To State 21 (10101)
sim> V MT 00
































sim> I Check RESET to State (00000)
sim> V MAC 00

































sim> I To State 1 (00001)
sim> V RESET 00
sim> V AV 111


































B. READ PREDICTION BUFFER CIRCUIT ESIM OUTPUT
Script started on Wed Sep 2 10:54:41 1992
sun 1 7 :/home3/nowicki/thesis/DONE/TEST
% ext2sim RPB
Memory used: 1802k
sun 1 7 :/home3/now icki/thesis/DONE/TEST
% esim RPB.sim
ESIM (V3.5 03/27/91)
5036 transistors, 2548 nodes (149 pulled up)
sim> I
I
initialization took 8435 steps
sim> initialization took 495 steps
sim> I
initialization took 286 steps
sim> I Patchfile watches all input and output nodes
sim> @ patchfile.sim
sim> I Set up clock pulses for FSM



























































































































































































































sim> I Set first Adrs to 1, and set read data and data line from CPU
sim> h AO HO HI H2 G4 G5 G6
sim> I To State 1
sim> V AV 1
1


































































































































































































































sim> I To State 2
sim> V AV 00
sim> V NRFOO



























































































































































































































sim> I To State
sim> V MAC 00



























































































































































































































sim> I To State 3
sim> V NAV 00
sim> I Set Read Adrs to 2
sim> 1 AO
sim> h Al
sim> V RDWR 1
1
sim> V AV 1
1


























































































































































































































sim> I To State 4
sim> V AV 00
sim> V RDWR 00
sim> V NRFOO


























































































































































































































sim> I To Stale 5
sim> V MAC 00


























































































































































































































sim> V NAV 00
sim> I Set next Write Adrs to 3
sim> h A0
sim> I To State 6
sim> V AV 11





























































































































































































































sim> I To State 7
sim> V AV 00
sim> V NRF 00






























































































































































































































sim> I To State 5
sim> V MAC 00



























































































































































































































sim> V NAV 00
sim> I Set Read Adrs to 10
sim> 1 AO
sim> h A3
sim> I To State 8
sim> V RDWR 11
sim> V AV 1
1































































































































































































































sim> I To Stale 9
sim> V RDWR 00
sim> V AV 00
sim> V NRF 00



























































































































































































































sim> I To State 10
sim> V MAC 00


















































>11111 LI 1 11 1 1:C1
>11111 LI 1 111 1:C2




































































































































































sim> I To State 1
1
sim> V NAV 00






















































































































































































































sim> I To State 12























































































































































































































sim> I Set Write Adrs to 14
sim> h A2
127
sim> I To State 13
sim> V AV 1
1






















































































































































































































sim> I To State 14
sim> V AV 00




























































































































































































































sim> I To State 16























































































































































































































sim> I To State 17
sim> V MAC 00























































































































































































































sim> I To State 12























































































































































































































sim> I Set Write Adrs to 18 (Matches Predicted Adrs)
sim> 1 A2 A3
sim> h A4
sim> I To State 13
sim> V AV 1
1
























































































































































































































sim> I To State 15






















































































































































































































sim> I To Stale 16



























































>1 1 11 1 111111:C9





























































































































































sim> I To State 17
sim> V MAC 00






















































































































































































































sim> I To State 12
















































>11111 11 1 1111:C1
>11111 lllllll:C2




































































































































































sim> I Set Read Adrs to 32
sim> 1 Al A4
sim> h A5
sim> I To State 18
sim> V RDWR 1
1
sim> V AV 1
1












































































































































































































































sim> I To State 19
sim> V RDWR 00
sim> V AV 00






















































































































































































































sim> I To State 21






















































































































































































































sim> I To Stale 10
sim> V MAC 00























































































































































































































sim> I To Stale 1
1
sim> V NAV 00























































































































































































































sim> I Set Read Adrs to 54 (Matches Predicted Adrs)
sim> h A4 A2 Al
sim> I To State 18
sim> V RDWR 1
1
sim> V AV 11
sim> V NRF 1
1






















































































































































































































sim> I To State 20
sim> V RDWR 00
sim> V AV 00


























































































































































































































































sim> I To State 21






















































































































































































































sim> I Check RESET operation To State
sim> V MAC 00


























































































































































































































1. Advanced Micro Devices, Dynamic Memory Design, 1991/1992
Da ta/Handbook , 1991.
2. Hennessy, J. L. , and Patterson, D. A., Computer
Architecture A Quantitative Approach, Morgan Kaufmann
Publishers Inc., 1990.
3. Computer Science Division, University of California at
Berkeley, Berkeley Cad Tools User's Manual, University
of California at Berkeley, 1986.
4. Department of Electrical Engineering and Computer
Sciences, University of California at Berkeley, SPICE3C1
User's Guide, by T. Quarles, and others, 27 April 1987.
5. Billingsly, A., An Investigation of Memory Latency
Reduction Strategies for Uniprocessor Architectures
,
Master's Thesis, Naval Postgraduate School, Monterey,
California, December 1992.
6. Northwest Laboratory for Integrated Systems, RNL 4.2
User's Guide, University of Washington, 1 September 19 88.
7. Weste, N. H., and Eshraghian, K. , Principles of CMOS VLSI
Design, Addison-Wesley Publishing Co., 1988.
8. Wakerly, J. F., Digital Design Principles and Practices,
Prentice Hall, 1990.
9. Computer Science Division Electrical Engineering and
Computer Sciences, University of California at Berkeley,
Designing Finite State Machines with PEG, by G. Hamachi,
17 November 1985.











Department of Electrical and Computer Engineering
Naval Postgraduate School
Monterey, CA 93943-5000
Prof. Douglas J. Fouts, Code EC/Fs
Department of Electrical and Computer Engineering
Naval Postgraduate School
Monterey, CA 93943-5000
Prof. Herschel H. Loomis, Jr., Code EC/Lm




Naval Command Control and Ocean Surveillance Center
ISE West Coast Detachment
Vallejo, CA 94592-5017
195





GAYLORD S

