Floating point bit-sequential arithmetic units / by Blaker, David Mark
Lehigh University
Lehigh Preserve
Theses and Dissertations
1988
Floating point bit-sequential arithmetic units /
David Mark Blaker
Lehigh University
Follow this and additional works at: https://preserve.lehigh.edu/etd
Part of the Electrical and Computer Engineering Commons
This Thesis is brought to you for free and open access by Lehigh Preserve. It has been accepted for inclusion in Theses and Dissertations by an
authorized administrator of Lehigh Preserve. For more information, please contact preserve@lehigh.edu.
Recommended Citation
Blaker, David Mark, "Floating point bit-sequential arithmetic units /" (1988). Theses and Dissertations. 4909.
https://preserve.lehigh.edu/etd/4909
r. 
' 
•' 
"" ._•· 
I 
Floating Point Bit-Sequential 
Arithmetic Units 
by 
David Mark Blaker 
A Thesis 
Presented to the Graduate Committee 
of Lehigh University 
I. 
in Candidacy for the Degree of 
Master of Science 
i) 
• Ill 
Electrical Engineering 
Lehigh University 
1988 
I 
' 
- , .. 
J 
' ' 
This thesis is accepted and approved in partial fulfillment of the 
requirements for the degree of Master of Science. 
JULY 15, 1988 
(date) 
•• ( 11 
I 
\ 
•• 
f essor in Charge 
Chairman of Department 
.• ~-
' ' 
"' 
1 
, .. 
1· 
. ' 
.. 
AcknoW ledgernen ts 
I 
6 I am profoundly grateful to my wife, Polly, and to my children, Sarah 
Elizabeth and Nathan Isaac, for their support and understanding while this 
thesis was created. They were generous and gracious in making do without me 
for many evenings and weekends. I c9uld not have finished this job without 
their love. J would like to thank Dr. Wagh for his guidance and numerous · 
suggestions, without which this thesis would have been incomplete. Thanks are 
~lso due to my managers and colleagues at AT&T Bell Laboratories, who have 
generously allowed-me the time and resources to do this work. Finally, I wish to 
~ 
~_/ . 
express my gratitude to my parents for raising me and guiding me. 
) . ! 
··11 
••• 
111 
., 
_/ 
Table of Contents 
Abstract ... 
1. Introduction 
1.1 Parallel Processing and VLSI ; t . 
1.2 Fixed Point and Floating Point \ 
,, 
1.3 Thesis Outline ') 
2. On the Suitability of Bit-Sequential Architectures for VLSI 
2.1 Introduction 
2.2 Wire Delays in VLSI 
2.3 Bit-sequential architectures 
2.4 Fixed Point Bit-Sequential Arithmetic Units 
3. Floating Point Bit-sequential Multiplier 
3.1 Introduction 
3.2 Multiplication Strategy 
3.3 Detailed Implementation 
3.4 Conclusion 
4. Floating Point Bit-sequential Adder 
4.1 Introduction 
4.2 Addition Strategy 
4.3 Detailed Implementation 
4.4 Conclusion 
5. Applications for Bit-Sequential Floating-Point Arithmetic Units 
5.1 Introduction ~) 
Jr 
5.2 Matrix-Vector Multiplication'/ 
5.3 FFT Calculation 
6. Conclusion 
6.1 Discussion 
6.2 Future Directions 
References ·· 
Appendix A. Floating Point Bit-Sequential Multiplier Schematics 
Appendi,;x B. Floating Point Bit-Sequential Adder Schematics 
Vita 
• IV 
,, ' ~ ' \ 
1 
2 
2 
3 
3 
5 
5 
6 
7 
g 
15 
15 
16 
17 
22 
23 
23 
,j 23 
r 
24 
29 
30 
30 
30 
32 
35 
35 
35 
37 
38 
55 
73 
List of Figures 
.•. 
Figure 2.1. Bit-Sequential Full Adder 
Figure 2.2. Semi-Systolic Bit-Sequential Multiplier 
Figure 2.3. Kaiser, Jackson and McDonald Bit-Sequential Multiplier Section 
Figure 2.4. High Performance Bit-Sequential Multiplier Section 
Figure 5.1. Matrix-Vector Multiplier 
Figure 5.2. Butterfly Processor 
Figure 5.3. FFT Calculator 
Figure A.I. FPrviPY.1 
Figure A.2. MANivfPY.l 
Figure A.3. MPYO.l 
Figure A.4. MPY0.2 
Figure A.5. MPY.l 
Figure A.6. MPY.2 
Figure A.7. MPY23.1 
Figure A.8 .. MPY23.2 
Figure A.9. EXPFMT.1 
Figure A.IO. EXPFMT.2 
Figure A.11. EXPFMT.3 
Figure A.12. EXPFMT.4 
Figure A.13. EXPFMT.5 
Figure A.14. EXPFMT.6 
Figure A.15. EXPFMT.7 
Figure A.16. EXPFMT.8 
Figure B.1. FP ADD.I 
Figure B.2. FP ADD.2 
Figure B.3. FP ADD.3 
Figure B.4. FP ADD.4 
Figure B.5~ FP ADD.5 
Fig"ure B.6. FP ADD.6 
Figure B.7. FPADD.7 
Figure B.8. FP ADD.8 
Figure B.O. FP ADD.O 
Figure B.10. FP ADD.IO 
V 
•,_,,. ____ ~ 
.. 
. ( 
10 
11 
12 
13 
31 
33 
34 
39 
40 
41 
42 
43 
44 
45 
46 
47 
48 
49 
50 
51 
52 
53 
54 
56 ,/ 
57 
58 
59 
60 
61 
62 
63 
64 
65 
. ' :~ 
.,,,•,, ', . 
"".C-11, 
List of Figures (Cont.) 
Figure B.11. FP ADD.11 
Figure B.12. FL1P3AX.1 
Figure B.13. SW2X2.1 
Figure B.14'.~ DNRMCTR.1 
Figure B.15. DNRNCTR.2 
Figure B.16. EXPCTR~:l 
Figure B.17. RNRMCTR.1 
.. 
\ 
' 
l 
' 
'. 
' 1 
/ 
• VI 
' ___ / . 
66 
67 
68 
69 
70 
71 
72 
' ' . 
. \ 
'' i:- ·.; 
., 
"'. :', :'~; • ,. , .. # ..... :"' ' 
. j 
.. 
Abstract 
Computational architectures must be reevalu1!,ted in the light of VLSI. In 
the past, processors and even arithmetic units were constructed from many 
microelectronic components, interconnected on some substrate. Today, 
arithmetic units and even whole processors are integrated into a single silicon 
die, whether in CMOS or bipolar technology. A fundamental problem as VLSI 
scales to smaller features and higher densities is that the delay associated with 
wires is becoming a larger and larger portion of the total delay, and threatens 
continued advances in system speed. At least for a certain class of problems, 
the answer is to switch to bit-sequential architectures. In these architectures, 
wire lengths are kept very short, and there is almost no global communications, 
aside from the clock. For many problems in this class, floating point number 
representation is necessary, bt1t most of the published work in bit-sequential 
arithmetic uses the fixed point format. This thesis describes the novel design of 
.. 
a floating point bit-sequential multiplier and adder. 0.33 Mflops addition, and 
0.9 Mflops 1nultiplication are acheived with 12,192 and 9,024 transistors, 
·respectively, using a 1.25µ CMOS technology. 
,_ 
~ 
'\ 
I 
1 
~ , • I , 
1· 
(} 
. 
.' 
... 
Chapter 1 
Introduction 
1.1 Parallel Processing and VLSI 
Recent advances in integrated circuit manufacturing have made custom Very 
. ........__. 
Large Scale Integrated (VLSI) circuits relatively inexpensive and widely 
' 
available. Computational architectures must be reevaluated in the light of VLSI 
advances. In the past, processors and arithmetic units used many discrete 
components placed on a substrate. The current VLSI technology allows 
arithmetic units and even whole processors to be integrated into a single silicon 
die, both in CMOS and bipolar circuits. There is even the possibility of 
integrating multiple processors, at least simple ones, onto a single die. Certainly 
many processors can be integrated onto a common substrate, and. into a 
common system. However, the design decisions made in that previous era are 
not necessarily the correct ones in the present era of VLSI. 
A fundamental problem as VLSI scales to smaller features and higher 
densities is that the delay associated with wires is becoming a larger and larger 
portion of the total delay, and threatens continued advances in system speed. 
Barring any breakthrough in interconnection technology, such as 
2 
.. 
'·-.:'' ..... 
, 
, 
} 
" 
,. 
superconducting interconnects o·n VLSI chips, it is necessary to find a way to 
/, 
exploit millions of transistors on a single silicon die, while shrinking the lengths 
~ of the wires being used. At least for a certain class of problems, the answer is to 
switch to bit-sequential architectures. In these arc~itectures, wire lengths are 
i ' 
kept very short, and there are almost no global communications, aside from the 
clock. Another problem in VLSI is the limitation of package pin counts. The 
advantage of bit-sequential architecture can be more than an order of 
magnitude reduction in pin counts. -,~ 
1. 2 Fixed Point and Floating Point 
There are many times in signal proc~~sing when fixed point arithmetic is 
inadequate, due to limited dynamic range. In these cases, floating point 
arithmetic is necessary. To date there has been a plethora of papers published 
. 
about fixed point bit-sequential arithmetic units[l] [2] [31, and a dearth of papers 
about floating point bit-sequential arithmetic units. This thesis attempts to 
redress that lack by reporting the design of two fundamental floating point bit-
sequential arithmetic units, a multiplier and an adder . 
. ' 
1.3 Thesi·s Outline 
Chapter 2 explores the argument for bit-sequential arithmetic in more detail, 
and reviews some examples of fixed point bit-sequential arithmetic units. In 
chapter 3, the design of a novel floating point bit-sequential multiplier is 
3 
,, ~ \ 
I \. 
() 
·\r'~·. ·';·· ~,.···:~·r(::·., 
• 1.,.~ 
. .J ) 
·I 
• 1 ' I, • 'v• 
".:, .. 
'I ~ ·' •• ·, • ; 
'., I 
! • • ... 
. ri. . . • 
I •• I 0 
• • 1, 
, 
.. •'. 
reported in detail. Qhapter 4 reveals the design of a novel floating point bit-
'· 
sequential add~r. · Both these processors accept two floating point operands in 
VAX format bi.t~sequentially, and produce their results in the same format. In 
chapter 5, two applications are described for the arithmetic units. developed in 
·. r .. 
.. 
this thesis. Chapter 6 states the conclusions . 
. ... '/ 
.. 
:- . . ..-
:, / •• I 
·, 
.,..... . 
' . ~ "· 
.- .! -~ ~ 
_-. • I•' 
. .. 
. . . . . 
. · .. v·· J. 
.. . 
. ~. .·,· 
. ~ ,r. . 
. . - ·, 
I•. •, 
' ~ .. _ . ~ 
. ., ,' ., 
.. 
: ~ . . · .... 
. . . 
~- I ; 
: ,·_ !: - ~.:. 
.. 
.. . .•· 
' . 
; •. t 
;. ·. ./ 
' ' ,•~ 
• _·. # 
, " 
. 
. . 
. ~ . 
' . 
i 
. ,·; ..... 
. . . 
\.. -;. 
.· : d' 
.. !." 
,, 
.. 
. ,.,. .: 
. ;, 
. . . 
. . " ' 
~-- . 
4 
,; .. : 
t .. ., 
i·I· .. 
• 
; :. ' \' i ,' ' ' ,, 
_., 
., 
" 
,· 
,. 
• 
.. 
Chapter 2 
;I !J 
. . 
. • ~· ~- ', 1', }-- ,:1 .-,,,. ,, 
·t 
,. 
On the Suitability of Bit-Sequential 
Architectures for VLSI 
2.1 Introduction 
) 
/ 
• 
., 
In programming a 1 function _on a di,gital computer, it is always possible to make ~ 
tradeoffs between the time to run the program, and the space, or amount of 
memory, which the program uses. A similar, but more complicated situation 
exists with regard to desigI,1ing digital hardware, especially with regard to VLSI. 
Again, a tradeoff can be._ made between time and space, or in this case amount 
' 
of hardware. However, :in the case of the hardware implementation of a 
function, there is a cost of communications which is not appa.rent in the case of 
software. In the case .of' hardware, afly 1 information which must be passed 
.· ',, 
between two or more separate physical entities must pass over some physical 
medium. Let.,us limit ouridiscussion to the case of conducting wires.* 
... '. 
'. 
* The advent of optical communication may change the constraints in the near future, but for 
now, optics is not available· to the VLSI designer. 
5 
\ 
•• 
f) 
, I 
" 
2. 2 Wire Delays in .VLSI 
,. 
' 
Wires take up both area and time. VLSI is essentially a planar technology, 
therefore we discuss the area of wires embedded in a plane. It is a simple 
matter to extend the argume;nt to. the 21h dimensional case of multiple planes 
which can communicate only by orthogonal contacts. The cost of a wire is its 
area in the plane, which is proportional to its length and widthJ41 There is 
another cost, which is more subtle. Since wires are embedded in a plane or 
f ' 
.. 
planes, and have finite width, then the perimeter of a module which must \ 
~ I 
connect to n wires is proportional to n. If the module is square, then the 
2 
minimum area it can possibly have is proportional to nP 
4 
. It can be seen, 
.. 
then, that it is very important to balance the area of wires and the area of the 
modules which they connect, lest the VLSI device be overwhelmed by the area 
' 
of wires. In fact, many VLSI devices today have about half of their area 
devoted exclusively to wiring. 
I f~ 
The other cost of wires is in their delay. As VLSI technology continues to 
scale to smaller and smaller dimensions, the capacitance of the wirin&. per unit 
length does not decrease. This is due to the fact the t·he capacitance per unit 
length is proportional to the ratio of width to height above the substrate. As 
,_.,. 
the technology scales in both directions, this ratio remains aproximately fixed. 
Also, the resistance, which is inversely proportional to the width and to the 
. ' 
j· 
6 
' '· 
-,, 
I 
•• 
L•-' 
... 
... 
thickness of the wire, goes up as the square of the scaling. Therefore, the RC 
time constant of the wire goes up quadratically with the scaling. Now, it is 
possible to ignore the wire delay, and regard it as ~ lumped capacitor, if the RC 
delay of the wire is significantly shorter than the gate delayJ51 Otherwise, the 
wire delay becomes a significant part of the total delay. The delay of the gates 
themselves, however, does scale down. ,The ref ore, we see that the RC time 
constant is going up ·approximately quadratically, while the gate delay is going 
down as the channel lengths. 
The result of this analysis is that the length of a "delayless" wire is 
decreasing. Note, however, that the size of the chips does not decrease. As the 
1:~ 
technology continues to scale down, designers keep the chip sizes the same, and 
put in more devices. This is the point: if the architecture of the device is such 
that there are wires which are in the path which determines the cycle time of 
the device, then we will soon reach the point where further scaling, while it may 
increase the density of the device, cannot improve its speed. What can we learn 
. 
from this argument? In order to gain increases in throughput which are 
commensurate with the gains in the transistors themselves, it is necessary to 
reduce the lengths of the wires in the critical paths of the architecture. 
2. 3 Bit-sequential architectures 
The logical conclusion of the above discussion is to use a bit-sequential 
architecture. In a bit-sequential architecture, data is processed word-parallel-
7 
I• 
/ 
,, 
..... 
bit-serial, rather than word-serial-bit-parallel, as in a microprocessor.~ A bit-
sequential architecture is also - the ultimate conclusion of a pipelined 
architecture. In systolic architectures, multiple processing units operate in 
lockstep with each other, and only communicate with nearest neighbors. In a 
semi-systolic architectur~, this constraint is relaxed somewhat to allow some 
globally broadcast signals. By processing data one bit at a time, in systolic or 
semi-systolic architectures;· a VLSI device can be expected to run at a speed 
which more closely tracks the fundamental limits of the technology than a 
parallel architecture can . 
There are other reasons for wishing to use a bit-sequential, architecture. 
Chief among these is the high degree of regularity in designing a larger function 
as a collection of identical cells. This has the effect of reducing design time, 
which is critical for VLSI. The problem with having a million transistors on a 
single die is defining what to do with them, and designing them. Reusing a 
small collection of modules in large regular arrays is one way of addressing these 
issues. Another point is that not having to drive large loads means that most of 
the transistors in a serial architecture · will be small. This increases the 
transistor density of the device. Also, since there are almost no global wires, the 
density goes up again. Further, there are many fewer transistors, because data 
is being processed only one ,bit at a time, rather than all at once. Therefore the 
area of a bit-serial implementation is much reduced from a parallel 
8 
q 
' 
. ·•) 
. ,. 
. ... \ . 
- . . . 
' -·J. ~ . 
··. .  
. ' .. ' . . 
""J 
implementation. In ·the domain of delay, the critical paths are limited to very 
few ·gates and very short wires, so the clock rate goes up. Balanced against this 
. 
• 
higher clock rate and lower area is a much lower throughput. 
,.,:. 
An important figure of merit for VLSI is the ratio of throughput to area. If 
the reduction of. area counterbalances the decrease in throughput in equal or 
I\ 
greater proportion, then it is advantageous to use a bit-seqential architecture . 
.. , . 
'. ' ~' ; 
This is especially, true in building parallel arrays of computational units to 
) -
' ... ': 
perform computat{qnally intensive calculations, i.e., systolic arrays. 
. . 
' . -~' ~· 
.. 
~ 
In all fairness .a:·nd honesty, one must look at when it is not a good idea to 
I " 
use bit-sequential architecture. These seem to be the two extreme cases: when 
,c 
the required throughput is so low that a general purpose processor is more than 
. . 
a match, and whe1n the data rate approaches the maximum clock rate of the 
technology. Whe.n·: the data rate approaches the maximum clock rate, bit-seri_~l 
computational units must be interleaved to keep up. At that point, it may 
become simpler to use a pipelined parallel implementation. 
2.4 Fixed Point Bit-Sequential Arithmetic Units 
At this point, it is useful to survey some "existing bit-sequential implementations. ,, 
Since this thesis · is limited to a floating point multiplier and adder, the 
d~scussion will be limited to fixed point multipliers and add~_rs. A bit-sequential 
full adder is simplicity itself (see Figure 2.1 ). 
g 
B 
i 
' 
. " ' ' 
~ 
. . .. 
. . 
J 
.... 
A 
.. 
. 
'-~C 
f- -- - -- - - - - -- -- - - - -·- - - - - - -- - -·- , . 
. I I 
., I 
I . ·, 
I 
' 
, • I 
. I I 
I I 
.. I I 
_) I I 
I 
-
- lD Q • . 
- -
. 
a: I 
' Full > Cl I I 
. I Adder . . I 
,. 
I . .. 
I Carry ' ' . I 
-
- lD Q J' 
. ·' 1 - -
· 1 I 
. . .. I ~ Cl .. · ... , -. 
I 
.. 
. ,-
' .. r & . .,.: 
·:\ -.I 
-J: I 
·t I 
' I () I 
:a I .. j 
:,I I I 
. ·, I 
J I 
J.,. 
-- - - - - - - - -
- . . -
- - - - - - -
_. 
- - - - - - -
.J 
.. 
-
. 
.. 
'.' 
, 
' 
. 
. 
.. 
,, 
' 
-. 
. . Reset 
· Figure 2.1. Bit-Sequential Full Adder 
C ; 
i 
' . . 
-
-
Sum 
It consists of a full adder, two D flip-flops and an AND gate. Both operands are 
11· .. • ,, 
input one bit at·: a. time, least significant bit first. The output also appears one 
' 
bit at a time, least significant bit first, delayed from the input by one clock 
cycle. 
j 
There 
. ' 
many bit-sequential multipliers • Ill the 1 it er at u r~e. Two are 
fundamentally aiff erent approaches are described here. One is purely systolic, 
[_,, 
and accepts both operands in bit-sequential form. The other is semi-systolic, 
10 
J. 
,. 
':. ,_' •,' '. . ~ ' ,' ,: . .. ' 
,. 
and is actuilly a parallel-serial multiplier. The semi-systolic multipliers receives 
' ' 
one of its operands in parallel, and one in serial formf6l (see Figure 2.2). 
' . 
.o 
,. 
. · .. ,Coef n-l I Coef n-2 . 
'· 
. 
.... 
~ , , , , 
'· 
. 
'' 
, , 
•• . 
' ·, 
• • • FF FF FF 
.. ~ 
. 
. 
' 
'. 
/,. 
, 
'•' 
,. Data 
• • • 
.!· 
,. } 
' 
,. : 
.. 1 I 1 , ' , I 1 I , , , , 
' & & 
• • • & 
' 
·, 
. ' \'', 
; 
' 
-·· 
. ). 
·. 
, 
,' ~--
, 
' 
, I 
' ' 
... 
. · . 
.' 
;: 
B-S B-S 1, ,: . . 
- - -
- FA - FA -'' • • • 
B-S 
- -
- FA - Product 
. 
-
: 
•, 
, .. " . 
,, 
· ;1~ -·... Figure 2.2. Semi-Systolic Bit-Sequential Multiplier 
. ,; . ~ 
f..t 
The multipller multiplies two n-bit numbers, producing a 2n-bit result every 2n 
• . I 
' 
• ~ ' • 
'I 
clock cycles. The delay is 1 clock cycle. It is semi-systolic because the serial 
' . 
input is brqadcast to every cell. 
., \ 
The original bit-serial multiplier seems to have been described by Jackson, 
Kaiser .and· McD0nald[7] (see Figure 2.3). n of these sections may be 
concatenated to form an n bit multiplier. In this case', both operands are input 
serially, and there ar,e no broadcast signals. Two n-bit numbers are multiplied 
11 
• 
\ 
.• 
" 
, 
. 
x. - FF 
-Ill 
,. 
•' 
• 
. 
. 
., 
& - -
-
-
PP. 
. 
- ,.... Ill 
-'"" 
' 
Truncat·e. 
I 
' ' ( 
Coeff. 
I 
I f 
' 
FF 
·' 
I I 1 • 
& 
11 
B-S 
FA 
. 
- FF 
-
.. -
FF -
-
-
-
-
-
X 
out 
pp t OU 
Figure 2.3. Kaiser:, Jackson and McDonald Bit-Sequential Multiplier Section 
,. 
to produce a truncated, n-bit product every n clock cycles. The truncate input 
" is the same .sjgnal as the reset to the bit-sequential full adder. The delay of 
each section is two clock cycles. The truncation has the effect of right shifting 
' 
the partial product produced in each cell to align it with the partial product in 
the next cell, much as is done in manual multiplication. 
12 
/ 
' ' 
I. 
1 I''-'· t', "'i.-I ri1 trl\ , . 
•j 
.. 
'j"-·"· ' " ., .. ,, ,., ··1 ' ' ~ ' ~ • '. ~ -, ' •,-... ' . '" ,,...._. ,. • ,- -·-• • '•-" '' • ,. "•• [I '. 
, , . 
I' 
I ' 
... 
The fixed point multiplier which is used in this thesis is a variation of the 
I 
Kaiser, Jackson, McDonald multiplier, which was described by Scanlon and 
Fuchsl81 at the 1986 IEEE International CorJ,ference on Computer Design (see 
Figure 2.4 ). 
x. - FF 
-Ill 
' 
Y. 
Ill 
- lD 
- Q 
- Gl 
-
PP. FF - -
- -Ill 
SP. - FF 
-Ill 
R. - FF 
-Ill 
-
-
-
-
I I 1 I 
~ ~ & 
I I 
Full 
-
Adder -
·- ,.... 
·- ""' 
-
-
. 
-
-
-
-
-
-
FF 
FF 
& 
& 
>1 
-
FF 
-
-
-
-
-
-
-
-
-
-
X 
out 
y 
out 
pp t OU 
SP t OU 
R 
out 
Figure 2.4. High Performance Bit-Sequential Multiplier Section 
.\ 
' . -
13 
/ 
l 
i'\ 
• 
,,.,, .,,. . 
I 
This multiplier, which differs from that of Jackson, et al, only in the addition of 
one D flip-flop and one 3-input complex gate per bit, produces the 2n-bit 
product of two n-bit twos complement numbers every 2n clock cycles. The Y 
input corresponds to the coefficient input in the previous multiplier, which has 
been serialized.( The PP signals are the partial products corresponding exactly 
to the PP signals in the previous multiplier. The SP signals are the least 
significant n bits of the partial product, which were lost in the previous 
multiplier. Having the complete 2n bit product will be important in the next 
chapter on floating point multiplication. 
----
14 
• 
( 
,, 
\. 
Chapter 3 
Floating Point Bit-sequential Multiplier . 
,· 
3.1 Introduction 
There are cases where the dynamic range and precision of fixed-point 
11 representation are inadequate. Examples are large matrix operations, large 
FFT's, etc. Also, many algorithms are initially designed and simulated in 
software, using floating point arithmetic, on large general-purpose computers. 
However, parallel floating point arithmetic unit~ are large and very expensive . 
• 
Putting together many of them would be prohibitively costly in area, power and 
dollars. For these reasons, a floating point bit-sequential multiplier, and a 
floating point bit-sequential adder have been designed. This cha·pter describes 
the multiplier, and the next chapter describes the adder. 
I, 
The floating point number representation chosen for this multiplier is in 
VAX F format. The mantissa is a 24 bit binary string. The exponent is an 8 bit 
binary string in two's-complement format, biased by + 128. The sign is a si11gle 
bit which is asserted for negative values. The value of the number is 
(-S) X (2E- 128) X (Mx2- 23 ). For example, ---32 would be represented by 
S= 1, E= 10000101, M= 100000000000000000000000. 1 would be represented by 
15 
f 
r . 
· S==O, E== 10000000, M= 1 • -63/64 would 
represented by S= 1, E=Olllllll, M= 111111 • The 
multiplier performs fair rounding according to the IEEE754-1gs5 standard:l91 If 
a result lies exactly halfway between two representable results, then the even 
result is chosen. 
3. 2 Multiplication Strategy 
Floating point multiplication is not terribly more complicated than fixed point 
multiplication. The algorithm for floating point multiplication is:l 101 
PAR DO 
1. multiply the two mantissas, and obtain a full-
precision result 
2. add the two exponents 
END DO 
3. normalize the mantissa, and adjust the exponent if 
necessary (given normalized operands, only 1 bit of 
adjustment at most) 
4. round the full-precision mantissa to the original 
precision, and adjust the exponent if necessary 
5. if the result exponent is outside the representable 
range, then signal an underflow or overflow, 
respectively, and adjust the output 
Refer to Appendix A, figure A.I, for a block diagram of the floating point 
multiplier. The implementation receives each of the data words on two wires: a 
mantissa wire, and an exponent wire. Every 24 clock cycles, a complete 24-bit 
mantissa, including the hidden bit, can be entered into the multiplier, least 
significant bit first, on the mantissa wires (MCDMAN and 1'1PYMAN). Every 
.r 
16 
I 
I, 
~ -..: 
24 clock cycles, the corresponding exponent and sign are received on the 
exponent wires (MCDEXP and :MPYEXP). The eight bits of the exponent are 
entered least significant bit first, fallowed by the sign bit, followed by 15 bits 
which are undefined and ignored. There is also a reset input, which must be 
entered coincident with the lsb's of the mantissa and the exponent. A block 
diagram of the multiplier can be found in Figure 3.1. 
3. 3 Detailed Implementation. 
The multiplier is broken into two parts: a mantissa multiplier, and an 
exponent/reformatter. The mantissa multiplier . is a completely systolic 
implementation, as described in the previous chapter. The delay from the 
appearan'Ce of the lsb of the input to the lsb of the output is 25 clock cycles. A 
full 48 bit result can be produced every 24 clock cycles. The mantissa multiplier 
is fully pipelined, so that a multiplication can be performed every 24 clock 
cycles. 
The exponent/reformatter (EXPFMT) section takes the full-precision 
mantissa product, and the two exponent and sign inputs, and formats the 
outputs. Within the EXPFMT, there are two basic operations: finding the sign 
and exponent of the result, and rounding and adjusting the output mantissa 
and exponent. The exponent of the result is found with a simple bit-sequential 
adder. However, since each of the two operands exponents are biased by plus 
128, one of the biases must be subtracted before the result exponent {XP) is 
17 
\ 
' . -··~' 
I 
... 
•• • '• • • • ... •, '·"'f, '"\'""-..--. L""\' ~!'t-. ' 0 
" 
.. 
correct. The sign is found by just XO Ring together the signs of the two input 
operands to obtain the sign of the result (NEG). Once the result exponent is 
found, it is necessary to detect certain conditions. 
The first condition is overflow. If the resulting unbiased exponent is greater 
than + 127, then it is outside the domain of the representation, and a flag 'is set 
to indicate overflow (OVF). Since the multiplier pipelines three operations, the 
flag is pipelined. A related condition is maximum exponent (1,1XP). If the 
unbiased exponent is exactly + 127, then an overflow results if either the 
mantissa product is >= 21 , or rounding the mantissa forces a carry out of the 
most significant bit. This flag is also pipelined. 
The next condition is und~rflow. If the resulting unbiased exponent is less 
than -128, the_n it is outside the domain of the representation in the other 
direction, and the underflow flag (UNF) is set. However, an underflow only 
~; 
' 
occurs if the res_ult is finite and too small to be represented in the dynamic 
range of the representation, but not if the result is exactly zero. If either of the 
two input exponents is exactly O (biase.d representation), then that input is O, 
and the result is exactly zero, and not an underflow. Therefore, if that case is 
detected the zero flag (ZERO) is set, and the UNF flag is not set. A related 
condition is minimum expone11t minus one (MNP). Notice that although the 
representable range of unbiased exponents is (-127,+127], that UNF is not 
asserted unless the resulting unbiased exponent is less than -128. If the 
18 
"i,. 
.~. 
. ' 
resulting unbiased exponent equals -128, then an underflow might or might not 
occur, depending on the mantissa product. If the mantissa product overflows 
either before or after rounding, then the exponent will be incremented by 1 to 
-127, and no underflow will occur. If, however, no mantissa overflow occurs, 
then an underflo~. will be signaled. 
In summary, the result of the first stage of the EXPFMT is the sign bit 
(NEG), the eight bit ·biased exponent (XP), and five flags: overflow (OVF), 
maximum exponent (:MXP), underflow (UNF), minimum exponent minus one 
(MNP), and zero (ZERO). Since the most significant bit of the mantissa 
product does not occur for 72 clock cycles, or tl\fee words, each of these 
fourteen bits is pipelined three times, 24 clock cycles apart. 
The last stage of the EXPFMT, and of the bit-sequential floating point 
multiplier, is rounding the product of the mantissas, and formatting the,)result. 
Recall that the mantissa format is: 
0 -1 -23 2 . 2 . . . 2 
and that the mantissa is a positive magnitude. Also, except for zero, the 
mantissa must be normalized, i.e. the mantissa must lie in the range 
2° ~ mantissa <21 - 2- 23 . Multiplying two 24 bit numbers results in a 48 bit 
number, whose format is: 
19 
I> 
... ·.• . .:' ,, ' . 
,• 
I 
and the range of the result is 2° ~ mantissa product ~ 22 - 2- 21 + 2- 46 • If 
the most significant bit of the result is set, then the mantissa overflows, the 
mantissa is right shifted one bit, and the exponent is incremented by 1. If the 
:MXP flag has been set for this operation, then OVF is asserted, and the 
mantissa and exponent are set to all ones. The sign of the result is unaff e·cted. 
0 
The mantissa must be rounded to a 24 bit result. Since it is impossible to 
know which will be the least significant bi,t until the most significant bit is 
output by the mantissa multiplier, the 24 most significant bits of the result, a 
guard bit (GRD), and two different rounding bits are saved:_round if the most 
significant product bit is asserted (RNDMSPT), and round if the µiost 
,/ 
significant product bit is negated (RNDMSPC). The fair rounding scheme 
requires always rounding to an even result if the full-precision result is exactly 
halfway between ·two results, i.e., the round bit is set, and all lower order bits 
are zero. In that case, only round (increment the mantissa), if the mantissa is 
odd. In the case that the mantissa product is ~21 , then the least significant bit 
of the product will be 2- 22 , and the result will be rounded up only it the 2- 23 
bit is set, and if either the least significant bit is set or any bit at all to the right 
of the 2- 23 bit is set. In other words, only if the round bit is set, and the result 
j 
/, 
I 
is odd, so round -to even in any case or the rtjund bit is set and the bits to be 
1 
20 
,. 
.. , 1 
~. ) , I 
truncated are > 1 /2 the least significant bit. Similarly, in the case that the 
mantissa product is <21 , then the least significant bit of the product will be 
2- 23 , and the result will be rounded up only if the 2- 24 bit is set, and if either 
the least significant bit is set, or if any bit at all to the right of the 2- 24 bit is 1 
set. 
It is important to know whether rounding the mantissa will overflow the 
mantissa, because that forces the exponent to be incremented, and may cause 
Q.I ~ 
an overflow if true, or an underflow if false. If it were necessary to wait to 
'· detect this condition until after the complete result were obtained, then the 
multiplier would have to add another pipeline stage. Fortunately, it is possible 
to detect an overflow from rounding even before the rounding is done. It is only 
necessary to detect that the 24 bits of the mantissa, assuming that the mantissa 
product will not have the most significant bit asserted, and the round bit are all 
equal to one. This is the only case which can force an overflow from rounding . 
../ 
This case is detected, and if true, forces the mantissa to be right shifted and the 
exponent to be incremented by one, with the possible consequences previously 
described. 
It then remains only to perform the actual C(_9unding with a simple bit-
... 
sequential half adder, and to increment the exponent with another bit-
sequential half adder. If an overflow occurs, the mantissa and exponent .are set 
to all ones, without affecting the sign bit. If an underflow occurs, or the result 
' . r 
... 
... 
I, 
r , 
is exactly zero, the mantissa, 1exponent and sign are all set to zero. If the result 
does not exactly match the full-precision result, either because of truncating the 
full-precision mantissa product, or because of an overflow or underflow, an 
inexact flag (INX) is asserted. A· new result appears on two wires (PRDMAN 
and PRDEXP) every 24 clock cycles or less frequently. There is an output flag 
(R,T7 4), which indicates the least significant bit of a new .result, and negative 
(NEG), zero (ZRO), overflow {OVF), underflow (UNF) and inexact (INX) flags 
-~~-, A•F,,· 
which are also output. 
3.,4 Conclusion 
The bit-sequential floating point multiplier is designed with 1.25µ CMOS 
standard cells from the AT&T libraryJ 11J That library has two sets of cells: an 
• 
area optimized set which uses 5µ transistors, and a performance optimized set 
which uses 20µ transistors. This design used the area optimized set. It uses 381 
flip-flops and 556 gates, or 9,024 transistors. The deepest path between two 
flip-flops is 5 gates deep. Its maximum clock rate at 5.0V, 25 ° C and nominal 
processing is 21.8MHz With a latency of 24 clock cycles, that is equivalent to 0.9 
:rvIFLOP. The delay from the least significant bit of the input to the least 
significant bit of the output is 74 clock cycles. Since the multiplier only has 14 
I/O's, it could be packaged in a 16-pin plastic dip; that is a very cheap MFLOP. 
The complete schematics of the multiplier are in-cluded as Appendix A, Figures 
A.l-A.16 
J 
22 
l , 
I 
/ 
'· J 
j 
Chapter 4 
Floating Point Bit-sequential Adder 
4.1 Introduction 
1: 
"· ,, 
The floating point bit-serial adder uses the same format as the multiplier. It 
has two additional inputs: a subtract augend input (SUBTRG), and a subtract 
addend input (SUBTRD). When these signals are asserted, their respective 
operands are negated by inverting their sign bits. The adder can therefore 
implement a+ b, a- b, - a+ b and - a- b. The adder does, however, have a 
' 
slightly longer delay than the multiplier: 76 clock cycles instead of 7 4. The 
adder is also not purely systolic. The exponent difference, mantissa 
denormalization, arithmetic and renormalization are bit-serial, but the exponent 
adjustment is done by a parallel counter, as the mantissa result is being 
generated. For this reason, the floating point adder is the speed limiting 
! 
component in a bit-sequential system, rather than the multiplier. 
4. 2 Addition Strategy 
' Tlie adder is not as e~ily broken into mantissa and exponent parts as th~ 
multiplier. The algorithm for floating point addition is somewhat more 
convoluted than for multiplication:[IO] 
23 
\~ 
'· 
• ,I 
,, 
i 1. find the difference in the exponents, select the larger 
2. denormalize the smaller exponent mantissa (right shift) 
3. add or subtract the mantissas 
4. mantissa normalization and rounding, adjust the exponent 
5.rdetect overflow or underflow 
4. 3 Detailed Implementation 
The first step is to find the difference in the e~ponents, and to select the larger 
. . 
number. The difference is found with one's complement arithmetic, 
impelmented with a bit-serial adder, a shift register, and a bit serial half adder 
and XOR. This implementation augments the algorithm by also finding the 
larger number even when the exponents are equal. The smaller number, or the 
AUGEND input if the numbers are exactly equal, is the one chosen to be 
denormalized. This means that if a subtraction is performed, the result is 
guaranteed to be non-negative. This will be important later. If the unbiased 
result exponent (EXP) is +127, then the maximum exponent flag (MXP) is 
asserted. If the rounded sum of the mantissas overflows, then the exponent will 
be incremented by 1, and an overflow will occur. 
rhe signs of the operands, after negating if the respective subtract inp·uts 
are asserted, are XORed together. If they are different, the mantissa of the 
smaller number is subtracted from the mantissa of the larger number after the 
smaller number is denormalized. The sign of the result {NEG) is the sign of the 
larger number. 
24 
i _J Q ' ~ 
,, 
' <·, 
! 
I' ~ 
( 
i 
r . 
' ,, .; .,,.' 
The next step is to denormalize the mantissa of the smaller number. This is 
easily accomplished in the bit-serial format . by loading the mantissa into a 
parallel-in-serial-out register which also has a clock enable. The exponent 
difference is loaded into a down counter (DNRMCTR), and counts down. As 
., 
long as the count is ~O, the shift register does a logical right shift. The serial 
output of the register goes into a three bit register, which consists of the guard 
(GRD), round (RND) and sticky (STK) bits. The sticky bit is formed by ORing 
together all the bits which are shifted into it. The purpose of the GRD bit is to 
become the least significant bit if the mantissa is left shifted when the result is 
normalized, and to be the round bit otherwise. The purpose of the RND bit is 
" 
to round the. result if the mantissa is left shifted. The purpose of the STK bit is 
more subtle. No matter how many bits are shifted past the RND bit, it is only 
necessary to OR them together into the STK bit to get the correct result. If the 
' denormalized number is subtracted, then all the bits, including the STK bit, are 
inverted, and one is added to the STK bit. The resulting STK bit will be the 
OR of the result even if the subtraction were performed with full-precision. 
The resulting STK bit is also necessary to decide if the portion of the mantissa 
to be truncated is exactly equal to one half the least significant bit. However, 
since the adder is pipelined to accept a new operand every 24 clock cycles, only 
shifts of O to 23 places can be made bit-serially. It is necessary to add a small 
parallel shifter. which shifts the least si.gnificant bit of the denormalized . 
25 
,; 
·, 
·• 
operand, the guard bit the round bit and the sticky bit, between the output of 
the denormalizing shift register and the next register in the pipeline. 
The third step is to perform the actual addition .or subtr~ction. There are 
. only 24 clock cycles, but the denormalized mantissa is 27 bits long, inclllding ~ 
the guard, round and~ sticky bits. Again, it is necessary to include a small 
-~· ,. 
parallel adder to obtain the result for the least significant result bit, and the 
guard, round and sticky results. The result of the summation is stored in a 
shift register. It is possible for the result of the summation to overflow one bit,· 
/ '\ 
or for there to be any number of leading zeroes from O to 24. In a parallel 
floating point adder, the number of leading zeroes in the result is encoded, the 
mantissa is renormalized and the exponent is decremented by that amount. In ) ·, 
the bit-serial case, it is possible to do these operations on-line, as the result is 
being calculated. 
As the result appears, one bit at a time, an exponent counter (EXPCTR) 
decrements the exponent every time a zero appears in the result, and reloads 
the original exponent every time a one appears, or if there is an overflow out of 
the most significant bit. This is why it was necessa~y to select the numbers so 
that the result could be guaranteed to be non-negative, else this simple scheme 
wouldn't work. At the same time a renormalization counter (RNRMCTR), 
counts the leading zeroes. Every time a one is encountered, or if there is an 
overflow out of the most significant bit, RNRMCTR is cleared to zero. At the 
26 
,,, 
cl 
r 
,,, 
.... 
same time, it is necessary to know if the result will overflow after rounding, so 
. ' . 
that the exponent can be adjusted, and any overflow or underflow can be 
detected. This condition is harder to detect in the adder than in the multiplier. 
The mantissa in the multiplier could only overflow, not underflow. However, it 
is possible to detect those results which will overflow after rounding, even before 
performing the rounding. 
• 
If the two input exponents differed by more than one, then the largest 
;? 
possible left shift to renormalize the result is one. If, however, thE;)wo input 
/ 
-
exponents were equal, or differed only by one,-! then the largest possible left shift 
is 24. In this case, only the GRD bit will be set; the RND and STK bits must 
be zero, because the denormalization shift was =::; 1. Therefore, the mantissa can 
only overflow after rounding if the 24 most significant bits are all one, and the 
GRD bit is one, or if the most significant bit is zero, the next 23 bits are one, 
and the GRD and RND bits are one. It is impossible for the two leading bits to · 
be zero if both the RND and STK bits are set. The ref ore, there are no more 
· than these two ways for the 25 consecutive most significant bits of the 
1· 
normalized result to be all ones. 
Again, there is the problem that there are only 24 clock cycles to renormalize 
and output the result, but there are 26 different possible realignments of the 
result: right shift one (mantissa overflow), or left shift O to 24 places. This 
problem was solved by combining a shift register with three pointers. If the 
27 
1 
. '' ' '' ~ :, ., .. 
. 
' . . ~ ,_ ' I ' ' ' I, ' ~ . ' '' -,,,· ,•,' '. • ' 
mantissa overflowed, ··or if the most significant bit would overflow after 
rounding, then a shift right must be performed, and a pointer (SHR) is set 
which points to the bit to the left of the lsb. This bit becomes the least 
significant bit of the output of the result register. If the mantissa didn't 
overflow, and the most significant· bit is asserted, and the 24 most significant 
bits and the GRD bit are not all ones, then no shifts are_ performed, and a 
pointer (SHO) points to the least significant bit of the result, which becomes the 
least significant bit of the output. If neither of these pointers is asserted, then 
the least significant bit of the output is taken from the bit to the right of the 
least significant bit of the result. At the same time that the pointers are 
formed, the round flag (RND) is calculated based on which flag is asserted. 
It now remains to renormalize the result, round it, and output the result. 
This is all done simultaneously,/~only 2 clock cycles delayed from producing the 
most significant bit of the result. As in the multiplier, if overflow or underflow 
occurs, the output is reformatted. The adder has a sum mantissa output 
(SUMMAN), sum exponent and sign output (SUMEXP), negative flag (NEG), 
zero flag (ZRO ), overflow flag (OVF), underflow flag (UNF), and inexact flag 
(INX). It also outputs a signal indicating the least significant bit of the output 
(RST76). 
28 
1······ \· .. , . . • . . 
, 
/. 
' 
,, 
4.4 Conclusion 
The adder is also designed with area optimized standard cells from the AT&T 
library. It uses 432 flip-flops and 965 gates, or 12,192 transistors. The deepest 
' ' 
path between two flip-flops is 10 gates deep. Its maximum clock rate at 5.0V, 
25 ° C and nominal processing is 8.0l\1Hz. With a latency of 24 clock cycles per 
operation, the resulting throughput is 0.33 ivIFLOPS. The delay from the least 
significant bit of the inputs to the least significant bit of the output is 76 clock 
cycles; that is two more than for the multiplier. The deepest path in the adder 
i~ twice as long as in the multiplier, and the adder frequency is less than half. 
That is reasonable, since the adder is not a purely systolic ~ystem like the 
'· 
' 
multiplier. Further, no attempt was made to optimize the timing of the unit. 
It is reasonable to expect a large improvement with more work, m~ybe as large 
. \ . 
as a factor of two, simply by adding registers in the appropriate places, and 
buffering heavily loaded wires, especially the RST wires, which control the 
operation·" of the adder. Experience shows that often a very small number of 
long paths can slow down an otherwise much faster circuit. The complete 
schematics of the adder are included as Appendix B, Figures B.l-B.17. 
2Q 
, 
,. 
, ·Ir .. 
. ' 
Chapter 5 
Applications for Bit-Sequential 
Floating-Point Arithmetic Units 
·• 
5.1 lntroduc tion 
,, t) 
. 
The arithmetic units described in the two previous chapters are most useful 
when applied to a heavily pipelined, functionally parallel architecture. This is 
because the large delays of the units ( > 3 words), and the large n um her of clock 
cycles per input (24), would make using them unreasonable otherwise. On the 
other hand, they are very cheap in absolute terms, so that if a very low 
performance, cost-sensitive application existed that required floating-point 
arithmetic and a higher performance than possible • Ill a software 
implementation, they would be applicable. Applications such as those described 
occur frequently in digital signal processing. This chapter describe two 
particular uses for these units: in a generic matrix-vector multiplier, and in a 
parallel-pip~lined FFT calculator. 
·5. 2 Matrix- Vector Multiplication 
In matrix-vector multiplication, an nX n matrix multiplies a length n input 
column vector to produce a length n output column vector. Each row of the 
30 
.. 
. , . I • . 
j 
1 
1 ' 
.,; 
output vector is the inner-product of a row of the matrix with the input vector. 
Hence the total computation requires n 2 multiplica\ions ·and n( n-1) additions. 
By providing n multipliers and (n-1) adders in a systolic array as shown in 
Figure 5.1, it is possible to do matrix-vector multiplication in 0( n) time. 
' t 
X/+ -X - -- -
0 
0 
' 
X/+ ---
0 
0 
0 
X/+ 
Figure 5.1. Matrix-Vector Multiplier 
-
If the multiplier and adder modules described in the previous chapters are used 
in this application, then a new data point can start every 24 clock cycles. 
Therefore the latency is 24P, where P is the clock period. The throughput is 
I/24P. The multiplications occur in parallel, so that only the multiplication in 
the first cell adds to the delay. Since the multiplier delay is 7 4 clock cycles, and 
the adder delay is 76 clock cycles, the total delay is (7 4+ 76( n-1 ))P. The 
floating point bit-sequential multiplier and adder described in the previous 
31 
,, 
t 
? 
l 
··~ 
chapters are particularly well-suited to· such an application. They have very 
limited pincounts, compared with a parallel implementation, so that many more 
(' 
units can be included on one chip without being limited by packages. Also, 
since they are so small, it is easy to imagine putting redundant units on a chip, 
with switches to bypass a faulty unit. 
5. 3 F FT Calculation 
The Fourier transform is an example of matrix-vector multiplication. However, 
Cooley-Tukey and others have shown how to do this special case in 0( nlogn) 
time. One could use four multipliers and six adders developed in chapters 3 and 
4 to form a butterfly processor (see Figure 5.2). ( n /2)logn of these butterfly 
processors can be combined to form a parallel transform processor (see Figure 
5.3). Again, the latency of the FFT calculator would be 24P, and the 
throughput would be 1/24P. In this case, however, the delay of the array ·is 
(7 4+ 2X 76)logn. Since there are 3 words input to the butterfly and 2 words 
'· 
output from the butterfly, there are only 10 data wires, 1 clock wire, 1 reset in 
wire, 1 reset out wire, and 109,248 transistors per butterfly. Again, it should be 
possible to package a 32-bit floa,ting point butterfly processor in an inexpensive 
,, 
I \ 
16-pin package. A special memory chip can be designed which cycles through __ /···7 
the appropriate coefficients every 24 clock cycles. Its pincount would also be 
.•. 
extremely low, since all the coefficients are only two wires wide. 
32 
•· 
,, 
,, 
.,. 
.,, . .,r 
Re[x(O)] ______ __.,. ______ ~ 
..,..___~ Re(X(O)] 
I 
Re[x(l)] ---ialt 
ReW) 
-1 
. Im[x(l)] -~ 
Im ['/v] · 
Re[x(l)] -...-. 
Im[W] 
Im[x(l)]-~ 
ReW] 
a----~ Im(X(O)] ~·' 
-1 
----------~ Re [X( 1)) 
-1 
a----~ Im[X(l)] 
\ 
I 
Figure 5.2. Butterfly Processor 
33 
\l 
I • 
w2 
• 
. w2 
Figure 5.3. FFT Calculator 
34 
,,..__ __ ~ X 2 
--~-~xa 
wi 
w2 
w3 
• 
Chapter 6 
Conclusion . 
., 
6.1 Discussion 
This thesis makes an argu:qient for the utility of bit-sequential architectures in 
pipelined-parallel functions. Most of the work reported in the literature deals 
. 
,, 
only with fixed point number representation. The major contribution of this 
thesis is the design of two completely sequential floating point processors for the 
first time. The design of a floating point bit-sequential multiplier reported here 
runs at 0.9 Mflops. The adder, the slower of the two, runs at least at 0.33 
Mflops. An array of these units will have a throughput which is .an integer 
multiple of 0.33 11:Flops. One should note that this performance is not 
optimized, and can be improved with more careful implementation. The 
applicability of these processors is demonstrated by single-chip implementations 
of a matrix-vector multiplier, and an FFT butterfly operator. 
6. 2 Future Directions 
I 
The floating point bit-sequential multiplier and adder described in the 
preceeding chapters are useful by themselves. A number of directions suggest 
themselves for expanding on what has been described here. One area of study 
35 
' ,, ,'' ''.' ,,,, ' 
' 
. ,. .. 
fl 
would be to implement these units, optimizing their area-throughput, to see 
what results might be obtained compared with the estimates made here. 
Another area of study would be to make a more complete set of operators, 
including divide, square root, fixed<->floating convert, etc. Yet another .area 
of study would be to improve the units described here by making them 
completely IEEE754-1985 compatible, including denormalized numbers, NaN's 
and infinity. Of course, there is the whole subject of various applications, 
particularly those which take advantage of the small size of each individual unit 
to incorporate fa ult tolerance into an array of operators. 
. . 
. :.- t 
36 
-• 
{ 
I I 
I 
REFERENCES 
1. I-Ngo Chen and Robert Willoner. "An O(n) Parallel Multiplier with Bit-
. '' Sequential Input and Output , IEEE Trans on Computers, vol. C-28, no. 10, 
Oct 1979. 
2. R. Gnanasekaran. "A Fast Serial-Parallel Binary Multiplier", IEEE Trans on 
Computers, vol. C-34, no. 8, Aug 1985. 
3. Tom Rhyne and Noel R. Strad~r, II. "A Signed Bit-Sequential Multiplier", 
IEEE Trans on Computers, vol. C-35, no. 10, Oct 1986. 
4. Charles E. Leiserson. Area-Efficient VLSI Computation. Cambridge, MA: 
The MIT Press, 1983. 
I 
5. Neil Weste and Kanran Eshraghian. Principles of CMOS VLSI Design: a 
Systems Perspective. Reading, MA: Addison-Wesley Publishing Company, 
1985. 
6. Paul M. Chau, Kay C. Chew and Walter H. Ku. "A Bit-Serial Floating-
Point Complex Multiplier-Accumulator For Fault-Tolerant Digital Signal 
Processing Arrays", in 1987 International Conj erence on Acoustics7 Speech, 
and Signal Processing, pp483-486. Piscataway, NJ: IEEE, 1987. 
7. Leland B. Jackson, James F. Kaiser and Henry S. McDonald. "An Approach 
to the Implementation of Digital Filters", IEEE Transactions on Audio and 
·- Electroacoutstics, vol. AU-16, no. 3, pp413-421, Sept 1968. 
8. Joseph T. Scanlon and W. Kent Fuchs, "High Performance Bit-Serial 
Multiplication", Proc IEEE International Conference on Computer Design: 
VLSI in Computers, pp114-117. Piscataway, NY: IEEE, 1986. 
9. ANSI/IEEE Std 754-1985: An American Natz"onal Standard IEEE standard 
for Binary Floating-Point Arithmetic. New York: The Institute of Electrical 
and Electronics Engineers, Inc., 1985. 
10. Paul M. Chau and Walter H. Ku. "A VLSI Floating Point Signal Processor". 
in Sun-Yuan Kung, Robert E. Owe~ and J. Greg Na.sh, eds. VLSI Signal 
Processing, II. New York: IEEE Press, 1986. 
11. 1.25µ CMOS Cell Library. AT&T, 1987. 
f" L• 
37 
". 
., 
'11¥,1;/ 
~' ... 
I 
'.I,. 
' 
' ' 
\~ . ' ' 
,, 
Appendix A 
Floating Point Bit-Sequential 
Multiplier Schematics 
" 
This appendix includes the complete schematics for the floating point 
bit-sequential multiplier described in chapter 3. 
<J. 
38 
/ 
'' 'I I 
.. 
I, 
t-rj 
-· oq MCOMAN ..... MCOMAN MSP , 
~ MPYMAN ..., 
(D RSTO 
....._ MPYMAN LSP , 
..... RSTO R5T48 , 
> CLK · 
..... 
- CLK , 
• 
c.v ~ 
co • 
~ 
1-,j 
~ MCDEXP MPYEXP ),-..... , . 
~ 
• 
~ 
. 
MANMPY 
MSP 
LSP 
RST48 
0 
•' 
,MANMPY 
MCDEXP PRDMAN 
MPYEXP PRDEXP 
.. RSTO NEG 
MSP .ZRO 
LSP OVF 
RST48 UNF 
CLK INX 
RST74 
EXPFMT 
--,.. 
..... 
-
-
-
...... 
-
..... 
-
--
-
--,.. 
.... 
-
EXPFMT 
! 
PROMAN 
PROEXP 
NEG 
ZRO 
OVF 
UNF 
INX 
RST74 
·., 
' .. I. 
I 
. I 
•. r 
I 
~ 
-· ()Q 
~ 
'"'1 
ro 
> • ~ 
• ~ ~ 0 MCOMAN 
MMPYO 
> NCOMAN HCOOUT ·MCDO MC0[0:211 z MPYMAN 
~ RSTO > 
NP'r'MAN MPYOUT MPYO MPY[0:211 
II,.. RSTO PPBOUT PPBO PPB(0:211 ,, 
LSPOUT LSPO LSP (0:21 J 
... \ ~ 
• 
ti--
x:·RSTOUT RST2 RST 12:44,21 
_J 
u 
. 
MMPYO 
CLK II,.. ,, 
MMPY [ 1 :221 
MCOIN MCOOUT flCD l 1 :22J MCD22 
MPYIN MPYOUT MPY [1:221- MPY22 
PPBIN PPBOUT PPB l 1:221 PPB22 
LSPIN LSPOUT LSP(i:221 LSP22 
RST IN ~ RS TOUT RST (4:46;21 RST46 
_J 
LI 
MMPY 
. 
. ' 
HCOIN HSP 
MP'r'IN LSP 
PPBIN RSTtfB 
LSPIN 
RSTIN x: 
_J 
u 
\ 
MMPY23 
.... 
,.. 
..... 
,,,... 
...... 
--
MMPY23 
MSP 
LSP 
RST48 
.. 
• 
MCOMAN "'" ,,
MCD 
. D Q . . 
SP 
1 
~CK ON . 
FD1P3AX 
RST 
HSTO >----- . D 0 
. 
~ 
-· 
1 t >CK ON (JQ 
~ FD153AX ...., 
~ 
> ~ • w ti-' 
• MPY p 
~ MPYMAN > D 0 
~ 
' 0 
• 
~ )CK ON 
FD153AX 
CLK > 
·, 
MCDOUT 
"" 
'D 0 
UC_ 
[>CK ON 
FD1S3AX MCOB 
. 
.RS TOUT 
RST 
~ D 0 
>CK ON 
' 
FD1S3AX 
MPYOUT 
MPY 
D 0 
MPYB [>CK ON 
i-.-· FD153AX 
. 
' 
pp 
. " ('\ I 
n. l 
)u 
NR2 
.... 
~ 
..... 
~ 
.... 
.,,..-
MCDOUT 
UC 
RSTOUT 
UC 
MPYOUT 
UC~ 
CLK 
RST 
RSTB 
pp 
• 
•. 
PPB PPBOUT 
RSTB 
pp 
.. . A b PPB z 
B 
D 0 -,,. 
. 
PPBOUT 
N02 
~ 
-· (Jq >CK ON u C 
~ 
""'1 (1) 
~~ ~ ..s;.. 
• RST 
LSPB 
F01S3AX 
A 
LSPOUT 
. 
. LSPB z 
B 
D Q u C 
~ N02 
~ CLK DCK ON -" ,, LSPOUT 
• 
. 
~ FD1S3AX 
' 
7. 
.-. 
..s . 
MCOIN > 
MCD 
r- MCD 0 Q 
cp 
·.J 
[)CK ON UC 
-
FD1P3AX 
RST 
R5TIN >- RST 
. 
D 0 ---
1-rj 
-· 
I ,- >CK QN 
(JG -
~ FD1S3AX 
"'"1 . 
ro 
~ > • ~ CJl 
• 
MPY 
~ MPYIN ),, D Q MPY 
~ 
• 
lo-' 
4~---
~CK ON UC 
-
I FD1S3AX 
CLK >-
. 
. 
MCOOUT 
0 a 
:,.. 
C>CK ON 
FD153AX 
RSTOUT 
-
D a 
' 
' 
)CK ON 
FD1S3AX 
MPYOUT 
D Q 
: 
)CK ON I ' . 
' 
FD1S3AX 
. 
PPB 
A 
z p 
B 
ND2 
.... 
-
--_. 
... 
-
MCDOUT 
\ ,. 
UC 
RSTOUT 
UC 
MPYOUT 
UC 
CLK 
RST 
RSTB_ 
PPB 
I. 
~ 
-· c,q 
C 
., 
(t) 
~~ (. 
~ ~ 
• 
~ 
~ 
• ~ 
.. 
PPB 
PPBIN 
RSTB 
RST 
lo.. 
, 
LSP IN >-
CLK 
' 
I 
. 
. . 
-
A 
-B CIB ClB C A 
----
B z 
..___ 
N02 
.. 
I 
Al ~ 
A2 I I 
B I 
. 
.. 
. 
} 
SUM 
ZCN co 0 
ZSN SUM 
FA >CK 
. 
01 
DO 
so 
)CK 
. . 
LSPB 
\ LSPB I l D 
I 
. 
AOI21 
[)CK 
. 
CI 
0 
Cl 
ON 
FD1S3AX 
PPBOUT 
0 
ON 
FL1S3AX 
LSPOUT 
0 
ON 
F0153AX 
UC 
u C 
~ 
-
PPBOUT 
u C 
-.. 
-
LSPOUT 
. '•t ' 
,1 
,n. , 
MCDIN ..... D , 
SP 
t>CK 
·Rs TIN ), . - .. D 
~ 
-· (JQ I~- l>CK 
~ 
.., 
(0 
> • ~ ---1 
c,-. • 
~ MPYIN ...... D , 
~ 
t-.:) 
w 
• 
'- ~ 
' l>CK 
CLK ...... , 
I 
\ 
MCD 
0 
MCD 
ON UC 
-
F01P3AX 
RST 
a 
"RST 
-
ON . . . 
FD1S3AX 
MPY 
a MPY 
ON UC_ 
FD1S3AX 
I 
. 
RST48 
0 o 
[) CK ON 
FD1S3AX 
. 
. 
PPB 
A p z . 
B 
.N02 
-,. RST48 
UC 
CLK 
RST 
RSTB 
PPB 
;_4-, I '•.~t~ 
.•, 
• 
., 
~ 
-· (Jq 
i::: 
....., 
~ 
[, 
> 
• ~ 00 
~. 
~ 
~ 
~ 
w 
• 
~ 
• 
•• 1· 
·.., .. PPB 
PPBIN 
RSTB 
RST 
.... 
,,, 
LSP IN~ 
CLK 
f/ A 
B CIB CIB C A --
B z 
ND2 
- -
Al 
l A2 I 
-, 
l 
B J 
. 
-
. 
SUM CI 
-
ZCR t_ co D o CI 
ZSN SUM 
UC_ -~CK ON FA ..s.. 
FD1S3AX 
PPBOUT 
' 
-·. 01 0 .» Mc-
l . DO -
- so 
pCK ON -UC 
I 
LSPB FL1S3AX 
LSPOUT 
LSPB u o 
_j 
D ' z [_ 
L 
AOI21 
~ pCK ON 
-
LSP 
FD1S3AX 
MCD CI 
MCDEXP 
D 
..... D a MCD 0 0 CI , . SUM 
A ZCN co . 
MCOB B )CK ON >CK ON UC _.:__· SUM -
' CIB - C ZSN . FD1S3AX FD1S3AX 
MPYEXP 
MPY ....,._ A· CIB FA XP2 
--
I, MPY z B --~ >--~--- D 0 01 Q -- XP2 
ND2 DO 
~ so 
~ 
-· (Jq 
e 
~ ('t, 
> 
• 
>CK ON MPYB ~CK ON UC 
-
FD153AX MCDZB FL 1S3AX .; Al MCDZ \ 
A2 I I MCDZB . '\ MCDZ z D 0 UC I -~~-
-
B I l 
co 
~ • 
'--1 tt'j 
~ 
~ 
~ RSTBl 
~ 
AOl21 ZER09 
)CK ON . A ZER0.9 . z 
MPYZ MPYZB F0153AX IB Al N02 ZEROlO I MPYZB ---- ~. A2 '\ . MPYZ ' " . 0 0 UC D 0 I z ~ 
-
ZfROlO 
• 
~ B r ) SP . 
AOI21 
\ RST9 
t>CK ON t>CK ON l1· I 
---
FD1S3AX FD1P3AX 
UC 
NEG9 NEGlO 
NEG9 A z 0 Q . NEGlO 
SP . B qll UC . 
-
. 
XOR 
CLK ""' CK ON UC , 
-
• FD1P3AX 
~ 
-· oq 
~ 
1-1 (t) 
> 
• 
~ 
~o 
00 ~ 
~ 
~ 
~ 
1-3 
• 
~ 
RSTO > 
RS T 10 :23 l 
XP2 
RST9 
RSfB8 
RSTl 
ZEROlO 
RSTB2 
RSTll 
RSl2 
CLK 
0 
XP2X 
A z 
- B Zl 
XOR 
.. 
r XP2X 
UC 
-
XPEN 
- Al 
A2 
B 
.. 
MNXPJl 
- Al 
A2 
., B 
. 
-
MXXPll 
I 
. '~1 
L~2 
B 
. 
• 
... 
'"' 
XPEB 
I 
I I \ XPEB z 
I ) 
ADl21 
MNB 
l I . l 
' 
MNB ) z . 
J ) 
AOI21 
C 
. 
. 
MXB 
l MXB l p 
J 
OAI21 
-
RS Tl 1:24 J 
0 a . 
vCK ON 
FD1S3AX 
XP2X ,XP l 3:9 J 
XPEN 
·- 0 a UC_ 
' 
. 
)CK QN 
MNXPB11 FD153AX ,~ MNXPBl 1 . z 
D 
Iu 
MNXPl 1 NR2 
D Q UC_ 
[>CK ON 
FD1S3AX 
~ 
MXXPll 
0 a UC 
-
' 
~CK ON . 
FD1S3AX 
. 
. D 
SP 
l>CK 
. . 
. 
. .. D 
SP 
>CK 
D 
' SP 
~CK 
~ 
. 
XPl3:101 
a 
ON 
FD1P3RX 
MNXP12 
0 
ON 
FD1P3AX ( 
MXXP12 
Q 
ON 
FD1P3AX 
,./ . 
RST 11 :24 J 
RS T LI l 1 :2 ~ l · 
XP[3:10J 
UC 
MNXP12 
UC 
MXXP12 
UC 
• 
··~· 
• Cl,j 
XP2 
ZEROlO 
RS T 10 
RST72 
RST48 > 
XP [3:10 
RST2Lt 
ZEROB73 
NEGlO 
ZEROlO 
CLK 
X 
) 
.. 
-
. ~ 
. 
- D 
I 
)CK 
D 
SP 
>CK 
D 
SP 
)CK 
D 
SP 
l)CK 
.. 
XP3A 
XP3A a . . 
ON UC 
-
' FD153AX \A 
B 
',... 
I ~ 
. . 
XPl25:321 
a XP(25:32J D 
SP 
' 
ON UC pCK 
-
FD1P3AX 
NEG25 
a NEG25 D 
SP 
ON UC DCK 
-
FD1P3AX 
ZER025 
0 ZER025 D 
SP 
ON UC 
~CK •· 
-
FD1P3AX 
OVFB10 . 
. OVF 11 
A OVFB10 l ·o .0 UC B -SP 
ND2 
' ~CK ON OVFll 
UNFlO FD1P3AX 
UNFll UNFlO z D o UNf 11 . 
. SP 
NR3 
,, .. . 
.. 1,--·- [>CK ON UC 
-
FD1P3AX RST48 -XP (49:561 
o XP (49:561 
' 
I 
I 
ON UC 
- ... 
FD1P3AX NEGB73 . 
NEG73 LA NEG49 NEGB73 NEG 
o NEG49 D 0 NEG73 B .Zr-t___D a UC -
SP ND2 
-
ON UC C>CK ON UC l)CK ON " r--
-
-
-
NEG 
FD1P3AX FD1P3AX FD1S3AX 
ZER049 ZER073 
o ZER049 D a UC 
' 
-
SP 
UC ~CK ON ZEROB73. ON 
-
F01P3AX . FD1P3AX 
' 
~ 
-· (Jq 
~ 
~ 
~ 
> 
• 
..... 
C)l~ 
0 trj 
~ 
~ 
~ 
~ 
• ~ 
RST72 
RST48 
MNXP12 
RST24 
MXXP12 
OVFll 
UNFll 
CLK 
{,..· : ./ 
0 
SP 
r>CK 
-- 0 
' .._ __ SP 
t>CK 
D 
SP 
f>CK 
D 
SP 
[)[K 
-
.... 
MNXP25 
0 MNXP25 D 
SP 
ON UC I) CK 
-
FD1P3AX 
MXXP25 
a MXXP25 D 
SP 
ON UC ~CK 
-
FD1P3AX 
OVF25 
a OVF25 D 
SP 
ON UC 
-
t>CK 
FD1P3AX 
UNF25 
a UNF25 D 
SP 
ON UC ~CK 
-
FD1P3AX 
RST48 
~ 
· .., 
' 
MNXP49 MNXP73 
Q MNXP49 D Q MNXP73 
. SP 
ON UC C>CK ON 
-
UC 
FD1P3AX FD1P3AX 
MXXP49 MXXP73 
a MXXP49 D a MXXP73 
I .__ SP 
ON UC >CK ON 
-
UC 
F01P3AX FDlP3AX 
OVF49 OVF73 
a OVF49 D o OVF73 
SP 
. 
ON UC t>CK ON 
-
UC 
FD1P3AX FD1P3AX ' • J ! 
UNF49 UNF73 
Q UNF49 D Q UNF73 . 
SP 
ON UC )CK ON 
-
UC 
FD1P3AX F01P3AX 
RST [49:80 J 
RST [48:79 J D Q RST [49:80 J o--» RST74 
.- RSTBl 
~CK ON 49:801 
FD153AX 
• 
01 
MSP 
LSP 
CLK 
RST48 
RSTJ2 
RST49 
• 
>------'-----;:R~ND~----'-· -~----'--· I 
J _ !R~N~D~----~·~~---i 
~~-Q-----,0 a ·RSTB48 - ~--r--.A2 
-~ RSTB48 ·-A3 
I t "1 
~RB ----+~-+--+--,aSl 
AN~B 
I ', 
z 
j 
,,. 
ANDMSf 
r----, 
..----,o a---,--U( 
ANOMSPB 
~.-~~-.~B2 . 
IB3 AOI33 F01S3AX at---.~--~~-t---,r-t--1 -tj-_'1-==11==~;'==-====================~~~-t~~~INICXP~ MSP 
-r -·~--,~CK ON UC_ ON..--ANOMSP 
FD1S3AX 
&..- INCXP73 ' GRO 
INCXPB72 INCXPB72 
'in 
' 
C9----1----- - 0 
~-SP 
I t-~---1p(K ·ON---- UC_ 
I L'------,--,--· ~ 
NR2 
D 
r---+---,SP 
o INCXPH 
at------,,,> CK ON --. -- - . I N l X t--' 
,.___ ___ _ 
----·- _1 ___ ~F-=-D~1P~3~Ax ___ 4 +---r-HT--==· =·=· ===-----------. 
.. RNDMSPCB 
--- ---·-r----
L...i-'---~--'-·---F_O-=---lP-::3:--=A:-::-X~ CkU 
RNOMSPC 
'\q1 
.---t---,J A2 
--· r--. __ .. ~---
I STKY i 
-·---L_.:S~T~K0_Y___:__:_·1 ·----~--t--H-e--,-'-rq1 ~~o a n2 
. 
a;-+--i)CK ONt----uc_ 
' . 
"q1 
L-+---+-m(\2 
) ' 
FD1S3AX 
STKB 
I 
z 
)" 
STK 
,-Q Qt-----r-- UC_ 
STKB 
B 
B 
OAI21 
.-------, 
----+---,o 
- .. >'----+-'~_ --, SP 
RNDMSPCB Z·):>)-c-.-.f-~~~ 
Qi----LJ( 
L--------
FD1P3AX 
RNOMSPT 
RNOMSPTB r------, 
...--+---10 o UC--
L--+--+---,SP I RNDMSPTB z 
I 
OAI2l 
INX49 
m-----.~CK ON-- RNOMSP I 
FD1P3AX 
.-----~ · INxqg 
~o . aL--:-------'----'-----'-------'--,--~--'---'--~ 
-__,sp 
B I 
.-)CK ON~-STK c-)(K ON1---UC_ R$TB2q -·- OAI21 
FD1S3AX 
~-.____----=======-------6---~~:...::::.....:.....:__ ___ -:-. FD1P3AX 
' 
. 
INCXPB A'l UNFB73 
' 
' MNXP73 A2 I I 
' I z 
UNF73 B I ) .... 
' A0121 
\ 
:r 
INCXP Al OVFB73 
MXXP73 · A2 I 
' 
. 
z 
OVF73 B I ) 
AOI21 
MSP Al INXB72 
I 
GRO · A2 I l INXB73 
' INXB72 I INXB73 l D a 
INX49 B J ) SP ~ 
I AOI21 
. 
. 
RSl72 UC 
- t>CK ON 
F01P3AX 
ZEROB73 
MSP ,MSPO l 1 =23) 
CLK ~; 
' 
' 
• 
• 
UNFB73 . 
0 
t>CK 
-~F73X 
INRB 
OVFB73 D 
DCK 
J 
INX73 
r, 
I I p INX73 B z D 
I" 
L 
ND3 
' 
)[K 
ZEROB73X 
A 
z 0 
B 
AND2 I 
. 
' 
~CK 
MSPO ll :24 J 
D 0 MSPO(l:241 
., CK ON UC 
-
F01S3AX 
UNF 
a 
ON 
FD153AX 
OVF 
0 
ON 
FD1S3AX 
INX 
a 
ON 
F0153AX 
' ZRO 
o 
ON 
F01S3AX 
... 
-
.... 
-
.... 
-
UC 
UNF 
OVF73X 
UC 
OVF 
INX 
z 
u 
UC 
EROB73X 
C 
';,. z RO 
-· 
• 
'-l 
RNOMSPT 
MSP 
RNOMSPC 
RST72 
GRO 
RSTB72 
MSPD24 
CLK 
ZEROB73 X 
OVF73 X 
INCX PB 
lNCX p 
3 RST7 
RST 873 
. 
. . 
. 
~PBMSPB 
-
INRB 
. Al 
. 
,--- A2 
B 
'· 
~ 
MNB73 
-
Al 
A2 
' 
. 
I l 
Bl I 
B2 AOI22 
Al 
I 
MNCOB 
A2 I ~ 
z 
Bl I 
..-- 82 AOl22 
Al 
' A2 I 
-~ 
Bl J 
82 I 
. . 
. . 
Al \ . 
A2 I I 
.. 
' 
Bl I I. 
82 I 
PMNB73 
I J 
I z 
j_ 
AOI21 
MNB73 
MNCOB 
, 
• 
RNDB72 
RND73 \ RNDB72 
0 RNO· D z 873 
I. 
- SP 
AOI22 
~CK ON RND73 
MNB72 FD1P3AX 
MN73 
~- MNB72 
0 u 0 z C 
) 
-
AOI22 
MN73 . 
~CK ON 
f01S3AX 
. 
PRDMAN 
PMNB73 
D 0 u (_ 
PMNS 
........ pCK ON 
-
. 
. PRDMAN 
PMN FD1S.3AX 
MNCO A z ~ ' ' 
MNCO PMNC 0 0 D 21 ~ 
XOR 
pCK ON UC_ 
FD153AX 
.. 
~ 
-· (Jq 
~ 
..., 
(l) 
> • 
~ 
C/1 0) • ~ 
trj 
~ 
~ 
~ 
t-3 
• 00 
ZEROB73X 
OVF73X 
INCXP 
RS.T73 
RSTB73 
RSTB80 
XP (56 :49 J 
RST72 
CLK 
NEG73 
RST73 
. 
-
. 
. 
. . 
Al 
l 
A2 I I 
Bl 
' 
I . 
I 
~ 82 
\ 
\ 
' 
. 
., 
XPB1 
A 
z. 
B 
AN02 
Al PXPS OVF73Y A2 
A 
z I 
B 
OVF73Y XPCOB AND2 
PXP \ XPCOB l A z I -,..-
) XPB73 PXPC B Zl 
AOI22 XOR 
.. 
Al 
- A2 
' 
XP81 
I 
. "" 
PXPB73 
' I I PROEXP 
' PXPB73 CZ . D a UC . . 
B I ) 
. 
AOI21 
\ ~CK ON ~ PRDEXP 
FD1S3AX 
COB73 XPCO 
A ZP COB73 0 a B ~ UC 
. 
ND2 
>CK ON . 
FD1S3AX , 
. 
XP l7.3:80 J 
01 0 XPl73:HOJ 
XP 174 :811. DO 
so 
~CK ON XPB 173:80 J 
.· 
FLlS3AX XPENB 
I 
I I XPEN73 
" XPENB 2 0 O' . UC 
B I ) 
·. 
AOI21 
. 
XPEN ~ . 
>CK ON 73 
FD1S3AX 
i 
1 .. 
. I 
I, 
Appendix B 
Floating Point Bit-Sequential 
\ 
Adder Schernati s 
This appendix i d~th~nrpfel~hemati 
- ' 
~-
bit-sequential adder described in chapter 4. 
55 
/ 
~ 
\ 
f r tke floating point 
I · .. 
' .' 
.,'' 
,. 
~ 
-· (Jq 
i= 
..., 
(1) 
t:ct 
• 
c.,, Ii-' 
' ~ . 
.. ~ ~ 
~ 
0 
• 
~ 
.. 
AUGMAN ~ 
AOOMAN > 
CLK 
GXZROB 
GXEOOX 
GXGTDX 
> 
GMBl· 
DMl 
GM1 
DMBl 
RSTB1 
GMl 
0 a 
>CK ON 
FD153AX 
DMl 
-
D a 
t>CK ON 
FD1S3AX 
GMGTEOMR 
GMBOM 
GMBOM 
A 
B z 
N02 
. 
GMl CM(l:241 
GMB1 I 
DM,1 OM I l :24 J 
.. 
. 
DMB1 GMGTEDMR 
A I RSTl 
z 
B\ 
OR2 
CMGTDMB '11 f I 
' A2 I I !l3 \ l . 
Bl I ) 
' B2 I 
AOI32 
. . 
. . 
. 
l 
. 
GM(2:25J 
0 0 
>CK ON UC -
FD1S3AX 
0M[2:25J 
D a 
[>(K. ON UC -
FD1S3AX 
GMGTEDM 
GMGTDMB GMGTEOM 
.0 a 
>CK ON 
FD1S3AX 
. 
. . 
. 
. 
,;. ' ~" 
GMl2:25J 
OM(2:25J 
.GMCTEOMB 
GMGTEZ GMGTEZ · 
. 
A 
--- l b Al a I ' .. A2 
ND2 , 
B 
., 
', ' 
. . 
GGTDB 
I \ 
~z 
, . ) 
. 
AOI21 
• 
GGTDB 
~10 ~ -
INRA 
GGTD 
GG r UtJ 
• 
0 
CXl GX [2:9J 
GXl a-------, ex 11 =81 ----o 
,-------. 
o---GX (2:9) 
. 
~CK ON UC.:.. >CK ON UC -
. 
FD153AX FD153AX 
' OXl ox (2:9) 
ADOEXP .... D a OXl ox l 1 :8) D a DX (2:91 , 
GGTO Al NEGB10 I 
A2 r I 
' OXB1 I z -- -- NE Clj10 )CK ON ~CK ON UC - GGTDB- Bl I ) ~ . ..;... \ (J 
I ~ -- B2 FD1S3AX NEGG~ F0153AX AOI22 
-· 
,: 
(Jq Z N GG9 C SUBG I'\ NEGG NEGl 1 ' ' 
...., SUBG UC NEGG UC (0 SUBTRG > D 0 n Zl - D o D 0 -u -
td . SP . SP SP XOR 
• 
01 t...:> 
--l . RST9 ~ UC UC . 
~ .. )CK ON - )CK ON - ~CK ON Nt. (; 1 1 -
6 -·-----f01P3AX NEGO[ FD1P3AX FD1P3AX ~ 
2 N G09 RSTlO t:i SUBD . A NEGD 5UB11 NEG~O • SUBO UC NE.GD Z EGlO t\j SUBTRD > D 0 D Zl -- 0 a· A D 0 5Ul311 u 
. SP XOR SP UC SP ( n. Zt -L,11 
XOR 
~CK ON UC - ~CK ON UC - ~CK ON UC -
f01P3AX F01P3AX FD1P3AX •" 
.... RSTO ii , ~ 
. 
RST l 1 :82 J 
RST f0:81 J 0 a RST l 1 :82 J 
.. 
,, 
CLK ~ )CK ON RSTB ( 1 :82 J 
f01S3AX 
C,>, 
00 
~ 
-· (Jq 
~ 
~ 
(t) 
to 
• 
w 
• 
~ 
~ 
G 
GXl 
OXB1 
RSTB1 
RST8 
C, RSTB16 
• w 
RST9 
RSTB9 
CLK 
I 
-
~ 
- A 
B 
--~. 
I 
~ 
.. 
-
t-----
. 
FAX 
A ZCN XPCE 
B 
XPOl XPCX C ZSN - .--- >-··---
z -- XPCX FA 
AND.2 
XPD I 1:81 
XPENB8 Al \ 
A2 I 
' z . 
B ) 
'· 
AOI21 
' 
'· 
DX9 XPC 
-0 0 UC 
-
GX9 
t>CK ON · XPC 
FD1S3AX 
GXGTDX 
GXGT,CJXB 0 0 
-
SP 
)CK ON GXGTDX 
-
FD1P3AX 
XPD 12:91 
.· XPD (2:9) D 0 
XP09 
UC C>CK ON -
FD1S3AX 
XPENB8 XPEN9 
UC 0 a -
~CK ON XPEN9 
f0'153AX 
. 
Al \ 
A2 I I 
. 
Bl J I 
B2 I I r--
' 
. 
' 
" 
xPoxi Z X OX9 A -
n 21 UC -LJ 
XOR 
-
Al 
A2 
Bl 
,--- B2 
XP9B XP9B 
~XP9 
INRB .'I 
> XP(9:t61 ) 
A012-2. 
-
XPDY9,XPD l 10:161 
I ~--
., 
·A 
XPOYi 
Z X OY9 
XPDC9 n 21 u 
XOR 
XPDCB 
XPDCB 
\ 
I 
" )> l 
. ) 
~· 
AOI22 
I 
0 
SP 
-·[>CK 
D 
SP 
>CK 
D 
~CK 
XP[t.0:171 
0 
UC ON -
.. 
FD1P3AX 
XPO[J0:17 
0 
ON UC -·· 
FD1P3AX 
XPDC 
0 XPDC 
ON 
FD1S3AX 
I 
XPll0:1/1 
cxcrox 
1 
XPUIJO:I/l 
Uc 
I 
~ 
-· ()q 
~ 
~ (1) 
to 
• 
c.,, ...i;:;.. 
'° 
• 
~ 
~ 
G 
t:1 
• 
...i;:;.. . 
L .. 
' .. 
RST2 
RST2 
RST 
D,X9 
GX9 
6 
s 
9 
-
7--RST2 
RSTB9 
' 
ClK 
XPEN9 
XP9 
., 
.. 
~ 
f 
A 
GEO~X 
z EOOX 
.... -
··B . Z 1 UC 
~-
XNOR 
.... - : -
' 
---
~.:... 
. 
-
. 
-
. 
. 
. \r-1 GXEODXB 
I I 
1 R2 1 
z b 
B I 
OAI21 
" 
--
Al GXZRO 
~ A2 1 \ 
I z 
.B I J .~ 
AOI21 
I " 
-_ 
. 
. 
~ DX.ZRO At 
--· A2 l \ 
z 
B r / ' 
AOI21 . 
. 
. \ '11 MXPB 
1 A2 I 
l D-
B I 
OAI21 
XP 127:34 J 
XP l 10:17 l D 0 .. - XP 12 7 :Jq J 
• SP 
. 
--
f 
UC 
GXEODXB GXEODX ··CK ON 
--
D 0 UC FD1P3AX 
SP ~ SUB26 
SUBl 1-.. - D 0 SUB26 
~CK GXEOOX 
SP 
ON 
F01P3AX UC 
. -- >CK ON ------=-
. 
---
-~ ·--
--
FD1P3HX 
:"'· SZR017 GXZRO GXZROB ·szRUL7 
~-- . \. SZR017 UC D Q ----'-A 
- -
z 
--;B 
. . ··--·- f- n 0 
SP SP 
SLRO:?/ 
NR2 
>CK ON cxzr~os UC )CK UN --
FD1P3AX FD1P3AX 
' 
• I 
NEG28 
NEGl 1 DXZRO OXZROB D 0 
--
UC SP D (J 
NEG28 
- SP 
·-
>CK . ON UC -
. 
>CK ON DXZ~OB FD1P3AX 
FD1P3AX 
MXPB MXP MXP27 
UC I D a .... 0 0 . MXP27 
SP SP 
t>CK ON MXP UC >CK ON 
F01P3AX FDlP3AX 
' 
/' 
' 
SW 123:0 J 
GM [2 :25 J GM BM SWBJ1 [23 :OJ 
OM(2:25J SWL\(1 (23 :Q J OM LM 
GXZROB GZB ~ 
DXZROB OZB 
GGTO GGTO 
SW2X2 
-
-
DNRM 
\ 
---' A ,· 
"' 
" z 
~ ~B 
OR2 . 
. 
. 
-· (Jq 
~ CLK 
~ 
(t) 
t'd 
• 
~ en 
0 • 
~ 
~ . 5WLM23 
G . 
0 
• en 
/ . 
XPD 125:321 
,--"------
XPO[l0:171 -· --..--,-----10 0 XPD125:32J 
RST24 SP 
DNRMCTR 
DI (7 :O I DO 12:11..----+---00 12 : l J 
L-----1LOAO OG 12:1 J DC 12 : l J 
CLK DZROB . OZROB 
... _ __.~CK UC ONr--. -~ 
~-__J ONRNCTR 
FDlP3AX 
RSTB25 ___ ~
8
A ~~~ L,.G 
LMO,GRO i f STK RNO ~ ---,1----- '"1 r . l 1A2 
ND2 
B RS T B25----'-1t----~---'--- I 
_, 
! ' 
' 
BM l23:0 I 
0 Q BM (23 :o J 
SP 
' 
,-----..,· > CK ON UC . 
F01P3AX 
I 
LM 122:01 
.r--c-----. 
5W~M[22:0J ~--'Dl O LM"[22:0J 
. 
LMl23:tJ ~--+-------JOO 
DNRM 
-
. 
LM23B 
. 
A 
z p B 
ND2 
so 
SP 
--';CK 
LM23B 
,{;.., 
D 
SP 
ON UC 
FLlP3AX 
~3 
q UC 
LM23. 
ONt'----'-
FD1P3AX 
. GRD,RND,STK 
.----'---'-- UC 
L~,R ~~~o a~~-
~---JsP 
ON GRO ,RNO ,ST K 
FD1P3AX 
R 
z b R 
OAI2l 
,,, 
I 
~ 
-· (1q 
i:= 
""1 
(t) 
to 
• 
~ O') 
~ • 
~ 
~ 
G 
tj 
• 
~ 
DZROB 
LMO 
GRD 
001 
RND 
002 
DGl 
002 
STK 
-
--· 
CLK 
RST 49· 
~RO OZRO 
. 
INRB 
/ -, 
. ' 
111.--
-
, 
• 
. 
-· 
. 
. 
. 
\. ,. 
~ 
' I 
A LMOX 
z LMOX 
B 
AND2 
Al GROB 
A2 \ 
I z 
Bl "\ J \ 
I B2 ,. AOl22 
RNDB Al 
·{ 
A2· I 
' Bl ' z . 
B2 j 
I 
Cl 
[2 AOI222 
Al STKB 
A2 
-
Ul \ . . . \. 
B2 I I 
' l 
Cl I ) 
(2 1 
D 
AOI2221 
. 
. ·. 
. 
CMl22:0J 
. 
BM [22 :OJ . 01 0 CM 122:0 I 
• CM 123:t I DO 
SD 
UC 
GROEO >CK ON 
GROB D a UC - FL lS3AX 
' MM l22:0J 
LM 122 :l I ,LMOX -· 01 0 MM-122 :O I 
MM (23: 1 J DO 
)CK ON GR050 I ~ so 
FD1.S3AX UC (~- t>CK ON 
·-RN050 FL1S3AX RNDB UC D o· CM23 
BM23 - D 0 CM:!] 
•CK ON ·RND50 
FD1S3HX >CK ON UC 
FD1S3AX 
. MM23 
LM23 D 0 STK50 . MM23 
STKB UC_._ 
. 
·o a 
I 
>CK ON UC -
.. 
' >CK ON STKSO FD1S3AX 
FD153AX 
~ 
-· OQ 
~ 
..., 
(1) 
to 
• 
O") a...-1 
t~ • 
~ 
~ 
~ 
C1 
• 
a...-1 
~ 
STK50 
SUB26 
RST49 
GRD50 
NEG28 
RST51 
RND50 
SZR027 
CMO 
MXP27 
CLK 
RST50 
I 
. 
·- 0 
SP 
)CK 
. D 
SP 
. [)CK 
·O 
SP 
[)(K 
D 
~ SP 
~CK 
D 
.. SP ., . 
. 
>CK 
. 
.. 
S-TK51 
a· STK51 
ON STKB51 
FD1P3AX 
SUB50 
0 SUB50 
ON SUBB50 
-
FU1P3AX 
-
NEG52 
0 NEG52 
UC ON -
F01P3AX I 
SZR051 
UC 0 -
ON SZROB51 
.. 
FD1P3AX 
MXP51 
0 MXP51 
: 
ON UC -
. 
FDlP3AX 
" I 
! 
n 
. 
~ 
NRRNO 
\ () NRRNO I z " 
-
I 
., )u 
in 
NR2 u 
-
. ' 
--; I 
RSTB50 
NRGRO 
~ A ~ NRGRD B , 
,.. . 
I ... 
NR3 
MMSO 
MMSO {'\' z I 
UC MMO n Zl -u 
I XOR 
MOVB 
A 20MOVB CRY 0 B 
N02 
~ CK 
. . 
u 
RNDS50. RNDA. ,0 RND51 
z RNOS50 Al RNDB50 0 \ a UC_ A2 I J \ SP JP Zl ){nR Bl I ) GRDS58 \ 82 I C>CK ON Z ROSSO A0122 
RNlJBS1 
RND~l 
UC F01P3AX Zl - GRnR.10 GRD51 XNOR Al GRDB50 D 0 \ .. GRDBSl 
A2 I I \ I~ SP 
z 
... Bl I ) . \ 
B2 I I~--)CK ON -----AOI22 Cf<l J~j l 
. 
-
-~ FOlPJnx co . .. 
. . - C£J AOI32 j_ CRY- -0 ~ B2 CINB 
~NCIN 
Bl 
z I . . A3 ) INRB A2 UC C>CK .1 ON -I 
:ll CINB -~ FDlS]AX 
' 
. 
SM 12 3 :O I 
SUM,SM (23:1 J 0 0 s M [j J :OJ 
. 
I I 
. 
SUMB 
~YCRY CRYB .___ A ZCN z 
s INRB ~{K .ON B 
~MSUM SUMB FD1S3AX C . ZSN z .. SM24 INRB SMB24 FA I 0 
MB 123 :OJ 
ON SM24 
FD153AX 
. 
[" 
r 
• 
00 
RSlB51 
SM23 
ORDB51 
RNDB51 
STKB51 
RST51 
GRD51 
RST74 
SM24 
RSTBSO 
RST74 
XPl27: 341 
RST50 
CLK 
' 
. 
-
-
-
. 
. 
OORRORS 
H 
.. 8 z D 
r 
L 
ND3 
. 
. 
. 
Al 
· H2 
B 
. 
e 
' 
• 
" 
'---- Al ZM74 
A2 
' 
ZM74 
l 
B Z. . 
I 
OORRO~S. 
AOI211 
. 
'q1 SMAB 
;'12 
SMAB 
. ~ 
B z~ 
( 
C 
OAI211 
' 
. 
I 
. 
CLRCTRB 
CLRCTRB CLRCTR . I 
' 
.~CLRCTR I z 
J I · INRB 
. AOI21 
. 
. 
,.. 
' 
-
~B7'l ~~.I/ z 
INRB UNFB74 Al 
' ZMB SZROB51 A2 J I '\ 7-' D a UC -- Bl I ) / .,..\ 
I 
ZMB74 
UNFB74 
) r-- B2 AOI22 
. 
-t>CK ON ZMB 
.. 
FD1S3AX ICX74 
A . ICX74 SMANO . z 
RN051 B 
C 
SHAND OR2 SMANDB l) 0 
. 
.____ Al lNCXB74 
' I I ~CK QN . A2 
' 
. 
z 
' 
F0153AX 
I B I 
INCX874 
AOI21 
SHR74 
Zb SHR74 . . A RNRMCTR SMB24· B SHR74 
CLR SHF SHF ND2 OVFB74 
RSTB 
- A 
' z p • OUT MXP51 · B 
. 
I CLK 
·OVFB74 
ND2 
RNRMCTR 
EXPCTR 
X[7:0J 2(7:0J ZXl7:0J XPUF74 RST UNF ZRXP74 LO ZRO • 
CLK ' 
EXPCTR 
\, 
-· -'i ·1 
i !, 
• (.0 
.. 
UNFB74 
SZ.ROB51 
ZMB74 
SMB23 
SMAND 
RND51 
SM23 
SMANDB 
SM24 
GRD51 
SHR74 
SMO 
RND51 
STK51 
·- A 
B 
-·- r 
ZR074 
z p ZR074 
N03 (l 
A1 SHOB 
A2 I I I \ yr A3 I 
Bl I I \ SH074 82 J SHOB," AOI32 I 
~ z 
) 'J 
NR2 
SHROB74 
\n 5HROB74 • l ~ ~ 
)u 
NR2 
~R07s'hR07'1 
~ 
INRB 
INXB74 
"-- Al 
- A2 
Bl 
\. 
B2 1 
z 
C I 
D "" ~ 
-""1!i"'~ 
. 
HUlllll 
SMl ROBO ~ "1 ' Hl GRD51 
~2 RND51 
· A3 STK51 A4 
• 
B h RDBO SMO . . . z ,-
. 
.SHR74 r 
SH074 OAI411 • 
--· 
\. Hl ROB1 1 
A2 
I ~1 I RND74 J 
/ ~ A RDB1 B z D B z b · HNUl'-! 
•. 
~c__ ___ 
ND3 
C 
' 
OAI311 . 
. 
\ ~1 RDB2 
) ';2 
B p ·ROB2 z .. 
-
INXB74 C .. 
OAI211 
. 
. 
. 
~ 
0 
INXB74---o 
. RST74 SP 
1NXX75 
Q UC 
INXX7S--'A 
OVF75 B 
UNF 75--) C 
ON.____- INXX75 
FD1P3AX 
OVf 75,UNF 75 
OVFB74 ,UNFB74 -· ··-'------~D-----0 UC 
I ~---+-. ----4 SP 
ON OVF75,UNF75 
L-----'---'------' 
FD1P3AX 
RM24, INCX75 INX75 SM24, INCXB74 --~ D 01---- RM24,UC_ 
Z -INX75 
OR3 ,-~t>CK ON uc_,INCX75 \ 
.___ _ ____, 
SM [23:22 ),-. -~---01 
RM [24 :231 DO 
____._ _ __, SD 
(&----1[) CK 
I 
FD1S3AX '~ 
RM [23:221 
O-' --RMl23:22J 
ON UC 
FL lS3AX 
l) 
ZR075 ,RN075 
~--
ZR07 4 ,RN07 4 ----- '-----ID 
~-~SP 
a I ZR075,RN075 ) 
___ R~ (21 :QI ,GR075 
SM l21 :Q 1,._GR_0_5_1 _,--ot O . RM l21 :Q J,GR075 
RM [22:0J DO 
SHFT so 
S_H_F_-'\.-A~- SHFT SP 
ON-- ZROB75,RNOB75 Z ~CK ON· UC lD---. -1t> CK 
..__ _ ____, L..-------1 
F01P3AX 178 FL1P3AX ZX8 SHR75 ,SH075 OR2 Z Z [7:0 J 
~-~ 
ZX [7:0J--~---tOl 
ZZlB:11 oo 
5HR74,5H074 ~~~o a~-SHR75,SH075 OL----ZZl7:0I ~: Z):)ZXB D 
NEG52 L--...--"--+---1 SP 
L---+---150 
NEG75 
.. UC ON~~ 
,-----.... 
NEG52-~A 
F01P3AX ZROB75 B 
ZRO ,OVF ,UNF I INX 
ZR075 ,OVF 75 ,UNF 75, I N_X_75--+-----l~D----, 0 ; ZRO ,OVF ,UNF, INX 
ON 
UC_ 
zp-
ND2 
NE G75...___..--o 
RST75 SP 
I 
m----1t>CK 
ON1------ ZZB (7:0·J 
FL1S3AX 
NEG 
0 UC 
ON~-;.~ NEG 
FD1S3AX FD1P3AX CLK-~-~-----------'---'--'--------'-----<.,.__ ____ ~ ____ __. 
G---)> RS T 76 
(. 
N02 
->CK 
)' 
ZZB 
a 
UC_ 
ON.I--'----· Z Z 8 
FD1S3AX 
~ 
-· (Jq 
<: ~ 
""1 
<1) 
t:d 
• 
t,--1, 
~ t,--1, 
~ • ~ 
~ 
~ 
tj 
• 
t,--1, 
t,--1, 
~-
RMl 
SH_R75 
RMO 
SH075 
GR0?5 
RST/5 
RN075 
RS1875 
ZZBO 
INCX75 
RSTB82 
RST7 4 
CLK 
-
-
s ROB75 
"\ ri 
I 
z 
JB 
NR2 
-
-
-- . 
-
' 
--
-
• 
~'. 
r 
. 
RMB Al 
' A2 I 
- ,. 
Bl RMB A I ·Z I 
- B2 
r B -
Cl 
- (2 AOT222 
' 
SHROB15 
·c 
Al OMCI 
A2 '\ :.. QMCI 
z 
Bl ) 
B2 AOI22 
A 
--' B OXCI -· Al 
A2 \ QXCI I z 
) Bl 
' B2 I AOI22 
-
XPENB 
~ Al I 
I I D A2 '\ XPENB I z 
• 
B I ) 
A0121 
- t>CK 
" l ( 
OSUM 
- . OSUM OSUMB z '"'1 i J I 
ZROB75 A2 OSUMB SUMMAN Zl l ...._ '\ UC SHFT :11 D 0 -z . -XOR I 
B I ) OVF75 - I 
AOT31 
- C>CK ON~ y -oUMMAN 
FD1S3AX 
-- -
-
SUMCO 
oco _-· D 0 C - - - - oUMlO 
-
\ 
i 
,,.,. 
UC )CK ON -
FD153AX 
OEXP -OEXP .OEXPB z Al SUMEXP I I OEXPB • I A2 \ 
-axco UC Zl d a -z 
-XOR Bl I ) 
B2 AD122 
..... OXCOB !>CK QN 
-
SUMEXP 
A 
z p-OXCOB FD153AX B 
XPEN EXPCO ND2 UC 
a UC D a -..__ 
-
~ -
-
EXPCO XPEN r>CK ON ON -
FD1S3AX FD1S3AX 
. . 
• 
•. 
~ 
• 
,.-:~ 
I •. 
' \ 
\. 
I• \ 
•. it • -. ... ; 
'\ 
; 
. 
,J} 
, 
,. 
DB ~ DO ,.., Al I , 
-I I ' 
I 
~SB 
A2 
" > 
,.., 
I 
< 
so z , 
Bl l ) INRB J ' .. ~ B2 I 01 ,-- AOI22 
~ :- INR~ ·OB 0 .-. 
D 0 ... ON ._____. 
-SP > SP 
~ 
-· oq 
. C>CK ON 
-- 0 CK """ i= , ..., 
FD1P3AX ('t) 
to 
'\ • 
~ 
~ 
~ ~ • 
..._.. 
~ ' 
~ I .. 
~ 
~ 
C,v 
~ 
• 
~ . 
. 
'~· 
,. ; 
" 
~ 
.. 
-
·, 
. 
.. 
I 
... 
l 
"' i 
i' 
GM 
GZB 
~~B: 
GGTD 
"'l:j 
-· ()Q 
~ 
..., 
(1) 
to 
• 
~ 1--l 
00 ~ 
00 
~ 
t\j 
>< t\j 
• 
1--l 
GMB 
A GMB 
B 
N02 
0MB 
1: } 0MB 
ND2 
A Z 
GGB GGB 
INRB 
" 
GG 
,, 
' 
Al BM 
A2 
BM 
Bl 
B2 AOI22 
· Al LM 
A2 
LM 
Bl 
82 A0.122 
/ 
( 
~ 
-· OQ 
~ 
""1 
(t) 
to 
~ • 
1--' 
~ 
• 
~v 
z 
::0 
~ 
0 
~ 
q--• -~ ::0 
• 
1--' 
D 17 ""' ,. 
D 16 ""' , 
D 15 ""-. , 
L OAO "'"' ,.
C LK >-
DZROB 
DI 0 > 
DI 1 ""-. ,. 
\ 
,, 
,· 
NRDI75 
' ' l 
B z NRDl75 
DOX5 
a DOXB5 ,-----=--_Jo 
j,... 
7 L. r----JSP 
,,· 
NR3 
. 
. 
.... 
CNT ~ --1~CK ON 
DOX5 
' - A C NT ' l 
FD1P3AX 
'B I 
. 
OR2 
~-~~,___~__:::_~~j___JDl 
ooxo 
0 ooxo 
-
rt"-+--+---lOO 
~--JSO 
..,--,--+-_JSP 
~-1.>CK ON OOXBO 
013 ""-. ,. 
--~____J ' 
-----'---'-·------.U!_J1r-~ ... _ _!_F~L l~P~3A~x_f------:---:----~~ 
DOXXl 
- r, 
111 
B 
,... 
DOXXl OOXl 
0 ; oox 1 
L. 
.. 
ooxo~~==----+-l-+-_Jo1 
A Z ~--t--+--J-.-__J_oo 
CNT3 
CNT3 
l b- --
N03 
,--,t3 Z 1 UC SO 
XNOR (Jt-if---+-_J SP ~---.1,1>CK ON OOXB 1 Lr,_A--.::.C NT 4 
____ ___JI " 
OOXX3 
OOXX3 -
01 
DOX3 
o OOX3 
,I'\ z I I 00 
B Zl UC so 
XNOR SP . )CK ON OOXB3 
___ ____J 
FL1P3AX 
-
====±t-~--~F~L~l~P3~A~XJ r· ,----:---:-. -:--. __ _J~ z D r~NT 4 
l_-r-A~J_C~N~T22----=fff======~'_J , 
1
..._o _ __.,,, DI 4 ~,,---~!.2_-t~~~-WJ _ _r __ D:.-=.,:OX4 
. 01 a OOX4 
r---t--+--+---___Joo 
DOXX4 
N04 OOXX4 
. 
II'\ 
Zb- CNT2 B 
--+--150 
z n 
..... UC I 
---t--Jsp 
L-----''>CK 
Zl :J 
XNOR 
N02 OOXX2 
012 ~~----+~OO~X~X~2-+-LJ _ __r __ D.=..:OX2 
,n 01 a DOX2 
~~1-f I Zt----t--+--+-_J 
ON OOX84 
.___---'---;___J 
I DO 
------~ Z 1 - UC - SD 
FL 1P3AX 
.. 
XNOR SP 
..__---l',~ CK ON DOXB2 
...__ _ ____J 
FL1P3AX 
• 
'/ i~, 
./ 
;· 
OOX5 ----'Q 
OOX4 
CNT4 
OR3 
DOR2 OOXB5 .:-A~-
OOXB4 B 
DOXB3 c 
OOXB2- o 
--=-~---
N04 
' .. 
OOR2 OOR2B 
DGl 
,---t---~---~ DGl 
OOXl 
OR2 
OG2B OG2 
.--- A 
~--~oc2 OOXO _ •-~A DG28 
-~--C-----~ B >-~~+--~B 
'-----ND2 ND2 
001 
.-----
OOXBl-----~---+--:t--'"---'--~--_Jca--_~--1: 
---4----~c 
~~001 
AND3 
002 r-----
.____,A 
OOXBO .____ ______ _J B 
-------,-~-------_Jc 
t---_.a,. 002 . 
AND3 
J 
\. 
~ 
-· c,q 
~ 
~ 
(1) 
t:c 
• 
i,-,,l 
'--1 ? 
1--' M 
~ 
0 
~ 
::0 
• 
i,-,,l 
Al 
A2 
INRB Bl 
82 CLK a---+-~~-l\CK . ON. UC_ CO (7:1 J,NB_O _ _.. 
LO 
Z l 7 :Q J 
UNF 
--+--+--+----'AZ 
-+---if----'--~ 01 
----J---4-----.100 
fR--..-+-........... so 
NBO 
NB1 
NB2 
NZ3 
. 
CK 
A 
r,. 
u 
- A 
r,. 
lJ 
"' n 
,~ 
[J 
FD1P3AX 
UNF 
r-----'--.:....... LOB LOB 
>----~------JA 
INRB 
NZ (7 :Q 1 COB ---1B 
Ot---NZ (7:0J AND2 
ON NB (7 :Q 1 
FL1S3AX 
UNR 
B 
AND2 
(01 
z CDl 
Zl TB2 
XNOR 
CO2 
z CO2 
Zl T3 
XOR 
CD3 
l CD3 
Zl UC 
XOR 
UNR 
D 
CK 
NZ8 
Qi-----NZB 
UC 
ON--
~-__. 
FD1S3AX 
TB4 NBO r=-A--
NBl- B 
NB2 C 
NB3 O .____ _ _. 
ND4 
TB8 
NB4 A 
NBS B 
NB6 C 
NB7 0 
ND4 
1 . 
.... 
f84 
TBS 
ZB [7:0J 
AOI22 
UNF 
T4 
TB 
Zl7:QJ Z8[7:0J Z l7 :Q J 
I 
NBO A 
COl B 
CO2 ZROO ZRO 
C03 
LOB 
C04 
cos 
C06 
COl 
TB 
NR4 
NB4----'A 
NB5-___.B 
NB4--"A 
NBS B 
NB6 C 
T6 
AND2 
T7 
T4 
T6 
COB AND3 
---
~-COB 
AND3 
A 
B 
NR2 NZ 8 ___.J 
OR2 
\ 
T4 
NB4 
NZ5 
T6B 
ND2 
T7B 
ND2 
A 
B 
{'\ 
I I 
B 
C04 
z 
Zl TB 
XNOR 
C05 
z 
Zl UC 
-
XNOR 
(06 
[04 
5 
cos 
168 A Z (06 
NZ6 
-B 21 UC_ 
------
178 A 
NZ7 
XNOR 
CP7 
Z [07 
Zl UC_ 
._____. 
XNOR 
. ' 
.. 
• 
UPRB 14 :Q J r---- UPRB l 4 :Q J .----U---.PC TR.l 4 :OJ RS TB ~.,__--1 A 
I 
n 
'. 
DNCTR l4 :O 1 
UP(q:QJ Zb O a UPCTRBl4:0J UP(4:0J 01 
B ON l 4 : 1 1 ,ONC T RBO ---100 
Qt---- ONC TR l 4 :OJ 
ND2 OUT~ so 
-~CK ONt-----LJPCTR (4:0J - t>CK ON ONCTRB [4 :Q 1 
CLK ~--------t!l~--F_0_15_3_A_X ________ ~ FL153Aj 
UP l4 :O 1 
U 14 :1 J,UPCTRBO H 
~RBCLRB z:--UP (4:0J 
CLR~  B 
_ . .,~---- NRB '---·--AND2 
1 U1B Ul 
UPCTR0---..1.A Z UlB ~1-----Ul 
UPCTRl~~s Zt TU2B INRB 
XNOR 
U2 
.._ A ZI------ U2 
TU3 UPCTRB2-.. ~B z1~ 
XOR 
1/ ON (4:11 0(4:1) A 
l ON 14 :1 J 
SHF 8-----l B 
HND2 
D1B 
DNCTRO ~ Z DlB 
ONCTR1~~8 Zl TD2 
XOR 
02 
-A z---02 
DNCTRB2-_..,~ Zl TD3B 
........ JI 
XNOR 
a. 
ONCTRB4 
ONCTRB3 
ONCTRB2 
DNCTRBl 
U3B ~ 
-~ Z U38 ~>----U3· 
038 
Z D3B {'I 
. ' 
0·3 ~ 03 
~RB UPCTR3-~B Zl TU4B INRB 
XNOR 
U4 
. '-- A. 
UPCTRB4 B Zb--U4 
ND2 
/" 0 
.. 
ONCTR3~~~ Zl TD4 
XOR 
04 
. z 
DNC T RB4----IH'"' ju 
NR2 
--04 
, 
SHFB 
A 
SHFB~F B Z b A Z ~ SHf C INRB 0 
. 
N04 
·' 
•. 
.. 
' . Vita 
David Mark .Blaker was born to Dr. J. Warren Blaker and Mrs. Cynthia 
Geber Blaker on December 3, 1957, in Boston, Massachusetts. He graduated 
' from Arlington High School, in New York State, in 197 4. He then went on to 
receive the Bachelor of Science degree in Electrical Engineering from .the 
Massachusetts Institute of Technology in 1979. From there he went to the 
Hewlett-Packard Corporation, in Colorado Springs, Colorado, where he designed 
portions of timing analyzers and in-circuit emulators. In 1980, h& went to ADR 
Ultrasound in Tempe, Arizona, where he was coinventor of a scan convertor for .. 
ultrasound imaging, on which a U.S. patent was issued. He moved to Edge 
Computer Corporation in 1984, where he designed part of the floating point 
unit for a high performance work station. In 1985, he moved to AT&T Bell 
Laboratories in Allentown, Pennsylvania, where he designs VLSI DSP devices, 
and is responsible for design for testablility and built-in self test strategies. 
David is married to Polly Jean Blaker ( nee Williams), and has two children: 
Sarah Elizabeth Blaker, aged 21h years, and Nathan Isaac Blaker, aged 1h year. 
His personal interests include playing with his children, skiing, walking· and 
reading. 
} 
73 
L 
• 
