Modular decomposition techniques for stored-logic digital filters by Mohamed A. Bin Nun (7204172)
LOUGHBOROUGrl 
UNIVERSITY OF TECHNOLOGY 
LIBRARY 
AUTHOR/FILING TITLE 
l\IN NI,)N' t-'\ 
- - - - - - - - - - - - - - - - - - - - - -)- ,- - c - - - - - - - - - - - - - - - - - - --
ACCESSION/COPY NCO 2.4778/01 
VOL. NO. CLASS MARK 
\FOR 
Nur EFERENCE 
MODULAR DECOMPOSITION TECHNIQUES 
FOR STORED-LOGIC DIGITAL FILTERS 
BY 
MOHAMED ARIF BIN NUN, 
B.Sc. (University of London), 
M.Sc. (Loughborough University of Technology). 
A Voc;toJta1. Thu,u, .6ubmUted .{.n paJr;tUtt 6u16-Umen-t 06 the 
l!.eqtWr.eme1tt.6 601!. the awaJld 06 Voc;tol!. 06 Phil.o.60phy 06 the 
Loughbol!.ough Urt.<.VeJt.6.<.ty 06 Technology, June 1977. 
Supervisor: M.E. WOODWARD, Ph.D. 
Department of Electronic and Electrical 
Engineering. . . ." '. 
© by Mohamed Arif Bin Nun, 1977. 
oughbo,ough Unive,slty 
Of I.d;oo!oe,y t;br~'y 
To my wi6e, THYE KHIM, 6o~ h~ patienee, ~a~6~ee 
and emou.oYlll1 I.JUppoJVt, and:to my pMent6 6o~ -thw 
u.nd~:tanMng. 
ACKNOWLEDGEMENTS 
I am grateful to the Malaysian Government for supporting me 
financially throughout my studies, and to Professor J.W.R. Griffiths, 
Head of the Department of Electronic and Electrical Engineering, 
for providing the research facilities. I would also like to thank 
my Research Supervisor, Dr. M.E. Woodward, who introduced me to 
the concept of 'closed' partitions, for his help and encouragement. 
In addition, I am indebted to Dr. D.J. Quarmby, Manager of 
the Signal Analysis Centre, and Dr. R.P. Knott from the Engineering 
Mathematics Department, for their invaluable specialist advice. 
I also appreciate Dr. R. Steele's interest in my work, and his 
optimistic encouragement. 
During my research, my close friends and colleagues, especially 
Mr. F.T. Sakane, Dr. U. Somaini and Dr. M.L. Rahman, provided a 
healthy mixture of humourous and serious discussions, and for them 
I have only fond memories. 
Finally, I must mention, with gratitude, the efficient service 
of the Staff of the University Library, and the immaculate typing 
of my Thesis by Mrs. B. Wright. 
M.A. Bin Nun, June 1977. 
SYNOPSIS 
Digital filtering is an important signal processing technique 
whose theory is now well established. At present, however, there are 
no well defi~ed and systematic methods available for realising digital 
filters in hardware. 
. \ This project aims to develop such methods wh~ch are general and 
technology independent, and adopts a systems and sub-systems design 
philosophy. The realisation problem is approached in a new way using 
concepts from finite-automata theory and implementing complete digital 
filter sections as stored-logic units. Two methods are introduced 
and developed. 
In the first, a complete basic second-order filter is directly 
modelled as a finite-state sequential machine (F.S.M.) and implemented 
with memory devices whose storage capacity is reduced by the application 
of a well known method of machine decomposition via 'closed' partitions. 
To initiate a systematic analysis of the partition structure of 
the F.S.M. digital filter, a study is made into the algebraic decomposition 
structure of the basic computational units making up the filtering 
algorithm. 
The insight gained is useful in a subsequent analysis which shows 
that a second-order filter section, suitably simplified and modelled 
as an F.S.M., may be decomposed into a parallel connection of smaller 
sub-machines, each of which, in turn, being composed of a 'nested' 
cascade interconnection of still simpler components. The overall 
memory requirement of the decomposed realisation is considerably 
less than that of the direct one. 
The second method presents a technique of 'digit slicing' over 
a variable number base, using which a filter section may be realised 
as a regular interconnection of identical sub-filters of short 
wordlengths. The technique leads to a flexibility in hardware count 
and processing mode, and a modular expandibility in computational 
accuracy and filter order. It is suited to implementations using 
large-scale integrated (L.S.I.) devices. 
Practical prototypes are constructed using programmable and 
erasable memory modules. 
The methods developed provide a general theoretical basis for 
the hardWare realisation of digital filters. It is hoped that its 
main usefulness lies in bridging the gap between the initial analytical 
description of the desired frequency characteristics and corresponding 
filter transfer function, and the actual hardwired practical implementation. 
CONTENTS 
CHAPTER 1 INTRODUCTION 
1.0 Introduction 
1.1 Background 
1.2 Motivation for project 
1.3 Design philosophy and problem formulation 
1.4 Scope of research and organisation of Thesis 
CHAPTER 2 THEORY AND IMPLEMENTATION OF DIGITAL FILTERS 
2.0 
2.1 
2.2 
2.3 
In troduc tion 
Descriptions of a general digital filter 
2.1.0 Dynamics of digital filters 
Applications and advantages 
Design and realisation 
Page 
1 
1 
1 
2 
3 
4 
7 
7 
7 
9 
10 
11 
2.3.0 Mathematical design 11 
2.3.0.0 Design techniques for FIR and IIR 
filters 12 
2.3.1 Effects of finite-length registers 13 
2.3.2 Considerations in the real-time hardware 
2.3.3 
2.3.4 
implementation 
Existing hardware design approaches 
Conclusion 
15 
la 
21 
CHAPTER 3 ELEMENTARY STRUCTURE THEORY OF FINITE-STATE 
SEQUENTIAL MACHINES 
3.0 
3.1 
3.2 
3.3 
3.4 
3.5 
3.6 
3.7 
Introduction 
Descriptions of F.S.M's 
Interconnections of F.S.M's 
Problem areas in F.S.M. realisation 
Basic algebraic concepts 
Structural ,decompositions of F.S.M's 
State reduction using S.P. partitions 
Conclusion 
22 
22 
22 
24 
26 
28 
29 
32 
34 
CHAPTER 4 FINITE-STATE MACHINE MODELS OF STORED-LOGIC 
DIGITAL FILTERS 
4.0 
4.1 
4.2 
4.3 
4.4. 
Introduction 
General approach 
Stored-logic digital filters 
Examples of S.L. digital filters 
4.3.0 
4.3.1 
4.3.2 
4.3.3 
Second-order non-recursive section 
First-order recursive section 
Second-order autonomous recursive section 
Memory storage requirements 
F.S.M. models of digital filters 
4.4.0 F.S.M. model of a general S.L. non-recursive 
Page 
35 
35 
35 
35 
37 
37 
38 
38 
41 
41 
second-order section 43 
4.5 
4.6 
4.4.0.0 
4.4.0.1 
4.4.0.2 
4.4.1 
4.4.2 
4.4.2.0 
An application 
State-reduction of the general F.S.M. 
non-recursive section 
Partial state reduction 
F.S.M. model of second-order autonomous 
recursive section 
F.S.M. model of first-order recursive 
section 
Decomposition results for D.F.2 with 
different feedback coefficient values 
Discussion 
Conclusions 
CHAPTER 5 PARTITION STRUCTURES OF STORED-LOGIC 
ARITHMETIC CIRCUITS 
5.0 
5.1 
5.2 
Introduction 
F.S.M. model of a general arithmetic circuit 
Radix-2N adders 
5.2.0 
5.2.0.0 
5.2.1 
5.2.1.0 
5.2.1.1 
5.2.1.2 
5.2.2 
Example 
S.P. partitions of radix-2 3 half-adder 
N The general modulo 2 
Generation of S.P. partitions 
Loop-free realisation of adders 
modulo 2N 
Memory storage reduction 
Generation of the carry digit 
43 
46 
49 
52 
55 
58 
66 
,.J!j 
72 
72 
72 
74 
75 
75 
82 
82 
85 
87 
'88 
5.3 
5.4 
5.2.3 Addition of "carry-in" digit 
Radix-2N parallel multipliers 
5.3.0 Example 
5.3.1 The general N x N bit multiplier 
5.3.1.0 Example 
Page 
S9 
90 
90 
91 
94 
5.3.2 
5.3.3 
Decomposition of the F.S.M. multiplier 94 
Improved model of N-bit parallel multiplier 101 
Conclusion 104 
CHAPTER 6 NOVEL METHOD OF MODULO 2N MULTIPLICATION 
USING CONSTRAINED OPERANDS 
6.0 
6.1 
6.2 
6.3 
6.4 
6.5 
Introduction 
Observations 
Modulo 2N multiplication using 'forced' operands 
and product correction 
6.2.0 Example 
N Internal algebraic structure of reduced modulo 2 
multipliers 
Example 6.3.0 
6.3.1 N The group under modulo 2 reduced 
multiplication 
6.3.2 Application of theoretical results 
General com~arison with the direct implementation 
of modulo 2 multipliers 
Conclusions 
CHAPTER 7 DECOMPOSITION STRUCTURES OF MODULO M ADDERS 
AND MULTIPLIERS, AND OF A SIMPLIFIED MODEL OF 
106 
106 
106 
109 
HI 
US 
HS 
121 
132 
142 
143 
A SECOND-ORDER DIGITAL FILTER 145 
7.0 
7.1 
7.2 
Introduction 
Partition structure of modulo M adder 
7.1.0 
7.1.1 
7.1.2 
Generation of the basic S.P. partitions 
General form of nC 
The partition lattice of the general 
mod M adder 
S.P. partitions for a mod M multiplier 
7.2.1 Sub-lattice of multiplier's S.P. 
partitions 
145 
145 
145 
149 
151 
155 
156 
7.3 
7.4 
Decomposition structures of digital filters 
7.3.0 
7.3.1 
7.3.2 
7.3.3 
Notation 
Simplified models of non-recursive 
filters 
Homomorphic images of (DF)M 
Parallel connection of (DF)b and (DF) 
7.3.4 Cascade decomposition structure of 
modulo pa digital filter 
7.3.4.0 Notation 
c 
7.3.4.1 
7.3.5 
Analysis 
Lattice of 
of cascade structure of (DF) a p 
homomorphic images of (DF)M 
Conclusions 
Page 
157 
157 
158 
162 
166 
173 
173 
174 
181 
183 
CHAPTER 8 MODULAR PARTITIONING OF BASIC SECOND-ORDER 
DIGITAL FILTER 
8.0 
8.1 
8.2 
8.3 
8.4 
Introduction 
General modular partition theory 
8.1.0 Sequence elements represented as 
sequences 
184 
184 
185 
185 
8.1.1 Extraction of a basic convolution unit 187 
8.1.2 The primitive convolution cell 189 
8.1.3 Effect of modular partitioning on 
frequency analysis 190 
8.1.3.0 Frequency characteristics ~O 
8.1.3.1 Digit frequency response templates 191 
Reprint of the article entitled "A modular 
approach to the hardware implementation of 
digital filters". 
Negative values of filter data 
8.3.0 
8.3.1 
8.3.2 
8.3.3 
Constant bias of filter input 
Distributed correction 
Example 
Circuit implementation of correction 
scheme 
Conclusions 
194 
203 
203 
205 
207 
210 
210 
Page 
CHAPTER 9 PRACTICAL HARDWARE IMPLEMENTATION USING 
MODULAR APPROACH 
9.0 
9.1 
Introduc tion 
General filter system 
212 
212 
212 
9.2 Functional and circuit description of filter 
sub-systems 214 
9.3 
9.4 
9.2.0 p.R.O.M. module and data registers 214 
9.2.0.0 Programming the p.R.O.M. 217 
9.Ll Accumulator 218 
9.2.1.0 Circuit implementation of accumulator 220 
9.2.2 Buffer logic 221 
9.2.3 Clock and control unit 221 
9.2.3.0 System clocks 221 
9.2.3.1 Timing pulses 222 
System performance 
Conclusion 
223 
226 
CHAPTER 10 A UNIFIED FILTER REALISATION APPROACH 
USING PROGRAMMABLE STORED-LOGIC CONVOLUTION 
MODULES 227 
10.0 
10.1 
10.2 
Introduction 
Basic implementations of digit-convolution 
module 
Novel implementation using complementary 
convolution module 
10.3 The Y ~ Y module in the parallel modular 
realisation 
10.4 
10.5 
10.6 
10.7 
10.8 
Consequence of the concept of complementary 
Y ~ Y convolution module 
Application to time-varying digital filters 
General digital filter systems 
10.6.0 
10.6.1 
General second-order section 
General high-order filters 
Recent proposals for the hardware implementation 
of digital filters 
Conclusions 
227 
228 
228 
230 
231 
233 
233 
234 
235 
236 
237 
CHAPTER 11 REVIEW AND RECOMMENDATIONS 
11.0 Introduction 
11.1 
11.2 
11.3 
REFERENCES 
Review of main results 
Possible directions for development 
Conclusions 
Page 
238 
238 
238 
240 
242 
1 
CHAPTER 1 
INTRODUCTION 
1.0 Introduction. 
In this Chapter we describe the background of and the 
motivation for the research project, state and discuss the nature 
and scope of the investigation, and finally outline the organisation 
of the various chapters of the Thesis. 
1.1 Background. 
The trend in the field of communication and signal processing 
is towards the digital format for representing, transmitting and 
operating on signals, the increasing use of pulse-code modulation 
(P.C.M.) and delta-modulation (6 - M.) being two familiar examples. 
This may be attributed mainly l) the ever growing complexity and 
flexibility of digital computers and the rapid advance in the 
technology of medium and large-scale (M.S.I. and L.S.I.) integrated 
circuits. 
In a general digital signal processor, one of the main 
components 1S the digital filter, which is basically a "black box" 
which processes a digital input signal according to a computational 
algorithm to produce a digital output having some specified 
characteristics. Among its many applications, a digital filter 
is widely used for waveform shaping and spectral analysis and 
synthesis of signals. Some of the advantages of a digital filter 
over its analogue counter-part are its arbitrary guaranteed accuracy, 
2 
predictable and reproducible performance, flexibility in parameter 
changes, and the possibility of time-multiplexing its major 
components. 
From its early start in the mid-60's the theory in the analysis 
and design of digital filters is now well advanced and fairly 
complete, and comprehensive discussions on it may be found in many 
11 b k 1,2 d . 1 . 3 h ·1 bl exce ent text 00 s an spec1a 1ssues t at are now ava1 a e. 
As such, in the next Chapter, we give only a brief review of the 
general theory and place more attention to discussing the problem 
of implementing digital filters in hardware. In contrast to its 
well developed theory, the practical aspect of digital filtering 
is far from satisfactory. Until a few years ago, with the exception 
4 
of the classic paper by Jackson et al , most published papers 
concentrated only on the off-line simulation of digital filters on 
general-purpose computers. 
The past few years, however, have seen a growing number of 
5-9 papers on the real-time hardware implementation of digital 
filters. The design techniques seem to differ from each other-;- but 
they invariably share a common philosophy, viz, a binary number 
representation is assumed for the arithmetic operations, and the 
hardware implementation is implicitly accepted as only an exercise 
in switching circuit techniques and combinational logic design. 
1.2 Motivation for project. 
Consequently, we feel that there is a theoretical gap 
between the analytical design of digital filters and their final 
realisations in real-time hardware, and a need for a systematic 
3 
realisation technique. If it is to be useful, any method developed 
should preferably be user-oriented and result in hardware structures 
that are modular and flexible for easy construction, testing, 
maintenance and reliable operation. 
1.3 Design philosophy and problem formulation. 
In our investigation, we decide to adopt the system and 
sub-system approach in the design philosophy, in which the hardware 
structures of digital filters are analysed from their input-output 
behaviour. Thus a macroscopic view is taken, rather than the 
conventional microscopic one in which logic elements and parts are 
put together to make up the overall filter circuit. 
Furthermore, we feel that a powerful tool with which to 
analyse such filter systems is the concept of finite-state 
. I h· (S ) f'· 10,11,12 (I sequent1a mac 1ne F .• M. or 1n1te automata , a so 
see Chapter 3), which is a useful model of the dynamics of discrete-
parameter systems. 
As it happens, in the period 1960-65, a structural theory 
of sequential machines which is generally unified and complete 
was developed by Hartmanis 12 Using this theory, it is possible 
in general to decompose an F.S.M. into an interconnection of smaller 
and simpler sub-machines. The application of Hartmanis' theory 
to our filter systems is obviously attractive since a structural 
decomposition implies a modular system architecture. Furthermore, 
Howardl3 showed that the decomposition theory is still applicable 
when an F.S.M. is realised as a table look-up unit implemented 
using a semiconductor read-only memory (R.O.M.), an L.S.I. device 
4 
. 14 15 that is rapidly becoming a popular alternative to random-loglc ' 
Consequently, besides being an exploratory study in 
implementing digital filters using the systems approach in general, 
our research project investigates in particular the feasibility 
of realising a digital filter section as a table look-up unit and 
modelling its dynamics as a finite-state sequential machine in 
order to discover any structural property. 
We term such a filter a stored-logic (S.L.) digital filter, 
and consider its implementation using semiconductor memories. 
1.4 Scope of research and organisation of Thesis. 
The results of our initial investigations along the lines 
proposed are described in Chapter 4, and we report some success in 
simplifying the memory requirement of an S.L. digital filter. We 
are not able, however, to generalise the technique used here to 
filters having arbitrary coefficients, especially with recursive 
filters, due to non-linearities introduced by arithmetic round-off. 
To gain further insight into the algebraic s·tructure of these 
S.L. filters, we then apply the F.S.M. modelling technique to the 
arithmetic components which make up the filter. This is not 
reverting to the traditional approach since the subsequent analytical 
treatment, which is discussed in Chapter 5, is still on the systems 
level. Analysing arithmetic circuits based on modulo 2N arithmetic 
(see ref. 16 for a discussion on modulo arithmetic), we derive 
12 interesting loop-free decomposition structures for adders and 
multipliers modulo 2N which require considerably less memory storage 
in their implementation when compared with that required if a direct 
5 
table look-up is used. 
We then extend the analysis to "rea lft arithmetic units, 
where one has to account for the carry output in the case of a 
general N-bit adder, and the double-length product of an N-bit 
multiplier. 
In Chapter 6 we outline a novel approach to the implementation 
of modulo ZN multipliers based on a transform which maps a sub-set 
of the multiplication table onto the Cartesian product of modulo 2 
N-l 
and Z adders. Although the results are not directly relevant 
to the synthesis of stored-logic filters, we have included the 
'Chapter because we feel that it is interesting and useful in its 
own right. 
In Chapter 7 the theory on the algebraic F.S.M. decomposition 
of general stored-logic modulo arithmetic units and digital filter 
sections is developed in which the concept of a lattice of 
partitions on machine states (see Chapter 3) plays a central role. 
We show that for a general modulo M adder, its decomposition 
structure can be completely described. Although we are unable to 
do the same for the corresponding modulo M multiplier, we have 
managed to describe completely one possible sub-structure. 
The next three chapters, 8.9 and 10, take on a more practical 
tone. In Chapter 8, an attractive and novel modular hardware 
architecture for a second-order digital filter section is 
introduced. This uses the concept of digit slicing, which leads 
to what we term a sub-filter module. We show that a digital 
filter section may be realised as a regular interconnection of 
such modules, which are all identical in structure. 
6 
Using this technique we have also constructed a practical 
prototype 8-bit second-order digital filter section, in which the 
sub-filter module is implemented using a semiconductor programmable 
and erasable read-only memory (p.R.O.M.). Details of the circuit 
construction and testing are documented in Chapter 9. Useful 
indications are obtained on the tradeoff between hardware complexity 
and processing speed. 
Following this, we propose in Chapter 10 two. ways with 
which the technique of digit-slicing and the concept of stored-
logic sub-filter modules can be successfully incorporated into 
a general system architecture to achieve flexible and relatively 
inexpensive real-time digital filtering. In this chapter we also 
discuss briefly the state-of-the-art of practical digital filters 
and signal processors and suggest probable trends. 
Finally, we conclude the Thesis with Chapter 11, in which 
the investigation that has been carried out is reviewed. 
1.5 Conclusions. 
The research project is an attempt to provide a theoretical 
framework for the methodical implementation of real-time digital 
filters. The problem is approached in a novel way using a systems 
design philosophy in general, and the concept of finite automata 
in particular. 
7 
CHAPTER 2 
THEORY AND IMPLEMENTATION OF 
DIGITAL FILTERS 
2.0 Introduction. 
The theory of digital filtering is briefly reviewed in this 
chap~er, and we discuss the problems involved in implementing digital 
filter hardware to process real-time signals. We also survey the 
different approaches to the problem that have been proposed in the 
literature. 
2.1 Descriptions of a general digital filter. 
A digital filter is basically a computational algorithm by 
which an input number sequence {x } is transformed into an output 
n 
number sequence {y}. When used in a digital signal processing 
n 
system, as shown in Fig. 2.0, {x } is the time and amplitude 
n 
quantised version of an analogue signal input. If so required, 
{Yn} may be converted back into the analogue form. 
The filter algorithm is the following linear difference 
equation, 
y = 
n 
N 
L 
k~ 
N 
l. 
k=l 
where ak's and bk's are termed the filter coefficients. 
.•• (2.0) 
The filter described by equation (2.0) is known as a general 
recursive filter. In many cases, the output y is explicitly 
n 
determined only by the present and past input values, i.e. all 
bk's = O. The corresponding filter is then known as a non-recursive 
one. 
x(t) 
.... 
Analogue 
Isom;ler 
input 
Fig. 2.0 
Fig. 2.1 
x(nT) 
8+bit 
quontizer 
Dod 
coder 
Digital 
input 
Digital Filter 
Block representation of a digital 
signal processing system. 
- b 
2 
-b 
N 
Digital 
output 
The direct form of a general digital filter. 
8 
In spectral analysis, due to the convenience of algebraic 
manipulation, a digital filter is alternatively described by its 
1 
z-transform transfer function H(z), where 
N· 
-k L ak z 
H(z) ·k=O ... (2.1) N 
1 + L bk -k z k=l 
-1 . 
where z 1S the unit delay operator. 
A canonical circuit realisation of (2.1) is shown in Fig. 2.1, 
known as the direct form. Due to accuracy requirements, the following 
cascade and parallel forms are preferred, i.e. 
M 1 -1 -2 +"li z +"Zi z 
H(z) = a TI 
-1 2 0 i=l 1 Sli + S2i + z z 
and 
M -1 Yoi +Yli z H(z) = Yo + L 1 2 i=l 1 + Sli z + S2i z 
where M is the integer part of (N+l)/2, and Y = a /b . 
o n n 
... (2.2) 
••• (2.3) 
The corresponding circuit realisations are shown in Figs. 2.2(a) 
and (b), in which the basic building block is the second-order or 
biquadratic section, which is shown in Fig. 2.3 and described by the 
following relationship, i.e., 
2 2 
= I. ~ xn- k - L bk Yn-k k=O k=l ••• (2.4) 
-f/ 
21 
T 
"'21 
(a) 
-(3 
21'4 
• 
Fig. 2.2 The cascade form (a) and the parallel 
form (b) of a digital filter. 
c 
x 
" o .... 
.... 
u 
'" " 
... 
'" ." 
... 
o 
I 
." 
" o u 
'" " 
M 
N 
9 
2.1.0 Dynamics of digital filters. 
The operational and functional behaviour of a digital filter 
can be analysed either in the time or frequency domain. 
In the former, we use the impulse response h(n), which is the 
filter output response to a discrete-time impulse at k = 0 (a digital 
impulse at k = k is a signal x(k) such that x(k) = 1 when k = k 
o 0 
and x(k) = 0 when k I k ). 
o 
If h(n) = 0 for NI < n < N2 , with NI ~ N2 , the associated 
filter is called a finite impulse response (FIR) filter. An infinite 
impulse response (IIR) filter is one in which either NI = 00 or 
N = - 00 or both. 2 
Given an input sequence g(n) and the filter impulse response 
h(n), the output f(n) is obtained by the discrete-time convolution 
operation defined by 
f(n) = \ h g l k n-k k=-~ 
... (2.5) 
Alternatively, a filter may be described by its frequency response, 
which is the value of H(z) when evaluated on the unit circle, 
i.e. Izl = 1, in the complex z-plane. When the frequency response 
is expressed in polar form, its magnitude and its angle as a function 
of frequency is called the amplitude and the phase response respectively. 
Other aspects of digital spectral analysis such as Discrete 
Fourier Transform (D.F.T.), and the algorithm for its efficient 
computation called the fast Fourier Transform (F.F.T.) may be found 
in the recommended references. 
10 
2.2 Applications and advantages. 
Digital filters are extensively used in data reduction and 
system simulation experiments, and as integral parts of communication 
or signal processing systems. Specific applications include character 
extraction in speech processing and biomedical engineering, the study 
of new signal processing systems via computer simulation, e.g. 
vocoders, speech codecs, bandwidth compression schemes, the removal 
of interference noise and the compensation for perturbation in the 
transmission channels of communication systems. 
A digital filter has the following advantages over its analogue 
counterpart; 
(a) Theoretically, it can be designed to an arbitrarily high 
accuracy which is reproducible due to the absence of drift and 
component tolerance. 
(b) It is very flexible as the overall performance can be 
modified by simply altering the filter coefficients. 
(c) The time-multiplexing of the main hardware units i..-
possible, leading to simple filter banks. 
(d) There is no problem of impedance matching and also no 
restriction on critical frequencies. 
(g) Many practical signals today are already in digital form 
anyway. 
(h) Its configuration is not highly cross-connected and is 
thus suitable for integrated circuit technology, which is currently 
developing at a tremendous rate. 
On the other hand digital filtering has its special problems 
11 
in design and implementation. These will be discussed in the 
following sections. 
2.3 Design and realisation. 
To the system designer, his problem is basically to produce a 
realisation which approximates as "closely" as possible a given 
specified filter response (time, frequency, group-delay etc.) in 
a prescribed manner. This realisation may be an off-line software 
routine or a real-time hardware implementation. 
There are three distinct stages that he has to go through, viz., 
that of 
(a) mathematical design assuming infinite precision arithmetic, 
(b) circuit or configuration design accounting for the effects 
of finite register lengths, and 
(c) real-time .hardware architecture and device implementation. 
2.3.0 Mathematical design. 
In general, the "filter design problem" is one in mathematical 
approximation and consists simply of finding the values of the 
coefficients ~'s and bk's such that the response of the corresponding 
filter approximates, in a prescribed manner, a desired characteristic. 
The theory on the design techniques is well developed and excellent 
documentation of established and proven methods may be found in the 
literature (e.g. References 1,2,17,18). Also new designs are constantly 
being published. As such, we will mention only briefly the main 
design procedures for both FIR and IIR filters. 
12 
2.3.0.0 Design techniques for FIR and IIR Filters. 
There are three well known classes of design methods. 
The first is the window method, which is based on the expansion, 
in a Fourier series form, of the periodic (in frequency) frequency 
H( jw) of any digital filter, i.e., response e 
00 
h(n)e- jwn 
n=-oo 
where h(n) are the Fourier coefficients. It is also easily shown 
that h(n) is identical to the impulse response of a digital filter. 
To obtain a realisable FIR filter, a finite weighting sequence w(n) 
is used to modify h(n) to control the convergence of the Fourier 
series. Some well known windows as these w(n)'s are called, are 
the rectangular, tlgeneralised" Hannning, and Kaiser windows. 
The second method is that of frequency sampling in which an 
FIR filter is expressed in term of its D.F.T. coefficients. The 
continuous frequency response is thus approximated by sampling, in 
frequency, at N equidistant points around the unit circle. The 
continuous frequency response is then evaluated as an interpolation 
of the sampled frequency response. 
In the third method, the design problem is regarded as a 
Chebyshev approximation problem and consists of minimising the 
maximum absolute value of a weighted error of approximation E(e jw ) 
(see page 126 of Ref.l for its definition). 
For the IIR filters, there are two main classes of design 
teChniques. In the first class, one first designs an appropriate 
continuous time analogue filter. The design obtained is then 
13 
digitised to determine its digital equivalent using procedures 
like the mapping of differentials to finite differences, the 
impulse invariant, the bilinear transform and the matched z-transform 
techniques. The second class is the open form approach using modern 
optimisation algorithms, like the minimum mean square and minimum 
absolute error methods, equiripple techniques and time domain 
op t imi s a ti on. 
2.3.1 Effects of finite-length registers. 
In a theoretical realisation it is assumed that infinite precision 
arithmetic is used. In practical realisations, however, (especially 
with special-purpose implementations), data words can only be stored 
in registers having finite lengths. Thus the filter data, coefficients 
and the results of intermediate operations have to be either truncated 
or rounded-off. These quantisation effects affect the overall filter 
performance in various ways, depending on the type of arithmetic 
used, the type of quantisation and the exact filter structure. 
(Comprehensive discussions on these effects are given in the review 
. .. 19 . 20) papers by Oppenhe1m and We1nste1n and L1U • 
The first of these effects is the error introduced as a result 
of the AID conversion of the filter input. This quantisation effect, 
however, is not usually regarded by digital filter designers as an 
integral part of filter design. 
The second effect is when the filter coefficients are quantised, 
leading to the restriction of the possible values of the poles and 
zeros of the filter transfer function to a finite set. Consequently 
the actual filter response will differ slightly from the theoretically 
--- -----------------------------------------------------------------
14 
derived one. Some common approaches to the problem consist of 
computing the frequency response directly using the quanti sed 
coefficients, performing an optimised search over the grid of 
allowed pole/zero positions around the ideal positions, and to 
find general structures which are less sensitive to coefficient 
inaccuracies. 
The quantisation of the results of the arithmetic operations 
of multiplications and additions is the third effect of finite 
register lengths. Its analysis depends on whether truncation or 
round-off is used, whether we implement the operations with fixed-
point or floating point arithmetic, and on whether we represent 
negative numbers in the sign-magnitude, l's or 2's complement form. 
In many situations, the rounding effect at each mUltiplier is 
statistically modelled as a discrete stationary white-noise source 
-b 
uniformly distributed in amplitude between ± (1/2)2 ,b being 
the product register's bit length. Also each source has a transfer 
function to the output. 
For recursive filter structures, the quantisation of the 
multiplicative products produce stable periodic or non-zero constant 
outputs when the inputs are zero or constant. These outputs are 
called small-scale limit cycles. 
Another problem is when the result of some arithmetic operations 
overflows and falls outside the permitted set of representable 
values resulting in an incorrect in-range value. When this occurs 
in the feedback loops of certain second-order sections, stable and 
persistent full-scale oscillations result. They are known as 
large-scale limit cycles. Methods exist with which one may determine 
15 
the scale factors of signal levels at certain points in the 
filter structure to prevent overflow and still maintain a 
maximum signal/round-off noise ratio. 
2.3.2 Considerations in the real-time hardware implementation. 
Although conceptually one simply interconnects adder, multiplier 
and storage units in order to mechanise the filter algorithm, in 
practice one is confronted with a bewildering array of factors and 
constraints in the choice of system or circuit structure and component 
and device technology. To achieve an efficient and economical system, 
the designer must consider initial costs, hardware complexity with 
respect to construction, testing and maintenance, power dissipation, 
space requirements, system modularity and flexibility etc. All 
these factors depend on specific needs and applications. 
Assuming that the would-be designer has obtained his filter 
coefficients, and an estimate of the bits required for the input, 
coefficient and internal data words, he must also realise that in 
real-time digital filtering, the filtering algorithm must be computed 
within the sampling period T of the input signal, with the maximum 
allowable value of T depending on the bandwidth or Nyquist frequency 
(see Ref. 1) of the signal. Operational speed thus has to be 
balanced by system cost and complexity. 
To obtain the system structure one has to decide on the number 
representation and the type of arithmetic to use. Floating-point 
arithmetic gives a larger dynamic range than that of fixed-point, 
but it requires a more complex hardware due to the need to align 
mantissas. The circuit complexity also depends on the particular 
16 
representation of negative numbers. One must also remember that 
although the use of 2's complement arithmetic makes additions and 
subtractions easy, multiplication is much more convenient using 
the sign-magnitude representation. 
One must also define the basic functional units making up 
the adders, multipliers and data stores, their interconnections, 
and their processing modes, i.e. word parallel or bit serial 
. ( L' 21 d' 1 b . process1ng see eW1n for more eta1 s on num er representat10ns 
and arithmetic hardware). Basic units may also be time-multiplexed, 
and one could also increase system throughput by the incorporation 
f . 1" 22,23 o plpe lnlng. 
The basic adding unit is the full-adder (F.A.)2l; a single 
one is used in serial addition, and an N-bit parallel addition may 
be achieved by connecting N F.A.'s in cascade. Fast additions 
employ the familiar carry 10ok-ahead2l technique. 
Multipliers are the most important, complex and expensive units. 
They range in structures from the simplest shift-and-add ones, 
through the serial-parallel varieties, to the fast two-dimensional 
array multipliers. An ultra-fast array combines the carry-save 
technique with a "tree" arrangement of adder rows. (See Chapter 8 
of Ref. 1 and also Refs. 24 & 25). 
Data are stored in either bistable shift registers and/or M.S.I. 
and L.S.I. memories. These memories are either static or dynamic 
which requires refreshing26 , and are further classified into read-
only memories (R.O.M's) and read-and-write memories (commonly referred 
to as R.A.M's which, strictly speaking, can be also taken to mean 
random access memories). 
17 
Apart from these ma1n units, extra circuitry is required 
for overflow detection and correction, intersection scaling, data 
quantisation and system control. 
For a given architecture, an appropriate device technology 
must be matched to it. Each particular technology is characterised 
by 
(a) the physical S1ze of the basic device, e.g. a logic gate, 
(b) its power dissipation, and 
(c) its switching speed. 
(a) and (b) usually determine the scale of integration, i.e. 
the number of devices per chip, while the ratio of (b) to (c) is 
roughly constant for a given technology and is often used as a 
figure of merit. 
At present two proven technologies are the bipolar saturated 
transistor-transistor logic (T.T.L.), the linear emitter-coupled 
logic (E.C.L.), and the unipolar metal-oxide semiconductor (M.O.S.) 
technology26, having typical power-delay values of (10 roW - l~ nS), 
(60 mW - 1 nS) and (0.2 mW - 300 nS) respectively. In general bipolar 
circuits have achieved much higher speeds while M.O.S. chips have 
attained a much higher degree of circuit integration. 
Among the newer technologies are the use of sapphire substrates 
and the bipolar integrated-injection logic (I.I.L.) which promises 
a high packing density. 
Lastly, the trend in digital system design is rapidly moving 
from the use of discrete gates and simple logic packages to that of 
medium-scale (M.S.I.) and large-scale integrated (L.S.I.) techniques 27 
18 
2.3.3 Existing hardware design approaches. 
In contrast to the mathematical design theory, there is no 
systematic design technique for the real-time hardware implementation 
of digital filters. There are as many hardware structures as there 
are authors, and the continuing rapid change in integrated circuit 
technology makes the problem of the device implementation of these 
structures a dynamic one. 
In this section we discuss the major classes of design approach. 
In the author's opinion, the first three are becoming established 
designs due to their efficiency, simplicity and modularity, and 
have attracted the attention and enthusiasm of a host of workers 
in the field. 
The first design approach was proposed in the classic 1968 
4 paper by Jackson et al. The corresponding filter structure uses 
serial arithmetic and features a sign-magnitude serial-data and 
parallel coefficient multiplier as shown in Fig. 2.4(a). The problem 
of excessive propagation delays and the need to quantise double-
length products to single-length registers was solved by efficient 
pipelining and simple logic. A typical adder cell of the modified 
multiplier is shown in Fig. 2.4(b). Jackson et al. also presented 
a simple multiplexing method for multichannel or multifunction 
processing, using a R.O.M. to store the coefficients. The scheme 
was used in implementing an all-digital touch-tone receiver consisting 
of high-pass, low-pass, band-stop and band-pass filters of various 
st th. 
orders from the 1 to the 6 , uS1ng multiplexed first and second-
order sections. The sampling rate is 10K samples/sec., input 
quantisation is 7 bits, and 40 serial adders and 400 bits of shift-
\ 
19 
register storage were used. 
Of a more recent vintage is the design proposed by Croisier 
6 f . 28 et al. , and urther analysed and developed by L1ttle and Peled 
and Liu7 , which promises high speed, fairly low power and low 
package count. The design substitutes a table look-up R.O.M. for 
the bit-wise multiplication of the filter coefficients by the data, 
with the filter output obtained by the operation of adding and 
shifting. A second-order section implemented in this way is shown 
in Fig. 2.5 and requires a 32 x 8-bit R.O.M. This basic circuit 
has a 20 MHz bit-rate, package count of 20 I.C's and dissipates 9.6.W. 
For a l2-bit input this section can handle up to 800 kHz bandwidth 
signals. A parallel version7 of the technique requires 60 I.C's, 
consumes 24 W, and allows a signal bandwidth of 10 MHz. A general 
comparison with Jackson's approach is shown in Table 2.1. 
9 Lockhart took a different approach by combining delta-modulation 
d · 29 (. d f )... . . enco 1ng 1nstea 0 P.C.M. w1th d1g1tal f1lter1ng. The versions 
d 'b d b C . . 30 d . 31 'f h' h' . escr1 e y r01S1er an L1U use R.O.M s or t e1r mee an1sat10n. 
The design required simple and inexpensive hardware, is particularly 
appropriate to applications involving analogue-digital interfacing, 
and has found favour with researchers working on speech signals. 
We now mention briefly the work by other authors. In 1967, 
32 Sypherd used R.O.M's for multiplication and mUltiple-input additions. 
Gabe1 5 described a simple architecture using a time-shared multiplier/ 
adder unit in which the filter coefficients are represented in a 
simplified floating-point form. Tra~-Thong and Liu33 used 
differential pulse-code midulation (D.P.C.M.) for the signal encoding 
and evolved a design for a D.P.C.M. filter. A look-up table 
F..F = flip flop 
d 
..... 
~ 
Q. 
d 0:: c 
-1W 
..... -10 
C ::>0 Q) LL« 
·u 
..-:: 
-Q) 0 
0 d 
d=1 bit delay 
J 
X 
n 1 
-1 
J 
X 
n-2 
~ 
0:: 
I.f) 
N 
0:: 
I.f) 
I 
(a) (b) 
Fig. 2.4 Jackson's basic serial-parallel multiplier (a) 
and pipeline cell (b) 
r-- r--
if> R.o.M CV) R5 
-+ -'- 0:: I.f) 
....; '-- adder ..... , (sub) 
• R6 -'<-
...j" 
. 
0:: 
I.f) 
Fig. 2.5 Read-only memory second-order section 
(after Croisier et all. 
--
J 
y n-l 
J 
Yn-2 
I Bede and Liu Jackson et a1. 
Filter type (l2-bit word-lengths) No. of LC's I Power No. of LC'c Power 
8th-order, parallel, 
1 MRz word-rate (w. r.) 
8th-order, cascade, 
250 kHz w.r. 
2n-order, multiplexed, 
128 ehanne Is, each 8 kHz 
10th-order, mux., 
96 ch., 8 kHz, w.r. 
Table 2.1. 
dissipation dissipation 
(W) (W) 
. 
72 28 240 
33, memory 14 60 
Siz,e = 128 x 
8 bits 
18, memory 
Size = 512 x I 10 60 w.r. 
I 8 bits ! , , 
I I 
I ! I i 
I I 54 , 22 190 
I 
Hard~are and performance comparison of Bede and Liu, and 
Jackson et al. methods. 
96 
24 
24 
100 
, 
I 
I 
! 
, 
J 
N 
o 
21 
technique using R.O.M's was proposed by Nussbaumer34 in which he 
d · f' d h f . 1 . .. .. l' 1" h d35 mo 1 1e t e am1 1ar quarter squares mu t1P 1cat10n met 0 
to reduce the overall number of additions and squarings. 
Another attractive approach is to replace multiplications by 
simple shifting using multiplexers by restricting the values of 
the filter coefficients to only integer powers of two or zero. 
36 The design leads to simple and very fast filters, e.g. Tomozawa 
used the approach to process real-time colour television signals. 
Van Gerwen et al. 37 published an excellent theoretical and 
experimental study of this approach, and introduced a filter 
consisting of a transversal part and a simple recursive network. 
Other approaches are the bit-level counting technique of 
Zohar's8 (which is still a conceptual entity), and the use of 
logarithmic arithmetic as suggested by Hall et al. 38 and Kingsbury 
39 
and Rayner As far as the author knows, no hardware details of 
the latter technique have been published. 
Finally, custom-design digital filter chips and packages are 
now slowly making their appearances commercially, e.g. the Pye 
TMC Ltd.'s pM.O.S. dual second-order filter chip40 and the 3-chip 
41 M.S.I./L.S.I. digital filter set by Advanced Micro Devices Inc. , 
which employs low-power Schottky bipolar technology. 
2.3.4 Conclusion. 
We have reviewed briefly the theory and design of digital 
filters and the problems involved in their real-time hardware 
implementation, and also surveyed the state-of-the-art of the 
existing hardware design approaches. 
22 
CHAPTER 3 
ELEMENTARY STRUCTURE THEORY 
OF 
FINITE-STATE SEQUENTIAL MACHINES 
3.0 Introduction. 
The theory of finite automata or finite-state sequential 
machines (F.S.M's) is a special case of general systems theory 
in which the input, state and output variables only assume discrete· 
values, and the functional relationship between them is described 
by abstract algebra, An F.S.M. is a useful mathematical model 
for digital computers, processors, the behaviour of nerve networks, 
language structures and information-transmission systems to name 
a few. 
The "structure theory" for F.S.M's concerns the realisation 
of an F.S.M. from a set of smaller component sub-machines, the 
interconnection and the "information" flow between these components. 
The theory provides a direct link· between algebraic relationships 
and physical realisations of machines. 
In the following sections we introduce briefly the basic 
ideas, concepts, terminology and results of this theory. The 
b k h ll .71 . 12 11 00 s by Boot ,Kohav1 and Hartman1s are exce ent 
introductory texts. 
3.1 Descriptions of F.S.M's. 
Definition 3.0. A Mealy type sequential machine M is a 
quintuple (S,I,O,o,>.) where S,I,O are finite nonempty sets 
23 
of states, inputs and outputs respectively, and a,A are 
the transition (next state) and the output functions given 
by 
o : S x I ~ S and A : S x I ~ O • 
. When the output is a function of the present state only, 
i.e. A : S 7 0, then the machine is known as a Moore type. 
When, in many cases, we are not. interested in the output, the 
corresponding machine is called a state machine defined by the 
triplet (S,I,o). The block representation of the Mealy type 
F.S.M. is shown in Fig. 3.0. The behaviour of an F.S.M. is 
commonly represented by a flow table or a state graph. Each row 
of the flow table represents a machine state, while the columns 
correspond to the inputs. The table entries indicate each state 
and output transition. The nodes of the corresponding state 
graph represent the states, while the arrow between nodes SI and 
s2' labelled by the ordered pair (x,o), x E I, 0 EO, indicates 
that a(sl'x) = s2 and A(sl ,x) = ·0. 
These two representations are illustrated in Figs. 3.1(a) 
and (b) for theMealymachineM= f(P,Q,R), (a,b), (0,1), a,A). 
In elementary machine decompositions, an important concept, 
which relates the behaviour of two machines, is that of machine 
homomorphism, which is an operation-preserving transformation. 
Definition 3.1. The sequential machine M' = (S',I~O~o~A') 
is a homomorphic image of the machine M = (S,I,O,o,A) iff 
there exist three onto mappings; 
o 7 0' such that 
>C. 
p 
Q 
R 
!r 
S SES A. 
Fig.. 3.0.· Block representation of a finite-state 
sequential machine. 
T nputs 
a b a b 
Q P 0 0 
R R 0 1 b/O 
P Q 1 0 
Presen t Next 
states 
Output 0/1 
(a) (b) 
Fig. 3.1. Flo~ table (a) and state graph (b) 
representations of an F.S.M. 
0 
0/0· 
b/1 
hI [o(s,a)] ; 
h 3 [>.(s,a)] ; 
24 
0' [h1 (s), h2 (a)] 
>. ' [h1 (s), h2 (a)] 
The triple (h1 ,hz,h3) of mappings is referred to as a 
homomorphism of M onto M'. 
Definition 3.2. A state machine M' ; (S:I:o') is a homomorphic 
image of M iff there exist two onto mappings; 
h1 : S + S', h2 : I + I' such that 
When h2 and h3 are identity mappings, the homomorphism is 
called a· state homomorphism. When two machines are identical except 
for a renaming of the states, inputs and outputs, we have an 
isomorphism between them. 
Definition 3.3 Two machines M; (S,I,O,o,>') and M'; (S:I:O:o:>") 
are isomorphic iff there exist three one-to-one mappings; 
f1 : 5 + 5', f2 I + I' and f3 0+0' such that 
f1 [0 (s ,X)] ; O'[f1 (S), f2 (X)] 
f3 [A (s ,X)] ; >.' [f1 (s), f 2 (X)] . 
3.2 Interconnections of F.S.M's. 
\oIhen decomposing a machine M into, or realising it from its 
component sub-machines, it is important to know the possible ways 
of interconnecting them. 
25 
Definition 3.4(a) •. The serial connection of 
M1 = (Sl,I1 ,Ol,01,A1) and M2 = (S2,I2 ,02,02,A2) for which 
01 = 12 , is the machine M, denoted by M1 + M2 , where 
such that 
and 
The serial connection for state machines, however, is 
slightly different. 
Definition 3.4(b). Given two state machines 
HI = (Sl,I1 ,01)' M2 = (S2,I2,02) with 12 = SI x 11 , and 
an output set ° and an output function A·: SI x S2 x 11 ->- 0·, 
then the serial connection of M1 and M2 is the machine 
M = (SI x S2,I1 ,O,0,A) where. 
o [(SI ,S2),j = [01 (sI ,x), O2 [52' (sI ,X)]) 
and 
These two different serial connections are shown in the 
schematic diagrams in Figs. 3.2(a) and (b). 
Definition 3.4(b). The parallel connection of ~11 and M2 is 
the machine M = MI x M2 = (SI x S2,I1 x 12 ,°1 x 02,0,A), 
where 
I, 
(a) 
M, '-- M2 - . 5, 52 
(b) 
Fig. 3.2. Serial connections of (a) general 
F.S.M's and (b) state machin~s. 
.. 
A 
x z , 1 
; 
COMBINATIONAL 
Xm CIRCUIT Zr :~ , 
Yn 
>-
« 
---1 
W 
0 
Fig. 3.3. An. F.S.M. realised with binary variables. 
o 
26 
° [(Sl,S2)' (X1 'X2)] ; [01 (sl ,xl)' °2 (s2 ,X2)] 
and 
>{(Sl,S2), (X1 'X2)] ; [A1 (sl ,xl)' A2 (s2 ,X2)]· 
3.3 Problem areas in F.S.M. realisation. 
Two major problems in the realisation and physical implementation 
of F.S.M's are those of state reduction and state assignment. 
The former concerns the concept of equivalence between the 
states of machines, and also between two machines. 
Definition 3.5. 
M2 ; (S2,I,0,02,A2) having the same input and output alphabets, 
. sl E Sl and s2 E S2 are said to be equivalent iff 
where (for Mealy type) x is any finite non-null input sequence 
and >:1'. >:2 are the extended output functions of HI and M2 
respectively (see pp. 22-23 of Ref. 12). 
Definition 3.6. Two machines of the same type, Ml and M2 , 
are equivalent iff each sl in SI has an equivalent state 
s2 in S2 and vice versa. 
Definition 3.7. A machine H is reduced iff state sI equivalent 
to state s2 implies that sI ; s2. 
It is easily shown that among all the machines equivalent to 
a given machine H, there exists a unique equivalent reduced machine 
~ which has the minimum number of states. Basically, for any 
27 
given finite input sequence, M and MR will give the same output 
sequence. While there are standard techniques in the minimisation 
of machine states, it is also possible to apply structure theory 
t~ the state reduction problem as will be described in Section 3.6. 
The second problem arises because in practice the inputs, 
states and outputs of a machine M are invariably represented by 
binary variables. Thus we may write 
S ~ { (y l' ••. ,y n)} , the set of all n-tuples on· {O,I}, 
I ~ {(xl"" ,xm)} 
and 0 ~ {(zl"",zr)} 
Also, each state and each output binary variable is a function 
of· {Yl""'Yn' xl' •. ·,xm}. The block diagram of an F.S.M. 
expressed in this manner is shown in Fig. 3.3. 
The state assignment problem is the selection of "desirable". 
binary codes to represent the internal machine states. Although 
this usually means the use of fewest number of components, e.g. 
logic gates, the relevant criteria are most often determined by 
the dynamics of teChnology. In concept, however, it is reasonable 
to assume that we can obtain economical state assignments and 
simplify the logic circuits in the physical implementation if 
we can reduce the number of present-state and input variables on 
which the next-state variables depend. 
Since the structure theory for F.S.M's deals with the general 
understanding of functional dependence and the realisations of 
machines from smaller components, it may be regarded as an approach 
to the state assignment problem. (Other approaches are listed in 
page 36 of Ref. 12). 
28 
3.4 Basic algebraic concepts. 
Two key mathematical ideas that are i~ortant tools in 
machine deco~ositions are the concept of partitions on a set, 
and that of an algebraic lattice. 
Definition 3.8. A partition ~ on S is a collection of 
disjoint subsets of S whose set union is S, i.e. ~ = {B } 
a 
such that 
= $ for a # a, and V{B } = s. 
a 
The B 's are called blocks of ~ and the block. containing 
a 
s is lVritten as B (s). Also we write s " t(~) iff s and t 
~ 
are contained in the same block of ~. 
Partitions may be combined by the "product" or 
the Ifsum" or "+" operations as follows. 
tt." , and 
(i) ~1· ~2 is the partition on S such that s - t(~1'~3) 
iff s " t(~l) and s " t(~2). 
(ii) ~1 + ~2 is the partition such that s " t(~1+~2) iff, 
there exists a sequence in S, S = so,sl,s2, ..• ,sn = t, for which 
either si = si+1(~1) or si " si+1(~2)· 
As an example, let S = {A,B,C,D,E,F,G,H,I}, and 
~1 = {A,B; C,D; E,F; G,H,I} and ~2 =' .{A,F; B,C; D,E; G,H; D. 
Then we have, 
~1 '~2 = fA; B; C; D;E; F; G,H; i} and 
Partitions may also be ordered by the "larger than or equal", 
29 
i.e. ~ relation. We say that na ~ nb iff every block of na 
is contained in a block of nb-, i.e. n'n = n a b a 
Definition 3.9. A lattice is a partially ordered set 
L = (S, ~) in which every pair of elements have a least 
upper bound (l.u.b.) and a greatest lower bound (g.l.b.) 
(S 6 7 f R f 12 H . 70 f d f' .. ee pp. - 0 e. ,or erste1n or e 1n1t10ns 
of the underlined terms). 
Alternatively, a lattice L is defined as a triplet 
L = (S, " +) where "." and "+" are binary operations ,satisfying 
certain postulates (page 7 of Ref. 12). 
The set of-all partitions on a set, for example, is a 
Definition 3.10. If L = (S, • , +) is a lattice, and T £ S, 
T ~ 0, then LI = (T, " +) is a sub-lattice of L iff x and y 
E T implies that x'y and x+y E T. 
Definition 3.11. A lattice Ll = (S1' " +) is homomorphic 
to L2 = (S2' " +) iff there exists an onto mapping 
h : Sl ~ S2' such that 
h(x'Y) = h(x)'h(y) and h(x+y) = h(x) + h(y). 
Thus, L2 is very simply a "coarse" version of Ll . If h is 
a one-to-one onto mapping, then we say that the two lattices Ll 
and L2 are isomorphic. 
3.5 Structural decompositions of F.S.MIs. 
The basis of machine decompositions is the modification of 
30 
the homomorphism concept to involve only one machine. 
Definition 3.12. A partition TI on the set of states of the 
machine M = (S,I,O,o,A) has the substitution property (S.P.) 
iff s = t(TI) implies that 
o(s,a) = o(t,a)(TI) 
for all a in 1. 
In other words, for each input, blocks of TI, defined as 
above, will be mapped into blocks of TI. These blocks may now 
be regarded as the states of a new machine defined by TI and M. 
Definition 3.13. Let TI be an S.P. partition on the set of 
states of M. Then the TI-image of M is the state machine 
M = ({B }, I, 0 ) TI TI TI 
with 
o (B ,x) = B' TI TI TI iff o(B ,x) CB' TI - TI 
It is easily shown that there is a one-to-one correspondence 
between state homomorphisms and S.P. partitions. Also, if TIl and 
TI2 are S.P. partitions, so are the partitions 
We will use the following theorem quite often. 
Theorem 3.0. The set of all S.P. partitions on the set of 
states of an F.S.M. M forms a lattice LM, under the natural 
partition ordering. Also LM contains the trivial partitions 
TI(O) and TI(I). 
Proof. (See page 41 of Ref. 12). 
31 
The lattice ~ is useful because it displays visually· all 
the important multiple series-parallel state behaviour realisations, 
and because algebraic lattice properties are reflected in machine 
properties and vice versa. 
A general procedure in finding all the S.P. partitions of 
a machine consists of two steps; 
(i) For every pair of states sand. t, compute the smallest 
S.P. partition TI which identifies the pair. 
s,t 
(ii) Find all possible sums of the TI 'so These sums 
s·, t 
constitute all the S.P. partitions. 
Details of this procedure may be found in the recommended 
texts. 
To consolidate the ideas we have discussed so far, we 
consider the machine M shown in Fig. 3.4. Using the above procedure 
we find the following set of S.P. partitions: 
TI(O) = { T; 2· 3· 71· 5· 6; 7· a }, , , , , ,
TI1 = { 1,2; 3,4; 5,6; 7.8 }, 
112 = { 1,2,3,4; 5,6,7,8 }, 
TI3 = { 1; 2; 3; 4,5; 6; 7; - ) 8 J' 
TI4 = { 1,2; 3,4,5,6; 7,8 }, 
TI5 = { T· 2· 3,6; 71· 5; 7; a }, , , , 
{ 1· 2· 3,6; 4,5; 7· - ) TI6 = 8 J' , , ,
I = { 1,2,3,4,5,6,7,8 }. 
IiP O/P 
o 1 
1 3 7 0 
2 4 8 0 
3 1 5 0 
STATE 
L, 2 5 0 
5 2 L, 0 
5 1 3 1 
n'5 
7 4 4 1 
8 3 3 0 
·Fig. 3.4.F.S.M. M. Fig. 3.5. 'Lattice ~ of M. 
32 
The corresponding lattice ~ is shown in Fig. 3.5. Also 
if we consider n4 say, the n4-image of M is shown in Fig. 3.6. 
_The concept of S.P. partitions is very useful in the serial 
and parallel decompositions of an F.S.M. into its components, 
as shown in the following theorems. 
Theorem 3.1. The F.S.M. M'has a non-trivial serial 
decomposition of its state behaviour iff there exists a, 
nontrivial S.P. partition n on the set of states of M. 
Proof. (Pages 45-46, Ref. 12). 
If the largest block of n has k states, then M is defined 
by n and T, where T is a k-block partition such that 
n.T = n(O) ••• (3.0) 
For example, for the machine shown in Fig. 3.7 the partition 
n = {- 1 1,2; 3,4,5 J has S.P. One possible T is then T = 
because 
{ 1,2; 3,4,5;} {l,3; 2,4; s} = n(O). 
Theorem 3.2. The F.S.M. M has a non-trivial parallel 
decomposition of its state behaviour iff there exist two, 
nontrivia1 S.P. partitions n l and n2 on M such that 
••• (3.1) 
Proof. (Pages 48-51, Ref. 12). 
3.6 State reduction using S.P. partitions. 
Another application of S.P. partitions is in finding the 
Fig. 3.6. A ~4 - image of M. 
I I P alP 
a 1 o 1 
5 3 1 a 
3 4 a a 
1 5 0 1 
2 3 0 a 
1 4 a a 
Fig. 3.7 •• An F.S.M. to demonstrate 
serial-decomposition. 
33 
reduced machine that is equivalent to M. The technique is based 
on the following. 
Definition 3.14. For a machine M, we define ~R to be the 
partition on M such that s = t (~R) iff state s is equivalent 
to state t. 
It can be shown (page 55, Ref. 12) that ~R.has S.P. and M~ 
R 
with the output 
AR(B ,x) = A(s,x) 
~R 
is the reduced equivalent of M. 
for s in B 
~R 
Thus once the S.P. lattice is obtained M is easily found 
~R 
by deciding which S.P. partition is ~R. An easy method for this 
is based on the following. 
Theorem 3.3. If M is an F. S .M., then ~R is the maximal output· 
consistent (O.C.) partition with S.P. Also S.P. partition ~ 
is O.C. iff ~ ~ ~R. (A partition ~ on the states of M is 
O.C. iff s = t (~) implies A(s,x) = A(t,x) for a11.inputs x. 
Proof. (Page 56, Ref. 12). 
It is also easily shown that the O.C. S.P. partition. form a 
sub-lattice of the S.P. lattice. Furthermore it is easy to test 
a partition to see if it is O.C. Thus once the S.P. lattice is 
given, M ,the reduced equivalent of M may be determined in a 
~R 
straightforward way. 
34 
3.7 Conclusion. 
We have presented a very brief introduction to the main 
concepts in the structural theory of decompositions of finite-
state sequential machine. A more detailed treatment of the theory, 
which includes advanced concepts like partition-pair algebra and 
state-splitting may be found in any of the references given. 
35 
CHAPTER 4 
FINITE-STATE MACHINE MODELS 
OF 
STORED-LOGIC DIGITAL FILTERS 
4.0 Introduction. 
In this chapter, we investigate the feasibility of applying 
the structure theory of finite-state sequential machines (F.S.M's) 
to the implementation of digital filters. The results we obtain 
give us a valuable insight into the problem of using this direct 
modelling technique to realise general filter sections. 
4.1 General approach. 
We have seen in Chapter 2 that the conventional way to implement 
the basic second-order section, shown in Fig. 2.3 and described by 
equation (2.4), is to use adder, mUltiplier and delay units. Also, 
we know from Chapter 3 that by using the theory of state partitions 
>rith the substitution property, it is possible to decompose an 
F.S.M. into an interconnection of "smaller" machines. 
In contrast to the conventional method we propose to realise 
a basic biquadratic section as a table look-up or stored logic 
unit. This unit is subsequently modelled as an F.S.M. which is then 
analysed using the method of S.P. partitions. 
4.2 Stored-logic digital filters. 
Conceptually, the method of table look-up is the most straight-
forward way to realise combinational switching functions in general 
36 
d . h . . . 42, 43, 44 an ar1t met1c C1rcll1ts in particular. (Mac lean and 
. 142 d· . . . 1 . Asp1nal use th1s techn1que to des1gna pract1ca dec1mal adder 
in as early as 1957). 
A general table look-up arithmetic unit is shown in Fig. 4.0(a), 
in which the arithmetic function g is a function of n independent 
variables qk's, k = 1,2, .•• ,n, each qk being an i-valued variable. 
Consequently, there are m possible values of g, where'm = (i)n. 
Every value of g is precomputed and stored in a memory or storage 
unit. A particular value of g is accessed by the corresponding n-tuple 
(ql,q2, ..• ,qk, •.. ,qn) which forms the memory address. 
··In practice, data are usually represented in the binary. form, 
in which case i = 2, and qk = 0 or 1. Also. g will now be represented 
by z bits. The resulting table look-up circuit is now as ·shown in 
Fig. 4.0(b), and it is usual to characterise this memory circuit by 
its capacity M given by, 
M = (2n ) x z word-bits (W-b) ••• (4.0) 
At present, the table look-up operation is normally implemented 
using semiconductor bipolar or M.O.S. L.S.I. read-only or read-and-
write memory chips. A typical organisation of a read-only memory 
(R.O.M.) is shown in Fig. 4.1. 
Since the delay time is dependent only on the access time of the 
memory store, circuits designed using the look-up technique are 
obviously fast in operation and easy to construct, test and maintain.· 
Furthermore, the architecture of any digital system designed this 
way is independent of device technology since the introduction of 
memory stores of larger capacity and faster acceSs time will only 
q1 
qn 
STORED 9 
TABLE 
(S.T) 
q 
-L 
.----l 
. 
q 
n 
(S.T) 
f 
z 
Ca) (b) q=oor L k 
Fig. 4.0. General table look-up arithmetic units for 
(a) i-valued and (b) binary variables. 
r - - - - - - - - - - - - - - - - - - - - - - - -~ - - - - - -, 
I· I 
, 
, 
I 
I 
I 0:: MEMORY w 
I a 
I 0 MATRIX u 
I W 
a ~ 
i 
I 
'--------- --- - -- -- - --
I 
I 
l 
I 
I 
I 
I 
I 
i 
I 
I , 
____ J 
Fig. 4.1. Functional organisation Of a read-only 
memory (R.O.H.). 
, 
37 
result in a more efficient use of the basic system architecture. 
Although R.O.M's have been incorporated in the hardware 
structures of digital filters (recall Section 2.3.3), they are used 
only for the partial computation of the overall filter algorithm. 
Our approach, however, is different, in that we propose to 
implement a complete second-order digital filter section as a 
look-up table. Using equation (2.4) we first- precompute the section 
output for every combination of present input and past inputs and/or 
outputs. The resulting output values are then written into a suitable 
memory store. In operation, the present input and past inputs and/or 
outputs act as addresses of the memory to access the relevant filter 
output. 
A digital filter implemented in this manner will be termed a 
stored-logic (S.L.) digital filter. 
4.3 Examples ofS.L. digital filters. 
We now illustrate the approach by deriving the S.L. forms of 
a few typical digital filter-structures. 
4.3.0 Second-order non-recursive section. 
Consider a second-order non-recursive section whose data and 
coefficients are represented by 2 bits, and whose coefficient values 
are . 
a = 3 and 1 _ 
We now compute the maximum value of the filter output, Yn 
lLax 
• by 
setting each x k' 
n-
k = 0,1,2, to its maximum value of 3. Using 
equQtion (2.4) with b l = b2 = 0, we find that, since 
38 
Yn = (1 x 3) + (3 x 3) + (2 x 3) = 18, 
max 
we require 5 bits to represent the filter output. 
This filter which we will label D.F.l is shown in Fig. 4.2(a), 
and its input-output relationship which is to be stored is given 
in Table 4.0. The address inputs consist. of x ,x 1 and x 2' n n- n-
and the data to be written into the look-up memory are given in 
the last four columns. The corresponding S.L. filter is shown in 
Fig. 4.2(b), and requires a 64 x 5 word-bit storage module. 
4.3.1 First-order recursive section. 
Consider -the first-order recursive filter labelled D.F.2 
shown in Fig. 4.3(a) whose feedback coefficient bl = 5/8 = 0.1012 • 
The input x is represented by 2 bits, while 3 bits are used for 
n 
bl and the output wn • Also, the 6-bit product bl x wn_1 is 
quantised to 3 bits. 
The values of w for all possible combinations of present 
n 
input x and past output w 1 are shown in Table 4.1. The S.L. 
n n-
form of D.F.2 is given in Fig. 4.3(b), in which a store of 32 x 3 
word-bits is used. 
4.3.2 Second-order autonomous recursive ·section. 
This section D.F.3 is shown in Fig. 4.4(a) in which b l = 2 x 2-2 
and b2 = 3 x 2---2 simplify subsequent analyses, we let the input 
be zero and the past outputs w 1 and w 2 have non-trivial initial 
n- n-
values*. The data and coefficients are represented by 2 bits, while 
the sum of the double-length products, bl x wn_l and b2 x wn_2 , 
* See Appendix 4.0 for a fUrther explanation. 
---- - -- - - --- ---- - -- -----, 
I I 
Xn I 
I 
I I 
xn 
I X ADDER I 
: b f I I 
I . I 
I 0 0 1 1 I I I a 1 X X ala I 1 I 2 
I I L ________ 
I--
- ---- --
___ ...J 
F/F F/F F/F 
I-- flip-flop F/F F/F x x 
'---
- -
n 1 n 2 
(a) 
64word )( 5 bit 
STORAGE 
MODULE 
1\ 
• I-- F/Fs 
(b) 
Fig. 4.2. Conventional (a) and stored-logic (b) 
realisations of D.F.l. 
, 
39 
Past inputs .Present input .x 
n 
0 . . . 1. 2 . . .. 3 
x x 
n-2 a x x 1 .a2
x xn- 2 0 1 2 3 a x x n-1 1 n- o n 
0 0 0 0 0 1 2 3 
0 1 0 2 2 3 4 5 
0 2 0 4 4 5 6 7 
0 3 0 6 6 7 8 9 
1 0 3 0 3 4 5 6 
1 1 3 2 5 6 7 8 
1 2 3 4 7 8 9 10 
1 3 3 6 9 10 11 12 
2 0 6 0 6 7 8 9 
2 1 6 2 8 9 10 11 
2 2 6 4 10 11 12 13 
2 3 6 6 12 13 14 15 
3 0 9 0 9 10 11 12 
3 1 9 2 11 12 13 14 
3 2 9 4 13 14 15 16 
3 3 9 6 15 16 17 18 
filter output y 
n 
Table 4.0. Input-output relationship of D.F.l 
(all data to be represented in binary). 
x + n 
1 01 b1 ill 
3-bit ;:. := x quan- t: 
tJzer 
f-
E--
delay 
latche 
xn 
S 
32x3 bit 
MEMORY 
-
(b) 
Fig_ 4_3_ Conventional (a) and stored-logic (b) 
realisations of D_F_2_ 
b1 
L 1 
r- X Wn_1 
I r r--,.... ~j w~ r- + r- ~j '----o-J<-
~ .-
:0>-
a::: 
NO 
x L 
U)W 
... ::::E 
2-bit X Wn_2 
-quantizQr r Ib2 
(a) (b) 
Fig. 4.4. Conventional (a) and stored-logic (b) 
realisations of D.F.3. 
f--
I I I 
,... 
I I I 
~ 
40 
Previous 
-3 
output, w
n
_
l Present input x x 2 n ~ . 
x 2 -3 0 1 2 3 
0 0 1 2 3 
1 1 2 3 4 
2 1 2 3 4 
3 2 3 4 5 
4 3 4 5 6 
5 3 4 5 6 
6 4 5 6 7 
7 4 5 6 7 
rounded output w' 
n 
Table 4.1. Input-output relationship of D.F.2. 
41 
are quantised to 2 bits. The state output relationship of D.F.3 
is shown in Table 4.2 and its S.L. form is given in Fig. 4.4(b), 
which requires a 16 x 2 W-b store. 
4.3.3 Memory storage requirements. 
The examples we have discussed demonstrate the implementation 
of a few typical filter sections as stored-logic units. This 
direct approach suffers from the following problems: 
(a) In practical sections a tremendous. amount of storage 
will be required, (a second-order 7-bit non-recursive section, for 
example, requires a memory store of over two million words). 
(b) Possible redundancies in the stored-table entries are 
difficult to determine. 
(c) It is also not easy to detect any structure or pattern· 
that may exist between the stored data. 
4.4 F.S.M. models of digital filters. 
We will now describe how the S.L. filters (D.F.1-3) that we 
discussed in the previous section may be modelled by F.S.M's. In 
the traditional approach, the design of a table look-up circuit 
is considered to be completed as soon as the input (address) -
output relationship has beeri determined. We hope to extend the 
design problem by analysing look-up tables via F.S.M. models to 
achieve a reduction in the memory requirement and a systematic 
decomposition procedure for general S.L. filters. 
42 
Past outputs Double-length Quantised output 
-2 b l
x b2
x 
output w w ' by x 2 w w
n
_
2 n 
n • 
n-l 
x 2-4 
-4 (w ) (wn ) 4 truncation;round-off w wn_2 x 2 n-l nlO 
0 0 0 0 0 00 0 0 
0 1 0 3 3 03 0 1 
0 2 0 6 6 12 1 2 
0 3 0 9 9 21 2 2 
1 0 2 0 2 02 0 1 
1 1 2 3 5 11 1 1 
1 2 2 6 8 20 2 2 
1 3 2 9 11 23 2 3 
2 0 4 0 4 10 1 1 
2 1 4 3 7 13 1 2 
2 2 4 6 10 22 2 3 
2 3 4 9 13 31 3 3 
3 0 6 0 6 12 1 '2 
3 1 6 3 9 21 2 2 
3 2 6 6 12 30 3 3 
3 3 6 9 15 33 3 4 +3 
overf1otv. 
maximum registe 
value is used. 
Table 4 .. 2. State-output relationship of D.F.3. 
4.4.0 
43 
F.S.M: model of a·general S.L. non-recursive·second~order 
section. 
The above model is derived very simply by redrawing the standard 
configuration of the non-recursive filter to that shown in Fig. 4.5 
such that it now corresponds to the familiar Mealy machine described 
by the 5-tuple (S,1,O,'S,).). ~ee Definition 3.0J, via the following 
mappings: 
h 
s 
h 
o 
x 
n-l 
X 
n 
Y 
n 
a 
b 
x Xn- 2 ... S 
... I 
... 0 
... <I 
... ). 
where X ., (i = 0,1,2), is the set of all possible values of x ., 
~1 ~1 
Y
n 
is the set of all possible values of the filter output Yn'. 
a : «x l' x 2)' x ) ~ (x , x 1) n- n- n n n-
and b is the filter algorithm described by equation (2.4). 
Thus, the "internal state" of the F .S.M. filter is represented 
by the outputs of the two delay elements. 
4.4.0.0 An application. 
~~e now apply the modelling technique to D.F.l, and .thus obtain 
the flow table shown in Table 4.3, in which the states, repre·sentcd 
by the ordered pairs (x
n
_l ' xn_2)'s, have been appropriately labelled. 
For simplicity, the corresponding state diagram is drawn only for 
the inputs 0 and 2, as shown in Fig. 4.6. 
... 
Xn 
: -------
- - -
----
- --STATE---_ : 
---
---
---MATRIX-- -
........,. --- --[& 1 
. 
· x Pt · n-1' · 
DELAY 
UNITS 
-
x . 
n-Z N 
~ OUTPUT · 
· 
· Yn MATRIX · 
· 
· [ A. 1 
Fig. 4.5. F.S.M. model of a. general second-order 
non-recursive filter. 
44 
Present state Input Input 
ordered pair label o 1. 2 .. 3. . .0. .1 ... 2 3 
0 0 A A E I M 0 1 2 3 
0 1 B A: E I M 2 3 4 5 
0 2 C A E I M 4 5 6 7 
0 3 D A E I M 6 7 8 9 
1 0 E B F J N 3 4 5 6 
1 1 F B F J N 5 6 7 8 
1 2 G .B F J N 7 8 9 10 
1 3 H B F J N 9 10 11 12 
. 
2 0 I C G K 0 6 7 8 9 
2 1 J C G K 0 8 9 10 11 
2 2 K C G K 0 10 11 12 13 
2 3 L C G K 0 12 13 14 1~ 
3 0 M D H L P 9 10 11 12 
3 1 N D H L P 111 12 ;, 13 14 
3 2 0 D H L P 13 14 15 16 
3 3 P D H L P 15 16 17 18 
Next-state Output 
Table 4.3. , Flow table of F.S.M. equivalent of D.F.1. 
(a) 
(b) 
Fig. 4.6. State diagrams for the F.S.M. model of D.F.l. 
with respect to inputs (a) 0 and (b) 2 respectively. 
45 
This F.S.M. equivalent of D.F.l is so "rich" in S.P. partitions 
that it is impractical to generate them manually. Instead, a 
co.mputer 
author's 
program in Fortran 1900 which was written by one of the 
45 46 
colleagues ' was used. To obtain some id~a of the 
size of the S.P. partition set, it suffices to say that the program 
found lZO basic partitions'while from the first level sums alone, 
over 300 partitions were obtained. 
It can be seen however that the next state time function 
a is the simplest possible, since 
where dl,dZ and di,di are the inputs and outputs of the delay 
elements Dl and Dz respectively. 
and di = xn- Z· 
Also d l = x , n d' = d 1 Z = x n-l 
Nevertheless, the existence of S.P. partitions is still useful 
if some of them are output consistent (O.C.) as well, in which 
case it is possible to minimise the F.S.M. It may then be 
* necessary to c'ode the state variables • 
From Table 4.3 the following is the largest o.c. partition, 
T = {A, B, C, DI, HM, J, E, F, G, K, L, N, 0, p}. 
T, however, is not S.P. since the blocks DI and HM implies 
that AC, EG, IK, M~ and BD, FR, JL, NP must be "identified" thus 
leading to the partition n, where 
n = {AC, BDIK, FHH~, EG, JL, NP} 
and already with this initial implication, n ! T. Thus T is not 
preserved for inputs, and hence the F.S.M. given in Table 4.3 is 
a reduced machine. 
* See Appendix 4.1. 
• 
4.4.0.1 
46 
State-reduction of the general F.S.M. non-recursive 
section. 
Consider a general second-order non-recursive filter in 
which each x ., and a. may assume any value from the set ZR' 
n-1 1 
where 
ZR = { x I x integer, 0 ~ x < R } 
The corresponding F.S.M. equivalent will then have R possible 
input values, and R2 internal states, which are all the possible 
combinations of the ordered-pair (~n-l' x
n
- 2). The ,general form 
of the flow table for this F.S.M. is shown in Table 4.4. 
Definition 4.0. We define T(i) to be the partition 
on S, the set of states of the above F.S.M., such that 
two ordered-pairs are in the same block of T (i) only i'f 
the1·r 1·th ·d . 1 components are 1 ent1ca '. 
It is easily seen that T(i) consists of R blocks, each 
containing R states or ordered-pairs. 
Lemma 4.0. T(l) has the substitution property. 
Proof. Consider any two distinct states, PI and P2' in the 
same block Bk of T(l) 1.e. 
p = (k, g) and P2 = (k, h). 1 
Using 0 as defined in Section 4.4.0, the next states of PI and P2 
for any particular input x = j are 
n 
o ~1 ?j] = O[(k,g), j] = (j , k) and 
- --
Present 
state 
(0, 0) 
(0, 1) 
· 
· 
· 
(0, R-l) 
(1,. 0) 
(1, 1) 
· 
· 
· 
(1, R-1) 
(2, 0) 
(2, 1) 
· 
· 
· 
(2, R-1) 
· 
· 
· 
· 
(R-1, 0) 
(R-1, 1) 
· 
· 
· 
. 
(R-1, R-1) 
I 
47 
Present input 
...... 1. . 
(0, 0) (1, 0) 
(0, 0) (1, 0) 
· · 
· · 
· · 
(0, 0) (1, 0) 
(0, 1) (1, 1) 
(0, 1) (1, 1) 
· · 
· · 
· · 
(0, 1) (1, 1) 
(0, 2) (1, 2) 
(0, 2) (1, 2) 
· · 
· · 
· · 
(0, 2) (1, 2) 
· · 
· · 
· · 
· · 
(0, R-1) (1, R-1) 
(0, R-1) (1, R-1) 
· · 
· 
· · 
(0, R-1) (1, R-1) 
. . . .. 
.. 
(R.,.l) 
(R-1, 0) 
(R-1, 0) 
(R-1, 0) 
(R-1, 1) 
(R-1, 1) 
(R-1, 1). 
(R-1, 2) 
(R-1, 2) 
(R-1, 2) 
(R-1, R-1) 
(R-1, R-1) 
(R-1, R-1) 
------ -'-_____ -1 
Table 4.4 Flow table for the F.S.N. equiva1ent·of a 
general non-recursive second-order filter. 
48 
Thus, for a given input PI and P2 have the same next state 
and consequently no further implication of .S.P. partition blocks 
is possible. Since Pl.and P2 are arbitrary states in ~, it follows 
that all the states in Bk will be mapped to the same next state. 
Also, as Bk is an arbitrary block, therefore T(l) has the 
substitution property. 
As an example, see the state graphs in Fig. 4.6 for x = D 
n 
and x = 2. 
n 
Lemma 4.1. Any non-trivial partition T ~ T(l) cannot 
be output-consistent. 
Proof. Consider T' the smallest form of T. This will have 
one block bk containing two distinct elements, while the remaining 
are just one-element blocks. Let PI and P2 defined as in Lemma 4.D 
For a particular input x = q, the corresponding filter 
n 
output will be given by 
",(4.1) 
and 
•.. (4.2) 
If Tt 1S output-consistent (D.C.) then we must have Ynl = Yn2 
which implies that, since k and q are fixed, a 2g = a 2h.This is 
only possible if g = h. By construction however g ~ h. Therefore 
Tt is not D.C. Since in general T must contain at least one block 
with two elements, no T can be D.C. 
Lemma 4.2. Any non-trivial D.C. partition on S cannot 
have the substitution property. 
------------
- - - ----
49 
Proof. As a con3equent of Lemma 4.1 we see that for any 
partition on S to be O.C. any pair of states, sI and s2' in a 
block must have different values of their first components, 'i.e. 
For any particular input x = j, the j-successors of sI and s2 n 
are given by 
o[sl' j] = o[(kl , gl)' j] = '(j, k l ) 
and 
o ~2' j] = o [(k2 , g2)' j] = (j, k2) . 
Consequently, sI and s2 are mapped to the same block of T(l), 
and hence the transitions to next states of sI and s2 do not lead 
to the same output, i.e. 
This means that any O.C. partition we start with will not be 
"preserved" even for the next immediate input. Therefore no O.C. 
partition can be S. P. 
The 
4.4.0.2 
previous Lemmas lead naturally to the following Theorem. 
Theorem 4.0. For a general second-order non-recursive 
digital filter in which each,of the data and coefficients 
comes from the set ZR' the corresponding F.S.M. model is 
already in the minimal form. 
Partial state reduction. 
Although it has now been shown that the F.S.M. equivalent 
50 
of a second-order non-recursive section is inherently minimised, 
a simplification is still possible. 
Suppose we represent the outputs of the F.S.M. equivalent of 
D.F.l shown in Table 4.3 in radix-4 arithmetic, i.e. (14)10 say 
is written as (0, 3, 2)4. If we consider only the least significant 
digits for the moment, the modified output table shown in Table 4.5 
will be obtained. From it, we find the following O.C. partition 
'd given by 
'd =. {A,C,J,L; B,D,I,K; E,G,N,P; F,B,M,W} 
which is also S.P. Thus we may regard the blocks of 'd as the 
states of the reduced equivalent of the machine ~hose output table 
is shown in Table 4.6. Furthermore, this reduced machine has the 
following useful S.P. partition 
1J = { Q ,R; S, T } . 
To realise this reduced machine we require a partition , 
a 
such that n.'a = 'd = {Q,R,S,T}. One such, is the non-S.P. 
a 
partition { QS, RT}. The initial F.S.M .. (which incorporate the 
remaining output digits) is now easily implemented by using a 
partition 'b to distinguish between the states in the blocks of 
'd' i.e. we require that 
'd.'b = n(O) = {A,B.C,D.E,F.G,B.I.J.K.L.M.N.~.P} 
The block diagram of the overall realisation is shown in 
Fig. 4.7 which results in a saving of about one third of the 
nominal storage of the direct form shown in Fig. 4.5. 
It is possible to achieve further savings if the component 
Present Input 
state 
ci 1 2 3 
A 0 1 2 3 
B 2 3 0 1 
C 0 1 2 3 Present Input 
D 2 3 0 1 state 0 1 2 3 
E 3 0 1 2 
F 1 2 3 0 Q Q,O S,l R,2 T,3 
G 3 0 1 2 R Q,2 S,3 R,O T,l 
H 1 2 3 0 S R,3 T,O Q,l S,2 
I 2 3 0 1 
'" T R,l T,2 Q,3 S,O t-' J 0 1 2 3 
K 2 3 0 1 
L 0 1 2 3 
M 1 2 3 0 Ta = {A,C,J,L; B,D,I,K; E,G,N,P; F,H,M,0} 
N 3 0 1 2 
0 1 2 3 0 =' {Q,R,S,T} 
P 3 0 1 2 
. 
Table 4.6. Flow table of reduced F. S .M. 
equivalent of D.F.1. 
Table 4.5. Output (least significant 
digit) table of D.F.1. 
Xn 
"Tt' 
8 x 1 bits 
~ 0 
. 1 xn ! 
l"a 
16 x 3 bits 
., 
0 
1 Xn 1 
~ 
't"b . 
64 x 5 bits 
--" 
0 
0 
tot 
1 st Q/P 
digit 
2nd 
O/P digits 
3rd 
al . 
rage st~ 
376 bits 
Fig. 4.7. Cascade realisation of F.S.M. model of D.F.l. 
52 
Tb of the F.S.M. in Fig. 4.7 is simplified, by applying similar· 
analyses to the second and third digits respectively. 
4.4.1 F.S.M. model of second-order autonomous recursive section. 
This is the filter D.F.3 described in Section 4.3.2 (Figs. 
4.4(a) and (b». Its F.S.M. equivalent, shown in Fig. 4.8, is 
obtained by the following mappings 
hI W xW n-l n-2 ... S 
, 
h2 [Wn] t(or r) 
... 0 
cS [wn " wn-J (t.) -, Gn-l' wn_2] (t. + 1) 
are the sets of truncated and rounded filter 
outputs respectively, and t. is a particular time instant. 
For these two forms of quantisation, the corresponding flow 
. tables and state graphs are shown in Tables 4.7(a) and (b), and 
Figs. 4.9(a) and (b) respectively. 
It is interesting to note that the state graphs illustrate 
quite clearly the existence of limit cycles. For example, in 
Fig. 4.9(a), if the F.S.M. is initiated at state M then. after two 
state transitions, the F.S.M. will be alternating between states 
J and G resulting in theperiodicoutput{w '} = ••• 1.2.1,2,1,2 ••••• 
. n 
The machine could also settle to a constant amplitude limit cycle 
if, for instance, it is started at state N. Then. after three 
transitions, with outputs 2,3,3, the machine stays in P with the 
corresponding output of 3. 
ut~ 
MEMORY 
"""""= . t-
delays 
'-- t--
Fig. 4.8. F.S.M. model of D.F.3 .. 
0/2 0/3 
(a) 
(b) 
Fig. 4.9. State graphs for D.F.3 with output (a) truncation 
and (b) rounding-off. 
Present 
state 
0.0 ..,. 
0,1 
0,2 
0,3 
1,0 
1,1 
1,2 
1,3 
2,0 
2,1 
·2,2 
2,3 
3,0 
3,1 
3,2 
3,3 ..,. 
53 
Next Output Present Next Output 
state state state 
A A 0 A A 0 
B A 0 B E 1 
C E 1 C I 2 
D I 2 D I 2 
E B 0 E F 1 
F F 1 F F 1 
G J 2 G J 2 
H J 2 H N 3 
I G 1 I G 1 
J G 1 J K 2 
K K 2 K· r/J 3 
L r/J 3 L r/J 3 
M H 1 M L 2 
N L 2 N L 2 
r/J P 3 r/J P 3 
P P 3 P P 4 
(a) ~) 
Table 4.7. Flow tables of F.S.M. D.F.3 with output 
(a) truncated and (b)- rounded-off. 
S4 
Now consider the F.S.M. described by Table 4.7(a), in which 
we find that the largest O.C. partition is 'I' where 
'1 = {ABE, CFIJM, DGHKN, L0P} 
This O.C. partition however is not S.P. as may be seen by considering 
the pair of states C and F in the second block of '1' By applying 
the transition function 6, we find that 
c5 (C, I ) = E and c5 (F, I) = F 
o 0 
where I is the zero input. We see now that E and F are in 
o 
different blocks of '1' Therefore '1 is not preserved. 
'I' however, may be refined to '2' 
where '2 = {ABE, C, F, K, IJM, DGH, N, L0P} 
= { a, b, c, d, e, f, g, h } 
which can be shown to be S.P. 
Consequently, the F.S.M. in Table 4.7(a) may be reduced to 
that shown in Table 4.8, in which some of the possible S.P. partitions 
are 
''1 = {abed, efgh} 
"2 = {abef, cdgh} 
"3 = {abgh, cdef} 
" = 4 {ab, cd, ef, gh} 
One possible realisation of M ,the reduced machine is to 
'2 
use "4 and 'a =" {aceg, bdfh}, because 
, 
a 
55 
The corresponding block diagram is shown in Fig. 4.10 which 
requires two 2 x 1 w-b memories and an 8 x 3 W-b memory. 
M may be simplified further when we assign binary variables 
T2 
to the internal states. One such assignment is shown in Table 4.9 
which is the binary coded form of Table 4.8. From Table 4.9 we 
observe that the Y2' Y1 columns are identical to those of, Y2' Y1' 
Thus we may eliminate two delay elements and, use Y2 and Y1' as 
control variables, required only to specify the initial state of 
the F.S.M. model of D.F.3. 
The final realisation is shown in Fig. 4.11 requiring only 
an 8~word store, thus representing a considerable simplification 
over the direct form shown in Fig. 4.8. 
4.4.2 F.S.M. model of first-order recursive 'section. 
The above filter is D.F.2 which we described in Section 4.3.1 
(Figs. 4.3(a) and (b», and characterised by Table 4.1. By letting' 
rn-J' the set of delayed output values, represent the 'state set, 
of the corresponding F.S.M. model we obtain the flow table shown 
in Table 4.10. The direct realisation is shown in Fig. 4.12, in 
"hich a 32 x 6 W-b memory is required. 
From the state and output table we find that the following 
partition "1 has S.P. as well as being output-consistent, Le. 
"1 = a:, BC, D, EF, GH} • 
This leads to the reduced F.S.M. shown in Table 4,11, which, 
since it has five states, still require three binary variables 
in the state coding. Nevertheless, a modes't simplification of 
Present Next Output state state 
Present Next Output 
state state Y2 Y2 Y1 Y Y1 Yo 0 
a a 0 a -+- 0 0 0 0 0 0 0 
b a 1 b 0 0 1 0 0 0 1 
c c 1 c 0 1 0 0 1 0 1 
d d 2 d 0 1 1 0 1 1 2 
e f 1 e 1 0 0 1 0 1 1 
f e 2 f 1 0 1 1 0 0 2 
<.n 
g h 2 g 1 1 0 1 1 1 2 0-
h h 3 h -+- 1 1 1 1 1 1 3 
Table 4.8. Flow tab le for Table 4.9. State-assignment of 
112 
2 x 1 bits 
D 
r-l 
71'3 
2x1 bits 
J D l I I 
'--l L & A 
a 
8 x 3 bits 
-' 0 L I 
Fig. 4.10. Cascade realisation of reduced 
F.S.M. model of D.F.3. 
y. 
2 
Y1 OF 3 
Qutp ut 
8 x 3 bits 
Yo Yo 
DELAY 
"Fig. 4.11. Final simplified implementation 
of D.F.3. 
Present 
state 0 
A A,O 
B B,l 
C B,l 
D C,2 
E D,3 
F D,3 
G E,4 
H E,4 
Table 4.10. 
Input (x 2-3) 
1 2 3 
B,1 C,2 D,3 
C,2 D,3 E,4 
C,2 D,3 E,4 
D,3 E,4 F,5 
E,4 F ,5 G,6 
E,4 F,5 G,6 
F,5 G,6 H,7 
F,5 G,6 H,7 
State and output table for· 
F.S.M. equivalent of D.F.2. 
(Output scaling x 2-3) 
Present Input (x 2-3) 
state 0 1 2 3 
P P,O Q,l Q,2 R,3 
Q Q,l Q,2 R,3 S,4 
R Q,2 R,3 S,4 S,5 
S R,3 S,4 S,5 T,6 
T S,4 S,5 T ,6 T,7 
'" 
..., 
"1 = {i\, BC, D, EF, GII} 
= {P, Q, R, s, T}, 
Table 4.11. Flow table of reduced D.F.2. 
O/P 
11 P 
32x5 bit 
store 
0 
0 
0 
Fig. 4.12. Direct F.S.M. model of D.F.2. 
! IIP ! 
7(3 
ex 1 bits 
L 0 J 
!I/Pl 
"t;a & A. O/P 
32x5bits [ 
0 :J 
0 
Fig. 4.13. Cascade realisation of reduced F.S.M. 
equivalent of D.F.2. 
-.:; 
-
58 
tr..is reduced machine is possible by using its sole S.P. partition 
~ = rp, QRST} in serial with T such that 2 a 
~2.,Ta = ~(O) = fA, BC, D, EF, GH}. 
Hence. one possible T is {PQ, R, S, T}. The cascade realisation 
a 
of this F. S.H. filter, shown in Fig. 4.13, uSes an 8 x 1 W-b memory 
and a 32 x 5 W-b memory for its look-up tables. 
4.4.2.0 pecomposition results for D.F.2 with different feedback 
coefficient values. 
-The same modelling and decomposition techniques that we have 
discussed so far will now be applied to the basic first order recursive 
filter section for various values of the feedback coefficient b l , 
from bI = (0.001)2' Le • . i· , to b l = (0.111)2 =~. The F.S.M. equivalent 
of the section having b l = k/8 will be labelled ~. 
The flow tables for the y~'S, k = 1,2,3,4,6,7 are shown in 
Tables 4.l2(a) to (t). The number of possible input values is 
not the sam~ for all the ~'S because the maximum input in each 
F.S.M. is so chosen as to prevent section overflow (see Chapter 2). 
Alongside each flow table. the corresponding set of basic 
S.P. partitions is given, as well as a subset of those partitions 
generated from higher level sums. The partitions in this subset 
are chosen for their convenient and useful number of blocks and 
block sizes, and are selected by the manual inspection of a very· 
much larger collection of possible S.P. partitions generated using 
the computer program mentioned in Section 4.4.0.0. 
The ~'s are first analysed for output-consistent S.P. 
Present 
state 
0 +A 
1 B 
2 C 
3 D 
4 E 
5 F 
6 G 
7 +H 
Input (x 2-3) 
0 1 2 3 4 5 6 
A,O B,l C,2 D,3 E,4 F,5 G,6 Useful S.P. partitions: 
A,O B,l C,2 D,3 E,4 F,5 G,6 
1Tl = {ABCD, EFGH} 
A,O B,l C,2 D,3 E,4 F,5 G,6 
1T2 = {AB, 
A,O B,l C,2 D,3 E,4 F,5 G,6 
1T3 = {AC, B,l C,2 D,3 E,4 F,5 G,6 H,7 
B,l C,2 D,3 E,4 F,5 G,6 H,7 1T4 = {AD, 
B,l C,2 D,3 E,4 F,5 G,6 H,7 
B,l C,2 D,3 E,4 F,5 G,G· H,7 
(a) Machine Ml 
Tables 4.12(a) to (f). Flow tables and useful S.P. partitions for ~'s, the 
F.S.M. equivalents of first-order recursive filters. 
CD, EF, 
BD, EG, 
BC, EH, 
GH} 
FH} 
GF} v. 
'" 
60 
partitions in order to determine the possibility of machine 
minimisation. The reduced equivalent machines ~'s are described 
by Tables 4.13(a) to (f). These reduced F.S.M's are in turn 
analysed for useful S.P. partitions which may lead to parallel 
or cascade realisations. These implementations are illustrated 
in Figs. 4.l4(a) to (f). 
, As a result of the analysis described above, the following 
observations are'made: 
(i) Machine Ml • nl is O.C. Hence MI is reduced to ml , 
(denoted by Ml ~ ml ), where ml is a 2-state machine. 
(ii) Machine M2• n9 isO.C. 
which is a 3-state machine. 
R Therefore we have M2 --+ m2 , 
4-state machine. Although m3 possesses the S~P. partition 
n = {PQR, S}, (see Table 4.13(c», it is not useful because its 
largest block contains three states. Consequently, the successor 
component alone in the corresponding cascade realisation will 
require two binary variables to code its states. 
(iv) M4• The S.P. partition nl = {A, BC, DE, FG, H} is 
O.C. Therefore M4 ~ m4 , a S-state,machine. Also m4 has the' 
S.P. partition,n2 = fp, QRST}, and using T such that'n2 .T = n'(O) = nI' 
Le. T = {PQ, R, S, T}, the cascade realisation shown in Fig. 4.l4(d) 
is obtained. 
61 
Present Input 
state 
0 1 2 3 4 .S 
A A,O B,l C,2 D,3 E,4 F,S 
B A,O B,l C,2 D,3 E,4 F,S 
·C B,l C,2 D,3 E,4 F,S G,6 
D B,l C,2 D,3. E,4 F,S G,6 
E B,l C,2 D,3 E,4 F,S G~6 
F B,l C,2 D,3 E,4 F,S G,6 
G C,2 D,3 E,4 F,S G,6 H,7 
H C,2 D,3 E,4 F,S G,6 H,7 
(b) Machine M2 
Basic S.P. partitions Useful S. P. partitions 
111 0 {AB, C, D, E, F, G, H} 11 0 {AB, CDEF, GH} 
0 {A, B, CD, E, F, G, in 9 112 
113 0 {A, B, CE, D, F, G, iD 1110
0 {AB, CD, .EF, GH} 
114 0 {A, B, CF, D, E, G, H} 1111 = {AB, CE, DF, GH} 
1Is 0 {A, B, C, DE, F, G, H} 11120 {AB, CF, DE, GH} 
116 0 {A, B, C, DF, E, G, H} 
117 0 {A, B, c, D, EF, G, H} 
118 0 {A, B, c, D, E, F, GH} 

63 
Present Input 
state 
a 1 2 3 
A A,a B,l C,2 0,3 
B B,l C,2 D,3 E,4 
C B,l C,2 D,3 E,4 
D C,2 0,3 E,4 F,5 
E C,2 D,3 E,4 F,5 
F 0,3 E,4 F,5 G,6 
G D,3 E,4 F,5 G,6 
H E,4 F,5 G,6 H,7 
(d) M4 
Present Input Present Input 
state 
a 1 2 state 0 1 
A A,a B,l C,2 A A,a B,l 
B B,l C,2 0,3 B B,l C,2 
C C,2 D,3 E,4 C C,2 D,3 
D C,2 D,3 E,4 D 0,3 E,4 
E D,3 E,4 F,5 E E,4 F,5 
F E,4 F,5 G,6 F E,4 F,5 
G F,S G,6 H,7 G F,5 G,6 
·H F,S G,6 H,7 H G,6 H,7 
. , 
.-
(e) M6 (f) M7 
"I = fA, BCDEFGH} "4 = {A, B, C, D, E, F,. GH 
"2 = {A, B, CD, E, F, G, if} "s = {A, B, CO, E, F, GH} 
"3 = {A, B, CDEFGH} 
Present 
state 
D 1 
ABCD ... P P,D P.l 
EFGH .... Q P,l P,2 
Present 
state 
D 1 
AB -+-P P,D P,l 
CDEF -.-: Q P,l Q,2 
GH .... R Q,2 Q,3 
Present 
state 
D 1 
AB -+-P P,D P,l 
CD .... Q P,l Q,2 
EFG .... R Q,2 Q,3 
H .... S Q,3 R,4 
Tables 4.l3(a) to (f). 
64 
Input 
2 3 4 5 
P.2 P.3 Q,4 Q,5 
P,3 Q,4 Q,5 Q,6. 
Input 
2 3 4 5 
Q,2 Q,3 Q,4 Q,5 
Q,3 Q,4 Q,5 R,6 
Q,4 Q,5 R;6 R,7 
Input 
2 3 4 
Q,2 Q,3 R,4 
Q,3 R,4 R,5 
R,4 R,S R,6 
R,S R,6 ,S,7 
Reduced equivalent machines 
~'s of ~'s. 
6 
Q,6 
Q,7 
65 
Present Input 
state 
D 1 2 3 
A -+-P P,D Q,l Q,2 R,3 
BC -+- Q Q,l Q,2 R,3 R,4 
DE -+- R Q,2 R,3 R,4 S,5 
FG -+- S R,3 R,4 S,5 S,6 
H -+-T R,4 S,5 S,6 T,7 
Present Input Present Input 
state state 
D 1 2 D 1 
A -+-P P,D Q,l R,2 A -+- P P,D Q,l 
B -+-Q Q,l R,2 R,3 B -+- Q Q,l R,2 
CD -+- R R,2 R,3 S,4 C ..,. R R,2 S,3 
E -+- S R,3 S,4 T,5 D -+- S S,3 T,4 
F -+-T S,4 T,5 U,6 EF -+- T T,4 T,5 
GH -+- U T,5 U,6 U,7 G -+- U T,5 U,6 
, . 
H -+- V U,6 V,7 
(e) m6 
(f) m7 
66 
ITs ia O.C •. Hence M6 -!.. m6 , a 6-state machine. 
As m6 does not possess any useful S.P. partition, only a direct 
realisation is possible as shown in Fig. 4.14(e). 
Its reduced equivalent m7 , which contains seven 
states, is obtained using the S.P. partition {A, H, C, D, EF, G, H}. 
This reduced machine m7 possesses no useful S.P. partitions~ 
4.S Discussion. 
Some interesting features of the F.S.M. models of non-recursive 
and recursive stored-logic digital filters have been brought out 
as a consequence of our analysis. 
We see that with the second-order non-recursive filter, although 
its F.S.M. equivalent is already in the minimal form further 
simplifications are possible if the filter output is represented 
as a multi-digit number with each digit regarded as a separate 
output for analysis. 
With the autonomous 2-bit second-order recursive section, the 
direct realisation in Fig. 4.8 is simplified quite considerably, 
using S.P. partitions, to that shown in Fig. 4.11. In this example, 
there are still useful S.P. partitions after the state minimisation 
process. One of the problems encountered when the complete section· 
is analysed directly is that for the same filter but with different 
values of the coefficients b1 and b2 , the corresponding flow tables 
and state graphs are considerably different from one another. 
Consider, for example, when the coefficients. are bl = 3 x 2-
2 
and 
-2 b2 = 2 x 2 • The state-ouput relationship is given by Table 4.14, 
------------------------------
1T2 xn --" 
32x6 b 16xL, bits 
8x1 b r;:: t-
l D 1 
0 0 0 
0 
't$.A. 
32x5 
b 
x 
n 
32x5 b • 
Xn 
16x 6 b 
.'- c-r" 
l- ~ l-
D !E-D 
e 0 ~ 
0 
D 0 
O 
Figs. 4.l4(a) - (f). . Implementations of reduced machines ~'s. 
67 
Past outputs 
(x 2-2) Double-length Quantised output 
blx w n-l b2x w n-2 output w w' n' by 
wn- l wn- 2 
0 0 0 
0 1 0 
0 2 0 
0 3 0 
1 0 3 
1 1 3 
1 2 3 
1 3 3 
2 0 6 
2 1 6 
2 2 6 
2 3 6 
3 0 9 
3 1 9 
3 2 9 
3 3 9 
Table 4.14. 
n. 
(x 2-4) (x 2-4) truncation 
0 0 0 
2 2 0 
4 4 1 
6 6 1 
0 3 0 
2 5 1 
4 7 1 
6 9 2 
0 6 1 
2 8 2 
4 10 2 
6 12 3 
0 9 2 
2 11 2 
4 13 3 
6 15 3 
State-output relationship of second-order 
autonomous recursive section D.F.4 with 
coefficients b l = 3, and b2 = 2. 
round-off 
0 
1 
1 
2 
.1 
1 
2 
2 
·2 
2 
3 
3 
2 
3 
3 
3 
68 
and the flow table and partial state graph of the F.S.M. model 
are shown in Tables 4.lS(a) and (b) and Figs. 4.lS(a) and (b) 
respectively. It is seen directly that they are very different 
in structure to Tables 4.7(a) and (b) and Figs. 4.9(a) and (b). 
The same dependence of state structure on filter coefficient 
. values is also true for first-order recursive filters as evidenced 
by the variety of different realisations shown in Fig. 4.l4(a) to 
(f). We also note that simplifications of the F.S.M. models of the 
first-order sections are mainly due to state reductions. Among 
the machines analysed only M4 and MS have reduced equivalents, m4 
and mS' that could be simplified further via. S.P. partitions. 
Furthermore, it is also observed that state reduction becomes 
increasingly difficult with increasing values of b l , the feedback. 
coefficient. This is illustrated in Fig. 4.l6(a). A similar result 
is also obtained when the word-length is increased·to 4-bits (see 
the graph in Fig. 4.l6(b». 
One difficult problem with both types of recursive digital 
filters is the inherent non-linearity of the system as a result of 
output and state quantisation, either by truncation or round-off. 
As an example, consider the reduced F.S.M. model of the first-
order section whose flow table is given in Table 4.11, and.two 
input sequences {ql} and {qZ} given by 
{ql} = 1,0,0,0, .•••••.• 
and 
{q2} = 2,0,0,0, .••••••. 
Present 
state 
0,0 ... A 
0,1 B 
0,2 C 
0,3 D 
1,0 E 
1,1 F 
1,2 G 
1,3 H 
2,0 I 
2,1 J 
2,2 K 
2,3 L 
3,0 M 
3,1 N 
3,2 
'" 
3,3 P 
Table 4.15. 
Next 
state 
A 
A 
E 
E 
B 
F 
F 
J 
G 
K 
K 
'" 
L 
L 
p 
P 
(a) 
69 
Truncated 
output 
0 
0 
1 
1 
0 
1 
1 
2 
1 
2 
2 
3 
2 
2 
3 
3 
Next 
state 
A 
E 
E 
I 
F 
F 
J 
J 
K 
K 
'" 
0 
L 
P 
P 
P 
(b) 
Rounded 
output 
0 
1 
1 
2 
1 
1 
2 
2 
2 
2 
3 
3 
2 
3 
3 
4 
Flow table of F.S.M. equivalent of filter 
shown in Table 4.14. 
(a) 
Cb) 
Fig. 4.15. State graphs for D.F.4 with output (a) truncation 
and (b) rounding-off. 
8 
6 
V> 
W 
t:c 1-4 
V> 
LL 
0 
. 
0 2 
z 
15 
V> 
w 
t:c 
l-
V> 
LL8 
o 
o 
z 
4 
2. 
0 
o 
0/ /.0',,/ 
• 
• 
• 
• 0 
2 
2 4 
. 
/ , 
, 
" , 
4 6 8 
(a) s,~ 
( x 2-' ) 
+ - rounding-off 
with . 
0---' truncatIon 
6 8 10 1214 16 
(b) 81- (x 2.-4 ) 
Fig. 4.16. Effect of coefficient values on state reduction. 
of F.S.M. models of (a) a 3-bit and (b) a 4-bit 
recursive filters. 
70 
The corresponding state transitions and output sequences for 
initial state T, say, are 
i.e .. 
= 
'{o} = 
2 
S, R, Q, Q, •••• , •••• Q, ••••• 
5, 3, 2, 1, 1, ........... 1, .. . 
T, S, R, Q, Q, ••• Q, ••• 
6, 4, 3, 2, 1, 1, .......• 1, ... 
Let {q3} be the sum of the input sequences {ql} and {q2}' 
{q3} =3,0,0, ••••••• 
The corresponding state and output sequences, {s3} and {03}' 
are given by 
T, S, R, Q, Q, 
= 7, 4, 3, 2,,1, 1, , •. 
The consequence of this non-linear effect is that any result 
of the analysis of recursive digital filters with short word-lengths 
cannot be easily generalised to, large word-length recursive sections. 
4.6 Conclusions. 
In general, the structure, theory of finite-state, sequential 
machines is conceptually attractive in the simplification of digital 
filters'realised as stored-logic units. In practice the main problem 
is to generalise the method such that it may be applied, without 
71 
resorting to exhaustive manual or even""computer search, to digital 
filters of both the recursive and non-recursive types, having 
varying word-lengths and coefficients. Even if a complete listing 
of S.P. partitions is possible, it is still extremely difficult 
to select the best subset of these partitions which will lead to 
a good realisation. Furthermore, it is not desirable to have to 
perform a complete analysis for every different filter specification. 
In view of this the non-recursive section "appears to be the most 
promising candidate for a general analysis. 
ApPENDIX 4.0 
In the example in Section 4.3.2, the input to D.F.3 is assumed 
to be zero in order to simPlify the subsequent analysis. The non-
trivial initial values of the ordered-pair (w l' w . 2) may be set 
n- n-
up by presetting the relevant delay registers. Alternatively, one 
can assume that prior to our analysis, D.F.3 has received the 
appropriate input sequence to 'send' the filter to a particular 
initial ordered-pair. As illustrated in Fig. A.4.0, a unique input 
sequence can always be found to connect the trivial state ordered-
pair (0,0) to any other ordered-pair. 
Consequently, in the state diagram shown. in Fig. 4.9(a), the 
starting states C,M,D,F,K and N which lead to limit-cycle oscillations 
may be reached from the trivial state (0,0), i.e. A, by the application 
of the input sequences; {2, -U, { 3}, U, -l}, {I, U, {2, U and 
{I, 3} respectively. 
The above discussion assumes that the output is truncated, but 
the treatment is similar when the output is rounded-off instead. 
o 1 2 3· 
- - --~~~ 
1 
, -2 
-1 0 1 2 
~O~~3~ 
3 
3 
Fig. A.4.0. State diagram of D.F.J for non-trivial input sequences 
of length ~ 2, (with output truncation), 
-1 0 L L 
4~2~~ 
ApPENDIX 4.1 
Consider a general state ordered-pair (sl' s2) of the F.S.M. 
model of D.F.l, and let its x-successor be (si, si). From the 
discussion in Section 4.4.0.0. we can easily see that si = I and 
S ' -2 - sl· Therefore the next-state function 0 simply consists of 
-the two identity mappings, 
Thus in practice the implementation of 0 consists of the direct 
connection of x to. sI and sI to s2 via the two delay registers. 
Suppose now there exists an n-block o.c. partition which has 
also S.P. Then the.F.S.M. equivalent of D.F.l may be reduced to an. 
n-state machine. In such a case some combinational logic may be 
required for the 0 state transition mapping. 
72 
CHAPTER 5 
PARTITION STRUCTURES 
OF 
STORED-LOGIC AR ITHMETIC CIRCUITS, 
5.0 Introduction. 
In the previous chapter we have. encountered the limitation 
of the direct modelling of the complete digital filter as a 
finite-state sequential machine. To resolve some of the questions 
that were brought out there, we investigate in this chapter the 
application of S.P. partition techniques to the analysis of the 
arithmetic units that make up the filter algorithm. It is hoped 
that an insight into the algebraic structure of the overall section 
will be gained as a result of knowing the partition structures of 
its componen t units.· 
A general F.S.M. model is first introduced which will then be 
used as a basis for the structural analysis of N-bit adder and' 
N-bit by N-bit multiplier modules. 
5.1 F.S.M. model of a general' arithmetic. circuit. 
Consider the case of an arithmetic function, g, of two variables 
or operands A and B, where 
A,B = ZM ' Z = {x M x integer, 0 ~ x ~ M-I}. 
For our applications; the range of g is· ZC' where 
Zc = {x : x integer, o ::: x ::: C-l}, 
C-1 being the maximum value of g(A,B). 
73 
This function is represented by the combinational or stored-
logic circuit enclosed in the broken lines in Fig. 5.0. 
The corresponding F.S.M. model for this arithmetic circuit 
is obtained by first separating g(A,B) into two components Gt and 
G such that the elements of the former are identical to those of 
u 
one of the operands, say B. For completeness we regard Gt to be 
linked to B by an imaginary feedback. The mappings below follow 
naturally. 
A 1 
hZ .. B,Gt ...... S 
h3 G ...... 0 u 
h4 gu ...... A 
hS gt ...... 0 
where gu A x B -+ G u 
gt A x B ...... G~ 
and g is. now written as g : A x B -+ (G
u
' Gt ). 
Thus, an arithmetic function g described by the four-tuple 
* (A,B,ZC,g) may now be modelled by an F.S.M. (S,l,O,o,A). 
* A discussion on the motivation behind and the theoretical 
constraints of the above model is given in Appendix 5.0. 
s B 
.---t---'--J,,-
, ,..--+---,/ 
9(A. BI 1 
~G 
-,' 1 jirV I 
. QI. 1 ' , 1 ' 1 
, 
2:: , : 11 
lI) I I L _ _ _ _ _ _____ ___ ______ _ • _I J I 
__ 'I u" ,I 
c : 1 feedback I' Q) I I 1 ________ ., I I 
cJ) I I I r J. I I 
ft •• ,------ --- ---I r'e,q" ~ -- --------.) - ~- -- ___________ 0 •••• '\. J ... __________ _ fi. l _______ J 
Fig. 5 _0. F.S.M. model of a general arithmetic circuit. 
A . 
, 
, 
g=A+B 
B " 
I 
• i 1 I I ~ ~ , I I 1 I· I I I I 
, 1 1 
I 1 I r-------- 1 I 
L __ - -1 J..- --- , 
. 1 I . I . ,.). I I 
L ., t. I· I I - _______ , ~-------..J L_________ k.---------...1 L ______ J 
Fig. 5.1. An F.S.M. radix-2N 'half-adder', 
74 
5.2 . N Rad1x ~2 adders. 
Conventionally, when two numbers are to be added, binary 
arithmetic is invariably used, and N-bit additions are realised 
using N modulo 2 adders (i.e. the familiar half and full adders) 
connected in cascade. One of the disadvantages is that the final 
sum is obtained only after the internally generated "carries" 
have propagated through the whole word-length. 
The trend towards the widespread use of large-scale integrated 
(L.S.I.) digital circuits is leading to the hardware design of 
arithmetic circuits based on radices greater than 2. The immediate 
consequences are the reduction of packages, the simplification of 
interconnections, and a· relatively fast circuit operation because 
"carries" are now between groups of digits, the size of the group· 
depending of the radix used and the degree of. parallelism required • 
. 
We will study here the specific case when the radix is of the 
N form 2 , N a non-zero integer. Using the general model in Fig. 5.0, 
the F.S.M. ·model of a radix - 2N "half adder" is easily derived 
by letting 
where 
and 
added. 
A = I, B = S, C 
o 
= G 
u 
C is the modulo 2N sum of A and B, 
o 
Cl is the carry-out of the "half adder", 
A and B are·the two N-bit numbers that are to be 
The block diagram of this radix - 2N half-adder is shown in 
Fig. 5.1. 
75 
5.2.0 Example. 
Consider the addition of two 3-bit numbers A and B, (i.e. N = 3). 
3 The corresponding modulo 2 sum and the carry tables are shown in 
Tables 5.0(a) and (b) respectively. These tables may be now 
regarded as the state and output tables, respectively, of the F.S.M. 
equivalent of this radix - 23 half-adder, and implemented using 
memory modules as look-up tables as shown in Fig. 5·.2. The sum 
and carry circuits require a 64 x 3 and a 64 x 1 W-b memory stores 
respectively. 
This direct implementation, however, will not be practical for 
operands having large ,verd lengths. 
5.2.0.0 S.P. partitions of radix - 23 half-adder. 
For ·the moment consider only the modulo 23 sum table (i.e. 
Table 5.0(a» that is realised by the state machine shown in 
Fig. 5.2, with the elements of A and B being regarded as the set 
of machine inputs and internal states respectively. 
This F.S.M. possesses the following S.P. partitions, 
ITl = {0,2,4,6 ; 1,3,5,7} and IT2 = {0,4; 2,6; 1,5; 3,7} 
A cascade realisation is thus possible using either ITl or IT2 
in conjunction with a non-S.P. partition Tl or T2 respectively, 
such that 
ITl • Tl = IT2 • T2 = IT(O), the zero partition. 
Possible values of Tl and T2 are 
Tl = {O,l; 2,3; 4,5; 6,7} and T2 = {0,1,2,3; 4,5,6,7}. 
Table 5.0. 3 (a) Modulo 2 sum and (b) carry output tables 
for radix - 23 "half-adder". 
A 
A. 
64 x 1 bit 
store 
r-- (b. s) 
b 
64 x 3 
8 ( b. s ) 
Fig. 5.2. Direct memory realisation of. 
a radix - 23 'half-adder'.· 
A 
"""J, 1 1 
'lIj, 1:', 
4 x 1 B 64 x 2 
b.5 b.s 
Ca) 
A 
1 I1 
-rr. I--- 1:'. 
.16 x 2 64x1 
b.s B I-
--i> b.s 
Ch) 
Fig. 5;3. Cascade memory realisations of modulo 23 
adder using Ca) ~l' Tl and Cb) ~2' T2 
--, 
77 
Also, the machine input set may be partitioned in a similar way. 
The state tables for the component machines of the realisation 
using ITl and Tl are shown in Tables 5.l(a) and (b) while·those of 
the realisation using ITZ and TZ are given in Tables 5.2(a) and (b). 
The corresponding block diagrams of these two possible cascade 
realisations are shown in Figs. 5.3(a) and (b), with corresponding 
memory storage of {(4 x 1) + (64 x 2)} W-b and {(16 x 2) + (64 x I)} 
W-b respectively, i.e. 132 and 96 bits. (It is useful to note 
here the advantages of using a successor component having as few 
blocks as possible). 
A much better realisation, however, will be to use ITZ and TZ 
to obtain IT(O), with ITZ' in its turn, being derived from IT1 and 
Ti:t where 
The state table for this 'successor' component is drawn in 
Table 5.3, and the overall realisation of the modu1o Z3 addition 
using IT l , Ti and T2 is shown in Fig. 5.4. This realisation uses 
three memory modules, of overall capacity of 84 bits, and compares 
favourably with the two realisations discussed previously. Each 
memory circuit is a single output store and the. interconnection 
pattern between the memory store is highly regular. 
This particular form is known as a loop-free implementation 
and will now be discussed in detail. 
Input 
I J 
A A B 
state 
B B A 
(a) 
A A A A A 
I I I I J 
i j k t i 
a a b c d a 
state b b c d a b 
c c d a b c 
d d a b c d 
Table 5.1. 
W1 = {O,2,4,6; 1,3,5,7} = {A; B} c· {I; J} 
T1 = {O,l: 2,3: 4,5; 6,]} = {a: b; c;d} c U: j; k; t} 
A A A B B B B B B 
J J J I I I I J J 
j k t i j k R- i j 
b c d a b c d b c 
c d a b c d a c d 
d a b c d a b d a 
a b c d a b c a b 
(b) 
State tables for component machines of cascade 
realisation shOlm in Fig. 5.3(a). 
B B 
Augmented 
J J input 
k R-
d a 
a b 
b c 
c d 
..... 
co 
state 
" 2- = 
= 
= 
TZ = 
= 
79 
Input 
E F G H 
P P Q R S 
Q Q p S R 
R R S Q p 
S S R P Q 
{O,4; Z,6; 1,5; 3, n 
{ p. , Q; R; S } 
{ E; F' , G; H }. 
{O,Z,1;3; 4,5,6,n 
{ p; q } = { e' f }. ,
Table 5.2(a). State table of predecessor 
component of cascade realisation 
ShOlffi in Fig. 5. 3(b). 
p p p 
E E F 
e f e 
p p q p 
state 
q q p q 
---
R R R 
E E F 
e f e 
---
p q p 
state 
q. p q 
---
Table 5.?(b) 
p p p p p Q Q Q Q 
F G G H H E E F F 
f e f e f e f e f 
q p q p q p q q p 
p q p q p q p p q 
R R R R R S S S S 
F G G H H E E F F 
f e f e f e f e f 
q p q q p p q q p 
p q p p q q p p q 
State table of successor component of cascade 
realisa.tion in Fig. 5.3(b). 
---
Q Q Q Q 
G G H H Augmented 
e f e f input 
---
p q q p 
q p p q 
---
(XI 
0 
S S. S S 
G G H H 
Augmented 
input 
e f e f 
q p q p 
p q p q 
Hence 
81 
,t = {04l5; 2637 J 1 
= {M, N } = { m,n }. 
"2 = (see Table 5.2(a) ). 
= {p, Q, R, S} = {E, F, G, H } 
'i { P,R; Q,S} = {E,G; F,H }. 
Table 5.3. State table for successor component 
of the machine realisation of "2 from 
"I and 'i· 
8 
, 
A 
"IT. 't" 
• 't'z. 
4x 1 r+ 16x 1 -+- B . 64x1 r-'""+ 
b.s r b.s r b.s r-
Fig. 5.4. Loop-free memory realisation of modulo 23 
adder using nI' Ti and T2 . 
r 
. . . . 
. I 
- - --
. 
! - - -- JJ ... !t··:· ,i 
-
H ~ 
~ ~ • • • • . r B: t f-~ . N-2 N-1 
I' ~ r. 
_ ... - -
- --
· 
· 
· 
Fig. 5.5. Generalised realisation of modulo 2N adders. 
82 
5.2.1 N The general modulo 2 adder. 
The decomposition technique in the example may be generalised 
for any value of N by using the fOllowing result. the proof of 
which may be found in pages 379-380 of Reference 71. 
5.2.1.0 
Theorem 5.0. If there exists a set of S.P. partitions 
{~l' ~2······ ~n} for an F.S.M. M such that ~l. >- ~2 >- •••• >-
and ~ = ~(O). then M is realisable as a serial loop-free 
n 
connection of n components ml • m2 •· •• · mn in which mi 
is a predecessor of m. iff ~. >- ~.. All the components 
J 1. J 
operate concurrently. 
Generation of S.P. partitions. 
We have seen in Chapter 3 that the generation of all possible 
S.P. partitions is initiated by identifying (i.e. put in the same. 
block) all possible pair combinations of the states. 
N For the case of modulo 2 adders. however. it is sufficient 
to consider only the identification of 0 and the integer d. where 
dEN' = { 1.2...... 2N -1 } 
This is because the first row and the first column of the 
modulo 2N addition table merely duplicate the inputs and the 
present states respectively. Furthermore. each successive column. 
going from column 1 to column 2N_l. is identical to its predecessor 
except that the top entry is shifted to the bottom. and every entry 
is shifted up by a unit step. 
Thus. the identification of 0 and d automatically implies the 
identification of (0 + k) and (0 + k) + d for all k's where 
~ . 
n 
83 
N 1 :s k :s 2 -1. As a consequence. all elements that are d units 
apart will be identified. and hence for an 
N o :s a. :s 2 -1. then a. and a. + kd (modulo 
1 1
same partition block. 
arbitrary a .• 
1 
2N) will be in the 
The following leIml1as will now be proved. (A useful aid to 
the proofs is to regard the a.' s to be placed consecutively on the 
1 
circlli~ference of a circle. the 'distance' between a. and a. 1 being 
1 1+ 
of 'unit' length). 
LeTTU17a 5. 0; If d is odd. there are no S.P. partitions 
apart from the trivial ones n(I) and n(O). the 'identity' 
and 'zero' partitions respectively. 
Proof. If d = 1. then all state elements that are a unit 
distance from each other will be identified thus leading to a 
partition block which contains all tne state elements. in other 
words n(I). 
Consider now the general case in which d = (2q + 1). 
N-I 1 :s q l; 2 -1. 
For an arbitrary ai' any state of the form kd. k = 1.2 •••••• 
will be identified with it. That this will eventually lead to 
n(I) is clearly seen by the following observation. 
Consider the case when. starting from a. and going around the 
1 
circle. a. is picked up again after k steps of d units each.- i.e. 
1 
i.e. 
a. + k(2q +- 1) _ a. 
1 1 
N (mod 2 ). 
(2q + l)k _ 0 N (mod 2 ). 
84 
As the above impJ.ies that 2N divides (2q + l)k, and also 
N N . (2q + 1), being odd, is relatively prime to 2 , then 2 must 
divide k, i.e. 
N k = g 2 , g = 0,1, ..•... 
Since k > 0, then g must be greater than zero, and hence the 
first solution for k is when g = 1, leading to k = 2N. Therefore 
2N different states will be identified before any starting state 
is repeated. 
Lemma 5.1. If d = 2P,. P = 1,2,3, .••• N, there exists 
Any 11 , 
P 
derived from d = 2P , contains 2P blocks of equal size, 
and if the elements in anyone block are arranged in 
ascending magnitude, adjacent elements will differ by 
2P units. 
Proof. Following similar argument as in Lemma 1, we obtain 
(2P)k 
-
° 
(mod 2N) 
i.e. k = 
g 2N 
2P 
As 2P always divides 2N, repetition of any initial state can occur 
N before the full cycle of 2 steps can be completed. 
It follows that the number of.elements, m(1I ) in one block p 
of 11 is given by p . 
m(1I ) 
p 
= = 
N-'p 2 
and the number of blocks of lip' ~ (lip)' is given by· 
1r (IT ) 
P 
= 
= 
Lemma 5.2. 
85 
Total number of states 
Number of elements in a 
partition block 
= 2P • 
N Let d,D be two integers, 1 < d,(D) ~ 2 • 
If d divides D, then lTd ~ lTD • 
Proof. Let ai and a j be two elements in a block of lTD• 
Then, by construction, we have 
a. _ a. + kD 
J ~ 
N (mod 2 ). 
Since D is divisible by d, i.e. D = ~d, ~ an integer, then 
a. _ a. + k~d 
J ~ 
N (mod 2 ), 
implying that a. and a. are also contained in a block of lTd • 
~ J 
5.2.1.1 N Loop-free realisation of adders modulo 2 • 
The following theorem follows naturally from the three lemmas 
we discussed in the previous section. 
Theorem 5.1. The F.S.N. model of a general modulo 2N 
adder possesses N S.P. partitions lTl , lT2 ' •••• , IIp' •••• ,. lTN 
such that 
~ ... ~ 
In the implementation'of the adder, any of these ·partitions 
IT can be used with any non-S.P. partition T , so long as p . p 
IT T = 11(0). p p 
86 
A more economical implementation, however, will be to use 
all the S.P. partitions in a systematic way as follows. 
Consider nN- I • A valid realisation will be to use TN- l 
given by 
From Lemma 5.1 we obtain 
= 2 • 
Hence TN- l will have to be a 2-block partition in order to 
distinguish"between the elements of each block of nN- l • 
nN- l , in turn, is realised from nN_2 in the same manner, 
i.e. using again another 2-blockpartition TN_2 , such that 
= 
By repeating this procedure for the remaining S.P. partitions, 
we arrive at the following iterative relationship, 
n(O) = TN_l . nN_l 
+ 
n p = 
= 
N Consequently, one can implement an adder modulo 2 as a set 
of loop-free interconnected component machines as described in 
Theorem 5.0. As these sub-machines operate concurrently, there 
87 
is no carry propagation at all. 
If the 'input' A is assigned the same binary code as for 
the 'present state' B, the resulting hardware implementation using 
memory modules is as shown in Fig. 5.5. 
5.2.1.2 Memory storage reduction. 
It will now be shown that the storage required for the loop-
free form.of adders modulo 2N is considerably less than that 
required in the direct realisation. 
* If M is the memory storage of the direct form and M the 
o r 
overall storage of all the sub-machines, then 
where N 15 the word-length of each of the operands A and B. 
Since A and B are coded in the same way, then 
•.• (5.0) 
2 3 N 
= P + P + P + ..•.• P ••• (5.1) 
where p = 22. If we mUltiply (5.1) by p, we get 
2 3 4 N+l pM = P + P + P + •••• P 
r 
••• (5.2) 
By subtracting (5.1) from (5.2), we obtain 
N+l M (p-l) = p - p or 
r 
M = p(l -1) 
r p-l 
= ~ (22N -1). 
* For simpUaity of subsequent expZanation, the unit for memory 
storage is understood to be 'word-bits'. 
88 
Thus, the reduction ratio is given by 
R = 
M 
r 
M 
o 
and if 22N »1, then we have 
R 4 3" 
1 
N 
, 
which is a considerable reduction. Also, as can be seen from the 
graph in Fig. 5.6, this reduction improves, i.e. becomes smaller 
as the word leng~h N is increased. 
5.2.2. Generation of the carry digit. 
N When A and B are added modulo 2 , a table can be drawn to 
show when a carry digit has to be generated. One such table for 
N = 3 is shown in Table 5.0(b), which is also the output table 
for the F.S.M. model. 
For a general N, let the rows and columns of the output 
table be denoted by i and j respectively, (i,j = 0.1,2, •••• , (2N~I». 
It was observed that below the diagonal described by, 
i,k for all i 
k = (2N_I) - i, 
N 0,1, .••• ,2 -1 and 
the table entries are all '1' s. Again this. is clearly illustrated 
in Table 5.0(b). This fact suggests a simple method of realising 
the output table. 
A carry is generated, i.e. Cl = 1, only if 
A + B > 2N_I i.e. 
A > 2N_l - B 
o 
::E 
....... 
L.. 
::E 
1 
1/2 
1/4 
o 
.. 
• 
• 
• 
• 
I 
\ 
\ 
t 
, 
\ 
\ 
2 
• \ 
'. 
• 
' ..... 
4 
-10- __ 
- -
8 
wordlength 
---
---- --+ 
16 
N (bits) 
Fig. 5.6. Effect of loop-free decomposition on 
overall memory storage. 
r-" 
r' 
B 
Fig. 5.7. 
A 
B 
3- bit 
COMP. C i 
C1 
Carry-out circuit 
of radix - 23 
'half-adder' . 
A 
.. .. 
MOD-2N 
H/A )I;J;/': 
. . .. 
MOD-2N 
· • 
H/A · Co · .~ . 
Fig. 5.B. Stored-logic realisation 
of a radix - 2N 'full-adder'. 
89 
Since the right hand side of the inequality is simply the one's 
complement of·B, then if 
N-l 
B = I 
)/.=0 
b 2)/. 
)/. 
we can write this one's complement B' as 
N-l 
B' = L 
)/.=0 
where bk = 0 or 1 and bk is the logical negation of bk • 
Thus the output table may be realised using an N-bit inverting 
network and a standard N-bit M.S.I. binary comparator. This form 
of realisation for N = 3 is shown in Fig. 5.7. 
Of course, the generation of the carry output may be 
incorporated in the general loop-free design as discussed in 
Section 5.2.1.1. by regarding the addition to be modulo 2N+l instead 
of 2N and assuming the last input bit and state bit to be at a 
constant '0' value. 
5.2.3 Addition of "carry-in" digit. 
For a full radix-2N adder design, the "carry-in" digit from 
the previous full adder must be incorporated. If SN is the modulo 2N 
sum of A and Band C. the "carry-in" digit, then the addition of 
1. 
SN and Ci is carried out in exactly the same way as described in 
Section 5.2.1. 
This time, however, since C. is only a one-bit variable, the 
1. 
required storage M is much less and is given· by 
r 
M 
r 
• • •• + 
-------------------------------------------------
90 
Therefore, the corresponding ratio R 
= 
2 
N 
M 
r 
M 
o 
is given by 
The carry-out circuit for this part is relatively simple 
N 
since a carry is generated only if SN = 2 -1 and Ci = 1. Thus 
we would require only an (N+1) -input AND gate. The block diagram 
of the complete fu11.adder is shown in Fig. 5.8. 
5.3 Radix-2N parallel multipliers. 
We will now investigate the modelling of a parallel N x N 
bit multiplier by an F.S.M. Two models will be presented, the 
first being the straightforward application of the general model 
shown in Fig. 5.0, while the second is derived by regarding the 
radix-2N full multiplication as being equivalent to two 
multiplications, modu10 2N and modu10 2N_1 respectively, operating 
in parallel. 
5.3.0 Example. 
Consider the multiplication of two 3-bit numbers A and B, 
giving a 6-bit product P. The direct look-up table is given in 
Table 5.4, and the corresponding memory module implementation, 
requiring 384 storage bits, is shoWn in Fig. 5.9. 
The F.S.M. model of this mUltiplier is obtained by separating 
the direct table into two simpler component tables as shown in 
Tables 5.5(a) and (b). The latter is simply a modulo 23 
multiplication table, while the former consists of the values for 
the most significant 3 bits of the product P. These Tables (a) and 
91 
(b) may now be regarded as the output and state tables of the 
equivalent F.S.M. respectively. 
5.3.1 The general N x N bit multiplier. 
The F.S.M. model of the general N x N bit parallel multiplier 
will now be derived. 
When two N-bit numbers, A and B, are multiplied, the result 
P. is a 2N-bit product, i. e. if 
N-1 
A = L 
i=O 
i 
a.2 
1. 
and B 
where a. = 0 or 1 and b. = 0 or 1, then 
1. 1. 
2N-l 
AxB=P= L 
j=O 
N-I 
= I;' L 
i=O 
b.i 
1. 
p. = 0 or I 
J 
, 
... (5.3) 
This product P can be expressed as a 2-digit number in the radix 
N 2 as follows; 
where 
2N-1 
P = L 
o 
2N-I 
= L 
p .2J 
J 
N-I 
PI = I 
m'=O 
N-I 
L 
k=O 
and P 
o 
N-I 
= L 
k'=O 
.... (5.4) 
... (5.5) 
11 64 0 1 
0 0 0 
1 0 1 
2 0 2 
3 0 3 
B 
4 0 4 
5 0 5 
6 0 6 
7 0 7 
Table 5.4. 
92 
A 
2 3 4 5 6 7 
0 0 0 0 0 0 
2 3 4 5 6 7 
4 6 8 10 12 14 
6 9 12 15 18 21 
8 12 16 20 24 28 
10 15 20 25 30 35 
12 18 24 30 36 42 
14 21 28 35 42 49 
Direct multiplication table 
for a 3 bit x 3 bit parallel 
multiplier. 
8 
A 
64 x 6 
b.s 
p 
I I : 
Fig. 5.9. A 3-bit parallel stored-logic multiplier. 
r" 
B 
1 
0 
1 
1 
_'IL 
A 
1 
, 
64 X 3 
b.s 
A 
,,-
64 x 3 
b.s 
1 
1 
0 
0 
a 
1 
1 
p. 
o 
Fig. 5.10. F.S.M. model of a 3-bit parallel multiplier. 
0 
0 0 
I 0 
2 0 
B 3 0 
4 0 
5 0 
6 0 
7 0 
11 8 0 
0 0 
I 0 
2 0 
B 3 0 
4 0 
5 0 
6 0 
7 0 
Table 5.5. 
93 
A 
1 2 3 4 5 6 7 
0 0 0 0 0 0 0 
0 0 0 0 0 0 0 
0 0 0 1 I 1 I 
0 0 1 I I 2 2 
0 I 1 2 2 3 3 
0 1 1 2 3 3 4 
.. 
0 1 2 3 3 4 5 
0 1 2 3 4. 5 6 
(a) 
A 
I 2 3 4 5 6 7 
0 0 0 0 0 0 0 
I 2 3 4 5 6 7 
2 4 6 0 2 4 6 
3 6 I 4 7 2 5 
4 0 4 0 4 0 4 
5 2 7. 4 1 6 3 
6 4 2 0 6 4 2 
7 6 5 4 3 2 1 
(b) 
(a) Output and (b) state tables for 
the F.S.M. model of a 3-bit parallel· 
multiplier. 
----------------- ------------ -- - --
94 
The direct look-up table represented by equation (5.3) is 
now separated into two simpler tables, the PI and Po tables 
represented by equations (5.4) and (5.5), where PI is the N 
most significant bits of the real product P, and P is the N-bit 
o 
modulo 2N product. The F.S.M. model is obtained by regarding 
PI and Po as the G
u 
and Gt respectively of the general model 
as shown in Fig. 5.0. 
5.3.1.0 Example. 
For the 3-bit multiplier we discussed before, let A = 7 
and B = 5. Thus we have 
A x B = 7 x 5 = 35 
= (~. x 8) + 3 
Here PI = 4 and Po = 3. This particular operation is illustrated 
in Fig. 5.10. 
5.3.2 Decomposition of the F.S.H. multiplier. 
Considering the P table given in Table 5.5(b), the following 
o 
S.P. partitions are found, 
TIl = {0,2,4,6; 1,3,5,7}. 
1T2 = {O,4; 1,5; 2,6; 3,7} 
Follmi'ing a similar argument to that used in deriving the 
N loop-free realisation of adders modulo 2 , the complete multiplier 
is realised by using two non-S.P. partitions Tl and T2 such that, 
------------------------~------------------------------ - ----
95 
and (0) . 
Possible values for Tl and TZ are 
Tl ~ {0,4,1,5; Z,6,3,7} 
TZ ~ {O,l,Z,3; 4,5,6,7} • 
The Po table may first be rearranged according to "1 and "Z' 
as shown in Tables 5.6(a).and (b), and the "1 and "Z images of the 
F.S.M. multiplier, viz. M 
"1 
and M respectively are obtained 
"Z 
by considering operations only between partition blocks. The 
corresponding 'state' tables are shown in Tables 5.7(a) and (b). 
Also, the two successor components derived from the non~S.P. 
partitions Tl and T2 are shown in Tables 5.8 and 5.9 respectively. 
It is interesting to note that M and M are isomorphic 
"1 "Z 
to a modulo 2 and a modulo 4 multipliers respectively. This 
observation leads to the following theorem. 
Theorem 5.2. A partition" that has S.P. for a modulo p 
ZN addition table also has S.P. for a modulo 2N 
multiplication table, where" is defined as in Lemma 5.1. 
p 
Proof. Let a. and a. be two elements in a block of" and 
1 J P 
assume that a. > a. Then we have 
J 1 
a. - a. = kd , 
J 1 
d ~ zP • 
Multiplying a. and a. by an arbitrary element b, 0 ~ b ~ zN_ l , 
1 J 
we get 
c._bxa. 
J J 
c. _ b x a. 
1 1 
N (modulo Z ), 
9 8 
0 
2 
4 
6 
----
1 
3 
5 
7 
0 2 4 6 1 3 5 7 9 8 0 4 1 5 2 6 3 7 
0 0 0 0 0 0 0 0 0 0 0 I 0 0 0 0 0 0 I 
I 
I 
0 4 0 4 2 6 2 6 4 0 0 I 4 4 0 0 I 4 4 I I 
---- ---------r---------t--------~i----------
0 0 0 0 4 4 4 4 1 0 4 I 1 5 I 2 6 I 3 7 I I I 
I I I 
I I I 0 4 0 4 6 2 6 2 5 0 4 j 5 1 I 2 6 7 3 I I I 
-~-----------------~-------------------- ---------r---------t---------i----------~ 
0 2 4 6 1 
0 6 4 2 3 
0 2 4 6 5 
0 6 4 2 7 
(a) 
Table 5.6. 
3 5 7 2 o 0.1 2 2 I 4 4 I 6 6 
I I I I I 
1 7 5 6 0 0 I 6 6 I 4 4 I 2 2 I I I 
7 1 3 3 
---------r---------t---------i-----------
041 3 7 I 6 2 J 1 5 
I I I 
I I I 
5 3 1 7 0 4 I 7 3 I 6 2 I 5 1 I I I 
I I I 
(b) 
The modulo 23 multiplication table organised 
by (a) rr l and (b) rr2 • 
"' 0-
E F 111 = {O,2,4,6; 1,3,5,7} = {A,B} = {E ,F} 
A A A 
B A B T1 = {O,4,1,5; 2,6,3,7} = {a,b} = {e,f} 
M 
111 
(a) 
'D 
..., 
I J K L 112 = {O,4; "1,5; 2,6; 3,7} 
P P P P P 
= fp, Q, R, S} 
Q p Q R S 
R P R P R = {I, J, K, L} 
S p S R Q 
M 
T2 = {O,1,2,3; 4,5,6,7} = {p,q} - {m,n} 
112 
(b) 
Table 5.7. State tables for the (a) 111 .and (b) 112 images of 
a 3-bit F.S.M. multiplier. 
A A A A B B 
E E F F E E 
e f e f e f 
a a a a a a b 
state 
b a a b b a a 
Table 5.8. 
B B 1T1 
F F Augmented 
e f input 
a b '1 
a a 
Also 
State table of successor 
machine M 
'1 
= 
= 
= 
= 
{A; B} = {E; F} 
{P,R; Q,S} = {r,K; J,L} 
{a; b} = {e; f} 
-0 
{P,Q; R,S} = {r,J; K,L} ex> 
P-+- (A, a) Q-+- (B, a) 
R-+- (A, b) S->- (B; b) 
p p p p p p 
I I J J K K 
m n m n m n 
p p p p p p p 
state 
q p p q q p p 
---
R R R R R R 
I I J J K K 
m n m n m n 
---
p p p p p q q 
state 
q p p q q q q 
---
Table 5.9. 
p p Q Q Q Q Q Q Q 
L L I I J J K K L 
m n m n m n m n m 
p p p q p q p q p 
q q p q q p p q q 
R R S S S S S S S 
L L I I J J K K L 
m n m n m n m n m 
q q p q p q q p p 
p p p q q p q p q 
State table of successor machine M 
T2 
---
Q 
Augmented 
L input 
n 
---
q 
p 
---
\0 
\0 
S 
L Augmented 
input 
n 
q 
p 
100 
i. e. b N and x a. = q.2 + c. 
J J J 
b N x a. = q.2 + c. 
1 1 1 
b(a. a. ) (q. N c. - c. = - + - q .)2 
J 1 J 1 1 J 
b(kd) (q. N ••• (5.6) = + q.)2 
1 J 
Since d = 2P and 2N is divisible by 2P , the R.H.S. of equation 
(5.6) is divisible by 2P• Consequently 2P divides c j - ci ' Le. 
we may write 
c. - c. i.e. 
J1 
o(a.,b) _ o(a.,b)(rr). 
J 1 P 
Thus c. and c. , the 'next states" of a. and a. respectively, 
J 1 J 1 
are in the same block of rr and therefore rr has S.P. p P 
It must be mentioned here that as a consequence of the above 
theorem, it does not mean that theS.P. partitions for themodulo 
2N multiplication are confined only to those having the form as· 
in rr. The partition {e; 1,2,3,4,5,6,7} for instance, has S.P. p 
Eventually, however, since one has to consider the interconnection 
of both adders and multipliers, then the use of compatible S.P. 
partitions for both will eliminate the need for coding and decoding 
between partitions having different structures in terms of the 
number of partition blocks and. their sizes. 
The PI or 'output' table is more dffficult to analyse than the 
P or 'state' table. This is because it contains no two rows that 
o 
are identical, thus leading to output-consistent partitions, 
which may lead to a state reduction. There is also no obvious 
internal structure or pattern that we can exploit. Consequently, 
101 
the problem is also difficult to generalise to an arbitrary N. 
Although the PI or output function can be implemented using 
'" conventional combinational logic techniques, a better method 
is presented in the following section. 
5'.3.3 Improved model of N-bit parallel multiplier. 
A model will now be derived for the N x N bit multiplication 
which result in both Po and PI tables having regular algebraic, 
structures. 
The product P is first written in two different ways, as a 
2-digit number in the radices 2N and 2N_l respectively, i.e. 
P = P 2N + P 1 0 • •• (5.7) 
and 
• .• (5.8) 
The implementation of P has already been discussed and that 
o 
of Q will be analysed in detail in Chapter 7. Our immediate 
o 
problem now is to determine PI in equation (5.7) knowing only 
P and Q . 
o 0 
Equation (5.7) may be subtracted from equation (5.8) to give 
• •• (5.9) 
The L.H.S. of (5.9) may be written as 
P 2N N 
- P + PI - Q (2 -1) 1 1 1 
= Pl (2
N
-l) + P - Ql (2
N
_l) 1 •.• (5.10) 
* Another obvious soZution is to regard the N x N bit muZtipZiaation 
as that of moduZo 22N, i. e. one simpZy extends the range of "moduZo 
multiplication" by ZetUng N -> 2N. 
102 
Substituting (5.10) into (5.9), we get 
where 
Q - P 
o 0 
R = Q - P 000 and K = Q - P 1 1 
The_maximum value of P, P say, is given by 
max 
P = A x B 
max max max 
... (5.11) 
= (2N_1)(2N_1) since A,B are N-bit numbers. 
P = (2N_1)2N (2N_1) 
max 
= (2N_1)2N _ 2N + 1 
= (2N_2)2N + 1 
Hence 
Using equation (5.11), since PI ~ PI ,then 
max 
•.• (5.12) 
S· b d f' 't' 0 P 2N 1 and· 0 ~ Q ~ 2N_2, then, .s;nce 1nce y e 1n1 10n ~ ~ - _ ~ 
o ' 0 
R = Q - P it is at its maximum value when 
o 0 0' 
Q = l-2 and P = O. 
o 0 
Substituting R into equation (5.12), we obtain 
o 
max 
K = o. 
103 
Similarly, Ro is miniffium when 
R 
o . 
ml.n 
and P 
o 
N ~ - (2 -1) 
and hence 
Since the minimum value of PI is 0, then, K in equation (5.11) 
must be 1. Consequently, equation (5.11) may be written either as 
or 
P~R~Q-P 1 0 o· 0 
PI ~ R + 1 x (2N_l) 0 
~ Q
o 
- P + (2N_l) 
0 
••• (5.13) 
••• (5.14) 
Equation (5.14) is more general and will 'cover' equation (5.13), 
because adding (2N-l) to any number x, N o ~ x ~ 2 -2, will not 
alter its value provided any end-round carry is taken into account, 
since 
~ 2N + (x-I), 
where 2N is now the overflow or carry bit. Adding this to the least-
significant bit of (x-I), we obtain 
(x-I) + 1 ~ x. 
Therefore, the general expression for PI' the significant. 
half of the product P is given by 
P ~ Q _ P + (2N_l) 
100 
~ Qo + [(2N_l) - po] 
i.e~ P = Q + p' loo 
104 
.••• (5.14) 
where p' is the additive inverse of P with respect to (2N_l). 
o 0 
We have now, in effect, modelled the N x N bit multiplication 
as two simpler multiplications in parallel, viz. that of modulo 2N 
N 
and modulo (2 -1). Consequently, the original F. S .M. multiplier 
can be regarded as two simpler F.S.M. multipliers operating in 
parallel. 
The corresponding block diagram of this parallel realisation 
is shown in Fig. 5.11. N As the modulo 2 product is already in the 
binary form no decoding is necessary. PI is easily obtained from 
Q and P by using a conventional 2N adder with an end-round carry.· 
o 0 
Although there will be two representations for zero, this is not 
a problem since we know that 
5.4 Conclusions. 
N 
cannot happen since PI ~ (2 -2). 
N Stored-logic radix - 2 full adders and mUltipliers are 
analysed on a systems level by modelling them as finite-state 
sequential machines. The algebraic structures of these F.S.M's 
are then analysed using S.P. partitions. For the F.S.M. models 
of both the modulo 2N sum and product, respectively, of two N-bit 
numbers, theorems have been derived showing that these F.S.M's 
possess cascade loop-free decomposition structures. The corresponding 
implementations require substantially less memory storage than 
those of the direct form, and this advantage improves with increasing 
word-length N. 
A 
LJ-", 
rv' 
--" 
r-v 
" 
'-y 
-1) 
v 
B 
I 
0 
MOO-(2N -1) ~ 
I e I 
.s; 
I v MOD-12N -1) 
I 
I 
I ... 
I 
I 
I inverter 
I w.r.t. 
0 \ 2N -1 I 
, 
..n. 
MOD-12N) 
Fig. 5.11. Improved model of a general N x N bit 
parallel multiplier. 
=> 
p 
o 
105 
Two models have been found for the radix - 2N, i.e. N x N 
bit, parallel multiplier. The second model is extremely useful 
because although it requires some simple additional circuitry, 
it enables the N-bit most significant half of the multiplication 
product to be determined in a systematic way for a general N. 
ApPENDIX 5,0 
Some observations on the F.S.M. model of·stored~logic·arithmetic 
circuitso 
Although the stored-logic arithmetic circuits that we are 
-modelling as F.S.M's are strictly combinational switching circuits 
(thus necessitating the imaginary feedback from Gi to B in Fig. 5.0), 
the approach enabled the application of the useful results from the 
structure theory of machine decompositions. Furthermore, some 
familiar arithmetic units do have real feedbacks, e.g. digital 
accumulators. Thus the model is quite general. 
There is however a theoretical constraint. At the particular 
'instant I. that the product A x B is required it has to be assumed 
that the result of a 'previous' .multiplication is such that its 
less significant half Gi must be equal to·B. This implies that 
from a starting state s. there is an input sequence x such that 
~ 
6(s., x) = B. 
~ 
This constraint is only academic since in practice, this condition 
is satisfied all the time. 
ApPENDIX 5.1. 
Reprint of an article entitled, "Half-adders 
modulo 2N using read-only memories", published in 
Electronics Letters, 30th May 1974, Vol. 10, No.ll. 
carry input. Multiplication by -1 is just the bit-by-bit 
invcrsion of the data word, and multiplication by 2 -J, to 
normalise the transform, is only a cyclic shift of the ""'ord j 
places to t he left. 
"0.16 
" "8 
u 
15 12 
" u c 
Ss 
.'11 
"0 
E 
.~ 4 
c 
'f. 
0 4661012 
number of c~ls in use k 
'14 
Fig. 1 d against k 
The system shows relatively high error correction against 
loading performance. Fig. l'shows, for a 16-channel system, 
the minimum weights obtained for different values of k. the 
number of channels in use. Since the system is a linear code. 
these values can be taken as the minimum distance of the 
code (d). The number of errors that can be corrected is then 
(d/2)- f; for this code, all minimum weights are even. The 
performance is independent of p, the modulo number. To 
obtain this performance, the carriers must be selected pro-
perly, otherv.·ise a much lower bound will be obtain:d; 
, namely, d ::::: 4 for halfrate, d ::::: 8 quarter rate, and so on. 
In short, a multipJexing system has been described that 
compares very favourably with existing systems. Unfor-
tunately, for maximum performance, the carriers must be 
properly selected. At the receiver end. the decoding pro-
cedure for correcting more than the trivial l-error case is very 
complex. 
E.INSAM 
Electronics Departmellt 
Chelsea College 
London SW6 5PR, England 
References 
1st May 1974 
1 SCIlRElBEN: 'A review of sequency multiplexing'. Proceedings of 
symposium on applications of Walsh functions', Washington, USA, 
1973 
2 Hiln~ER: 'Comparison of methods for multiplexjn~ digital signals 
using sequency techniques'. Proceedings of s)'mpnslum on applica-
tions of Walsh functions, Washington, USA, 1973. 
HAlFADDERS MODULO 2N USING READ-
ONLY MEMORIES 
indt'xin.r{ tums: Adders, Digital arilhmelic, Read-only storage, 
Sequential machines 
Halfadders modulo 21'1 are resarded as finite-stale sequential 
machines. anti are implemented with reau-only memories. 
The application of the theory of 'closed' partitions is shown to 
lead to considerable savings in the memory storage required. 
which improves with increasing word lengrhs. and gives a 
very regular interconnection pattern and parallel operation. 
llllrndtlctiolJ: \Vith the advent of m.s.i./I.s.i. techniques, the 
trend in the design of logic systems is moving from the 
discrete-logic-gate level towards the system and subsystem 
level. Thus there is a case for investigating the hardware 
r~alisation of arithmetic operations in number systems having 
radices greater th • .H1 two. In thislcUer, half adders modulo 2''', 
where N ranges through the set of positive integers, are 
considered as basic arithmetic modules, and are studied to 
tind whether they contain useful algebraic structures that can 
lead to practical and economical hardware impicmcnlalions. 
ELECTRONICS LETTERS 30th Ma 1974 Vol. 10 No. 11 
Moduli of the form 2N are chosen to avoid the need for 
complicated circuitry for the conversion to and from the 
binary (0l0dulo-2) system. Also, the proposed design 
method is compatible with the familiar 'carry/look-ahead' 
technique in high-speed addition, in which case N then 
represents the number of digit pairs in a 'carry/look-ahead' 
stage. 
Table 1 MODUL~23 SUM TABLE 
A 
+ 0 1 2 3 4 S 6 7 
0 0 1 2 3 4 5 6 7 
1 1 2 3 4 S 6 7 0 
2 2 3 4 5 6 7 0 1 
B 3 3 4 S 6 7 0 1 2· 
4 4 S 6 7 0 1 2 3 
5 5 6 .7 0 1 2 3 4 
6 6 7 0 1 2 3 4 5 
7 7 0 1 2 3 4 5 6 
Table 2 CARRY T A.BLE 
A 
+ 0 2 3 4 5 6 7 
0 0 0 0 0 0 0 0 0 
1 0 0 0 0 0 0 0 1 
2 0 0 0 0 0 0 1 1 
B 3 0 0 0 0 0 1 1 I 
4 0 0 0 0 1 1 1 1 
.5 0 0 0 I 1 1 1 I 
6 0 0 I I 1 1 1 I 
7 0 I I I 1 I 1 1 
Example: Consider adding two numbers A and B modulo 8, 
Le. N = 3. The modulo-sum-and-carry tables are shown in 
Tables 1 and 2, respectively. Two read-only memories acting 
as table-look up units may be used as the hardware. 1 As the 
two operands are used to address 2(3 + 3) memory locations 
for each of the two r.o:m.s, and, since each location for the 
modulo sum and carrY tables contains a 3-bit and a I-bit 
word, respectively, the memory, storage required wiJI corres-
pondingly be 192 and 64 bits. For large word lengths, how-
ever, this direct implementation will lead to excessive storage. , 
Fortunately, the memory storage can be reduced consider-
ably by applying the theory of 'closed' partitions" 3 to de-
compose the direct realisation into an interconneCtion of 
smaller and simpler substructures. This is dum: by regarding 
Tables 1 and 2 as the state transition and output tables, 
respectively, of a finite-state machine (f.s.m.), having A and B 
as the 'input' and 'internal state~. Partitioning the machine 
states,3 we find the following non trivial 'closed' partitions: 
Itl = (0,2,4,6/1,3,5,7) It, = (0,4/2,6/1,5/3,7) 
A 'serial' decomposition is possible using either 711 or 7t2 in 
conjunction with a nonclosed partition A1 or A2, respectively, 
provided that It,.)., = It,.)., = 1t(0), where 
1t(0) = (0/1/2/3/4/5/6/7) 
is the 'zero' partition, and the . signifies a partition multi-
plication. Notice, however, that, since Itl is 'greater' than 1Cl.3 
nz, in turn, can be derived from 7rl and 1'10 where ..t'1 is 
another nonclosed partition such that n·, .1' 1 = nz. Thus we 
obtain 
)., = (0,2, 1,3/4,6, 5, 7) ).'1 = (0,4, 1,5/2,6, 3,7) 
The hardware for this form of realisation is shown in Fig. I, 
where the 'input' has been given the same assignment as the 
'internal state'. The overall memory storage of the three 
r.o.m. modules used is only 8~ bits for the modul0-8 sum, 
compared with the 192 bits obtained previously. Also, each 
of the modules is a sing!e-output r.o.m. and the inter-
connection pattern is very regular. The above imple-
lIlentation is known ·as the !oopfree realisation of an f.s.m.4 
213 
he carry or 'output" function can be realised as a straight-
frward combinational" matrix, and will not be discussed 
lrrher in this Ictl~r. 
A I I I 1 I 
r.o.m r.o.m r.om 
.", f- X 
"" 
f-
4 bits 16 bits. 64 bHs 
1 1 I 1 I 
1 1 
B modulo-2:J sum 
ig. 1 Read-only-memory realisation 
eneral modul0-2N hal/adde,.: To generate all possible 
Iosed' parritions for an adder modulo 2' .. ·• in general. it is 
Ifficicnt only to 'identify'. i.e. put .in the same partition 
lock. state 0 with each of the other states d in turn. since 
Lis invariably identifies any state at with the state aj + d 
nod 2· ... ). where d. Oj = 1",2,3, ...• 2N - 1• This is because the 
rst row and first column of Table I merely duplicate the 
Iput and present states, respectively. As a consequence, all 
ements that are a d distance apart will be identified; i.e. 
)r an arbitrary state. Oh a, and aj+kd (mod 21'1) will be in the 
Lme partition block, k being any integer. It is found that 
dders modulo 21'1 possess algebraic properties. as will be 
lawn by the following lemmas, whose proofs can be found 
1 Reference 5. 
emma J: If d is. odd. there are no 'closed' partitions apart 
'om the trivial partilions 11(1) and 11(0), the 'identity' and 
:ero' partitions. 
emma 2: If d = 21', P = 1,2.3, ... N, there exists a set of 
'Iosed' partitions (7th 7r2 •.. " nI" ••.• 1t"N)' Any nl' contains 
~ blocks of equal size, and, if the elemen'ts in anyone block 
re arranged in ascending magnitude, adjacent elements will 
ilfer by 21' units. 
It follows that the number of elements m(llp) in a block of 
p is given by 
2" 
m(llp ) = 2-
ld the' number of blocks of 11" # (11p), is given by 
) number of states 
#(n" = number.of elements in a block 2P 
emma 3: Let d and D be integers, I < d, D ,,2". If d 
'Ivides D, nIl > nD. 
oopfree realisation 0/ adders modulo 2'''': As a consequence 
f these lemmas, adders modulo 2N are seen to possess N 
losed' partitions (nit 7rl, •••• 7r" ••• ,1l"N), such that 
1l, > 1tl > ... > n, > .. , > 1CN TeN = n(O) 
5 is well known,o4 any finite-state machine that has the above 
gebraic properties is realisable as a serial loopfree con-
!ction of N components mlo m2 • ••• , mp, .•• , m,V, all operating 
mcurrently. Although any of the partitions n" can be used 
ith any nonclosed partition ;'1", as long as nI")'''' = n(O) • .a 
lore economical realisation will be to use all the available 
Iosed' partitions in tht! following manner: 
Using lC/V_ It a valid realisation will be 1t"N-I')·, ... _1 = TeN. 
imilariy, TC,v _ 10 in turn, is obtained using llN _ 2 • ) ..... _ 2 = 1lN _ I' 
'e thus have the following iterative relationship: 
1l(O) = TC .... _I.).N_I 
IIp = lCp _ I • .A.p_1 
: : : 
. . . 
From lemmas 2 and 3, ).,.. p = 1,2,3, ... , p • .•. , N - J, is a 
2-block partition. Therefore adders modulo 2'''' can be imple-
mented as a set of loopfree interconnected~omponent 
machines all operating concurrently. If the 'input' A is assigned 
the same code as the 'internal state' B the hardware imple- . 
mentation using single-output read-only memories is as shown· 
in Fig. 2A. 
• 
· A 
· · • 
· T T 
I • I H I 
I ::m .. ~ I~~m ~ .... I r.am. I corn. AN_2 AN_1 
• 
• 
· · B 
· 
• 
· 
· 
Fig. 2A Generalised realisation 
I 
I 
I 
\ 
\ 
\ 
\ 
\ 
", 
024 
---
----:.----
6' 8 10 '2 14 16 18 
word tength, N 
: 2N 
m 
Fig. 2B Effect of loopfree decomposition on memory storage 
Reduced storage = (4/3N)xoriainal storOle 
Memory-storage reduction: The ratio of the overall memory 
of the submachines to the memory of the direct machine R is 
given bys 
_ (4/3) (2'" - 1) _ ~ 'f 2'N 1 
R - N(2lN) - 3N I >-
Fig. 2a illustrates this considerable reduction in - memory 
size. which improves as N, the word length. is increased. 
Conclusioll: A design procedure for haIfadders modulo 2N 
has been proposed in which the hardware realisation requires 
less memory storage than that of the direct implementation, 
and it also results in a regular interconnection pattern and 
parallel operation. Consequently, this simple, effective and 
economical method appears to be promising, considering 
that the cost of semiconductor memories is falling all the 
time. 
M. A. BIN NUN 25th April 1974 
M. E. WOODWARD 
Department 0/ Electronic & Electrical Engineering 
Unit'ersity 0/ Technology 
Loughborough, Leics. LE Il 3TU, Eng/alld 
References 
t KRAMME. F.: 'Standard read--only memories simplify complex logic 
desigr.', Electronics, 1970,43. (I). pp. 88-95 
2 flOWARD, D. V.: 'Partition methods for read-only sequential machines', 
£JC'(·lron. Lt'ff .• 1972,8, pp. 334-336 . 
3 HARTMAl'OlS, J.: 'On the state assignment problem for sequential 
machines-I', IRE Trans., 1961, EC-IO, pp. 157-165 
4 H-\RTMA:"'IS. J.: 'Loopfree structure of sequential machines', Infor· 
mation &: Control. 1962,S, I, pp. 25-43 
5 BI~ NU:"'. M. A.: 'Adder modules using residue arithmetic'. Lough-
borough University of Technology Departmental Memorandum 88, 
1974 -
ELE TRON/CS LETTERS 30th May 1974 VoL 10 No. 11 
106 
CHAPTER 6 
NOVEL METHOD OF MODULO 2N MULTIPLICATION 
USING CONSTRAINED OPERANDS 
6.0 Introduction. 
In the previous chapter, we saw that a modulo 2N multiplier, 
modelled as an F.S.M., may be realised as a cascade loop-free 
interconnection of submachines. As indicated before, this is just 
one possible structure, and it would be more useful if there exist 
one or more parallel decompositions. Besides, each sub-machine in 
the loop-free decomposition does not appear to possess a structure 
regular en·ough to be generalised. The modulo 2N multiplication table. 
N 
as it stands however, is not as easy to analyse as the modulo 2 
addition table. 
In this chapter. a novel technique is presented in which the 
multiplication table may be modified in such a way that it is then 
possible to determine a definite algebraic structure that may be 
generalised to arbitrary N. Its features make it an interesting 
alternative to the loop-free configuration. This approach was 
initiated by the following observation. 
6.1 Observations. 
Consider the modulo 23 multiplication table discussed in Section 
5.3.0 and shown in Table 5.4. 
* for this multiplier are , 
Some of the possible S.P. partitions 
* It may be noted that ~2 and n3 cannot be derived using the loop-tree 
structure described by Theorems 5.1 and 5.2. 
• 
107 
"1 = { 0,4; """"6. ~, , 1,5; 3,7 } 
"2 = { 0,4; 2,6; 1,3; 5,7 } 
"3 = { 0,4; 2,6; 1,7; 3,5 } . 
. In the above partitions, it is observed that their first two blocks 
are identical, i.e. (0,4) and (2,6). Also, if we form the partition 
product of any two of the above partitions we will obtain "4 where 
"4 = { 0,4; 2,6; 1; 3; 5; 7" } • 
If we now restrict the values of the operands A and B, and the 
multiplication product to the set (1,3,5,7), then the following 
partitions may be derived from "1' ·"2' "3' Le. 
" 
= { 
a 
1,5; 3,7 } 
..rb = { 1,3; 5,7 } 
" c 
= { 1,7; 3,5 } 
Any two of the above S.P. partitions may be used in a parallel 
realisation to obtain the modified modulo 23 multiplication table, 
e. g. 
These observations suggest that parallel decompositions are 
possible if the original multiplier is modified by restricting the 
operands only to odd values. The corresponding table is shown in 
Table 6.0. The actual product may then be obtained using a. simple 
correction circuit. 
Table 6.0. Reduced modulo 23 
multiplication table. 
Table 6.1. Reduced modulo 24 
multiplication table. 
109 
6 2 M d 1 2N 1· 1·· . 'f d' d . d d . 0 u 0 mu t1P lcat10n uSlng orce operan s an pro uct 
correction. 
* The approach proposed is based on a reduced multiplier. which 
is the original modu10 2N multiplier whose operands and product are 
constrained to take on only the odd values from the set 
N . 
ZN = { 0.1.2 •.•.• 2 -1 } 
Hence. if A.B and A'. B' are the operands to the original and reduced 
multipliers respectively. then A'. B' E ZD' where 
Z = { x : x odd integer.. 1 ~ x ~ 2N_1 } 
D 
These modified operands may be derived from the originals by the 
mappings gA and gB' where 
and 
such that 
g. : A ~ A' = A + C 
A 
B ~ B' = B + d 
for A odd 
for A even 
for B odd 
for B even .. 
In other words. whenever any of the original operands is even. 
we 'forced' it to be odd by adding a '1' to it. In practice. this 
simply means that the least significant bits (L.S.B's) of A' and B' 
are assumed to be '1' all the time. Similarly for the product P' 
o 
f h d d d 1 2N 1·· Th .. N 1 b· f o t e re uce mo u 0 mu t1pl1er. e rema1n1ng - . ltS 0 
'" The meaning of "redUf!ed" here is different from that defined 
in Chapter 3 in the context of state minimisation. 
110 
A' and B' are identi~al to those of A and B. 
MUltiplying these 'forced' operands, we obtain, 
P' = A' x B' _ (A+c)(B+d) 
o 
N 
modulo Z 
= AB +' dA + cB + cd modulo ZN • 
Hence the required product P _ A x B 
o 
AB = (A'xB') - (dA + cB + cd) 
N 
modulo Z is given by 
N 
mod. Z ••• (6.0) 
This congruence relationship expresses our proposed multiplication 
scheme, in 'which (A' x B') describes the reduced multiplication 
and C = (dA + cB + cd) the correction required to obtain the actual 
product. The block diagram of the overall configuration is shown 
in Fig. 6.0. 
The various values of C corresponding to all possible combinations 
of a and b, the L.S.B's of A and B respectively; are given below. 
a b c d C 
o o 1 1 A+B+l 
o 1 1 o B 
1 o o 1 A 
1 1 o o o 
This leads to a very simple correction circuit consisting of 
a modulo ZN adder which is just the conventional ZN adder with the 
carry digit excluded, and two gating circuits, each consisting of 
(N-l) Z-input AND gates and one inverter. The output C from this 
correction unit is then subtracted from p' to obtain the actual 
o 
product P. This correction process is shown in Fig. 6.1. Further 
o 
> 
> 
,- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ---, 
, , 
, , 
I 1 , 
I A , I , I I 
, ~ A I o h j) REDUCED Po' , , I '" Z Cl: CORR N I , W 0 
'" 
I 
1 « u ®N , a:: er .,/ UNIT i 
1 t-..... 
w 0 f----1'-. 2 i CL LL I , 0 ;V rV - , B I B I , 
I , . I 
, I , I 
, I 
I I 
, 
L _______________________________________ J 
Fig. 6.0. Block diagram of modulo ZN multiplication using 'forced' operands 
and product correction. 
I-
~ 
· 
~ 
· 
· 
(,!)(,!) 
· A' z- • 
..... f-
· 
1--: ..... 
· 
· 
<:(z 
· (,!)~ 
1fr CB N 2 
C 
. . . 
, 
invert 
· I 
· 
· 
G.U 
· 
· 
• 
,--
"'-
-- . 
· 
· 
• 
· 
· 
· 
· 
B 
"0 
L.. 
..... 
..... 
c 
0 
u· 
::J 
0. 
C 
FQ · , · IQ 
· 
SUBTRACTOR 
· 
· 
· 
· 
Fig. 6.1. Correction circuit for multiplication 
product. 
III 
simplification in its detailed realisation is possible and will 
be described in the following example. 
6.2.0 Example. 
Consider first a modulo 24 multiplier whose operands have been 
constrained to only odd values. The corresponding multiplication 
table is shown in Table 6.1. The F.S.M. model of this reduced 
multiplier has the following basic S.P. partitions. 
. "1 = { 1,3,9,11; 5,7,13,15 } 
"2 = { 1,5,9,13; 3,7,11,15 } 
"3 = { 1·,7,9,15; 3,5,11,13 } 
"4 = { 1,7; 3,5; 9,15; 11,13 } 
"5 = { 1,9; . 3,11, 5,13; 7,15 } 
"6 = { 1,15; 3,13; 5,11; 7,9 } 
By forming the higher level partition sums, it may be shown 
that these partitions, "1 - "6' are the only possible S.P. partitions 
f h d 24 .• or t e F.S.M. mo el of the reduced modulo mult1p11er. The 
corresponding partition lattice is shown in Fig. 6.2 •. 
There are a variety of ways with which this reduced. multiplier 
'" may be implemented using the above partitions. For example;.any 
two of the 4-block partitions "4' "5 and "6 will lead to a .parallel 
realisation. The case of u·sing "4 and "5 is shown in Fig. 6.3·(a), 
'" In our discussion, it is assumed that A' or 'input'. and B' or 
'state' of the reduced muLtipLier are assigned the same partition. code. 
112 
in which 64 bits of memory are required. Alternatively, anyone 
of these 4-block partitions, Us say, along with a relevant non-S.P. 
partition TS' will realise a cascade configuration, as in Fig. 6.3(b), 
requiring a 96-bit memory store. 
Similarly, with the 2-block partitions nI' n2 and u3' a cascade 
form is possible if ul is used in conjunction with a non-S.P. partition 
Tl as shown in Fig. 6.3(c). Also, any two of them, u2 and n3 say, 
may be operated in parallel to realise Us since u2.n3 = uS. This 
parallel form shown in Fig. 6.3(d), is then used with n4 , as in 
Fig. 6.3(a) to realise the reduced multiplier which now require only 
40 bits of memory storage. 
An even better realisation has been found in which u3 and u2 
(or n1) in parallel realise uS' and the same n3 in cascade with a 
non-S.P. T3 implement n4 (or n6). The resulting partitions nS and 
n4 are then operated in parallel. Unlike the scheme shown in Fig. 6.2(a), 
these components now share a common variable between them. This hybrid 
or composite configuration, show~ in Fig. 6.4,. requires a storage 
of only 24 bits, which is a considerable reduction from the 192 bits 
required in the direct realisation. 
It may be shown, by deriving the logical functions of the components 
represented by n2 , n3 and T3 that the binary assignment shown· in 
Table 6.2 is a good one. The blocks of Us and n4 are encoded ~y the 
variables (Y2' Yl) and (Y2' Y3) while those of n3 and n2 by the 
variables Y2 and Yl respect.ively as shown below. 
n1 
Tr(I) 
115 TT4 
1T(O) 
Fig. 6.2. Partition lattice of 
F.S.M. model of modulo 24 
reduced multiplier. 
~ 
1 1 
rr 
5 
r--" ____ I 
1 I 
1 1r I 
Z 
-+,.~ 1Ts I-+-~ 
1 
1 
I 1 I... ______ J 
Ca) 
(d) 
Figs. 6.3(a) - (d). 
f-
7T5 
'-111 
1:"5 
(b) 
y 
-n;. 
~11 
L-, 
1L 
(c) 
Possible decompositions of modulo 24 
reduced multiplier. 
,-_ .. _ .. _.--_ .. _-- --_.-.-.----_.-
I 
(_. -
Tf2 
4 )( 1 
bits 
~----------------
TT3 
; 
-- _ .. _---_ .. _ ... _-- .. - .... -... 
, . , 
I 
· 
, 
· I 
· , 
I , 
" 
· 
· 
. 
4)( 1 ! 
-I , 
· bits I • ! , 
• ~ ___ .J • . ._--------.-.-.- .. - . I 
, 
I 
I 
• I 
L3 I , 
I 
I 15 x 1 : 
I 
• bits ,
, 
I 
· 
, 
" .... -- _ .. -- ... - - -- ... - .. -_ ........ - -- - - _. - ......... - - ... _- --
Fig. 6.4. Hybrid (composite) realisation of 
modulo 24 reduced multiplier. 
1T5 Q/P 
~ Q/P 
Operands 
A, B 
1 
3 
5 
7 
9 
11 
13 
15 
113 
Binary number Actual binary 
representation assignment· 
x3 x2 xl x Y3 Y2 Yl 0 
0 0 0 1 0 0 0 
0 0 1 1 0 1 1 
0 1 0 1 0 1 0 
0 1 1 1 0 0 1 
1 0 0 1 1 0 0 
1 0 1 1 1 1 1 
1 1 0 1 1 1 0 
1 1 1 1 1 0 1 
t 
permanent 
'l's 
Table 6.2. Binary codings for operands of 
reduced modu1o 24 multiplier. 
Yo 
1 
1 
1 
1 
1 
1 
1 
1 
i 
115 .. 
and 
Y2 Y1 
(1,9) 
-
0 0 
(3,11) -+ 1 1 
(5,13) 
-
1 0 
(7,15) -+ 0 1 
Y2 
(1,7,9,15) -+ 0 
(3,5,11,13) -+ 1 
114 
114 
112 
Y2 Y3 
(1,7) 
-
0 ·0 
(3,5) 
-
1 0 
(9,15) -+ 0 1 
(11,13) -+ ·1 1 
Y1 
(1,5,9,13) -+ 0 
(3,7,11,15) - 1 
If the original.operands A and B come from an external 
environment, then they are invariably coded in the binary number 
representation. Consequently, except for their L.S.B's, (which are 
kept at '1') . the 'forced' operands A' and B' are also in the binary 
format. In this situation the partition variables Yl' Y2 and Y3 
have to be encoded from the binary number code represented by the 
variables xl' x2 and x3• Similarly,' the output of the reduced 
mUltiplier has to be decoded back to the conventional binary format. 
The logical 'relationship for this encoding and decoding process 
however is simple. From Table 6.2 we find that 
Yl = xl 
Y2 = X2 0 Xl 
Y3 = x3 • 
Similarly, for the decoding, we obtain 
xl = Yl 
x2 = Y2 CV xl = )'2 0 Yl 
115 
and 
If the reduced multiplier is in an "internal" processor then 
the partition variables may be used directly. 
The functional structures of the n2 , n3 and T3 components are 
obtained by first reorganising the mUltiplication table in the ways 
shown by Tables 6.3(a) and (b), and Table 6.5. 
Consider first the n2 and n3 images of the F.S.M. model of the 
multiplier, whose state tables are shown in Tables 6.4(a) and (b) 
respectively. It is observed that in each of the two cases, -the 
operation. between the partition blocks is simply that of the Exclusive-
OR function. Hence the n2 and n3 components are straightforward to 
implement. 
With the table organised by T3 , we observe that if the operation 
between blocks is considered, this would have resembled that of the 
Ex-OR were it not for the entries shown in the dotted boxes. 
Assigning the variable Y3 to the blocks of T3 , we obtain, 
Y3 
T3! (1,7,3,5) -+- 0 
(9,15,11,13) -+- 1 
From this Table 6.5, it is seen that if the Ex-OR function is 
to be used for the operation between the blocks of T3 , the. output 
of this successor component has to be modified by logically inverting 
it whenever both A' and B' come from the set (3,5,11,13). An Ex-OR 
output is assumed if either or both A' and B' are from the. set 
(1,7,9,15). This information is easily obtained since these two 
blocks are also the blocks of n3 . 
116 
A' 
CV 16 1 S 9 13 3 IS .. 11. .7 
1 1 S 9 13 I 3 IS 11 7 I 
I 
I 
S S 9 13 1 lIs 
I 
11 7 3 
I 
9 9 13 1 S III I 7 3 IS 
I 
13 13 1 5 9 I 7 3 15 11 B' I I 
----- -------------------~----------------
3 3 15 11 I 1 5 7 I 9 13 I 
I 
15 15 11 7 3 
I 113 
I 1 5 9 
I 
I 
13 11 11 7 3 15 I 1 5 9 I 
I 
7 3 15 I 13 1 7 11 I 5 9 I 
I 
(a) 
A' 
@ 16 1 7 9 15 3 5 11 13 
1 1 7 9 15 3 5 11 13 
7 7 1 15 9 5 3 13 11 
9 9 15 1 7 11 13 3 S 
B' 15 15 9 7 1 13 11 5 3 
-----1-------------------;---------------...:--
3 3 5 11 13 I 9· 15 1 7 
I 
I 
5 5 3 13 11 I 15 9 7 1 I 
I 
I 
11 11 13 3 5 I 1 7 9 15 I 
I 
I 
13 13 11 5 3 I 7 1 15 9 I 
I 
(b) 
Table 6.3. d 1 24 1· 1· . Reduced mo u 0 mu t1P 1cat10n 
table reorganised by (a) "2 and (b) ."3 
B' 
117 
E, F 
E E F 
F F E 
(1,5,9,13) ->- E 
(3,15,11,7) -+ F 
(a) 
u v 
u u v 
v v u 
(1,7,9,15) '->- U 
(3,5,11,13) ->- V 
(b) 
Table 6.4. (a) n2 and,(b) n3 images respectively 
of reduced multiplier. 
16 1 
1 1 
7 7 
3 3 
7 3 
7 3 
1 5 
A' 
5 I 9 
I 
5 I 9 
I 
3 I 15 
I 
........... I 
5 ; 9 IS:! 11 
, . I 
• . I 
15 11 13 
15 11 13 
9 13 11 
13 
3 :15 9: I 13 11 7 1 ~ 
.......... , I .......... . 
-------------------~--------------------I 
5 5 T3 = {l,7,3,5; 
9 9 15 
15 15 9 
11 11 13 
13 13 11 
Table 6.5. 
11 
13 
13 I 1 
I 
I 
11 I 7 
I 
" ..••.••• I : 
: 1 7; I 3 
• I 
• ; I 
• . I 
·7 1: I 5 
........... , I 
I 
7 
1 
5 
3 
Multio1ication table 
reorganised bv T3 • 
3 
5 
9 
15 
5 
3 
15 
9 
.,. ...... ~~.". 9,15,ll,13} 
118 
The detailed circuit structure of the component.ma·chines of 
the reduced modulo 24 multiplier has now been derived and is shown 
in Fig. 6.5. Furthermore, using equation (6.0) the corresponding. 
correction circuit may be obtained as shown in Fig. 6.6. 
6.3 Internal algebraic structure of reduced modulo ZN multipliers. 
We have already seen that in the particular cases of modulo 23 
and modulo 24 multipliers, the restriction of the multiplicative 
operands to only odd values led to the discovery of a variety of 
useful S.P. partitions on the input and state sets of their corresponding 
F.S.M. models. That these partitions could not have been predicted 
by Theorems 5.1 and 5.2 implies that algebraic structures other than 
. the cascade loop-free form are possible. 
In this section the algebraic structure of a general reduced 
modulo 2N multiplier is investigated in depth in order to determine 
its general nature and pattern. 
6.3.0 Example. 
Consider the modulo 24 multiplier discussed in Section 6.2.0, 
and in particular the reduced multiplication table reorganisedbi,r . 
W2 ' i.e. Table 6.3(a). There it was shown that the operation between· 
the blocks of w2 is analogous to the logical Exclusive-OR operation 
which, in turn, is simply the familiar modulo-2 addition. .Consequently, 
the w2-image of the reduced modulo 24 multiplier is identical to a· 
modulo-2 adder. 
Table 6.3(a) also possesses another interesting ·feature. Consider 
first the multiplication operation between elements of the block 
,-----------1 
: I 
I I 
I : 
i ~.~----~~-------
I ,,~ I 
I I 
... _--------, 
r-------, 
:, I 
I i 
: I }T!,----------~~L--/ 
: I 
.... _________ t 
L::j:==:;Jr- -- --- -- ------ --, 
y. . 
2 
L-----t-~~_W~'~~ : 
I }-~-\=r,""""" I 
. ' 
Fig. 6.5. 
I 
I 
I 
'il----" t l' . 
: 
I 
I 
I 
i , 
I 
I 
I 
I L ___________________ J 
lr----~~ 
4 Logic circuit implementation of modulo 2 
reduced multiplier. 
r--------'- 2
3 
, 
.-___ 22 A 
r---- 21 
2° 
+ 
+ 
'-02° 
'------421 
'-------<>22 B 
..... 
L-
W 
> C 
+ 
+ 
+ 
'1 ' 
, ' 
L-., 1 
. Fig. 6.6. Implementation of correction circuit for 
modulo 24 multiplier. 
119 
(1,5,9,13), which is described by the upper 1eft-hand.quadrant. 
The structure of this quadrant, which is redrawn in Table 6.6, 
becomes obvious if the following mapping is performed. 
modu1o 24 multiplication ~ .modu10 22 addition 
confined to elements (1,5,9,13) 
'lLl 
1 o 
1 
9 2 
13 3 
By comparing Table 6.6 with the modu1o 4 addition table shown 
in Table 6.7, it may be deduced that this particular quadrant of 
the reduced multiplier table is isomorphic to a modu10 4 adder. 
The upper right and 1m.er left-hand quadrants may be analysed in 
a similar manner. The lower right-hand quadrant, although' similar 
in structure, differs from Table 6.6 by two row shifts upwards. 
4 Hence, as shown in Fig. 6.7, the reduced modu10 2 multiplier can 
be regarded as a parallel connection of a modu10 2 adder and a modu1o 4 
adder incorporating the row shift mentioned. 
A more elegant structure may be obtained if we consider the 
4-b1ock partition TI4 and reorganise the multiplication table according 
to its block as shown in Table 6.~. The corresponding TI4-image, 
(see Table 6.9), is isomorphic to a modulo 4 adder via the mapping 
04 
p o 
Q 1 
R 2 
s 3 
B' 
120 
Table 6.6. Part of modu1o 24 
multiplication table. 
A' 
0 16 1 7 .3 .5 9. 15 
1 1 7 3 5 9 15 
7 7 1 5 3 15 9 
r-9--15-: 3 3 5 11 13 
J I 
I 5 5 3 ' 15 9 I 13 11 l _______ • 
9 9 15 11 13 1 7 
15 15 9 13 11 7 1 
11 11 13 1 7 3 5 
13 13 11 7 1 5 3 
Table. 6.7. Amodulo4 
addition table. 
11 13 
11 13 
13 11 P .. Q R 
P P Q R 
1 7 
Q Q R S 
7 1 
R R S P 
3 5 S S p Q 
5 3· 
9 15 (1,7) ---. P 
(3,5) .....--.. Q 
15 9 (9,15) ---. R 
(11,13) --+- S 
S 
S 
P 
Q 
R 
Table 6.8. Reduced multiplication table 
organised bi 714• 
Table 6.9. 7f4-imageof 
reduced 
multiplier. 
, 
B 0 24 
Fig. 6.7. 
e 2 
e 
22 
0 
2 
,"-
'-
O 2 2 
&. 
shift 
Pseudo-parallel decomposition of 
modulo 24 reduced multiplier. 
, 
e 2 
r--- 1-----
I 
I 
I El 
I 2 , 
I 
I 
I 
I 
I 
I 
I 
-
C'2 
--1 
I 
: 
I 
I 
I 
I 
I 
I 
I L- __________ .1 
(~ ~) 
Fig. 6.8. Implementing reduced multiplier (a) as a 
parallel connection of modulo 2 and modulo 4 
adders, (b) with the modulo 4 adder further 
decomposed. 
121 
Also, the entriEs in Table 6.8 are grouped into blocks of 
four elements, a typical block being shown enclosed by the broken 
lines. 
It is observed that the structure of every such block :is 
. identical to that of a modu1o 2 adder. Hence the resulting 
implementation, shown in Fig. 6.8(a), consists of a modulo 2 and a 
modu1o 4 adders operating in parallel, with neither component 
requiring any correction. 
In addition, this modu1o 4 adder may be further decomposed 
using the loop-free technique described by Theorem 5.1. The final 
realisation of the reduced modulo 24 multiplier shown in Fig. 6.8(b) 
is thus obtained which requires a memory store of 24 bits • 
. A1though this approach led to an overall circuit configuration 
and memory requirement identical to those shown in Figs. 6.4 and 
6.5, it will now be shown to be more systematic and directly applicable 
to the general modulo 2N reduced multipliers than the approach used 
in Section 6.2.0. 
6.3.1 h 2N, l' , T e group under modu10 reduced mult1p 1cat10n. 
N Consider the general modu1o 2 multiplier whose two operands 
(and hence product) are constrained to odd values, i.e. to values 
®N from the set ZD' and denote this multiplier as the tuple (ZD' 2 ). 
We will now prove the following lemmas. 
Lel'l7J'TU 6. O. (ZD~2N) is closed. 
Proof. Consider a,b E ZD' Using the familiar division 
algorithm, the real product a x b may be expressed as 
a x b = q2N + r, where N 0(r<2. 
122 
Since a,b are odd and hence not divisible by 2, so is their product. 
Thus the right-hand side of the above equation is not divisible by 
2. N· Hence r, the modulo 2 product of a and b must also be odd and 
in the ·range ° to 2N_l, i.e. r E Zo' 
Lerruna 6. 1. (Zo0,X2N) . •. . 18 aSSOc1at1ve, commutat1ve 
and has identity 1. 
Proof. This follows naturally from the properties of 
the original modulo 2N multiplier. 
Lemma 6.2. Every element d E Zo has a multiplicative 
. d 1 2N l' l' . 1nverse-w.r.t. mo U 0 mu t1P 1cat1on, Le. x E Zo may 
be found such that dx = 1 (modu10 2N). 
Proof. Consider dE ZO' From Lemma 6.0, d x d is also 
in Zo and similarly for d x dx •••••• x d = dk , k an : integer. Since 
ZD is a finite set, then for a particular k, say k', there must be 
a repetition, i.e. 
i.e. 
Since d is coprime to 2 and hence 2N, the above equation implies 
that (dk'-l - 1) is divisible by 2N. Therefore we can write 
or 
k'-2 N d d _ 1 (modulo 2 ), 
k'-2 Le. for any d E ZO' we can find its inverse, which is d .• 
123 
LeTTm2 6. J. N-l ZD contains 2 elements. 
Proof. All the even elements from the original ZN 
are of the form 
N-l N-l O. 2Xl. 2x2, •••• 2x~, •••• 2x(2 - 1), ~ = 0.1, ••• ,2 - 1. 
N-l N-l There are (2 - 1) + 1 = 2 even elements. Consequently, 
the number of odd elements which we denote by IZDI is given by 
The above four le\llIDas lead naturally to the following. theorem 
. N 
with which the overall structure of the general reduced modulo 2 
multiplier may be described. 
6.3.2 
Theor>em 6. O. The set of odd integers, i. e. ZD =. {x : x 
odd integer, 1 " x " 2N-l}. forms an Abelian. group * under 
N 
multiplication modulo 2. This group, which we denote by 
N I N I N-l G(2 ). has order G(2) = 2 • 
Derivation of detailed algebraic structure·of reduced 
modulo 2N multipliers. 
Before we present our main results, we state below some well 
known results and concepts in algebraic number theory that we·.will· 
be using. 
Definition 6. O. Let x E ZM' where ZM =. {x : x integer. 
} 6(m) _ 0' x < m and suppose x = 1 (modulo m). where 6(m) is the 
number of elements coprime to m. 
. n 
Also let x 1. 1 (modulo m) for 
* Refer>ences 68-70 ar>e excellent introductions to the concepts and 
temrinology of elementar>y gr>oup theory. 
124 
n < 8(m). Then x is called a primitive root of 1 modulo m or 
simply a primitive root of m. 
Theorem 6.1. Z has a primitive root of 1 modulo m 
m 
if 
a) N m = p , 
b) m = 2pN or 
c) m = 1, 2, 4, i.e. m = 20 z1 22. , , 
where p is any odd prime. 
"" Proof. See Theorem 2.25 of Reference 47. 
We see that (c) can be "used directly to describe the structure 
012 
of our reduced multipliers for moduli 2 , 2 "and 2 , by determining 
the relevant primitive root and using the following Theorem 6.2. 
N Theol'em 6.2. If Z N has a primitive root of 1 modulo p , 
p 
then G(pN) is a cyclic group where Z N =" {x : x intc"ger, 
p 
o .; x < pN}, and G(pN) is the group, consisting of the 
f 1 f Z h . N d h set 0 e ernents 0 N t at are copr1me to p , an t e 
p 
d 1 Nl . 1· . mo u 0 p mu t1P 1cat10n. 
N Proof. Suppose Z N has x as a primitive root of p • 
p N N " 
Then x mus t be coprime to '"P", and hence x E G(p ). Let H be the 
subgroup of G(pN) generated by x. Then IHI = O(x) = 8(pN), i.e. 
IHI = IG(pN)I, where IHI and O(x) are the orders of H and the element 
x respectively. Since H ~ G(pN) we must have H = G(pN), i.e. 
G(pN) is cyclically genera"ted by x. 
Theo1'em 6.;>. If G is an Abelian group of order pN for 
some prime p and natural number q, then 
G = HI x H2 x ••• H. x •••• x H 
1 q 
125 
k. 
h h H · 1· f d 1 and 'k. - N. were eac . 1S cyc 1C 0 or er p L-
1 1 
Proof. See Theorem 5.1.11 of Ref. 48. 
From Theorems 6.1 and 6.2 we now know that for each of the 
moduli 20 , 21 and 22, the complete multiplication table of the 
corresponding reduced multiplier can be generated by a single element 
x E Zn' if x is a primitive root 
N 
of 2 , N = 0,1,2. 
In our following ;'two,:o leIIUDas and one theorem we extend the 
analysis to cases where N ~ 3 and will show that the table of a 
N general reduced modulo 2 multiplier, N ~ 3, can still be described 
completely but this time two elements Z., Z. E Zn are required to 
1 J 
generate it. 
Lemma 6.4. 2
n 2 3 , = (2n+ )(x(n» + 1 for n ~ 1 where 
x(n) is an odd number for all n. 
Proof. 21 3 1+2 3 , = 9 = 8 + 1 = 2 + 1 = (2 ) x 1 + 1. 
So the expression is true for n = 1, where x(l) = 1. We now assume 
that it is true for n = k, i.e. 
32k 2k+2 x(k) + 1. 
2k+l 
Writing the expression 3 we get 
2k+l k k 
3 = 32 .2 = (l )2 
Substituting for 
2k 
have 3 , we 
2k+l ~k+2 x(k) + ~2 3 = 
= (2k+Z x(k»Z + Z(2k +2 x(k» + 1 
= Z2k+4 (x{k» Z + 2k+3 x{k) + 1 
= 2k+3 [Zk+l (x{k»2 + X{k)] + 1 
126 
k+3 
= 2 x(k+1) + 1 
where we have let [2k+l (x(k»2 + X(k)] be equal to x(k+l), which 
is odd since x(k) is odd and 2k+l (x(k»2 is even. Therefore the 
Lemma is proved by induction. 
(We note that x(l) = 1 and that x(n) is defined recursively 
using the expression· 
x(n) = 2n {x(n-l)}2 + x(n-l) ). 
CoroU=y. The element 3 in Z N has order 2N- 2 in G(2N). 
2 
Proof. Since 3 is in G(2N), and the order of G(2N) is 
N-1 k N 2 (see Lemma 6.3), then 3 has order 2 in G(2 ) such that 
o < k 'N-1. Thus we may write 
_ 1 modulo 2N 
2k 
Using Lemma 6.4, we can substitute for 3 thus obtaining, 
2k+2 x(k) + 1 _ 1 modulo 2N 
i.e. 2k+2 x(k) = 0 modu1o 2N 
implying that 2k+2 is divisible by 2N since x(k) is odd. Thus we 
obtain 
2k+2 = q2N , q = 0,1, ..... . 
The first (or least non-zero) value of k to satisfy the above 
equation is when q = 1, resulting in k+2 = N, and hence k = N-2. 
. N-2 N Therefore 3 has order 2 in G(2 ). 
Lemma 6.5. There are four elements, ± 1 and 2N- 1 ± 1, 
in G(2N) having order 2. 
127 
Proof. N Let x E G(2 ) have order 2, 
i.e. 
or 
x
2 
- 1 _ 0 modu10 2N 
Since x is odd, we can write 
Therefore 
.i.e. 
or 
x = 2q + 1. 
(2q+1)2 - 1 _ N o modu1o 2 
4q2 + 4q + 1 - 1 = 0 modu10 2N 
22 q(q+1)= 0 modu10 2N 
q(q+1) N-2 _ 0 modu10 2 • 
In this congruence, we see that if q is even then q+l is odd and 
vice versa. 
Case (i). Let q be even. This implies that q. is divisible 
N-2 by 2 ,i. e. 
q = 0, 
where R. is an int.eger. 
N-2 1 x 2 , 
x can be written as 
2 x 2N- 2 , .•• • •• , R. x 
For Jl. even, i.e. = 2u, say, we obtain 
x = 2[2u 2N-~ + 1 
= 2 [u 2N- 1l + 1 
N 1 = u 2 + . 
N-2 2 , 
Therefore 
x" 1 
128 
N 
modulo 2 
If R, is odd, i.e. = 2v + 1 say, then· 
Therefore 
~ N-2l 
x. = 2 L(2v+1)2 J + 1 
N N-1 
= v 2 + (2 + 1) 
N (modu10 2 ) 
••• (6.0)· 
••• (6.1) 
Case (H) 
N-2 
Let q be odd, thus implying that (q+1) is divisible 
by 2 ,i.e. 
(q+1) = R,2N- 2 
• 
R, = 0.1 •..•. 
or 
q ; R,2N-2 _ 1. 
x = 2 [R,2N- 2 ~ + 1 
= R,2
N
-1 - 1 
For R, even, i.e. R, = 2u, we obtain 
x = 2u 2N-2 - 1 = u 2N - 1. 
x " - 1 (modu10 2N) ••• (6.2) 
For R, odd, i.e. = 2v + 1, then 
N-1 
x = (2v+1)2- 1 
= v 2N + (2N- l - 1) 
••• (6.3) 
129 
Consequently, from equations (6.0) - (6.3), we obtain the 
result that if x is to have order 2 then 
x =' 1, 
N-l 2 + 1, - 1 and 2N- l - 1 
CoroUaxy. (a) The values x h =' 1 and xi =' 2
N
-
1 + 1 
N (modulo 2N) • (modulo 2 ) may be expressed as powers of 3 
(b) The values x. 
J -
1 and - 2N- l xk = _ - 1 
N (modulo 2 ) cannot be powers of 3. 
Proof. Using Lemma 6.4 and letting n = N-2 and N-3 
respectively, we obtain the expressions 
and 
The values in 
and 
where qh and 
of the terms 
N-2 
32 = 2N x(N-2) + 1- ... (6.4) 
N-3 
32 = 2N- l x(N-3) + 1 ... (6.5) 
(a) above may be written as 
~ = 2N x q h + 1 ••• (6.6) 
N-l 
+ 1 x. = 2 q. 
~ ~ 
... (6.7) 
q. are integers. 
~ 
We may now, by comparing the coefficients 
2N in equations (6.4) and (6.6), and the terms 2N- l 
in equations (6.5) and (6.7), deduce that if we let qh = x(N-2) and 
q. = x(N-3) then 
~ N-2 
~ = 3
2 
-
and 
N-3 
x. = 32 
~ 
thus proving (a) • 
130 
The values Xj a~d 1k may be written as 
••• (6.8) 
and 
- 2 ~ x. - 2 
1. 
••• (6.9) 
In the above equations, we know that ~ and x., being powers of 3, n 1. . 
must be divisible by 3. However 2 is not. Therefore Xj and 1k are 
not divisible by 3, and since this is a necessary condition for 
them to be powers of 3, then we have proved (b). 
Consider now the set K given by 
K ~ {k 0' 
where k. E G(2N) and k. 
1. 1. 
For any pair k., k. E K. 
1. J 
we therefore have 
k. x k. 
-
i x 3j modulo 2N 
1. J 
3(i+j) 
" -
N-2 N-2 (i+j) may be written as q2 + r. where q ~ 0 or 1 and 0 ~ r < 2 
2N-2 N 
k. x k. _ 3q 3r modulo 2 
1. J 
if q ~ O. 
If q = 1, we use the Corollary to Lemma 6.4. to obtain 
k. x k. _ (Q2N + 1)3r modulo 2N. 
1. J 
since 0 ~ r < 2N-2, 
N 
modulo 2 • 
k ~(k.xk.)EK. 
r 1 J 
Q. an integer 
131 
N Therefore K is closed and hence is a subgroup of G(2 ). 
and 
Consider now the set L, given by 
i JI.. _ x 
~ 
L = (JI. 
0' 
N 
modulo 2 
where 
, for i = 0,1. 
The value of x is either x = - 1 N modulo 2 
(See Corollary (b) of Lemma 6.6). 
or x 
To show that L is closed· and hence L C G(2N) , it is sufficient 
to demonstrate that x x x = x2 = 1 modulo 2N according to Lemma 6.S. 
2 Therefore x E L. 
We are now in a position to present a detailed description of 
the algebraic structure of our reduced modulo 2N multiplier, N ) 3, 
via the following Theorem. 
Theorem 6.4. The group G(2N), as described by Theorem 6.0, 
for N·) 3, is isomorphic to the direct product group K x H, 
N-2 
where K and H are cyclic groups of orders 2 and 2 
respectively. 
Proof. G(2N), N ~ 3, cannot be cyclic since if it were, 
it would have a primitive 2N root and this contradicts Theorem 6.1. 
N So by Theorem 6.3, G(2 ) 
subgroup generated by 3. 
X ••• X H , q 
N-2 
= 2 . 
q ) 2. Let K be the 
So K is HI say; 
since K is cyclic (being generated by a single element), and G(2N) 
is not. By order considerations then G(2Nj = HI x·H2 where H2 
is cyclic of order 2. This gives our result. 
Since L is not a subset of K, we also have H2 = L •. 
6.3.2 
-------
132 
Application of theoretical results. 
The subgroups K and L may now be used to organise G(2N) by 
* forming the relevant cosets in the usual way. 
Let the cosets w.r.t. K be Vo and V1 and those w.r.t. L be 
w , 
n 
n = (2N-2 -1), given by 
v .. J ••• ,·v ) = K 
O,J. o,n 
••• , v 1 ., •••• J vI ) ,1 ,n 
and 
w. = (w. , w. 1) , 1. 1.,0 1., 
where d i modu10 2N d = 0 or 1, o " i 
" 
vd . - x , n 
,1. 
and 3i d " in which the value of x is w. d - x , 1., 
Xj or ~ as described by Corollary (b) of Lemma 6.5. 
4 For example, if N = 4, then the elements of G(2 ) are 
either 
o 123 1,3,5,7,9,11,13,15, and hence K = (3 , 3 , 3 , 3 ) = (1,3,9,11) 
4 
modu10 2 , and L is either (1,7) or (1,15). Consequently, the 
relevant cosets are 
= (7,5,15,13), 
* Since the multiplication operation is commutative the elements of 
K (or L) can be multiplied by any particular element of G(2N) either 
from the left or from the right. 
133 
and 
N In general, each element g, of G(Z ) can be represented by 
"',m 
N the modulo Z product of powers of x and 3 via the following 
congruence, i. e. 
x~ 3m = go modulo ZN )t;,m ' 
-where ~ = 0 or 1 and 0 ~ m , n. 
From the cotollary to Lemma 6.4, and the results in Lemma 6.6, 
the components' 3m and xJl., will go through ZN-Z and 2 values 
respectively before repeating themselves. Therefore this will 
N-Z N-l, JI. m N generate (2 ) x 2 = 2 different values of x 3 (modulo 2 ). 
Since there are also 2N- l different elements of G(2N), the above 
congruence describes gJl. m uniquely. 
, 
Let us now express G(2N) in terms of its cosets, i.e. 
,and 
subgroup L. 
N If we consider any two elements of G(2 ), say gD' ,and 
N ,m 
gJl." ,m" , then, as shown in Appendix 6.0, their product P is 
given by 
where 
P - g .. 
1,] 
N 
modulo Z , 
(JI.' + JI.") _ 1 modulo 2 
and 
(m'+m")_j 
134 
N-2 
modulo 2 
Since the subscripts i and m denote the cosets w.r.t. the subgroups Kanc 
respectively, then we know that 
and 
and also E W , , 
m 
, and also E w ., • 
m 
Furthermore, their modulo 2N product g. . belongs to both cosets 
1.,] . 
. N-2 v. and W., where i and j are the modulo 2, and modulo 2 . sums 
1. J 
of i', i" , and m', m" respectively. 
Consequently, if we denote the operation between any two· 
* cose.ts (w.r.t. K) by O'jb' and ·that'hetween any two (w.r.t. L) by 
[] , then it is not difficult to see that 
m 
and 
W " m 
= V (i' +i' ') 
= W(m'+m' ') 
modulo 2 
N-2 
modulo 2 
••• (6.10) 
••• (6.11) 
In other words, if each coset is mapped onto its corresponding 
index~ i.e .. 
and 
W , --+ rn' 
m 
, W --+ m" m' , , 
* i.e. the modulo ~ multiplication of any member of anyone coset 
with any member of the same or any other coset. 
135 
then operations between cosets may be mapped onto modulo addition 
operations bet,yeen indices as shown by the two commutative diagrams 
below. 
(a) V R,' V R,' , DR, VR,' DR, V R,' , 
1 1 Q v. ~ ! R,' R,' , O 2 • i = (R,'+R,") modulo 2 
(b) W ,. t{ " Om Wm,Om W " m m m j j = W. f " ." ,,' m' m' , 8 2N- 2 N-2 
• j = (m'+m' ') modulo 2 
Finally, the complete reduced modulo 2N multiplication may be ' 
described in a compact way as follows. 
and 
where 
Let f and f be the mappings given by gp'
f g 
f 
P @N--->-2 
N) go E G(2 , 
",.m 
(l!..m) 
, 'N-2 
mE {O.l.2 •.... , 2 -l} 
136 
and 0 is the parallel component-wise operation between any two 
ordered-pairs (~', m') and (~", m"), i.e. 
= (i,j). 
N Thus, for any two elements of G(2 ), say go, ,and go" ", 
. x. ,m . ~ ,m 
we have the following useful conmutative diagram. 
(c) 
gil',m" 
. N 
(g~, m,)X(g~" m' ,) modulo 2 
, , 
I " I " = g. . 1,J 
(~' ,m') ; (~",m") f g 
W,m') 0 W',m") = ([~'. O 2 ~!],[m' 0 2N- 2 rn"]) = (i,j) 
It is now easily seen that the mapping-pair f , f transforms g p 
the original reduced multiplier into two adders, modulo 2 and 
modulo 2N- 2 respectively, operating in parallel. 
To illustrate this isomorphism between the multiplier and the 
adder-pair, consider again the case when N = 4. We have already 
seen on page 132 that two possible organisations of G(24) into sets 
of case ts are, 
(a) G(24) = {(1,3,9,ll) (7,5,15,13)} 
and 
(b) G(2 4) = {(1,7); (3,5); (9,15); (n,13)}. 
137 
Using these cosets, the modulo 24 multiplication table may 
be 'rearranged' as· shown in Tables 6.10 and 6.12, and the corresponding 
operations between cosets are shown in Tables 6.11 and 6.13. The 
tables illustrated there are easily seen to be identical to the 
addition tables modulo 2 and modulo (24-2 = 4) respectively. 
Consider multiplying, modulo 16, the number 9 by H. Using 
the mappings shown in Tables 6.U and 6.13, and the connnutative 
diagram (c), we may substitute additions modulo 2 and modulo 4 for 
our original modulo 16 multiplication as shown below. 
9 ; 11 @16 (9 x ll) modulo 16 
1 1 
= 3 f f g g 
(0,2) ; (0,3) f g 
10 
(0,2) 0 (0,3) = ( [0 @ 2 ° , 2 0 4 ~IJ = (0,1) 
Instead of 3, we can also use 5 to generate the subgroup of 
order 4, and 15 may be chosen for the subgroup of order 2 thus 
obtaining K = (1,5,9,13) and L = (1,15) respectively. The corresponding· 
tables reorganised by these subgroups are shown in Tables 6.14 and 
6.15 respectively. The sets of cosets are now· {(1,5,9,13); (15,ll,7,3)} 
and {(1,15); (5.U); (9,7); (13,3)} .and the tables for the operations 
between these cosets can be derived in the way discussed preViously. 
Other possible pairs of K and L are {(1,3,9,H); (15,13,7,5)}, 
{(1,15);(3,13); (9,7); (n,5)} and {(1,5,9,13);(7,3,15,ll)}, 
{(1,7); (5,3); (9,15); (13,ll)}. 
® 
B' 
A' 
16 1 3 9 11 7 5 15 13 
1 1 3 9 11 7 5 15 13 
3 3 9 l.l 1 5 15 13 7 
9 9 11 1 3 15 13 7 5 
11 11 1 3 9 13 7 5 15 
--____ ..:. __________ L ______ ...;.. __________ 
I 
7 7 5 15 13 I 1 3 9 11 I 
5 5 15 13 7 
I 
I 3 9 11 1 
I 
15 15 13 7 5 I 9 11 1 3 I 
I 13 13 7 5 15 I 11 1 3 9 I 
Table 6.10. Reduced multiplication table 
organised by subgroup 
(1,3,9,11) • 
® 2 0 1 
0 0 1 
1 1 0 
o 16 
-
G 2 
(1,3,9,11) , 0 
(7,5,15,13) -1 
Table 6.11. Operation between 
blocks. 
..... 
.., 
<:J> 
BI 
AI 
1 7 3 5 9 15 
I III 13 
1 
1 1 7 i 3 5 9 15 11 13 
1 1 1 
7 7 1 1 5 3 I 15 9 I 13 11 _________  _________ ~________ ~________ 
1 1 1 
3 3 5 I 9 15 I 11 13 I 1 7 
1 1 1 
1 1 1 
5 3 I 15 9 11 13 11 I 7 1 
---------~--------- ---------~--------
5 
1 1 1 
9 15 I 11 13 I 1 7 I 3 5 
I 1 1 1 1 
9 
15 15 9 I 13 11 I 7 1 I 5 3 
---------~---------t---------~--------
11 11 13 i 1 7 I 3 5 I 9 15 
I 1 I 13 13 11 I 7 1 I 5 3 1 15 9 
Table 6.12. 
1 1 1 
Reduced mUltiplier table 
organised by subgroup (1,7). 
0) 
401 2 3 
001 2 3 
112 3 0 
223 o 1 
3 3 0 1 2 
@ 16 ___ ~,0 4 
(1,7) -->- 0, (3,5) --+- 1 
(9,15) --+- 2 and (11,13) --+- 3 
Table 6.13. Operation between 
blocks. 
B' 
A' 
0 16 1 5 9 13 15 11 7 3 
1 1 5 9 13 15 11 7 3 
5 5 9 13 1 11 7 3 15 
9 9 13 1 5 7 3 15 11 
13 13 1 5 9 3 15 11 7 
-------------------~-------------------
15 15 11 7 3 1 5 9 13 
11 11 7 3 15 5 9 13 1 
7 7 3 15 11 9 13 1 5 
3 3 15 11 7 13 1 5 9 
Table 6.14. Modu10 24 reduced multiplication 
table organised by subgroup, 
(1,5,9,13). 
B' 
A' 
@ I 
16 1 15 '·5 11 9 7 I 13 3 I 
I 
1 1 15 . I 5 11 I 9 7 I 13 3 I I I 
I I I 
I I I 
15 15 1 I 11 5 I 7 9 I 3 13 _________ l _________ ~ _________ ~ _________ 
I I I 
5 5 11 I 9 7 I 13 3 I 1 15 I I I 
I I I 
I I 
3 13 
I 
15 11 11 5 I 7 9 I I 1 I I I 
----- ---------.---------~---------~---------I I I 
9 9 7 I 13 3 I 1 15 I 5 11' I I I 
I I I 
I I I 
7 7 9 I 3 13 I 15 1 I 11 5 
. I I I 
---------.---------~---------~---------I I I 13 13 3 I 1 15 5 11 I 9 7 I I I 
I I I I 
3 3 13 I 15 1 I 11 5 I 7 9 I I I 
I I I 
Table 6.15. Multiplication table organised 
by subgroup (1,15). 
.... 
.c-
0 
141 
N If the reduced modulo 2 multiplier is modelled as a finite-
state machine, we recall that the sets of values that the 'forced' 
operands A' and B' can .take are regarded as the 'input' set and 
the 'internal state' set respectively, and are both equal to Zn. 
In such a case, the sets of cosets w .• r. t. the subgroups K and L 
are equivalent to S.P. partitions on the input and state sets of 
the F.S.M. model respectively. If these partitions are denoted 
by UK and uL' we also observe that as a direct consequence of the 
isomorphism between G(2N) and K x L, we have uK.uL = u'(O) = Zn. 
Furthermore, the tables in which the 'inputs' and 'states' are the 
cosets w.r.t. to K and L can now be looked upon as the homomorphic 
uK-image and uL-image, respectively, 'of the F.S.M. model of the 
reduced multiplier. 
In practice, the results that we have derived are easily applied 
N to the implementation of the general modulo 2 reduced multiplier. 
N The subgroup K is first generated by simply forming, modulo 2 , the 
N-2 
successive powers, up to' the (2 -1) th, of 3 or 5, e.g. 
( 0 1 2 2N-2_l 2N-2 } K = ~3. = 1, 3 ,3 , ••. 3 ,3 = 1 , 
. ,~ 
either manually or by 
means of a straightforward computer program for large values of N. 
N-2 The corresponding lIK-image, being isomorphic to a modulo 2 adder, 
* may now be structurally decomposed using the loop-free technique for 
adders as in Section 5.2.1.1. The generation of L,is trivial. 
* Unlike the direct loop-free structure of the multiplier (See 
Chap. 5 .), the loop-free config1a'O.tion of a general modulo :! adder 
is composed of sub-machines or components whose algebraic structures 
are regular, and are easily described and generalised. 
142 
For example, if N = 5, i.e. 2N = 32, then the two possible 
forms of K are, 
1234567 8 K(3) = (1,3 , 3 = 9, 3 = 27, 3 = 17, 3 = 19, 3 = 25, 3 = 11, 3 = 1) 
and 
Similarly, for the subgroup L, we have 
L(15) = (1,15) and L(3l) = (1,31) • 
. We thus have the S.P. partitions, 
~K(3) = {1,3,9,27,17,19,25,11; 5,15,13,7,21,31,29,23 } 
~K(5) = {1,5,25,29,17,2l,9,13; 3,15,11,23,19,31,27,7 } 
~L(l5)= {l,15; 3,13; 9,7;""27,21; 17,31; 19,29; 25,23; 11,5 } 
. ~L(31)= U,3l; 3,29; 5,27; 7,25; 9,23; 11,21; 13,19; 15,17 }. 
6.4 General comparison with the direct implementation of modulo 2N 
mul tipliers. 
In this Chapter we have been mainly occupied in the theoretical 
derivation of an algebraic structure for the general modulo 2N multiplier 
which is found to be an interesting alternative to the loop-free 
configuration described in Chapter 5. As such we have not made a 
detailed comparison of our proposed method of implementing a modulo 2N 
multiplier with that of the direct approach in which the first N bits 
of the partial products are summed using rows of full-adders. Some 
general observations, however, may be made. 
In both approaches, the number of full-adders (F.A's) required 
can give some indication of the overall hardware complexity. lilith 
143 
the direct method we can easily work out that the number of F.A's 
N-l 
needed is L n. With the proposed approach, we would need 
n=l 
N-2 {(N-2) + I} F.A's for the modulo 2 and modulo 2 adder-pair, along 
with (N-l) F.A's for each of the two (N-l)-bit adders used in the 
correction circuit, giving a total of 3(N-l) F.A's. 
The effect on the full-adder requirement with increasing word length 
N is shown in the graph in Fig. 6.9. We see that with the method 
proposed, the full-adder count increases linearly with N, while that 
of the direct approach is proportional to N2 For N >.6, the proposed 
i~plementation technique requires considerably fewer full-adders. 
Furthermore, with the direct approach the propagation delay 
through the circuit, apart from the ripple delays through each row 
of F.A's, is dependent on N. With our method, however, the system 
delay is basically constant, and is the sum of the delays thro·ugh 
the first correction adder, a circuit for encoding into partition 
blocks, the adder-pair, a circuit for decoding from the partition 
blocks, and the final correction adder. 
6.5 Conclusions. 
A novel method of implementing a general modulo 2N multiplier 
has been presented, and consists of constraining the operands to 
odd values for a modified or reduced multiplier. The output of this 
reduced multiplier is then corrected to obtain the actual modulo 2N 
product. 
The algebraic structure of the reduced multiplier has been 
analysed in detail. As a result, it was shown that a reduced modulo 2N 
100 
If) 
ex: 
w 
0 
0 
<{ 
I 
-l 
-1 
::> 
LL 50 LL 
0 
0 
z 
o 2 
Fig. 6.9. 
1 
1 
(a) " I 
/ 
/ 
.-
I 
/ 
, 
I 
/ 
/ 
/ 
/ 
1 
I 
/ 
I 
1 
, 
1 , 
I 
8 
WORDLENGTH N 
( bits) 
1 
1 
I 
1 
, 
I 
I 
I 
.1 
1 
I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
16 
Complexity in full-adder requirement for 
(a) direct implementation and (b) proposed 
implementation of modulo 2N multiplier. 
144 
N-2 
multiplier is isomorphic to two adders, modulo 2 and 2 
respectively, operating in parallel. 
Finally, it was observed that when compared with the direct 
way of implementing modulo 2N multipliers, the proposed approach 
leads to a circuit which requires considerably less full-adder 
units and possesses a basically constant system propagation delay. 
ApPENDIX 6.0· 
Let g., , and g." "be any two elements from Z. Then, ~ "m N ,m· -D 
their product P may be expressed by 
i.e. 
where 
(x 1.' m' . 2. ' , m' , . P = gR.',m' x 8R." ,m" = w ) x (x w ). 
P 
.2,'+2.' , m'+m" 
= x w 
_1'+1" modu10 2, and hence r1. = 0 or 1, 
r _ m' + m" modu10 2N-2, 
m 
N-2· O::;:r <2 , 
m 
and q = 0 or 1 and Q = 0 or 1 since the maximum value of (1.'+ 1.'.') = 1.. "lIl 
N-2 N-2 N-2 1 + 1 = 2, and that of (m'+ m") = (2 : - 1) + (2 - 1) = 1 x 2 + 
N-2 (2 - 2) respectively. 
Consider now the case when 1.',2" and m' ,m" are such that 
q1. = 'I,n = 1. Then we have 
where 
w 
r 
m 
2 2N- 2 
K = (x w ). 
From Lemmas 6.4 and 6.6, we know that 
2N- 2 
w _ 1, and 2 x - 1 N (modulo 2 ) 
where 
i.e. 
Since the 
~, ~ integers, 
= F ZN +.1 
F = 
p = 
= 
term 
(~ Q" ZN + ~ + Q,,) 
~ ZN + 
(F. 
r i 
x 
w 
r 
m 
r 
~ Vi wrm) 
r r i w m)ZN + x 
modulo ZN 
r 
m 
w 
r i m be written x w may 
rR. r m 
X W Q ZN + = g 
rR"rm 
N 
modulo Z • 
as 
, 
Using these results, we see that the two elements g" ,and 
,. ,m 
d d d 1 ZN 1· 1·· h 1 g", "are mappe , un er mo u 0 mu t1P 1cat1on, to tee ement 
,. ,m 
g. . E ZD such that 1,J 
i _ (i' + R.") modulo Z 
and 
j _ (m' + m") N-Z modulo Z • 
••. (A.6.0) 
(A.6.l) 
The cases for the remaining possible values of qi and 'l,n may 
be treated in a similar way to derive results that are identical to 
equations (A.6.0) and (A.6.1). 
145 
CHAPTER 7 
DECOMPOSITION STRUCTURES OF MODULO M ADDERS 
AND MULTIPLIERS, AND OF A SIMPLIFIED ,MODEL 
OF A SECOND-QRDER DIGITAL FILTER 
7.0 Introduction. 
In this chapter we extend and generalise the main ideas developed 
in Chapters 4 and 5. 
After a brief analysis of the partition structures of both modu1o-M 
adders and multipliers. we will show how the non-recursive second-order 
digital filter can be simplified .such that the resulting model is easier 
to analyse. 
It is then shown that this simplified filter may be decomposed into 
a parallel and/or a 'nested' cascade interconnection of submachines. 
A partition lattice of these submachines is developed and is shown to 
be related in a simple way to the familiar lattice of integers under 
the. 'factor' relation. 
7.1 Partition structure of modulo M adder. 
The generation of the set of S.P. partitions for a mod-H adder 
modelled as an F.S.M. is described. The lattice structure of this F.S.M. 
is then developed and is shown to be related in a simple way to the 
lattice of the divisors of M under the 'factor' relation. 
7.1.0 Generation of the basic S.P. partitions. 
The general mod-M addition table is shown in Table 7.0. Its 
modelling into an F.S.M. and the algebraic analysis of the resulting 
146 
model are. similar to those discussed in Sections 5.1 and 5.2 •. 
Furthermore, in order to generate all the basic S.P. partitions 
of the F.S.M. mod-M adder, the basic arguments presented in Section 
N 5.2.1.0 for the mod-2 adder are still applicable. 
Thus we may say that it is sufficient to only 'identify' the 
state 0 and every other state C, where 
C E {x : 0 li x, integer li M-I}. 
We will also show later that even with this simplification, only 
certain values of C need to be considered. 
When 0 and C are identified, we automatically identify every other 
element x with (x + C) mod-M. The resulting pairs will in turn lead 
to similar·implications. 
Consider first the pairing of 0 with C. One particular chain 
of implied pairs is 
0, C --+ C,(C + C) --+ 2C, 2C + C --+ •.• (k I)C, kC 
·Using the transitive property of partitions, all the above pairs 
have to be put in the same block. 
We thus see that for the pair O,C we have the linked or chain 
connection of all the mUltiples of C, i.e. 
o --+ C --+ 2C --+ --+ kC • 
When kC = 0 mod M, then the identification of all the elements in 
the block containing 0 and C will be complete. 
Thus, for a given C, we may apply the same argument to the implied 
pair x and (x + C) mod-M, to obtain 
x --+ x + C --+ x + 2C --+ ..• x + kC 
147 
Table 7.0.. General mod-M addition table. 
0 
12 0 1 2 3 4 5 6 7 8 .. 9 10 11 
0 0. 1 2 3 4 5 6 7 8 9 10 11 
1 1 2 3 4 5 6 7 8 9 10 11 0 
2 2 3 4 5 6 7 8 9 10 11 0 1 
3 3 4 5 6 7 8 9 In 11 0 1 2 
4 4 5 6 7 8 9 10 11 0 1 2 3 
5 5 6 7 8 9 10 11 0 1 2 3 4 
6 6 7 8 9 10 11 0 1 2 3 4 5 
7 7 8 9 10 11 0. 1 2. 3 4 5 6 
8 8 9 10 11 0. 1 2 3 4 5 6 7 
9 9 10 11 0 1 2 3 4 5 6 7 8 
10 10 11 0. 1 2 3 4 5 6 7 8 9 
11 11 0 1 2 3 4 5 6 7 8 9 10 
Table 7.1. Mod 12 addition table. 
148 
and hence for completion of the identification of the corresponding 
block, we have 
x + kC _ x mod M ••• (7.0) 
or 
kC = 0 mod M, i.e •. kC = pM· ••• (7.1) 
where p is an integer. 
As equation (7.1) tells us that k does not depend on x, we see 
that every block generated this way will each contain k elements. 
In general, for a. given C, a basic S.P. partition.is generated 
by first forming the block corresponding to the pair 0 and C, and to 
repeat the process for every pair x and (x + C) not contained in the 
preceding blocks •. The resulting set of such blocks ·is then, by 
construction, a basic S.P. partition on the sets of M states of the 
F.S.M. adder. This partition, which .we call nC' consists of 
:# (nc) = M/k blocks, with each block containing m (ne) =.k elements, 
k being obtairied from equation (7.1). 
E.g. Let M = 12 and C = 3 with ·the mod-12 addition table shown 
in Table 7.1. The initial pair 0 and 3 leads to the sequence 
repeats 
f 
0+3+6-+9 -+ 0 
giving the first block (0,3,6,9). 
etc. , 
The initial pair (0,3) also implies the pairs (1,4) and (2,5). 
Consequently, we have the chain sequences 
1 -+ 4 -+ 7 + 10 ~ 1 etc. and 2 -+ 5 ... 8 -+ 11'" 2 etc., 
thus resulting in the blocks 
(1,4,7,10) and (2,5,8,11). 
149 
1TC=3 = {0,3,6,9 1,4,7,10 2,5,8,1l} • 
7.1.1 General form of 1TC 
For a given modulus M, the number of blocks # (1TC)' and the number 
of elements in each block m (1TC)' of the basic S.P. partition 1TC depend 
on the actual values of C and M. 
In general, let the greatest common divisor of C and M be d. 
Hence we have 
C = c'd and M = m'd • •• (7.2) 
where c' and m' are now coprime. 
Substituting these values in equation (7.1) we obtain 
kc'd = pm'd • •• (7.3) 
i.e. kc' = pm' ••• (7.4) 
The number of steps k is given by 
k=~ 
c' 
• •• (7.5) 
Since the number of steps must by definition be an integer, 
then the right-hand side of equation (7.5) must also be an integer. 
Therefore pm' must be divisible by c', and since c' does not divide 
m', then p must. be a mUltiple of c', say p = qc' 
k qc' m' = 
c' 
= qm' • •• (7.6) 
k is least when q = 1. Consequently, in the generation of 1TC' 
the repetition that was mentioned in the previous section occurs at 
the k th. step where 
150 
k = l·m' = (M/d) 
We conclude that 
and 
# ("c) = (M) / (M/d) = d 
Consider as examples, the cases below. 
(i) C = I , i.e. d = 1 
CH) C =M , . i. e. d = M. 
With these two cases, it is easily shown that the 1T 's C are 
trivial S.P. partitions ,,(I) and ,,(0) respectively. 
'(H) C and M are co-prime. 
Here the greatest common divisor of C and M is obviously 1. 
Therefore like in case (i), "c = ,,(I). 
e.g. M = 12, c = 5. Then we have the following chain 
o 7 5.~ 10 ~ 15(=3 mod 12) + 8 ~ 1 
+6+11~4+9+2+7+0 . 
Similarly for c = 7 and 11. 
(iii) C divides M. 
In this ca~e d = C, and hence 
and 
k = (M/d) = (M/C) = m("c) 
# (" ) = d = C C 
e.g. M = 12, C = 4. Thus d = C = 4. 
the 
151 
k = m(n4) = (12/4) = 3 and 
-# (n4) = d = C = 4 • 
Consequently, we get 
n4 = {O,4,8 ; 1,5,9 2,6,10 3,7,l1}. 
In the general .case, let M = 12 and C = 8 say •. By the direct 
method we have the chain sequence 
0-+-8-+-4 1-+-9-+-5 
ns = {O,S,4 ; 1,9,5 
= n 4 
2,10,6 
2 -+- 10 -+- 6 3 -+-1 -+- 7 • 
3,l1,n . 
The main result in this.section is that, in the generation of 
the basic partitions nC' "e need to consider, apart from the trivial 
cases of C = I, C = M and C coprime to M, only those values of C 
that have different values of d. 
This greatly simplifies the generation of the lattice of S.P. 
partitions of a mod-M adder. 
7.1.2 The partition lattice of the general mod M adder. 
A simple method is presented with which the complete partition 
lattice of the general. adder modu10 M may be derived from simply knowing 
the divisors of M. 
We begin by analysing the nature of the· partition 'sums' and 
'products' of pairs of nC's .derived as discussed in the previous section. 
Lemma 7.0. If d,D are divisors of M, and d divides D, then 
152 
Proof. Let d < O. and a.b be any two elements in a block'of 
"0' Then from the results in Section 7.l.1. we have 
b Q a + QO. Q an integer. 
Since d divides O. i.e. 0 Q qd say. we obtain 
b Q a + Q(qd) Q a + Q'd. Q' Q Qq. 
This means a and b are also in the same block of "d' Since this 
applies to any pair in any block of "0' then 
"d > "0 
When d Q O. we have the trivial case "d Q "0 
Corro Z la:ry. If d • dl •••••• d .• d. l •••.•• d are divisors of M o 1 1+ n 
such that di divides di +l • then "i ~ "i+l and hence 
•••. 3. 
" n 
Proof. The result is obtained by applying the Lemma to successive 
Since using equation (7.5). a block of "d. contains M/di elements. 
1 
and that of "d contains M/d. 1 elements and is furthermore contained 
i+l 1+ 
in a block of "d.' then a block of "d. will contain 
1 1 
(M/d.)/(M/d.+,) Q (d i +1)/d1· blocks of "d. 1 1 1 L 1+ 
Lemma 7.1. If d1• d2 are divisors of M and they do not divide 
each other. then 
(i) Q " d and (ii) 
where d and O. also divisors of M. are the greatest common 
153 
divisor (g.c.d.) and the least. common multiple (t.c.m.) 
respectively of dl and d2• 
Proof. Let d' divide both dl and d2, this implies, from 
Lemma 7.0, that 
and 
"d' ;: "d 2· 
From case (iii) in Section 7.1.1, we know that "d' has d' blocks, each 
containing (M/d') elements. As d' increases, so will the number of 
blocks, while the number of elements of each gets fewer. In other 
words "d 'decreases'. Finally, when d' attains its greatest value, 
i.e. d, the corresponding "d will be the 'smallest'. 
Thus "d is the least upper bound (t.u.b.) of "d and "d ' and 
. 1 2 
using the result in page 7 of Ref. 12, ·we can write 
"d = t.u.b. ("d ' "d ) = 
1 2 
To prove (ii) of the Lemma, we let D' be a common multiple of 
dl and d2 . Again, from Lemma 7.0, we can say that 
and' "D' < ... Trd • 
2 
In the way similar to that for the proof of (i), it can be seen 
that when D' is minimum, i.e. D, then "D will be the 'largest' to 
satisfy the simultaneous inequality. 
"D = g.t.b. ("d ' "d ) 
1 2 
As an example, let M = 12, dl = 3 and d2 = 2, from which we have 
d = g.c.d. 0,2) = 1, and D = t.c.m. 0,2) = 6. 
154 
Using the results in Section 7.1.0, we obtain 
~3 = {O,3,6,9 ; 1,4,7,10 ; 2,5,8,ll} 
and 
~2 = {O,2,4,6,8,lO l,3,5,7,9,l1} • 
Fonning·their sum and product we have 
We nOw need the following definition. 
Definition 7.0. A M-integer lattice is the set SD of all the 
divisors of M, M an integer, which is partially ordered by the 
relation 'is a factor of', and the ·operations between pairs of 
d , d E SD of finding their greatest common divisor and least 
x y 
common multiple, denoted, respectively by 0 and 0 say, Le. 
and 
g.c.d. (d , d )... d 0 d , 
x y -+ x y 
i.c.m. (d , d ) -+ d 0 d 
x y... x y 
This 'factor' relation can be conveniently represented by a 
Hasse diagram, as shown in Figs. 7;0(a) - (c) for M = 8, 18 and 60. 
Using Lemmas 7.0 and 7.1 we may now state the following theorem. 
Theo~em 7.0. The set S of S.P. partitions of a mod M adder 
~ 
partially ordered by the partition inequality ~, is isomorphic 
to the M-integer lattice, the isomorphism being described by the 
one-to-one mappings h., h. and hk given by 
1. J 
such that 
and 
155 
h. 
1. 
h. 
J 
0-++, 
h. (d 0 d ) = h. (d ) + h. (d ) 
1. .x Y 1. X 1. Y 
h. (d 0 d ) = h. (d ) 
1. x Y 1. X h. (d ) 1. y. 
0 .... 
Theorem 7.0 is merely a formal statement of the principal results 
discussed in Lemmas· 7.0 and 7.1,.and presents us with a very simple 
method of constructing the lattice of S.P. partitions of a mod M 
adder from just knowing the divisors of M. 
As an example Fig. 7.1(a) shows the partition. lattice of a mod 12 
adder, and the isomorphic lattice of the divisors of 12 is shown in 
Fig. 7 .l(b) • 
In practice, the partition lattice of the adder may be obtained 
directly by regarding the divisors d's as subscripts for the corresponding 
partition nd's, and geometrically reorientating the M-integer lattice 
as shown in Fig. 7.2 for M = 6. 
7.2 S.P. partitions for a mod M multiplier. 
In contrast to that of the mod M adder, the partition structure 
of a mod M multiplier is difficult to describe completely due to an 
apparent lack of a convenient regularity. As such, we have only been 
able to give a complete description of the lattice made up of a 
subset of the possible S.P. partitions. The knowledge of this sub-
lattice, however, is sufficient for our subsequent search for useful 
decomposition structures of stored-logic digital filters. 
8 
I 
4 
I 
2 
I 
1 
(a) M = 8 
(a) 
(b) M = 18 
Fig. 7.0~ Some M-integer lattices. 
(b) 
1[2 = {0,2,4,6,8,1O; 1,3,5,7,9,1l} 
1[3 = {0,3,6,9; 1,4,7,10; 2,S,8,il} 
1[4 = {0,4,8; 1,5,9; 2,6,10; 3,7,11} 
---1[6 = {0,6; 1,7; 2,8;,3,9; 4.10; S,ll} 
Fig. 7.1. Lattices of (a) ~.P. partitions of a 
mod 12 adder, and (b) the ,divisors of 12. 
I c 
3 
I 
1 
Fig. 7.2. 
If(I) 
1T(o) 
M = 4 
-2 ={O,2,4; 1,3,S,J, -3 ={O,3; 1,4; 2,SJ 
Diagrammatic derivation of the partition 
lattice of a mod 6 adder from the 
corresponding integer lattice. 
-,r(I) 
·0 - a sub-lattice 
Fig. 7.3. 
M = 6 
Complete S.P. partition lattices of 
typical mod-M multipliers showing the 
relevant sub-lattices (in broken lines). 
156 
7.2.1 Sub~lattice of multiplier's S.P. partitions. 
The following theorem is basically a generalisation of Theorem 
5.2 to the general mod-M multiplier. 
Theorem 7. 1. The lattice of S.P.partitions of a modulo M 
* adder is a sub-lattice of the S.P. partitions of a mod M 
mul tiplier. 
Proof. Consider the S.P. partition nd of a mod M adder, d a 
divisor of M, and let x and y be any two elements of a block of nd • 
Multiplying each by an element a E ZM we obtain 
ax 
- : } ••• (7.7) and mod M ay 
-
... (7.8) 
Subtracting equation (7.8) from (7.7), we get 
b - c _ a(x y) mod M 
or 
b - c = a(x - y) + qM , q an integer. 
.•. (7.9) 
Since x and y comes from a block of nd , then if say x > y, then 
x = y + fd, f an integer. Also, because d divides M, we can write 
M as pd, p an integer. 
Equation (7.9) can now be written as 
b - c = a(f~) + q(pd) 
= (a~ + qp)d = Q'd 
where Q' = (af + qp). 
b = c + Q'd, 
which means that the products band c are still in a block of nd • 
* ReaaZZ Seation 3.4. 
157 
Hence wd is preserved under the modulo M multiplication operation. 
Furthermore, since 
and 
where dl , d2, d3, d4 are all divisors of M, the lattice is also preserved. 
Hence the result. 
Some examples of these sub-lattices are shown in Fig. 7.3. 
The ideas and experience gained in the precedirig sections were 
found to offer a helpful insight in the analysis into the decomposition 
structures of digital filters. 
7.3 Decomposition structures of digital filters. 
The. general second-order non-recursive digital filter, suitably 
transformed and modelled, is shown ·to be systematically decomposable •. 
Also, the lattice of the component sub-machines is develop~d. This 
lattice provides a simple rEpresentation of the operation uf subsets 
of these sub-machines. 
7.3.0 Notation. 
The symbols used in the subsequent discussion are briefly 
explained below. 
If I and J are positive integers, with I > J say, then we can 
write I as 
I = kJ + P ..• (7.10) 
where k,p are integers such that 
o ~ k ~ I/J and 0 ~ p < J 
158 
i.e. k and p are the quotient and remainders respectively, obtained 
when I is divided by J. 
We denote k by QJ(I) and p by R/I). Sometimes, for RJ(I) we 
may also use I mod J or (I)J instead. 
Equation (7.10) may be written as 
•.. (7.11) 
Also, if a,b,c,d are positive integers such that 
a + b - c, and a x b = d mod J, 
then we denote c and d as 
Finally, if G is the n-component vector {gl' g2'····' gn}' then 
and 
7.3.1 Simplified models of non-recursive filters. 
Our subsequent analyses will be greatly assisted if we first 
derive a simplified version of the original non-recursive second-
order section as follows. 
If the actual filter has the coefficients a. and data x ., 
1. n-1 
with its output Z given by 
n 
2 
Z = L a. x n-i n 1=0 ~ 
then its simplified version, which we call a modu1o-d filter or 
(DF)d' is one with coefficients (ai)d and data (xn-i)d given by 
159 
and (x .)d = R n-~ . d 
respectively, and whose output (Zn)d is given by 
.•• (7.12) 
i.e. the filter output is now operating in modulo-d arithmetic. 
Using the general ideas developed in Chapter 4, we may now 
model (DF)d as a finite-state sequential machine (F.S.M.). Thus we 
may describe (DF)d by the quintuple 
where, if 
such that 
and 
(DF) d = (Sd' Id' Od' od' Ad) 
sd E Sd ' id E Id ·and od E Od' then· 
sd = [(Xn_l ) d , (xn_2) ~ 
id = (xn)d 
od = (Zn)d 
°d{sd id} = od {[(Xn-l)d ' (xn_2)~ 
= (Z ) 
n d 
••. (7.13) 
; (xn) d} 
••. (7.14) 
•.. (7.15) 
(Obviously if d = W, where Wis the maximum value of the output 
of the original filter, then (DF)W is identical to this filter). 
In a pl:actical filter system; the coefficients and data are 
each, in the simple case, constrained to a maximum positive integer 
160 
value of (M-I) say. In this' ease, only its output need to be redueed 
M 
modulo-M in order to derive the eorresponding modulo-M filter, i.e. 
Consequently, it is not unreasonable to eons,ider (DF)M as a 
'good' simplified model of our original seeond-order seetion. 
The bloek diagram of (DF)M is shown in Fig. 7.4. Its state 
transition table is shown in Table 7.2, while, for given values of 
a
o
' al and a2 , the eorresponding output table may be easily eonstrueted 
in a way similar to that deseribed in Chapter 4. 
If a stored-logie approaeh is adopted, only' the output matrix 
need to be realised as a look-up table. Sinee eaeh x . ean have 
n-1 
M possible values, a store of (M)3 words will be required. Furthermore~ 
the output ~ also has M possible values. 
* 
3 Consequently, the overall stored-logie eapaeity is (M) x q word-
bits, where q is the integer ~ log2',M. 
* The unit 'word-bit' is more general than the commonly used 'bit' 3 
to denote the storage capacity of a memory W'lit., This is beoause (M) 
need not be a power of 2 and the generalised unit, antioipates the time 
when programmable logic arrays (P.L.A's) wiZl be used as oommonly as 
R.O.M's are today. 
Fig. 7.4. 
(Xn )M = iM 
, 
. , 
. , 
'-. 
, 
. . 
. 
, 
, 
. 
. & -.••.. 
M • 
0 
- 0 
(zn)M = 
AM 
F.S.M. model of a general modulo-M 
second-order non-recursive digital filter. 
Present 
state' 
SM 
0, 0 
0, 1 
· · .. 
· 
· · 0, k 
· · 
· · 
· · 0, M-I 
I, 0 
I, 1 
· · 
· · 
· · I, k 
· · 
· · 
· · I, M-I 
· · 
· · 
· · 
· · 
· · 
k, 0 
k, 1 
· · 
· · 
· · k, k 
· · 
· · 
· · k, M-I 
· · 
· · 
· · 
· · 
· · 
M-I, 0 
M-I, 1 
· · 
· · 
· · M-I, k 
· · 
· · 
· · M-l. M-I 
.0 
0, 0: 
0, 0 
· · 
· · 
· · 0, 0 
· · 
· · 
· · 0, 0 
0, 1 
0, 1 
· · 
· · 
· · 0, 1 
· · 
· · 
· · 0, 1 
· · 
· · 
· · 
· · 
· · 
0, k 
0, k 
· · 
· · 
· · 0, k 
· · 
· · 
· · 0, k 
· · 
· · 
· · 
· · 
· · 
0, M-I 
0, M-I 
· · 
· · 
· · 0, M-I 
· · 
· · 
· · 0, M-I 
Table 7.2. 
161 
1. ... 2. ." ... " . k 
I, 0 2, 0 k, 0 
I, 0 2, 0 k, 0 
· · · · · · 
· · · · · · 
· · · · · · I, 0 2, 0 k, 0 
· · · · · · 
· · · · · · 
· · · · · · I, 0 2, 0 k, 0 
I, 1 2, 1 k, 1 
I, 1 2, 1 k, 1 
· · · · · · 
· · · · · · 
· · · · · · I, 1 2, 1 k, 1 
· · · · · · 
· · · · · · 
· · · · · · I, 1 2, 1 k, 1 
· · · · · · 
· · · · · · .. 
· · · · · 
· · · · · · 
· · · · · · 
I, k 2, k k, k 
1, k 2, k k, k 
· · · · · · 
· · · · · · 
· · · · · · I, k 2, k k, k 
· · · · · · 
· · · · · · 
· · · · · · 1, k 2, k k, k 
· · · · · · 
· · · · · · 
· · · · · · 
· · · · · · 
· · · · · · 
I, M-I 2, M-I k, M-I I I, M-I 2, M-I k, M-I 
· · · · · · 
· · · · · · 
· · · · · · I, M-I 2, M-I k, M-I 
· · · · · · 
· · · · · · 
· · · · · · 1, M-I 2, M-I k, M-I 
Flow table for the F.S.M. equivalent of 
a modulo-M digital filter. 
M-I 
M-I, 0 
M-I, 0 
· · 
· · 
· · M-I, 0 
· · 
· · 
· · M-I, 0 
M-I, 1 
M-I, 1 
· · 
· · 
· · M-I, 1. 
· · 
· · 
· · M-I, 1 
· · 
· · 
· · 
· · 
· · 
M-I, k 
M-I, k 
· · 
· · 
· · M-I, k 
· · 
· · 
· · M-I, k 
· 
.. 
· · 
· · 
· · 
· · 
M-I, M-I 
M-l. M-I 
· · 
· · 
· · M-t, M-I 
· · 
· · 
· · M-I, M-I 
162 
7.3.2 Homomorphic images .of.(DF)M. 
* As will be shown, the concept of a homomorphic image of an 
F.S.M. is a powerful aid in the structural decomposition of the 
general modulo M digital filter. 
Let band c be factors of M and the corresponding F.S.M. filters 
operating in the arithmetic modulo band modulo c be denoted by (DF)b 
and (DF)c respectively: 
Using equation (7.13), we obtain the quintuples 
and 
(DF) = (S , I , 0 , cS , A ) • 
c c c cc· c 
Theorem 7. 2. Hf b divides c, then (DF\ is a homomorphic image 
of (DF) , with the homomorphism defined by 
c 
hI I -->- 1), (lc) = lb c 
h2 S --> 1), (Sc) = Sb c' 
h3 0 -->- 1), (0 c) = Ob c 
such that 
1),~c{sc i c}] cSb{~(Sc) ~ (ic)} = '\ {sb ; ib} 
... (7.16) 
and 
1),[Ac{sc ; i c}] = Ab {Po (sc) ~ (ic)} = \{sb ; i b} 
, .• (7.17) 
* See Definition 3.1 in Chapter 3, and aZso Ref. 12 for the significance 
of homomorphic images in generaZ. 
" 
163 
where 
Proof. Using the results in the previous section, we can write 
s and i 
c c 
Similarly for 
sb and i b · 
Expanding the left-hand side of equation (7.16) we have 
~ R- r(x·) , (x 1) J 
. -oL n c n- c 
which is the right-hand side of equation (7.16). 
To simplify the proof of (7.17), we let, with no loss in 
generality, aZ ~ o. 
• •• (7.18) 
and 
••• (7.19) 
where 
i ~ 0,1. 
Subtracting (7.19) from (7.18) and rearranging terms, we obtain, 
y ~ y" - y' 
+ {(al)c - (al)b} (xn-l)c + {(Xn-l)c - (Xn-l)b} (al)b 
••• (7. ZO) 
164 
If a E· {O.l ••••• c-l). we may express it using equation (7.11) 
as. 
l),(a)} = bQb(a) ••• (7.21) 
We observe now that in the R.H.S. of equation (7.20). every 
term in the curly bracket is of the form {a - ~(a») which. from 
equation (7.21). implies that it is·divisible by b. 
Therefore y itself is divisible by b and may be written as 
.y = yH _ y'= qb q an integer 
or yH = qb + y' .•• (7.22) 
If yH and y' is now written in the form shown in equation (7.11). 
the above equation may be expressed as 
cQC(yH) + RC(yH) = qb + bQb(y') + ~(Y') 
or 
RC(yH) = b[q + Qb(Y')] - CQC(yH) + ~(Y') • 
.•• (7.23) 
We have said. however. that b divides c. i.e. let c = kb say. 
RC(yH) = bG + ~(Y') 
where G = [q + Qb(y') - kQc(yH)} 
••• (7.24) 
One may easily work out that RC(yH) and ~(yl) are actually 
(Z) and (Z )b respectively. as described by equation (7.12). 
n c n 
165 
Furthermare, we may express them in the farm given by equatian 
(7.15). As a result we can naw express equatian (7.24) as 
thus praving equatian (7.17). 
Cansequently, the triple mappings (hI' h2 , h3) are preserved far 
bath state and .output transitians. 
Finally, ta shaw that it is necessary that b divides c, we first 
.observe that c and 0 are bath divisible by c, i.e. 
R (c) = 0 and R (0) = 0 • 
c c 
Alsa, ~(o) =0. 
If c is nat a mUltiple .of b, then 
where 
Althaugh R (c) = R (0) = 0, 
c c 
~(c) I ~(o) = o. 
Thus, the element ~ = c = 0 mad c has twa distinct images under 
the mapping ~(~), in which case the mapping is nat a marphism. ° 
As an example,o let b = 3 and c = 6. In .order ta simplify the 
illustratian, we will cansider .only the state transitian .or flaw 
table. 
That far (DF)6 is shawn in Table 7.3, in which the raw states 
are reardered ta demanstrate the hamomorphism. °In this table we also 
include the images .of the states .of (DF)6 w.r.t. the mapping h 2, e.g., 
the particular subset .of state-pairs [(1,2), (1,5), (4,2), (4,5)J 
166 
is mapped to the single state-pair [1, 2J of (DF) 3' the homomorphic 
image of (DF)6' 
The flow table for this homomorphic image is shown in Table 7.4. 
In general, a homomorphic image of a modulo-M filter is a 
"coarse" version of it which still retains its essential characteristics. 
7.3.3 Parallel connection of (DF)b and (DF)c • 
We now analyse the parallel operation of two homomorphic image 
filters (DF)b and (DF)c in which b does not divide c and vice versa, 
but have the greatest common divisor d, i.e. 
b = b'd and c = c'd say, 
where b' and c' are coprime. ••• (7.25) 
With (DF\ and (DF)c described by the quintuples as in Section 
7.3.2, let (DF) be their parallel connection. Then, if Definition 
. p 
3.4b in Chapter 3 is applied, (DF) is given by p 
where 
and 
b,c. 
(DF)p = (DF)b i i(DF)c 
x I ), (Ob x ° ), 0 , A l 
c c p pJ 
0p {(Sb' sc) ; (ib , ic)} 
8 ) 
c 
= {Ob (sb' i b), ° (8 ,i)} c c c ••• (7.26) 
A (5 , i )} 
c c c 
•• , (7.27) 
We will now determine the relationship between p and the pair 
167 
Input (X
n
)6 
Present state 
s6 = [(Xn - 1)6 ' (Xn - Z)6] 
0 1 Z 3 4 5 
hZ = R3~6J \ I ! 
• 0 0 0 0 5 0 
0 3 0 0 5 0 
00 3 0 0 3 5 3 
3 3 0 3 5 3 
0 1 0 0 5 0 
0 4 0 0 5 0 
_ 0 1 3 1 0 3 5 3 
3 4 0 3 5 3 
0 Z 0 0 5 0 
0 5 0 0 5- 0 
o Z 3 Z 0 3 5 3 
3 5 0 3 5 3 
1 0 0 1 5 1 
1 3 0 1 5 1 
1 0 4 0 0 4 5 4 
4 3 0 4 5 4 
1 1 0 1 5 1 
1 4 0 1 5 1 
1 1 4 1 0 4 5 4 
4 4 0 4 5 4 
1 Z 0 1 5 1 
1 5 0 1 5 1 
1 Z ~ -< 4 Z 0 4 5 4 
4 5 0 4 5 4 , 
I d 
I q 
I • 
" 
'V 
2 0 
2 1 
2 2 
168 
:I 
" 
" 2 0 0 2 
2 3 0 2 
E 
5 0 0 5 
5 3 0 5 
2 .1 0 2 
2 4 0 2 
( 
5 1 0 5 
5 4 0 5 
2 2 0 2 
2 5 0 2 
E 
5 2 0 5 
5 3 0 5 i i 
Table 7.3. Flow table for modulo-6 
digital filter (DF)6~ 
~ 
5 2 
5 2 
5 5 
5 5 
5 2 
5 2 
5 5 
5 5 
5 2 
5 2 
5 5 
i 5 5 
Present state 
0 0 0 
0 1 0 
0 2 0 
1 0 0 
1 1 0 
1 2 0 
I 
2 0 0 
2 1 0 
2 2 0 
Table 7.4. 
169 
o 1 
0 1 0 2 
0 1 0 2 
0 1 0 2 
1 1 1 2 
1 1 1 2 
1 1 1 2 
2 1 2 2 
2 1 2 2 
2 1 2 2 
Flow table for the R3[CDF) 6J 
homomorphic image. 
2 
0 
0 
0 
1 
1 
1 
2 
2 
2 
170 
First. consider the element yE {0;1.2 ••••• k-l} and define the 
mapping I/J as 
••• (7.28) 
i.e. y is reduced modulo band modulo c concurrently. 
Lerruna 7.2. If k = bc d then I/J is a one-to-one mapping. 
Proof. For an arbitrary y. -we first form the sequence-
{y; y+l; y+2; y+i; •••• } mod k. 
Since y + k - Y mod k. the above sequence will first repeat at 
the k th step. 
As Y is incremented. so will its image pairs I), (y) and Rc (y) • 
Furthermore. we have 
I),(Y) + q'b - I),(Y) mod b 
and 
R (y) + q"b =: R (y) mod c. 
c c 
Consequently. the pair {I),(Y), Rc(Y)} repeats when q' and q" are 
such that 
q'b = q"c • ••• (7.29) 
Applying equation (7.25). we have 
q'b'd = q"c'd. 
i.e. q'b' = qUe' 
which says that q'b' is a multiple of c'. 
As b' and c' are co-prime. this is only possible if q' itself 
is a mUltiple of c' • i.e. q' = tc' say. 
We can write q'b as 
171 
(tc')b = t(c/d)b. 
The smallest integer value of q'b is when t = 1, giving us 
q'b = (bc)/d. 
Therefore, the pair {~(y), Rc (y)} will first repeat itself at the 
(bc)/d th step. Since y first repeats at the k th step, then we 
have 
k = (bc)/d 
Each of the values {y; y+l; ••.•• y+(k-l)} mod k has a unique 
w-image in the sequence 
....... , 
We are now in a position to state the following theorem. 
Theorem 7. 3. The filter (DF)k is isomorphic to (DF)p' the 
parallel connection of (DF\ and (DF)c' with the ,mapping W 
given by 
i k --->- . [~(ik) , Rc (ik)] 
W sk --->- t'), (sk) Rc (sk)] 
Ok --->- [~ (Ok) , Rc (Ok)] 
such that 
W[Ok{Sk; i k}] = Op{W(Sk); $(ik)} ••• (7.30) 
and 
W[Ak{Sk; ikTI = Ap{$(Sk); W(ik)} ••• (7.31) •. 
Proof. Considering the state transition function first, we 
expand the left-hand side of equation (7.30), thus obtaining 
172 
~ ~k{ sk; i k}] 
= [~~k{Sk; i k}] , RC[Ok{Sk; i k}]) .•• (7.32) 
If we apply equation (7.16) of Theorem 7.2, we will have 
and 
RC~k{Sk; i k}] = OC{SC; i C}· 
~~k{Sk; i k}] = h{Sb; ib}' OC{SC;' iC}) ••• (7.33) 
Applying equation (7.26) to the R.H.S. of (7.33) we finally 
obtain 
and hence the proof. 
= 0p{(Sb' se); (ib , ic)} 
= 0p{~(Sk); ~(ik)} 
The proof of equation (7.31) may be obtained in a similar way. 
The resulting isomorphism described by Theorem 7.3 is shown· 
diagrammatically in Fig. 7.5. 
Of course if k = (bc)/d = M, then the parallel connection of 
(DF)b and (DF)c realises (DF)M. 
Ik 
.r f v 
Fig. 7.5. 
/ 
(OFl
k 
r---:- - - - ------, 
, I I I I I i J'.. (OFl
b 
J-, 
i v I . , I , I 
-1 , 
I I· t I I I 
I I 
I I 
I 
-" ( OF) I 
" i . c ; y , 
I I , I 
'--- - - - - - - - - - ~ -, 
Decomposition of (DF)k' k = (bc/d) 
a parallel connection of (DF)b and 
Ok 
:> 
• 
into 
(DF) • 
c 
173 
7.3.4 Cascade decomposition structure of ~_~~~ul~_Ea digital !i1ter •. 
In general, a modulo M non-recursive digital filter may be 
realised as a cascade connection of its homomorphic image (DF)d' 
which may be regarded as a 'predecessor' component. and a 'successor' 
component. 
In particular. it is usual in practice for M to be of the form 
a p • where p is a prime and a an integer. In such a case. a detailed 
analytical description of the cascade decomposition of (DF) a can p 
be derived. thus characterising completely the structures of the 
'predecessor' and 'successor' components. and also the combinational 
mapping between them. 
7.3.4.0 Notation. 
Let a E' {O.l ••.•• pa_l } be an input. state-component. or output 
element of (DF) a • and d be an integer < a. p 
d a-d Dividing ~ by p and p in turn we" obtain the following 
where 
i.e. 
and 
and 
d 
a = p Q d(a) 
p 
a-d 
a = p Q a-d(a) 
o :s 
p 
d R d(a) < p 
p 
a-d o :s Q d(a) < p 
p , 
+ R a-d(a) 
p 
a-d 
o :s R a-d(a) < p 
p 
a-d Q d(a) E {0.1.2 •...• p -I} 
p 
•.. (7.34). 
•.• (7.35) 
In the subsequent discussion, we often interchange the notations 
and 
174 
* Q a(a) :t Q(a) , 
p 
Q a-d (a) :t Q" (a) , 
p 
and similarly for R (a), 
a p 
and R a-d (a). 
p 
have 
. d 
Finally, if we multiply equation (7.35) by p ,we get 
= p a Q" (a) + p d R" (a) ••• (7.36) 
a Since we are operating in modulo p arithmetic, however, we thus 
d p a _ ••. (7.37) 
7.3.4.1 Analysis of cascade structure of (DF) a p. 
In the following, we will show that a (DF) a can be. decomposed 
p 
into a cascade connection of two image filters (DF) d and (DF) ·a-d ' p p 
with a simple combinational mapping between them •. 
* * * Now let (a.) ,(x .) and (z ), i = 0,1,2, be the coefficients, 1 n-1 n 
data and output of (DF) a. Thus, the filter algorithm is given by p 
* 
2 
* * L a (z ) - (a. ) (x .) mod p ••• (7.38) n i=O 1 n-1 
* (From now on, we will assume that it is understood that (z ) 
n 
is ·computed in modulo pa arithmetic). 
* If we now express (x .) in the form shown in equation (7.34), 
n-1 
we get 
'I< (z) = 
n 
2 
L 
i=O 
(a. ) 
1 
* {pd * Q' (x .) 
n-1 + R' (x .) *} n-1 
or 
* (z ) 
n 
175 
Using equation (7.37) to replace 
d * by P R"(a.) 
1 
. we obtain 
where 
* (z ) 
2 
= L d * * * * R"(a.) Q' (x .) + (a.) R' (x .) n o P 1 n-1 1 n-1 
••. (7.39) 
* * E = R"(a.) Q'(x .) 
1 n-l. 
, and 
* * F = (a.) R'(x .) 
1 . n-l. 
In equation (7.39) above, we express the terms 
·2 2 
L E and L F 
o 0 
in the forms given by equations (7.35) and (7.34) respectively. 
get 
* (z ) 
n 
= P d ~ a-d Q" { ~ E} + R" { ~ E} 
o . 0 
+ pd Q'{ ~ F} + R' { ~ F} ••• (7.40) 
Writing Q' { ~ F} in the form given in equation (7.35), we· 
- pd {pa-d Q" [Q' { ~ F}] + R" [Q' {I F}] } 
=pd R" [Q' { ~ F}] • 
176 
Substituting this value into equation (7.40) we obtain 
•.. (7.41) 
If we now substitute the actual expressions for E and F, we 
obtain from equation (7.41) 
* (z ) 
n 
+ R' { 1 
~ R"(a.)* Q' (x .)*} l. ~ n-l. 
o 
* (a. ) 
1. 
R' (x .) *} 
n-1. 
•.. (7.42) 
In the above equation, since the term in the final curly brackets 
d is computed in modulo p arithmetic, we may replace 
* (a.) 
1. 
by * R' (a.) 1. . 
We finally obtain 
* (z ) 
n 
* R"(a.) 
1. 
Q' (x .,*l} 
n-l. J 
a 
mod p 
.... (7.43) 
* From equation (7.34) in Section 7.3.4.0 we know.that Q'(x .) 
n-1. 
comes from the set 
., 
I 
177 
{ a-d } 0,1,2, ••. , p -1, 
'* 
which is the same one that R"(a.) , by definition (Le. equation 
1 
(7.35», comes from. 
In equation (7.43) above, we observe that each of the terms 
in the curly brackets is identical in form to that given by equation 
(7.12) in Section 7.3.1 which described a general modulo filter • 
. Therefore, we can say that two basic sub-machines of (DF) a 
. p 
are actually the modulo filters (DF) d and (DF) a-d ' whose respective p p 
outputs (z )' and(z )" are given by· 
n n 
and 
Let 
(z)' = R' ff R'(a./ 
n 4.=0 1 
R' (x .) '*l 
n-1 J 
C
2 
. '* . '*J (z )" = R" L R"(a.) Q' (x .) 
n .=0 1 n-1 
(a. ) 
1 
'* R' (x .) '*l) 
n-l. J 
The output of (DF) a is given by p 
(zn)'* = pd[ (zn)" + fJ + (zn) , 
pd K (z )' mod a 
-
+ p 
n 
where 
K :: RH [(Zn) " + fJ mod a p 
i~e. 0 If K < a-d p 
. '* (. . K = Q' (z ) ) . 
n 
••• (7.44) 
••. (7.45) 
... (7.46) 
••• (7.47) 
Also d o ~ (z )' < p 
n 
178 
Equation (7.47) is the describing equation for our original 
modulo pa digital filter. 
The block diagram of the corresponding circuit realisation is 
shown in Fig. 7.6. in which stored-logic units are used to implement 
the relevant functions. 
·The storage capacity of the three components of (DF) a are shown p 
below. 
(DF) d p 
(DF)pa-d 
f-comb inational 
matrix 
d log2(P ) word-bits 
(pa-d)3 lOgz<pa-d) word-bits 
(pd)3 log2(pa-d) word-bits. 
* The mapping $ shown in Fig. 7.6 transforms the input (x ) 
n 
of (DF) a into the pair shown below, i.e. p 
* * * If s , i • 0 are the state, input and output elements 
of (DF) a • and s • i • 0 those of its equivalent realised as a p c c c 
cascade realisation of (DF) d and (DF) a-cl then we have p p 
Sc = [[Q'(Xn_1>*, R'(Xn- l )J.[Q'(Xn_2)*. R'(Xn-2~J 
and 
The cascade decomposition technique that we have presented here 
may of course be applied to each of the components (DF) d· and (DF) d 
p . pa-
rl 0 J 
i" 
o 
, Q (xn) fXi~)----' 
Ucp :-
I 
• 
I 
I 
I 
... _-- -
, 
I 
, 
I 
, 
J R'(xn) 
t- O 
Fig. 7.6. 
, 
Q (Xn_1) 
, Q(xn_2 ) 
_I 
o r-' 1 
(po-d}3 tog(po-d) 
-
o a-d p 
word-bits 
-
"U 
I 
o~-9 
-Cj)1 
-- ~ 
c:L. 
"Uo. 
0 r-
( pd)3 tog(pd) 
word- bits 
, 
Q (zn) 
.- --_ ... ---~. 
I 
I 
., 
, . 
: (zn) , 
I 
I , 
-1 I ~ cp 
I 
, 
I 
! .. - --
, 
R (zo) 
• I 
I 
----, 
Cascade realisation of (DF) a from (DF) d. 
. pp. 
(DF) a d and a combinational matrix. . p -
179 
to sim~lify them still further, until we reach the point when the 
two components are each a mod-p filter. 
The consequence of this chain of decomposition levels is that 
it allows one to select the pair of component machines that is most 
suited, in terms of stored-logic capacity, to available devices. 
To have an idea of the effect of the cascade decomposition on 
the storage capacity of the overall realisation, let a = 2d and 
consider only the first level decomposition. 
Then we have the component filters (DF\d and (DF\a-d = (DF)p2d_d 
= (DF) d' p 
All three components of. (DF)pa have identical stored-logic 
capacity equal to 
(pd)3 1 (d) og2 p word-bits, 
resulting in an overall capacity of 
word-bits. 
The direct stored-logic implementation of (DF) a will require p 
a 3 a (p) 10g2(P) 
?d 3 
= (p-) 2d 10g2(P) word-bits. 
The ratio of the capacity required for the cascade realisation 
to that of the direct stored-logic implementation is 
3(p3d) d 10g2 P 
(p 6d) 2d 10g2 p = t Hd) p word-bits. 
As an example, let p = 3, a = 3 and d = 2. Then the modulus 
a d 2 a-d 1 M = P = 27, P = 3 = 9, and p = 3. Also let the resulting 
180 
(DF) a have the coefficient values p 
* (a ) 
o 
and 
Let the data values at a particular sampling instant be 
* (x ) 
n 
* = IS, (x 1) 
n-
= 11 and = 24. 
,'The direct approach will yield 
* (z ) 
n 
* (a. ) 
1 
* (x .) 
n-1 
_ 21 x 15 + 17 x 11 + 16 x 24 
_ 22 
18 + 25 + 6 
3 
mod(3 ). 
3 
mod(3 ) 
3 
mod(3 ) 
Using the decomposition technique developed"we first obtain 
and 
* cj>(x
n
) = HIS) = (1, 6) = (Q' (15), R' (15) 
cp(xn- 1) * = CP(ll) = (I, 2) = (Q' (11), R' (11) 
* cp(x
n
_2) = cj>(24) = (2, 6) = (Q' (Z4), R' (24) 
* * * R'(a) = 3, R' (a1) = 8, R' (aZ) = 7 , 0 
* * * R"(a ) = 0, R"(a ) = 2, R"(a ) = 1 . 0 1 Z 
From equations (7.44), (7.45) and (7.46) we have 
(Zn)' = R'{(3X6) + (8x2) + (7X6)} 
= R'{O + 7 + 6} = 4 
) 
) 
) 
and 
181 
(Zn)" = R"{ (OX1) + (2x1) + (1)<2)} 
= R"{O + 2 + 2} ~ 1 
f = R"(Q t [(21X6) + (17x2) + (16X6)]) 
= R"(Qt ~56J 1 = R"(28) = 1. 
K = R" [(Zn) " + f] = R"[2J = 2 
and finally from equation (7.47) we obtain 
(Z )* _ 32 • 2 + 4 . 
n 
q,(Z) = (2, 4) 
n 
* If we apply the ~ function to the (z) obtained via the direct 
n 
approach, we also get 
[22J = (2, 4) • 
7.3.5 Lattice of homomorphic images of (DF)M. 
From the ideas developed in Sections 7.3.1 to 7.3.4, we see 
that a general modu1o-M filter (DF)M may be decomposed into a parallel 
and/or cascade connection of submachines, i.e. its homomorphic images. 
The relationship betw.een pairs of these images can be compactly. 
and visually represented by a lattice developed below. 
Let d, d1 , d2 , D be factors of M and the corresponding modu10 
filters be (DF)d' (DF)d ,(DF)d and (DF)D. 
1 2 
Also let d = g.c.d. (d1 , d2) and D = t.c.m. (d1 , d2), and FD 
be the set of all unique images of (DF)M' i.e. 
182 
F = {h 
D h = (DF)d' where d is a factor of M}. 
Now let us define a. relation '<:l' on F to mean that if D 
then 
(DF) b <:::J (DF) c' 
'is a homomorphic image of' (DF) • 
c 
We have 
(DF) d <:::J (DF) d and 
1 
(DF) d <::J (DF) d 
2 
Since d is the greatest common divisor of both d1 and d2 , then 
(DF)d is the greatest (in terms of input, state component, and output 
symbols) modulo F.S.M. filter that is common to both (DF)d and (DF)d 
I 2 
Similarly 
and (DF)d -c:::::::] (DF)D 
2 
As D is the least connnon multiple of dl and d2 , then (DF)D is the 
smallest modulo filter that (DF)d and (DF)d are the images of. 
I 2 
(DF)D is identical to 
(DF)k in Section 7.3.3.). 
I ., 
Thus, the set FD is partially ordered by ~ and hence 
(FD, <:l ) is a lattice, which has a least upper bound 
and a greatest lower bound (DF)d for every pair of images 
and (DF)d • 
2 
It is not difficult to see that this lattice of homomorphic 
images is identical to the lattice of divisors of M with the 'factor' 
relation discussed in Section 7.1.2. 
183 
7.4 Conclusions. 
The F.S.M. models for general modulo-M adders and multipliers 
have been successfully analysed for S.P. partitions. The lattice 
of these partitions for the adder is related in a simple way to the 
well known lattice of the divisors of M with the 'factor' relation, 
and.was shown to be a sub-lattice of that for the mod-M multiplier. 
The general non-recursive second-order digital filter. has been 
suitably transformed and modelled to make it more amenable to.algebraic 
partition analysis. 
This simplified model was shown to be structurally decomposable 
into a parallel and/or a nested cascade connection of submachines, 
whose lattice is identical to that· of the divisors of M mentioned 
previously. 
In general, in the author's opinion, the simplified model of 
the filter section is not unrealistic, since in practice it may be 
regarded as being 'embedded' in the actual section. Furthermore, 
although the decomposed realisations of (DF)M require input .and 
output combinational mappings, which, if implemented with stored-logic 
devices, will restrict the wordlengths of the filter's data and 
coefficients, this may be overcome by developing practical filter 
sections of short wordlengths. 
In the next chapter we will see how this may be achieved; 
184 
CHAPTER 8 
MODULAR PARTITIONING OF BASIC 
SECOND-ORDER DIGITAL FILTER 
8.0 Introduction. 
In this and subsequent chapters an approach different from that 
discussed in previous chapters is developed to partition the basic 
second-order digital filter. 
The design philosophy is initiated by the fact that, as explained 
in Chapter 2, a general digital filter of a high order is realised, 
not directly, but as a parallel or cascade connection of basic second-
order sections, each being identical in structure. 
The open question then arising is whether or not it is. possible 
to apply a similar idea to the basic second-order section itself and 
factor or partition it into a systematic interconnection of smaller, 
preferably structurally identical, modules. 
In response to this question, we have successfully extracted a 
basic computational unit from the algorithm of the general second-
order filter. This unit, which we have termed the digit convolution 
module has many desirable features in terms of hardware realisation. 
Furthermore, we have also derived the simplest elementary form of the 
convolution module which we have called the primitive convolution cell. 
The proposed modular approach also has a useful consequence in 
the frequency domain analysis of digital filters. 
In the following discussion, the general theory is presented 
first, followed by a detailed study of a special case which will be 
useful in practical implementations. A short discussion on the handling 
of negative sample values is also given. 
185 
8.1 General modular partition theory. 
In this section we show how the digit convolution module is 
extracted, and the technique logically extended to derive the primitive 
convolution cell. The concept of digit templates for frequency 
analysis will also be ·explained. 
8.1.0 Sequence elements represented as sequences. 
In the purely analytical design and analysis of digital filters·, 
and even in their off-line simulations on general-purpose computers, 
there is the tendency to regard each element of the input and impulse 
response sequences of a digital filter, i.e. {x .} and {A.} 
n-l. 1. 
respectively, as a single conceptual entity. 
In the conventional approach, there is also the assumption that 
once the filter coefficients have been derived, the theoretical design 
problem is completed. The subsequent hardware implementation is 
then regarded as essentially an exercise in switching circuit theory, 
with hardware designed at the bit level. 
As an attempt to bridge the gap between formal filter design 
and practical hardware realisations with a systematic theory, we propose 
the following approach. 
We observe, first of all, that number elements are most frequently 
represented as the sum of weighted digits, i.e. to say, if N is a 
natural number, then 
L-l 
N ~ L '\ wk ' k~O •••• (8.0) 
where the nk's and wk's are the digits and weights respectivety. 
The most common form of this weighted digit representation is 
186 
one in which the wk's are integer powers· of a fixed number or base •. 
R say. 
Then we have 
L-l 
N = L •.•• (8.1) 
k=O 
In both cases. N may be represented as an L-tuple digit vector. 
i. e. , 
•••• (8.2) 
(In equation (8.2), it is implicitly understood that any vector 
element ~ say is weighted accordingly by Rk). 
Consider. now, for simplicity, just the non-recursive part of 
the second-order section. If Z is the corresponding output. then· 
n 
Z 
n 
= 
2 
L 
i=O 
A. X . 
1 n-l. 
The filter impulse response is given by 
.... (8.3) 
and at a particular sampling period nT, the present and past input 
samples are given by the sequence 
If the vector representation in equation (8.2) is applied to 
the elements of the sequences {A.} and {X .}, we now see that each 
1. n-l. 
of their elements is itself a sequence, i.e., 
A. {a. L"-l' a. L"-2' .... , a .... , .... a. O} 1. 1, 1, . 1,~ 1, 
187 
and 
x . = {X .. L' 1'····' X . 0' , ••.• X . 'Ol • n-1 n-1, - n~1,~ n-1, 
Thus, while the overall filtering operation consists of the 
convolution between the sequences' {A.}< and {X .}, the internal 
1 n-1 
computation during a sampling period T is actually composed of operations 
between digit sequences. The detailed nature of these internal operations 
will now<be presented. 
8.1.1 Extraction of a basic convolution unit. 
<The filter algorithm described by equation (8.3) is normally 
carried out as shown in the block diagram in Fig. 8.0. If, however, 
we use the vector representation for the data and coefficient words, 
we arrive at the block diagram shown in Fig. 8.1. There, we have shown 
the operation between the 2" th digits of the digit vectors of the A.' s, 
~ 
and the 2' th digits of the vectors of the inputs X < .' s. 
n-~ 
As shown in Fig. 8.1, using these digits, we then form the typical 
partial convolution given by 
Zn 2' 2" = { I (A. 2" )(X _. 2,)}(R" )2" (R')R.' 
" i=O 1, n 1, 
., .. (8.4). 
The overall or actual convolution product is finally obtained 
by summing over all such typical partial convolutions, thus obtaining, 
Z 
n 
L'-l , 
= < L (R')R. 
~'=O 
L"-l 2,,{2 } L (R") L (A. o,,)(X _. 2') 
.£," =0 i=O 1,x. n 1, 
•••• (8.5), 
where the A.'s and X .'s are expressed as L" -tuple and L'-tuple digit 
1 n-l 
vectors respectively. 
If we now compare the term in the curly brackets in either equations 
Fig. 8.0. 
X 
n-2 
2 
Z = I-n A. X . 
i=O 
In-I 
Direct implementation of algorithm 
of non-recursive second-order filter. 
{X L'-l'····· ,!-~--::! , ..... ,X o} n, I U, N I n, {X -1 '-1' .... .r~-=-l-::! , ..... X -1 O} n ,L I n ,x. In, {Xn- 2,L '-1' ...•. ·r~::~~::l· " . ,X n-2,0} ( L ________ I 
{AO,L II -1'····· 
L ___ .~ ___ I 
------ ~ , , 
X ~ X ..... 
.. X 
r r i 
-------
, .•... ,AO,O} {AI,L" -1'····· ,r:~~::'-I , ... ~. ,AI,O} { r--, , A2,L" -1'" ... ,! A2, 'A , '1 0 R.. It 1 , ' , L ___ L ______ 
----, , 
.2. If I 
---' -------
:fy 
Fig. 8.1. 
, ~ 
.I 
.; 
, 
-" 
f2 ~ £''' (R')£" , Z = A. X . (R") . 
n,1' ,R." 
Block diagram of 'internal' computation of filter algorithm and the 
extraction ·of atypical convolution unit. 
••••••• A2 , o} 
188 
(8.4) or (8.5),. i.e., 
{ r (A. i" i=O 1., ) (X . ",)} n-1,7.; •••• (8.6) 
with the expression for the normal convolution as given in equation 
(8.3), we see that they are both identical in form and hence in 
hardware structure. 
The implementation of the term in (8.6), however, is much simpler 
in its hardware requirements, especially in terms of register lengths 
because 
Ai,i" E ZR" = [0,1, ••. , R" -lJ 
and 
X . ,E Z , = 10 ,1, .••• , R'-ll 
n-1.,i R L J 
while, before partitioning, we have, 
A. E Z L" = 10 ,1, .... , (R") L." -lJ 
1. (R") L 
and 
X . E Z , = 10 ,1, .... , (R') L' -lJ 
n-1. (R,)L L 
" 
since in practice L" and L'are invariably greater than 1, it 
is easy to see that ZR" and ZR' are smaller than Z L" and Z L' 
(R" ) (R') 
respectively. 
We feel that the structure shown in (8.6) is a useful and also 
practical basic computational unit in digital convolutions, and so have 
termed it, not surprisingly, a digit convolution module (D.G.M.). 
The process of extracting this module from the second-order section 
may be visualised conceptually as shown in Fig. 8.2, and is analogous 
.~ 
0. 
E 
o 
u 
~ 
::J 
"U 
o 
E 
Ol 
C 
'Vi 
d 
QI 
L. 
() 
QI 
"U 
1 
\ 
\ 
\ 
\ 
\ 
\ 
\ 
Fig. 8.2. 
\ 
\ 
I 
/ 
I 
I 
I 
I 
I 
I 
Conceptual projection of second-order 
filter structure onto digit convolution 
modules of decreasing complexity. 
I 
I 
I 
I 
/ 
I 
/ 
/ 
/ 
/ 
189 
to looking at an object through the wrong end of a telescope. 
As will be explained in Section 8.2, the digital designer, by 
the proper choice of R" and R', can have complete control over the 
hardware complexity of his D.C. module, tailoring it according to 
existing technology, component availability, processing speeds, etc. 
From equation (8.5), we see that the original convolution is 
now the sum of digit convolutions. Consequently, the basic second-
order section can be realised as a regular interconnection of D.C. 
modules, each being identical in architecture. At any sampling 
instant, the filter output is obtained by summing weighted outputs 
of-- these D.C .M' s.- The block diagram of this modular realisation is 
shown in Fig. 8.3. 
The practical features and applications of our proposed approach 
are discussed in detail in Section 8.2, when we apply the partitioning 
- technique to the case when R is an integer power of 2. 
8.1.2 The primitive convolution cell. 
By carrying the modular partition technique to its logical 
conclusion, we can derive the most elementary form of the D.C. module. 
The resulting unit may then be regarded as an 'atomic' building block 
of the filter algorithm. 
The simplest form of the D.C. module described in equation (8.6) 
is when the fixed bases for the digit- vectors of the data and coefficient 
words are both chosen to be 2, i.e. R' = R" = 2. 
In such a case, the data and coefficients of a typical D.C. module 
are simply two-valued words, i.e., 
A. 0" 1,<' 
x . 0' n-1,x.. lE o or -1 
Xn L~1 
I 
• 
• 
• 
X " 
c 
n,L 
+ x 
X· x ' 
r----. n-1,L-1 n-2, L-1 
r-----~~ T T 
I 
r ----
1 
1 
1 
1 
r 1 
" Az 0 I 
, 1 
1 I 
I +1 L_. ____________ ::..J 
1 
I 
I 
1 
• • 
• • 
• 
• 
------ ---- ---, 
A. (5().-A2.,L~!1 
1,L'1 "t::.-. I 
L ______________ =.J 
:+ 
+ 
+ 
Fig. 8.3. Modular realisation of second-order 
digi tal fi Iter. 
+ 
.:z 
'n 
190 
The structure of such a module is shown in Fig •. 8.4, 1n which 
the data bits; Xn,i' , Xn-l,i' , Xn_2 i' ; are gated by the , 
coefficient bits; An" , 
0, .. 
bit products summed by the full-adder. 
; and the resulting 
From this module, it is a short step to arrive at an even simpler 
one which operates now in the unary base, i.e. by counting. We have 
termed such a unit a primitive convolution cell (p.C.C.), whose circuit 
structure we show in Fig. 8.5. 
In this primitive cell, the data and coefficient bits are 
recirculated internally, and the bi t products (A2 , i" )(Xn- 2,i')' 
(Al,i" )(Xn-l,i') and (Ao,i" )(Xn,i') formed in time successions. 
These products enable or inhibit the clock input to the two-bit 
counter which simply counts the number of these products 'that are 
at logical 'l's. 
As a further explanation to the operation of the p.C.C. we have 
shown in Fig. 8.6 the contents of the data and coefficient flip-flops 
at successive count cycles during the filter sampling interval T. 
8.1.3 Effect of modular partitioning on frequency analysis. 
The modular approach proposed is also useful when digital filters 
are analysed in the frequency domain. 
8.1.3.0 Frequency characteristics. 
The frequency responses (amplitude and phase or real and imaginary) 
of digital filters are usually obtained by using, as inputs, sampled 
jwnT 49 
complex exponentials of the form e , where T is the sampling period . 
A oil 0," 
x 0' n," .-----, X n-l , i' r----, 
FtF FtF 
F/A 
Ft F : flip-flop 
Fig. 8.4. Base 2 digit convolution module. 
r- -- ---- ---------.., 
I I 
I 
2-bit 
+-,t_-::_-.:_-.:_~_~_::-.::_-::_-.:_~_~C _~o~~~n~t~e~r ___ J_ _ _ _ _ _ _ _ _ _ __ I clock 
Fig. 8.5. Circuit structure of primitive 
convolution cell. 
X. 
X. 
, 
• X. 
x. 
) c 
x. 
(1) (3) 
Xi Xj 
• 
~ 
" X, • X. ca, loo la, T Ca. IQ~ IQ~ T ( 
(Za) (4a) 
x, 
Xl X. XL , 
X' = X~ = X2 
• ca, 1~.la, I r a. la. la, 11 ~ [ • 
< 
Fig. 8.6. 
(Zb) (4b) 
Bit patterns of primitive cell during 
successive count cycles during period T: 
191 
For the simple second-order non-recursive filter with impulse 
response {A.}, i = 0,1,2, we know_ that the filter output Z(nT) is 
1 
the sum of the present and past two inputs appropriately scaled by 
the A. 's, i.e., 
1 
= ( I A.e- jwTi) e jwnT 
i=O 1 -
•••• (8.7) 
Th h ( ) . h .. 1· jwnT d·f· d us, t e output Z nT 1S t e or1g1na 1nput e mo 1 1e 
by the complex number H(jw), .called the frequency response of the 
filter, given by 
2 
H(jw) = I 
i=O 
-jwTi 
A.e 
1 
8.1.3.1 Digit frequency response templates. 
•••• (8.8) 
If each of the coefficients A.'s is an L" -digit number in the 
1 
radix R" , then from equation (8.8), we see that there are 
possible combinations of Ao' AI' A2• Consequently, during the 
analytical design stage, one apparently has to deal with a very large " 
number of different frequency responses. Thus, in the binary 
representation, i.e. R" = -2, if L" = 8 bits, then there is, using 
the direct method, a total of (2 8)3 = 16 millions possible frequency 
responses. 
We recall, however, for radix-R" decomposition, that our typical 
digit convolution module as described in Section 8.1.1 has the impulse 
response {Ai' i" } , where 
{A. ",,}=A "" 1,X. o,)'~ , 
o ~ R. u < LtI 
, 
192 
and 
Ai,i" E ZR" = [0,1, ••• , R" -lJ • 
Using equation (8.8), the. frequency response Hi" (jw) of a typical 
D.e. module is given by 
2 
L 
i=O 
-jwTi 
(A .• ,,)e 
1, .. 
•••• (8.9) , 
and there are now only (R" )3 different frequency responses involved, 
and the frequency response of any D.e.M. comes from this set. 
A particular response from this set we have termed a digit 
frequency response template (D.F.R.T). 
Analogous to the realisation of the second-order section from 
D.e. modules, the general frequency response of the filter can be built 
up simply by scaling and summing the appropriate D.F.R. templates, i.e., 
L" -1 
H(jw) = L (Hi" 
i" =0 
.1/." (jw»(R" ) •••• (8.10) 
As an example, let A =16 Al = 23 and A2 = 5. o ' Let R" = 3, and 
use equation (8.2) to obtain 
A 
-
(1,2,1) 
0 
Al - (2,1,2) 
and 
A2 " (0,1,2) . 
There are thus three forms of D.e.M's having the impulse responses, 
. {A. 2} 
1, 
= {l,2,0} 
{A. I} 
1, 
= {2,l,1l 
and 
{A. } = {1,2,2} 
1,0 
193 
Hence the frequency response of the filter is obtained by 
sunnning the following weighted D.F.R. templates, viz. , 
[1 + 2e -jwj 32 
[2 + -jwT e-jw2j 31 and e + , 
[1 + 2e-
jwT + 2e-jw2j 30 
For the simple case in which the A.'s are represented by~" bits 
1. 
and each coefficient is then partitioned intoB" blocks of 1 bit each, 
the set of non-trivial D.F.R. templates is small indeed, consisting, 
as shown in Fig. 8.7, of-only six' different frequency responses. 
Although at this stage our analysis is only preliminary, there 
is a good indication that the concept of digit frequency response 
templates may prove to be useful in the off-line designs and especially 
in the interactive simulations of digital filters. 
. {A .• n} = 0,1,0 
1, .. 
1 1 
o O~--------~ 
- 1 
(i) 
(H) 
2 
0 
(Hi) 
Figs. 8.7 
1/2T =fS '"7 fS . 
. {A .• n} = 0,0,1 1," 
(a) 
{A .• n} = 1,1,0 
1, .. 
"'~ fS 
(a) 
(i)-(vi) . Real (a) and 
responses of 
- 1 
(b) 
(b) 
0 
fs 
-1 
(b) 
imaginary (b) frequency 
D.F. R. template·s for radix 
R=2 partition. 
2 
(iv) 
(v) 
2 
1 
{A.· ",,} = 0,1,1 
1, ... 
(a) 
. {k. ",,} = 1,0,1 
1, ... 
(a) 
{A. ",,} = 1,1,1 
1, ... 
(b) 
1 
194 
B.2 Reprint of the article entitled 
"A modular approach to the hardware 
implementation of digital filters", 
by 
M.A. Bin Nun" and M".E. Woodward 
published in 
The Radio and Electronic Engineer, 
Vol. 46, No.B/9, pp." 393-400, Aug./Sept. 1976. 
195 
UDe 621.3.049.771.12/14 :621.372.54 :621.374 
A modular approach 
to the hardware 
impiementation of 
digital filters 
M. A. BIN NUN, B.Sc., M.Sc.· 
and 
M. E. WOODWARD, B.Sc., Ph.D.· 
SUMMARY 
Recent advances in the technology of medium and 
large scale integrated circuits (m.s.i. and I.s.i.) 
have made possible economical hardware imple-
mentations for real-time digital filtering. A flexible 
design approach for such implementations is 
presented. The processing mode can be varied to 
give any hybrid structure between the purely serial 
and parallel realizations. This leads to a design 
approach which can be adjusted to suit hardware 
availability. The resulting structures are modular 
and are in line with current trends in m.s.i. and I.s.i. 
technology in that they lend themselves readily to 
implementations using semiconductor read-only 
or random access memories. 
. Department of Electronic and Electrical Engineering. University 
of Technology, Loughborough, Leicestershire LE11 3TU. 
1 Introduction 
The theory in the analysis and design of digital filters 
is well established, and their advantages over conven-
tional analogue filters, made up of resistors, capacitors, 
inductors and crystals, have been widely discussed."> 
Until quite recently, the implementation of digital filter-
ing has been confined mainly to simulation on general-
purpose computers. The rapid development in tbe 
technology of medium and large-scale integrated 
circuits (m.s.i. and l.s.i.) however, is making possible the 
construction of special-purpose hardware for real-time 
digital filtering. Conventional implementations reported 
in tbe literature invariably compute the filter algorithm 
in the familiar binary arithmetic, eitber in the serial3 or 
in the parallel' mode. Furthermore, the actual hardware 
synthesis is usually at. the discrete gate level, and the 
structures proposed are" mainly for specific corifigura-
tions. 
In this paper, a modular approach to the hardware 
implementation of digital filters is proposed. This 
approach is general, flexible and is at tbe system and sub-
system level, and is thus very suited to rn.s.i. and l.s.i. 
devices. In this approach, a basic second-order digital 
filter section may be constructed as a regular inter-
connection of simple identical 'sub· filter modules'. The 
structure of a typical module and the processing mode of 
the overall section are flexible and may be adjusted to 
suit specific requirements. As there is a very wide range 
oflogic families (t.t.l., e.c.l., m.o.s., etc.) and of m.s.i. and 
l.s.i. devices currently on tbe market, only a general 
guide as to the trade-off between circuit complexity aod 
operating speed will be described. 
The hardware implementation of the proposed 
approach using semiconductor memories is also discussed. 
2 Digital Filtering 
In general, the term 'digital filter' refers to any device 
which operates on an input number sequence to Produce 
a second sequence of numbers by means of a computa-
tional algorithm. If the digital filter is part of a signal 
processing system, like that shown in Fig. I.- the inpu! 
number sequence is usually the digital version of an 
. . , 
analogue sIgnal. The output sequence may be converted 
to the analogue form if required. 
( ) ;r I 
Analogue 
input 
(T) ;cn 
Sampler 
T 
8-bit 
quantizer 
and 
coder 
" 
DI91tal 
input 
Digital 
filter 
y. 
" 
Digital 
output 
"' 
Fig. t. Block representation or a digital signal processing system. 
High·order digital filters are normally realized as either 
a cascade or a parallel network of basic second-order 
sections, 1.2 which, in the former case, are ordered for 
minimum round-off noise and have outputs" suitably 
scaled. S, 6 
A typical second-order section is shown in Fig. 2. The 
input and output sequences, (XJ and (YJ respectively, 
Th6 Radio .n1 Electronic Engineer. Vol. 46, No. 8/9. pp. 393-400, August/September 1976 393 
196 
M. A. BIN NUN and M. E. WOODWARD 
are related by the following difference equation: 
2 2 
y. = L A,X._,- L Hi Y._, (1) 
1=0 i::z 1 
where A, and Hi are the filter coefficients obtainable from 
its transfer function. 
The filter network in Fig. 2 consists of a non-recursive 
and a recursive part. Both are essentially the same in both 
structure and operation in that each may be represented 
by an expression of the form . 
2 
V. = L C, U._, (2) 
'=0 
where, for the recursive part, Co = Bo = O. 
In the subsequent discussion of the proposed design 
approach, it is therefore only necessary to consider the 
more general non-recursive part, which has the input-
output relationship 
2 
Z. = L A,X._, 
'=0 
(3) 
3 Design Approach 
The proposed design approach is based on computing 
the filtering algorithm given by equation (3), not only in 
the conventional binary system, but in the general radix 
R arithmetic, where R is an integer power of 2, i.e. 
R= 2P, p = 1,2,3, ... etc (4) 
It is assumed that fixed-point arithmetic is used, and 
that, in order to process equation (3) to a specified 
accuracy, B' and B" binary digits (bits) are required to 
represent each of the data and coefficient words respec-
tively. Also, to simplify the discussion on the design 
approach, the data and coefficient words are assumed to 
be non-negative integers, i.e. 
O.:s; X n - i ~ 2B'_1 
and 
0" A, ,,28 --1 
In practice, the data and coefficients are represented as 
binary fractions and the two's complement" 6, 8 notation 
is most commonly used to handle negative numbers. 
Non~ f@cursive part 
x. 
, + , z. 
I I 
Since any B-bit binary number M can be represented 
in the form 
8-' 
M = L rn, 2', rn, = 0 Or 1 
.=0 
the binary forms of the data and coefficients will be 
S'-l 
and 
where 
Xn - I == L X tl _J,k 2t 
'=0 
" Ir-l 
A, = L ai,J 2'. 
j=O 
i = 0,1,2, XII_i,b a,.) = 0 or 1 
(5) 
(6) 
(7) 
Conventionally, equations (6) and (7) are substituted 
directly into equation (3) for the subsequent computation 
of the filter outputZ., A comprehensive discussion on the 
possible hardware organizations and processing modes 
for implementations based on binary arithmetic is given 
by Freeny in his tutorial paper.' 
In the proposed modular approach, a B-bit binary 
number M is first partitioned into b blocks, each of p 
bits, where 
B = b x p, band p being integers (8) 
(p = 3, and p = 4 result in the familiar octal and hexa-
decimal systems respectively). 
Thus equation (5) may now be represented as 
M = (m8-,2P-'+ ... +mB_p+,2'+mB_.20)(2p)'-' 
+ ... +(lIIp(H'I_,2P-'+ ... +m .. +,2' 
+mp' 20)(2P),+ ... +(lIIp _' 2P-' 
+ ... +m,21 +1110 2")(2')° 
or 
,-, 
M = L M'(2,)' 
1:=0 
where 
P-' 
(9) 
M" = L 1npk+,,2h (10) 
'=0 
and 
0" M. " 2P-l 
Equations (9) and (10) simply mean that the B-bit. 
binary number in equation (4) is now represented as a 
b-digit number in the radix 2 P, where each digit is ap-bit. 
binary number. . 
L _________________ ----1.--''--, 
Y" 
,..---------------- - - - - --I I , 
, 
+ 
8, B, 
, T 
I y"_2 Yn-l : L __________________ '
Recursive part 
394 
w. 
Fig. 2. 
Second-order digital filter 
section with sample 
period T. 
The Rildio and Electronic Engjne~r. Vol. 46. No. 8/9 
197 
MODULAR APPROACH TO THE HARDWARE IMPLEMENTATION OF DIGITAL FILTERS 
3.1 Example 
Let M be the 6·bit (B = 6) binary number, 1 0 I 1 0 I. 
Expressing this in terms of equation (5), then, 
M = lx2'+Ox24 +lx2'+lx2'+Ox2'+lx2°. 
If M is partitioned into three blocks, each of two bits 
(b = 3, p = 2), then M can be expressed as 
M = (lx2'+Ox24)+(lx2'+lx2')+(Ox2'+lx2°) 
or, in terms of equation (9) 
M = (I X 2' +0 X 2°)(2')' +(1 x 2' + I X 2°)(2')' 
+(0 x 2' +1 X 2°)(2')° 
Thus, M is now represented as a.3·digit number in the 
radix 2', where the digits, M, of equation (9), are 2·bit 
binary words, and, using equation (10) are given by 
Mo = 01, M, = 11 and M, = 10 
3.2 Computing in the Radix 2' 
In general, each data word may be partitioned into b' 
blocks each of p' bits, and each coefficient word into b' 
blocks of p' bits. 
Using equation (9), equation (3) can be rewritten, in 
which Z., the output of the non·recursive filter section, 
is expressed as a triple sum, 
Z. = t ['£' A, .•. (2'")'·] ['£' X._ 1• ,.(2')'"] (11) 
1=0 k"=O k'""O 
where 
and 
p"-l 
" '" Ai,A:" = L.. ai.p~k"+h .. 2 h~=O 
p'-l 
" ,. X n - I ,'/':' = L X n-i,P'A;'+h. 2 
11'=0 
fori = 0, 1,2, and 
(b")(p") = B", (b')(p') = B' 
(12) 
(13) 
The order of summation in equation (11) is then 
changed, resulting in 
b'-l 6"-1 2 
Z. = L (2')'" L (2'")'" L (A,.k·)(X._, .•. ) (14) 
'/':'''''0 k"=O 1=>0 
Equation (14) forms the basis of the proposed modular 
approach to the hardware implementation of digital 
filters. 
3.2.1. Example 
Consider a second·order non·recursive filter having the 
coefficients 
Ao = 6'0' A, = 1310 and A, = 9'0 
Also, suppose that at a particular sampling instant the 
data consists of 
XII = 12 10• Xll_l =5 10 and X n _'2 =7 1O 
If both data and coefficients are represented by 4·bit 
binary numbers, (B' = B' = 4), then 
Ao = 0 I I 0, A, = I I 0 J, A, = 1 0 0 I 
and 
X. = JI 00, X._ 1 = 0 I 0 J and X._, = 0 I J I 
Each of these words is now split into two blocks (b' =b" 
AugustlS~ptembtH 1976 
= 2), each of two bits (p' = p' = 2), say. The filter 
output Z. at this particular sample instant may then be 
computed by the substitution of tbe actual values of the 
data and coefficients, now represented in the radix 2', 
into equation (3). This computation is!llustrated by 
Table 1. 
Coefficient 
Data 
Table 1 
Sum of partial 
R3 R1 RI J(O R3 R'J R1 ~ R:' R2"R' R.o products in 
like rows 
A, OIIOA, 11 0 I A, 1001 
x x x 
X. IIOOX •• , OIOIX._, 0111 
0000 0001 0011 0100 
0000 0011 01 10 1001 
01 10 0001 0001 1000 
0011 0011 0010 1000 
Each 4-bit parlial product is the result of a 2·bit by 
2·bit parallel multiplication, i.e. the data and coefficient 
blocks are mUltiplied in radix R = 2' aritbmetic. The 
partial products in like rows are now added. Tbis 
corresponds to tbe first summation of equation (14). The 
remaining stages of summation, as specified by equation 
(14) for the computation of the section output Z., are 
shown in Table 2. 
Table 2 
Second, final summation according to 
equation (14) Filter output Ztl 
R' R' R' R' R' R' R' R' 
01 00 } 
00 I 10 10 10 01 
+ r 10 00 ~ I + 10 10 00 10 00 
As a result, the original filter, whose data and coeffi· 
cients are represented by 4·bit binary words, is now 
regarded as being made up of four simpler units whose 
data and coefficients consist of only 2·bit binary words. 
4 Possible Realizations 
Two possible realizations for the computation of 
equation (14) are shown in Figs. 3 and 4. Tbey differ both 
in hardware complexity and operating speed. 
4.1 Parallel Processing 
In the direct realization illustrated in Fig. 3, the second· 
order non·recursive section consists of a parallel inter· 
connection of, what will be termed, sub·filter modules. 
These modules, enclosed by tbe broken lines in Fig. 3, 
are organized into b' groups each group containing b" 
modules, where b' and b" are the number of partition 
395 
198 
M. A. BIN NUN and M. E. WOOOWARD 
blocks as described by equation (ll). For the overall 
section, b' x b' modules would be required in all. 
A typical module has the same general structure and, 
computing algorithm as that of the overall section. Each 
of the dala and coefficients of a module, however, are 
now only p' bit and p' bit words respectively. 
In operation, these sub·filter modules implement the 
first summation in equation {I 4). The output of each 
group is obtained by adding the weighted outputs of 
all the modules in that particular group. Similarly, the 
section' output Zn is obtained by summing the weighted 
outputs of all the groups, as specified by the outer 
summation of equation (14). 
In this direct realization, the output weightings are 
done by hard·wired shifts. 
4.2 Sequential Processing 
In contrast to the realization shown in Fig. 3, where 
b' x b' modules operate concurrently, a single module, 
performing b' x b' module computations in time succes· 
sion, may be used. ' 
This sequential mode of processing is illustrated in 
Fig. 4, in which a basic sub·fiIter module is time·shared 
among the data and coefficient blocks. ' The accumulator 
, 
I.. + L. ___________________ J 
Xn,t/., 
,;., 
",0 
r·-····· 
I 
..... .... -._._._' 
Ao,G 
". 
Timing circl,llt not shown 
n.", 
_.-_ ..... -. 
A2,o 
o 
;; 
'5 
E , 
" 
" ..
Fig. 4. Time-sharing of a single sub-filter module. 
keeps a running sum of successive module outputs anc 
also incorporates the required weightings to them. 
The blocks of each of the data words are accommodatec 
in a (b', p') register store while those of each of the co 
efficients are stored in a (b', p') circulating register store 
where a typical (b, p) register is one having b stages, eac\ 
stage accommodating a p·bit word, as shown in Fig. 5(a) 
For every clock shift of the data registers these circu 
lating coefficient stores go through a complete cycle of b 
shifts. Since the data registers have to be clocked b 
times, the required section output, Z., will be obtained iJ 
b' x b' register clock periods after the arrival' of thl 
section input, X., at a particular sampling instant. 
The data and coefficient blocks are so arranged as te 
be in increasing order of significance at the start o(eve~ 
sampling instant. 
+ 
Fig. 3. Modular circuit configuration 
of a non-recursive digital filter 
xn,o section. 
, '+ 
1. _____ -------------1 
. . 
:' .. ··--~t~----: 
I ~o ' 
1 ~~.c A,a ~+: 
___ • ____ • _______ • ...J 
396 
Zn 
The B'·bit input, X., is loaded in parallel into an inpu 
register of the form shown in Fig. 5(b). In the subsequen 
processing, the blocks of X. are accessed sequentially, th, 
accumulator being reset to zero prior to every samplin: 
instant fiT. 
The control of the overall section can consist of ; 
counter and simple logic circuitry to account for th, 
different clock rates of the data and coefficient registers 
The Radio and Electronic Enginuf. Vol. 46, No. 8/ 
199 
MOOULAR APPROACH TO THE HAROWARE IMPLEMENTATION OF DIGITAL FILTERS 
01 P-l 
Ca) 
;; 
~ 
.~ pCk+1)-l -....r''-'-~'1.J 
c pk 
a 
~ 
:0 , 
'" P -1 ~--or'-.L..-". 1 
1 
0'-..... .-.-' 
01 p-l 
Cb) 
Clock 
Fig. S. Store and input registers. 
4.3 Features 
In the direct realization, as shown in Fig. 3, the circuit 
configuration of the overall filter section is highly modular. 
All the component units have an identical structure, and 
the interconnection between them is very regular. In 
consequence, the hardware implementation of the section 
is systematic and straightforward .. Furthermore, testing 
and fault diagnosis are greatly simplified. 
Since a typical module has the same computing 
algorithm as that of the original section, the 'feel' for the 
overall filtering operation is retained when interconnect-
ing modules. Also, the hardware requirement of a 
module is determined only by the manner in which the 
original data and coefficient words have been partitioned. 
The structure is therefore easily adjusted to suit particular 
requirements and available hardware components. To 
illustrate this, consider a non~recursive section, whose 
. data and coefficients are represented by 6-bit and 4-bit 
binary words respectively. Then Table 3 shows the 
possible ways in which these words may be partitioned 
into blocks, according to equation (8). 
Number of blocks 
Number of bits/block 
Table 3 
Data Cpefticient 
6 3 2 I 
236 
4 2 
124 
The structure of the basic module depends very much 
on the size of its component mUltipliers. For this parti-
cular filter section there are, altogether, 4x 3 = 12 
different multiplier sizes, which range from a I-bit x I-bit 
to a 6-bit x 4-bit configuration, with one convenient size 
being the 2-bit x 2-bit one. An interesting size is the I-bit 
(data) x 4-bit, as it is of the type used in the familiar 
shift-and-add technique for multiplication." 7. 8. 9 
A final feature of the proposed approach is that, after 
the structure of the basic module has been decided upon, 
the actual mode of processing the filter algorithm is 
flexible. The parallel and sequential realizations, dis-
August/Sep/lJmbef 1976 
cussed previously and shown in Figs. 3 and 4, are just 
two extremes, hybrid forms being possible. For example, 
one hybrid realization might consist of a set of basic 
modules, operating concurrently, this being regarded as a 
basic time-shared unit for subsequent sequential proces-
sing. Another hybrid form might be one in which sets of 
data blocks are processed in parallel by a number of time-
shared basic modules each operating sequentially. 
In general, in between the parallel and the completely 
sequential realizations there is a spectrum .of hardware 
structures and processing modes, the final choice being 
left to the system designer. 
4.3.1. Example 
Consider a non-recursive section having 8-bit data and 
coefficient words, (Le. B' = B" = 8). If each of these 
words are partitioned into four blocks, each of two bits 
(b' = b" = 4, p' = p~ = 2), the resulting basic module 
has a word length of 2 bits. The direct realization of this 
section, as in Fig. 3, would require b' x b' = 16 of these 
basic modules. The completely sequential mode is shown 
in Fig. 6(a), while Figs. 6(b) and (c) illustrate two possible 
hybrid realizations. In the former, two basic modules 
make up the time-shared unit, while in the latter the 
input X. is split into two parallel halves, each of which 
are then processed sequentially. It is. seen that when both 
examples of hybrid processing are compared with the 
completely sequential one, two basic modules are re-
quired. Their computing time, however, is· reduced by 
half. The parallel mode, of course, has an even shorter 
computing time which, in this example; is sixteen times 
as fast as that of the completely sequential mode. 
5 Practical Considerations 
The performance of the overall filter section depends 
primarily on the structure of the basic module and the 
manner in which the computing algorithm is processed . 
The hardware requirement and implementation of a 
typical sub-filter module are described below, and the 
computation time for the section output is derived for the 
two extreme modes of processing. The trade-off between 
circuit complexity and operating speed is also discu~Sed. 
5.1 Hardware Implementation of Sub- Filter Module' 
The hardware organization of a typical module is 
shown in Fig. 7. The required arithmetic operations are 
three p' bit xp' bit multiplications and two (p' +p")' bit· 
additions. These operations may be implemented by any 
suitable m.s';. multiplier and adder chips currently on the 
market. An attractive alternative, however, is to imple-
ment the module using semiconductor memories, (either 
read-only (r.o.m.) or random access (r.a.m.)), acting as 
stored look-up arithmetic tables." . 
One way of using these memory chips is to replace 
eachp' bit xp' bit multiplier, shown in Fig. 7, by a r.o.m. 
or r.a.m. of suitable storage. Variable and fixed co-
efficient multiplications using r.o.m.s are illustrated in 
Fig. 8(a) and (b). The former offers versatile operation 
at the expense of large memory storage when the word 
lengths of the data and coefficient blocks are large. The 
fixed coefficient multiplication requires less memory 
storage but is less versatile. 
397 
M. A. BIN NUN and M. E. WOODWARD 
398 
Basic module, 
2-bit wordlength 
To 
Qccumulator 
1 
rMJF 
,jt.flJ~ ~ '3 ~ 0 ,. 
, It 
", ",p: 
(a) 
To Qccumulator 
1 .. 
+ 
=~ =-
~ ~ .,'P-~ ~ ~r 
~ ~ -.!1-
¥ P: ~ 
tll!tlt. 
(c) 
To accumulator 
1 
:IIDtl ~I 
~ 
'3 :ttfll~ ~ 0 ~I ,. 
Ill. 
~ .. ", 
Fig. 6, Processing modes using 2-bit basic modules. 
(a) Completely sequential 
(b), (c) Two possible hybrid forms 
200 
p'bit data, pit bit coefficient 
. block _ ~ block 
. , 
'0" .. ' -----If.! 
. L. _______ _ 
Module output 
Fig. 7. Hardware configuration of a sub-filter modu1e. 
The configuration in Fig. 8(c), however. combines 
partially-variable coefficient capability with reasonable 
memory storage requirements. A total of 2' different 
coefficients can be stored in the r.o.m. 
For data and coefficient blocks of short lengths. i.e. 
p' and p' small. even the complete sub-filter modules 
may be implemented as a look-up store using a r.o.m. of 
sufficiently large memory storage. as shown in Fig. 9. 
There is thus no necessity for the two P-bit (P = p' + PI 
adders previously required. 
In general, the implementation of digital filters using 
l.s.i. semiconductor memories is simple, straightforward 
and incorporates programmability. It also offers the 
possibility of volume production of digital filter i.e. chips 
using existing manufacturing facilities. As digital filters 
are still not being used extensively enough, there is 
obviously a reluctance to custom-design and manufacture 
special i.c.s apart from very simple filter configurations. ,0 
The market demand for semiconductor memories, how-
ever, is great enough to support its own technology. 
5.2 Operating Speed of Filter Section 
The minimum value of the sampling .period T for the 
basic nonrecursive section depends on the time it takes 
to compute the output Z. after the arrival of a particular 
input X •. 
If t .. is the time to compute the output of a typic.a! 
sub-filter module, then 
(a) 
p' bits p" bits 
PX2P bit 
r.O.m. 
(b) 
p' 
P.x 2P' bit 
t.o.m. 
q bit 
coe-fticient 
program. 
(15) 
(c) 
p' 
'p x 2P'+'1 bit 
r.o·m. 
Fig. 8. R.o.m. realizations of p' bit x p" bit mu1tipliers. 
The R.dio Md EltJctrDn;c Enginfl~'. Vol. 6, No. B{9 
201 
MODULAR APPROACH TO THE HARDWARE IMPLEMENTATION OF DIGITAL FILTERS 
where I. = time to perform a p' bit xp' bit mUltiplica-
tion, and 
I, = time to sum three (p' + PJ bit words. 
For the realizations shown in Figs. 8(a) to (c), I. will 
be the access time of any particular r.o.m. used. Similarly, 
for the realization shown in Fig. 9, IM corresponds to the 
access time of the r.o.m. implementing the complete 
sub-filter module. 
For the direct realization shown in Fig. 3; the total 
time, Tp' required to compute Z. is given by 
T. = IM+lo+l,. (16) 
where 10 = time to sum the outputs of all the modules in 
any particular group 
and I, = time to sum the outputs of all the groups. 
Details on the propagation delay during the process of 
addition can be found in any standard text on digital 
arithmetic (e.g. Ref. 8). 
Data blocks 
r.Q.m. storing 
module function 
cotfticient '-r-.-------.-r-' 
program input . . . . . . . . 
Fig. 9. R.o.m. realization of a sub· filter module. 
If equation (14) is processed sequentially (see Section 
4.2, Fig. 4), the computing time, Tq , is given by 
Tq = (tM+t,) x (b') x (b") (17) 
where I, = time to add the module output at time b:J, to 
the accumulator output LI, previously, LI, 
being the period of the register clock (see 
Fig. 4). 
In equation (17), it is assumed that the time taken to 
Collection of 
module pairs X"F::- -----.- ------------I 
dJ: , 
clock the accumulator output is much less than the 
computation time for the module output. 
If lp, Iq are the maximum possible sampling fre· 
quencies for the section in the parallel and sequential 
realizations respectively, then 
1 1 I • .;; -T. and Iq';;-
p Tq 
The computation time for hybrid realizations may be 
determined using the general principles discussed. 
5.3 Trade-off Between Circuit Complexity and 
Operating Speed 
The relative advantages of the various processing 
modes depend on their respective circuit complexity, 
module count and operating speeds. The parallel mode 
has the fastest processing speed and requires virtually no 
control circuitry. The number of suh-filter modules 
needed, however, is a maximum (being b' xb' modules 
in total). At the other extreme, the sequential mode 
requires only one module and an accumulator, but 
operates b' x b' times slower than the parallel realization. 
Also, some control logic is necessary for the proper 
accumulation and weighting of the module output. The 
hybrid mode offers a compromise by enabling the designer 
to select the most suitable combination of module count 
and processing speed to match his specific requirement. 
6 General Second-order Section 
. As the recursive and non-recursive parts of the general 
second-order digital filter section (Fig. 2) have basically 
the same structure, the modular approach already dis-
cussed can be directly applied to realize this general 
section. 
The resuiting basic modll:~ then consists of two 
modules, each similar to that shown in Fig. 7. The block 
diagram of the direct modular realization of the general 
second-order section is shown in Fig. 10. 
Since Y .. the section output, is now in a feedback loop, 
it has to be truncated or rounded ·off to prevent the 
number of bits required for its representation from 
increasing indefinitely. Also Y. has to be scaled, usuall~ 
by simple powers of two.4 , t t Oth" .. gcncral practical 
considerations such as overflow detection, limit cycle 
. oscillations, and manipulation of negative numbers 
using the two's complement code, have been adequatel~ 
discussed by previous authors.s , .,1 
X r :X",k'l Non-recursive 
"1 . modult 
Round-off 
Section 
output 
'""i : : 
L; cnd I=;;=~> s~ale 
X",O 
, 
, . 
, I 
,,~L~:P:·: --: 
Delay units not shown 
August!Sffptember 1976 
Y" 
(S-bits) Fig. 10. 
Modular organization of 
a general second-order 
filter section. 
202 
M. A. BIN NUN and M. E. WOODWARD 
7 Conclusions 
A method has been presented for the hardware design 
of general second-order digital filter sections. The 
procedure is systematic, flexible, and is in accordance 
with current hardware trends in that it makes use of 
m.s.i. or I.s.i. technology. The resulting hardware 
structures are modular, have uniform interconnection 
patterns, and variable processing modes. 
The versatility and flexibility of the proposed technique 
should make possible the economical design of special-
purpose digital filter hardware for any applications 
requiring reill-time processing. 
8 References 
1. Gold, B. and Rader, C. M " 'Digital Processing of Signals' 
(McGr:iw.HiII, New York, 1969). . 
2. Rabiner. L. R. and Rader. C. M. (eds.), 'Digital Signal Process~ 
iog' (IEEE Press, New York, 1972). 
3. Jackson, L. B.; Kaiser, J. F. and McDonald, H. S., 'An approach 
to the implementation of digital fi.-Iters\ IEEE Trans. on AudiQ 
anti Eleclroacoustics, AU-16, No. 3, pp. 413-21, September 
1968. 
4. Gabel, R. A., 'A parallel arithmetic hardware structure for 
Cecursl\o"e digital filtering', IEEE Trans. on Acoustics, Spuch 
and Signal Proussing. ASSP-21, No. 4, pp. 255-8, August 
1974. 
S. Liu, B.o 'Effect of finite word length on the accuracy of digital 
filters-a review', IEEE Trans. on Circuit Theory, cr~18. 
No. 6, pp. 670-7, November 1971. 
6.0ppenheim, A. V. and Weinstein, C. J., 'Effects of finite 
register length in digital filtering and the fast Fourier trans-
form', Proc. IEEE, 60, No. 8, pp. 957-76, August J972. 
7. Freeny. S. L., 'Special-purpose hardware for digital filtering', 
Proc. IEEE, 63, No. 4, pp. 633-48, April, 1975. 
8. Lewin, D., 'Theory and Design of Digital Computers' (Wiley. 
New York, 1972). 
9. Peled. A. and Liu. B.. 'A new hardware realization of digita1 
filters', IEEE Tram. Oil Acoustic6. Spuch and Signal Process-
ing, ASSp·22, No. 6. pp. 456-62, December 1974. 
10. Pye TMC. Ltd., London, 'Monolithic Modular Digital Filters\ 
IEEE International Solid-State Circuits Conference, Feb-
ruary 1973 .. 
11. Croisier. A., Esteban, D. J., LeviHon. M. E. and Riso, V., U.S. 
Patent 3,771,130, December 1973. 
12. McDowell, l., 'Large Bipolar ROMS and PROMS Revolution-
ize Conventional Logic and System Design'. Monolithic 
Memories Inc., Applications Seminar, April19lh. 1973. 
Manuscript first received by the Itlstitution on 9th June 1975 and in 
fina/form on 4th December 1975. (Paper No. 1730/CC 261.) 
\0 The Institution of EICl:lronie and Radio Enldnccrs. 1976 
400 The R.diD Md EkClrQnic Enginlle'. Vol. 46. No. 8}9· 
203 
8.3 Negative values of filter data. 
In our previous discussions, we have assumed, for simplicity, 
that the data words of the second-order section are positive. In 
this section we outline a simple method to account for the negative 
values as well. This method is very convenient to use when the 
complete digit convolution module is implemented as a stored-logic 
unit using a read-only memory (R.O.M.) as shown in Fig. 9 on page 201. 
8.3.0 Constant bias of filter input. 
A particular structure of a D.C.M. is one in which the coefficients 
are not partitioned at all, and each of the B'-bit data words is 
partitioned into B' blocks of 1 bit each. This corresponds to the 
R.O.M. digital filter proposed by'Croisier et al. With such a structure, 
the filter input is in two's-complement representation. (The 
mechanisation of this filter is fully discussed'in References 6 and 50). 
For our modular realisation using D.e. modules, we propose a 
simple interesting alternative in which the filter input is given a 
constant bias, with a constant correction (for a particular impulse 
response) at the output. 
* Let X . be the actual signal samples, and X . be the input 
n-1 n-1 
samples of the second-order section. Also, assume that B'-bit registers 
are available to hold the data samples. 
Before going into the filter, the signal is given a positive biast 
B'-l 
of 2 thus resulting in the filter input given by 
* B'-l X . = X . + 2 . 
n-l. n-l. 
t Most analogue to digital convertors has this bias already built in, 
giving their digitaZ outputs in the so called offset binary. 
204 
The resulting modified data are now processed according ~o 
equation (14) on page 197. 
Consequently, if the signal has the range given by 
*. [ B' -1 ) X . ~ + 2 -1 
n-1. •••. (8.11);· 
then the filter data is given by 
2 + 2 [ B'-l) b'-l = 
The filter output Z is thus given by 
n 
2 
Z = L A. X n-i n i=O 1. 
2 
* 
2 B'-l 
= L A. X + L A. 2 
i=O 1. n-i i=O 1. 
2 
* Since L * A. X .. 1. n-l. is the true output, Z say, then we have, n 
where 
i=O 
* Z 
n 
= Z 
n 
.2 B'-l 
2 
L 
B'-l 
- 2 
A. is 
i=O 1. 
2 
Expressing L 
2 
L 
i=O 
i=O 
A. = 
1. 
A. as 
1. 
G-l 
C = L 
g=O 
2 
L 
i=O 
A. 
1. 
the constant correction term. 
a G-bit binary number; we have 
The output correction term is given by 
.••• (8.12), 
205 
[ 
G+B'-2 B'-l) B'-2 0 
= cG_12 + ••.. + cB'_12 + Ox2 + •••• + Ox2 • 
Thus, the first B'-l least significant bits of the filter output 
Z need not be corrected. The overall scheme is shown in Fig. 8.8. 
n 
8.3.1 Distributed correction. 
Tbe direct correction method has .the disadvantage that the sum 
2 
L A. has to be computed separately and held in an extra G~bit register. 
~ 
o 
Also, one would like to make the correction scheme compatible and 
consistent with the philosophy of modularity and the concept of digit 
convolutions. 
We will now show how the relevant segments of the correction 
term may be distributed and absorbed into the appropriate digit modules. 
F · I b h h .. b' 2B'-1 ~rst y, we 0 serve t at t e constant pos~t~ve ~as is a 
B'-bit number whose first (B'-l) digits are all zeros, i.e., 
B'-l 2 , ·1 ' 2 1 _ IX2B - + ox2B - + •••. + Ox2 + Ox2° • .. .. (8.13) 
This bias may be partitioned as shown in equations (9) and (10) on 
page 196, thus resulting in 
B'-l 2 p'-2 Ox2 I il [ ,)b'-l + •••. Ox2 + OX2oJ 2P 
•••• (8.14) 
Also, using equation (9) each coefficient may be written as 
REGISTER 
. . . 
• 
· 
. 
· 
• 
--. en 
~ · 2-
· V> · [[AjXn_iJ= zn · · · 
0 > 
· 
· 
· 
Fig. 8.8. Direct correction scheme. 
from data registers 
. 
• 
. 
• • 
. 
• 
! " . . · .. . . . It ~I~ 1 U, ....... IlL- J tu, 1'-' I ") / "- ,) \ "- \/ ..... 
'" If • • • • • . • . . 
~ '---y----J '----v---' I I' 
X k' Xn_1 k' Xn_2k ' '0 n, · , , 
L. 
-R.OM. MODULE c .... o .-. u..o 
. 
, . . . . . . 
Fig. 8.9. One mechanisation of distributed correCtion 
for sequential mode. 
A. = 
1 
206 
p" k" (Ai k" )(2 ) 
, 
Consequently, the correction term can be expressed as 
{
b"-l "k" 2 
= L (2P ) }: 
k"=O i=o 
where the order of the double summation has been interchanged. 
•... (8.15), 
The filter output Z , as expressed by equation (14) on page 197 
n 
is now written in a slightly different form as shown below, i.e., 
{
b"':'l p" k" 2 } p' b'-l 
Z = L (2 ) L (A. k" )(X _. b' -1) (2 ) 
n °k1l = 0 i=O 1, n 1, . 
b'-2 'k' 
+ L (2P ) 
k'=O 
2 
\" (A. k")(X . k') L 1 n-1, i=O ' 
.•.• (8.16) 
Equation (8.12) can now be written in terms of equations (8.15) 
. and (8.16). * Thus the real filter output Z is given by 
n 
Z* = {bI~-l (2P"r 
n k"=O 
2 
L reA. k" )(x _. b'-l) i=oL 1, n 1, 
b'-2 , , 
+ L (2P )k 
k'=O 
2 
L (A. k")(X • k') 
. 0 1, n-1, 
1= 
••.• (8.18) 
Equation (8.18) above indicates that, in the modular circuit 
configuration shown in Fig. 3 on page 198, only the last group of b" 
D.C. modules need to have the correction incorporated. 
207 
When the D.C. modules are implemented as look-up tables, their 
contents are stored in the two's complement form since it will then 
be easy to perform the correction subtraction. 
8.3.2 Example. 
Let the filter coefficients have the values 
A = 5 Al = 3 
o ' 
and A2 = 7. 
Also, at a particular sampling instant'let the signal values be 
* * * X, = -6 
, n ' 
X = 2 
n-l and X = -5 n-2 
If 4 bits are used to represent both data and coefficient words, then 
the constant positive bias will be 24- 1 = 8.' 
Consequently, the offset data values are 
X = -6 + 8 = 2, 
n 
X 1 = 2 + 8 = 10 
n-
and X = -5 + 8 = 3. 
n-2 
Using equation (8.12), the actual filter output is given by 
* Z = (5)(2) + (3)(10) + (7)(3) - 8(5+3+7) 
n 
= 61 - 120 = -59. 
If we apply the method of distributed correction as discussed 
* in Section 8.3.1, then the stages in the computation of Z will be 
shown in Tables 8.0 and 8.1. 
n 
2 Here we have selected Rn = R' = R = 2 • 
Coefficient A : 
o 
010 1 001 1 
Data X : 
n 
x 
o 0 1 0 
o 0 1 0 
o 0 1 0 
o 000 
1""-----------. 
: 0 0 1 0: 
'- _________ .J 
000 0 
x·· 1 : n-
x 
1 0 1 0 
o 1 1 0 
000 0 
o 1 1 0 
~--- .. ------. 
: 0 1 1 0 I 
"- _________ J 
o 0 0 0 
. .---~------·I r----'" -- ---, 
'00101 '0000: . L ________ ._I L ________ _
-Table 8.0. Internal computations in filter algorithms. 
segments are enclosed in broken rectangles. 
X 2: n-
x 
o 1 1 1 
o 0 1 1 
100 1 
001 1 
o 0 0 0 
r------------
: 0 1 1 0: 
'- _________ 1 
o 0 0 0 
1---------, 
'0 0 1 0: L ________ • 
(Distributed correction 
'" o 
(» 
0 1 0 0 0 1 
+ 
0 0 0 1 0 1 
1 1 1 1 0 0 0 
+ 
1 1 1 1 0 0 
Table 8.1. 
l 0 0 1 0 0 1 0 1 
+ 1 0 0 0 1 0 1 
Filter output 
1 0 1 0 0 0 
* ) Z in 2's-complement 
n 
Summation of outputs of D.e. modules according 
to equation (8.18). 
N 
0 
'" 
210 
8.3.3 Circuit implementation of correction scheme. 
As described in Section 4.3 on page 199, there are many possible 
hardware structures and processing modes in the modular. implementation 
of the second-order filter. In the light of this, only a general 
comment will be given on· the incorporation of the distributed correction 
scheme into the final filter structure. 
We recall that in Section 8.3.1 it was mentioned that for the 
completely parallel realisation, only the b" modules belonging to 
the b'th input partition block need to be corrected. With the 
completely sequential mode (Section 4.2, page 198), this corresponds 
to the correction being applied only during the last period of the 
data register clock cycle. 
If the time-shared· convolution module is implemented using a 
R.O.M. as in Fig. 9 on page 201, an extra control bit will be necessary 
which effectively doubles the original. memory size. 
One possible alternative is to retain the same memory capacity 
at the expense of some additional simple circuitry as shown in Fig. 8.9. 
Also, during the last or b'-th data register period an additional b" 
coefficient register cycles will be required. At the appropriate 
instants, the leading bits of the data blocks are 'forced' to logical 
l's and the R.O.M. output two's-complemented. 
8.4 Conclusions. 
A comprehensive and systematic theory,. based on the novel concept 
of a digit convolution module, has been proposed as an attempt to 
bridge the gap between the formal analytical design of digital filters 
and the implementation of their hardware structures. The theory, 
211 
which has been developed in some detail in this chapter, enables a 
general second-order digital filter to be realised in a modular form 
and in a varie~y of processing modes. The proposed modular approach 
is also well suited to the technology of large-scale integrated 
(L.S.I.) circuits. 
212 
CHAPTER 9 
PRACTICAL HARDWARE IMPLEMENTATION 
USING MODULAR ApPROACH 
9.0 Introduction. 
In this chapter, the essential ideas of the modular approach 
are consolidated, and the practical implications of the desirable 
features of the proposed technique brought out by a practical example. 
A detailed description is presented of the design of a non-recursive 
second-order digital filter and its practical hardware realisation 
for real-time operations. 
After describing the processing system in general, we go on to 
the functional and circuit details of the main sub-system units. 
Results on simple input-output tests on the filter system are also 
given. 
9.1· General filter system. 
As the hardware implementation was restricted by a modest budget, 
the architecture that was adopted was mainly the result.of a compromise 
between the need to reduce component count and the desire to keep a 
filter processing rate that is realistic for practical real-time 
signals. 
Consequently, the filter consists of only a single digit convolution 
module operating in a sequential mode (see Section 4.2 on page 198). 
Also the filter data and coefficients are represented by 8-bit words. 
Furthermore, the complete D.C. module is implemented as a look-up 
213 
* table using the·Inte1 1702 256 x 8-bit programmable read-only 
memory (p.R.O.M.), it being the only large scale integrated (L.S.I.) 
chip that was readily available to the author at the time of design. 
The complete filter system is sho~~ in Fig. 9.0 with its 
functional sub-systems shown in Fig. 9.1. 
The filter proper consists of the data registers. the p.R.O.M. 
module, and the accumulator. 
The data registers enable each of the data X ,X 1 and X 2 
n n- n-
to be processed two bits at a time •. These bit-pairs form the first 
six bits of the p.R.O.M's address lines. The remaining two are used 
to select the different functions of the convolution module. 
During every sampling period T secs., the filter system goes 
through eight cycles of internal computation. At each cycle, the 
bit-pairs access the relevant stored function of the convolution 
module. The time successive outputs from the p.R.O.M. are added by 
the accumulator, with each partial result being appropriately shifted· 
to ensure the correct relative weightings between the module outputs. 
(This accumulator is set to zero at every sampling instant nT). 
After the actual filter output Z has been computed, the buffer 
n 
logic is enabled and Z is converted to the analogue form by a l2-bit 
n 
digital to analogue convertert (D.A.C.). 
Finally, to prevent the aliasing53 of the frequency spectrum of 
the filter transfer function, the output of the D.A.C. is band-limited 
t 
See Appendix 9.0. 
Appendix 9.0. 
--~--
, 
': 
\ 
I' 
, 
I 
\ 
• i 
1 
1 
Fig. 9.0. Second-order digital filter system. 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
I 
X 
n 
~ l h .j. 
DATA 
REGISTERS 
r . 
• 
1 
. -.. 
r 
. -, 
0 
3 I 
CO 
;:,0 . 
::;:;;1,. 
-, p.R.O.M Q.. 
ACCUMULATOR 
"0 t.) C ..... 
0 ~ 
....... 
....J 0 
0 ....J 
0:: x: 
BUFFER . l- t.) Z 0 
0 ....J LOGIC t.) t.) 
Z . . . . .. n 
• 
DIGITAL to 1 .. 
ANALOGUE sampling 
clock 
CONVERTER 
. 
LOW-PASS ZIt! 
FILTER 
Fig. 9.1. Functional sub-systems of second-order 
digital filter. 
214 
* by an analogue lm<-pass reconstruction filter which has its -3 dB 
point at 3.4 kHz. 
9.2 Functional and circuit description of 'filter sub-systems. 
In the following, the sub-systems that are described in some 
detail are the data registers and p.R.O.H. module, the accumu.lator, 
and the filter system control unit. In their corresponding circuit 
diagrams, only the essential wiring and pin connections are shown 
and labelled. Further details on the I.C. packages used may be found 
51 52 in the relevant manufacturers' manuals ~ . 
9.2.0 p.R.O.M. module and data registers. 
In terms of processing speed, hardware count and some flexibility 
of operation, it was considered reasonable to partition each a-bit 
data word into four blocks of 2 bits, and each coefficient word into 
two blocks of 4 bits. 
Thus we may write X . and A. in the form 
n-1 1 
= X . (22)3 ( 2 2 X • 3 + X • 22) n-1 n-1, n-1, 
2 1 2 0 
+ X . 1(2) + X . 0(2 ) 
n-1, n-1, •.•• (9.0) 
and 
.... (9.1) 
Using equation (14) on page 197, we .can express the output 
Z. of our second-order fiiter as 
n 
* Appendix 9. 1. 
Z 
n 
= 
215 
2 
I (A. k" i=O l.~ 
where the term in the curly brackets above defines a digit 
• ••. (9.2) 
(R' = 22, R" = 24) convolution module having data and coefficient 
word-lengths of 2 bits and 4 bits respectively. Furthermore it may 
be easily shown that the maximum value of this convolution module·can· 
be represented completely by 8 bits. 
To see how this module may be implemented as a stored-logic unit, 
we first. expand the corresponding digit algorithm, thus obtaining, 
Zn k' k" = { ~ (A. k" )(X -' k')} 
" i=O 1., n 1., 
• ••. (9.3) 
Thus, for a given filter impulse response· {A.}, the partial 
1. 
convolution output Z k' k" is essentially a function of the triplet 
n, , 
of data blocks, i.e., 
• ••• (9.4) 
Since each X . k' is 2 bi t in length, the function <l>k" ,for. 
n-1, 
a given value of k" , has (22)3 = 64 possible combinations. Hence, 
to store this function, we would require a memory of 64 words, each 
8 bits long. 
As there are two values of k" , i. e; 0 and I, as can be seen 
in equation (9.1), we need another (64 x 8)-bit memory space. 
Furthermore, since the p.R.O.M. that we have consists of 
216 
256 8-bit words, we can make use of the 128 locations remaining to 
repeat the above procedure for a different filter impulse response, 
{B.} say. 
1 
The overall organisation of the p.R.O.M. memory space is shown 
in Fig. 9.2, in which the contents in locations: O,O,~, ; O,l,~, 
l,O,~, ; and l,l,~, are shown, "here~, is the value of the triplet 
(X k' , X -1 k' , X -2 k') at a p~rticular computation cycle. The U, n, n, 
.... '" 
first two bits of the address locations shown are the 7th and 8th 
address bits of the p.R.O.M. 
The segmentation of the 8 address lines and the allocation of 
the resulting segments to the relevant variables is shown in Fig. 9.3. 
The hardware implementation of the D.e. module and the data 
registers, and the corresponding circuit diagram are shown in Figs. 
9.4 and 9.5 respectively. 
At the start of every sampling instant nT, with the mode control 
at '1', the 8-bit data X is loaded in parallel into data registers 
n 
RI and R2. After the first memory access, the mode control is brought 
to '0', and, for the remaining cycles of the internal computation, 
these registers shift their bit-contents serially. The other registers, 
R-3 - R6, are permanently connected in the serial mode. All the data 
registers RI - R6 are clocked once for every two internal· cycles. 
fuo bits, in turn, of each X . are used as address lines to pins 
n-1 
3,2 ; 1,21 ; and 20, 19 of the p.R.O.M. 
Pin 18 is connected to a control variable which alternates between 
'0' and 'I' at every internal clock cycle. Thus, for a particular 
impulse response {A.} say. memory locations 0 to 63, and 64 to 127 
1 
will be made available al.ternate ly. 
Memory 
location 
0 
63 
64 
127 
128 
191 
192 
255 
Fig. 9.2. 
Contents 
2 
L CA. O)(X . k') 
0 1., n-l., 
2 
L CA. 1) CX . k') 
0 1., n-l., 
2 
L CB. O)(X . k') 
0 1, n-l., 
2 
L CB. 1) CX . k') 
0 1, n-1, 
Address 
bits 
8 7 
0 0 
0 1 
1 0 
1 1 
The organisation of the 256 words of the 
p.R.O.M. convolution module into four basic 
sections. 
Memory address bits 
8 7 6 5 4 3 2 I I 
• • • 
Xn_2 k' Xn_l,k' X n,k' , 
bit select for coefficient blocks 
'----- bit select for impulse response 
Fig. 9.3. Segmentation of p.R.O.M. address lines. 
L.S.B. 
! 
'" '" '" '" '" '" '" '" 
* * * * * * * * 
* * '" * * * * * 
* * * '" * * '" * 
* * * * * * * * 
'" '" * '" * '" * * 
* * * * * * * * 
* * * '" * * * * 
Filter output --+ . . . . . . . . ,. . ,. . . .. . . . 
Fig. 9.6. Tim~ successive addition of p.R.O.M. outputs. 
Fig. 9.4. 
... ~ ... ~- .. 
,- -. 
-------,--
p.R.O.M. convolution module and 
associated data registers. 
~Vl 
Lf) 
Cl 
~ 
Z 
(f) 
. 
(f) 
0:: 
W 
tn 
...... 
@ 
0:: 
« ~ 
Cl 
-., 
.., 
o 
3 
3 
o [ 
(Jl 
~ 
;::;. 
o 
:T 
, 
J 
I 
I r 
1 1 N h 
w 
to ~ N 
0: .t"-o:: 0: 
r-- 10 '-~ 10 '-0+ 10 U1 
6 
T 
I 
N !E-l-
W 
If) ("") ~ 
0:: 0:: 0:: .t"-
10 '- 10 '- 10 tn 
6 
t 
'--
~ 
'-1: 
0 1- ~ u 
17 18 19 20 21 1 2 3 -g 
-.. E 
.., 
Intel 1702A 0 , ----3 
I 
,0 
10 
10 9 S 4 ::1 11 7 6 5 .... 
.., 
Q. 
c 
d. 
to accumulator 
Fig. 9.5. circuit diagram of convolution module 
and data registers. 
C". 
X 
L..o o 
e' 
X 
217 
The M.S.B.· of the address, pin 17, is connected to a manual 
switch, and is used to select either the impulse response {A.} or 
1 
{B.}. As such, the implementation incorporates a simple 1n-situ' 
1 
programmability. 
Successive outputs of the p.R.O.M. are appropriately weighted 
(by shifting) and added together by the accumulator. 
9.2.0.0 Programming the p.R.O.M. 
Some general information on the lntel 1702A 256 x 8 bit p.R.O.M; 
used in the implementation are given in Appendix 9.0. 
Basically, it is made of enhancement field effect transistors 
(F.E.T's), with a floating gate, i.e. one which is embedded in an 
insulating layer of silicon dioxide. 
The p.R.O.M. is programmed by injecting high energy electrons, 
produced by a controlled avalanche breakdown, through the oxide layer 
to form a charge on the gate. 
The p.R.O.M. can be reprogrammed by first erasing its previous 
contents which is done by irradiating the chip with ultra-violet 
light for about 15 minutes. 
For our implementation the programming is straightforward. The 
contents of locations 0 to 255, as shown in Fig.9.2, for given {A.} 
1 
and {B.} are precomputed and the resulting data are punched onto a 
1 
standard 8-bit paper tape. Then this tape is used as the input to a 
p.R.O.M. programmer (made by Data I/O Corporation), with the p.R.O.M. 
to be programmed placed in a socket provided· for. After initiating 
the machine, the rest of the programming is automatic. 
In operation the lntel 1702A p.R.O.M. has an access time of IllS. 
218 
9.2.1 Accumulator. 
The accumulator adds, in time succession, the outputs of the 
p.R.O.M. convolution module. Also, it weighs each successive output 
by spatially shifting the previous partial result. 
Its operation is best illustrated if we first expand equation 
(9.2) as follows; 
Z 
n { 2, } 4 ° 2 ° = iIo 
(Ai,O) (Xn-i,O) (2 ) (2) + 
In equation (9.5), we now write each term 
{ 2 } 4 k" 2 k' L (A, k" ) (X _. k') (2) (2) i=O ..L, n 1, 
in the form 
{ } 
4k" + 2k' 
Yn k" k' 2 , , 
•••. (9.5) 
k" = 0,1 ; k'= 0,1,2,3, 
where {Yn,k" ,k'} = tt (Ai,k" ) (Xn-i,k')} is the output of the p;R.O.M. 
convolution module. 
The module outputs are added in time succession in the order 
4k" + 2k' 
shown in Table 9.0, with each output being weighted by 2 
In the sequential accumulation, this weighting is equivalent to 
shifting the module output, at a particular computation cycle, 
Computation 
step 
1 
2 
3 
4 
5 
6 
7 
8 
219 
p.R.O.M. output .No. of bit shifts 
Yn k" k' , , 
Yn,O,O 
Y n, 1,0 
Y ,0,1 
n 
Y ,1,1 
n 
Y ,0,2 
n 
Y ,1,2 
n 
Y ,0,3 
n 
Y ,1,3 
n 
Table 9.0. 
to left relative to 
w.r.t. L.S.B. previous p.R.O.M. 
(4k" + 2k') output 
° ° 
4 4 left 
2 2 right 
6 4 .9-
4 2 r 
8 4 .9-
6 2 r 
10 4 .9-
Order of addition of successive outputs 
of convolution module. 
(.9-) 
(r) 
220 
(4k" + 2k') bits to the left, relative to the L.S.B. of the filter 
output, as shown in Fig. 9.6. Alternatively, the module outputs 
can be alternatively shifted 4 bits to the left and 2 bits to the 
right of each other, as shown in the last column of Table 9.0. 
Since, in our circuit implementation, the p.R.O.M. output is 
hardwired, we have to shift, instead, the partial results of the 
running sum of the module outputs. Consequently,the4-bit left 
and 2-bit right shifts must now be replaced by 4-bit right and 2-bit 
left shift respectively. 
Furthermore, we have designed the accumulator to truncate the 
filter output to 12 bits, by shifting out the two least significant 
bits of the partial result with every 4-bit right shift. 
The mechanisation we have described is illustrated by the example 
shown in Fig. 9.7 in which the successive B-bit outputs of the p.R.O.M. 
are enclosed in rectangles. The last 2-bit left shift shown is not 
a physical shift but only a reinterpretation of the binary decimal 
point in the final filter output. 
9.2.1.0 Circuit implementation of accumulator. 
The accumulator hardware and its circuit diagram are shown in 
Figs. 9.B and 9.9 respectively. 
In Fig. 9.9, the three 4-bit adders add the p.R.O.M. module output 
to the shifted and delayed partial result. 
The·necessary 4-bit right and 2-bit left shifts are provided by 
the six dual 4 line to 1 line data multiplexers. Furthermore, one 
data input of each multiplexer is permanently connected to a logical 
'0'. After the Bth comp·utation cycle, the select lines 2, 14 are 
Time 
sequence 
1 
2 
3 
4 
5 
6 
7 
8 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
Accumulator register length 
0 0 
[ 0 
0 0 
0 0 
I 0 
0 0 
0 0 
I 0 
.0 0 
0 0 
I 0 
0 0 
0 0 
I 0 
0 1 
0 0 
I 0 
0 0 
0 0 
I 0 
0 1 
0 0 
I 0 
0 0 
0 0 
.. 
decimal 
point. 
p.R.O.M. output 
0 0 0 0 
0 1 0 1 
0 1 0 1 
0 0 0 0 
0 0 0 1 
0 0 1 0 
1 0 0 0 
0 1 0 1 
1 1 1 0 
0 0 0 0 
0 0 0 1 
0 0 1 0 
1 0 1 1 
0 1 0 1 
0 0 0 0 
0 0 0 1 
0 0 0 1 
0 0 1 0 
1 0 1 1 
0 1 0 1 
0 0 0 0 
0 0 0 1 
0 0 0 1 
0 0 1 0 
1 0 1 1 
0 0 0 
0 1 0 
0 1 0 
I 
0 1 0 
1 1 1 
0 0 1 
1 1 0 
0 1 0 
0 0 0 
1 1 1 
1 1 1 
1 1 0 
0 0 0 
0 1 0 
0 1 0 
0 0 0 
1 1 1 
1 1 1 
1 0 0 
0 1 0 
1 1 0 
0 0 0 
1 1 1 
1 1 1 
1 0 1 
Fig. 9.7. Successive addition and truncation in 
filter accumulator. 
0 0 
I 
0 0 
, 4-bit shift 
1 0 
I 
1 0 
~ I 2-bit shift 
0 0 
I 
0 0 
0 0 
I 
0 0 
0 0 
I 
0 0 
0 0 
I 
0 0 
0 0 
. 
I 
0 0 
0 1 
I 
0 1 
0 0 
~~. 
t 
I 
I 
I 
i 
J 
i 
• Fig. 9.8. Accumulator unit. 
III 
L-
ID 
D 
-a 
0 
..... 
:.a 
I 
-..t 
----
, O· 
from ---
control 
unit 
-
I 1 , . , , , , 
- -
--
1 
---
---I -- - - --
.L J ,g--_o ~---- 1 
1 16 3 4 8 7 10 11 
SN 7483 SN 7483 SN 7483 ~ '0' 14 13 
15 2 6 9 
t t r 
L L .'= ~--': I- -:...~ F -.-1-;: t- t°.;"~;: ~f-~--~ ... - 1---~-- --
'f :.... -- I 
--1---- to, ~ \±~ rt=- --:-..: ~--~I-,..--T-.... - T -- 1 I. '- If~- f-r-~-- -!>---: I 1 I I I 'V 1 J 
, , ~ 
, 
, 
, 
, 
I 
, 
I 
_I 
I 
.... 
2 
13 
I 
I I 
J. 1 .:. I .1 I 
, 
J 
, 
I 2 , 
I 
14 , I 
I 
'--;J, . ,r;- ~ 
3 4 5 
SN7495 
12 11 10 
t t ! I 
V 1 ' ~, 1 I 1 ... 1 
3 4 5: 11 , 12 
I 
, 
I I 
I , 
, 
. 
I I 
I I 
"--;], '--;), .r;- ,r;--
SN 7495 
t t t t I , I I 
registers' outputs to adders 
,c -
I 
,:. T 
:13 
I 
I 
I 
I 
I 
, 
1..-:J, 
't' 
i 
Fig. 9.9. Circuit diagram of accumulator unit. 
1 
-"1 ::" ,,1 
J 
I 
I 
I 
I 
I 
~ 
"-:t. ,.-- .r;--
SN 7495 
r 1, I 
~ 
'0 . 
..--
III 
Ill-0 
.... C"'l ..... Q)Lt) 
-..t X..--
-a ~~ 
J;;::;Z 
-a;5Vl 
E 
'0' 
'0' 
221 
such that an all zeros combination is loaded into the three 4-bit 
parallel accumulator registers at the next sampling instant (n+l)T. 
This effectively resets the accumulator at every system sample period. 
The outputs of the 4-bit adders are connected to the buffer logic 
as well as to a row of light-emitting diodes (L.E.D's). 
9.2.2 Buffer logic. 
The buffer logic consists simply of eight 2-input And-gates 
and three parallel 4-bit registers. 
One input of every And-gate is tied to a common.control line, 
while the other is connected to an accumulator output bit. These 
gates 'mask' the partial results, and are enabled only when the actual 
filter output Z is obtained. 
n 
At every sampling instant, Z is loaded into the 4-bit registers 
n 
and held there until the next output Z 1 has been computed. 
n+ 
9.2.3 Clock and control unit. 
This functional unit provides the clock pulses for the accumulator 
and data registers, and the necessary timing pulses to synchronise 
the other sub-systems. 
The basic circuit and its timing diagram are shown in Figs. 9.10 
and 9.11. A simple modification to this basic unit for operating the 
filter with real-time signals is shown in Fig. 9.12. The corresponding. 
timing diagram is shown in Fig. 9.13. 
9.2.3.0 Sys tern clocks. 
The basic clocks of the filter system consist of a manual 
222, 
'bounce-free' switch, and a simple 1 }lliz square wave generator made 
up of two standard monostable multivibrators. 
Either of these two'clocks is selected via a 2 to 1 multiplexer 
made up of one triple 3-input Nand I.C. package. 
When used in ,a dynamic operation the system sampling clock 
comes from an external generator. 
As can be seen in Fig. 9.10, the output of the 2:1 multiplexer 
is divided by a 4~bit counter, whose A (pin 12) and B (pin 9) outputs, 
(see also the timing diagram in Fig. 9.11), are used to clock the 
accumulator and the data registers respectively. 
9.2.3.1 Timing pulses. 
To ensure that the relevant timing and select signals are set 
up before the corresponding clock pulses, the outputs of the counter 
are delayed by 500 nS, by clocking the parallel 4-bit register with 
the pulse output of a, monostable, which is triggered at every 
computation cycle by the A output. 
The mode control of data registers RI and R2 comes from the 
output of the Nand-gate N2. It is set to '1' prior to every sampling 
instant and reverts to '0' after the first computation cycle, remaining 
so until after the 8th. cycle. As a consequence, the filter ,input 
x is converted from an 8-bit parallel word to a' 2-bit sequential 
n 
one by the input regis'ters RI and R2. 
The delayed B output of the counter, and the output of the 
Nand-gate NI are used as the select A,B inputs, pins 14, 2 respectively, 
of the 4:1 data multiplexers of the accumulator. 
The output of NI is also connected to the input 2:1 multiplexer .. 
mode control pins 6's 
of registers R1,R2 
2 
14 
multiplexers 
select 
inputs 
• 
~~oo pF 
A 6.Bkn. 
I "1 
910 11 
SN74121 
3 4 
m onostable 
A 
m fro 
ma 
do 
nual 
ck 
t 
A 
.t 1 
I . 
. . 
from 
select 
switch 
14 
6 
'f 
from 
auto 
clock 
A 
N2 
L f.J 
/'~ 
N1 
~ ·~·I 
10 11 12 13 
8 SN7495 
5 I, 3 2 
pin 
1 1 8 9 12 of 
14 SN7493 p.R 
4 -bi t counter 
accumulator data 
registers 
clock 
inputs 
Fig. 9.10. Basic timing circuit. 
18 
.O.M 
--pua 
apAJ 
If) 
:::l.. 
t 
a 
c: 
L-
Q) 
+-
c: 
~ 
u 
0 
u 
o o 
<t: CO 
slndlno 5 lndlno JalunOJ 
JalunoJ paAP]ap 
o 
I , 
I 
...... ' IN z z. 
0 Cl 
z z 
<t: <t: 
Z Z 
. 
... 
..... 
" u .. 
..... 
u 
u 
..... 
'" 
'" "'" 
..... 
o 
!ii 
.. 
OIl 
'" ..... 
"Cl 
OIl 
c: 
..... 
13 
..... 
Eo< 
.... 
.... 
'" 
. 
OIl 
..... 
rz. 
223 
as shown in Fig. 9.10, such that when the filter is operating with 
the 1 MH clock, this clock is inhibited just after the 8th computation 
z 
cycle. This effectively halts the computation process once the actual 
filter output Z has been obtained. 
n 
To enable the system to operate on real-time signals, the 2:1 
input multiplexer is rewired as shown in Fig. 9.12. As can be 
followed from the timing diagram given in Fig. 9.13, at every sampling 
instant, the external clock triggers the monostable, which in turn 
provides an output pulse of about SuS wide. This acts as a 'window' 
to allow at least two clock pulses from the IMH generator to initiate 
z 
the internal computation. The remaining pulses necessary to complete 
the internal processing is provided for by connecting the output of 
the Nand-gate N3 as shown. 
Finally the And-gates· of the buffer logic is controlled by the 
output of NI, and its registers are clocked by the Q output of the 
input monostable. 
9.3 System performance. 
Before the sub-systems were connected together, the basic clock. 
and control circuit was tested by selecting the I MH clock and 
z 
monitoring, on an oscilloscope, the counter outputs, the delayed 
counter outputs and the outputs of Nand-gates NI and N2. 
The complete filter system was then assembled, and for its 
preliminary tests, eight manual switches were used as inputs, and 
the system output, [from the output of the 12-bit adders of the 
accumulator], was monitored by the row of L.E.D's. 
After programming the p.R.O.M. with some simple filter coefficients, 
....... 
Z 
<I> 
..... g 
E I 
o -g 
.!:: 0 
,z 
. I 
, 
6.8 kn. 
10 11 9 14 
pin 14 of counter 
6.8 k.f\.. 
1nF .8.2k.f\.. 
10 11 9 14 
1-+-"'0-16 '121 
"'---' 1 3 4 
Q 
'121 6~4 . 6 SN74121 
buffer 
register 
3 " 
Fig. 9.12. 
SAMPLING 
CLOCK 
Simple modification of control unit. 
~ n-th instant (n+1 )-th ! 
sampling sJ ins tant ~~ pUls e-f 
Ne -sJLJlJl. fJ1 
N3 -tl I f~ 
Nb ~ 
No rfi 
N1 
-H 5, 
Fig. 9.13. Timing diagram of modified input 2:1 multiplexer. 
224 
-8 
say A = A = A = 255 x 2 , the accumulator unit was checked for 012 
correct operation. The manual clock was selected, and the filter 
was stepped up through its computation cycles. At each step the 
partial result in the accumulator was visually compared with the 
calculated value. 
Following this check, the overall processing of the filter was 
simulated by setting up successive values of Z via the input switches. 
n 
With every value of X , the manual clock was selected and clocked 
n 
twice, thus loading X into the data registers Rl and R2. 
n 
The 1 MHz clock was then selected and the computation subsequently 
completed automatically •. After Z , the filter output, was obtained 
n 
(and displayed on the L.E.D's), the filter system was then prepared 
for the next imput value X l' 
n+ 
The two basic forms of the input signal used are the digital 
impulse and step sequences, given by {Xt } where 
-8 ) 255 x 2 for t = 0 
= o for t " 0 
and 
for all positive values of t, 
respectively. 
The system was then tested for a real-time operation ·by using 
a 10 kHz square wave as the basic sampling clock. The control unit 
was first modified as described in Section 9.2.3, and the modified 
circuit was checked comparing the outputs (see Fig. 9.12) of the· 
monostables, and those of the Nand-gates Nb' Nc and Na with the 
pulses in the timing diagram shown in Fig. 9.13. 
The filter dynamic characteristics, for given impulse responses 
225 
{A.}, was then tested by observing the response of the filter to 
1 
a digital impulse. Also the frequency content of the resulting 
* impulse response was measured using a Fourier Analyzer • 
The digital impulse was derived by first connecting together 
all the input bits of X to a common line, which is driven by the 
n 
output of a pulse generator. A pulse input of 90~S wide, and a 
repetition rate of 30mS was used to obtain the impulse response. 
The repetition rate was so chosen such that successive impulse 
responses of the filter do not overlap in time. Also, the period 
is of a much larger duration than the 'time window,t used in the 
Analyzer measurements. 
The step responses of the filter were also obtained by simply 
widening the pulse width of the test input to more than six times 
the period of the sampling clock. 
Some typical results are shown in Figs. 14 to 17, for the impulse 
responses 
and 
{A.}' = (255/256), (94/256), (35/256) 
1 
{B.}' = (63/256), (127/256), (63/256). 
1 
Figs. 14 and 15 show the time domain responses of the system 
to a digital impulse, while Figs. l6(a) and (b) and Figs. l7(a) and 
(b) show the discrete Fourier transforms (magnitude and phase) of 
the time wave forms in Figs. 14 and 15 respectively. 
The filter system implemented has a maximum sampling frequency 
of about 60 kHz. 
*, t Genera~ information on the Fourier Ana~yzer used for our ezperiments~ 
and the reZevant parameter settings are given in Appendix 9.2. 
Fig. 14. 
Fig. 15. 
Time impulse response of filter 
having coefficients {A.}' (see text). 
1 
Time impulse response of filter having 
coefficients {B.}. 
1 
(Scale: Vertical 2V/cm; Horizontal 0.2 mS/cm). 
" 
(a) 
(Scale: Vertical 5 x 10-2 Volts/division.) 
(b) 
(Scale: Vertical 450 /div.) 
Fig. 16. Amplitude (a) and phase (b) responses of 
waveform in Fig. 14. 
(Horizontal scale: 2.5 kHz/div.) 
\ j 
(a) 
(Scale: Vertical 1 x 10-1 V/div.) 
(b) 
(Scale: Vertical 45 0 /div.) 
Fig. 17. Amplitude (a) and phase (b) responses 
of "aveform in Fig. 15. 
(Horizontal scale: 2.5 kHz/div.) 
226 
9.4 Conclusion. 
The modular approach proposed in Chapter 8 has been applied 
to the practical design and hardware implementation of a second-
order digital filter system operating on real-time signals. 
The design of the main functional sub-systems was described 
in some detail, with emphasis on the features particular to the 
modular approach. 
The filter system was successfully constructed and tested. 
ApPENDIX 9.0 
General technical data on 
(i) The Intel 1702A M.O.S. erasable and electrical 
programmable read-only memory. and 
(ii) the Date1 DAC-HY12BC 12 bit hybrid digital to 
analog converter. 
Silicon Gate MOS 1702A 
2048 BIT ELECTRICALLY PROGRAMMABLE 
READ ONLY MEMORY 
1702A- ERASABLE & ELECTRICALLY REPROGRAMMABlE 
• Fast Programming--2 minutes 
for all 2048 bits 
• All 2048 bits guaranteed * 
programmable --100% factory tested 
• Fully Decoded, 256x8 organization 
• Static MOS -- No Clocks Required 
.. Inputs and Outputs DTL and 
TTL compatible 
.. Three-state Output--
OR-tie Capability 
• Simple Memory Expansion--
Chip select input lead 
The 1702A is a 256 word by 8 bit electrically programmable ROM ideally suited for uses where fast turn·around and pattern 
experimentation are important. The 1702A has undergone complete programming and functional testing on each bit position 
prior to shipment, thus insuring 100% programmability. 
The 1702A is packaged in a 24 pin dual in-line package with a transparent quartz lid. 
The transparent quartz lid allows the user to expose the chip to ultraviolet light to erase the bit pattern. A new pattern can then 
be written into the device. This procedure can be repeated as many times as required. 
The circuitry of the 1702A is entirely static; no clocks are required. 
The 1702A is fabricated with silicon gate technology. This low threshold technology allows the design and production ofl higher 
performance MOS circuits and provides a higher functional density on a monolithic chip than conventional MOS technologies.· 
PIN CONFIGURATION 
AI J U Vcc 
" , n Vcc 
'DArAOUr, • Il'''1 t, AI 
-ou. our , 10 ". 
'DAT"OUT I 19 "s 
"DA'''our. 
_"'OUT, IJ ", 
'1"1""" OUT r 10 rs v. 
'0.0':. f"lUT. 11 _, ,. U 
PIN CONNECTIONS 
PIN NAMES 
Ao-A7 Address Inputs 
ts Chip Select Input 
DOUT1-DOUT8 Data Outputs 
BLOCK DIAGRAM 
DATA OUT 1 DATA OUT • 
PROGRAM 
,....rr 
ORIVEAS 
t '-n_~t 
Ao A, A, 
NOTE: In the read mode e logic 1 et the address inputs 
end data outputs is. high and logic 0 I" • Iow. 
The external lead connections to the 17'J2A differ, depending on whether the device is being programmed(U or used in 
read mode. (See folloVwing table) 
~ 12 13 14 15 16 22 23 MOOE (Vcc ) (Program) (CS) (Vaa ) (VGG ) (Vcc ) . (Vcc , f---
VCC Vcc GNO Vcc VGG Vcc Read Vcc 
Programming GNO Program Pulse GND Vss .Pulsed VGG (V1L4P ) GNO GNO 
'~. 
"'.' ,... -'~'''''' l--.l'>~'--
. , '-.' . 
-- _. _. -' - -
----------:;:-..~- ~~;c ~~:t· : 
: '. /'~ 
n'\lEL 
SYSTEMS,INe. 
12 Bit Binary or 3 Digit BCD 
Pin-Programmable Outputs 
Internal Reference & Output Amp. 
Miniature Hermetic Glass Package 
±15VDC Supply Only 
Fast Settling Time 
le DAC·HY12BC and DAC·HY12DC are 
N cost 12 bit binary and 3 digit BeD digi· 
i-to-analog converters manufactured in vol-
ne in Datel Systems' modern in-house thin 
m hybrid facility. A new level of pertor-
mee has boen achieved tor 12 bit D/A con-
rters at 8 price far below that of previously 
ai/able models. These converters are corn-
ete, including a precision internal reference 
d a fast output operational amplifier. A 
gh degree of application flexibility has been 
hieved with voltage and current outputs of 
to -2mA, ±lmA, 0 to +5V. 0 to +10V, 
'.5V • .:!.5V, and !.10V, all availahle by ex-
'nal pin connection. These devices are avail· 
le in a miniature 1.3 X 0.8 X .15 inch her· 
Hically sealed glass package. 
mlincarity is :t~LSB maximum for the 
~C-HY12BC and !Y.LSB maximum for the 
~C·HY12DC. Temperature coefficient of 
in is ±3OppmfC maximum and tempera· 
re coefficient of zero is :!.5ppmfC of full 
lie maximum. Output settling time is 300 
et. to %LSB for current output and 3jJsec. 
%LSB for voltage output with a 10V 
ange. Input coding is complementary bi-
ry, complementary BCD, and complemen· 
'y offset binary. Power supply requirement 
±.15VOe at 35mA. No 5 volt logic supply is 
cessary. 
le internal design of these hybrid converters 
nsists of 12 weighted current sources, 2 
in film resistor networks, a precision zener 
ference, reference control circuit, and an 
Itput operational amplifier. The current 
urce switches consist of monolithic quad· 
nem sources in conjunction with a Ni· 
rome thin film resistor network which is 
nctionally laser trimmed to precisely set the 
~·2-1 weighting. The superior tracking capa· 
ity of the thin film resistors in conjunction 
th the tightly matched quad-current sources 
.ults in a differential linearity tempo of 
Ily ±2ppmfC, assuring monotonic opera· 
In over the full OG e to 70°C temperature 
~ge. For excellent long term stability both 
e thin film resistor networks and the thin 
m substrate are passivated. 
cOl-Id :;ourCf! devices for the OAC·HY12BC 
d DAC-HY12DC arc Burr·Srown :oedes 
.... CBO and DACB5 which are pin tor pin 
uivalents, 
175 
lOW COST, 12 BIT HYBRID 
DIGITAL TO ANALOG CONVERTERS 
$24. IN 100 's 
!i!::.!_~',.1 
r '.:u.na.. ' i) I?"STEMalN9-- • .', 
,.' • D/A CONVERTER 
" DAC-HY12BC' , 
~ , 
, ! ... - . ~ -. ~ .. --'-
, ' 
(ACTUAL SIZE) 
_1~V 
~~0) 
,eo 
". 
'" '" 
." 
"" 
, I--
'" 
, f---
~ , I--
'" 
.!-- f---
" 
. ,
, I--WEIGHTED THIN FILM 
• CURr-tENT I-- RESISTOR 
" 'r--
SWrTCH(S I-- NETWORI<. 
" '!-- f---
'!-- f---
" 
" 
"I-- f-;-
"!-- I--
co, 
" 
"I-- I--
Qi) ~ 
N.C. REF. IN 
Fo, BeD model these re5islO,S ala 4KH. 
For BCD mOdel this resistor ;s open circuit. 
MECHANCIAL DIMENSIONS 
INCHES (MM) 
,.. ,.. 
~ 
~RA, ..... ,o.lIl'-
OUTPUT AMI' 
-
6,3"0. 
I PAECISION I r-
RH.I.-6,3VI 
I 
~ 
RH OUT 
~ 
@ IOV RANGE 
" 
VOLT. OUT 
I~ 
~ CURRErH OUT 
f0 811'OLMI OF~. 
r& GAIN "'OJ. 
r-,:O~I-j~ INPUT/OUTPUT CONNECTIONS 
~I I .15013,8) 
,230 ~ ~ T 
-.-- '.010 OIA. MAX. I KOVAR 
023" HIGH 
STANO-OFFS ~ 
DOT ON 
TOP REF-~ 
EAENCES 
PIN I 
BonOM 
:~VIEW 
. . 
. . 
·'0 o· 
.1----24. 
T'200 r 
" SPACES 1.300 
if .. T 
k-600--1k-
I 115.2)- -I I .100 
NOTE: .100 ,nch '1.5 MM 
PIN FUNCTION PIN FUNCTION 
1 BIT 1 IN 13 NOCONN. 
2 BIT21N 14 -15VDC 
3 BIT 3 IN 15 VOLT. OUT 
4 BIT41N 16 REF. IN 
5 BIT51N 17 BIPOLAR OFF, 
S BITS IN lB 10V RANGE 
7 BIT71N 19 20V RANGE 
a BITalN 20 CURRENT OUT 
9 BIT91N 21 GROUND 
10 BIT 10 IN 22 +15VDC 
11 BIT 11 IN 23 GAIN ADJ, 
12 BIT 12 IN 24 REF, OUT 
12 k.n. 10 k.n. 10 k.n.. 
1 OOO,.!-pF---l I-~ 3300 pf 4700f,,-F_-II----I 
12 kJ\. . 5.6 kn. 10 kJl.. 4.7 kJl- 10 k.n. 4.7 kn. :I:> 
-0 
-0 
m 
z 
0 
-x 
0.033 IF 0.015r F 0.01 rF l.D . 
f-' 
Circuit diagram of reconstruction filter. 
ApPENDIX 9.2 
The Hewlett-Packard 545lA Fourier analyzer system 54,55. used in our 
experiment utilizes the HP2l00A digital computer to calculate the 
Fourier transform of a time-varying voltage x(t), i.e. the transform 
given by, 
In the digital implementation of this transform the input x(t) 
has to be sampled at finite, usually uniform, intervals of· time 
lit say. 
Thus, we calculate instead, 
n=+<o 
S" (f) ; lit I x(nllt)e- j 2'ITf (nllt) 
x 
n=-oo 
This describes accurately the spectrum of x(t) up to some 
maximum frequency F which is dependent upon the sampled spacing lit. 
max 
Furthermore, in practice, only a time limited record of- the input 
signal can be taken. Thus if the signal is 'observed' from some zero 
time reference to time T secs., then N ; number of samples; T/lIt. 
As a result we cannot not< calculate the spectrum of x(t) at an 
infinite number of frequencies from 0 H to F • 
z max 
Thus, "e end up with what is called the discrete finite transform 
(D.F.T) given by 
S' (mM) ; 
x 
N-l . 
lit I x(nllt)e-J 2 'IT (mlIf) (nllt) 
n=O 
In our experiments, the follOlnng parameters were used: 
N ; block size of sampled data ; 256 
lit ; sampling period of analyzer's analogue to digital converter 
; 20).lS 
F ; 25 kHz 
max 
/If ; 50 Hz . 
227 
CHAPTER 10 
A UNIFIED FILTER REALISATION ApPROACH 
USING PROGRAMMABLE STORED-LOGIC 
CONVOLUTION MODULES* 
10.0 Introduction. 
We have shown in Chapter 8 how a second-order section may be 
realised in a modular way by using digit-convolution modules. Further 
to this, we propose and develop, in this chapter, a novel method of 
implementing the basic digit convolution module which combines the· 
fast operating speed of a table ·look-up form with the flexibility of 
one realised from standard arithmetic units. 
The proposed method has the added attractions in that it may 
be further generalised to enable the concept of stored-logic convolution 
modules to be used in a general-purpose computer, and also to digital 
filters with time-varying coefficients. 
In this chapter we also describe the extension of the modular 
approach to a general second-order digital filter, which now includes 
the recursive part, and discuss the general mechani&ationof high-order 
digital filters. 
We conclude the chapter by briefly surveying other significant 
approaches to the hardware implementation of digital filters that 
have been proposed recently. 
* Sections 10.1 to 10.4 are based on a paper to be presented at a 
forthcoming conference 58. 
228 
10.1 Basic implementations of digit-convolution module. 
After having decided upon the word sizes of the data and 
coefficient blocks of the basic digit-convolution module, a digital 
designer is faced with essentially two basic ways with which to 
implement the module in hardware. These are shown in Figs. 10(a) and (b). 
In the former, the digit module is built directly from standard 
multipliers and adders using known techniques2l While this form 
offers maximum flexibility in terms of filter coefficients, it is rather 
expensive, requires considerable wiring, and its operating speed is 
dependent on the gate propagation delays. 
The stored-logic form shown in Fig. 10(b), on the other hand, is 
compact with reduced wiring and power dissipation, and is extremely. 
fast in operation •. Its disadvantage is that different R.O.M's are 
needed for different filter transfer functions. Even if erasable and. 
programmable R.O.M's like those described in Chapter 9 are used, it 
still requires a considerable amount of time to erase previously 
stored contents of the p.R.O.M's and to prepare the paper tapes for 
the updated look-up table. 
10.2 Novel implementation using complementary convolution module. 
The method to be described is an application of a recent proposal 
by the author (See Appendix 10.0). 
We resolve the dilemma in the previous section by realising that 
instead of having to decide on one of the two forms in Fig. 10.0, we 
may actually use both in a unified structure to produce an effective 
combination. 
The basic scheme shown in Fig. 10.1, in which the modular 
3-component 
data block 
X k' n, X n-l,k' Xn_ 2 k' , X k' n, 
x . 
n-2,k' 
vector 
each A. k" 1, 
is p"-bits 
T--
I 
I 
I 
I 
I Ao.o 
I 
/ 
J 
I 
• 
• 
• 
// A • 0,'-1 
I 
/ 
---------1 
• 
• 
· 
A • f,~-1 
I 
I 
I 
I 
/ 
I 
/ 
I 
/ 
I 
I 1 ___ . __________ _ 
___ ·._1 
Z k' k" n, , R.D.M. 
r' 
.... 
coefficient 
block 
store 
q-bit control line, 
where q is the integer ;! 10g2 b" 
(a) ~) 
Fig. 10.0. ·'lWo basic hardl~are impleinentations of digit module (a) Direct method, and 
(b) using a memory unit with control inputs. 
z n, 
REAL-TIME ENVIRONMENT 
'real-time table look-up digital filter 
,\ \. \ \ \. \\\ \. \. "'-"'-"-\. \.~\.\ \ \ " " " \ \ \ \ \ " ,,~\ \ \ \ \ \ "-"- \ \ \. " \ \. \ \ \ \ \ \ \ \ ",-, \ L\ \_'-'" \ 
,\ 
, 
,\ 
~ ~ INPUT ARRAY OUTPUT X{t) ~ VI OF VI FAST . \ BUFFER CL> BUFFER 
" 
· 
2 : 1 : I- R.A.M. y{t * "0 ~ ':- "0 . t rr MUX. 0 ~ data-in ~ 
"'T \ ~ ._. ~\~ . -. '-. '-. . ~ . . . =---' - 'F'_ . . ~ ...... '-. 9' ~.:7' . """' . ,.... 
. ,,,' \\\\\ ........... -- ".-' ---.~ . 
"""-'-'-' . . . 
. ~. read/write DATA . 'SLOW' 
REALISATION BLOCK 
· OF DIGIT- control 
VECTOR CONVOLUTION 
SIMULATOR · . MODULE 
. .. 
coefficient-block select 
~ 
non-real-time * A/D converter, data registers. 
standard digital filter 
t Accumulator, D/A converter and reconstruction filter • 
. Fig. 10.4. Complementary y~y convolution module. 
229 
second-order configuration is implemented in the sequential processing 
mode, consists of a slow non-real time simulated filter and a fast 
real-time stored-logic part. 
* We have termed this combination the complementary Y~Y convolution 
module. The 'slow' filter part of this module is basically similar to 
the cirCuit shown in Fig. 10(a) with ·two main differences, viz., 
(a) the data block vector~, = (Xn,k" Xn-l,k" Xn- 2,k') 
are now simulated by a 3p'-bit binary counter (p'= bit length of a data 
block), and 
(b) since this 'slow' filter is not working in real-time, the 
arithmetic unit needed to compute the module output for a given vector· 
~, can be constructed as·a serial configuration using slow and inexpensive 
components. 
One such realisation is shown in Fig. 10.2, in which the multiplexers 
allow for the mUltiplications (A k")(X k')' (AI k") (X -1 k') and 0, n, ,n , 
(A2,k")(Xn_2,k') to be done in time successions. Thus, for a given 
vector ~, and coefficient block, the corresponding module output is 
obtained after three multiplexers' periods. 
Associated with the slow simulated filter is its fast table look-up 
version operating in the real-time environment. This counterpart 
consists of a fast read/write memory ·(R.A.M.), the multiplexer for the 
data registers and the data block simulator, and the relevant interfaces 
* An abbreviation of the term Yin-Yang, a term used in Chinese philosophy 
to indicate{the active and passive principles of the universe • 
...• From their interaction aU things come into existence), 
'Encyclopedia Americana', Vol. 29. 
mux .. 
sele 
line 
~ 
---
et 
s 
---,l 
Fig. 10.2. 
-DATA BLOCK VECTOR SIMULATOR 
(3p'-bit binary counter) 
t t 
X k' n, Xn _ 2 k' , 
DATA BLOCK VECTOR 
MULTIPLEXER 
Qx 
SIMPLE MODULE 
pt x p" ACCUMULATOR 
MULTIPLIER 
COEFFICIENT BLOCK 
MULTIPLEXER 
., ., f 
A - k" 0, Ai k" , A2 ,k" 
i 1 
from coefficient block store 
'Slow'. serial simulation of 
digit-convolution modules. 
i 
to 
R.A.M. 
address 
l '1 
V 
to da ta 
input s of 
R. A.M. 
230 
from and to the real-time environment. 
Before actual real-time processing, the y~y module is first 
switched to its slow half. For every combination of the vector 
block simulator, the coefficient block registers go through the 
complete sequence of coefficient blocks, i.e. {A. }, ....•• ,{A. k"}' 
1,0 1, 
••. {Ai,b"-l}. 
For a particular vector ~, and coefficient block {Ai,k"}' the 
module output 
2 
Zn k' k" , , = r {A. k"}{X . k'} 1., n-l., 
o 
is computed 'and written into the R.A.M. store at the location specified 
by the vector ~, • 
Each coefficient block is associated with a particular combination 
of those address bits that are allocated as the control variables •. 
The programming mode is completed after the vector simulator has 
exhausted all possible combinations of ~, • 
The y~y convolution module is now switched to its active real-time· 
mode and now operates as a fast stored-logic digital filter. 
A practical digital filter using this complementary convoluation 
module idea, and based on the author's basic ·circuit· designs, has been 
successfully constructed as a Final Year's ProjectS? 
10.3 The y~y module in the parallel modular realisation. 
The proposed approach can be easily applied to . the direct parallel 
form of the modular realisation of the second-order section. 
To illustrate this, consider the case when the B'-bit data words 
are each partitioned into two blocks, each of p' = B'/2 bits, and the 
B"-bit coefficient words are each parti tioned into. two blocks, each 
231 
of p" = B" /2 bits. 
The resulting parallclform is shown in Fig. 10.3 and consists 
of two groups of digit-convolution modules, each group containing 
two modules. 
The groups may be programmed simultaneou~ly, while in a particular 
group, each convolution module is programmed in turn, each module 
being selected by-the control signals. 
The non-real-time module used is identical to that described 
previously (see also Fig. 10.2), but is used in a slightly different 
way. 
The particular block vector ~, in a given group, say, is not 
connected-in parallel to the address of the stored-logic modules. 
Instead, only one entry port is used, via the X k' data block as 
n, 
shown in Fig. 10.3. Also the address vector used, i.e. ~, • is 
obtained sequentially. Each value of ~, the output of the data block_ 
multiplexer, is input and shifted horizontally along the registers of 
the groups. 
After three such shifts, the correct combination is now addressing 
the modules. By this time also, the output for the module currently 
being written into would have been computed 
The remaining steps in the programming are as discussed previously. 
10.4 Consequence of the concept of complementary y~y convolution module •. 
Apart from its obvious practical usefulness, the primary consequence 
of the above concept is that it can be a useful tool to unify what 
have previously been apparently different approaches to the realisation 
of digital filters. 
X 
n 
Registers 
I T I 
I r I T I 1 , 
X 
n-l.l 
-+- MUX 
X 
n,l 
r-
Registers 
'--+ MUX .r Tl 
"' r 
IT I 
1 I 
X 
n-1,O 
X 
n,O 
Q
x 
NON-REAL-TIME 
MODULE * 
module output 
* see Fig. 10.2. 
X 
n-2,l 
~ MODULE 
A· 1 1., 
4 MODULE 
A. 0 1., 
• 
X 
n-2,O 
t-- MODULE 
A. 1 1., 
""'~ 
4-
MODULE 
A. 0 1., 
.......... 
chip select 
r 
Z 
,1,1 n 
Z 
n,l,O 
Z 
n,O,l 
z 
. ".-
ri,O,O 
. 
" 
Fig. 10.3. Parallel realisation of second-order section 
using programmable digit-convolution modules. 
232 
. 23 50 1 In the 11teratur~ , ',there is firstly the division between 
slow but flexible realisations using a general-purpose computer 
(C.P.C.) and real-time special-purpose realisations using hard-wired 
circuits. Further to this, even with special-purpose processors, 
there is the division between those built from standard arithmetic 
units, and those using R.O.M's as look-up tables. 
We have already described how the last two forms can be combined 
together as one complementary unit. By generalising the concept, it 
is also possible to combine the general and special-purpose organisations. 
The block diagram of a C.P.C. organised as a complementary y~y 
convolution module is shown in Fig. 10.4. 
The vector simulator and the 'slow' filter in Fig. 10.1 have 
now been replaced by a software routine, quite probably in assembler 
language, and the stored-logic module is now implemented using an 
allocated memory space in the computer core store. 
The data block vector is taken in and acts, via the direct memory 
21 
access (n.M.A.) input-output interface, as the address to the reserved 
memory space. In practice, this vector may have to be modified in 
order to match it with the address format of the computer's memory. 
The software routine computes basically the sum of products 
algorithm for the module. The results of the computations are loaded 
into the allocated store at the address specified by the vector simulator. 
This method is attractive in 'view of the current trend in 
microprocessor technology. 
real-time part 
---_/- of G.P.C. ----I 
I 
x(t) * y(t) INPUT I/O D.M.A. I OUTPUT 
BUFFER INTERFACE I BUFFER 
I 
I 
I 
, 
I REAL-TIME 
, I ENVIRONMENT 
R.A.H. SPACE 
, 
I 
address ~ IN CORE , 
vector STORE I 
I 
I 
I 
data I 
_.-\.. r-.-. ___ in I 
/./ -- I . -- -- I 
--I 1--_ . . -. -.-
I I SOFmARE 
I I 
I ROUTINE I simulates data 
I I 
I I block vector I 
off-line r' ~ I and implements I filter algorith part of G.P.C. I m 
I I 
L __________ l 
GENERAL-PURPOSE COMPUTER (G.P.C.) 
* Direct memory access. 
Fig. 10.4. Embedding of a complementary y~y convolution 
module within a general purpose computer. 
233 
10.5 Application to time-varying digital filters. 
Many practical signal processings require the use of digital 
filters whose coefficients have to be varied at specified time intervals, 
e.g. in the simple.digital modelling of speech production (Chapter 12, 
Ref. 1), the vocal tract is simulated by a digital filter whose 
coefficients vary, on average, every 10 msec. 
Previous hardware filter designs based on R.O.M's cannot be used 
in such cases. Our proposed technique, however, can be used, provided 
the period of coefficient up dating is greater than the time required· 
for the completion of the programming phase. 
The general scheme is shown in Fig. 10.5 which is basically the 
structure shown in Fig. 10.1 with the addition of the extra R.A.M. ·and 
the multiplexers MUX-2 and MUX-3. 
Each R.A.M. operates alternately in real-time. While R.A.M.l 
say, 1S filtering real-time signals, R.A.M.2 is being loaded with the 
look-up table of the new filter characteristics. 
10.6 General digital filter systP~ 
Up to now we have illustrated our various proposals for· the 
realisation of digital filters using a non-recursive second-order 
section. The methods may be directly generalised to include the 
recursive part as well. Once the general second-order section has 
been implemented, standard techniques may be employed to· realise 
higher order filters. 
234 
10.6.0 General second-order section. 
As explained in Chapter 2, the general second-order digital 
filter consists of both a non-recursive and a recursive part 
described by the difference equation 
CL 2 r * y = A. X L B. Y . ••• (10.0) n 1 n-i i=l 1 n-1 
= [zn + wJQ 
2 2 
. where Z = L A. X and W = L B. y. n 0 1 n-i n 1 1 n-i 
(See also Fig. 2 on page 196). 
For simplicity, we assume that B. and Y . are represented by 1 n-1 
the same number of bits required to represent A. and X . respectively. 
. 1 ~1 . 
Furthermore, B. and Y . are partitioned in the same way as A. and 
1 n-1 1. 
X . are. 
n-1 
Applying the method in Section 8.2 and using equation (14) on 
page 197, we obtain the following expression for the modular realisation 
of the general second-order digital filter, i.e. 
where 
* [.JQ 
Y 
n 
K = 
= K { I (A. k ,,)(X _. k') i=O 1., n l, 
- I CB. k")(Y _ •. k')} i=l 1., n·1, ••• (10.1) 
b'-l I k' btl-l n k" L (2P ) . L (2P ) 
k'=O . k"=Q 
means that the value in the brackets is rounded-off to 
Q bits. 
235 
The term in the curly brackets is the generalised digit-convolution 
module, and consists of a pair of modules each having a hardware 
structure similar "to that of the simple basic digit-convolution module. 
The modular realisation of the general second-order section 
resulting from equation (10.1) are shown in Figs. 10.6 and 10.7. 
If the generalised convolution module is "implemented as a stored-
logic unit, two "R.O.M./R.A.M. units are needed. Of" course, if memory 
units of sufficient storage capacity are available, the complete 
generalised digit-convolution module may be implemented as a look-up" 
table. 
10.6.1 General high-order filters. 
It is well known (Chapter 2) that it is convenient in practice 
to realise a high-order filter as either a cascade or a parallel 
connection of basic second-order sections. Also, to take advantage 
of digital techniques, these connections are usually implemented using 
"a single time-multiplexed basic second-order section. 
Typical schemes are shown in Figs. 10.8(a) and (b) • In the former, 
after the output for the first section has been computed it is fed back 
to the basic section via the input multiplexer. Outputs for successive 
sections are obtained in this way. The coefficients for the second.., 
order sections are obtained from a circulating store. 
The implementation of the parallel connection is shown in"Fig. IO.8(b) 
in which the outputs for successive sections are accumulated" and 
rounded-off. 
These multiplexing techniques are well described in the 
l1"teraturel ,5,4,7,SO. 67 f. Also a recent paper on a 1lter hardware 
laboratory is very instructive. 
L.. 
Cl> 
..... 
..... 
:J 
..a 
a.. 
--
~ 
E 
o 
L.. 
..... 
-+ 
11) 
~c 
Cl> Cl> 
c '<3 
;;:: 
'0 -o Cl> 
o 0 
u 
:-
.-
x 
:) 
::a 
N 
x 
:) 
::a 
address 
coeff 
block 
store 
RAM 
1 
('Y) 
x 
:) 
::a 
RAM 
2 0 
..... 
t:l c 
'0.-
I 
'I . sow 
simulated 
filter 
Fig. 10.5. Time-varying filter using programmable 
convolution modules. 
Xn•b·_1 G. 
b-1 
• 
• 
r 
• 
.= , 
• • X k' · • 
· 
n. 
---to 
bu 
O/P 
ffer 
-" adder & .... G , 
~ r-- ~ • k round·off 
c 
x Yn 
• • 
• 
• 
-
.~ 
• 
X • 
n.O • 
I~ GO 
Y k' Yn,b~1 n . ••• I ~ Y 0 . . . 
1,1 n, 
. • V" 
\ 
,1' 
l 
Fig. 10.6. Modular realisation of the general 
second-order filter. 
[8' 
-bits] 
Registers 
T T 
X k' n, 
X 
n-2 k I- ' 
! 
I 
I 
1 
1 
I 
I. ~ ~~~----------~-----------;,,~.~ ~ 
I i IIl'-<::>:. 
.... " _~ _____ +I;..! I ::I "" ~ I 
... tJ 0 • 
I ~ El 1 L ________ ~_I 
I 
:. 
I 
NRM 
I 
I 
I 
I 
I 
I 
I 
<>! 
I L---....._-l 
I 
I 
I 
I I ~---------+-y--~-----.-~I I 
1 k ' I RM· I n- . I. 
Fig. 10.7. 
~~----~I I 
L-__ .J.. L ____ . __ __ J 
T T 
Regl.sters 
Typical Gk , group implemented from 
recursive ~ non-recursive pairs of 
digit-convolution modules. 
t 
+1----
G , 
k 
output 
~ 11 
0 ~ x --" SECOND-ORDER ..r-. x I :> « -- ::> SECTION [5.0.5] ::> --« -v :L -v "'V' :L 0 
-: ;.. 
1 circulating {A .• B.} 
I I store of • 
· • • 
M [5.0.5] s { A. , ~} coeffrcients I 
(a) 
0 
--'" -" 
ACCUMU- h> [5.0.5] r -- LATOR « ...... 
/ 
COEFF. 
STORE 
( b) 
Fig. 10.8. Multiplexing a single second-order section 
to implement a high-order filter realised 
in the cascade (a) and parallel (b) forms. 
« 
--0 
236 
10.7 Recent proposals for the hardware implementation of digital filters. 
In recent years, most of the published proposals are. either 
. 58-61 1" 62,63 h b . extens10ns or genera 1sat10ns of t e aS1C structures 
proposed by Croisier et a~6 and developed by Pe1ed and Liu7. Some 
other approaches, however, have been suggested. 
, t .64 'b' f De Mor1 e av descr1 es a spec1a1-purpose processor or both 
digital filtering and fast Fourier transformation (FFT), using emitter-
coupled logic (ECL) hardware components. Its main feature is a 
parallel multiplier which requires about 15% less hardware than normal 
implementations. This is because the least significant weight bits 
in the partial products are not processed. Instead a correcting bias 
is added. This·multiplier performs a multiplication and two doub1e-
precision additions simultaneously. 
65 Peled ,on the other hand, proposed a machine organisation of a 
dedicated digital signal processor in which the filter coefficients are 
represented in the specialised canonical signed-digit code. The 
resulting realisation requires the minimum number of add/subtract 
operations to implement the required multiplications and additions. 
It promises a significantly better performance than existing realisations 
using standard multiplier packages. 
De Mori also proposed an interesting implementation scheme66 bas\!d 
on 10gic-in-memory cellular arrays. The resulting structures allow 
very fast filters to be designed because the time required for a single 
multiplication, (due to a large overlapping between the execution of 
the overall multipiications), gives only an additive contribution to 
the total time required to compute an output sample. 
The iterative nature of De Hori's filter structure is most suitable 
237 
for special customed-designed L.S.I. implementation of high frequency 
digital filters. 
As .the research on finding good hardware structures for digital 
filters is actively progressing, there are certainly other interesting 
ideas yet to come. 
10.8 Conclusions. 
An interesting realisation approach to the realisation of digital 
filters using programmable stored-logic digit-convolution modules 
has been proposed and developed. The scheme promises to unify what 
were previously apparently different realisation approaches. 
The modular approach has also been extended to the general 
second-order digit filter by the concept of a generalised digit-
convolution module. 
Finally, other interesting implementation proposals published 
recently were briefly mentioned and commented upon. 
, , 
APPENDIX 10.0 
APPUEDiDEAS 
V~rsatile digi~al arithmetic unit wgth rams 
~ common method of implc· 
rlenting fast arithmetic circuits 
s to realise them as look-up 
abies using semiconductor read .. 
tnly memories, but they arc still 
txpensive for the general user to 
)urchase and programme. In 
~ddjtion. their contents cannot 
)8 altered to suit different oper-
Iting parameters. Even with 
ield-programmable and erasable 
'oms, it still takes time to prepare 
:he data paper tapes and to 
HBse previously stored contents 
Nith an ultra-violet source. 
A simple and efficiont alterna-
tive makes use of random access 
'cad/write memories instead. As 
r;hown in" Fig. 1, the ram. when 
operating in real time, is ad-
t:tresscd by the signal from tho 
environment and outputs the 
relevant word to it. 
The rDm is volatile, so that it 
has to havo its contents written 
overy time the system is switched 
on. But because the conlents are 
computable. the ram is easily 
programmed using an operand 
simulator (os) and slow arithme· 
tic unit (sou). Tho os generates 
all possiblo combinations of 
input values. and the sac. which 
is easily design ad with conven~ 
tional serial arithmetic toch· 
niques. computes the required 
arithmetic function. 
As an example, a binary digital 
filter is shown in Fig. 2. It has 
four siX-bit w&ighting coeffi· 
Fig. 1: Digitslarithmetic unit. 
Fig. 2: Bin/uy digital filter with 
six-bit coefficients. 
Fig. 3: Implementation of binary 
filter using a IlJm. 
,u 
MW: 
i ,~ ,," 
cients. A~ Al. A. and A:- and its 
output Y 0 is given by: 
3 
Yn = L x_.A. (1) 
.-. 
A table look-up realisation of 
the portion of the filtar which is 
enclosod by the broken Hnes in 
Fig. 2 would require a 16-word 
by eight-bit ram addressed by the 
binary vector (X,I, X n _t • Xn_t. 
Xn_,), 
. The complete circuit is de-
tailed in Fig. 3. The os is a 
£imple four-bit counter type 
5N7493, while the sac is a four. 
word serial one-bit addor which 
':. ' , " '~?\., .. .-".'-: .. ' . Mode 
,0. 0 
. .', switch .-
Enable/ 
is made up of a pair of full 
adders type SN74183 and an 
5N7482 two· bit adder. The filter 
output Y n is computed as fol-
lows. Expressing the weighted 
coefficionts in binary: 
, 
A, = L ",121 (2) 
1-' 
whero k= 0, 1. 2, 3 and 81:" 
= 0 or 1. By putting (2) in (1) 
and : re-ordering the double 
summation: 
, 3 
Yn =2 L Xn_' ",121 (3) 
1-0 1:-0 
., 
. ::. , 
.', 
, 
" : 
inhibit 
, . I~r,ite/read I 1 
I 
Slow. • ,. 
arithmetic 
• unit 
• 
• • • • , 
, 
- Operand 
simulator 
-
· . 
.. . , 
.. , .. ' ~ 
, 
·Fig·,1 .. ;, ; 
D 
Random ! • Real-
,I • access I ti~e 
memory 
_ i : output 
r ~ c t .. 0) 1 ~ 1 c 0 ~--·'·9- . ~ I > c 
I w 
! • Peal· I • tir:le I • input 
I • 
-
i , 
,'.' 
, ' 
. x n-3 • 
.' 
..... 
, . 
, ,'. 
1x4-btt • 
multipliers-
.! .. I 
, ,':' ., 
.... 
~, .. ': 
'. '. 
'.' . 
Thus coefficient bits of th~ samo 
significance me ;::dded in one 
bit time. each bit 81:.1 being 
weighted by tha relevant simu. 
lated data bit Xn_i". for ovary 
combination of the os output. 
In operation. the weights A. 
to A:o are entered serially. via th& 
5N74157 two-to·one mulli-
plexer. into the SN7491 oight-
bit registers. This is done by 
connecting the programme clock 
line to S4, which is used as a 
manual clock. Each weight is 
padded by two zeroes followir.g 
its most significilnt bit. S1 is set 
to ontJ, the counters afe roset t(.. 
zero, and the ram address is now 
switched to the os output via 52. 
The programme clock line is 
reconnected to the clock. S4 ~c.t 
to one, and the clock is initiated. 
The thret.'-bit and four-bit 
counters and the associ3tcd nand 
logic are designed so that. after 
every eight clock. pulses, the 
write enablo of the ram 'is 
strobed to zelO, writi!lg in tho 
reievilfLt lilter output which hJS 
meanwl)He b!?lIn comr..'Uted. The 
noxt clock puls9 bdn~~ tl1& 
write enable back to ono ":ld 
docks tho os tv a now four-bit 
oddress. with two mJfld Qates 
botween the counters preventing 
data being written into the wrong 
address. After anothur eioht 
pulses the process is repented. 
Thu system has beon designed 
to stop eutumatically aHe( the.- os 
output ha~ re,!~h~d (1.1.1,~) 
·llnd the necessary arithmt1lic 
. correspoudino to this addr.:::;s 
has been duly comploted. 54 is 
now set to 201'0, disenabling the 
clock and the COimters reset to 
zero, thus holding tho write 
enabfe to onc. After switching 
the ram address back to tho 
environment input. the memory 
Is now ready for real-tims 
llpplication . 
Digital arithmetic units built 
with this technique are fast in 
operation, with a 30 ns data rate 
typical for the example given. 
simplo and inexpensive, since 
rams are general purPose msi/lsi 
devices, and extremely versatile, 
since operand parameters may bo 
altered quickly. 
The slow arithmotic unit and 
the ram may be used in their 
more traditional rol~ when the 
system is not operating i'n the 
fast mode . 
M. A. Bin Nun. ·Department of 
Electronics Dnd £Ieetrical Engi-
neering. University of Techno~ 
fogy, Loughborough, Leics. 
From coefficient switches 
A . A A A 3,J 2,j 1 ,j O,j . 
; Select ,. ..'~, -~" 
1 .. \ IS;=;:::~5V ctfu ~ 4A 3A 382A Ws -v5V r- 7408 .. r I 174183 . . . .': ·."95V 
O. 48 74157 lA -+- : 
o a 4Y' 3Y . 2Y . lY ClK 
5trobo ''--_.J ----1 
HJ i I 18 lC
n
+H __ , .' ··f 
t--:ttr1v)-. -+ jl.J r-: Cnr" .1 ~ A2. C2 f-'-
~ . ~N t 5V? 
.. -
• r A tIi"jj:j:iI::f},,-+1 :-'L ! 74183" L Al ~ .,2 
'----4-+-18 7491 QH L.~ __ j 
eLK 2A ~ 81 co ;"ll+-h 
. 
.. 
. '
.,... - .. 1 c:J 00 
~::J~-~:J';'::-I~V Ijl.J rr::::~~~:5~n,22~C~=+~II-~J... ~< 
? SV A r 8 . 7491 QH 
CLK . 
I t 5V\, 
r A 8 749i OH 
CLK 
I 4-
• 
• 
IV QA A .~ f-'- OB '" B 1~ '" ~Qc ;! C ~lode 
ClK 
1 2 , 
.=.. ~ 
v 
Clock 
52°"(: ..... -,.:. \ 
~~:::f~~'5~.~lelc~t_··~""~·r··~··~011F·~~rM~.~r·i=Fi=t~~'. IV lA 1Y A 82 03 51 ~ .' 
#---13A 2Y ..... --<HI-IB 7439 521-+++-i1--0 
_ 4,~ ~ 3YI--++-I-IC Enables S3H--HH--
u ;:. 0 ,....->--.. S4 : :; 
r- 28 4Y~tti-Ur~Jr~it!e!He~m!;.·.:;JrttH-~·~ 
38 I --o~g 
48 .J. : .~: , . -0 ~ 
~ . -() . 
Strobe 01 10l 0 IV 
A 02 03 51 
.....,. 8 7489 521---~ 
_ C '~nabl~' 531----.J 
O ........-.. Write Mem. S4 
1...-_._---11 -." ~ . ~ . -- ; 
• 
Input signal 
'--_______ '_"_ .. -~.--'--' .•_o' . ." (RAM address) 
, " ",,',. ,.',':; ,: .. ". , . 
~,'_'_"_-________ 5_4~ __________________________ ~ __ ~ ____ "<_"'_!'_:-~"~'_'~'~'_"_~.~".~.~ .. ~"'~: __ ~J' 
Electronic Engineering June 1976 19 
238 
CHAPTER 11 
REVIEW AND RECOMMENDATIONS 
11,0 ·Introduction. 
We recall that the purpose of the research reported in this 
Thesis is to develop a systematic hardware realisation theory of 
digital filters which will logically link their formal analytical 
designs and their hardwired practical implementations. 
In contrast to existing techniques, a systems approach was 
adopted for the general investigation of possible modular architectures 
for the basic second-order digital filter. In particular, we proposed 
and developed two novel methods. In the first, we modelled the 
complete second-order section as a finite-state sequential machine 
(F.S.M.), which was then analysed using the theory of machine 
decomposition using S.P. partitions. In the second method, we 
analysed the internal computation of the filter algorithm.by developing 
the idea of digit convolutions. 
Below we review briefly the main results of our investigations 
and in Section 11.2 we recommend the possible directions along which 
the foundation presented in this Thesis may be extended. 
11.1 Review of·maill results. 
When a non-recursive second-order digital filter is modelled 
directly as an F.S.M., we showed that the state transition function 
of the resulting model is already in its simplest form. In addition. 
we proved that this model is also a minimally reduced machine. A 
partial state reduction is possible, however, if the filter output 
239 
is expressed as a multi-component word. For the recursive sections, 
it is possible to minimise their F.S.M. models. Some of the reduced 
machines also contain S.P. partitions. State minimisation, however, 
becomes less useful with increasing wordlengths, and the non-
linearities introduced in the filter transfer function by.quantisation 
effects make it difficult to generalise the results found. 
By applying the same modelling and analysis technique to the 
adder and mUltiplier units making up the filter, we first showed 
that modulo 2N adders and multipliers may be realised as loop-free 
cascade interconnections of sub-machines which require less memory 
space to implement than the direct stored-logic implementation. In 
addition, we further developed the stored-logic implementation of 
the conventional N-bit parallel adder, and two useful F.S.M. models 
of the N-bit by N-bit parallel multiplier. 
We then generalised our findings to adders and mUltipliers 
modulo an arbitrary base M, and showed that the partition lattices 
of their F.S.M. models are easily generated from the lattice of the 
divisors of M under the 'factor' relation. The understanding gained 
was useful in showing that a second-order section, suitably and 
realistically simplified, possesses a regular algebraic decomposition 
structure. We also introduced the concept of the homomorphic images 
of an F.S.M. filter and their corresponding lattice. 
As an interesting 'spin-off', we found that, as an alternative 
to the loop-free structure, a modulo 2N multiplier may be implemented 
in a novel way which requires a low full-adder count and a propagation 
delay that is essentially independent of wordlengths. 
Using our second method, we extracted one possible basic 
240 
computational unit for digital convolution which we termed the 
digit-convolution module (D.e.M.). The second-order digital filter 
may now be regarded as a regular interconnection of D.e.M's. This 
modular approach favours the digital designer since it is easy· to 
construct, test and maintain the filter hardware, the circuit 
structure is directly expandable in terms of computational accuracy, 
and is also flexible in its processing modes. 
The modular theory was consolidated and its essential attractive 
features brought out by the construction of a practical real-time 
prototype filter using semiconductor memories. 
Finally, we introduced amd developed the' concept of the Y ~ Y 
complementary pair of 'slow' and 'fast' digit-convolution modules 
which unified what were previously apparently different approaches 
to digital filter realisations. 
11.2 Possible directions for development. 
The basic research that we have carried out has led to .a useful 
theoretical framework for the implementation of digital filters. 
, 
This may be used as a foundation for further research along the 
following possible directions. 
As an attractive alternative to read-only memories (R.O.M's), 
a study may be made on the use of the newer programmable logic arrays 
(P.L.A's) to implement the homomorphic images of modulo-M filters 
and digit-convolution modules as stored-logic structures. As with 
P.L.A's selected minterms of the logic variables may be programmed, 
their use should lead to a more efficient 'packing' of stored 
information. 
241 
With the modular realisation using D.e.M's, one could investigate 
the application of pipelining to increase the overall throughput 
or computation rate of the filter section. This study may incorporate 
the analogue-to-digital and digital-to-analogue converters since 
they are also usually organised in groups of data digits. 
In spite of the rapid progress made in the technology of 
microprocessors, they 'are still considered slow for most real-time 
work if programmed to implement the digital convolution algorithm 
directly. A more realistic processing rate should be possible, 
however, if a microprocessor is used to implement only the digit-
convolution module. The overall filter is now realised as an array 
of microprocessors. An even superior performance may be obtained 
if the newer bit-slice bipolar'microprocessors are used instead. 
Furthermore, each. microprocessor in the array may be configured 
into a Y ~ Y complementary module pair. The resulting filter will 
be extremely.flexible and fast. This approach is attractive as 
semiconductor memories for microprocessors are getting larger in 
capacity and faster in access time. 
We also believe that the concept of the digit-convolution 
module is useful as a unit of hardware complexity and as a means 
to measure the comparative usefulness between different digital 
filter implementations. A theoretical study on this should result 
in a convenient analytical tool. 
Finally, as a specialist's project, the attractive'implementation 
of modulo 2N multipliers using adder-pairs should be developed 
further, especially to discover whether simple algorithms exist 
for the necessary coding and decoding. 
242 
11.3 Conclusions. 
As, at the moment, there is a tremendous activity in the 
search for good hardware structures for practical digital filters, 
it will not.be long before real-time digital filters will be as 
common and as easy to build as active filters are today, with the 
added extra features that are only possible with digital processing. 
If the author's findings are seen to contribute ·in a modest 
way towards that objective, it will more than recompense the effort 
that has gone into the research reported in this Thesis. 
REFERENCES 
1. Rabiner, L.R. and Gold, B., 'Theory and Application of Digital 
Signal Processing' (Prentice-Hall, 1975). 
2. Oppenheim, A.V. and Schafer, R.W., 'Digital Signal 
Processing', (Prentice-Ha11, Englewood Cliffs, N.J., 
1975). 
3. Rabiner, L.R. and Rader, C.M. (eds.), 'Digital Signal Processing'. 
(IEEE Press, New York, 1972). 
4. Jackson, L.B., Kaiser, J.F. and McDona1d, H.S., 'An approach 
to the. implementation of digital filters', IEEE Prans. on 
Audio and EZectroacoustics, AU-16, No.3, pp.413-21, 
September 1968. 
5. Gabel, R.A., 'A parallel arithmetic hardware structure for 
recursive digital filtering', IEEE Trans. on Acoustics, 
Speech and SignaZ Processing, ASSP-22, No.4, pp.255-8, 
August 1974. 
6. Croisier, A., Esteban, D.J., Levil1ion, M.E. and Riso, V., 
U.S. Patent 3,777,130, December 1973. 
7. Peled, A. and Liu, B., 'A new hardware realization of digital 
filters', IEEE Trans. on Acoustics, Speech and Signal 
Processing, ASSP-22, No.6. pp.456-62, December 1974. 
8. Zohar, S., 'New hardware realisations of non-recursive digital 
filters', IEEE Trans. on Computers, Vol. C-22, No.4,.April 1973 .. 
9. Lockhart, G.B., 'Digital encoding and filtering using delta 
modulation', Conference on Digital Processing of Signals 
in Communications, University of Technology, Loughborough, 
11-13 April 1972. 
10. Minsky, M.L., 'Computation: Finite and Infinite Machines' 
(Prentice-Hall International, 1972). 
11. Booth, T.L., 'Sequential Hachines and Automata Theory' 
(John Wiley & Sons, 1967). 
12. Rartmanis, J. and Stearns, R.E., 'Algebraic Structure Theory 
of Sequential Machines' (Prentice-Rall, 1966). 
13. Roward, B.V., 'Partition methods for read-only memory sequential 
machines', Electronics Letters, Vol. 8, No.l3, pp.334-336, 
29 June 1972. 
14. Lewin, D., 'Outstanding problems ·in logic design', The Radio 
and Electronic Engineer, Vol. 44, No.l, pp.9-l7, January 1974. 
15. . Kvamme, F., 'Standard read only memories simplify complex 
logic design', Electronics, ·43, No.l, pp.88-95, 1 January 1970. 
16. Uspensky, J.V. and Heaslet, M.A., 'Elementary Number Theory' 
(McGraw-Rill, U.S.A., 1939). 
17. Ackroyd, M.H., 'Digital Filters', (Butterworths, 1973). 
18. Bogner, R.E. and Constantinides, A.G., Eds.,. 'Introduction to 
Digital Filtering' (Wiley, 1975). 
19. Oppenheim, A.V; and Weinstein, C.J., 'Effects of finite register 
length in digital filtering and the fast fourier transform', 
Proc. IEEE, Vol. 60, No.8, pp.957-976, August 1972. 
20. Liu, B., 'Effect of finite word length on the accuracy of 
digital filters - a review', IEEE Trans.on Circuit Theory, 
CT-18, No.6, pp.670-7, November 1971. 
21. Lewin, D., 'Theory and Design of Digital Computers', (Wiley, 
New York, 1972). 
22. Freeny, S.L., 'Special-purpose hardware for digital filtering', 
Proc. IEEE, 63, No.4, pp.633-48, April, 1975. 
23. AlIen, J., 'Computer architecture for signal processing', 
Proc. IEEE, Vol. 63, No.4, pp.624-633, April 1975 •.. 
24. Dadda, L., 'Some schemes for parallel multipliers', AZta 
Frequenza, Vol. XXXIV, No.5, pp.349-356, Maggio 1965. 
25. Dadda, L. and Ferrari, D., 'Digital multipliers: a unified 
approach', AZta Frequenza, Vol; XXXVII, No.ll, pp.l079-1086, 
Novembre 1968. 
26. Barna, A. and Porat, D.I., 'Integrated Circuits in Digital' 
Electronics', (John Wiley & Sons, 1973). 
27. Blakeslee, T.R., 'Digital Design with Standard MSI and LSI', 
(John Wiley and Sons, USA, 1975). 
28. Little, W.D., 'An algorithm for high-speed digital filter', 
IEEE Trans. on Computers. Vol. C-23, No.5, pp.466-469, 
May 1974. 
29. Steele, R., 'Delta Modulation Systems', (Pentech Press .• 
London, 1975). 
30. Croisier, A. and Riso,. V., 'Digital filter for delta modulated 
information', British Patent: 1346 216. 
31. Peled, A. and Liu, B., 'A new approach to the realisation of 
nonrecursive digital filters', IEEE Trans. on Audio and 
EZeatroaaoustias, Vol. AU-2l, No.6, pp.477-484, December 1973. 
32. Sypherd, A.D., 'Design of digital filters using.read-only 
memories', Proa. N.E.C., Vol. 25, pp.691-693, December 1969. 
33. Trail-Thong and Liu, B., 'A recursive digital filter using d.p.c.m.', 
IEEE Prans. on Communiaations, Vol. COM-24, No.l, pp.2-1l, 
January 1976. 
34. Nussbaumer, H., 'Digital filters using read-only memories', 
EZeatronias Letters, Vol.12, No.I!, pp.294-295, 27 May 1976. 
35. Chang, T.L., 'Binary read-only memory multiplier', EZeatronias. 
Letters, Vol. 9, No.25, pp.580-581, 13 December 1973. 
36. Tomozawa, A., 'Nonrecursive digital filters with coefficients 
of powers of two', International Conference on Communications, 
Minneapolis-Minnesota, U.S.A., 17-19 June 1974. 
37. Van Gerwen, P.J. et aZ, 'A new· type of digital filter for 
·data transmission', IEEE Trans. on Communications, Vol. COM-23, 
No.2,pp.222-234, February 1975. 
38. Hall, E.L., Lynch, D.D., Dwyer, S.J., 'Generation of products 
and quotients using approximate binary logarithms for 
digital filtering applications', IEEE Trans., C-19, pp.97-105, 
1970. 
39. Kingsbury, N.G.,and Rayner, P.J.W., 'Digital filtering using 
logarithmic arithmetic', Electronics Letters, Vol.7, No.2, 
28th. January 1971. 
40. Pye TMC, Ltd.·, London, 'Monolithic Modular Digital .Filters' , 
IEEE International Solid-State Circuits Conference, 
February 1973. 
41. Electronics Review, 'Digital filter set costs under $200', 
Electronics, pp.38-40, 8 January 1976. 
42. Maclean, M.A. and Aspinall, D., 'A decimal adder using a stored· 
addition table', Proc. IEE, Paper No. 2389 M, pp.129-135, 
July 1957. 
43. Johnson, N., 'Improved binary multiplication system', 
Electronics Letters, Vol. 9, No.l, pp.6-7, 11 January 1973. 
44. McDowell, J., 'Large Bipolar ROMS and PROMS Revolutionize 
Conventional Logic and System Design', Monolithic Memories 
Inc., Applications Seminar, April 19th, 1973. 
45. Almaini, A.E.A., 'A digital computer program for the generation 
of closed partitions for sequential machines', Departmental 
Memorandum, 98, Loughborough University of Technology, 1974~ 
i 
46. Almaini, A.E.A. and Woodward, M.E., 'Computer program for 
S.P. partitions of sequential machines', Eteetponies 
Letters, Vol. 10, No.21, pp.445-446, 17 October 1974. 
47. Niven, I. and Zuckermann, H.A., 'An Introduction to the Theory 
of Numbers', (Wiley, 1973). 
48. Scott, W.R., 'Group Theory', (Prentice'Hall, 1964). 
49. Steiglitz, K., 'An Introduction to Discrete Systems', (John 
Wiley & Sons, 1974). 
50. Peled, A. and Liu, B., 'Digital Signal Processing', (John 
Wiley & Sons, 1976). 
51. Texas Instruments, 'Digital Integrated Circuits', Data Book 
Two, July 1971. 
52. Texas Instruments, 'System 74-Designer's Manual', 1973. 
53. Cattermole, K.W., 'Principles of Pulse Code Modulation', 
(ILIFFE Books, London, 1969). 
54. Fourier Ana1yzer Training Manual, Application Note, 140-0, 
(Hewlett-Packard Co.). 
55. Fourier Analyzer System 5451A System Operating Manual, 
(Hewlett-Packard Co., 1972). 
56. Bin Nun, M.A. and Woodward, M.E., 'Realisation of programmable 
digital filters using digit-convolution modules', 
Conference on 'Digital Processing of Signals in Communications", 
(IERE, IEE, IEEE), to be held at University of Technology, 
Loughborough, Leics., 6-8 September 1977. 
57. Lee, B.B., 'A programmable real-time digital filter', Finat 
year (1977) project report, Dept. of Electronic and Electrical 
Engineering, University of Technology, Loughborough. 
58. Yiu, K., 'On sign bit assignment for a vector multiplier', 
Proc. IEEE, Vol. 64, No.3, pp.372-373, March 1976. 
59. Yuen, C.K., 'On Little's digital filtering algorithm', 
LEEE Trans. on·Computers, Vol. C-26, No.3, p.309, March 1977. 
60. Peled, A., Liu, B. and Steiglitz, 'A note on implementation 
of digital filters' ,LEEE Trans. on Acoustics, Speech and 
SignaL Processing, Vol. ASSP-23, No.4, pp.387-389, 
August 1975. 
61. B!lttner, M. and Sch!lbler, H., 'On structures for the implementation 
of the distributed arithmetic', Nachrichtentechn. Z29 (1976) 
H.6, S.472-477. 
62. White, S.A., 'On mechanization of vector multiplication', 
Proc. IEEE, Vo1. 63, No.4, pp.730-73l, April 1975. 
63. Claasen, T.A.C.M., Mecklenbrauker, ,v.F.G. and Peek, J.B.H., 
'Some considerations on the implementation of ~igital 
systems for signal processing', Phi Lips Research Reports, 
30, pp.73-84, 1975. 
64. De Mori, R., Rivoira, S. and Serra, A., 'A special-purpose 
computer for digital signal processing', LEEE Trans. on 
Computers, Vol. C-24, No.12, pp.1202-l2ll, December 1975. 
65. Peled, A., 'On the hardware implementation of digital signal 
processors', LEEE Trans. on Acoustics, Speech, and SignaL 
Processing, Vo1. ASSP-24, No.l, pp.76-86, February 1976. 
66. De Mori, R., 'Cellular structures for implementing recursive 
and non-recursive digital filters', The Radio and ELectronic 
Engineer, Vol. 46, No.4, pp.173-l81, April 1976. 
67. Bass, C.S., Gibson, D.J. and Leon, B.J., 'A laboratory for 
digital filter instruction', IEEE Trans. on Circuits and 
Systems, Vol. CAS-23, No.4, pp.2l2-221, April 1976. 
--------
68. Mason, J., 'Group - A Concrete Introduction using Cayley 
Cards', (Transworld Publishers Ltd., 1975). 
69. Fraleigh, J.B., 'A First Course in Abstract Algebra' 
(Addison-Wesley Publishing Co., 1967). 
70. Herstein, I.N., 'Topics in Algebra' (Xerox College Publishing, 
1964). 
71. Kohavi, Z., 'Switching and Finite Automata Theory' 
(McGraw-Hill, 1970). 

