Low bit rate speech communication based on charge coupled device fourier transform processors by Davie, Malcolm Craig
LOW BIT RATE SPEECH COMMUNICATION 
BASED ON CHARGE COUPLED DEVICE 
FOURIER TRANSFORM PROCESSORS 
A thesis submitted to the Faculty of Science of the 
University of Edinburgh for the degree of 
Doctor of Philosophy 
by 
M C DAVIE B.Sc. 
Department of Electrical Engineering 	 Sept 1979 
•: 
(iii) 
DECLARATION OF ORIGINALITY 
This thesis, composed entirely by myself, reports on 
work conducted by myself in the Department of Electrical 
Engineering, University of Edinburgh, and the Advanced 
Development Division, Racal Group Services, Reading. 
(iv) 
ACKNOWLEDGEMENTS 
The author would like to express his sincere gratitude 
to 	Dr.P.M.Grant, Mr.R,J.Preston, Dr.M.A.Jack, Dr,J.M.Hannah 
and Professor J.H.Collins for their supervision and 
encouragement given throughout this work. He would like to 
acknowledge Racal Group Services Ltd., Reading and the 
Science Research Council for financial assistance. 
Sincere thanks are also due to the many friends and 
members of staff, both in the Department and at Racal for 
their useful comments relating to this work. The advice and 
help given by Mr.J.N.Holmes and his colleagues at the Joint 
Speech Research Unit, Cheltenham are gratefully 
acknowledged. 	Finally, thanks are due to Mrs.M.C.Davie for 




Title Page 	 i 
Abstract 	 ii 
Declaration of Originality 
Acknowledgements 	 iv 
Contents 	
V 
Glossary of Abreviations 	 x 
CHAPTER 1 : INTRODUCTION 	 1 
	
1.1 	Advanced Analogue Signal. Processing 	 I 
1.2 	Layout of Thesis 	 3 
CHAPTER 2 	SPEECH AND VOCODERS 	 5 
2.1 	Human Speech Production 	 6 
2.2 	Pitch Detection 	 11 
2,3 	The Channel Vocoder 	 18 
2.4 	The Linear Predictive Vocoder 	 27 
(vi) 
2.5 	Other Vocoder Principles 	 29 
CHAPTER 3 : THE CHARGE COUPLED DEVICE 	 33 
3,1 	Basic Principles 	 34 
3,2 	Charge Input and Output 	 38 
3,2,1 	Input Techniques 	 39 
3,2.2 	Output Techniques 	 42 
3.3 	Device Limitations and Defects 	 45 
3.3.1 	Transfer Efficiency 	 46 
3.3.2 	Noise 	 48 
3,3.3 	Dark Current 	 49 
3.3.4 	Peripheral On-chip Circuitry 	 50 
3,4 	The Transversal Filter 	 51 
CHAPTER 4 : CCD FOURIER TRANSFORM PROCESSORS 	 53 
4.1 	Conventional Spectrum Analysers 	 54 
4.2 	The Fourier Transform 	 56 
4.2.1 	The Discrete Fourier Transform 	 58 
4.2.2 	The Fast Fourier Transform 	 62 
4.2.3 	On the Use of Weighting Functions 	 66 
(vii) 
4.3 	The Chirp-z Transform 	 69 
4.3.1 	Derivation 	 69 
4.3.2 	Implementation 	 74 
4,3,3 	Hardware Reduction 	 77 
4,3,4 	Inaccuracies and Limitations 	 82 
4.4 	The Sliding Chirp-z Transform 	 84 
4.5 	The Prime Transform 	 86 
4.5.1 	Derivation 	 86 
4.5.2 	Implementation 	 88 
4,5.3 	Hardware Reduction 	 89 
4,5,4 	Errors and Limitations 	 90 
4.6 	Comparison of Real-time Spectrum Analysers 91 
CHAPTER 5 : THE DESIGN AND CONSTRUCTION OF A CCD 
CHIRP-Z TRANSFORM PROCESSOR 	 94 
5.1 	Design Objectives 	 94 
5.2 Computer 	Simulation 95 
5. 2.1 Graphical Analysis of 	the CZT 97 
5,2.2 Premulriplier Quantisation Errors 103 
5.2.3 CCD Tap Weight 	Tolerance 106 
5.2.4 Charge Transfer 	Efficiency 107 
5.2.5 Analogue Multiplier Accuracy 111 
5.2.6 Phase 	Shifter 	Errors ill 
(viii) 
5.2.7 Summary of 	Simulation Results 113 
50.3 Implementation 115 
5.3.1 The Premultiplier 116 
5.3.2 The Convolver 120 
5.3.3 Post 	Circuitry 124 
503.4 Timing 129 
5.3.5 90-Degree 	Phase Difference Network 134 
5.3.6 Low-pass 	Filter 135 
5.37 Physical Construction 140 
504 Hardware Performance 140 
CHAPTER 6 THE ON-LINE COMPUTER SIMULTL0N OF A CCD 
CRNNEL VOCODER 151 
6.1 	Computing Facilities 	 isi 
6.2 	The Channel Analyser 	 155 
6.2.1 	Speech Input 	 156 
6.2.2 	Spectrum and Cepstrurs Computation 	 160 
6.2.3 	Data Reduction and Quantisatjon 	 169 
6.2.4 	Pitch Extraction 	 176 
6.2.5 	Performance Comparison 	 186 
6.2.6 	Summary of Analyser Simulation 
Conclusions 	 191 
6,3 	Channel Synthesiser Simulation 	 192 
(ix) 
6.3.1 	Impulse Response Generation 	 193 
6.3.2 	Synthesiser Excitation Sources 	 197 
6.3,3 	Reconstruction by Convolution 	 198 
6.3.4 	Synthesiser Performance 	 202 
6.3.5 	Summary of Synthesiser Simulation 
Conclusions 	 206 
CHAPTER 7 	THE OPTIMAL DESIGN OF A CCD BASED 
CHANNEL VOCODER 	 208 
7.1 	Analyser Configuration 	 208 
7.2 	Synthesiser Configuration 	 217 
7.3 	Comparison with a CCD Parallel 
Filter Bank Vocoder 	 222 
CHAPTER 8 	CONCLUSIONS 	 226 
References 230 
Appendix A 238 
Appendix 	B 242 
Appendix C 247 
(x) 
GLOSSARY OF ABBREVIATIONS 
age automatic 	gain control 
ARAM Analogue Random Access Memory 
BBD Bucket 	Brigade Device 
bps bits 	per second 
CCD Charge Coupled Device 
CMOS Complementary Metal 	Oxide Silicon 
CTD Charge Transfer Device 
CIT Chirp 	Z-transform 
DCI Discrete Cosine Transform 
DFT Discrete Fourier Transform 
DSAT Double 	Sample Alternate Tap 
FF1 Fast 	Fourier Transform 
FILO First 	In 	Last 	Out 
FIR Finite 	Impulse Response 
FM Frequency Modulation 
FT Fourier Transform 
FTP Fourier Transform Processor 
IDFT Inverse 	Discrete Fourier Transform 
IF Intermediate Frequency 
IFT Inverse 	Fourier Transform 
LSB Least 	Significant 	Bit 
MDAC Multiplying 	Digital 	to 	Analogue Convertor 
MOS Metal Oxide 	Semiconductor 
N/S Noise 	to 	Signal 	Ratio 
PCM Pulse Code Modulation 














Random Access Memory 
root mean square 
Read Only Memory 
Sliding Chirp Z-transform 
Serial-In-Parallel-Out 
Transversal Filter 
Tuned Radio Frequency 
Transistor-Transistor Logic 




I N T R 0 D U C T I 0 N 
1.1 ADVANCED ANALOGUE SIGNAL PROCESSING 
Low bit rate speech communication systems (vocoders) 
have 	been 	available 	for many years now, but their 
application 	areas 	have 	always 	remained 	extremely 
specialised,. One of the main reasons for this trend has 
been the large size and expense associated with such 
equipments, especially when the synthetic speech quality 
achieved is rather poor0 within the last few years, 
however, vocoder interest has been rekindled and stimulated 
by the demand for digital communications. Efficient data 
compression techniques are necessary since the increased 
bandwidth inherent in digital coding is contrary to the 
overriding philosophy that bandwith conservation is 
requisite [l] 
At present, the two established systems for low bit 
rate (2400bps) speech compression are the channel vocoder 
[2] and the linear predictive vocoder [3] The linear 
predictive vocoder is becoming increasingly popular because 
of. its more elegant digital implementation. However, recent 
advances in analogue signal processing may yet produce a 
channel vocoder which is even more attractive in terms of 
engineering premiums. 
INTRODUCTION 	 Page 2 
Advanced Analogue Signal Processing 
One technology which offers high density analogue 
signal processing is the Charge Coupled Device (CCD [4]. 
The CCD is a shift register which stores' samples of analogue 
information directly as charge packets. These charge 
packets can be transferred along the register under the 
control of an external clock to form a variable delay line 
structure. Further, by adding a non-destructive tapping 
technique, CCD permits fabrication of very compact 
Transversal Filters (TFs [51 which form the basic building 
blocks for many powerful signal processing modules [6] 
Since the conception of CCD at the turn of the last 
decade, industry's acceptance of CCD for analogue signal 
processing has been laboured. This has been due to several 
factors which include (ao the increased competition from 
digital componentry. (b the limited CCD operating 
characteristics 	and 	(c 	the slow development of CCD 
peripheral integration. However 9 the CCD is now being 
manufactured as a fully modular 'black box' which frees the 
consumer from many of the 'setting-up' problems, and 
advanced components such as Fourier Transform Processors 
(FTPs) (6], adaptive filters [7] and correlators [81 are 
finding extensive real-time application. 
The intention of this thesis is to examine 	and 
demonstrate the potential of analogue CCD in the realisation 
of complex signal processing systems. 	In particular, the 
application of CCD FTP5 in a low bit rate channel vocoder is 
INTRODUCTION 	 Page 3 
Advanced Analogue Signal Processing 
investigated, since the market for a low power, small size 
and low cost vocoder is potentially vast. 
12 LAYOUT OF THESIS 
Chapter 2 gives an introduction to the subject of 
speech and vocoder, highlighting the particular areas in 
which CCD may have application. The channel vocoder is 
described in detail since this algorithm is examined in 
later chapters. Some of the basic CCD principles are 
explained in chapter 3, along with a summary of the most 
important operational characteristics. The transversal 
filter, one of the most powerful analogue signal processing 
blocks, is introduced in this chapter. 
Chapter 4 investigates several of the many algorithms 
which have been proposed for real-time Fourier 
transformation and, in particular, compares the advantages 
and disadvantages of the CCD Chirp-Z Transform (CZT) and the 
CCD Prime Transform (PT). The detailed design and 
construction of a CCD CZT processor together with the 
performance limitations are presented in chapter 50 
Practical aspects of CCD system design are emphasised. 
In chapter 6, an extensive computer simulation of a 
novel channel vocoder is reported. Both the analyser and 
the synthesiser are based upon discrete Fourier transform 
INTRODUCTION  
Layput of Thesis 	 Page 4 
processors. 	The 	simulation 	involved the design and 
construction of a 	specialised 	'intelligent' 	computer 
terminal and this is described %briefly. Practical 
experience in CCD processors and the results of the above 
simulation are mer"ged in chapter 7 to suggest an optimal 
hardware configuration for a CCD channel vocoder. 
Firially, the most important achievements of this work 
are summarised in chapter 8 and the conclusions suggest 
Suitable areas for research continuation. 
CHAPTER 2 
SPEECH AND VOCODERS 
The advantages of digital communication are well known. 
For example, binary waveforms may be regenerated at stages 
along the transmission path without cumulative addition of 
noise and distortion. Also, the user is free to scramble 
the message in complex ways for secure or private 
transmission. The price paid for these important advantages 
is additional transmission bandwidth. In order to transmit 
a speech signal with 3kHz bandwidth and 40dB signal to noise 
ratio using direct Pulse Code Modulation (PCM , a data rate 
of approximately 64000 bits per second (bps) is required. A 
normal 3kHz analogue communication channel will handle only 
2400 bps without equalisation. 
The solution is therefore to find an efficient coding 
algorithm for speech, which permits more economical use of 
the spectrum. If one examines the information content of 
speech, assuming a vocabulary of 2 	words and a speaking 
rate of 10 words per second, only I40 bps are required. 	Of 
course, this figure does not include other information such 
as emotional content and speaker characteristic. 
Nevertheless, it is clear that the analogue speech signal 
contains considerablp redundancy because the human vocal 
mechanism generates sounds by relatively slow articulatory 
movements. A system which attempts to exploit this 
redundancy is called a vocoder (short for voice coder). In 
SPEECH AND VOCODERS 	 Page 6 
general, a vocoder operates by analysing speech in the 
transmitter, generating a set of control parameters at a 
much lower bit rate and synthesising the original speech in 
the receiver. The amount of information which can be thrown 
away depends on the application: 	military systems might 
require 	only intelligibility whereas telephone systems 
demand high quality. 
Section 21 reviews briefly 	the 	most 	important 
characteristics of the human speech mechanism. This is 
necessary since any vocoder algorithm must capitalise on 
various aspects of speech production; the optimum vocoder 
will model the human mechanism exactly. Section 22 
examines pitch detection algorithms which are vital for good 
quality vocoding, and sections 23 through 25 summarise 
some of the most important, established low-bit rate vocoder 
techniques. Other speech coding algorithms such as adaptive 
delta modulation [9] and time encoded digital speech [10], 
which are generally considered to have application in higher 
bit-rate communication channels (i.e. 4800 - 16000bps) , are 
not considered here. 
21 HUMAN SPEECH PRODUCTION [11,12] 
• A cross-section through the human vocal mechanism is 
shown schematically in Fig02010 The main vocal tract is a 
non-uniform acoustical tube which starts at the pharynx and 






Larynx (Voice Box) 
Trachea (Windpipe) 
SPEECH AND VOCODERS 
Human Speech Production 
Page 7 
/ 






Oral Cavity - Teeth 
ATLips 
SPEECH AND VOCODERS 	 Page 8 
Human Speech Production 
articulators, namely the lips, the tongue, the jaw and the 
velum, to provide the resonances and anti-resonances (poles 
and zeroes) which modify the energy/frequency distribution 
of the excitation source. The resonances of the vocal tract 
are normally known as the "formant&° of speech. An 
ancillary path for sound transmission is formed by the nasal 
tract, extending from the velum to the nostrils, and has 
essentially fixed characteristics. 
The excitation for the vocal tract is a controlled flow 
of air from the lungs which first passes through the larynx 
or voice-box. The larynx has a cartilage frame and houses 
two lips of ligament and muscle called the vocal cords0 
When"voiced" speech is produced, the vocal cords are 
held tensioned by cartilages and the Bernoulli effect makes 
the slit between the cords (the glottis) open and close at a 
rate determined by both. the sub-glottal pressure and the 
cord tension. The quasi-periodic pulses of air have a 
triangular shape and the repetition rate, which is closely 
related to the perceived pitch of the speech, lies between 
/ 
50 and 400 times per second. Because of the triangular 
shape, the vocal cord excitation source has a line spectrum 
which falls off at approximately -12dB/octave (Fig0202a) 0 
Voiced speech includes the vowels and several consonants 
such as /1 9 r,m 9 n/ and a typical speech segment is shown in 
Fig0 2. 2c0 
SPEECH AND VOCODERS 


















vocal tract impulse response and transfer function 
speech waveform 	(a) * (b) 
Fig.2.2 Typical Voiced Speech Segment (vowel Iii) 
SPEECH AND VOCODERS 	 Page 10 
Human Speech Production 
The second main excitation source is random noise and 
is used in the production of "unvoiced" speech,, In this 
case the vocal cords are held wide apart and the air passes 
uninterrupted through the larynx. A subsequent stricture in 
the tract causes turbulent air flow creating acoustic random 
noise. An example of unvoiced speech is the fricative 
consonant /f/ which is produced by a labio-dental stricture 
(upper teeth on lower lip). Another group of speech sounds, 
call "voiced fricatives" (e.g. /z 9 v/) 9 uses both the 
turbulent and the vocal cord sources simultaneously. 
The final type of excitation results from the build- up 
of pressure that occurs when the vocal tract is completely 
closed at some point. A sudden release of this pressure 
causes a transient excitation of the vocal tract. If the 
vocal cords were vibrating immediately before the closure, 
the sound is called a "voiced stop consonant" (e.g. /b 9 d/) 
and if the closure is preceded by silence, an "unvoiced stop 
consonant" is produced (e.g. /p 9 t/), 
Speech production represents only half of the human 
communication syste; the acoustic signal has to be received 
by the ear and decoded into the appropriate neural stimuli. 
Several aspects of the receiving process give an insight 
into speech waveform redundancy. The acoustic pressure wave 
received by the ear is converted into a mechanical vibration 
by the tympanic membrane (eardrum) in the middle ear, This 
SPEECH AND VOCODERS 
Human Speech Production 	
Page 11 
vibration is amplified and transmitted to the cochlea in the 
inner ear by a system of levers. The fluid filled cochlea 
is partitioned by the basilar membrane which tapers in width 
from base to apex,. Because the cochlea is a rigid 
structure, the input vibrations pump the cochlea fluid back 
and forward and the basilar membrane vibrates at a position 
dependent on the frequency. The organ of Corti, which rests 
along the length of the basilar membrane and contains some 
30,000 sensory cells, detects and converts the vibration 
into electrical pulses to be fed in parallel along the 
auditory nerve to the brain. How the brain decodes these 
pulses is still unknown. However, it is clear that the ear. 
effectively performs a spectrum analysis and passes spectral 
amplitude information to the brain. Both the frequency and 
amplitude are logarithmically resolved. In addition, it has 
been shown that the relative phase of the input signal does 
not affect human hearing [ilL 
22 PITCH DETECTION 
In almost all speech synthesisers, use is made of the 
fact that, to a first approximation, the excitation source 
and the vocal tract may be treated independently; electrical 
models of speech production consist of a set of waveform 
generators feeding a filter bank which represents the vocal 
tract (Fig0203) 0 The excitation source in Fig0203 does not 
cater for voiced fricatives and stops, because in general, 






generator 	 V 	 oral cavity 
random 	 UV 
noise articulator controls 
EXCITATION SOURCE 	 VOCAL TRACT 
speech 
Fig..2..3 Electrical Model for Speech Synthesis 
the extra complexity does not provide a significant speech 
quality improvement. Recognition of these sounds is left to 
human perception.. 
Two control parameters are required for the synthesiser 
excitation source: (a a voiced/unvoiced (V/UV) input to 
select the appropriate waveform generator and (b) a number 
representing the pitch period for the voiced generator.. In 
a vocoder, the analyser has to estimate automatically these 
parameters.. The V/(JV decision can either be computed 
separately or can be a by-product of the pitch detection.. 
The most common V/UV detector compares the speech energy in 
two frequency bands.. For example, a typical system might 
compare the energy in the band 200-600E-lz with that in the 
SPEECH AND VOCODERS 	 Page 13 
Pitch Detection 
range 5000-7000Hz [13] 	A high ratio of low frequency to 
high frequency indicates voiced sound and a low ratio (<<1) 
indicates unvoiced sound. 
At first glance, it would appear that the accurate and 
reliable measurement of pitch period is relatively 
straightforward,. However, this is not the case for several 
reasons and, in. fact, turns out to be one of the most 
difficult vocoder operations. In principle, the fundamental 
frequency may be extracted by a low-pass filter. In 
practice, the fundamental frequency varies over 3 or 4 
octaves so that a fixed low-pass filter often extracts more 
than the fundamental. Also, in many practical situations; 
the fundamental frequency is not present or is greatly 
attenuated (e.g. 300-3000Hz telephone channel). Another 
reason for pitch detector inaccuracy is that the glottal 
excitation waveform is not a perfect train of periodic 
pulses. This results in a speech waveform varying both in 
period and in the detailed structure within a period. 
Over the years, considerable effort has been invested 
in the search for a reliable pitch detector. The reason for 
this quest is that the intonation and intelligibility of 
synthetic speech depends to a large extent on the correct 
pitch. 	Pitch detector errors cause very objectionable 
effects. 	If, for example, the pitch detector selects the 
second harmonic instead of the fundamental, the resultant 
Iusqueak 	not only sounds unnatural but causes the listener 
SPEECH AND V'OCODERS 	 Page 14 
Pitch Detection 
to lose concentration temporarily, thereby masking a longer 
section of the speech. 
Pitch extraction techniques may be classed in two main 
categories: (a) time domain and (b) frequency domain. The 
most common time domain pitch detector in an analogue 
implementation is a tracking band-pass filter. This 
attempts to follow the fundamental frequency by assuming 
that the fundamental component has the largest amplitude. 
If the input speech is high quality (wide-band) with a good 
signal to noise ratio 0 then reasonable performance can be 
maintained over a limited frequency range. However, because 
of the filter response time, fast pitch inflexions may 
temporarily unlock the filter. 
Another pitch detector makes use of parallel processing 
in the time domain [15] and relies on the philosophy that 
one simple measurement is unlikely to be satisfactory but, 
by combining the results of several measurements performed 
in parallel and taking a majority vote, a reliable answer is 
obtained. The pitch detector in Ref0[15] low-pass filters 
the input speech to 900Hz and makes six parallel estimates 
of the pitch, based on peak and valley measurements. 
In recent years, , autocorrelation pitch detectors have 
become popular in digital systems. There are many 
variations on this basic theme, but perhaps the most 
successful is autocorrelation of centre clipped speech [16] 
The centre clipping tends to remove the formant structure of 
SPEECH AND 'IOCODERS 	 Page 15 
Pitch Detection 
the speech and effectively flattens the spectrum. It is 
then much easier to resolve the autocorrelation peak due to 
pitch (i0e0 autocorrelation peaks due to the formants are 
suppressed). The pitch period is measured as the time lag 
from a reference to the largest autocorrelation peak and, 
normally, logic circuits after the correlator check the 
















Auto Corr Logic 
peak 
Posi ti on 
Pitch 
V/ UV 
Fig0204 Centre Clipped Auto-correlation Pitch Detector 
A hardware implementation of the above approach has been 
reported [17] which uses both centre clipping and infinite 
clipping to ease computational problems. Another approach 
computes the average magnitude difference function [18] 
instead of the autocorrelation function so that 
multiplications are replaced by additions. 
SPEECH AND VOCODERS 	 Page 16 
Pitch Detection 
The final time domain technique of significance makes 
use of adaptive filtering. In one example, the coefficients 
of a digital filter are updated to minimise the mean square 
error function and the resulting residue approximates to the 
glottal pulse train [19] One advantage of this technique 
is that in a linear predictive vocoder (section 24) the 
filter structure already exists and the pitch period is 
therefore a by-product. A slightly different technique 
based on adaptive principles employs a recursive comb filter 
which homes in on the speech line spectrum by minimising the 
mean square output of the filter [20] Since this method 
involves only addition, it is faster than the inverse 
filtering method. 
A frequency domain pitch extractor has been described 
by Noll [21] and is based on the cepstrum of speech, which 
is defined as the Fourier transform of the logarithm of the 
power spectrum. The non-linear operation on the spectrum 
equalises the line harmonic amplitudes and effectively 
de-emphasises the formant structure. Mathematically, the 
log operation deconvolves the effect of the vocal tract and 
the excitation source. The second transform measures 
periodicity in the frequency domain. Fig0205 illustrates 
the operations involved in the cepstral computation. The 
tirne domain signal in Fig0205a is transformed to give the 
line spectrum in Fig0205b which is subsequently logged 
(Fig.2o5co and Fourier transformed a second time to provide 
the cepstrum (Fig0205d 0 (The cepstrum variable is called 


















T I r K 
(c) 
ig (Power Spectrum) 









Fig0205 Illustration of the Cepstrum 
SPEECH AND VOCODERS 	 Page 18 
Pitch Detection 
quefrency° and has units of timeO. The cepstrum has a main 
peak due to the speech periodicity and a series of smaller 
peaks at high quefrencies due to the compressed formants0 
ten the cepstrum is computed for unvoiced speech, there is 
no line spectrum and therefore no cepstral pitch peak, so 
that an automatic V/UV decision can be made. As in the 
autocorrelation technique, logic circuits are usually 
necessary to decide if the pitch measurement is feasible. 
It is generally considered that the cepstrum is an extremely 
powerful technique because the input signal does not require 
to be high quality. 
A comparative performance study of several of the above 
pitch detectors was carried out by Rabiner et. a10[22]0 The 
conclusion was that .each detector has its own strengths and 
weaknesses and no single detector was top ranked for all 
cases of input signal. For example, the cepstruin was poor 
on high pitch speakers whereas time domain techniques were 
poor on low pitch speakers. Overall, the cepstrum technique 
proved to be the best all-round performer but was probably 
the most inefficient in terms of hardware implementation. 
23 THE CHANNEL VOCODER 
Historically, the channel vocoder was invented in 1939 
by Homer Dudley [23] of Bell Telephone Laboratories. During 
the 1940s and 1950s it was realised that vocoders might 
SPEECH AND VOCODERS 
The Channel Vocoder 	
Page 19 
have a useful role in communication systems and by the late 
1950°s practical systems were being developed. The basic 
principles described by Dudley in 1939 remain today as an 
established speech compression technique,. 
A block diagram of the channel vocoder is shown in 
Fig02060 In the analyser, the main processing block is a 
bank of contiguous band-pass filters arranged to cover 
continuously the speech bandwidth of interest. The outputs 
from these filters are rectified and low-pass filtered so 
that an approximation to the short-time spectral envelope 
[11] of the speech is available. Normally, the amplitude 
components of the smoothed spectral envelope are sampled, 
quantised logarithmically, multiplexed with pitch and 
voicing information (section 22) into frames and 
transmitted serially to the synthesiser. Data reduction is 
achieved because: 
10 phase information is not transmitted 
2 only the smoothed envelope of the voiced speech 
line spectrum is transmitted (in addition to the 
line spectrum fundamental) 
30 both amplitude and frequency are logarithmically 
quantised0 
In the synthesiser, the received data are inversly 
decoded and fed in parallel to the appropriate circuit 
4 
 P 
PITCH PUL S E 
GENERATOR  














• SPEECH AND VOCODERS 
The Channel V'ocoder 
Page 20 
BPF (100 Hz) 
BPF (200 Hz) 
SPEECH 
INPUT 







r1 (50 Hz) 
I LPF (50 Hz) 
ENCODER 
BPF BAND PASS FILTER 





BPF (100 Hz 
Fig02,6 The Channel Vococler 
SPEECH AND VOCODERS 	 Page 21 
The Channel Vocoder 
elements. Speech is synthesised by summing the outputs from 
a contiguous filter bank, similar to that in the analyser, 
which has been input with weighted versions of an excitation 
source. The particular source is selected by the voicing 
control and the period of the periodic source is given by 
the received pitch information. Finally, the synthetic 
speech is filtered to remove the analyser pre-equalisation 
and to compensate for the non-triangular excitation source. 
It is generally accepted that the minimum data rate 
which can be achieved by a channel vocoder (without special 
coding techniques [241) is in the order of 2400 bps. At 
this data rate, the speech has a mechanical quality but 
still maintains good intelligibility [2]. 
The channel vocoder fidelity depends on the design of 
the contiguous filter banks [2] Ideally, these should 
consist of steep-sided filters narrow enough for no more 
than a single harmonic to enter any one filter during 
voicing. The spectrum envelope generated would thus be a 
correct measure of the speech spectral energy at that time 
and synthetic speech could be constructed exactly from this 
data. The disadvantage of this filter bank, disregarding 
practical considerations, would be that relatively little 
bandwidth compression would result. For example, if the 
lower limit of the pitch frequency is 50Hz then 80 parallel 
filters are required to analyse a 4kHz bandwidth. As the 
C- 
SPEECH AND VOCODERS 	 Page 22 
The Channel Vocoder 
filter bandwidths are increased so that the total number of 
filters can be reduced, a spectral distortion becomes 
evident. This is because more than one harmonic appears in 
some of the filters at low pitch frequencies,. it is 
exceptionally difficult to quantify this distortion in terms 
of synthetic speech quality and generally, the number of 
filters in the bank is chosen through practical experience. 
Channel vocoder designs typically employ between 16 and 32 
logarithmically spaced filters to cover a 4kHz bandwidth. 
The filter characteristics for a 19-channel vocoder are 
given in Table 21 [251 
The individual filter characteristic is 	also 	an 
important consideration0 	it can be deduced that sharp 
cut-off filters will give a more accurate spectral 
measurement; in practice, sharp cut-off filters have long 
settling times so that their use would result in a smearing 
of rapid spectral changes and subsequent reverberation 
effects in the synthetic speech. Unequal filter time 
delays, as might be the case if different channels employ 
different bandwidths, also give rise to a temporal smearing 
effect. The compromise utilised by most channel vocoders is 
a 2-pole Butterworth characteristic [2] 
The cut-off frequencies of the low-pass filters which 
smooth or average the rectified band-pass filter outputs 
have to be chosen to follow the slowly varying spectral 
content of speech. Practical experience [26] has shown that 
SPEECH AND VOCODERS 
The Channel Vocoder 
Page 23 
Channel Filter Centre Analysis Filter 	Synthesis Filter 
Number Freq. 	(Hz) BW (Hz) BW (Hz) 
1 240 120 40 
2 360 120 40 
3 480 120 40 
4 600 120 40 
5 720 120 40 
6 840 120 40 
7 1000 150 40 
8 1150 150 40 
9 1300 150 40 
10 1450 150 40 
11 1600 150 40 
12 1800 200 60 
13 2000 200 60 
14 2200 200 60 
15 2400 200 60 
16 2700 300 60 
17 3000 300 60 
18 3300 300 60 
19 3760 500 (f0 3600)60 
19a - - (f0 3750)500 
Analysis Filters are second order Butterworth 
Synthesis Filters are single tuned with alternate outputs summed in antipha 
Note: (a) Only 19 analysis channels 
Synthesis filter 19 excited during voiced sounds 
Synthesis filter 19a excited during unvoiced sounds 
Table 21 Parameters for a 19 Channel Vocoder 
C 
SPEECH AND VOCODERS 	 Page 24 
The Channel Vocoder 
the spectral content of speech is fairly constant for 
periods of 20mS but has probably changed significantly after 
4OmS0 The smoothing filter is usually chosen to have a 3dB 
attenuation at 25-35Hz and an 18dB/octave roll-off [2] 
As in other speech processing systems 	the large 
dynamic range (> 60dB) associated with speaker variaticxris and conditions 
causes practical circuit 
problems in vocoder implementations. Several techniques are 
used to ease the situation. Pre-equalisation in the 
analyser is designed to boost the high frequencies so that 
the spectral energy is spread more evenly between the 
channel filters. This boost is typically 6dB/octave from 
1kHz [25J (The vocal cord excitation source has a general 
trend of approximately -12dB/octave which is differentiated 
when the acoustic pressure wave is launched from the human 
lips [11] so that th'e input speech to a vocoder has a 
general trend of -6dB/octave) 0 Automatic gain control (agc) 
may be applied to save up to 20dB dynamic range [27] but 
this practice is not generally desirable since agc distorts 
the speech and the vocoder is a non-linear system. 
More recently, fully digital implementations of the 
channel vocoder have been reported. In one example [28] 
use is made of the filtering structure in Fig0207a0 Speech 
is simultaneously modulated by two quadrature sine waves of 
the same frequency and the resultant waveforms are 
separately low-pass filtered. The modulus of the quadrature 
SPEECH AND VOCODERS 







) Heterodyne Ana'yser 
DU 
(b) 2nd Order Recursive Filter 
Fig0207 Digital Filtering for the Channel Vocoder 
SPEECH AND VOCODERS 
The Channel Vocoder 
Page 26 
channels gives the spectral amplitude of the speech input 
evaluated at the frequency of the modulating sinewaves0 By 
sequentially filtering the same segment of speech using 
different modulating frequencies an equivalent to the 
channel vocoder filter bank is achieved. In digital 
hardware, the modulating frequencies are stored in read only 
memory and the low-pass filters are accumulate and dump 
algorithms 0 
An alternative digital channel vocoder [29] operates by 
multiplexing a recursive band-pass filter. The filter shown 
in Fig0207b has a Z transfer function given by 
H(z) = 	- G b (1 	
2) 
1 - (2 - a - b) ;i + (i - b) z 	
000 (21) 
which has a centre frequency and a Q-factor approximately 
proportional to /and 7E respectively. Using a filter of 
this type it is relatively straightforward to update the 
filter coefficients (stored in read only memory) and produce 
a logarithmically spaced filter bank. 
Both of these digital filtering 	techniques 	have 
potential for Very Large Scale Integration (VLSI) and must 
be considered serious contenders for the single 	chip 
vocoder. 	However, at the present time, typical digital 
vocoder implementations employ in the region of 200 discrete 
integrated circuits and consume 15 of pow6r[).91 
SPEECH AND VOCODERS 
The Channel Vocoder 
Page 27 
One other channel vocoder implementation has recently 
been reported [30) Here the filter bank is fabricated as a 
single integrated circuit using 19 parallel finite impulse 
response (FIR) CCD filters. This approach to the CCD 
channel vocoder will be discussed in more detail in chapter 
7 and compared to the alternative CCD channel vocoder 
studied by the author. 
24 THE LINEAR PREDICTIVE SJOCODER 
An increasingly popular technique for low bit rate 
speech analysis and synthesis employs the properties of 
linear prediction [31] This method is suited to 
sampled-data implementation and utilises either an all-pole 
recursive filter [3] or a lattice filter [321 to synthesise 
the speech. The filter coefficients represent a linear 
prediction of the analyser input speech. 
Briefly, linear prediction consists of predicting or 
estimating the present value of a signal using a linear 
weighted sum of delayed signals. The linear weights that 
minimise, for example, the least mean square predictor error 
are then information parameters which characterise the 
properties of the signal under analysis. These linear 
weights or predictor values can be transmitted directly or 
can be further processed in a variety of ways for different 
applications [3] 
SPEECH AND VOCODERS 	 Page 28 
The Linear Predictive Vocoder 	 - 
A useful by-product of the prediction process is that 
once the weights have converged, the error signal is an 
approximation to the excitation source. Pitch information 
and a V/LJV decision can therefore be derived by means of a 
peak picking algorithm. 
Speech synthesis from a recursive digital filter using 









a k  i-It 
	
predictor coefficients 
Fig0208 Linear Predictive Synthesis 
The excitation source is selected in exactly the same manner 
as in the channel vocoder (section 23) and is input to the 
filter via an amplitude control. The amplitude level is 
derived from the rms value of the analyser input speech and 
is necessary since the predictor coefficients contain only 
information concerning the spectral shape. 
SPEECH AND VOCODERS 	 Page 29 
The Linear Predictive Vocoder 	 - 
Due consideration to the quantisation of 	control 
parameters [33] results in vocoders which operate down to 
2400 bps with more natural sounding synthetic speech than 
the equivalent channel vocoder[3]. The predictor typically 
requires 10-42 coefficients and a sample rate of 10kHz for 
good quality speech. Charge coupled device programmable 
transversal filters (chapter 3) are suitable for use in this 
application, but; because only 10-12 taps are required, it 
seems likely that digital implementations will almost always 
be preferred. 
25 OTHER VOCODER PRINCIPLES 
Since the introduction of the digital computer which 
facilitated simulation studies of complex systems, the 
interest in speech research has grown enormously and many 
other vocoding techniques have been reported. Some of these 
are still too complex for real-time hardware implementation, 
but others are now realistic. 
One such system is the homomorphic vocoder [34] which 
has the potential for real-time hardware implementation 
using charge coupled devices. It relies on the 
deconvolution of the speech excitation source and the vocal 
tract impulse response by homomorphic filtering [35]; 
homomorphic filtering is based on the computation of the 
cepstruin0 The vocoder°s block diagram is shown in Fig0209 
where it can be seen that the analyser is identical to the 
SPEECH AND VOCODERS 	 Page 30 
Other Vocoder Principles 
cepstral pitch detector (section 22) except for a gating 
operation which extracts the quefrencies due to the vocal 
tract (Fig0205d)0 These quefrencies are then coded and 
transmitted to the synthesiser where the vocal tract impulse 
response is constructed by an inverse set of operations. 
Synthetic speech is produced by convolving the vocal tract 
impulse response with the excitation source. Computer 
simulations [34] have shown that high quality speech may be 
reconstructed at 7800 bps and compression to 4000 bps may be 
obtained by more complex coding [36] Simplification in 
synthesiser hardware and a bit rate reduction to less than 
2000 bps are possible by using a log magnitude approximation 
filter [371 
A vocoder technique which promises to give an almost 
optimum bit rate reduction (600 bps) is the formant vocoder0 
Research has been continuing for many years but still no 
practical solution has been found. The principle is to 
extract the centre frequencies and bandwidths of the main 
formants (there are three in male speech below 3kHz) and to 
use this information to control a resonant model of speech 
production [ll] Formant extraction techniques generally 
rely on formant tracking algorithms which are based on 
accumulated and detailed experience of speech waveforms. 
These algorithms have been designed to operate on the 
short-time spectra of speech [38J autocorrelation functions 
[391 and linear prediction spectra [40I but in general, 



























SPEECH AND VOCODERS 	 Page 32 
Other Vocoder Principles 
philosophy 	that 	applies 	to 	formant 	extraction 	is 
analysis-by-synthesis. Here an educated guess is made of 
the formant parameters and a spectrum is generated which is 
compared to the actual speech spectrum. The formant 
parameters are then varied until the difference between the 
to is minimised 'according to some criterion [41J The 
latter technique has some advantages because the entire 
spectral shape is considered and not simply the spectral 
peaks. 
CHAPTER 3 
THE CHARGE COUPLED DEVICE 
The Charge Coupled Device (CCD) is essentially an 
analogue shift register which can be fabricated as an 
integrated circuit using Metal Oxide Silicon (MOS) 
technology, Discrete samples of input signal are stored as 
charge packets in potential wells and these may be moved 
along the CCD register by applying a sequence of clock 
pulses. The CCD therefore provides the flexibility of a 
time-quantised clock variable system which does not require 
analogue to digital conversion. 
The CCD was first reported by Boyle [42] in 1970 and is 
a member of the more general Charge Transfer Device (CTD) 
family which includes the earlier Bucket Brigade Device 
(BBD) [43] In recent years the CCD has become important 
because: 
1 	the technology (MOS) is standard and hence is low 
cost 
2 	the silicon area required per stage is very small 
3 	the CCD can be used 	to process either 	analogue or 
digital electrical signals, or optical signals 
4 	the range of applications is very wide. 
THE CHARGE COUPLED DEVICE 	 Page 34 
Section 31 explains the basic principles of CCD charge 
storage and charge transfer whilst section 32 discusses the 
merits of several methods of charge input and output. The 
third section summarises the main defects and causes of 
degradation in CCDs and, finally, section 34 describes one 
of the most powerful analogue signal processing blocks, the 
transversal filter. 
31 BASIC PRINCIPLES 
The fundamental concepts of CCD operation [4441 have 
been developed directly from the established theory of NOS 
transistors [45] and, as shown in Fig0301 0 the basic CCD 
delay line structure resembles a rather large multi-gate 4OS 
transistor. The input diffusion converts a sample of the 
input signal into a charge packet of minority carriers, 
which is subsequently transferred along the register at a 
rate controlled by the clock waveforms. After some delay, 
this charge packet can be sensed at the output diffusion. 
To understand the CCD operation it is best to start by 
examining the basic CCD storage element [45] the MOS 
capacitor shown in Fig03020 With zero potential on the gate 
there is a uniform distribution of majority carriers (holes 
in this case) in the 	type semiconductor. When the gate 
terminal 	is pulsed more positive than the substrate, 
majority carriers from the silicon/silicon dioxide (Si/Si02) 
interface immediately below the gate are repelled, thereby 





. P. ty 
depletion 
layer 
THE CHARGE COUPLED DEVICE 
Basic Principles 
i/p 




i~ I I-;-- i— i - - - - - - - F- ff, "- F, "MR 
l5Onrn Si 
n diffusion 
fL.? 'Tp type Silicon substrate 
(all dimensions typical) 




(a) 	 (b) 
	
(c) 
Fig032 The CCD Storage Element 
/ 
THE CHARGE COUPLED DEVICE 	 Page 36 
Basic Principles 
creating a depletion region (Fig0302b0 	If the pulse 
amplitude exceeds the threshold voltage, Vth 0 minority 
carriers (electronsD can be attracted towards the interface 
to form an extremely thin "inversion layer" (Fig0302c)0 As 
the amount of charge stored is increased the extent of the 
depletion region must decrease to preserve charge neutrality 
in the system. The creation of this layer corresponds to 
the formation of a channel in a I4OS transistor. These 
minority carriers can be stored in the inversion layer for 
typically hundreds of inilli-seconds before thermally 
generated minority carriers from within the depletion region 
significantly distort the charge packet0 
An extremely useful model for visualising the operation 
of CCD structures results from the potential well concept 
[46] If a potential well is formed by applying a gate 
pulse greater than the threshold voltage, then the 
introduction of minority carriers is analogous to liquid 
being poured into a well. The maximum quantity of charge 
which can be stored in the structure (typically lpCf depends 
on the volume of the potential well the depth of the well 
is related to both the magnitude of the gate pulse and the 
oxide thickness, and the area of the well is defined by the 
electrode area. 
The next operation is to transfer the stored packet of 
charge to an adjacent CCD element. Consider the structure 
I, 














I 	I 	I 	I 
t 0 itttg 
Fig3,3 Charge Transfer in CCD 
THE CHARGE COUPLED DEVICE 
Basic Principles 
Page 38 
illustrated in Fig0303a where there is a charge packet 
stored under the 01 electrode, If each electrode is 
physically very close to its neighbour (<3pm separation) 
then when 02 turns on,, the depletion regions under 01 and 02 
will merge and the charge will redistribute itself 
(Fig. 3.3b). By slowly reducing the potential on Øl the 
charge remaining under $i will spill over into the 02 well 
to make the transfer complete at time t40 It is therefore 
possible to transfer charge packets along an entire register 
by applying the appropriate time sequence of pulses. In the 
simple structure shown in Fig0303 9 a three phase clocking 
system is necessary to propagate the charge unambiguously in 
one direction. This structure has, however, certain 
practical limitations (some of which are discussed in 
section 33) and many other more sophisticated electrode 
arrangements have been developed (43] 
32 CHARGE INPUT AND OUTPUT 
When the CCD is used in an analogue mode, the linearity 
of charge input and output is extremely important. In the 
following section, four serial input schemes of varying 
complexity and performance will be compared and three output 
techniques, one serial and two parallel, will be discussed. 
THE CHARGE COUPLED DEVICE 	 Page 39 
Charge Input and Output 
3.2.1 Input Techniques 
Dynamic current -injection [47] which uses the input 
structure in Fig0304a 9 is one of the simplest input 
techniques. The input signal is applied to the input 
diffusion and the input gate is held at a relatively low 
d0c0 potential. When the 01 potential well is created, 
charge flows frQm the diffusion across the gate into the 
well. The size of the injected charge packet is determined 
by the input diode potential, the channel conductance and 
the available injection time ( governed by the CCD clock). 
This process is inherently non-linear and the resulting 
distortion is quite severe. For a sinusoidal input 
giving full well capacity, the 2nd 
harmonic is typically at -16dB and the third harmonic at 
30dB [481. 
In a diode cut-off scheme [49] (Fig0304b) 	the signal 
is normally capacitively coupled to a reverse biassed input 
diode. During Øl the input gate is pulsed on to allow 
minority carriers from the diffusion to flow into the 01 
potential well. The surface potential is therefore set 
directly by the diode potential and the sample is trapped 
when the input gate is turned off. The trailing edge of the 
gate pulse has to be designed carefully because: (aO if the 
channel is cut-off too quickly, some charge from within the 
channel will be emptied into the signal packet (partition 
noise) and (bb if the channel is cut-off too slowly, the 





_L1 d 	 J J 
(a) Dynamic Current Injection 
- 
i/p gate 	 02 03 
pdthde 	 I 
(b) Diode Cut-off  
I/p gate 	
I I —t 
 
i/p diode 














Fig34 Charge Input Techniques 
THE CHARGE COUPLED DEVICE 	 Page 41 
Charge Input and Output 
However, even with an ideal gate cut-off, this method still 
has 	an 	inherent 	non-linearity due to the depletion 
capacitance changing with the diode potential. These 
effects give rise to second harmonics in the order of -26dB 
[48] 
A significant improvement in linearity may be obtained 
by using a fill and spill method [50] of charge input. In 
the variation shown in Fig0304c 0 the input signal is applied 
to the control gate and the input diode is pulsed to a low 
potential to fill the well created under 01. When this 
pulse is returned to a high potential, the excess charge 
drains back into the diffusion, leaving a charge packet 
proportional to the gate potential. Two second order 
distortions are present: (a spurious noise on the 01 
driving waveform enters the signal packet directly and (b) 
the signal dependent fringe field from the input gate alters 
the effective area of the $i potential well. However, both 
of these problems may be eliminated by a more complex pump 
priming" method of fill and spill [511 Second harmonic 
distortion components of less than -40dB have been reported 
[501. 
The technique described above helps to linearise the 
CCD input structure; feedback linearisation [52] however, 
attempts to linearise the complete CCD input to output 
transfer function. An essential feature of the input 
THE CHARGE COUPLED DEVICE 	 Page 42 
Charge Input and Output 
structure in Fig,304d is the inclusion of 	an 	extra 
non-destructive output tap to monitor the input charge. The 
output is compared with the original signal to generate an 
error which subsequently corrects the stored charge. If the 
monitoring tap is electrically identical to all the other 
output taps, the CCD transfer function will tend to be 
linearised and the total harmonic distortion will 
theoretically be reduced by the open loop gain of the system 
(in practice to less than -40dB [531 The disadvantage of 
this technique is the need for a high quality differential 
amplifier which may be difficult to integrate with the CCD. 
3.2.2 Output Techniques 
When the CCD is used as a serial delay line, an output 
diffusion [46] (Fig0305a) senses the magnitude of the charge 
packet. The pn junction is normally held reverse biassed 
and is positioned so that its depletion region couples with 
that of the last storage element. An extra gate held at a 
constant 	bias is generally included to help minimise 
capacitive pick-up from the last transfer electrode. 	When 
03 is turned off, any charge in the 03 potential well will 
be collected by the output diode to appear as a current 
change in the output circuitry. A voltage output can be 
produced simply by incorporating a resistor. However, 
output changes are very small because of the minute charge 



















(a) Output Diffusion 
(c) Plan View of Split Gate CCD 
'reset 
I 	I 
• 	 I 
I 	 I 
I I 
(b) Floating Output Diffusion with Reset 
Fig0305 Charge Output Techniques 
_ [---OV 0 
P i 5 gg 	I 
bias  




THE CHARGE COUPLED DEVICE 	 Page 44 
Charge Input and Output 
to perform on-chip amplification. This is best accomplished 
by a MOS transistor with its gate connected directly to the 
sense diffusion. Since in this case the sense diffusion is 
floating, an extra diffusion and control gate are required 
to reset the sense diffusion after the detection of each 
charge packet (Fig0305b0 
Split electrode tapping [54] is an extremely elegant 
method for the implementation of fixed weight transversal 
filters (section 34 The basic split electrode 
arrangement for use in a 3 phase CCD is shown in Fig0305c0 
Here the third electrode in each cell is divided into two 
sections and each is connected to either the or the 03-
clock lines. As charge transfers into the region under a 
gate, an opposite charge is induced onto the electrode from 
the clock line. Assuming that the oxide capacitance is very 
much greater than the depletion capacitance and that the 
latter may be considered constant, the induced current in a 
03 clock line due to one section of a split electrode is 
proportional to the amount of charge in that potential well 
times the area of the section. The transversal filter is 
obtained by differencing the total current change in each 
clock line. 	Thus, a split in the middle of an electrode 
corresponds to a weighting factor of zero. 	This technique 
does not interfere with the signal charge packet in any way 
and is therefore non-destructive. 
THE CHARGE COUPLED DEVICE 	 Page 45 
Charge Input and Output 
A very powerful and flexible mode of CCD operation is 
made 	possible by the floating gate tapping technique 
serial charge packets may be sensed 
nondestructively and their magnitudes output in parallel0 
The applications of this configuration include programmable 
transversal filtering [57 Q correlation [81 and adaptive 
filtering [81 
Fig0305d shows the floating gate tap schematic for a 
three phase CCD with pseudo two phase clocking. The CCD tap 
electrode is directly connected to the gate of a sense 
transistor and also to a reset diffusion. Assuming that the 
reset transistor is off and that the tapped CCD electrode is 
floating at a potential of Vgg, the transfer of minority 
carriers into the potential well below this electrode 
induces a charge redistribution which causes a related 
change in electrode potential. This voltage change is 
buffered by a NOS source follower to provide an output 
signal at a low impedance0 After transfer of this charge 
packet to the next potential well (under l) the reset 
transistor is pulsed on to reset the tap electrode potential 
in preparation for the next cycle. 
33 DEVICE LIMITATIONS AND DEFECTS 
THE CHARGE COUPLED DEVICE 	 Page 86 
Device Limitations and Defects 
331 Transfer Efficiency 
The charge transfer efficiency is an important measure 
of device performance in analogue signal processing. When a 
charge packet is transferred from under one electrode to the 
next, some of the charge is left behind and some lost 
completely. Two main effects are responsible for this 
inefficiency, the first of which is due to interface 
states" at the Si/Si02 boundary [58] As each charge packet 
is passed along the device, interface states are filled 
almost instantaneously by minority carriers and then, when 
the charge packet moves on, the states are emptied much more 
slowly. Some of the emitted charges return to the correct 
packet but others empty into trailing packets. The primary 
effect of these states can be reduced considerably by 
passing a background charge or "fat zero" continuously along 
the device. The second source of inefficiency is caused by 
the transfer mechanism itself [59] When the transfer 
process begins, minority carriers move across quickly under 
the influence of a drift field. As the charge in the new 
well builds up, the drift field is reduced and the 
predominant transfer process becomes thermal diffusion, 
which is a relatively slow process characterised by a time 
constant defined by the electrode length and the carrier 
mobility. This time constant therefore gives a trade-off 
between clock frequency and transfer efficiency, The 
efficiency can be maximised by careful design of the driving 
waveforms and by making the interelectrode spacing as small 
THE CHARGE COUPLED DEVICE 	 Page 47 
Device Limitations and Defects 
as possible. 
The effect of charge transfer efficiency can be 
visualised by impulsing a CCD delay line. The delayed 
output will consist of an attenuated version of the original 
pulse, followed by a time series of smaller residual pulses. 
In practical CCD delay lines only the first residual is 
normally significant. This smearing gives rise to a low 
pass filter characteristic in the frequency domain reducing 
the device bandwidth. A simple analytical expression 
relating the transfer efficiency to the frequency response 
has been developed by Vanstone et. a10 [60 using 








11 . £ 	2 cos(WT)J 
and the phase response is given by 
On(w) - 	 n sin(WT) 	
000 (3.2) 
L cos(iT)j 
where IX is the transfer efficiency per stage, C is the 
transfer inefficiency (=JL'o) T is the sampling period and 
w is the angular frequency of the input. Fig0306 shows a 
plot of the normalised amplitude transfer function for 
various values of the transfer inefficiency product nC 
In current CCDs 9 charge transfer inefficiency is in the 
order of 00001 which restricts the number of serial stages 
to 10000 However, this depends to a great extent on the 
THE CHARGE COUPLED DEVICE 











0.1 	0.2 	0.3 	0.4 
Normalised Frequency 
Fig-3.6 CCDAmplitucle Frequency Response 
application 	and 	the 	amount 	of 	permissible 	signal 
degradation. Several techniques have been developed to 
compensate for charge transfer inefficiency [6162,63], but 
in general these result in considerable circuit complexity 
or redundancy. 
3.32 Noise [64] 
Noise sources can be classified into four different 
categories; input, storage 0 transfer and output. The 






THE CHARGE COUPLED DEVICE 	 Page 4 
Device Limitations and Defects 
range. 
The input and output noise sources , depend largely on 
the 	particular techniques used. 	For example, dynamic 
current injection suffers from random fluctuations in 
voltage levels and pulse jitter in the clocking waveforms 
whereas dynamic cutoff has associated partition noise and 
sampling jitter., In the split electrode tapping technique, 
the summing amplifier noise tends to dominate all other 
sources. 
Of the inherent noise groups, storage and transfer, the 
most significant sources are due to the shot noise in dark 
current (storageb and the fluctuating trapping of fast 
interface states (transfert 0 In typical signal processing 
applications, this last source causes most concern and may 
be minimised by careful design. Signal to noise ratios in 
the order of 70-80dB have been achieved in current devices. 
3.3.3 Dark Current [441 
Dark current is the equivalent of "leakage current" in 
NOS transistors and is caused by the thermal generation of 
minority carriers both in the bulk semiconductor and at the 
Si/Si02 interface. This extra charge accumulates in the 
potential wells, thereby degrading the stored information0 
At normal temperatures, dark current limits the maximum 
storage time to several hundred milli-seconds. In 
THE CHARGE COUPLED DEVICE 	 Page 50 
Device Limitations and Defects 
continuously clocked delay lines, the dark current effect 
simply reduces the dynamic range, whereas in transversal 
filters (SIPO) there is a non-uniform' - noise distribution. 
The level of dark current approximately doubles for every 
ten degrees centigrade increase in substrate temperature 9 
necessitating careful consideration of the total on-chip 
power dissipation. 
3.3.4 Peripheral On-chip Circuitry 
To make the CCD appear as a "black box 	which may be 
readily configured for any system, it is highly desirable to 
integrate along with the CCD many of 	the 	necessary 
peripherals. For example, the operation of a CCD requires 
clock drivers, timing logic, input and output amplifiers, 
anti-'aliasing filters and sample and hold gates. All of 
these functions have, of course, to be realisable in a 
compatible technology. 
The integration of the clock drivers and timing logic 
is relatively easy, but their inclusion normally limits the 
maximum operating frequency to about lN11z0 This limit is 
due to the power dissipated when driving capacitive clock 
lines0 t4OS amplifiers have been improved considerably in 
recent years and suitable amplifier designs have been 
reported [651 which operate at over 1MHz bandwidth with very 
low d0c0 	drift. The anti-aliasing filters and sample and 
hold circuits should be clock variable 	switched capacitor 
THE CHARGE COUPLED DEVICE 	 Page 51 
Device Limitations and Defects 
techniques [66] provide suitable characteristics at audio 
frequencies.. 
3..4 THE TRANSVERSAL FILTER 
The transversal filter structure shown in Fig..3..7 is an 
extremely powerful and flexible building block in analogue 
(and digitalD signal processing [5].. 
OUTPUT 
Fig..3..7 Transversal Filter Schematic 
It 	consists 	of 	an 	N-stage 	shift 	register 	with 
non-destructive taps after each delay, T. Each tap output 
is -multiplied by a weighting coefficient s h (k=1 9 2 0 000m 
and the results are summed.. The filter output is given by 
M 
V0(nT) = 	V. (nT - kT + T) h, 	 000 ()' 
k=1 
THE CHARGE COUPLED DEVICE 	 Page 52 
The Transversal Filter 
which is the discrete convolution of the input with an 
impulse response function (correlation may be obtained by 
time reversing the impulse response) The frequency 
response of this filter is given by the discrete Fourier 
transform of the weighting coefficients. Therefore by 
modifying the weights appropriately, any linear Finite 
Impulse Response (FIR) filter [671 can be constructed. 
Split electrode weighting (see section 222) is an 
extremely efficient technique for the implementation of 
fixed weight transversal filters in CCD, and one powerful 
application of these filters is in the CCD Chirp-Z transform 
processor (chapters 4 and 5) 	If, however, the weights are 
made 	electrically programmable, then a vast range of 
sophisticated analogue signal processing applications is 
possible e0g0 time variant adaptive filtering [7) and 
programmable correlation [868J 
CHAPTER 4 
CCD FOURIERTRANSFORH PROCESSORS- 
When analysing electrical signals, it is most common to 
display the waveform in the time domain. This gives the 
necessary amplitude and timing information. However, in 
certain cases, it can be more illuminating to picture the 
same information from an entirely different viewpoint, 
i0e0 from in the frequency domain. In much the same way, it 
is sometimes much more powerful to perform signal processing 
in the frequency domain. 
This chapter discusses the advantages and disadvantages 
of 	several 	techniques 	for time to frequency domain 
conversion (and the inverse 	when applied 	to 	signal 
processing. 	Section 41 reviews the conventional filtering 
spectrum analysers, one of which 	has 	recently 	been 
implemented in CCD. 	The fundamental mathematical tools 
relating 	the 	two 	domains, 	both 	theoretically 	and 
practically, are summarised in section 42 along with an 
efficient 	discrete 	transform 	algorithm 	suited 	to 
microprocessor implementation. 	Sections 43 44 and 45 
investigate three algorithms which can be realised 
efficiently using CCD transversal filters to give real-time 
operation up to several megahertz. The final section 
compares their performance. 
CCD FOURIER TRANSFORM PROCESSORS 
	
Page 54 
4,1 CONVENTIONAL SPECTRUM ANALYSERS 
One of the first systems used to resolve the frequency 
components of time domain signals employed a variable centre 
frequency filter to scan the temporal signal. This type of 
analyser, known as a Tuned Radio Frequency (TRF analyser, 
is simple and inexpensive but suffers from several 
disadvantages0 Firstly, because the TRF analyser has a 
swept filter, its sweep width is limited (usually one 
decadeD ; secondly,, since the swept filter bandwidth is not 
normally constant with frequency, the resolution is 
dependent on frequency. 
A significant development in spectrum analysis was 
initiated by the invention of the heterodyne principle. In 
contrast to the TRF analyser, the heterodyne spectrum 
analyser uses a bandpass filter with fixed characteristics. 
The input signal is mixed with a swept local oscillator 
before being filtered. An output from the filter will be 
present only when the difference frequency (or the sum 
frequencyo falls within the passband0 The advantages of 
this technique are considerable. It obtains high 
sensitivity through the use of IF amplifiers and many 
decades in frequency can be covered. Also, the resolution 
can be varied by changing the bandwidth of the IF filter and 
the sweep rate of the local oscillator0 
Both the TRF and heterodyne analysers discussed so far 
are swept tuned and hence the frequency components of a 
CCD FOURIER TRANSFORM PROCESSORS 	 Page 55 
Conventional Spectrum Analysers 
spectrum are sampled sequentially in time.. 	This 	is 
sufficient for 00 off4ine°° spectrum analysis, but to 
transform signals continuously"on-line', the output rate of 
the processor has to be the same as or greater than the 
input rate. Equivalently, the processor's bandwidth must be 
at least that of the input signal.. Such processors are 
normally termed real-time.. 
One way of achieving this real-time performance is to 
use a bank of staggered band-pass filters, each with equal 
bandwidth s, and to process the input signal in parallel.. The 
frequency range and resolution of this analyser is normally 
restricted by the amount of hardware required and typical 
applications (e.g. the channel vocoderD nave fewer than 30 
resolution bins across the bandwidth.. The contiguous filter 
bank arrangement obviously lacks flexibility and is always 
used with fixed parameters.. More recently, many of the 
engineering disadvantages of this approach have been 
relieved.. For example, an integrated circuit has recently 
been reported [30] which houses 19 parallel CCD FIR filters.. 
Alternatively, switched capacitor filters [66] may be used 
in audio frequency applications.. 
Another technique, similar to the analogue filter bank 
but rather more flexible, is the multiplexed digital filter.. 
The filter coefficients are stored in Read Only Memory (ROM 
and accessed when required.. For a single integrated circuit 
realisation, the serial digital processing restricts the 
CCD FOURIER TRANSFORM PROCESSORS 	 Page 56 
Conventional Spectrum Analysers 
real-time operation, at present, to several kilohertz. 
Hardwired versions may be constructed to operate 
considerably faster (hundreds of kilohertz in real-time but 
the power consumption and physical size increase 
accordingly. 
42 THE FOURIER TRANSFORM 
The principle mathematical tool for time to frequency 
domain conversion is the Fourier Transform (FT 0 The 
definition given in equation 41 transforms the time domain 
signal, f(t 0 into its frequency domain counterpart F(w0 
The inverse process, from the frequency to the time domain, 
is given by the Inverse Fourier Transform (IFT) in equation 
42 These equations are used widely in communication 
theory and are fundamental to spectrum analysis. 
(co 
F(w) = 	r(t) 	dt 	 000 (41) 
, cO 
f(t) 	
1 	1 = - F(w) e 	dw 	 000 (42) 
211 J 
In general, F(w) and f(t) are complex quantities. 	The 
amplitude and phase components may be extracted by taking 
the modulus and argument in the usual way, i0e0 for a 
complex number x=a+jb 
and 	
lxi =Ja2 
+ 	 000 (403) 
(x) = tan 	(b / a) 	 000 (404) 
CCD FOURIER TRANSFORM PROCESSORS 	 Page 57 
The Fourier Transform 
The FT pair are well defined for 	most 	signals 
encountered in practical systems. One sufficient but not 
necessary condition for the existence of the FT is that the 
time signal, f(t 0 should have finite energy. Periodic 
signals, commonly represented by the Fourier series, have 
infinite energy and are therefore excluded by this 
condition0 However, by the introduction of the Dirac 
impulse function, (to, which has an infinite amplitude, an 
infinitesimal width and unit area, it can be shown [69] that 
the FT of both periodic and singular functions can be 
defined. 
If the time signal, f(t)9 is zero for all negative 
time, then the FT is equivalent to the evaluation of the 
Laplace transform on the imaginary axis in the °°s plane 
(complex frequency), The FT is therefore a special case of 
the more general Laplace transform. 
One reason for the popularity of the FT pair is its 
wide range of useful properties [70] 	Possibly the most 
important of these is given by the convolution theorem. 	If 
h(t) is the linear, time-invariant impulse response of a 
system, and this system is excited by the input signal, 
x(t) 0 then the output, y(t) 9 is given by the convolution 
integral 
aD 
Y(t) = 	x() h(t 1) d 	 000 (4 0 5) 
J-c* 
The convolution theorem states that the FT of the output, 
CCD FOURIER TRANSFORM PROCESSORS 	 Page 58 
The Fourier Transform 
Y(w) 	is equal to the product of the system transfer 
function, E(w) and the transformed input signal, X(w 	viz. 
Y(w) = X(w) H(w) 	 00. (46) 
Therefore, convolution in the time domain is the same as 
multiplication in the frequency domain. In addition, it can 
be shown that convolution in the frequency domain is 
equivalent to multiplication in the time domain. 
4.2.1 The Discrete Fourier Transform 
When there is a need to calculate the FT by computer, 
the definition given in equation 41 must be modified 
because it requires an infinite amount of processing. 
Firstly, the input function has to be band-limited and 
sampled at discrete time instants, and secondly, this 
sequence has to be time truncated to say N points. The 
resulting definition is an approximation to the FT and is 
called the Discrete Fourier Transform (DFT) 0 The DFT pair 
corresponding to equations 41 and 42 is given by 
N-1 
Xk =  > x 
	 1c=0,1 00 N-1 
N-1 	ffik x = 	Xk eJ2'11 	1c=0,100N-1 
wnere XkiS the kth Fourier coefficient, x is the nth sample 
of the input data and N is the number of points in the 
CCD FOURIER TRANSFORM PROCESSORS 	 Page 59 
The Fourier Transform 
transform, 
In discrete sacdpled data analysis, the Z-transform 
plays the same role as does the Laplace transform in 
continuous analysis. The s plane is related to the °z 
plane through the expression z=exp(st and the imaginary 
axis in the s plane maps to the unit circle in the z plane. 
The equivalence of the FT and the Laplace transform 
evaluated on the imaginary axis is therefore analogous with 
that of the N'point DFT and the Z-transform evaluated at £ 
equidistant points round the unit circle. 
To find out how closely the DFP approximates to the. 
continuous FT it is necessary to examine each stage in the 
development of the DFTO Firstly, consider the effect of 
sampling in the time domain. If the sampling period is Ts, 
then the output spectrum will contain not only the correct 
result, but also an infinite number of aliased replicas each 
separated in frequency by fs 9 where fs=l/Ts0 As long as the 
input function is band-limited to fs/2 (Nyguistl the 
aliased spectra will not overlap and there will be no 
distortion0 In the practical case 9 the band-limiting filter 
cannot have an infinitely sharp cut-off and so it is usual 
to sample several times faster than the Nyguist limit. For 
a fixed filter, the only way to reduce aliasing errors is to 
increase the sample rate. 
Secondly, the input data are truncated to N points. 
This is equivalent to multiplying the time signal by a 
CCD FOURIER TRANSFORM PROCESSORS 	 Page 60 
The Fourier Transform 
rectangular window of length N0Ts and its effect is to 
convolve the true FT with a (sin x)/x response (which is the 
FT of a rectangular window). Fig0401 compares the true FT 
of a sinewave (impulse function) with that of the windowed 
version. It can be seen that the (sin x)/x main lobe width 
limits the frequency resolution i.e. the transform°s 
ability to distinguish between adjacent frequencies. The 
main lobe width is inversely proportional to the time domain 
window length and to increase the inherent frequency 
resolution therefore requires an increase in window length0 
(Note that if the window is increased to infinity then the 
ideal impulse function results) In addition, the (sin x)/x. 
sidelobes create a "leakage" effect and this limits the 
amplitude resolution. The most significant sidelobes are 
the first pair, their amplitudes being -13dB with respect to 
the main lobe peak. However, weighting functions (section 
423) can be employed to increase the amplitude resolution 
at the expense of frequency resolution. 
The final modification necessary to obtain the DFT is 
sampling in the frequency domain. This is achieved by 
assuming that N samples of the input function are one period 
of a periodic waveform. The output spectrum will then 
consist of 	N 	discrete 	samples, each 	spaced 	by 
l/(N0T5) = fs/N0 No information is lost by this sampling 
but great care has to be exercised in the interpretation of 
such spectra. For example, consider the DFT of a sinewave0 
If the sinewave is a basis vector (i0e0 it has an integral 
CCD FOURIER TRANSFORM PROCESSORS 
The Fourier Transform 
Page 61 
_ideal FT 








truncation length equal to an integral multiple 
of sinewave periods 
non-integral multiple of sinewave periods 
Fig4,2 DFT of a Sinewave 
CCD FOURIER TRANSFORM PROCESSORS 	 Page 62 
The Fourier Transform 
number of periods within the truncation window) then the 
resulting frequency domain (sin x)/x will fall exactly on 
the sampling grid (Fig0402a) and the output sequence will be 
all zeroes except for a 	at the appropriate frequency 
sample. 	If, however, the sinewave is not a basis vector, 
the (sin x)/x will be offset from the sampling grid to give 
an output similar to that shown in Fig0402b0 
4.2.2 The Fast Fourier Transform 
It can be seen from equation 47 that N 2 complex 
multiplications and associated additions are required to 
compute an N-point DFTO Since the processing time and hence 
the cost are usually proportional to the number of 
multiplications, the DFT calculation for large N (>64) 
becomes prohibitive. A Fast Fourier Transform (FFT) is an 
algorithm which significantly reduces 	the 	number 	of 
multiplications needed to calculate the exact DFTO 
The first FFT algorithm to achieve widespread acclaim 
was developed by Cooley and Tukey [711 in 1965 and remains 
today as the foundation for most other FFT algorithms. 	The 
mechanics 	of the 1 Cooley-Tukey FFT algorithm are well 
documented [72] and it is sufficient to note that the key to 
its efficiency results from the periodicity of the function 
W in N where 
= 5-j211/N 	
000 (49) 
CCD FOURIER TRANSFORM PROCESSORS 	 Page 63 
The Fourier Transform 
Using the properties 
1 for all nk = pN, p-091--N 
nk+N/2 = nk 
jpk = rik modulo(11) 
and 
where nk modulo(N) is the remainder upon division of nk by 
N it is possible to structure the DFT to minimise the 
number of multiplications. Fig0403a shows the flow diagram 
for a decimation in time radix-2 FFT algorithm with t=80 
The fundamental operation in this algorithm is the 
"butterfly" represented by a circle in the flow diagram. 
Each butterfly takes two complex inputs P and B Q and 
combines them to give P and 0 through the operations 
	
P = A + 11H. B 
	 000 (41o) 
= A 	B 
	
000 (4q11) 
wtere 	are the so called twiddle factors'. 	To evaluate 
the complex P and Q using real arithmetic involves four 
multiplications three additions and three subtractions 
(Fig0403b 0 
The complex input sequence {x} 9 n=OlOOON-l Q 	is 
initially 	reordered and the first set of butterflies 
performs what is essentially a 2-point DFT on pairs of input 
data. The second set of butterflies combines the 2-point 
DFTs using twiddle factors to give two 4-point DFTs of the 
even and odd numbered input data. Finally, these are 
combined to achieve the 8-point DFT0 
1 0 
In 
CCD FOURIER TRANSFORM PROCESSORS 















- 	 ___ 





Eig0403b Implementation of a Radix-2 Butterfly 
CCD FOURIER TRANSFORM PROCESSORS 	 Page 65 
The Fourier Transform 	 - 
From this description, it is clear that N must be 
restricted 	to 	an integral power of 2 for efficient 
implementation i.e. N=2, where 	is an integer. There are 
therefore Y 	or 	logN g 	 stages each with N12 complex 
butterflies so that a 	total 	of 	(N/2)log,,N 	complex 
butterflies are required to provide the n-point DFT O In 
terms of real arithmetic, this is a total of 2Nlog 
multiplications, which compares with 4N 2 for the direct 
calculation of the DFTO 	For N=1024, 2log=20,480 and 
4N 4,194,,304 ; in this case, a saving of approximately 
200:1 in processing time has been achieved. 
The FF1' accuracy is limited by the finite word lengths 
used in digital machines [73]. The error sources can be 
divided into three categories: (U the analogue input 
quantisation, (2) the finite word lengths used to represent 
the twiddle factors and (3) the truncation and round-off 
within the butterflies. The last error source is the most 
important because its effects are cumulative and depend on 
the transform length. Each Fourier coefficient is processed 
through log 2 N butterfly operations, which indicates that 
higher accuracy is necessary for longer transforms. 
• Many other FFT algorithms have since been developed 
either to capitalise on particular properties of the input 
data or to optimise the speed-storage trade-off. For 
example, if a larger memory is tolerable, a faster transform 
CCD FOURIER TRANSFORM PROCESSORS 	 Page 66 
The Fourier Transform 
may be obtained by increasing the FFT radix [67] 
An FFT algorithm can be implemented in one of two ways: 
(a) in software on a general purpose mini- or 
micro-computer, or (b) as a specialised hardware structure. 
The software implementations tend to be used for 
non-real-time applications as the transform rate is limited 
(typically 600mS. for 1024 complex points on microprocessor 
based systems (741) Hardware structures commonly employ a 
single multiplexed high-speed butterfly (751 or make use of 
pipelining to achieve a transform rate of up to 0065mS for 
1024 complex points [761 However, the COSt 2 power 
consumption and size of these array processors greatly 
restrict the range of possible applications. 
4.2.3 On The Use Of Weighting Functions 
As dicussed in 	section 	42l 	the 	result 	of 
transforming a finite sample of the input data (i0e0 a 
rectangular window) is to convolve the output with a 
(sin x/x response. 	This reduces the frequency resolution 
and limits the amplitude resolution. 	The purpose of a 
weighting function is to reduce the sidelobes without 
significantly broadening the main lobe. 
A weighting function is normally multiplied into the 
input data before transformation and, in general, brings the 
data smoothly to zero at the window edges. Many different 
CCD FOURIER TRANSFORM PROCESSORS 	 Page 67 
The Fourier Transform 
weighting schemes are available for this and Ref0[771 gives 
an excellent summary of the most important of these. 
Alternative figures of merit are given to facilitate the 
most appropriate choice. 
One of the most popular functions is the Hamming window 
defined by 
W = 054 - 046 cos(2r1n/) 	 000 (412) 
Theoretically, this window gives sidelobes which are 
approximately -43dB down on the main peak, and broadens the 
3dB main lobe width by 13 (when compared to a (sin x)/x) 0 
which is accepted as a good compromise. Fig0404 compares 
the rectangular window with that of the Hamming window. 
Unfortunately, the use of weighting functions 
inevitably leads to loss of data at the window edges. For 
on-line signal processing applications, it becomes necessary 
to overlap successive windows. For instance, if the 
transform is being used to detect short duration signals, 
the non-overlapped analysis could miss the event if it 
occurred near the boundaries. The amount of overlap depends 
on the weighting function used but is almost always between 




























CCD FOURIER TRANSFORt'L PROCESSORS 
The Chirp Z-Transform 
Page 69 
4. 3 THE CHIRP Z-TRANSFORM  
The Chirp Z-Transform (CZT) [78) as suggested by its 
name, is a restricted version of the Z-transform. It is 
however considerably more general than the DFTO The main 
additional freedoms offered by the CZT are: (a) the number 
of time samples does not have to equal the number of samples 
of the Z-transform, and (b) the summation contour in the Z 
plane need not be a circle, but can spiral in or out with 
respect to the origin. Historically, the CZT was not 
considered as useful as the DFT since the special symmetries 
which are exploited in an FFT derivation are absent. 
However, because the CZT can be structured to allow 
efficient real-time implementation of the DFT using analogue 
CTD transversal filters, the CZT has recently become an 
extremely important algorithm [7980811 
4.3.1 Derivation 
The finite Z-transform, (Xk)0 of a sequence, (x} 0 
n=0 0 l 0 2000N-1 2 is defined as 
N-1 
Xk = > x z' 	 000 (415) 
The CZT can be derived from equation 413 by substituting 
the restricted contour 
A W_1C 	k=09100M1 	 000 (4q14) 
CCD FOURIER TRANSFORM PROCESSORS 	 Page 70 
The Chirp Z-Transform 
where I'1 is an arbitrary integer and both A and W are 
arbitrary complex numbers of the form 
A = A0 exp(j2ir) 	 000(4014a) 
W = W exp(j211 0 ) 	 000(4014b) 
This general Z plane contour, Fig0405 0 begins at the 
point Z=A and, depending on the value of W spirals in or 
out with respect to the origin0 
Fig0405 The General Z-plane Contour of the CZT 
If 	=1 then the contour is an arc of a circle. The angular 
spacing of the samples is 21rØ'00 The special case of A=1 0 
4=N and 1=exp(-j2r/N) corresponds to the DFTO 
The key to the usefulness of the CZT is an equality 
given by Bluestein [82] 
2rik = n2 + 	(k - 	 000 (4q15) 
CCD FOURIER TRANSFORM PROCESSORS 	 Page 71 
The Chirp Z-Transform 
The substitution of this equation into the restricted 
Z-transform 	- 
Xk = 	 000 (416) 







- On close inspection equation 417 can be decomposed 
into a three step process: 
10 pre-multiplication of the input sequence, (x} 9 by 
a weighting function to give an intermediate 
sequence0 [p}, 
p = x A 	
000 (418) 
2 	convolution of the {p} with a sequence, (h}, 
where 
h = 	/2 
	
000 (4q19) 
to form the sequence (q)9 
N-i 
= i p 
ri=O 
30 post-multiplication of {q} by 
000 (42o) 
to give {X k } 
Xk =q k 	
2,2 	
000 (421) 
CCD FOURIER TRANSFORM PROCESSORS 
	
Page 72 
The Chirp Z-Transform 
This three stage operation is illustrated in F1g04060 
n=O1 ... N1 	 k=O1 ... N1 
Fig.46 The Chirp Z-Transform 
The symbol * is used to represent convolution. 	The 
advantage of the CZT in a practical implementation is now 
clear, since a fixed kernal convolution can be performed by 
a transversal filter in real-time i0e0 data can be output as 
fast as they can be input because the filter multiplies N 
samples in parallel. 
Before proceding to translate the mathematical CZT into 
a realisable hardware configuration, the application of the 
CZT°s generality is examined. Firstly, is there any 
advantage in transforming on contours other than the unit 
circle (OFT)? In linear systems analysis, there is often a 
need to determine the poles and zeroes. By making the CZT 
contour pass close to these, the pole and zero positions 
will be enhanced [83] This particular advantage is an 
exception. Most systems are characterised by their response 
on the unit circle (i0e0 the frequency response) and any 
CCD FOURIER TRANSFORM PROCESSORS 	 Page 73 
The Chirp Z-Transform 
deviation from this standard would lead to confusion. 	In 
signal processing, no advantage would be gained unless there 
was a requirement to in" on features of interest by 
adaptively shifting the contour. From the practical point 
of view, major computational inaccuracies can arise when the 
contour is moved significantly from the unit circle. This 
is because the function WO' 	is required in the CZT 
evaldation0 	(W0 controls the rate of contour spiral). For 
large N (e0g0 lOOO) if W differs by very much from 100 9 
W. + 
	
can become very large or small when n becomes large 
[83] 	The second CZT freedom is that 1 9 the number of 
output samples, can be chosen independently from N 9 the 
number of input samples0 Also, the starting frequency of 
the contour can be selected. This makes the CZT ideal for 
high resolution, narrow-band analyses [83] When using the 
DFT for such analyses, many of the output points are of no 
interest and therefore represent wasted processing. 
It can be concluded from the above discussion that the 
CZTs generality is not particularly useful in signal 
processing, as most applications either require or prefer 
the now standardised DFTO In the following sections, the 
CZT will therefore be restricted to the special case 
corresponding to the DFTO 
CCD FOURIER TRANSFORM PROCESSORS 
	
Page 74 
The Chirp Z-Transform 
4.3.2 Implementation 
The CZT algorithm for the DFT reduces to 
N 
Xk 	 Y n = jilk 	




The pre-multiplying sequence is a constant amplitude complex 
cochirpoo 	or linear FM function (hence the name Chirp 
Z-Transform) and the filter needed to perform the 
convolution has an impulse response which is the time 
reverse of the pre-multiplying signal. To perform the 
arithmetic in equation 4.22 using real components demands a 
parallel or interleaved structure with separate real and 
imaginary channels. For example, the multiplication of the 
two complex numbers x=a+jb k and Y n C m +Jdm (ak etc0 all 
real) gives 
m km 
Thus the complex CZT processor requires four convolution 
filters, four separate multipliers for pre-multiplication 
and four multipliers for post-multiplication. 	Figure 47 
shows the expanded block diagram. The pre- and 
post-multiplying complex chirp functions are now represented 
by their real and imaginary parts [viz0 cos(irn/N) and 
sin (Tr n IN) J 0 
A circular convolution is necessary 	in 	Equ04022 
i.e. those values that are shifted from one end of the 
POST-MULTIPLICATION 
cos (rrk 2/N) 
























cos (rr(m-N) 2IN) 
real channel 











in(irn 2 /N) 
i(xj 
1+ 
cos (nn 2IN) 
	
cos(iik 2IN) 
Fig.4.7 Chirp ZTransform Implementation 
	 Ct, 
U, 
CCD FOURIER TRANSFORM PROCESSORS 	 Page 76 
The Chirp Z-Transform 
summation interval are circulated into the other [84] 
Transversal filters can only perform linear convolution. 
However, a linear convolution can appear as a circular 
convolution by doubling the length of the filter and padding 
the input sequence with zeroes [841 The transversal 
filters shown in Fig0407 therefore need 2W-1 stages and have 
impulse responsesC and S given by 
C = cos [il(m - n) 2 / N] m=1 9 2002N-1 
000 (424) 
5 = sin[ii(m - n) 2 /N] m=1 9 2002N-1 
where m=k-n+N and m is the mth filter stage. (This can be 
confirmed by noting that the range of the index k-n is 
2N-1). 
The operation of the processor is as follows. 	N 
sequential samples of the input data are shifted into the 
processor, pre-multiplied and loaded into the transversal 
filters. At this point in time these contain N-'l leading 
zeroes and N data points and after post-multiplication the 
first output is available. The data are then shifted by one 
stage and a zero is input to each filter. The second output 
sample is now available. This operation is repeated until 
all N output samples have been calculated, at which point 
the transversal filters contain N leading data points and 
N-i trailing zeroes0 when a new frame of N data samples is 
shifted into the processor, the old data are shifted out. 
CCD FOURIER TRANSFORM PROCESSORS 	 Page 77 
The Chirp Z-Transform 
Several undesirable features of this implementation are 
now apparent. The output must be blanked during the loading 
of the data and the input must be set to zero during the 
calculation of the coefficients. This means that the 
processor has a duty cycle of approximately only 50% 
(l00t'!/(2-l)%), In addition, inefficient use is made of the 
filters since only half of each contains useful information 
at any point in time. 
The use of weighting functions (section 4,2,3) to 
reduce (sin x)/x sidelobes can be readily incorporated into 
the pre-multiplying chirps without the addition of extra 
hardware, 
4,3,3 Hardware Reduction 
The implementation described in the previous section is 
configured to process a complex input (real and imaginary 
parts) and produce .a complex output. In many applications, 
this complexity is not required and savings in hardware are 
possible. 
Table 4,1 summarises several properties of the FT which 
may be used to reduce the CZT hardware, If the input data 
are either real or imaginary only (properties in 
Table 4,1) then two of the pre-multipliers and both of the 
input summers are redundant. The reduced pre-multiplication 
structure is shown in Fig,4,8. For the restricted inputs 
CCD FOURIER TRANSFORM PROCESSORS 
The Chirp Z-'Transform 
Page 78 
Time Domain h(t) Frequency Domain H(f) 
1 Real Real part even 
Imaginary part odd 
2 Imaginary Real part odd 
Imaginary part even 
3 Real even Real 
Imaginary odd 
4 Real odd Imaginary 
Imaginary even 
5 Real and even Real and even 
6 Real and odd Imaginary and odd 
7 Imaginary and even Imaginary and even 
8 Imaginary and odd Real and odd 
9 Complex and even Complex and even 
10 Complex and odd Complex and odd 
Table 4.1 Properties of the FT 
CCD FOURIER TRANSFORM PROCESSORS 	 Page 79 
















Re ( Fk) 








Fig04010 Modulus Circuit for Power Spectra 
CCD FOURIER TRANSFORM PROCESSORS 
	
Page 80 
The Chirp Z-Transform 
described by cases 3 to 	in Table 4.1, there is a similar 
situation in the post-multiplication circuitry and the 
hardware savings are shorn in Fig041090 
When only the power spectrum is wanted i.e. when the 
phase spectrum is irrelevant, the complex 
post-multiplication may be replaced by a modulus circuit 
consisting of two squarers and one summer (Fig04010 It 
can be seen from equation 822 that the complex 
post-multiplication function, exp(-jrEk/WD contributes only 
to the phase of {Xk}o 
Finally, a technique exists to process simultaneously 
the OFT of two independent t-point real sequences in a 
single -point transformer (691 A slight increase in 
peripheral circuitry is required, but by doubling the 
throughput a substantial increase in hardware efficiency is 
achieved. If the sequences (h) and (g 0 n=0 0 1 0 2000 l are 
purely real, then the complex sequence {x} can be formed by 
taking one of the sequences to be imaginary, i.e. 
-- 	
- x= h + j g 	 000 (4q25) 
The OFT of this sequence, {XR}0 is 
Xk = 	+ J Ik 	k=09100N-1 	000 (426) 
where R. and 'K are the real and imaginary parts of 
CCD FOURIER TRANSFORM PROCESSORS 	 Page 81 
The Chirp Z-Transform 
respectively. 	By making use of the even and odd properties 
of the FT, in particular 1 and 2 in Table 41 	it can be 
ShOr!fl that 
2 Rk = ("k + N-k + j 	 - IN-k) 	000 (427) 
and 
2 Gk = ( 1k + IN-k) 	i ( r 	RN_k) 	000 (428) 
where {H} and (Gk} are the DFTs of (h) and 	{g} 
respectively. 
The extra hardware required to sort out the results, as 






Fig04011 Doubling CZT Throughput for Real Inputs 
The main components are two £'/2 stage first-in last-out 
CCD FOURIER TRANSFORM PROCESSORS 	 Page 82 
The Chirp Z-Transform 
analogue delay lines. The first N/2 CZT output points are 
read into the stores and then read out in reverse order in 
parallel with the second N/2 CZT output points. After the 
appropriate summations, the intended data are available. 
Note that in this implementation the output data are in 
reverse order and only the unique N/2 points in each 
sequence are output. If all N output points are called for 
the real outputs can be reflected and the imaginary outputs 
reflected and inverted (property 1 in Table 431) 	Thus a 
doubling in throughput may be obtained by adding N stages of 
delay and four summers to the complex czr structure. 
4.3.4 Inaccuracies and Limitations 
In a CCD implementation of the CZT Q the main error 
sources [85] are due to (1) transversal filter weight 
accuracy (2) pre- and post-multiplier quantisation (assuming 
that the sequences are stored in digital ROM) (3) thermal 
noise and (4) charge transfer efficiency. These errors will 
be discussed in terms of an r0in0s0 noise to signal ratio 
(N/S) [86] for the purposes of comparison with other 
transform techniques. In section 52 a more practical 
error definition, the peak error to peak signal ratio, is 
used in a computer simulation of the CZTO 
The weighting coefficient accuracy depends on the 
particular technique employed. If the tap weights are 
formed by floating gates (section 322) and external 
CCD FOURIER TRANSFORM PROCESSORS 	 Page 83 
The Chirp Z-Transform 
resistors then a percentage error is appropriate. This is 
dealt with in section 52 	The most common 	tapping 
technique for fixed weight transversal filters is the split 
gate (section 322) where the error depends on 	the 
photolithographic resolution. 	It has been shown [61 that 
errors for a 500 point CZT are in the order of 008% 
(-62dB) 0 	 The MIS ratio is approximately independent of N 
the number of transform points0 
Pre- and post-multiplier sequences are stored typically 
as 8 bits (including sign) and this guantisation gives a 
dominant random error in the CZT of about 03% (-50dB) [6] 
The thermal noise in CCDs gives rise to a signal 
independent error source analogous to input quantisation in 
the FFTO MIS ratios of less than -70dB are possible in 
CCDS Q making this error source relatively insignificant. 
The final CCD error source is charge transfer 
efficiency. Since this is a coherent error, its effects can 
be expressed both as degradation in CZT frequency resolution 
and as a ?/S ratio. The study of a 64 point CZT [86] has 
shown that for a charge transfer inefficiency 0 e of 000001 
the MIS ratio is -40dB0 In terms of frequency resolution 
degradation, the sensitivity is found to be three times 
worse for high frequency than for low frequency inputs. 
Since longer transforms call for longer transversal filters, 
the N/s ratio increases with N. For a givenC, the MIS 
ratio increases by ' about 5dB for each doubling of N. 
CCD FOURIER TRANSFORM PROCESSORS 	 Page 84 
The Chirp Z-Transform 
Techniques are available for charge transfer compensation 
(section 33) but these tend to be impractical. 
The performance limitations of the CCD CZT are set by 
various aspects of the analogue circuitry. The realistic 
real-time bandwidth of the processor is limited to 5MHz by 
peripheral electronics (10MHz clock rate) and significantly 
less (1MHz) for fully integrated implementations. Charge 
transfer efficiency and also physical chip size limit the 
number of transform points to a maximum of 500 (1000 stage 
convolvers) and thermally generated dark current in the CCDs 
restricts the total storage time to several hundred 
milliseconds. This sets the maximum resolution. Due to the 
processing gain in matched chirp filters (which is defined 
as the square root of the time-bandwidth product), the CZT°s 
linear dynamic range is limited not by the CCDs but by the 
output analogue multipliers. In a processor configured for 
power spectrum output the linear dynamic range is limited to 
40dB by the squaring multipliers [871 The overall accuracy 
of a 500 point CZT can be likened to an equivalent 13-bit 
FFT [6] 
404 THE SLIDING CHIRP Z-TRANSFORM 
The sliding variation of the direct CZT permits a 
reduction in transform hardware but does not give the true 
DFT for a general input [6] The Sliding CZT (SCZT) can be 
defined as 
k+N-1 
= e 1 ' 	cc' x  e n/'N  ej)21'N 	000 (4q29) n 
CCD FOURIER TRANSFORM PROCESSORS 	 Page 85 
The Sliding CZT 
The difference between the direct CZT and equation 4.29 
is the summation -index which is incremented for each new 
spectral component so that the current N-point transform 
depends on data from the immediately following N input 
points. The only class of input signal for which the SCZT 
gives the exact DFT is a periodic waveform in N, 
i0e0 x=x, ; this is an extremely restricted input. 
However, if only the power spectral density is required, the 
range of inputs can be expanded to cover any stationary 
signal because the indexing only results in a modified 
phase. (A stationary signal has a constant amplitude 
spectrum even though each N point time record is different). 
Examination of equation 4.29 reveals the main advantage 
of the SCZT: 	for an N point transform, the convolution 
process demands the filters to have only N stages. 	The 
filter impulse responses are defined by 
= cos(11(m-N) 2/N) m=1,200N 	000 (430) 
= sin(iT(m.=N) 2/N) 	m=1,200N 	000 (4q31) 
where m=k-n+N and m is the mth filter stage. Apart from the 
obvious hardware savings, the reduced filter lengths mean 
that the transform degradation due to imperfect charge 
transfer efficiency is less in the SCZT than in the direct 
CZTO In addition, since one new data point is input for 
each spectral coefficient output, the processor has a 100% 
duty cycle and blanking is not necessary. 
CCD FOURIER TRANSFORM PROCESSORS 	 Page 86 
The Prime Transform 
45 THE PRIME TRANSFORM 
The Prime Transform (PT) is an alternative algorithm 
for computing the DFTO It is suited to CCD implementation 
because the bulk of the computation is performed 	in 
transversal filters,. 	Using a concept from number theory, 
Rader [88] has demonstrated that the DFT can be calculated 
from the correlation of two sequences if the number of data 
samples is prime. 
4.5.1 Derivation 
The derivation begins by writing the DFT in a more 
convenient form 
N-i 
x 	 000 (4q32) 
ii=O 
Xk = xo + 	
xbjnk 	
000 (4°3) 
where W=exp(-j2ir/0) 0 If N is chosen to be prime 0 there 
exists at least one integer R 0 called a primitive root, 
which will produce a one to one mapping of the integers n° 
to the integers n according to the relationship 
n = Rn modulo(N) 
	
n 0n 1 =1 	0N-1 	000 (4q34) 
Similarly,  
k = Rk modulo(N) 
	
k,k"=l 9 2 0 0N-i 	000 (4 0 35) 
Taking advantage of the cyclic properties of W 	it can be 
> X  
X((Rk')) 
n=O 




CCD FOURIER TRANSFORM PROCESSORS 
The Prime Transform 
Page 87 
shown [88] that equation 433 can be rewritten as 
N-i 	 (ke+n2) 
= X +. 	X(()) W 	 (46) 
ne = 1 
where ((a)) represents modulo(N)0 Equations 4.32 and 4.36 
together form the PT. 	The term X. has to be calculated 
separately because ((Rk ))=O 
Equation 4.36 thus represents a circular correlation of 
the permuted (reordered) input sequence (x R 	} with the 
permuted values of the complex sinusoid { } 	The 
resulting sequence {x 	 } is a permuted sequence of the 
DFT coefficients {X}0 
Sample at n=O 
Permutation 	 j 	Correlation 
n= Rh Inod u 1 o(I 'I 
Fig04012 The Prime Transform 
Fig04012 gives the hardware configuration for the complex 
PT. 
CCD FOURIER TRANSFORM PROCESSORS 	 Page 88 
The Prime Transform 
4.5.2 Implementation 
The PT architecture has to be configured for complex 
arithmetic using real components (Fig0413) in much the same 
way as the CZTO 
Re (x n ) 
!m(x) 
Re ( Xk) 
Im(Xk) 
1 
Fiq04013 Expanded PT Architecture 
Note that in Fig04013 0 the calculation of X 0 and the 
addition of the spectral offset s x0 , are not included for 
clarity. It can be seen that the basic difference between 
the CZT and the PT is the replacement of the analogue 
multipliers by analogue permuters. Data permutation, 
however, does not involve complex arithmetic and hence 
yields the simpler structure. The permuters can be 
implemented using Analogue Random Access Memory (ARAM) [89] 
or a CCD store in conjunction with an analogue demultiplexer 
CCD FOURIER TRANSFORM PROCESSORS 	 Page 89 
The Prime Transform 	 0 
[90] 
The circular correlations are performed in transversal 
filters. Since these filters inherently perform linear 
convolution, one of the two sequences has to be time 
reversed. Examination of equation 4.36 reveals that a total 
of 2-3 stages are required in each filter. The filter 
impulse responses are given by 
CP = cos [2V ((10+1 )) / N] 	in=1,2002N-.3 	000 (437) 
SP = sin [2 -rf ((IP+ ')) IN] 	m=1,2002N-3 	000 (438) 
where mk+n°-1 and m is the mth filter stage. The 
operation of this convolution is similar to that in the CZT 
and a duty cycle of approximately 50% results0 
The inverse permutations at the output are included to 
reorder the PT coefficients, thereby giving the conventional 
DFTO However, if subsequent processing involves inverse 
Prime transforming, these may be omitted. 
453 Hardware Reduction 
In contrast to the CZT L, special cases of input signal 
result in significant hardware savings. For the case of 
real data only, two permuters, two convolvers and two 
summers become redundant (Fig04014)0 A further reduction is 
possible if the data are purely real and also even. In this 
CCD FOORIER TRANSFORM PROCESSORS 








Permute Cos 	~— 




Fi040l4 PT Hardware Reduction for Real I/P Data Oni 
case there is no imaginary output and the PT is reduced to 
the Discrete Cosine Transform (DCT) O requiring only one 
correlator (Fig04015) 
I 	Correlation 	I I_______ 	 Inverse X - Permute 1 I Permute__ -4Re(Xk ) L 	 cos  
Fig.4015 PT for Real and Even Input Data (DCT) 
4 . 5 . 4 Errors and Limitations 
The error sources in the PT are similar to those in the 
CZT with the exception of the multiplier quantisation which 
has been replaced by a permuter error .  
CCD FOURIER TRANSFORM PROCESSORS 	 Page 91 
The Prime Transform 
It has been found [86] that the effect of tap weight 
error in the PT is the same as that in the CZTO The effect 
of charge transfer efficiency on the PT is not as 
staightforward as in the CZT case because the correlation 
for a particular DFT coefficient depends on the permutation 
code and not on the linear position in the filter. It is 
therefore not possible to treat charge transfer efficiency 
as a simple degradation in frequency resolution. For a 67 
point PT with a charge transfer inefficiency of 00001 the 
N/S ratio is approximately -41dB [8612 which is marginally 
better than the equivalent CZTO 
The Achilles' heel in the PT is the analogue perrnuter. 
State-of-the-art permuters using ARAM have an accuracy of 
between 5% and 10% [89] 	An accuracy of 5% gives a N/S 
ratio of -30dB [86] 	The alternative is to perform the 
permutation digitally i.e. A to D convert, store in digital 
RAM and finally D to A convert the reordered words. While 
this approach may achieve superior performance, many of the 
engineering advantages associated with a CCD PT 
implementation are lost. 
46 COMPARISON OF REAL-TIME SPECTRUM ANALYSERS 
Table 42 summarises the main operating characteristics 
for real-time spectrum analysers operating below 5MHz0 
There are three distinct approaches illustrated here: the 
digital FFT, the analogue CCD DFT processor and the parallel 
PROCESSOR F F P 1 F P F F P C Z P C Z P S C Z T P P Filter Bank Filter Bank Filter Dank 
:WM 
nicrocomputer 1 2 L Bit Slice Specialised Integrated Discrete Discrete Discrete CCD IC Switch, Cap. Digital 
RAMUM Stand-Alone Custom Hardware (1977) IC Discrete 
x, Number of 
1024 512 1024 64 512 1024 512 32 52 32 
anstorin Points 
ansfon Speed for Parallel Parallel 
275cjs 9ma 0,65ms ----- 0,1mm 0.1mm O.Ims Processing Processing 
2 Complex Points 
al-time Process- 
2 kHz 55 kHz 780 kHz 1 11Hz 5 	4ffz 5 MHz 5 MHz 200 kHz 20 kHz 5 kHz 
Bandwidth 
uracy 0.1% 0,1% 0,1% 1% 1% 1% 1% 1% 0.3% 0,1% 
riam.tc Range 12 bits 12 bite 12 bite 40 dB 50 &B 50 dB 60 aD 70 aD 70 dB 13 bits 
of CCD Stages 
---- 811 811 411 811 lOON 
N pt cmplr trans 
it 	tt £4500 13000 * £75000 9200 * £2000 £2000 £2000 * £200 £200 ** £5000 ° 
I Board I Board 1 Board 
rsicai. Size 50x19"x20" 9 chips 4311x19flx2811 IC 4'x9" 4"x9" 4"x9" IC IC 19"x12"x5' 
Flexibility Accuracy Small Size 1006 Duty Hardware Small Size Small Size 
'antages Flexibility Accuracy and Lob, Power High speed Cycle Reduction for Low Power Semi- Programmable 
Low Power Speed Low Coat Special Inputs Programmable 
Limited Limited Power 50% Duty 	50% Duty 	Power Spectra 50% Duty Fixed Audio 
advantages Real Time Speed Consumption Cycle Cycle Only Cycle haracteristic Bandwidth Large Size 
Application Size and Cost Analogue Multipliers Analog Permute] Only 
t Plessey M1CPr0c 	 * - Estimate 	- Estimate for large quantities 
tt hardware costs only 	- 	 - - 

























CCD FOURIER TRANSFORM PROCESSORS 	 Page 93 
Comparison of Real-time Spectrum Analysers 
filter bank. Each of these is complementary in that their 
application areas are well defined and tend not to overlap. 
The FFT is used in cases demanding high accuracy 
coupled with high resolution and, at present, there is no 
alternative approach. Major problems arise when the 
real-time bandwidth is much greater than 50kIz and the only 
solution involves an exponential increase in power, cost and 
size. 
In applications requiring fewer than 32 frequency 
points the "brute force" parallel filter bank will often 
give the optimum engineering result. This is especially 
true for the analogue filter bank because there is no need 
for A to D and D to 1 conversion. 
The CCD transform processor fits into application areas 
requiring low power, low cost and only modest accuracy. For 
a fully complex processor, 'there is little to choose between 
the CZT and the PT. 	However, in cases where the input 
signal can be restricted sufficiently, the PT 	offers 
distinct hardware advantages. 	The SCZT provides the best 
solution when the input signal is stationary and only power 
spectra are required. 
CHAPTER 5 
THE DESIGN AND CONSTRUCTION 
OF A 
CCD CHIRP Z-TRANSFORM PROCESSOR 
This chapter describes the practical considerations 
taken into account during the design and construction of a 
CCD CZT processor. The main design objectives are 
summarised in section 5010 Section 52 discusses czr 
computer simulation results which allow component tolerances 
to be specified for the implementation in section 53 
Finally, the hardware performance is examined in section 
5040 
51 DESIGN OBJECTIVES 
The main objective was the design and construction of a 
CZT processor suitable for speech processing. For research 
purposes, maximum flexibility was desirable together with 
minimal hardware overcomplication. In addition, the 
operating speed had to be maximised without the need for 
special high-bandwidth circuit techniques. The restricted 
availability of suitable CCD transversal filters (Jan.1977) 
necessitated consideration of hardware efficiency. 
The preliminary design specifications resulting from 
tne above were as follows: 
THE DESIGN AND CONSTRUCTION OF A CCD CZT PROCESSOR Page 95 
Design Objectives 
10 CONFIGURATION:. a structure permitting either a 
32-point direct CZT or a 64-point sliding CZT 
2 	INPUTS: complex (real and imaginaryo, used in 
conjunction 	with a 90-degree phase difference 
network (section 535) to maximise 	convolver 
efficiency 
3 	OUTPUTS: power spectra only 
40 WEIGHTING: 	optional 	(rectangular 	or 	Hamming 
windows) 
50 MAX. PROCESSING 	BANDWIDTH: 	>100kHz 	real-time 
bandwidth (>200kHz clock frequency 
6 	LINEAR DYNAMIC RANGE: >40dB 
70 POWER DISSIPATION: <lOW 
52 COMPUTER SIMULkION 
The need for computer simulation arises because the 
mathematical analysis of the CZT using real signal 
representation is a very laborious task (see Appendix BD. 
I4oreovr, it is often difficult to obtain a closed 
mathematical solution. 
The computer simulation strategy centres around the 
tecnnigue 	employed 	to 	calculate 	the 	chirp 	filter 
THE DESIGN AND CONSTRUCTION OF A CCD CZT PROCESSOR Page 96 
Computer Simulation 
convolutions. 	These can be computed 	either 	directly 
(i.e. 4t 2 multiplications or by means of the convolution 
trieorem and an FFT routine.. The direct method is by far the 
simplest because the hardware structure can be simulated 
exactly without FF'r errors interfering with the results.. 
However, for large N c the direct method is inefficient.. 
Since in this simulation the transform length is less than 
or equal to 64 the direct convolution method has been 
adopted and the CCD transversal filters are modelled in the 
time domain.. The simulation flow diagram is the same as the 
block diagram in Fig..4..7 with the post-multiplication 
replaced by the modulus circuit (Fig04..10).. 
As already discussed in section 43 ° 4 c the 	most 
significant errors in tne CZT are due to pre-multiplier 
sequence quantisation (when the 	pre-multiplier 	is 	a 
Multiplying D to A Converter(MDAC) analogue post-
multiplier accuracy, transversal filter weight accuracy and 
charge transfer inefficiency.. One additional error source 
has been considered here: phase shifter accuracy when 
generating pseudo complex data (section 5..3..5).. In sections 
5..2..2 through 5..2..7 these errors are quantified in terms of 
a peak error to peak signal ratio.. This is a more practical 
error classification than the r..m..s.. equivalent because in 
many CZT applications, spurious peak errors may 
significantly alter the outcome of automatic 	decision 
algorithms. 
THE DESIGN AND CONSTRUCTION OF A CCD czr PROCESSOR Page 97 
Computer Simulation 
To help in the understanding of the hardware operation, 
tne simulation has been used to obtain a graphical analysis 
of the CZT i.e. the computer has been used as a software 
oscilloscope. This analysis is presented in section 521 
The simulation was written in the Edinburgh IMP 
language [91] and run on a large time-shared twin ICL 4-75 
computer configuration. 
5.2.1 Graphical Analysis of the CZT 
The graphical analysis is based on a 64-point direct 
czr0 Reference is made to the block diagram in Fig0407 0 
The pre-multiplying chirps are plotted in Fig0501 and 
appear as VU chirps because of aliasing at N/2. If the 
sample frequency is f,, tnen the waveforms chirp from dc 
through f/2 to dc0 when the input signal is a complex 
basis vector (Figs0502a and 502b , the pre-multiplication 
produces the real and imaginary waveforms shown in Figs0502c 
and 502d0 These waveforms may be considered in terms of sum 
and difference frequency sidebands0 (Note that because the 
system is at baseband 9 these sidebands cannot be separated o. 
The upper sideband chirps from f (where f. is the tone 
input frequency -> f/2 -> dc -> f and the lower sideband 
chirps from f5 -> dc -> f/2 -> f (Fig0503)0 
For circular convolution, the convolver filter impulse 
responses are double QIVC chirps lasting for 2N-1 samples 
ii 	tNU (-; UN6fXUCTl0N OF A CCD CZT PROCESSOR 	Page 98 Conputer Simulation 
AMPLITUDE 
() n 
COS PRE-f1/JL TIPL YINC CHIRP 
AMPLITUDE 
(b) 





Fig501 CZT Pre-multiplier Waveforms 
Fig.502 CZT Pre-multiplication 
0 (c) 0 



















I1MPL I TIlDE 
ci) 













Fig05 0 3 Sideband Representation of Pre-multiplication 
The complex convolution is illustrated graphically 	in 
Fig0504 	After the pre-multiplied signal has been loaded 
into the convolver (t=0) 0 the process of shifting and 
multiplying begins At time t=fN/f , the frequencies in 
the upper sideoand of the pre-multiplied input signal match 
exactly with those in the conv'olver, and a convolution peak 
is output. Similarly, at t=N(l-f/fj/f 0 the lower 
sideband matches. The timing of a convolution peak is thus 
proportional to input frequency0 When the complex convolver 
is split into four real channels, the individual outputs are 






























r Sideband Matched 
position 
in filter 
Fig054 Graphical czr Convolution 
U 
(a) (rb) 
AMPLITUDE 	IWAMq 11W BY 6 4 .00,  
RrIPL I I UDE 	NOFYY liLY YB 
RERL COS CON  OUTPUT RERL SIN CONV OUTPUT 
AMPLITUDE 	IPB1BYLIZ1.0 BY 	64.00I 
C) 
IfiIG[NRRY COS COW OUTPUT 
AMPLITUDE 	rnoqrYyjjLD BY 	6q. 00,  
n 	 (d) 
II1IGINRRy SIN COA/V OUTPUT 
Fig505 Individual Coriu-o1ut-'irn P1,--  
THE DESIGN AND CONSTRUCTION OF A CCD CZT PROCESSOR Page 103 
Computer Simulation 
is given in Appendix B. Note that because the input signal 
is complex, the even and odd properties of COS and SIN 
combine to cancel one of the convolution peaks. The real 
and imaginary convolution outputs (after summing the 
individual filter outputsO are plotted in Figs0506a and 506b 
respectively. 
The modulus operation at the CZTs output removes the 
dependence on input phase giving the power spectrum in 
Fig0506c0 
5.2.2 Pre-multiplier Quantisation Errors 
In hardware, the pre-multiplying sequences are normally 
stored in ROM to an accuracy of b bits including sign and 
the input signal is pre-multiplied in an MDACO 	If it is 
assumed that the MDAC is accurate to within 	LSB of the 
pre-multiplying sequence, the pre-multiplication error is 
due to the level of quantisation0 
The computer simulation was set to calculate a 64-point 
direct CZT and the input chosen as a complex basis vector. 
The pre-multiplier quantisation was varied from 11 to 6 bits 
and all other error sources set to zero. A typical output 
from this simulation is given in Fig05070 Here the 
quantisation is 7 bits and the peak error to signal ratio 
(a) n () 
n 
AMPLITUDE 	tNoco)LIzLo OT 	64.00, 
RERL CI-/RNNEL OUTPUT 
AMPLITUDE 	INRr1qL12LU By  
IIIRC [NRR Y C/-/RNNEL OUTPUT 
AMPLITUDE 	IRGqtftlzLD By 	64.00) 
L INERR POWER SPEC TRUII 
(c) Ltg05 6 CZT Outputs  
LUQ tJC.1j\3 IU'U LU0 
Computer Simulation 














0UR'T: 7 B!T 
rIULTPCC= 0.00 • 
RE5TOL 0.00 X 
OCERR= 0.00 Z 
SLID'D'R= 0 
HRM WGI= NO 
NORMr 64.0 
CCDCZ T - R5PEC IQURN Ti 
Fig 0 57 Typical Simulation Output - Quantisation Accuracy 
Ouritsatiori Aec. (bits) 
11 	10 	9 	a 	7 	6 	5 
-10 
	
-20 	 I 
I 	 I 
I 	 I 
-30 	 I 





-50 --- ----- - 
w 
-60 	 - - 
-70 
Fig058 Errors Due to Pre-multiplier Quantisation 
THE DESIGN AND CONSTRUCTION OF A CCD CZT PROCESSOR Page 106 
Computer Simulation 
(PE/S is -5107dB0 Fig0508 shows a graph of quantisation 
accuracy against the PE/S ratio and, as expected, the PE/S 
is increased by about 6dB for each quantisation step0 Note 
that for real inputs only, the peak signal is reduced by 6dB 
(because of the spectral 'image) and, consequently, the PE/S 
ratio is degraded similarly. It can be seen that a 
quantisation accuracy of at least 8 bits (including signo is 
required for PE/S ratios of less than -50dB, 
5.2.3 CCD Tap Weight Tolerance 	 - 
The CCDs available for experimental use were floating 
gate tapped delay lines. This implied the use of external 
discrete resistors for tap weighting. A percentage 
tolerance is therefore an appropriate error classification 
i.e. 
Tap Weight = R (1 + ---) 	 o o o (51) 
100 
where R is the exact tap weight o 16 is a random number 
(Gaussian distribution) in the range -1 to +1 and P is the 
percentage tolerance. The tolerance T includes both the CCD 
tap amplifier gain mismatch and the resistor accuracy. 
Fig0509 shows the simulation results for a 64-point direct 
CZT with basis vector inputs. Once again, when the input 
data are real only, there is a 6dB loss in peak signal. 
% Tolerance 
0 	1 	2 	3 	4 	 7N 	S 	0 	r) 
jo 






Fig0509 CCD Tap weight Accuracy 
These results indicate that tap accuracy should be at least 
1% for peak errors of less than -50dB0 
5.2.4 Charge Transfer Efficiency 
In practical CCD registers, an amount( of the signal 
packet is transferred, and a fraction E is left behind. The 
CZT computer simulation models this mechanism directly in 
the time domain. 	Three results are of interest: (a the 
variation of output PE/3 ratio with C 	(b) the PE/S 
dependence on input frequency for a fixed £ and (co the 
variation of PE/S with N, the number of transform points, 









































































THE DESIGN AND CO N STRUCTION OF A CCD CZT PROCESSOR Page 109 
Computer Simulation 
Fig05,10 shows the progressive degradation in the CZI' 
output as the charge transfer inefficiency,C., is increased 
from 0,0001 to 0,1. It can be seen that the effect of E is 
to reduce the frequency resolution. The graph in Fig05,11 
gives the relationship between PE/S and, 
Fig,5011 The Effect of Charge Transfer Inefficiency 
(when the curve crosses the horizontal axis, i,e. PE/S=OdB, 
the error component becomes larger than tne signal 
component o . For the particular CZT parameters in FigO5O11 Q 
a transfer inefficiency of better than 0,0001 is necessary 











j5i2 Dependence of PE,'S on Input Frequenc y  
THE DESIGN AND CONSTRUCTION OF A CCD CZT PROCESSOR . Page 110 
Computer Simulation 
The amplitude spectrum shown in Fig.5012 is the result 
wnen the CZT is input simultaneously with seven equal 




QURNT= 12 BITS 
r1ULTCC= 0.00 • 
RESTOL= 0.00 / 
OCERRo 0.00 1. 
SLID/DIRro U 
HM tlGTro NO 
NORM-- 64.0 
With increasing input frequency, the pre-multiplied input 
signal moves further along the CCD register before the 
appropriate convolution term is produced. 	The 	higher 
frequencies are therefore affected more by than the lower 
frequencies. In the example shown here (N=64 and E. =OOOl), 
the PE/S ratio is 9d8s worse for the higher frequencies. 
When the number of transform points is increased, the 
effect of E becomes greater because each output point 
depends on more serial transfers. Fig.5013 shows the 
variation of PE/S with 0 for two different values ofE0 














Fig. 5.13 Relationship between E and N 
5.2.5 Analogue Multiplier Accuracy 
The post-convolver modulus circuitry 	is 	normally 
implemented using analogue transconductance multipliers. 
Simulation results for random multiplier errors of between 0 
and 10% are illustrated in Fig050140 As expected, these 
errors produce results similar to those obtained by the tap 
weight accuracy simulation and a PE/S ratio of less than 
-50dB demands 1% multiplier accuracy. 
5.2.6 Phase Shifter Errors 
Stictly speaking, the 90-degree phase snifter (section 
5.3.5b 	is not part of the CZT processor. Nevertheless, it 
THE DESIGN AND CONSTRUCTION OF A CCD CZT PROCESSOR .Page 112 
Computer Simulation 
% Accuracy 




I - w40 
--- \compx i/p 
50 - -- 
/ 0 
Fig05014 Post Squarer Accuracy 
13 necessary to investigate the accuracy required by this 
peripheral unit in order to suppress image frequencies to 
any desired level. In this simulation, the ideal CZT was 
supplied with real and imaginary inputs of the form 
CR = A cos(2rkri/N) 	 000 (52) 
= A sin(2uikn/N +.) 	 000 (503) 
where c is the phase shift error and is in the range 0 to 
5 0 
The output for a 20  error is shown in Fig05015 where 
tne image frequency is clearly visible. The relationship 
Detween PC/S and phase shift error, c. is plotted in 












OURNT 12 B] 




HM WCT'. NO 
NORM-- 64.( 
Fig05015 Image Frequency Suppression 
Fig05016 which shows that the phase shifter has to be 
accurate to about l ' for an image supression of -40dB0 
5.2.7 Summary of Simulation Results 
The simulations discussed so far consider the effect of 
each error source independently; in the practical situation, 
all sources combine to give an increased PE/S ratio, 
However, to compare the relative significance of each 
source 0 it is worthwhile summarising these results (Table 
5li 0 
To predict the accuracy of a practical CZT processor 0 
tne estimated error tolerances for typical components (see 
•D -31 
w o , 
WN 
THE DESIGN AND CONSTRUCTION OF A CCD CZT PROCESSOR Page 114 
Computer Simulation 
Degrees Acc. 
1 	2 	 S 	6 	7 	Ft 	Q 	in 
r1yD0ibu-c1egree Prase Ditterence Accuracy 
Accuracy to achieve 




Pre-multiplier Quantisation 8 8 
Tap Weight Tolerance 1% 2% 
Change Transfer Inefficiency 0.0001 OOOOl 
Post Squarer Accuracy 1% 5% 
Phase Shifter Error b 
0 1 0 
Table 51 Error Analysis Summary for a 64-point direct CZT 
THE DESIGN AND CONSTRUCTION OF A CCD CZT PROCESSOR .Page 115 
Computer Simulation 
Table 51 were used in the CZT simulation, 	For a basis 
vector input the PE/S ratio is -4207dB. The amplitude 














QUBNT' 6 BITS 
MULTBCC 5.00 
RESTOL' 2.00 1. 
OCERR= 0.00 
SLID/DIP' 0 
HB'l iGT= NO 
NORM= 64.0 
g51 Simulation Output for Square Wave Input (practical errors) 
The peak spurious error is at 42dB and the largest harmonic 
amplitude error is 103dB in the 13th hari'nonic0 
5.3 IMPLEMENTATION 
This section describes in detail the CZT hardware and 
is 	split 	into 	four 	subsections 	dealing 	with 
pre-multiplication, COrLVOlUtiOfl, post-multiplication 	and 
THE DESIGN AND CONSTRUCTION OF A CCD CZT PROCESSOR Page 116 
Implementation 
timing0 	In addition, two CZT peripherals specifically 
designed for speech ptocessing are discussed: 	(ao 	a 
90-degree phase difference network and (b 	a low-pass 
filter. 
5.3.1 The Pre-multiplier 
Two different pre-multiplication techniques exist. The 
first employs four-quadrant analogue transconductance 
multipliers, the pre-multiplying chirps being generated 
either actively or by impulsing CCD chirp transversal 
filters. The second uses MDACs and the pre-multiplying 
sequences are stored in ROM. Although the difference in 
speed between these two approaches is insignificant, the 
second technique has been adopted here because digital 
processing offers increased stability and flexibility over 
its analogue equivalent. 
Fig05018 illustrates the complex pre-multiplication 
scnematic0 The circuit employs four low cost, monolithic 
MDACs allowing up to 10 bit accuracy (only 8 bits are used O. 
These devices are fabricated using a combination of 
Complementary MOS (CMOS) and thin film technologies to give 
a power dissipation of only 20mW and a current settling time 
of 500nS0 Because the reference (signaU input has bipolar 
capability, four-quadrant multiplication is achieved by 
providing offset binary at the digital input. Linearity 
measurements on both the digital and analogue inputs showed 
IOK 







--I - 	 22 -- 	CoJ VOWER 
NE 531  
100 F 
ov 	 lk7nF  
T ov 
eos 
Fig,518 Hardware Schematic for Complex Pre-multiplication 
THE DESIGN AND CONSTRUCTION OF A CCD CZT PROCESSOR Page 118 
Implementation 
the non-linearity to be 025% of full scale output (better 
than -52dB)0 Normally, on-chip feedback resistors are used 
in conjunction with external amplifiers to define accurately 
the gain of each MDACO However, to minimise the total 
number of amplifiers required in the quad configuration, the 
MDAC current summers are combined, thereby necessitating 
three variable gain controls to equalise the circuit. 
In the direct CZT algorithm, the convolvers have to be 
serially loaded with N pre-multiplied data points followed 
by N-i zeroes,, This blanking operation may be conveniently 
incorporated by extending the pre-multiplying sequence to 
20-1 samples, the last N-i being zeroes. A further 
simplification in hardware timing results if, instead of N-i 
zeroes, N zeroes are loaded. The timing for a 32-point 
direct CZT then becomes identical to that for a 64-point 
sliding CZT. In the direct case, the extra input zero 
simply means that one extra output point has to be 
discarded. 
The pre-multiplying sequences are stored in four 32x8 
bit bipolar ROMs which are used in pairs to form two 64x8 
rit offset binary chirp sequences. The ROM addresses are 
suppiLed by a 6 bit synchronous binary counter (Fig05019)0 
Three different sets of ROMs were programmed to provide 
alternative CZT configurations. The sequences are defined 
by 
CLoag 
Fig05019 Pre-multiplier Sequence Storage 
C1 = cos(lrn2/32) 	Si = sin(11n2/32) 
Cl =0 n Si =0 n 
n=O,10031 
000 (504) 
n=32330 0 63 
THE DESIGN AND CONSTRUCTION OF A CCD CZT PROCESSOR Page 120 
Implementation 
10 32-Point DirectCZT (rectangular windowb 
2 	32-Point Direct CZT (Hamming window 
C2 = (o54 - 046 cos(rrn/32)) ci 
S2 = (054 - 046 cos(irn/32)) Si 	n0 9 10063 
000 (505) 
3 	64-Point Sliding CZT 
C3 = cos(1Tn2/64) 	 n=0 9 10063 
S3 n = sin(iTn2/64) 	 n=O10 0 63 
000 (56) 
The pre-multiplier operation for each of the above 
cases is shown in Fig050200 Here the real input is dc and 
the imaginary input is grounded. 
5.3.2 The Convolver 
In-house 32-tap CCD delay 
prototype construction. These 
an un" channel, aluminium gate 
tnree phase operation. The 
stages, tapped every alternate  
lines [561 were available for 
devices were fabricated using 
process and were designed for 
CCD registers have 64 serial 
stage [63] to make room for 
$ 
. 	,.f. .-,. 
---- rr -- •u - - 
. . ';.e- 
e 
.L1 4'L 
-619 — — — — — .i1luI.I,1 •l 
• 	S. 
- -- c—, - :-.. — 
I - 
. 1 
iJv - 	I 2m3 
• - -- 












- a 	• 	• 	0 • 







--D.-'- 	 - 
4•.+ . . 	 •.. ..!_ 
79s, - 
THE DESIGN AND COSTRUCTION OF A CCD CZT PROCESSOR .Page 122 
Implementation 
the peripneral floating-gate-reset cicuitry (Fig0305d)0 
The CZT convolv'er design (Fig05021) employs discrete 
resistors and two CCDs in series to implement two 64 point 
transversal filters. Since the convojvers in both the real 
and imaginary cIiannels have the same inputs, only one set of 
CCDs and two sets of resistor weights are required for each 
of tne channels. The resistor values are calculated from 
toe appropriate impulse responses and normalised to reduce 
loading effects. The CCDs are operated with a diode cut-off 
input technique similar to that described in section 321 
In tLis case s the input gate pulse is derived from the 
timing logic (section 534) and is arranged to sample the 
input signal during $ The sample is then temporarily 
stored under an extra floating gate before being transferred 
to the 0, potential well. In the circuit of Fig05021 0 
external diodes are paralleled with the CCD input diffusions 
to protect against the accidental application of negative 
voltages which could damage the devices. 
For optimum charge transfer and 	charge 	handling 
capacity, clock amplitudes in the region of 30V are 
required. The clocking waveforms are produced by transistor 
buffers described in section 534 The floating gate 
structures (Fig0305dt are reset to the Vgg potential during 
gS (section 
0 















S 	 - 	 - 4 	 L2Z 
270A 	 t2o 
Loy 	R+V L 	[ IV _L0 	
N~ - 	
i1 
I I 3 	__ I ,sv 	 I_ J 
	
Ps NESSI 
oV 	 j_ov loopf 
THE DESIGN AND CONSTRUCTIOt. OF A CCD CZT PROCESSOR Page 124 
Implementation 
Tfle currents in each transversal filter busbar are 
summed into separate virtual earth amplifiers before being 
differenced. Fast slew rate amplifiers are necessary here 
in order to cope with the CCD clock breakthrough. The 
positive and negative busbars cannot be differenced in one 
operation, thereby eliminating the CCD breakthrough (and 
hence the slew rate problem 	oecause the parallel impedance 
of each set of weights is different. 	Sample and hold 
circuits subsequently remove the CCD clock breakthrough and 
provide a stable waveform for post-processing. 
Finally, two circuits of Fig05021 are combined to 
provide the full complex convolver in Fig05022. Offset 
controls are provided on the output of each convolver 
cnannel so that dc pedestals may be removed before the 
modulus circuit (section 50303b0 
The photographs in Fig05023 show the outputs from the 
four convolvers wrien the processor is configured for a 
64-point sliding CZTO These waveforms may be compared to 
the computer simulation in Fig505 and the mathematical 
analysis in Appendix B. 
5.3.3 Post Circuitry 
Since only power spectra are required, the 	post 
convolver processing is reduced to a modulus function 
I I$ 
CHAPJCGL 	 RfiL CO6UUOLUEr< 
p 	
(F. 5.21) 
0- 	 - - 
Ip - 








- ;;5  
2OY  
-ISV 	- o.471,F
li IV -s-- 
L o'r 
-- 
Fig05022 Fully Complex Convolver 
U 
lv 
- 	 aS 	 • S •_'4 	ec • - 
	
.. es a - eeee esSe . '. •S. __. _•_ 	S. 	 ;•SI.• 	• -. 
- -- 	.•S.s_eS•e 	;. I - 
-: - 
Aw 
-a-- -- -- - 
lv  ?;;:. 
THE DESIGN AND CONSTRUCTION OF A CCD CZT PROCESSOR Page 127 
Implementation 
(Fig04010) i.e. 
2 	2 y=/R +1 
000 (507) 
where y is the CZT output, R is the real corwolver output 
and I is the imaginary convolver output. The obvious 
circuit solution is to employ, analogue transconductance 
multipliers configured as squarers0 In the schematic shown 
in Fig05024, two AD533 bipolar multipliers, which have a 
full power bandwidth of 75OkU, and one summing amplifier 
perform the square and add operation. A square-rooter has 
oeen added as an optional extra. The main disadvantage of 
this implementation is the limited linear dynamic range. 
Typically, analogue multipliers have a 60dB output dynamic 
range which implies 30dB at their inputs when the 
multipliers are used as squarers0 This figure does not 
allow the full potential of the CZT to be exploited. 
An alternative approach to the direct implementation of 
the modulus function is to use a linear approximation. It 
has been shown [80] that the approximation 
	
jRJ +IhI 	IRI> III 
y= co" Jill + hi 	if JIJ< IRI 
000 (58) 
gives an answer to within 	005dB of the 	exact value 	when 
04=0409 Such an 	approximation is generally acceptable. 
0 
+ssV _tS_V 	 -lgv 
	
I 	 I 	 I 
VKS 
JODVF 












Fig5.24 Analogue multiplier - Modulus Function 
THE DESIGN AND CONSTRUCTION OF A CCD CZT PROCESSOR .Page 129 
Implementation 
The advantage is that this approximation does not require 
analogue multipliers and the dynamic range is well in excess 
of 40dB0 
Fig05 0 25 gives the circuit diagram. The inputs R and I 
are full-wave rectified before being combined in the 
appropriate ratios. A comparator makes the decision as to 
whether fRI is greater than III and the result is used to 
generate complementary control signals for two analogue 
switches. 	A small amount of hysteresis is applied in the 
decision algorithm to prevent random switching by noise. 	A 
third analogue switch is included in the feedback loop of 
the summing amplifier to equalise the 	gain 	defining 
resistors. 	The speed of this circuit is limited by the 
full-wave rectifier to about 200kHz0 
534 Timing 
The timing circuitry (Fig.5.26b is designed to accept a 
master clock and generate various sub-clocks, as well as the 
appropriate CCD driving waveforms. 
The main timing information is derived from 	two 
ring-of-three counters connected in series. This allows 
each phase of the 3-phase CCD clock to be divided into three 
segments. The CCD clock waveforms, 0, 11 and 03 are 












-. 	 2Il 












































I 	 ri 
—1D 773 I 	 77 
Lb 
Rsr 




DR IVER  
VP 
7400 	 IK 
0 
Fig05026 CZT Timinci Lociic 
THE DESIGN AND CONSTRUCTION OF A CCD CZT PROCESSOR Page 132 
Implementation 
open collector buffers. To improve the CCD charge transfer 
efficiency, the drivers are designed to give the CCD clocks 
a fast turn-on and a slow turn-off. A pull-up resistor is 
included in each driver so that the clocks do not go more 
negative than 2V which ensures that there is always a thin 
depletion region at the surface of the CCD (the CCD 
substrate is at O%fl 0 
The middle segment of çA 	is selected to provide a 
gate pulse for the CCD diode cut-off input technique. The 
TTh signal is shaped by a circuit similar to the CCD clock 
driver to give a 15V pulse with a slow trailing edge. In 
this case, there is no pull-up resistor. 
Because the CCDs are tapped every alternate stage, 
sample and hold pulses are required only after every second 
CCD transfer. A 3K flip-flop connected as a toggle is used 
to divide the clock by two and, after appropriate gating, 
the sample and hold pulse is available (see timing diagram 
in Fig05027) 0 The sample and hold pulse is chosen in the 
middle of a 03 cycle to allow the signal time to settle. 
Again, because of the CCD°s alternate tapping, the 
clock to the pre-multiplier ROM address counter is at half 
tne CCD transfer rate. The Q3 /2 signal is appropriately 
timed for this purpose. The CCD is therefore operated in a 









J/P GTE. ny 
CA-OU 
Fig. 527 CZT Timing Diagram 
THE DESIGN AND CONSTRUCTION OF A CCD CZT PROCESSOR . Page 134 
Implementation 
Ref. [63] 
A frame sync0 output waveform is derived from the ROM 
address counter (Fig05019 and a reset input is provided to 
synchronise the CZT frame. Overall, the master clock input 
rate is 18 times the effective CZT sample rate. 
5.3.5 90-Degree Phase Difference Network 
In most CZT applications, only real input data are 
available; real data on their own make inefficient use of 
the convolvers since the resulting spectrum always contains 
an image. However, by generating quadrature data (i.e. real 
and imaginary partso from the real input it is possible to 
utilise the full processing bandwidth. Quadrature inputs 
may be generated by (a1 filtering or (bo modulating the data 
onto an IF carrier and demodulating in quadrature [92]. 
Only the filtering method is considered here. 
The generation of an output signal with exactly 90 0  
phase shift relative to the input signal is an extremely 
difficult operation. It is much easier to produce two new 
outputs with 90 0 phase shift relative to each other, 
i0e0 for an input signal of the form 
X(t) = A cos(wt + 0) 	 000 (509) 
it is relatively straightforward to generate 
y1 (t) = A cos(wt + 0 + 9) 	 000 (510) 
THE DESIGN AND CONSTRUCTION OF A CCD CZT PROCESSOR •Page 135 
Implementation 
and 
Y2(t) = A sin(wt +  
where c is an arbitrary phase. This method of generating 
complex data is valid in cases where only the amplitude 
spectrum is required and 	where 	the 	absolute 	phase 
information 	is unimportant. 	The 90 0 phase difference 
technique is therefore suitable for speech processing. 
The synthesis and analysis of a 90 0  phase difference 
network designed for speech processing is detailed in 
Appendix C. This particular network was specified to 
operate over the band of frequencies from 50Hz to 3200Hz 
with an absolute phase difference error of l © The complete 
circuit, including input drivers and output buffers, was 
constructed on a printed circuit board measuring 114mmx75mm0 
Measurement of the practical phase difference function 
was accomplished by an analogue phase meter and the results 
are plotted in Fig050280 The peak phase difference error is 
17 ° which exceeds the design tolerance of l ° This is due 
to a linear phase difference error produced by phase 
mismatch in the drive -circuitry. 
5.3.6 Low-pass Filter 
For correct operation with analogue input signals, the 
sampled-data CZT processor demands an input anti-aliasing 










































ZiLg .5 .28 Theoretical and Experimental Phase Difference Ripp 
THE DESIGN AND CONSTRUCTION OF A CCD CZT PROCESSOR -Page 137 
Implementation 
low-pass filter0 	To idake efficient use of the CZT°s 
processing bandwidth, this filter must cut-off very close to 
the Nyquist limit and roll-off very steeply (consistent with 
the step responseD so that aliased components are attenuated 
sufficiently. 
Such a filter has been designed and constructed for use 
in speech processing with the following prerequisite 
characteristics: 
l 	cut-off frequency: 3000Hz 
2 	attenuation: >20dB at 3200Hz 
3 	in-band ripple: <1dB 
4 	phase characteristic: unimportant 
AdB ripple 10-pole Tschebycheff transfer function was 
selected to give the best compromise between the roll-off 
and the step response and was implemented by cascading five 
buffered two-pole Rauch sections [9310 Fig5029 shows the 
circuit diagram for a single Rauch section and the component 
values for each of the five sections are given in Table 52 
The measured frequency response is illustrated in 
Fig05030a and compared with the theoretical response in 
Fig050310 It can be seen that the higher frequency poles 
are not exactly matched giving a 2dB ripple at the 




Fig.5.29 Two Pole "Rauch Low Pass Filter 
Section C 1 C2 R1R2R3 * 
1 0.47pF 160pF 6k2 
2 0.1iF 345pF 10k 
3 0.047iF 620pF 13k 
4 0.01jF 470pF 51k 
5 4700pF 1200pF 96k 
All resistors in ohms 
Table 5.2 Component Values for 10 -pole Tschebyscheff Low-pass Filter 











THE DESIGN AND CONSTRUCTION OF A CCD CZT PROCESSOR Page 139 
Implementation 
ILI 3) cnZ 
(a) frequency response 	(b) step response 
Fig.5.30 Practical Tschebycheff Filter Measurements 
10Hz 	 100Hz 	 1kHz 	 10kHz 
Fig.5.31 rschebycheff LPF Theoretical Frequency Response 
TilE DESIGN AND CONSTRUCTION OF A CCD czr PROCESSOR .Page 140 
Itnpleinentat ion 
band-edge. this problem is inherent in filters having a 
large 	number of poles. 	The filter step response is 
reproduced in Fi;.5.30b and has settled to 5% within 2.5mS. 
5.3.7 Physical Construction 
The CZT circuitry in sections 5.3.1 to 5.3.4 is 
constructed on two printed circuit boards measuring 
250x115mm (see photograph in Fig.5.32). One of these boards 
contains the complex convolver with the filter weighting 
networks mounted on pluggable subassemblies. The otrier 
ooard 	houses 	the timing logic and CCD drivers, the 
pre-multipliers and ooth the post-modulus circuits. 	The 
overall processor fits into a volume of 30x250xll5mcn. 
5.4 E1ARD4ARE PERFORMANCE 
Jnen the processor is configured for a 	64-point 
unweighted sliding CZT, the output in response to d.c. 
inputs in ootn the real and imaginary channels is shown in 
Fig.5.33. The master clock rate is 57.6kHz giving an 
effective processor clock of 3.2kHz and a resolution of 
50dz. The expanded photograph shows that the PE/S ratio is 
-42d3. The magnitude output in Fig.5.34 is the response 
wnen the real input is a 2V p/p 9501z tone (integral number 
let 
p 







Fig.5.32 Photograph of CZT Hardware 
THE DESIGN AND CONSTRUCTION OF A CCD CZT PROCESSOR Page 142 
Hardware Performance 
Fig. 5.33 SCZT Output 
for a DC Input 
(a) 
(a) response to dc 
(b)expanded (x20) 
(b) 
Fig-5-34 Amplitude Spectrum 
of 950Hz Real Tone Input 
i/p 
0/p 
-950 	dc +950 
THE DESIGN AND CONSTRUCTION OF A CCD czr PROCESSOR Page 143 
Hardware Performance 
of Cycles) on a 1V d.c. pedestal and the imaginary input is 
zero. Because the input is real only, Fig.5.34 may be 
interpreted in terms of positive and negative frequencies 
witn d.c. in the middle. 
The addition of a 90-degree phase difference network 
cancels the image frequencies and effectively doubles the 
processing bandwidth. The phase difference network outputs 
(Fig.5.35a# are input to the real and imaginary CZT inputs 
to give the output shown in Figs.5.35b and 5.35c. In these 
oscillograrns, the frame has been rotated by N/2 points to 
make toe d.c. response appear at the left-hand side. The 
peak error is due to transfer inefficiency and the PE/S 
ratio is approximately -40dB. It can be seen that the image 
frequency, whicn should appear at the arrow in Fig5.35c, has 
oeen well suppressed by the phase difference network. As 
explained in section 4.2.1, the peak and nulls of the output 
(sin xI/x response move off the sampling grid for 	a 
non-basis vector input. 	This is demonstrated in Fig.5.36 
wnere the input frequency is 1175Hz and the output falls 
exactly between two adjacent resolution cells. 
Dynamic range and linearity measurements are plotted in 
Fig.5.37. roe linearity of the output is limited to 30dB by 
the output transconductance multipliers and noise restricts 
toe overall dynamic range to 48dB. It is thought that the 
THE DESIGN AND CONSTRUCTION OF A CCD CZt' PROCESSOR Page 144 
Hardware Performance 
(a) phQse diff. 
network output 
iI 
(b) SCZT output 
dc 	1050Hz 
-I 
(c) expanded WO) 
Fi.5.35 SOz'r Operation with Phase Difference Network 
THE DESIGN AND CONSTRUCTION OF A CCD CZT PROCESSOR Page 14-5 
Hardware Performance 
Fig.5.36 Output for Non- 
Basis Vector Input 
 























Fig.5.37 Linearity and Dynamic Range Measurements 
THE DESIGN AND CONSTRUCTION OF A CCD CZT PROCESSOR .Page 146 
Hardware Performance 
noise level could be improved by separating the digital and 
analogue sections of the circuitry on the printed circuit 
board. 
To test tne processor's performance with regard to 
charge transfer efficiency, the input signal was swept from 
10Hz to 3kHz in 5 seconds and a time exposure of the output 
was developed (Fig.5.38) . 	It can be seen that the input 
low-pass filter characteristic (Fig.5.30ai 	is superimposed 
upon a general attenuation trend in the spectral output. 
The high frequency components are attenuated by 3.5dB more 
than the low frequency components. 
The master clock rate can be varied from oelow 10kHz to 
almost 2MHz providing effective clock rates of between 550Hz 
and ll)kHz. At the lower clock rate, the resolution is 
8.6Hz and the effect of dark current significantly distorts 
the output. The upper clock frequency gives a resolution of 
1718Hz and is limited by the slew rate of the transversal 
filter summing amplifiers (N5531) and also by the sample and 
hold amplifiers (HA2425) . 	A major hardware defect is the 
variation of transversal filter d.c. 	output with clock 
frequency; 	changes in d.c. 	offset cause the modulus 
circuitry to malfunction. 	This point is discussed more 
fully later. 
THE DESIGN AND CONSTRUCTION OF A COD CZT PROCESSOR Page 147 
Hardware Performance 
Pig.5.38  Response to Chirp 
Input (time exposure) 
 
dc 	1kHz 2kHz 3kHz 
 
Fi ( :. 5.39 Amplitude Spectrum 




THE DESIGN AND CONSTRUCTION OF A CCD CZT PROCESSOR. Page 148 
Hardware Performance 
An example of the processor analysing a square wave is 
demonstrated in Fig.5.39a. Here the fundamental is 200Hz 
and the odd harmonics are spaced at 400Hz (the processor's 
resolution is 50Hz) . The photograph in Fig.5.39b displays 
many sequential frames of real and imaginary convolver 
outputs, neatly illustrating the function of the modulus 
circuit,. Because the input signal is free running, the 
phase of the input signal ,&, relative to the CZT frame is 
changing. The fundamental convolver outputs are modulated 
by COSê and SINE), the third harmnics by COS 3E) and SIN 3, 
etc. The modulus function performs the operation 
COS 
2( e ) + sin 2( e ) = 1 	 000 (5012) 
on each of tne components so that the amplitude spectrum is 
independent of input phase. When there is a d.c. offset 
added to either the real or imaginary channel outputs, the 
cose ,SIN e etc. terms do not completely cancel and a frame 
to frame ripple is present in the output. This is a serious 
effect because the d.c. offsets change with both clock 
frequency and CCD temperature. A practical solution is to 
employ chopper stabilisation in a feedback loop to the CCD 
input. However, this implies major hardware modification. 
The processor's power dissipation was measured and 
found to oe almost 12. 	This figure could be reduced 
significantly be the use of CMOS circuitry 	and 	MOS 
THE DESIGN AND CONSTRUCTION OF A CCD CZT PROCESSOR. Page 149 
Hardware Performance 
amplifiers. 
A final demonstration of the CZT°s operation is in the 
calculation of the cepstrucn (section 22) for a triangle 
waveform. Here the processor is a weighted 64-point sliding 
CZT and the modulus circuitry is the linear approximation. 
The sawtooth waveform in Fig05.40a is processed to provide 
tne amplitude spectrum which is subsequently logged and 
stored on a tape recorder. (Note the decrease in resolution 
caused by the weighting functionb. 	The log spectrum is 
replayed through the CZT to provide the cepstrum 	in 
Fig05040e0 	The first peak is the fundamental guefrency at 
205mS and the smaller peaks are the rahmonics. 
In trds section, only the 64-point SCZT configuration 
has been demonstrated 0  It is considered that the operation 
of the direct CZT has been covered sufficiently in previous 
sections. 




(c) log, spectrum 
(pf log, spec. 
Note : trace inverted and 
delayed w.r.t. other signals 
cepstrum 
Fig.5.40 Demonstration of Cepstral Processing 
CHAPTER 6 
THE ON - LINE COMPUTER SI MU L AT I ON 
OF A CC.D CHANNEL VOCODER 
Armed with both the vocoder design philosophies and the 
signal processing capabilities of a new technology, it is 
possible to postulate new vocoder implementations which may 
provide engineering benefits. In the design of any system 
as complex as a vocoder, it is generally a wise precaution 
to simulate fully the effect of system variables before the 
commitment of hardware. This is especially true in low bit 
rate speech processing because there are many sources of 
distortion and ignoring any of these can be dangerous. 
Section 6.1 describes the basic computing facilities 
which were used by the author in this simulation. The 
computer models for a novel CCD implementation of the now 
established channel vocoder, together with the simulation 
detail and conclusions, are summarised in sections 6.2 and 
6.3 for the analyser and synthesiser respectively. 
6.1 COMPUTING FACILITIES 
In the computer simulation of 	speech 	processing 
systems, several special requirements must be considered. 
Any digital speech facility must have access to both A to 0 
and D to A conversion with associated buffer store or Direct 
COMPUTER SIMULATION OF A CCD CHANNEL VOCODER 	Page 152 
Computing Facilities 
Memory Access (DMA) and be capable of handling a large 
amount of data. For example, a one second segment of speech 
signal (the minimum which is useful) sampled at 10kHz and 
quaritised to a 12 bit accuracy requires 15,000 bytes of 
storage. In addition, it is necessary to have suitable 
output facilities for the convenient display of this large 
amount of data, either in the time domain or in the 
frequency domain. Although processing in real-time is not 
vital, hardware additions such as a floating-point processor 
or an FFT processor combine to make the overall system more 
efficient and less time-consuming. In the course of the 
simulations described in this chapter, two alternative 
computing facilities were used and each will oe described 
briefly. 
The first was based upon a PDPI1/70 mainframe with 128k 
words (16 bits) in main memory. Real arithmetic was handled 
by a hardwired floating point processor. Four disc units, 
three cartridge and one fixed, provided the fast store where 
all user programmes and an RSX-11M operating system resided. 
A floppy disc drive catered for individual user programme 
backup and a magnetic tape unit for longer term and system 
management backup. The multi-access system communicated 
with up to four users at any one time through three VDUs and 
one teletypewriter. Hardcopy output could be obtained from 
either a lineprinter or a graph plotter controlled by a dual 
12-bit D to A converter (x and y channels) . Analogue data 
were input to the computer via a 12-bit A to 0 converter 
COMPUTER SIMULATION OF A CCD CHANNEL V000DER 	Page 153 
Computing Facilities 
under the control of an external clock (variable from dc to 
100kHz) using DMA. The software included a Fortran IV 
compiler, a machine language assembler and the usual DEC 
utility programmes, as well as an extensive user written 
library. 
The second computing facility, more readily available 
to the author, consisted of a large time-shared DEC-10 
system. DMA was not permitted and the only access was via a 
standard serial terminal input port. This restriction 
necessitated the development of a specialised microprocessor 
based buffer to control the input and output of speech data. 
The design of this "intelligent terminal" is detailed in 
ReEecenCeLSand a block diagram illustrating the main 
component parts is given in Fig.6.1. Software in the Z80 
microprocessor controls different modes of operation and 
these may be selected by either the DEC-10 computer or the 
4014 graphics terminal. These modes of operation are 
designed to allow: 
direct communication between the 4014 terminal and 
the DEC-10 computer 
speech input through an 8-bit A to D converter into 
a 	12k 	byte 	buffer 	and 	subsequent 	serial 
transmission to file storage on the DEC-10 
3. speech output, by first gathering data from the 
DEC-10 into a buffer, and then recirculating these 











16 K BYTE DISC 
ZILOG 
	
DATA AND ADDRESS BUSES 
780 
SERIAL 'INTERFACE 
9K6 Baud  
LPARALLEL INTERFACE 
TTh8 b I t S Tr 8 b it 
4014 
TERMINAL 	A A to D 
[dc5OkHz L Output 	 In 
	
Modes of Operation 	(a) 	Feedthrough (Full Duplex) [4014 	DEC 101 
(controlled by Z80) (b) Analog I/P [A/D -* RAM/FLOPPY] 
Transmit (Half Duplex) 	[RAM -* DEC 101 
Analog 0/P [RAM - 0/A) 
Receive (Half Duplex) [DEC 10 - RAM/FLOPPY] 
COMPUTER SIMULATION OF A CCD CHANNEL VOCODER 	Page 155 
Computing Facilities 	
0 
data through an 8-bit D to A converter. 
Although this system is not as flexible as the first 
facility described, the processing power as applied in 
speech vocoder simulation is similar. 
62 THE CHANNEL ANALYSER 
The channel vocoder architecture (Fig.2.61 is ideally 
suited for CCD implementation. The parallel filter bank, 
which is the main processing block, may be replaced directly 
by an equivalent Fourier transform processor. Also, because 
of the increased processing power and flexibility made 
available by the use of such a processor, it is possible to 
replace the conventional time domain pitch detector by a 
technique which promises superior performance, the cepstral 
pitch detector (section 2.2). 
A computer model for the analyser simulation is shown 
in Fig06020 The input signal is pre-emphasised and Fourier 
transformed in frames to provide sequential short-time 
representations of the speech amplitude spectrum. At this 
stage, the processing divides into two paths. In the upper 
path, each short-time spectrum is smoothed and compressed to 
achieve the required data reduction, whereas in the lower 








COMPUTER SIMULATION OF A CCD CHANNEL VOCODER 
	
Page 156 
The Channel Analyser 
Spectrum 
Weighting 
S Pee C 	 DFT 	 Da ta Compression
Input ' I Emphasis 	Function 
Fig.6.2 Channel Analyser Simulation 
algorithin then extracts the appropriate excitation source 
information and transfers these data to the output for 
transmission. The main difference between the analogue 
filter bank implementation and Fig0602 is that the 
processing is performed in serial rather than in parallel. 
6.2.1 Speech Input 
Speech was input to the computer via a microphone s an 
an€i-aliasing fi1ter, an optional pre-equalisation filter 
and a 12-bit A to D converter, under the control of a 
supervisory software programme. This programme accessed an 
COMPUTER SIMULATION OF A CCD CHANNEL VOCODER 	 Page 157 
The Channel Analyser 
area set aside in main memory consisting of 12k integer 
words (1 integer -word = 2 bytes t. The extra four bits in 
each word were used as control flags., for the A to D 
converter. To conserve accuracy in later processing, the 
integer values were 'floated to real values before being 
stored on disc. The speech sampling rate was set at 8kHz 
from an external clock and it was arranged to input 10240 
samples. This allowed 128 seconds of speech to be 
recorded, sufficient for a short sentence. 
Three different microphones were used to take samples 
of varying quality.. These were: 
10 military handset with 300-3000Hz band-pass filter 
(telephone quality 
2 	cheap cassette microphone with 4000Hz low-pass 
filter 
3 	standard condenser microphone with 4000Hz low-pass 
filter. 
Segments of time domain speech produced by these three 
combinations (without pre-equalisationo are shown in 
Fig06030' (Each segment is part of the same phrase spoken by 
the same speaker). The signal in (at clearly demonstrates 
the high frequency emphasis placed on the - speech by the 
telephone handset. The lack of bass frequencies makes time 
domain pitch period detection extremely difficult. In 
COMPUTER SIMULATION OF A CCD CHANNEL VOCODER 	Page 158 





...................... ;O0 _____ .___________b 
------.---.-.-... --------..-------.-.. 
------------------ --- ------- ------ -- 	 -: 
(c) Condenser Mic. 
Fig.63 Comparing Microphone Characteristics 
waveforms (bt and (c 	- the band-pass filter has been 
replaced by a low-pass filter which allows the pitch 
fundamental to pass unattenuated (in this example, = 
125Hz) and the increased pitch period definition in these 
waveforms is obvious. The condenser microphone used in this 
experiment had a flat frequency response (1dB) from 20Hz to 
approximately 40kHz0 
In addition, three different male speakers and one 
female provided samples with low, middle and high pitch. 
The input sentences recorded were: 
COMPUTER SIMULATION OF A CCD CHANNEL SIOCODER 	Page 159 
The Channel Analyser 	
0 
"1 know when my lawyer is due" 
"We were away a year ago" 
"Every salt breeze comes from the sea" 
"1 was stunned by the beauty of the view". 
Sentences (1) and (2) are all voiced (except for the stop 
gaps) whereas (3) and (4 contain both voiced and unvoiced 
speech. 	The above sentences 	were 	used 	by 	Rabiner 
et. a).0 [22] 	to 	investigate 	several 	pitch detection 
algorithms. 
Conventionally, pre-emphasis of 6dB/octave is applied 
to the speech to compensate for the general trend [25]. 
This can be easily implemented in one of two ways: (a) by 
time domain differencing of the speech data according to the 
relationship 
Yk = Xk - 	 000 (61) 
where (y} is the pre-emphasised speech, {xk} is the input 
speech and p+l  for 6dB/octave lift or (b) by filtering the 
speech before it is digitised. To minimise the computer 
processing time, method (b) was selected and the filter 
section shown in Fig.6.4 was added before the A to D 
converter. The component values used give a 6dB/octave 
boost from 1kHz to 10kHz. 
COMPUTER SIMULATION OF A CCD CHANNEL VOCODER 	Page 160 
The Channel Analyser 
0D011.JF 
Ut 
Fig.6.4 Pre-emphasis Filter 
Finally, each segment stored on the computer was 
normalised to its peak value to ease scaling problems in 
later stages of the processing. 
6.2.2 Spectrum and Cepstrum Computation 
The main questions to be resolved are: 
1 	what resolution is required in the spectrum? 
2 	what resolution is required in the cepstrum? 
3 	are weighting functions necessary, and if so 	what 
type should be used? 
4 	can a sliding transform be utilised to give a 
potential saving in hardware complexity? 
COMPUTER SIMULATION OF A CCD CHANNEL VOCODER 	Page 161 
The Channel Analyser 
In the channel analyser configuration shown in Fig0602 0 
both the speech spectrum and cepstrum are used. For an 
efficient hardware struáture, the DFT processors employed in 
each calculation should, if at all possible, have identical 
characteristics. However, it is important to examine the 
needs of each calculation independently and then, if a 
suitable compromise can be reached, merge the two. 
It has been found in the channel vocoder (section 23) 
that filter bandwidths of between 100 and 300Hz provide 
sufficient resolution for representation of the short-time 
spectral envelope of speech [2] Some vocoders have 
linearly spaced constant bandwidth filters whereas others 
have logarithmically spaced filters (Table 21) In order 
to provide a good approximation to the vocal tract transfer 
function, each rectified filter output is averaged (low-pass 
filteredo over a period of between 20 and 30ms0 Normally, 
if the averaging period is greater than 30ms, the spectral 
output will not reflect fast changes in spectral content 
and, if the period is less than 20ms, too few pitch periods 
will be included in the average (see chapter 2) 
Thece are 	two 	alternative 	strategies 	for 	the 
implementation of a suitable approximation to the channel 
vocoder filter bank using a sampled-data DFT processor. If 
the input speech is sampled at 8kHz, the useful signal 
bandwidth is less than 4kHz, and the application of a 
40-point DFT processor transforms this real signal into a 
COMPUTER SIMULATION OF A CCD CHANNEL VOCODER 	 Page 162 
The Channel Analyser 
4kHz amplitude spectrum with linear resolution of 200Hz0 
However, this spectrum is obtained from a 5ms segment of the 
speech waveform and it is therefore necessary to average 
four successive spectral frames to achieve a result which 
represents a 20ms segment. The alternative solution is to 
employ a 160-point processor integrating over 20ms of speech 
and then reduce the resolution by grouping and averaging 
spectral coefficients. This solution is preferred because 
the higher resolution (50UzI spectral coefficients may then 
be grouped to approximate not only a linear filter bank but 
also a logarithmically spaced filter bank. The obvious 
disadvantage is increased transform length0 
The DFT characteristics for the cepstral computation 
depend on the desired pitch detector resolution and range. 
In speech, the maximum range of pitch period likely to be 
encountered is from 20ms (5OHzo to 2ms (500HzL [11]0 This 
is rather a wide range and most pitch detectors operate on 
reduced limits e.g. 1403ms (70Hz to 205ms (400Hz0 
Typically, a 6-bit word is used to represent the 
logarithmically coded pitch data i.e. 64 resolution cells. 
The resolution at the short period (high frequencyl end of 
the range is in the region of O0lmsand 0 for the longest 
periods (low frequencies is about hmas [941. 
For a cepstral processor to detect pitch periods of up 
to 20ms 9 a 40ms segment of speech must be analysed. Since 
the cepstrum has linear period resolution, the maximum 
COMPUTER SIMULATION OF A CCD CHANNEL VOCODER 	Page 163 
The Channel Analyser 
resolution (001ms) has to be provided; a logarithmic scale 
may then be approximated by grouping together the high 
resolution 'bins. These requirements imply the use of a 
400-point DFT in the cepstral computation. 
The discussion so far has ignored the effect of 
(sin xp/x leakage" in the OFT (section 4020U, wnich limits 
the inherent amplitude resolution to -13dB0 In applications 
where non-linear operations (e0g0modulus) are performed in 
the frequency domain, it is common practice to employ a 
weighting function (section 423) to trade frequency 
resolution for amplitude resolution. Since non-linear 
operations are involved in both of the above computations, a 
weighting function is necessary. 
Practical experience in 	analogue 	DFT 	processors 
(chapter 5) has shown that -40dB is a realistic Peak Error 
to Signal ratio (PE/S) ; it would therefore be wasteful in 
terms of frequency resolution to employ a weighting function 
giving a much greater amplitude resolution. In addition, a 
resolution of 40dB is sufficient for the human ear. (Mote 
that amplitude resolution is not the same as dynamic range). 
The best weighting function for this application is 
therefore the Hamming window (Equ04012) which provides a 
theoretical amplitude resolution of 43dB and decreases the 
3d6 frequency resolution by a factor of 1.3.,As explained 
in section 423, the use of a weighting function leads to 
loss of data at the window edges and overlapping techniques 
COMPUTER SIMULATION OF A CCD CHANNEL VOCODER 	 Page 164 
The Channel Analyser 
are necessary. 
Bearing in mind the decreased frequency resolution 
imposed by the weighting function s it is necessary to 
compromise the DFT characteristics for both the spectrum and 
cepstrum computations. Consider a 256-point DFT operating 
on a 32ms segment of Hamming weighted speech 	sample 
8kHzO . The nominal frequency resolution is 310 25Hz which is 
decreased to 40.6Hz by the weighting function. 	This is 
(section 2.3) 
certainly sufficient for spectral envelope representation. 
Pitch periods in the range 0 to 16ms may be detected from 
trie cepstrum 'with a resolution of 0.125ms. Although the 
full pitch range (2 to 20ms is not.cov'ered g the above DFT 
characteristics permit a very useful analysis of the speech 
waveform. 
In the United Kingdom s a standard frame rate for 
updating the synthesiser control parameters is once every 
20ms [27] To facilitate independent testing of the 
analyser and synthesiser, this frame rate is also chosen 
here. The DFT input speech therefore consists of 32ms 
segments each overlapped by 12ms, a percentage overlap of 
37.54. The frame to frame correlation for different 
overlaps is given in Ref. [77]. 
• 	Fig.6,5 illustrates an example of the spectrum and 
	
cepstrum of a voiced segment (part of the vowel 	. The 
unweighted speech in (ab has a pitch period of 7,2ms and the 
corresponding spectrum in (bb has a line spacing of 139Hz. 
.LLUJbVLLUIN 
- The Channel Analyser 
0 	 sm 	 16ms 	 24ms 	 3Zms 
Input Speech (voiced) 




0 	 4 	 8 
(c) Linear Cepstru 





-60L. 	 1 
















COMPUTER SIMULATION OF A CCD CHANNEL VOCODER 	 Page 166 
The Channel Analyser 
The cepstrum in (c has a large peak at quefrency of 7 02ms. 
To demonstrate the overall performance of the cepstral 
processor, Fig,606 shows a 3-dimensional cepstrogram 
(amplitude-quefrency-time) of the phrase 'I was stunned". 
Each plane in the time axis is separated by 20ms0 The 
fundamental peak is clearly visible during the voiced 
segments and, at the end of the rapid pitch 
inflexion is tracked. 
In order to maximise the efficiency of the proposed 
C 
hardware, initial simulations were performed using the 
sliding CZT (section 44 0 For the sliding transform to 
operate correctly, the input data have to appear stationary 
from frame to frame. The simulation showed that the sliding 
transform distorted and smeared the speech spectra. Fig0607 
illustrates several frames of log power spectra obtained via 
the sliding CZTO (Note that Fig0607 was generated by an 
unweighted 128-point sliding CZT from speech sampled at 
60 4kHz0 This was part of an earlier simulation 0 Each 
spectral frame should be symmetrical about 302kHz but, as 
can be seen, several frames are slightly skewed and 
distorted. Comparison with a direct transform of the same 
segment shows that there is a definite smearing effect due 
t the operation of the sliding CZTO Rapid,changes in the 
speech waveform e0g0 from silence to speech, create the 




UC I 	 IIII 
Fig.6.6 Three-dimensional Cepstro9ram 
- 	2Orn1 	iI1 	 I 	 - 	 - 
'Im i 
SPEECH WAVEFORM 	







- 	- LOG - POWER SPECTRUM 	-- 	- 	 - 




._-.. 	..-..----. .... .. ............----.- 	....-_.-- ..........--.- 







Fig.6,7 Sliding Transform of a Segment of Speech 
COMPUTER SIMULATION OF A CCD CHANNEL VOCODER 	Page 169 
The Channel Analyser 
a one frame delay. This is because the sliding transform 
convolvers have to be loaded with the signal before the 
transform is calculated. The correct results are obtained 
only during long vowels, implying that, in general, speech 
cannot be considered stationary for a 20ms frame rate. It 
is difficult to place any quantitative figure on the 
performance of the sliding algorithm in this application, 
but it is clear that the errors will cause severe distortion 
in synthetic speech. For this reason and others concerning 
hardware implementation (chapter 7) a direct transform is 
used in the following simulation work. 
(It was originally intended to incorporate CZT hardware 
errors in the channel vocoder simulation. However, the time 
domain simulation of the CZT required approximately 20 
minutes to process one second of speech. To conserve 
processing time and cost, an FFT algorithm (60 times as 
fast) was used in place of the CZTO The results are 
identical for the ideal case 0 
6.2.3 Data Reduction and Quantisation 
The desired end product from this stage of the analyser 
processing is a digit stream which represents a quantised 
version of the speech spectral envelope. For an output data 
COMPUTER SIMULATION OF A CCD CHANNEL VOCODER 	Page 170 
The Channel Analyser 
rate of 2400bps 0 each 20ms frame of spectral data has to be 
compressed into about 40 bits of information (2400bps = 48 
bits per 20ms frame The remaining bits in each frame are 
used for excitation source data. 
The first step is to obtain a smoothed version of the 
speech spectrum (he0 remove the spectral lines 0 Initial 
solutions to this problem made use of a low-pass filter in 
the frequency domain. The speech spectrum resulting from 
the DFT processor is in effect a time series representing 
frequency and tne filter is designed to remove the faster 
varying line components. The filter used in the simulation 
was a 51-tap FIR optimal filter designed using the Remez 
algorithm [67] The number of taps was chosen to be odd so 
that the group delay was an integral number of clock periods 
and the cut-off frequency was selected to be 001 f 0 A 
typical output from this filter is shown in Fig0608b (the 
appropriate group delay has been equalised). The formant 
structure of the vocal tract is now much clearer. An 
equivalent filtering operation could have been implemented 
using an accumulate and dump type of algorithm. 
The spectral 	envelope 	in 	Fig0608b 	is 	grossly 
oversampled and it was chosen to down sample each spectral 
envelope to give 16 analogue frequency samples. 	These 
sarnples are equivalent to the output from a contiguous 
filter bank with 16 linearly spaced, equal 	bandwidth 
filters. 	Since the human ear is logarithmically sensitive 








— 1st Forman t 	 - 
. 	 /.2d Format  
,--3rd- Farm~rut 
' .. I •... 




Fig,6.8 Spectral Smoothing (Low - pass Filter) 
COMPUTER SIMULATION OF A CCD CHANNEL VOCODER 	 Page 172 
The Channel Analyser 
to amplitude the spectrum can be further compressed. 	The 
first 9 frequency points in each frame (low frequencies) 
were quantised to 3 bits each at 5dB per step and the 
remaining seven points were quantised to 2 bits at 10dB per 
step, allowing for a dynamic range of 40dB0 
Subsequent synthesis from these data using techniques 
similar to those described in section 63 proved that the 
speech was of rather poor quality. Because the analyser and 
synthesiser were both new implementations, it was extremely 
difficult to pinpoint where the distortions originated; the 
bit stream between the analyser and synthesiser could not be 
compared against any standard. The poor quality appeared to 
be due to several factors, the most serious of which was the 
lack of frequency resolution at the low frequency end and 
the limited dynamic range. For these reasons, it was 
decided to adopt a logarithmic frequency compression method 
to allow direct comparison with an established hardware 
vocoder.  a 
The channel vocoder available for experimental use, 
called the 'Marvox was designed by the Joint Speech 
Research Unit, Cheltenham. This vocoder has logarithmically 
spaced 2nd order Butterworth filters as detailed in Table 
2.1. Although the filter Otypel is not critical, the 
individual filters should be flat-topped and roll-off gently 
at the band edges [25]. The incorporation of this filter 
COMPUTER SIMULATION OF A CCD CHANNEL VOCODER 	Page 173 
The Channel Analyser 
bank into the simulation may be accomplished in two ways: 
1 
	
	time domain windowing before the DFT (one for each 
filter characteristic 
2,. frequency domain convolution (one for each filter 
char acteristic 
The first technique makes extremely inefficient use of 
processing power and is not a practical solution. The 
second does however have some attraction because each filter 
characteristic may be represented (to a first approximation 
by relatively few weighting points. 
The weighting values were obtained by evaluating each 
filter characteristic at frequencies corresponding to the 
DFT picket fence (multiples of 31025Hz and then quantising 
these values to the nearest dB. Only the most significant 
20dB of each characterisic was considered. The equivalent 
filter outputs, Fk are produced by summing the 







where R is the OFT resolution (31025Hz , A is the spectral 
amplitude at the mth resolution bin and Uk are the 
logarithmic weighting coefficients as defined in Table 61 
k 	I i 	f j 	I H  
1 4 15 18 10 3 0 0 1 4 .0 11 14 16 18 
2 7 19 2014 82 0 0 1 5 9 12 15 18 20 
3 11 22 17 12 6 1 0 0 2 6 10 13 16 19 
4 15 26 16 11 5 1 0 0 2 7 11 14 17 20 
5 18 29 19 15 10 4 0 0 0 3 7 12 15 18 
6 22 33 18 14 9 3 0 0 0 4 8 12 16 19 
7 26 40 18 14 10 6 2 0 0 0 2 5 9 12 15 17 19 
8 30 45 20 17 14 9 51 0 0 0 2 6 9 13 15 18 20 
9 35 49 19 16 13 9 4 1 0 0 0 3 6 10 13 16 18 
10 40 54 18 15 12 8 3 0 0 0 1 4 7 11 14 17 19  
11 45 59 18 15 11 7 2 0 0 0 1 4 8 12 15 17 19 
12 49 68 19 16 14 11 8 5 2 0 0 0 0 1 3 6 
9 11 14 16 18 19 
13 55 74 19 17 15 12 9 6 3 1 0 0 0 1 2 5 8 
11 13 •15 17 19 
14 62 81 18 '16 13 10 7 4 1 0 0 0 0 1 4 7 10 
12 14 16 18 20 
15 68 87 19 17 14 11 8 5 2 0 0 0 0 1 3 6 9 
11 14 16 17 19 
16 77 96 1311 9 7 5 3 1 0 0 0 0 0 0 1 2 
4 6 8 10 11 
17 87 106 1210 8 6 4 2 1 0 0 0 0 0 1 2 3 5 7 
9 11 12 
18 96 115 13 11 9 7 5 3 1 0 0 0 0 0 0 1 2 
4 6 8 10 11 
19 109 128 8 6 5 4 3 2 1 0 0 0 0 0 0 0 0 
0 0 1 2 2 
Table 6.1 Analyser Filter Coefficients 
COMPUTER SIMULATION OF A CCD CHANNEL VOCODER 	Page 175 
The Channel Analyser 
The 	amplitude • transfer 	characteristic 	of 	the 
conventional filter bank is compared in Fig0609 with that 
produced by the above technique. It can be seen that the 
approximation matches closely within the first 20dB and then 
each filter falls off to a sidelobe level of -36dB0 	An 
individual filter 	fcentre = 1150Hz is shown separately in 
Fig0609c for clarity0 
For the analyser computer simulation to be matched to 
the conventional Marvox synthesiser, the frame coding 
details (Fig.6.10b for each must be identical. The nineteen 
filter outputs are logarithmically quantised and compressed 
into 39 bits. The lowest frequency channel is coded as 3 
bit direct PCM with 6dB/step (48dB dynamic rangeo and the 
remaining eighteen channels are each represented by two bit 
delta modulation. The delta modulation scheme permits more 
efficient coding by utilising the correlation 	between 
spectral channels. 	The 39 bits of spectral data are 
combined with two engineering bits for testing, 6 bits of 
pitch data (in reverse orderO and one Voiced/Unvoiced (V/UV 
bit to make a total of 48 bits per 20ms frame0 Immediately 
before transmission, the final five bits in each frame are 
inverted for synchronisation in the receiver, 
Frequency (kHz) 






V I; 4 i• 
	
) A x •y 
V'! 	
/ 





















Lwurtx 61MULATION OF A CCD CHAPEL VOCODER 










(A) PIARVOX 8LJTTERWORTH FILTER RARE 
(8) EQUIVALENT SIMULATION FILTER BARE 
Frequency (kHz) 







k t!•IJ\II vilI 	iIII rl '? 
 CD 
-60--  I 	 V 
I 	U 
I 	1 	I 
-80 	 I 
(C) RESPONSE OF CHARNEL 8 Ffl.TER (SUiuIat1on) 
• I- Li 
19 
/ 38 39 40 41 42 43 44 45 45 47 46 
eng. bits 	Pitch 	VIUV 
Channed 	11 	2 1 3 
Bit 1 2 3 4 5 6 7 
ItI1LII 
I 
3 bit PCM 2bt 
Amod 
NOTES 	(1) 3 bit PCk1 code 	111=0dB • 000=-42dB 	i.e. 6dB steps covering 48dB dynamic range 
2 bit 	Mod Code 11=+9dB , 10=+3dB , 01=3dB 	00=-9dB (MSB represents +6dB , LSB represents +3dB) 
Pitch code in reverse order i.e. LSB first 
V/UV = 1 for voiced 	V/UV = 0 for unvoiced 
Bits 44 to 48 are inverted prior to transmission for sync. purposes 
M = Most Significant Bit 	L = Least Significant Bit 
Fici.6.10 Channel Vocoder Frame Format 
COMPUTER SIMULATION OF A CCD CHANNEL VOCODER 	 Page 178 
The Channel Analyser 
62.4 Pitch Extraction 
The pitch extractor makes use of a peak picker to scan 
the cepstra and detect the peaks which are most likely to 
have been caused by the speech pitch periodicity. After a 
peak has been located, its position within the frame is 
converted into a digital word which is passed to the output 
stages for transmission. If a peak with the correct 
characteristics cannot be found, the peak picker assumes 
that the speech is unvoiced and a binary zero is 
transmitted. 
One of the first cepstral peak picking algorithms was 
developed by Noll [21] and,, although the algorithm works 
well, it is too involved for efficient hardware 
implementation. Two alternative algorithms have therefore 
been developed for the peak picker. The first (referred to 
as algorithm A) is extremely simple but does not eliminate 
spurious effects whereas the second (referred to as 
algorithm Bo is more sophisticated and can consequently 
operate with a higher fidelity, 
• Both algorithms employ an identical peak detection 
stage and its operation is illustrated by the flow diagram 
in Fig.6.11. The result produced is the magnitude and 
position of the largest and second largest peaks within a 
COMPUTER SIMULATION OF A CCD CHANNEL VOCOD6R 
The Cnanne]. Analyser 
Page 179 
START 














[_COUNT = COUNT + 
I 
YES 	PEAK2 = PEAK1 
PK2COUNT = PK1COUNT 
PEAK1 =CEPSDATA(COUNT) 
PK1COUNT = COUNT 
4H
PEAK2=CEPSDATA(COUNT) 
PK2COUNT = COUNT 
4 
COMPUTER SIMULATION OF A CCD CHANNEL VOCODER 	Page 180 
The Channel Analyser 
restricted portion of the cepstrum0 The lower quefrencies 
are disregarded because they are outside the expected range 
of pitch periods and in this example the lower limit is 
chosen to be 2000125=205ms (400Hz) In addition, the 
second N/2 points in each frame are rejected, being a 
reflection of the first N12 points. 
After the two major peaks have been located, various 
tests are performed to ascertain whether or not the largest 
peak indicates a meaningful pitch period. In the simplified 
algorithm (algorithm A), Fig06012, the magnitude difference 
between the two peaks is calculated and if this exceeds 
12dB, the speech is classed as voiced and the position of 
the largest peak is output. If this test is unsuccessful, 
the position of the second peak is compared with that of the 
first to determine whether or not the pitch peak has fallen 
between two resolution cells of the DFT processors.. All 
peaks which fail both of the above tests are classed as 
unvoiced. 
As it stands, algorithm A cannot distinguish spurious 
peaks. A decision of this type can be made only if 
information concerning the history of the pitch contour has 
been stored. The human pitch varies slowly and smoothly and 
it is therefore possible to predict the next pitch period 
based on past experience. Information concerning the 
cepstrum which immediately follows the present frame is also 
LAxujç Z>INULATIUN 
The Channel Analyser 
Page 18] 
Fig.6.12 Simple Peak-picker Decision Algorithm 
COMPUTER SIMULATION OF A CCD CHANNEL VOCODER 	Page 182 
The Channel Analyser 
helpful. (Note that 'looking into the future' implies a one 
frame delay in the systemD For exarnple g if the cepstra on 
either side of the present frame indicate unvoiced speech, 
it is most unlikely that the present frame is voiced. 
The following modifications to the simple algorithm 
help to eliminate spurious pitch estimates. During voiced 
speech the modified algorithm rejects pitch periods which do 
not appear within 10 resolution cells ( 1.25mso of the 
previous pitch measurement. If there is a large peak in the 
cepstrum outside the above limits, it is assumed that the 
speech is voiced and the previous pitch estimate 	is 
repeated. 	Unfortunately, this modification leads to an 
undesirable effect. On a change from voiced to unvoiced 
speech, the peak picker occasionally latches on to the 
previous voiced estimate because of spurious peaks in the 
unvoiced speech. To compensate for this, an additional V/UV 
indicator is incorporated, which compares the channel 19 
filter output with that from channel 20 If the ratio 
exceeds 5, the speech is classed as unvoiced and if the 
ratio is less than 05, the speech is classed as voiced. 
Should the energy ratio fall between these two limits, the 
classification depends on measurements from the cepstrum0 
The flow diagram for this more complex algorithm (algorithm 
Bo is given in Fig060130 
The Cnannel Analyser 
START 
—'kva1 	NO  time dom 
ame 	 PKlCNT 
 Silence 
Energy Ratio 	Channel(19) Output 	I 
is 	
YES 	Unvoiced 
ch a 1210t 
PK1CNT = 0 
Energy Ratio > 5  
Peak Pick Cepstrum 
(Fig .6. 11) 
is is 
NO 	PK2CNT=PK1CNT+ NO 
or Peak 1 





PREVCNT = 0 	YES  
Unvoiced 










PKCNT = PREVCNT 	 PK1CUT = 0 
PREVCNT = PK1CNT 
Write( PK1CNT ) 
Fig,6.13 Sophisticated 
- - 	 ( 	END 
Pitch Detection Algorithm 	\ 
COMPUTER SIMULATION OF A CCD CHANNEL VOCODER 	Page 184 
The Channel Analyser 
For male input speech, the pitch extractor operated 
with very few gross errors. To assess the performance of 
the cepstral pitch extractor, the output data were compared 
with manually extracted pitch data from the time domain 
speech waveform, I was stunned by the beauty of the view". 
Out of a total of 192 frames of speech data, 13 errors 
occurred, some of which were more serious than others. By 
far the most common error (8) was caused by rapid pitch 
inflexions, either at the beginning or at the end of words0 
When the pitch changes quickly in relation to the frame 
period, the cepstral peak disappears and the speech is 
classed as unvoiced. On two occasions, pitch inflexions 
during 	a 	voiced 	segment 	resulted 	in 	a 	similar 
classification. 	Apart from pitch contour smoothing, very 
little can be incorporated to prevent this effect. 	Because 
of 	the 	imposed dynamic range limit of 50dB in the 
simulation, one frame of low-level voiced speech was classed 
as unvoiced. 	It is possible that this may be improved by 
agc0 	One other error resulted from the peak picking 
process; the second rahmonic was selected in preference to 
the fundamental (pitch doubling at the end of a voiced 
segment; This may have been a legitimate decision since 
visual inspection of the time domain waveform showed that a 
component at half the fundamental period was. increasing in 
strength. However, synthetic speech produced from these data 
sounded rather unnatural. The remaining errors were due to 
COMPUTER SIMULATION OF A CCD CHANNEL VOCODER 	Page185 
The Channel Analyser 
the energy comparator selecting unvoiced instead of voiced 
during a voiced segment. 
When the pitch extractor was input with female speech, 
a large number of errors occurred. Most of these were due 
to the high pitch frequencies produced by this speaker. On 
several occasions, the pitch contour exceeded the upper 
liiüit scanned by the peak picker (2.5msO and, on others, the 
pitch peak became distorted by the formant information. 
These errors highlighted a serious defect in the cepstral 
pitch extractor; for pitch frequencies of greater than 
400Uz, the excitation source and the vocal tract are not 
properly deconvolved, which makes pitch extraction very 
difficult. The majority of the remaining errors were caused 
by the energy comparator classifying voiced as unvoiced 
because, overall, this speaker had a larger high frequency 
content in the speech waveform. Readjustment of the \T/UV 
comparator limits is necessary to alleviate this problem. 
The Marvox channel vocoder employs a 6-bit code for 
representation of the logarithmically coded (lumped linearO 
pitch data. The 6-bit code covers four octaves of frequency 
ranging from 3705Iz up to 600Hz with 16 levels per octave. 
Since the cepstrurn gives linear resolution, an encoder is 
necessary to convert the cepstral pitch data to a 
logarithmic scale. The encoder is designed to map the 
cepstral resolution bins on to the nearest Marvox 
quantisation level and the mapping function used in this 
simulation is given in Table 62 
Period 	(ins) Period (ins) Period (ms) Period (ins) 
0 1.7 (600Hz) 16 3.3 (300Hz) 32 6.7 (150Hz) 48 13.3 (75Hz) 
1 1.8 17 3.5 33 7.1 49 14.2 
2 1.9 18 3.7 34 7.5 50 15.1 
3 2.0 19 3.9 35 7.9 51 16.0 
4 2.1 20 4.1 36 8.3 52 16.9 
5 2.2 21 4.3 37 8.7 53 17.8 
6 2.3 22 4.5 38 9.1 54 18.7 
7 2.4 23 4.7 39 9.5 55 19.6 
8 2.5 24 4.9 40 9.9 56 20.5 
9 2.6 25 5.1 41 10.3 57 21.4 
10 2.7 26 5.3 42 10.7 58 22.3 
11 2.8 27 5.5 43 11.1 59 23.2 
12 2.9 28 5.7 44 11.5 60 24.1 
13 3.0 29 5.9 45 11.9 61 25.0 
14 3.1 30 6.1 46 12.3 62 25.9 
15 3.2 31 6.3 47 12.7 63 26.8 
Marvox' Pitch Detector Code 
Marvox' Ceps trum "1arvo," Ceps trum 
0 0to14 27 45 
1 15 28 46,47 
2 16 29 48 
3 17 30 49,50 
4 18 31 51 to 53 
6 19 32 54to56 
7 20 33 57to59 
8 21 34 60to62 
9 22 35 63to65 
11 23 36 66to68, 
12 24 37 69to71 
13 25 38 72to75 
14 26 39 76to78 
15 27 40 79to81 
16 28 41 82to84 
17 29 42 85to88 
18 30,31 43 89 to 91 
19 32,33 44 92 to 94 
20 34 45 95to97 
21 35,36 46 98 to 101 
22 37,38 47 102 to 105 
23 39 48 106 to 111 
24 40,41 49 112 to 118 
25 42 50 119 to 125 
26 43,44 51 126 to 129 
Period cepstruni 0 0.125ms 
Cepstruin to "MarvoxO Encoder 
Table 6,2 Line 
COMPUTER SIMULATION OF A CCD CHANNEL VOCODER 	Page 187 
The Channel Analyser 
6.2.5 Performance Comparison 
The channel analyser computer simulation was programmed 
to generate a serial data stream at 2400bps which could be 
input to a hardwired Marvox channel synthesiser. This 
facilitated a direct comparison between the parallel filter 
bank analyser and-the computer simulation. 
Fig06014 	illustrates 	three 	narrowband 	(50Hz 
spectrograms comparing the pitch variations in (aD the 
original speech (b the synthetic speech generated by the 
computer simulation/Marvox synthesiser combination and (c 
the synthetic speech produced by the Marvox analyser and 
synthesiser. (The sentence is divided into three separate 
segments because of the restricted computing facilities o. 
It can be seen that the cepstral pitch detector out-performs 
the phase-locked loop in the Marvox analyser in several 
aspects. In particu1ar, the pitch tracking is extremely 
good. For example, at the beginning of"beauty", the 
cepstrum follows the pitch which is starting to increase 
rapidly, whereas the phase-locked loop moves in the opposite 
direction. 	At the end of "beauty", however, the cepstrum 
classifies the 	as unvoiced. Apart from this mistake, 
the V/US! decisions appear to be accurate, which is in 
contrast to the Marvox analysis, where the• pitch contour 
almost breaks up during °I and "by". (Note that the pitch 
tracking is actually aided by the Marvox synthesiser because 
"1 was stunned by the beauty of the view" 
- 
(a) Natural Speech 
* I C 
- * 
- "I 




- 	' 	 - 
Computer Sic. Analyser / Marvox Synthesiser 
Marvox Analyser and Synthesiser 
vert 500+lz/dlv 
77 
benz 1ocnsId1v E 
Fig.6.14 Narrowband Speech Spectrograa 
COMPUTER SIMULATION OF A CCD CEfANNEL VOCODER 	Page 189 
The Channel Analyser 
all pitch data are smoothed by a 5Hz, one-pole low-pass 
filter. This filter is switched out of circuit for an 
unvoiced to voiced transit ions 
*1ideband analyses (200Hz$ for the same speech segments 
are shown in Fig..6.15. These analyses permit the forinant 
movements and bandwidths to be examined. As expected, both 
the computer simulation and the Marvox analyser produce 
similar results (apart from the stronger high frequency 
emphasis in Fig.6.15b5. 	In comparison to the original 
speech, the synthetic formants are less well defined. 	This 
is due to the coarse spectral quantisation. 
Informal listening tests indicated that the synthetic 
speech generated from the computer simulation was superior 
to that from the Marvox analysis. rhe main reason for this 
was the more accurate pitch extraction. Overall, the 
synthetic speech from the computer analysis had a sharper, 
more natural sound than the Marvox, although the 
intelligibility was good for both analyses. A sufficient 
quantity of data has not as yet been processed to permit 
full intelligibility testing using phonetically balanced 
word lists. 








(C) Marvox Analyser and Synthesiser 	 - 	 - 	FIg.6J5 Wideband Speech Spectrograms 
COMPUTER SIMULATION OF A CCD CHANNEL VOCODER 	Page 191 
The Channel Analyser 	 - 
6.2.6 Summary of Analyser Simulation Conclusions 
The following conclusions were drawn from this section: 
a sampling frequency of 8kHz permits sufficient 
analyser bandwidth 
pre-equalisation should consist of 	+6dB/octave 
boost from 1kHz 
tne spectrum resolution should be at least 50Hz to 
allow 	logarithmic 	frequency 	compression 
(i.e. 160-point DFT processor) 
to cover the full pitch range of 2ms to 20ms, the 
cepstrum 	resolution should be at least 0.1ms 
(i.e. 400-point DFT processor) and be averaged over 
a period of 40rns 
a good compromise to (3) and (4) above may be 
achieved using a 256-point processor operating on a 
32ms segment of speech 
the sliding cz-r is not useful for speech spectrum 
or cepstrum analysis 
weighting functions are essential for both the 
spectrum and cepstrum calculations (the Hamming 
window is suitable) 
COMPUTER SIMULATION OF h CCD CHANNEL VOCODER 	Page 192 
The Channel Analyser 
to achieve a frame rate of 20ms, overlapping 
techniques are required in the DFT computation 
standard 	channel 	vocoder 	data 	compression 
techniques may be employed in this implementation 
for stand alone performance, the analyser should be 
designed to interface with conventional synthesiser 
designs 
the cepstral pitch detector enables high fidelity 
channel vocoder operation. 
6.3 CHANNEL SYNTHESISER SIMULATION 
The synthesiser has to convert the received frames of 
quantised spectral data and excitation source information 
into a continuous synthetic speech waveform. Conventional 
synthesisers achieve this by filtering an excitation signal, 
generated from the control parameters, through a bank of 
contiguous band-pass filters whose overall transfer 
characteristic has been made to look like the vocal tract. 
In the implementation simulated here, this filter bank has 
been replaced by an inverse DFT processor. 
The computer simulation block diagram is shown in 
Fig.6.16. 	The quantised spectral data received from the 
COMPUTER SIMULATION OF 1 CCD CHANNEL VOCODER 	Page 193 
Channel Synthesiser Simulation 
Smooth Spectrum 	Impulse Responses 
ntilon 	Refori, Input 	I 	De- _______ Linear 










I I I 	voiced 
I ii I unvoiced 
, 	£ 
Fig.6.16 Synthesiser Simulation Block Diagram 
demultiplexet are processed by operations, which are the 
inverse of those performed in the analyser, to form an 
impulse response function in each frame. These impulse 
responses are an approximate reconstruction of the vocal 
tract impulse response at certain discrete instants of time, 
and to regenerate the speech waveform requires the 
excitation signal for a particular period to be convolved 
with the correct impulse response. 
6.3.1 Impulse Response Generation 
COMPUTER SIMULATION OF A CCD CHANNEL VOCODER 	Page 194 
Channel Synthesiser Simulation 
The synthesiser receives one 48-bit frame of data every 
20ms which is compo sed of 39 bits for spectral data, 6 bits 
for pitch, 1 bit for the V/UV control and two engineering 
bits. To generate an appropriate impulse response, the 39 
spectral bits have to be demodulated (using the first 
channel 3-bit PCM as a reference , anti-logarithmically 
converted, reformed into a smooth spectrum and finally 
inverse transformed, 
The inverse DFT (IDFT) does not demand as high a 
resolution as the forward DFT in the analyser because only 
the spectral envelope is to be transformed. However, to 
enable both the analyser and synthesiser to make use of the 
same central processor in a hardware implementation, a 
256-point IDFT is chosen. The IDFT requires real and 
imaginary inputs, {Rk} and {Ik}, of the form 
R  = A cos( 0k 000 	(6,) 
= Ak sin( 0k 000 	(64) 
where the sequence {Ak} is the amplitude spectrum and the 
sequence 100 is the phase spectrum. Since only the 
amplitude spectrum is transmitted to the synthesiser, the 
phase spectrum is arbitrary. If it is chosen that ø=O, for 
all k, the IDFT inputs reduce to Rk=Ak and 'k=°'  and the 
resulting impulse response is termed zero phase. This zero 
phase impulse response is real and even and does not possess 
an imaginary part. 
COMPUTER SIMULATION OF A CCD CHANNEL VOCODER 	Page 195 
Channel Synthesiser Simulation 
The 	256-point 	amplitude 	spectrum 	(128 	points 
reflected, must be reconstructed from the nineteen 
logarithmically spaced spectral components received by the 
synthesiser. In order to retain the characteristics of the 
conventional filter bank synthesiser, the reconstruction is 
achieved by summing together weighted versions of each 
filter amplitude transfer characteristic. The weighting 
factors are given by the received channel amplitudes.. In 
the Marvox synthesiser, alternate filter outputs are summed 
in anti-phase to prevent coherent summation giving large 
spikes. The same can be achieved in this simulation by 
alternating the sign of each weighting coefficient, 
Fig,6,17 demonstrates the 	difference 	between 	summing 
in-phase and summing in alternate anti-phase. 
Unfortunately, the band-pass filters in the Marvox 
synthesiser are not identical to those in the analyser, 
implying that an alternative set of filter coefficients have 
to be stored. Although the centre frequencies are the same, 
the filters are single tuned and the bandwidths are much 
narrower. The narrower bandwidths give a resonant quality 
to the synthetic speech. Also, during unvoiced synthesis, 
an additional wide-band filter is substituted for the 
channel 19 filter to provide extra high frequency energy. 
The inverse transformation of a spectrum with 31,25Hz 
nominal frequency resolution results in an impulse response 
which lasts for 32ms. However, an impulse response which 
8ins 	 16ms 	 24ms 	 32rns 












COMPUTER SIMULATION OF A CCD CHANNEL VOCODER 	Page 196 
Channel Synthesiser Simulation 
Fig-6-17 Illustration of Impulse Responses , Generated by Coherent Phase 
Summation and Alternate Anti-Dhase Summation 
COMPUTER SIMULATIOM OF A CCD CHAMNEL VOCOOER 	Page 17 
Channel Synthesiser Simulation 
lasts 	for 	only 	20ms 	is 	necessary for the speech 
reconstruction algorithm described in section 633 	This 
modification can be accomplished simply by truncating the 
32mg (256 point impulse response to 20ms (160 pointQ 	The 
same effect could have been achieved by reducing the 
frequency resolution to 50Hz0 	In addition, each impulse 
response is rotated by N/2 points (equivalent to a phase 
shifto to make the main peak appear in the middle of a 
frame, thereby reducing discontinuities when the impulse 
responses are convolved with the excitation source. 
632 Excitation Sources 
The two main excitation sources used in the production 
of human speech are (a periodic impulses generated by the 
vocal cords (voiced and (bD random noise created by air 
turbulence (unvoiced o . The voiced excitation source in this 
simulation consists of a fined amplitude pulse train with 
the pulse rate controlled by the pitch input data. The 
pulse width is equal to the time resolution of the system 
i.e. 20ms/160 = 00125ms0 	Although the spectral decay in 
this signal is not matched to the human source, 	an 
appropriate 	compensation 	may 	be 	incorporated 	by a 
post-equalisation stage. 
COMPUTER SIMULATION OF A CCD CHANNEL VOCODER 	Page 198 
Channel Synthesiser Simulation 
The random noise source is simply a random sequence of 
1°s and O's and is selected by the V/UV control input. 
6.3.3 Reconstruction by Convolution 
To alternative techniques are possible 	for 	the 
reinsertion of the excitation source data into the speech. 
The first is to multiply the smooth spectrum by the DFT of 
the excitation source before the inverse transformation is 
performed, and the second is to convolve the speech impulse 
responses with the excitation source in the time domain. 
At first glance, the former technique appears to be 
simpler but, in fact, there are two major problems. 
Firstly, the OFT of the excitation source has to be 
generated. The obvious approach is to use an extra DFT 
processor but this would not be a desirable solution. Since 
the continuous transform of the voiced excitation source is 
an impulse train with repetition rate l/T, where T is the 
pitch period, it seems likely that it would be relatively 
easy to generate this signal without the use of a DFT 
processor. However, the vocoder operates with finite and 
not infinite transformswhjch means that a (sin x/x picket 
fence would have to be multiplied into the impulse train. 
This procedure would require complex hardware. The only 
other way to implement this scheme would be to store the 
excitation signal transforms in ROM and multiply them into 
COMPUTER SIMULATION OF A CCD CHANNEL VOCODER 	Page 199 
Channel Synthesiser Simulation 
the spectrum through an MDAC. 	To store the appropriate 
signals for 64 different pitch periods would 	require 
approximately Bk bytes of ROM. The second problem with this 
frequency domain multiplication technique is that it would 
be very difficult to remove the frame to frame 
discontinuities in the resulting synthetic speech,. 
The alternative reconstruction method is the time 
domain convolution of the excitation source and the vocal 
tract impulse response. The convolution of a signal with a 
train of impulses is equivalent to the addition of delayed 
replicas of that signal with the delay equalling the impulse 
repetition rate. This operation may be conveniently 
implemented using a tapped CCD delay line with on-off 
switches at each tap. 
The first attempt using this technique is shown in 
Fig060 180 Here, the convolution reference is calculated 
from the pitch data and the resulting sequence of l's and 
Os loaded into the tap switches, starting at the first tap. 
One 20ms frame of impulse response is then clocked through 
the delay line before a new excitation sequence is loaded. 
The synthetic speech in Fig06018 demonstrates the main 
defect of this fixed convolution approach. Discontinuities 
in pitch arise because the delay between the last excitation 
impulse of one frame and the first of the next frame is not 
an exact pitch period but is the remaining delay after an 
integral number of pitch periods have been subtracted from 
COMPUTER SIMULATION OF A CCD CHANNEL VOCODER 	Page 200 
Channel Synthesiser Simulation 
.. 	5 ms/div 	 . 	 - - 
The change from I to S in IS" 
__..-..- 
o4rt+44y 
istiftiji.tch 	 . 	 . 	I Unvoiced 
I - JI 
tion 
............ .............................. 
Random noise excitation amplitude 
too large 
Fig.6.18 Fixed Convolution of Impulse Responses (Handset processing) 
the fixed frame delay. A scheme is therefore required which 
takes into account the pitch period from the previous frame. 
The flow diagram for an 	improved 	reconstruction 
algorithm is given in Fig060190 The controlling element in 
this scheme is a counter which is clocked synchronously with 
the CCD delay line and is reset at the start of each frame. 
In other words, the counter effectively follows the leading 
edge of each frame as it moves along the delay line 
register. At each address, the delay, line tap switch is 
interrogated and, if the switch is found to be closed, it is 
opened. The counter address is then compared with the 'next 
pulse address and, if there is agreement, the tap which has 
been closed remains closed for a full cycle of the counter, 
COMPUTER SIMULATION OF A CCD CHANNEL VOCODER 
Channel Synthesiser Simulation 
Page 201 
START 
Rood mat frc=2 of data 
r COUNTER 	1 
iB 
op 9i1tCh 	 QO 





NEXTPULSE 	 NO <1 
 	
er address 
address equal to  
count 
YES 
Close switch at counter oddress 
PITCH 	0 YES 
	
(unvoiced) 	 table for EIWULSE 
 Look . up 
FIEXTPLLSE 	I4EXTPULSE 4 PITCH I 	I 
Shift CCD dew one stage and i/p ne point 
Add data with switches closed & o/p 
L COUNTER o COIMTER4I 1 
is 
Wo 	COTER: N+1 
[ PULSE 
	 Fig. 6.19 Flow Chart 
YES 	 Of ReconstuctiOL 




COMPUTER SIMULATION OF A CCD CHANNEL V'OCODER 	 Page 202 
Channel Synthesiser Simulation 
thereby allowing one complete frame of impulse data to be 
output from each selected tap. After a switch has been 
closed, the new "next pulse" address is calculated, which, 
in the case of voiced speech, is the old address added to 
the pitch input number. For unvoiced speech, the "next 
pulse" address is accessed from a random sequence look-up 
table. At the end of each frame, the "next pulse" address 
will point to a location outwith the range of the delay line 
and, in order to reset for the next impulse response frame, 
N tap locations are subtracted from this address (N is the 
number of CCD tapso This method ensures smooth 
reconstruction of the speech. 
Fig06020 shows the reconstruction of voiced speech 
using the improved, algorithm. Each impulse response has 
been smoothly connected at the pitch rate so that frame 
discontinuities are avoided. The change from voiced to 
unvoiced speech is demonstrated in Fig06021 and again there 
is no interference from the frame periodicity. The 
amplitude of the unvoiced segment of speech has been 
adjusted for optimum audio clarity. 
6.3.4 Synthesiser Performance 
The synthesiser's performance was assessed by comparing 
the output from a Marvox synthesiser with that from the 
Synthe 
Speed 
COMPUTER SIMULATION OF A CCD CHANNEL VOCODER 	Page 203 
Channel Synthesiser Simulation 
- 	-- -•-•- -•- 	----------------Partof vowel 	1 - 
- 	 - 	 ) 	
Impus 
	
• i •- ______ ______ ______ ______ 	
Respo 





Fig,6.21 Change from Voiced to Unvoiced Speech 
COMPUTER SIMULATIOM OF A CCD CHANNEL VOCODER 	Page 204 
Channel Synthesiser Simulation 
simulation when the common input was a digit 	stream 
generated 	by 	the 	analyser computer simulation. 	The 
spectrograms for the original speech and 	the 	Marvox 
synthesiser's output are shown in Fig06014 and Fig060150 
Fig06022 shows the corresponding narrowband and 
wideband spectrograms for the computer generated synthetic 
speech. Although these spectrograms are of rather poor 
quality, it can be seen from the narrowband analysis that 
the pitch contour is completely broken up and shows little 
resemblance to the original. Several explanations are 
possible. Firstly, there is no pitch smoothing low-pass 
filter in the computer simulation and, secondly, the pitch 
period is fixed for each 20ms frame, whereas in the Marvox 
synthesiser the pitch period is continuously changing. 
However, both of these effects will produce only small steps 
in the pitch contour and are not responsible for the rather 
large deviations in Fig06022 0 The main cause of distortion 
results from the reconstruction by convolution. When each 
20ms impulse response is convolved with the periodic pulse 
train (wricki has a period of less than 20ms , consecutive 
impulse responses overlap, thereby creating discontinuities 
and generating spurious periodicities0 The amount of 
distortion depends on the pitch period and on the widtri of 
the most significant part of the impulse response. The 
wider impulse responses, generated from spectral summation 
in alternate anti-phase, have a significant width of 
* 
vert 500Hz, dlv 
horiz 100ms, dlv 
p p 	 - 	 SO 
Narrowband 
- 	
"1 was Stunned by the beauty of the view" 
rr 
Wideband 
Fig.6.22 Spectrograms from the Analyser / Synthesiser Simulation 
COMPUTER SIMULATION OF A CCD CHANNEL VOCODER 	Page 206 
Channel Synthesiser Simulation 	 - 
typically lOins, whereas those from the in-phase Summation 
last for 2ms. Since the wider impulse responses nave been 
employed here, the distortion is severe. Reconstruction 
using the narrow impulse responses has been attempted, but, 
although the pitch is improved, the speech has a very 
mechanical sound. 
The wideband analysis in Fig.6.22 shows that the 
forinants are not as well defined as those obtained from the 
Marvox. Also, the coarse time quantisation (20mso is 
apparent since continuous smoothing is not possible when the 
speech is reconstructed in frames. (The spectral data are 
smoothed from frame to frame by a low-pass filter in the 
Mar vox synthesisec 
Listening tests have shown that although the speech is 
intelligible, the rough nature of the pitch detracts 
considerably from the synthetic speech quality. 
Unfortunately, the available time did not permit more 
detailed investigation of these problems. 
6.3.5 Summary of Synthesiser Simulation Conclusions 
The following conclusions 	were 	drawn 	from 	the 
synthesiser simulation: 
COMPUTER SIMULATION OF A CCD CHANNEL VOCODER 	Page 207 
Channel Synthesiser Simulation 
1 	the inverse DFT processor should have the same 
characteristics as the forward processor in the 
analyser (i.e. 256-point but,'; in addition, should 
be capable of providing the real part of the output 
and not just the modulus (the sliding transform is 
therefore not applicable 
2 	the frequency 	domain 	multiplication 	of 	the 
excitation source is not easy to implement and may 
cause frame discontinuities 
3 	time domain convolution of the excitation source 
data can remove frame discontinuities but 
introduces pitch discontinuities if the impulse 
responses are of longer duration than the pitch 
period 
40 to date, the synthetic speech quality obtained from 
this simulation is not comparable with that from 
the Marvox synthesiser 
5 	further simulation is necessary 	to 	eliminate 
completely 	discontinuity 	problems 	in 	this 
synthesiser. 
CHAPTER 7 
THE OPTIMAL DESIGN OF A CCD 
C H A MM E L V 0 C 0 D E R 
This chapter is intended to merge the channel vocoder 
simulation results with the experience gained from CCD 
hardware. A possible implementation of a novel channel 
analyser is described in section 71 and the corresponding 
synthesiser is presented in the following section. Both are 
designed to interface with a conventional channel vocoder so 
that each may be operated independently. Section 73 
compares the advantages and disadvantages of this particular 
implementation with the alternative CCD parallel filter bank 
vocoder.  0 
71 ANALYSER CONFIGURATION 
In order to maximise hardware efficiency, it 	is 
necessary to multiplex a single DFT processor. Because 
sliding processors cannot be multiplexed (without defeating 
their main advantages ,, the choice for central processor is 
therefore between the direct Chirp Z-transform (CZT) [61 and 
the Prime Transform (PTI [88] Since in this particular 
application the input data do not have an imaginary part, 
the PT architecture (Fig04014) is chosen in preference to 
the CZTO This choice reduces the overall hardware 
configurationr 	by 	two 	CCD convolvers and six analogue 
multipliers. The PT input and output permutations can be 
THE OPTIMAL DESIGN OF A CCD CHANNEL VOCODER 	Page 209 
Analyser Configuration 
incorporated into., the addressing codes for exsisting input 
and output analogue storage without further hardware cost. 
In accordance with the computer simulation conclusions from 
chapter 6 a 257-point transform is selected, the nearest 
prime number to 256 (It is assumed here that the accuracy 
of Analogue Random Access Hernory (ARAMb will be increased 
through development from 5% to 2%) 
The complete analyser simulation is 	outlined 	in 
Fig07010 The input speec h is amplified, emphasised at 
+6dB/octave from 1kHz and low-pass filtered to prevent 
aliasing (308kHzO This pre-processed speech is then 
sampled at 8kHz and passed to the first of three main 
processing stages0 
The first stage, which is shown in more detail in 
Fig0702 0 is designed to time-compress and permute the input 
data. Time compression is necessary since the analyser 
computes two DFTs (each with 50% duty cycleo on 32ms 
segments of data (256 points at 8kHzO per 20ms frame. 	Two 
256 	stage ARAMs are connected in parallel to permit 
overlapping of data; the registers are loaded at at clock 
rate of 8kHz and are emptied every alternate frame at 
5102kHz (see timing diagram, Fig0705)0 These signals are 
therefore time compressed by a factor of 6.4. The order in 
which the data are read out is controlled by a RO 
containing the permutation code (eqn0403j for a 257-point 
PT. Two extra sample and hold circuits are required to 
flic.H 1 	1 	1 	1 
	
Audio 	 Pre 	 Anti-alias 	Preprocessed 
Pimp. 	1 Emphasis t 1 Low 1pss Speech 
spectrum 
Time Compression 















f 	I f 
S c If 
permuted 	C 
data 







Fig.7.1 Analyser Hardware Configuration 







1 Pre.-processed  
256 Stage AR.4H 
Input Data Permuted 
/11 En 2 
Data 
2 
	 (Timing waveforms, q256,  Stage ARAM  Fig.7.5) 
8 
To Hamming Wgt. 
f=8 kf1z (input clock) 	 ROM 
1j5:x: 	 Counter 	 f" (Fig.7.3) 
kflz (output clock) 


















Fig. 7. 2 
	
I 	 (Timing Waveforms, Fig.7.5) 
Fiq.7..3 Prime Transform Central Processor 
THE OPTIMAL DESIGN OF A CCD CHANNEL \TOCODER 	Page 213 
Analyser Configuration 
store the first data points in each frame (x 0 0 for the PT 
algorithm (see eqn04033b0 
The next processing section is the heart of the PT 
(Fig07030 The time-compressed data (256-points in 5ms are 
multiplied by a Hamming window (permutedo to reduce 
frequency domain leakage and then loaded into two 511 stage 
(2N-3D CCD split gate transversal filters. Equations 4037 
and 4.38 give the transversal filter weighting coefficients. 
The transversal filter outputs are added to the spectral 
offset, x09 before the two channels are combined by a 
modulus circuit to produce a permuted amplitude spectrum. 
An approximation to the modulus function (section 50303) is 
sufficient here because of the coarse spectral quantisation 
which follows. The amplitude spectrum is passed to the 
output stages for coding and is also fed back to the 
correlator inputs via a logarithmic amplifier and a 
256-stage delay line. Inverse permutation is not necessary 
at the output of the modulus stage since the feedback path 
also misses out the forward permutation. The nominal PT 
clock rate is 5102kHz, allowing the calculation of two 
257-point transforms per 20ms frame (actually 256 points 
because X 9 the dc coefficient is not computed t. The 
processors input/output sequence is as follows: 
	
Filter Coefficie t 	permuted 
ROM 
(380 x 12) 	
addresses 	
19 analogue samples 	
39 bits of spectrum 
4 
coefficient 
;/7' 	 / 
I 
Analogue 
Accumulat e and 	Logarithmic 
Dump Delta 





4 bit MDAC 	
fc 
	 0 M3 






Spec trum 	256  Stage ARNI 











_J\ Cepstrum to 
~4 Log Pitch 
Encoder 













Fig.7.4 Spectrum and Cepstrum Processing 
CD 
THE OPTIMAL DESIGN OF A CCD CHANNEL VOCODER 	Page 215 
Analyser Configuration 
TIME 	 - 	PT INPUT 	 PT OUTPUT 
O-Sms 	 Speech data 	Undefined 
5-10ms Zero 	 Amp, spec. 
10-lSms 	 Log0 amp. spec. 	Undefined 
15-20ms Zero 	 Cepstrum 
The 256-point spectra and cepstra are each gated to 
their respective output processing stages (Fig0704p0 The 
spectra are stored in a 256-stage ARAM to facilitate the 
channel compression scheme detailed in section 623 A 
380x12 bit ROM stores the permuted frequency bin addresses 
(8 bitso and also the 4-bit weighting coefficients (Table 
601t wi-iich are multiplied into the frequency components by a 
4-bit multiplying D to A converter. An analogue accumulate 
and dump circuit sums 20 weighted samples to give an 
approximation to each of the 19 Butterworth filter 
characteristics. The channel 2 and channel 19 outputs are 
fed to the pitch detector to assist in the V/UV decision. 
Finally, the channel outputs are coded by a logarithmic 
delta modulation scheme s normally implemented by a 
comparator s an up-down counter and a D to A converter in the 
feedback loop 
The pitch extraction process consists of detecting and 
storing the largest and second largest peaks in the 
cepstrurn. These peaks are selected according to the flow 
diagram in Fig06011 and stored in two sample and hold 
circuits. Decision logic (Fig06013) then estimates the 
pitch information which is converted to the appropriate 
0 
Time (ms) 	
20 	 40 	 60 	 80 
SH1 
R/WEn1 read write 
SH2 
R/W En 2 write read 
write write 
Effective LFLF- ARAM 1 Clock 8kHZ'SJIIiI1Illrlr 
ft :Z flu • 	
Z 
i/p select 
spec/cep Ispec Lcep cep 	cep 	spec 	cep g [1 
load 
Prime Transf4m I b c dJ a bjJ c d al b c d a b 	 (a) Speech Input 	(b) Spectrum Output 
i/p and 	 (c) Spectrum Input (d) Cepstrum Output 
Fig.7.5 Analyser Timing Diagram 
THE OPTIMAL DESIGN OF A CCD CHANNEL VOCODER 	Page 217 
Analyser Configuration 
output coding (Table 62) by a decoder. The analyser frame 
is completed by emptying the frame storage register in 
serial at 2400bps0 
72 SYNTHESISER CONFIGURATION 
The input data are demultiplexed at 204kHz to provide 
spectral, pitch and sync. information (Fig0706)0 The 
sync0 bit is used to control the master clock from which all 
timing functions are generated. After one frame of spectral 
data has been received and stored, the 39 data bits are 
anti-logarithmically D to A converted to form 19 channel 
amplitudes which are subsequently stored in ARAM. A 
256-point permuted spectrum is then reformed by weighting 
and accumulating a set of contiguous filter bank 
characterisics (section 631) Three samples are 
accumulated for each spectral point since, at most, three 
filter characteristics overlap. 
Only one 257-point inverse PT is required for each 20ms 
frame in the synthesiser and, because the spectral data are 
real and even, the inverse PT reduces to a single channel. 
The transform clock rate is therefore chosen to be 2506kHz 
sothat the spectral data are read in during the first lOms 
of each frame and the time-compressed impulse response data 
are output into ARAM during the second lOms period. At the 






(2400bp ) 	Latch 
(48 bits ) 
I 16 
itch 
fl Frame Delay 








RP 256 point 
wpoctaum 
Anti Log 	 Accumulato nnd 
D to 





tutor 	 I 
coefficient 	(76.8hHz) 
esicer Filter 1 	 k En 2 
















2.56 Stage 1EA1 
b (25.6k!&) 
Addrea8 
(Timing Diagram, Pig-7.8) 	 Counter 	 aott 
f (o 
Fig,7,6 Synthesiser 	Impulse Response Generation 
8 
THE OPTIMAL DESIGN OF A CCD CHANNEL VOCODER 	Page 219 
Synthesiser Configuration 
transferred to the speech reconstruction algorithm (see 
timing diagram in Fig0708)0 The order and rate of this data 
transfer is controlled by an address ROM clocked at 8kHz 
which combines four functions: (1) inverse permutation, (21) 
impulse response rotation, (3) truncation to 160-points and 
(41) time expansion to the original speech sampling rate. 
The main component in the speech reconstruction 
algorithm (Fig0707) is a 160-stage CCD transversal filter 
with binary weights. The operation of this stage is well 
described by the flow diagram in Fig06019 and its 
accompanying text. Note that the gain of the transversal 
filter is reduced for unvoiced reconstruction to ensure 
proper balance in the synthetic speech. A sample and hold 
circuit is included at the output of the transversal filter 
to remove CCD clock breakthrough. The synthetic speech is 
output after low-pass filtering at 308kUz and frequency 
de-emphasis. 
A half-duplex 	vocoder 	may 	be 	constructed 	by 
multiplexing the major processing blocks in the analyser 
with those in the synthesiser. Alternatively, a full duplex 
vocoder may be achieved by increasing the analyser 







\ 4Reglstorvoiced/unvoiced 	 Stage Topped CC -- -volced/unvolced 
0 I 0tr 
Ol 
0 En 
Control C) 'QZ 
8 	Bit r% 0 
Adder - 8 Suia 	 Analog 	IF 	Synthetic 0 ( kHz) 
Paip. 











f Address 	Counter  
frame (SkHs) (a 	bits) 
Master 	Clock 	Timing logic 




fjg77 Synthetic Speech Reconstruction 
Eli 
0 
0 	 20 	 40 	 60 	 80 Time (ma) 
IV'W En 1 	write I 	read 
R/W En 2 	 ]read 
(256pts) 
Ai En 3 	 write (160 pta) read 	write (160 pta) 
(256 pts) 
Frame 	ARAM 30/p 
	
ARAM 2 0/p 	 ARAM 30/p 
	
ARAM 2 0/p 
g708 Synthesizer Timing 













THE OPTIMAL DESIGN OF A CCD CHANNEL VOCODER 	 Page 222 
Comparison with a CCD Parallel Filter Bank Vocoder 
703 COMPARISON WITH A CCD PARALLEL FILTER BANK VOCODER 
In 1978, Hewes et0al0 (30] from Texas Instruments 
published details of a channel vocoder implemented using CCD 
and switched capacitor technology. This vocoder, almost an 
exact copy of the Marvox algorithm, is implemented with two 
custom designed CCD/NMOS integrated circuits and 5 
microcomputers (TMS990 0 The CCD analyser chip contains 19 
parallel CCD band-pass filters, 19 full-wave rectifiers, 19 
switched capacitor low-pass filters, a multiplexer and a 
logarithmic A to D converter. Three of the microcomputers 
are utilised in a pitch detector while the other two handle 
data coding. The synthesiser chip houses the full fularvox 
synthesiser. 
Table 71 summarises and compares the component 
requirements for the analyser with that for the cepstral 
vocoder0 In terms of analogue storage, each of the analyser 
implementations is comparable; however, ARAM consumes more 
chip space than does CCD. On the other hand, 19 CCD 
transversal filters need 19 summing amplifiers consuming 
both power and chip space. The other analogue circuitry in 
each analyser may be compared in terms of the total number 
of amplifiers. The filter bank analyser uses almost double 
the amplifiers employed by the cepstral channel analyser. 
In addition to the analogue components, the cepstral 
(a) CCD Cepstral Vocoder 
Analyser Chip 	 Synthesiser Chip 
Analogue 
1278 stages of CCD + 3 amps. 	511 stages of COD + 1 amp. 
768 stages of AIthN 	 160 stages tapped CCD + 1 amp. 
5 Sample and Holds 	 531 stages of ARAM 
3 Summers 	 1 Sample and Hold 
3 Comparators 	 1 Accumalate and Dump 
I Log, amp. 
2 Rectifiers 
1 Accumulate and Dump 
Digital  
1170 bytes RON 610 bytes RON 
6 bytes RAM 8 bytes RAN 
1-8 bit MAC 1-4 bit MflAC 
1 -.4 bit MflAC 1 -4 bit anti-log DAC 
1 - 3 bit DAC 1 - 8 to 160 line decoder 
Misc. Logic Misc, Logic 
External Componenftypr Complet e System 
Amplifier Amplifier 
Low-pass Filter Low-pass Filter 
Pre-emphanio Filter De-emphasis Filter 
U) CCD Filter Bank Vocoder 
Analyser Chip 	 Synthesiser Chip 
Analogue 
1900 stages of CCD + 19 Amps. 	1 anti, log. DAC (5 bits) 
19 Rectifiers 	 1 Channel Demultiplexer 
19 Switched Cap. LPFa (38 amps.) 19 Sample and Holds 
1 - 19 channel analog multiplexer 19 Switched.Cap, LPFs (38 amps.) 
1 - 5 bit log. A to D converter 	1 Summer 
Digital 
Misc. Logic 	 Misc, Logic 
External Componentry for Complete System 
Amplifier 	 Amplifier 
Low-pass Filter 	 Low-pass Filter 
Pre-emphasis Filter 	 De-emphasis Filter 
3 Microcomputers (pitch detector) 1 Microcomputer (frame format) 
1 Microcomputer (frame format) 
THE OPTIMAL DESIGN OF A CCD CHANNEL VOCODER 	 Page 224 
Comparison with a CCD Parallel Filter Bank Vocodr 
implementation requires a considerable amount of digital 
circuitry. By far the largest item is the 1170 bytes of 
ROM, of which almost half is used to represent the 19 
Butterworth filter characteristics. It seems likely that 
further development would permit a more efficient 
approximation to these filter characteristics. For example, 
if only one characteristic was stored for each filter with 
equal bandwidth, the filter coefficient storage could be 
reduced by approximately 80% It should be noted that the 
extra digital hardware in the cepstral analyser should be 
compared with the four microcomputers employed by the filter 
bank analyser since pitch detection and frame coding are 
included in the components list. 
Comparing the synthesiser component counts, it can be 
seen that the major item in the filter bank implementation 
is the number of amplifiers whereas, because processing is 
serial in the other synthesiser, the major cost is analogue 
and digital storage. Although the total chip ares may 
prove similar, the switched capacitor filter bank will 
always have an advantage 0 since the performance is limited 
only by the operational amplifiers. 
In summary, it is considered that the CCD DFT based 
analyser offers an equivalent performance tothat of the CCD 
filter bank approach [301 but will provide a more efficient 
hardware implementation. The new synthesiser, however, will 
THE OPTIMAL DESIGN OF A CCD CHANNEL V'OCODER 	Page 225 
Comparison with a CCD Parallel Filter Bank Vocoder 
operate 	with 	decreased 	fidelity 	(without 	further 
developmento due to pitch discontinuities and is unlikely to 
achieve vastly superior engineering benefits. 	The optimum 
analogue channel vocoder implementation points to a 
combination of the cepstral analyser and the filter bank 
synthesiser. It appears possible that this system will 
enable the construction of a two chi p vocoder0 
CHAPTER 8 
CONCLUSIONS 
The design of CCD -Fourier transform processors and 
their application in low bit rate speech communication 
systems have been investigated in this thegis. In 
particular, a novel implementation of the channel vocoder 
nas been developed 
In the author's experience, many people have a poor 
understanding of the DFT and its derivatives; chapter 4 has 
been written bearing this in mind and attempts to clarify 
some of the main areas of confusion. The mathematical 
concepts are translated into realisable hardware structures 
and attention is drawn to cases where suitable restrictions 
permit hardware reduction. Practical considerations are 
emphasised 	throughout 	and performance limitations are 
compared. It is shown that in real-'time applications 
demanding high accuracy coupled with high resolution, there 
is at present no alternative to the digital FFT. However, 
in cases where reduced accuracy can be tolerated, analogue 
CCD Fourier transform processors such as the CZT and the PT 
offer up to 512 point transforms on a single IC. Thus the 
module size and power consumption are reduced considerably 
when compared to current digital FFTs. This is achieved at 
the expense of reduced accuracy (e.g. 1%) which gives a 
typical processor error or sidelobe level of approximately 
-40dB. Considering noise only, the output exhibits 50-60dB 
signal to noise ratio. A comparison between the CZT and the 
CONCLUSIONS 	 Page 227 
PT has shown that in terms of accuracy, there is little 
difference between the CZT and the PT, but with restricted 
input conditions (e.g., a typical speech waveform which is 
real), the PT configuration reduces the hardware by a factor 
of two. 
Chapter 5 described the design and construction of a 
CCD CZT processor. The computer simulation employed here 
permitted the seleëtion of component tolerances for any 
given transform accuracy. The spectrum analyser, which was 
constructed in CCD hardware computed either a 32-point 
direct CiT or a 64-point sliding transform, with or without 
Ramming weighting, 	It met all of the initial design 
specifications 	except the output linearity, which was 
limited to 30dB by the output transconductance multipliers 
in the modulus circuit. The alternative linear 
approximation circuit (+1/2dB ripple) increased the dynamic 
range to 50-60dB but limited the processor speed to lOOkRz. 
From this discussion, it is clear that the practical design 
of the processor output stage requires further attention if 
40dB linear dynamic range is to be achieved. The author 
considers that chapters 4 and 5 together form a 
comprehensive guide to the design of CCD Fourier transform 
processors. 
On-line computer simulations, detailed in chapter 6, 
permitted the design of a channel vocoder based on DFT 
techniques. This clearly showed that the sliding transform 
CONCLUSIONS 	 Page 228 
was not suitable for the short-time spectrum analysis of 
speech. The simulation further showed that our OFT based 
cepstral pitch detector technique provided superior results 
on all speech except for very high pitched female voices. 
This was verified by comparative listening tests with speech 
synthesised from a discrete filter bank. The author 
therefore believes that CCD OFT processors offer significant 
practical advantages when used for pitch detection. The 
simulated channel synthesiser was not as successful as the 
analyser due to discontinuities arising from the overlapping 
of impulse responses. Further simulation work is therefore 
necessary to improve the synthetic speech quality. 
Finally, 	the 	simulation 	results 	and 	practical 
experience gained from CCD OFT processors have been combined 
in Chapter 7 to show how the channel vocoder might best be 
implemented using sample-data analogue signal processing 
techniques. As discussed in chapter 7, a PT configuration 
was selected for the central processing unit. From an 
engineering point of view, the cepstral analyser compares 
favourably with an integrated CCD filter bank in terms of 
chip size, power consumption and hence cost. However, since 
the OFT based analyser performs both the required spectrum 
analysis and pitch detection without the need for external 
microcomputers, it is now believed to be the more attractive 
implementation. In contrast, the synthesiser does not 
promise any particular performance advantage and is more 




therefore considered that the optimum CCD channel vocoder 
configuration will be based on a combination of the cepstral 
analyser and the filter bank synthesiser. Using this 
approach, a two chip vocoder implementation is feasible. 
However, before chip design can commence, a discrete 
hardware model of the vocoder should be built to permit the 
optimisation of the analyser filter bank coefficients 
thereby reducing storage requirements. This will also allow 
thorough intelligibility testing to be undertaken with a 
large sample of male and female speakers. 
In summary, this thesis has shown conclusively that, at 
the present time, the analogue CCD has considerable 
importance in speech processing systems. However this 
conclusion must be treated with caution since the rate of 
progress in digital signal processing (e.g. high speed bit 
slice microprocessors) is such that within a few years most 
speech processing may be performed digitally. 
REFERENCES 
CUCCIA,C,L.:"Bandwith Conservation is Essential", 
Microwave Systems News, Oct.1978 9 pp.67-72. 
GOLD,B, and RADER,C,t4. :"The Channel Vocoder", 	IEEE 
Trans. 	Audio 	and 	Electroacoustics, Dec.1967, 
Vol,AU-15, No.4 9 pp.148-16I. 
ATAL,B.S. 	and HANAUER,S.L. :"Speech Analysis and 
Synthesis by Linear Prediction of the Speech Wave", 
J. Acoust. Soc. Amer., 1971 9 Vol.50 9 pp. 637-655 . 
HOWES,M.J..and MORGAN,D.V.:"Charge Coupled Devices 
and Systems", John Wiley and Sons, 1979. 
TURIN,G,L,:"An Introduction to Matched Filters", 
IRE 	Trans.Inform.Theory, 	June 	1960, 	Vol.IT-6, 
pp 0311-329. 
EVERSOLE,W.L. et.al.:"Spectral Analysis using the 
CCD Chirp Z-Transform", AGARD Conf.Proc. No.230, 
Oct. 1977, Paper No.5,3. 
WHITE,M.FI. 	et.al.:"CCD Analog 	Adaptive 	Signal 
Processing", Proc. 	CCD Applications Couf., 	San 
Diego, 1978, pp.3A1-3A14. 
DENYER,P.B. et,al.:"A Programmable CCD Transversal 
Filter: 	Design 	and 	Application", 	Proc. CCD 
Applications Conf., San Diego, 1978, pp.3B11-3B21. 
WILKINSON,R,M.:'° Delta Modulation Techniques 	for 
Analogue to Digital Conversion of Speech Signals", 
Signals Research and Development 	Establishment 
Report no.69022, Apr.1969. 
KLNG,R.A. 	and GOSLING,W. :"Tirne 	Encoded 	Speech 
(TES)", tEE Int. Specialist Seminar on Case Studies 
	
In Advanced Signal 	Processing, 	Conf. 	proc., 
Peebles, Sept.1979, to be published. 
11, FLANAGAN,J.L:"Speech 	Analysis, 	Synthesis 	and 
Perception", Springer-Verlag, New York 1972. 
FIOLMES,J.N.: "Speech Synthesis", 	Mills and 	Boon 
Monograph EE/7, 1972. 
GILL,J.S,: "Improvements in or relati.ng 	to Larynx 
Excitation Period Detectors", U.K. Patent Applic. 
No, 10525/65, May 1965. 
REFERENCES 	 Page 231 
15, 	GOLD,,B. 	and RABINER,L.R.: 	"Parallel 	Processing 
Techniques for Estimating Pitch Periods of Speech 
in the Time Domain", J. AcOust, Soc. Amer., 
Aug.1969, Vol.46, pp.442-448. 
16. 	SONDkII,M,N,: "New Methods of Pitch Extraction", 
IEEE 	Trans. 	Audio Electroacoust., June 1968, 
Vol.AU-16, pp. 262-266. 
17.. 	DUBNOWSKI,J.J,,SCIIAFER,R.W, 	and 	RABINER,L,R,: 
"Real-time Digital Hardware Pitch Detector", IEEE 
Trans. Acoust.,,Speech and Sig.Proc., Feb.1976, 
Vol.ASSP.24, pp.2-8. 
ROSS,M.J. et.al.: "Average Magnitude Difference 
Function 	Pitch Extractor", 	IEEE Trans.Acoust,, 
Speech and Sig. 	Proc., 	Oct.1974, 	Vol.ASSP-22, 
pp. 353-362. 
MAKSYM,J.N. 	"Real-time 	Pitch 	Extraction 	by 
Adaptive Prediction of the Speech Waveform", IEEE 
Trans. Audio and 	Electroacoust,., 	June 	1973, 
Vol.AU-21, No.3, pp.149-154. 
MOORER,J,A,: "The Optimum Comb Method of Pitch 
Period Analysis of Continuous Digitized Speech", 
IEEE Trans.Acoust., 	Speech 	and 	Sig. 	Proc., 
Oct.1974, Vol,ASSP-22,No,5, pp. 
HOLL,A,M,: 	"Cepstrum Pitch 	Determination",. 	J. 
Acoust, 	Soc. 	Amer., 	Feb.1967, 	Vol.41, 	No.2, 
pp-293-309. 
RABINER,L.R, 	et.al.: 	"A Coraparitive Performance 
Study of Several Pitch Detection Algorithms", IEEE 
Trans.Acoust,, Speech and Sig. 	Proc., 	Oct.1976, 
Vol.ASSP-24, No.5, pp.399-418, 
DUDLEY,H.W.: "The Vocoder", Bell Labs, Rec., 	1939, 
Vol.17, pp.122-126. 
HOLMES,J.N,:"A Variable Frame Rate Coding Scheme 
for Speech Analysis-Synthesis Systems", Electronic 
Letters, Apr.1974, Vol.10, No.7, pp.101-102. 
.25. }IOLMES,J.N,, private communication, 
RADER,C,M, :"Spectra of Vocoder Channel. Signal", 	J. 
Acoust, Soc. Amer., 1963, Vol.35, p305. 
KELLY,L,C.:"Speeeh and Vocoders", The Radio and 
Electronic 	Engineer, 	Aug.1970, 	Vol.40, 	No.2, 
pp. 73-82. 
REFERENCES 	 Page 232 
BIALLY,T. 	and ANDERSON,W,M.:"A Digital 	Channel. 
Vocoder", tEEE 	Traas,Comm.Tech., 	Aug.1970, 
Vol.COM-18, No.4, pp.435-442. 
KINGSBURY,N.C. 	and KELLY,L.C.:"A Digital 	Filter 
Bank for Real-time Speech Analysis and Synthesis 
using Logarithmically Quantised Signals", Proc. 
Digital Processing of Signals in Communications, 
IERE Conf. Proc. No.37, pp.81-96. 
HEWES,C.R. 	et.al.:"A CCD/NMOS Channel Vocoder", 
Proc. 	CCD- Applications Conf., 	San Diego, 1978, 
pp. 3A17-3A24. 
MAKHOUL,J.:"Linear Prediction: A Tutorial Review", 
Proc. IEEE, Apr.1975, Vol.63, pp.561-580. 
WIGGINS,R. and BRANTINGHAM,L.:"Three Chip System 
Synthesizes Human Speech", Electronics, Aug.1978, 
pp. 109-1 16. 
VLSWANATHAN,R. 	and 	MAKIiOUL,J.:"Quantisation 
Properties of Transmission Parameters in Linear 
Predictive Systems", IEEE Trans.Acoust., Speech and 
Sig. 	Proc., 	Vol..ASSP-23, 	No.3, 	June 	1975, 
pp.309-321. 
34, OPPENHEIM,A.V.:"Speech Analysis-Synthesis 	System 
Based on Homomorphic Filtering", J. Acoust. Soc. 
Amer., 1969, Vol.45, No.2, pp. 458-465 . 
OPPENHEIM,A.V. 	and 	SCHAFER,R.W.:"Homomorphjc 
Analysis 	of Speech", 	IEEE Trans. Audio and 
Electroacoustics, 	June 1968, 	Vol.AU-16, 	No.2, 
pp. 22 1-2 26. 
WEINSTEIN,C.J. 	and 	0PPENHETJ4,A.V,:" 	Predictive 
Coding in a Homomorphic Vocoder", IEEE Trans. Audio 
and Electroacoustics, Sept.1971, 	Vol.AU-19, 	No.3, 
pp.243-248. 
IMAI,S. :"Low Bit Rate Cepstral Vocoder Using the 
Log 	Magnitude 	Approximation 	Filter", 	IEEE 
Coef .Proc. C111285-6/78/0000-0441$00. 75@1978, 1978. 
SCHAFER,R.W, and RABINER,L.R.:"System for Automatic 
Analysis of Voiced Speech", J. Acoust. Soc. Amer., 
1970, Vol.47, pp.634-648. 
MARKEL,J.D.:"Application of a 	Digital 	Inverse 
Filter for Automatic Formant and Fo Analysis", IEEE 
Trans. Audio and Electroacoustics, June 	1973, 




McCANDLESS,S.S.:"An Algorithm for Automatic Formaut 
Extraction Using Linear Prediction Spectra", IEEE 
Trans.Acoust., Speech and 	Sig. 	Proc., 	Apr.1974, 
VolASSP-22, No.2, pp.135-141. 
BELL,G.G. et.al.:" Reduction of Speech Spectra By 
Analysis-by-Synthesis Techniques", J. Acoust. Soc. 
Amer., 1961, Vol.33 9 pp.1725 - 1736. 
BOYLE,W,S. 	and 	SMITR,G.E.: 	"Charge-Coupled 
Semiconductor Devices", 	Bell Syst. Tech. Journ., 
1970, 49, .pp.587-593. 
43, 	SEQULN,C.H. 	and TOMPSETT,M.F.: 	"Charge Transfer 
Devices", Academic Press, Inc., 1975. 
TOMPSETT,M,F,: "Charge Transfer Devices", 	J. 	Vac. 
Sd. 	Technol., 	July-Aug 	1972, 	9 1 No.4, 
pp. 1166-1181, 
SZE, S.iI. 	"Physics of Semiconductor Devices", 	John 
Wiley and Sons, 1969. 
BCYNON,J,D,E,: 	"The 	Basic 	Principles 	of 
Charge-Coupled Devices", Microelectronics, 1975, 7, 
No.2, pp.7-13. 
TOMPSETT,M,F.,AMELIO,G,F, and SMITH,G.E,: "Charge 
Coupled 8-Bit Shift Register", Appi. Phys. Lett,, 
1970, 17, pp ,111-115 . 
SEQUIN,C,H, 	and 	MOHSEN,A,M.: 	"Linearity 	of 
Electrical Charge Injection into Charge-Coupled 
Devices", 	IEEE . J. 	of 	Solid 	State 	Circuits, 
Apr.1975, SC-b, No.2, pp.81-92. 
TOMPSETT,M,F. 	and 	ZIMANY,E.J.,Jr.: 	"Use 	of 
Charge-Coupled Devices for Delaying Analog 
Signals", IEEE J. Solid State Circuits, Apr.1973, 
SC-B, pp.151-157. 
TOMPSETT,rl.F,: 	"Surface Potential 	Equilibration 
Method of Setting Charge in Charge Coupled 
Devices", IEEE Trans. Electron Devices, June 1972, 
ED-22, No.6, pp.305-309. 
BAERTSCI-I,R.D. et. al.: "The Design and Operation of 
Practical Charge Transfer Transversal Filters", 
IEEE Trans. 	Electron Devices, 	Feb.1976, 	ED-23, 
No.2, pp.133-142. 
MACLENNAN,D.J, and MAVOR,J. : "Novel Technique for 
the 	Linearisation of Charge Coupled Devices", 





ARTHUR,J.W., private communication. 
BUSS,D,D, et. 	al.: 	"Transversal 	Filtering 	Using 
Charge Transfer Devices", 	IEEE J. of Solid State 
Circuits, Apr.1973, SC-8, No.2, pp.138-146. 
MACLENNAN,D.J. et. al.: "Techniques for Realising 
Transversal Filters using Charge-Coupled Devices", 
Proc. lEE, June 1975, 122, No.6, pp.615-619. 
DENYER,P.B. and MAVOR,J, 	"Design of CCD Delay 
Lines with Floating Gate Taps", Solid State and 
Electron Devices, July 1977, 1, No.4, pp. 121-129 . 
DENYER I P.B. and MAVOR,J.: "Design and Development 
of CCD Programmable Transversal Filters", 
Electronic Circuits and Systems, Jan.1978, 2, No.1, 
pp.1-8. 
58, 	TOMPSETT,M.F.: 	"The 	Quantitative 	Effects 	of 
Interface 	States on 	the 	Performance of 
Charge-Coupled Devices", IEEE Trans. 	Electron 
Devices, 1973, ED-20, pp.44-45. 
CARNES,J.E.,KOSONOCKY,W.F. and RAMBERG,E.G.: 	"Free 
Charge Transport in Charge-Coupled Devices", IEEE 
Trans. Electron Devices, 1972, ED-19, pp.798-80E. 
VANSTONE,G.F, 	ROBERTS,J.B,G, 	and 	LONG.A.E.:"The 
Measurement of the Charge Residual for CCD Transfer 
Using Impulse and Frequency Responses", Solid State 
Electronics, 1974, 17, pp.889-895. 
DUTTA ROY,S.C. and DAS,V.G. : "On Exact Compensation 
of Transfer Inefficiency in a Charge Transfer Delay 
Line", Electronics 	Letters, 	Feb.1978, 	14, 	No.4, 
pp. 115-116 . 
TOZER,R,C. 	and 	HOBSON,G,S.: 	"Reduction 	of 
High-Level Nonlinear Smearing in CCDs", Electronic 
Letters, July 1976, 12, No.14, pp.355-356. 
MAVOR,J.,DAVIE,M.C. 	and DEN1ER,P.B.: 	"Techniques 
for 	Increasing 	the Effective Charge Transfer 
Efficiency of Tapped CCD Registers", 	Electronics 
Letters, Jan.1977, 13, No.1, pp.31-33.. 
MOHSEN,A.M.,TOMPSETT,M.F. and 	SEQUIN,C.H.: 	"Noise 
Measurements 	in 	Charge-Coupled Devices", IEEE 
Trans. Electron Devices, May 1975, 	ED-22, 	No.5, 
pp. 209-2 18. 
REFERENCES 	 Page 235 
WESTE,N. and MAVOR,J, 	"MOST 	Amplifiers 	for 
Performing 	P.eripheral Integrated 	Circuit: 
Functions", tEE J. Electron. 	Circuits and Syst., 
1977, 1, pp.165-172. 
CAVES,J.T, et. al.: "Sampled Analog Filtering Using 
Switched Capacitors as Resistor Equivalents", IEEE 
J. Solid State 	Circuits, 	Dec .1977, 	SC-12, 	No.6, 
pp. 592599. 
RABINER,L.R, and GOLD,B.: "Theory and Application 
of Digital Signal Processing", Prentice-Hall Inc., 
1975, 
68, 	DENYER D P.B.,MAVOR,J, 	and 	ARTHUR,J.W.,:"Miniature 
Programmable 	Transversal Filter Using CCD/MOS 
Technology", Proc. IEEE. 1979, to be published. 
BRIGUAM,E.O,: 	"The 	Fast 	Fourier 	Transform", 
Prentice-'Hall Inc., 1974. 
BRACEWELL,R. :' ° The 	Fourier 	Transform 	and 	its 
Applications", McGraw-Hill Inc., 1965. 
COOLEY,J,W. 	and T(JKEY,J.W.: 	"An algorithm 	for 
Machine Calculation of Complex Fourier Series", 
Math. Computation, Apr.1965, Vol.19, pp. 297-301. 
BRIGHAM,E.O. and MORROW,R.E.: "The Fast Fourier 
Transform", 	IEEE Spectrum, Dec.1967, Vol.4, pp. 63 
- 70. 
WELCHP.D,: "A Fixed Point Fast Fourier Transform 
Error 	Analysis", 	IEEE 	Trans. 	Audio 	and 
Electroacoustics, June 1969, Vol.AIJ-17, pp. 151 - 
157. 
PLESSEY 	MICROSYSTEMS 	: 	"SPM 	FFT 	Spectrum 
Analysers", Plessey Data Sheet, Pub. No, PS4703. 
RISK,R.J.: 	"Efficient Hard Wired Digital 	Fast 
Fourier Transform Processor", Electronic Letters, 
Aug.1977, Vol.13, No.16, pp. 458-459. 
CASPE,R.A.:'° Array Processors", Mini Micro Systems, 
July 1978, pp. 51 - 83, 
HARRIS,F.J. :"On the Use of Windows for Harmonic 
Analysis with the Discrete Fourier Transform", 
Proc. IEEE, Jan.1978, Vol.66, No.1, pp. 51 - 83. 
RABINER,L.R,,SCHAFER,R,W. 	and 	RADER,C,M.: 	"The 
Chirp Z-Transform Algorithm", IEEE Trans. Audio and 
Electroacoustics, Jun.1969, Vol. AU-17, 	No.2, 	pp. 
REFERENCES 	 Page 236 
86 - 92. 
79, 	BAILEY,W.H. et. al.- "Radar Video Processing using 
the Chirp Z-Transform", CCD '75 Int. Conf. on the 
Applic, of CCD, Oct.1975, pp. 283 	290. 
WARDROP,,B. and BULL,E. : 	"A 	Discrete 	Fourier 
Transform Processor using Charge Coupled Devices", 
The Marconi Review, 1977, Vol,XL, No.204, pp. 	1 - 
41. 
MAYER,G.J..: 	"The Chirp Z-Transform 	- 	A 	CCD 
Implementation", 	RCA Review, Dec.1975, Vol.36, pp. 
759 - 7730 
BLUESTEIN,L.I.: "A Linear Filtering Approach to the 
Computation of the Discrete Fourier Transform", 
1968 Northeast Elec. Research and Eng. Meeting 
Record, Nov.1968, pp.218 	219, 
RABINER-,L.R.,SCHAFER D R,W. 	and 	RADER,C,M.: 	"The 
Chirp Z-Transform Algorithm and its Applications", 
BSTJ, Jun.1969, pp. 1249 - 1292. 
84, BERGLAND,G,D.: "A Guided Tour of the Fast Fourier 
Transform", IEEE Spectrum, Jul.1969, Vol.6-2, pp. 
41 - 52. 
BUSS,D.D. et. al.: "Comparison between the CCD CZT 
and the Digital FFT", CCD '75, Int. Conf. on the 
Application of CCD, Oct.1975, pp. 267-281. 
CAMPBELL,J.G.,TAO,T.F. 	and 	POLLACK,M.,A.: 
"Sensitivity Study of the Chirp Z-Transform and the 
Prime Transform as Sampled Analog Discrete Fourier 
Transform 	Algorithms", 	10th Asilomar Conf, on 
Circuits, Systems 	and Computers, 	Nov.1976, 
Asilomar,California, Paper No.7. 
87, DAVIE,M.C,: "Optimisation of Coniponentry in 	a 
Surface Acoustic Wave Discrete Fourier Transform 
Processor", Hons. Degree Special Projects Report, 
Elec, Eng. Dept., Univ. of Edinburgh, Ref,HSPI82, 
May 1976,pp.55-65. 
88. RADER,C.M.: "Discrete Fourier Transforms when the 
Number of Data Samples is Prime', Proc. IEEE, 
Jun.1968, Vol.56, pp. 1107- 1108. 
89, RETICON: "ARAM-64. Analog Random Access Memory", 
Reticon Corp. Data Sheet, 1976, No. CA94086, 
90. 	JACK,M.A.,PARK,DOG. and GRANT,P,M,: "CCD Spectrum 
Analyser 	using 	Prime 	Transform 	Algorithm", 
REFERENCES 	 Page 237 
Electronic Letters, Jul.1977, 	Vol.13, 	No.15, 	pp. 
431 - 432. 
BARRITT,M.M. et.al.: 	"Edinburgh 	IMP 	Language 
Manual", (Edinburgh Regional Computing Centre,.1970) 
PARK,D,G,: 	"The 	Construction 	and 	Computer 
Somulation of a CCD Fourier Transform Processor 
using the Prime Transform Algorithm", lions, Degree 
Special Projects Report, Elec. Eng. Dept., Univ. of 
Edin,, Ref.HSP2I5, May 1977, pp. 24-27 . 
FOSS,R.C. and GREEN,B.J. : "Design Data for High and 
Low-pass Active Filters", Technical Communication, 
The Plessey Company Ltd. 
94, 	KELLY,L.C., private communication. 
GRADSHTEYN,I.S, 	and 	RYZHIK,I.M,: 	"Tables 	of 
Integral Series and Products", Academic Press, New 
York and London, 1965, pp.29-30. 
ORCHARD,H.J. "The Synthesis of RC Networks to have 
Prescribed 	 Transfer 	 Functions", 
Proc. IRE,vol. 39,Apr. 1951, pp .428-432. 
97, SARAGA,W,: "The Design of Wide-band Phase Splitting 
Networks", Proc.IRE,vol.38,Jul. l950,pp.754-770. 
WEAVER,Jr,,D.K.: "Design of RC Wide-band 90-Degree 
Phase-Difference 	 Network", 
Proc. IRE,Apr. 1954, pp .67 1-676. 
DAVLE,M.C.:"Speech 	Storage 	Handler", 	Internal 




5.1 U U U U %'t 	 U ' 	U %1 U Oti 5fl LdUU) U 1114.0  U U U .. 
FFIECTOVE CHARGE-TRANSFER EFFICIIEUdCY 
F TAPPED C.C.D. REGOSTERS 
Inc exii:g I'r,,,s: Charge-c oupkd-dr.i'e tireuit.s. Delay line., 
A design technique for multilap c.c.d. delay lines is discussed 
in which the effective charge transfer efficiency is increased 
over its intrinsic process-dependent value. The technique 
involves locating tap amplifiers at every alternate bit, and 
operating the device at twice the normal clock rate. The 
advantages of the technique are discussed with reference to a 
32-tap, n-channel c.c.d. delay line. 
Introduction: Recent work has shown how an improvement 
in the effective transfer efficiency of c.c.d.s may be obtained 
by the introduction of cell redundancy and circuit com-
plexity.' 2 Techniques reported here employ cell redundancy, 
but involve a minimum of peripheral circuit complexity: 
these are especially suitable for multitap delay lines, as well as 
single-output registers. 
Consider a c.c.d. register operated in the conventional 
mode, as shown in Fig. Ia. The impulse-response sequence, 
allowing for transfer inefficiency, has been well studied and 
an adaption of the result obtained by Vanstone 3 is used here. 
The rth residual of the impulse response sequence at a non-
destructive tap it can be shown to be 
(n + r)! 
.........(I) 
r!n! 
where e is the effective transfer inefficiency per cell and r = 0 
indicates the main charge packet, r = I the first residual etc. 
The effect of c is to smear a single charge packet into 
following signal samples. For low tie products, this is limited 
to a predominant first residual contribution to the immedi-
ately following signal sample. This letter discusses two circuit 
techniques which may be used to reduce the effect of transfer 
inefficiency, and results are presented for a multitap c.c.d. 
delay line designed to employ these principles. 
Description: The first scheme, shown in Fig. lb, employs 
alternate input sampling, in which 'fat zeros' are interposed 
between the signal packets. By sampling only the signal 
packets at output taps, the contribution of the preceding 
signal sample is reduced from a first to second residual effect, 
the intervening 'zero' having absorbed the comparatively 
large first residual. 
The second scheme, shown in Fig. Ic, provides self cancel-
lation of the first residual loss, as well as a reduction in the 
contribution of preceding signal packets. Each input signal 
sample is injected in two successive charge packets, and the 
second charge packet of each pair is sampled at output taps. 
The reduction of the second packet by transfer inefficiency is 
compensated by the addition of the (ideally) identical loss of 
the leading packet during each transfer. The leading charge 
packet also reduces the residual effect of preceding signal 
samples. 
As both schemes halve the effective data rate, it is necessary 
to double the clock frequency and the number of stages to 
achieve the same sampling criteria and time—bandwidth 







a 1 '(l —a 1 )" 
r  
sampled at alternate stages. For convenience, th, chemes 
are referred to as alternate zero alternate tap (az.a.t.) and 
double sample alternate tap (d.s.a.t.), respectively. 
The impulse-response sequence of the a.z.a.t. scheme may 
be obtained from the conventional response (expr. 1), con-




C2 (I—c 2) .. .......(2) 
where &2  is the value of e at the new clock frequency 2f 
(c 1 will be taken as the value of e at J). That of the d.s.a.t. 
scheme is obtained by considering alternate terms of the 
conventional response to two impulses delayed by one clock 
period with respect to one another, again allowing for 2n 
stages: 
(2n+2r)! 
(1 +2nc2/(2r+ l)+ci)e22r(1  —c 2 ) 	 (3) 
2r !2n! 
A summary of these results is given in Table 1, where the 
techniques are compared in terms of a quality factor R, 
defined as the magnitude ratio of the first residual to the main 
charge packet. 
The results for the d.s.a.t. scheme are identical to those 
achieved by the scheme proposed by Tozer and Hobson;' in 
fact, the principle is similar. However, addition of the main 
charge packet and its first residual is here accomplished 
automatically during each transfer, rather than by peripheral 
circuitry. 
Limitations and comparisons: The benefit obtained from these 
techniques is reduced at high operating frequencies where 
doubling the clock frequency may degrade £ 2 significantly. 
Indeed, an upper limit on the operating frequency may be 
determined by the criterion 
£2 2 < e 1/2n ...........(4) 
Clearly, the schemes perform best at clock frequencies where 
a is effectively constant; it is convenient to compare the results 
under these conditions. 
Table 1 shows that, where £ 2 C1, both schemes offer 
an improvement over conventional operation. For practical 
ne values, the improvement in quality factor R is approxi-
mately I /2ne. Comparison of the two proposed techniques 
shows that double sampling provides a marginal performance 
improvement on alternate zero operation. Reference to Fig. 1 
also shows that implementation of the d.s.a.t. scheme is 
slightly simpler. It is thus concluded that double sampling is 
preferable to alternate zero operation. 
Device considerations: The increased number of transfer cells 
appears initially to be a disadvantage. For multitap-register 
applications, however, the increase in available silicon area 
may be advantageous where posttap signal processing is 
required on chip. The distance between tap outputs is often 
r 1 : r0 performance ratio 
General 	I 	Large n 	I ne = 0.1 






. 2 2 (1 r. 2 ) 2" 	 I 	e 2 (n+ l)(2n+ I) 	 2(nc 2 ) 2 	0.020 
alternate tap 
(2n+2r)! (I +e 2 (n+ I)) 
Double sample 	 (I +2,w 2 /(2r+ l)+c 2 )c2 2 '(l —a 2 ) 2 " 
	C2 2 (n+ 1)(2n+ I) (1 +e 2 (2n+ I)) 	
2(nc2) 2 (l —tic 2) 	0-018 
2r!2iz! 
alternate tap 
Reorinted from ELECTRONICS LETTERS 6th January 1977 Vol. 13 No. 1 pp. 31-33 
ternuned by tranSler Ctticiency and process considerations 
d can be restrictive where identical signal-processing 
'cuits are required at every tap. The schemes described 
re may he used to double the area available for peripheral 












g. 1 Three modes of c.c.d. operation 
sample frequency 
sample period 
)nversely, where the distance between taps is determined 
posttap circuitry, the schemes allow gate lengths to be 
Lived for a given circuit configuration, permitting higher-
quency operation. 
Implementation of both schemes involves little peripheral 
mplexity, the only additional circuit requirement being the 





tapi 	 tap 
IIiL1IIH 
b 
ig. 2 Experimental device 
Photomicrograph of c.c.d. 
Schematic of cell structure 
practical simplicity makes inc schemes very attractive where 
n improvement in efficiency or increase in available pen-
theral circuit area is desired without loss in device perfor-
tiance. Where the increase in clock frequency and device 
.ength is impractical, it is possible to implement the schemes 
md preserve the lower clock frequency by multiplexing 
two parallel registers. 
The principle may be extended to include higher-order 
sampling and cell redundancy where greater improvements 
in efficiency or available peripheral circuit area are necessary. 
Experimental results: A 64-hit (32-tap) c.c.d. and its peri-
pheral t'.g.r. 4 tapping circuitry (Fig. 2) was fabricated with 
a 'shadow-etch' (s.e.t.) process.' The device has gate lengths 
of 5pm, with 10pm tap gates and tap sense amplifiers of 
35 pm pitch. 
The device was operated with a fill-and-spill inputtctmique. 
Fig. 3a shows the sampled output at tap 13 in response to a 
pulse input (shorter in duration than the clock period) which 
was adjusted to give 90% of full well capacity. The quality 
factor R is estimated from the photograph to be 0.04. 
Fig. 3b shows the corresponding output at tap 26, employing 
the ds.a.t. technique, and quite clearly a significant improve- 
Fig. 3 Impulse responses 
a Normal operation (1, =' 1. 	10 kliz) 
b DobIe sampling (iI.s.a.t.) operation (f 	2O kHz, !, 	10 kHz) 
ment in the quality factor has been achieved; in fact, the 
improvement is such that the effective first residual is difficult 
to measure. Theoretically, the improved quality factor is 
approximately 0.003. It is interesting to note that the main 
response in Fig. 3b is the sum of the main response and the 
first residual in Fig. 3a, as predicted by eqn. 3. 
Conclusions: Two techniques have been discussed for improv-
ing the effective charge transfer efficiency of multitapped 
c.c.d. delay lines. Both can be implemented with little 
increase in peripheral circuitry. The apparent disadvantages 
of the techniques, twice the clock frequency and device 
length, should be outweighed in the majority of applications 
requiring high-efficiency values. The increases in device area 
may help the layout of the tap amplifiers. The d.s.a.t. tech-
nique is marginally better than the a.z.a.t. approach and the 
efficacy of the former technique has been demonstrated for 
a 32-bit c.c.d. delay line. 
upport of the UK SRC. DCVD MOMPE) and the 
olfson Microelectronics Liaison Unit. University of Edin-
urgh, for device design facilities. The devices were designed 
y D. J. MacLennan and fabricated at The Plessey Co. Ltd. 
• MAyOR 	 22nd November 1976 
• C. DAVIE 
Department of Electrical Engineering 
School of Engineering Science 
University of Edinburgh 
Edinburgh EH9 3JL, Scotland 
P. B. DENVER 
Wolfson Microelectronics Liaison Unit 
School of Engineering Science 
Mayfield Road, Edinburgh EH9 3JL, Scotland 
References 
I TOZER, K. C.. and HORSON, G. S.: Reduction of high-level nonlinear 
smearing in c.c.d.s'. Electron. Lett., 1976. 12, pp. 355-356 
2 COOPER. D. C., DAKLINGTON, F. H., PETFORD, S El., and ROBERTS. J. B. 0.: 
'Reducing the effect of charge-transfer inefficiency in a c.c.d. video 
integrator'. ibid.. 1975. II, pp.  384-385 
3 VANSTONE. G. F., ROBERTS. J. B. 0.. and LONG. A. F.: 'The measurement 
of the charge residual for c.c.d. transfer using impulse and frequency 
responses'. Solid-State Electron., 1974. 17, pp. 889-895 
4 MCL.ENNAN. D. ,.. MAVOR, 3., vANSTONE. G. F., and WINDLE, D. 3.: 
'Novel tapping technique for charge-coupled devices', Electron. Let:., 
1973.9, Pp. 610-611 
5 PERKINS. K. o., and I3ROWNE, V. A. 'Sub-micron gap metal gate 
technology for CCDs', Microeleciron. 1975, 7, (2), pp. 14-22 
Page 242 
APPENDIX B 
MATHEMATICAL ANALYSIS OF CHIRP FILTERING 
In the block diagram shown in Fig.B.1 an input sequence 
is multiplied by a discrete chirp and circularly convolved 
in a chirp transversal filter. Using the principle of 
superposition, 	Fig.B.1 may be regarded as part of the 
complex CZT processor (Fig.4.7). 
2, 
P = cos(ffn iN + A  
X n Yk 
Fig.B.1 Chirp Filtering 
Consider an input sequence (x) of the form 
X n = A cos [2rrfn/N + 0] 	n0,1..N-1 	... (B.1) 
where f is the normalised frequency and 	is an arbitrary 
phase factor. 	The multiplication of {x 1 } by (p p.1,) where 
p = cos(r1n2/N + Ø) 	n=0,1..N-1 	... (B.2) 
Matueaiatical Analysis of Chirp Filtering 	 Page 243 
results in an intermediate sequence (q) 
	
q 	





 (f - 2/2) + 08 - 0Pj1 	
... (B.3) 
The phase term 	is either 0 or -T1/2 depending on whether 
COS or SIN preinultiplication is Intended. 
The filter output sequence {y} 	is given by the 
discrete convolution 
N-i 
Yk = 	q Cos 	 k-)2/N + 	 k=0,1..N- 1 ... (B.4) 
7t=0 
where 	is either 0 or -1172. 	Expansion of the cosine 
products gives an output consisting of four terms 
Yk = ak  + b k + C  + d 	
k.=0,1,.N- 1 ... (B.5) 
Where 
N- i 	
+ (f -k)~ + k 2/2] + ~ + ~ + 
k 4 Y— C 0 S tLNJ"  S 	p	c3 
=0 
bA N-i = 	cos {1J + (f+k) - k2/j 




k = 4 >— Cos tN 8 	p 	cJr 
N-i 
d  = 
A 	t22+ (f+k) - k2/2J 
+ 	- 	 - 
 Cos 
Mathematical Analysis of Chirp Filtering 	 Page 244 
Terms bk and ck have linear cosine arguments and can be 
evaluated using a result .from Ref.[95) to give 
b 	
A 	(N-i) 	(f+k) + 
	+ 	- 	- j'
jk 
2  IN] 
k=4cosL N 
sin[11(f+k)] cosec[i1(f+k)/N] 	 ... (B.6) 
and 
 VP C
k = 005[(N_ 	(f -k)+ 	- 	
+ Ø +r1k/N] 
sinL11f_k)J cosec[11(f_k)/NJ 	 ... (B.7) 
However, both ak and dk have quadratic cosine arguments and 
are therefore similar to discrete forms of the cosine 
Fresnel integral. No closed solution has been found and a 
and d can be evaluated only by numerical analysis. 
The four individual terms are plotted in Fig.B.2 for 
N=64, f=16 (i.e. a basis vector input) and Note 
that although each term is defined only for integer values 
of k, outputs are also shown for real values. This allows a 
better understanding of the function type. As expected, b 
and ck are (sin x)/x functions centred at k=-f and k=+f, 
each modulated by quadratic phase terms. For integer values 
of k, only ti-te nulls and peaks of these terms are displayed. 
ak and d produce functions which can be likened to 
Fresnel ripples modulated by quadratic phase factors. These 
terms are undesirable in matched filtering, but for large 
time-bandwidth products the filter processing gain tends to 
make this distortion insignificant. In the CZT, the complex 
convolver adds and subtracts appropriate filter outputs so 
Ck 
Mathematical Analysis of Chirp Filtering 










Fig.B.2 Chirp Filter Outputs 
Mathematical Analysis of Chirp Filtering 	 Page 246 
that a and dK disappear, and bk and ch are reinforced. 
When the CiT is supplied with a complex input, the even and 
odd properties of COS and SIN combine to cancel either b , or 
c, leaving only a si ogle (sin x)/x in the output. The 
redundant quadratic phase term modulating the (sin x)/x is 




THE DESIGN OF A-90 DEGREE PHASE DIFFERENCE NETWORK 
C.l INTRODUCTION 
The 900  phase difference network described in this 
appendix has been designed for use in a speech processing 
system where there is need to generate pseudo complex input 
data for a real-time CZT processor. The provision of 
complex data effectively doubles the available processing 
bandwidth by cancelling the image frequencies present in the 
spectrum produced by real data only. 
The phase difference method of generating complex data 
is valid for speech since the absolute phase of the voice 
signal is unimportant to the human ear. Also, in most 
speech processing systems, only the magnitude spectra are 
required. 
Pnase difference errors in the practical configuration 
will give rise to suppresse d image frequencies in the 
spectrum. To suppress these images by at least 40dB 
requires a phase difference accuracy of approximately 10 
(section 5.2.6). 
The Design of a 90-degree Phase Difference Network Page 248 
The main design specifications are therefore: 
	
I. 	Operating Bandwidth: 	50Hz to 3200Hz 
Phase Difference Accuracy: 	1 
0 
In-band Gain: 	Unity 
C.2 THEORETICAL BACKGROUND 
The network theory which is relevant to the design and 
construction of constant phase difference networks has been 
well understood for many years [96] and has been widely used 
in single sideband modulation schemes. 
It is possible to show [97] that a 906  phase splitting 
circuit can be designed to operate over a large frequency 
range by connecting two all-pass networks as shown in 
Fig.C.1. To make these networks physically realisable, 
their idealised transfer functions can be approximated by 
equal-ripple Tschebyscheff functions to give the phase 
difference function illustrated in Fig.C.2. The two main 
design parameters of this configuration are (a) the maximum 
phase difference deviation,, and (b), the bandwidth ratio 
f/f. (a) and (b) together allow trie network complexity, 
n, to be determined. 
rn 
2i 0 90 - - - (4 O C 
--- -. - 
90+e 






v.=Acos(wt+G) 	v=Acos(wt+Ø) 	v bo =Acos(wt+Ø+90°) 




0 1==.— 	 U- 
	 >f 
0 
g,CO2 Tscheyscheff Phase Difference Function 
The Design of a 90-degree Phase Difference Network Page 250 
For circuit operation at low frequencies 	it 	is 
desirable to sysnthsise the transfer functions using 
resistance and capacitance elements only, since it is 
difficult to manufacture inductors of adequate quality. 
This practical restriction requires that the poles of the 
individual all-pass transfer functions lie on the negative 
real axis in the complex frequency plane. 
The general response functions of networks A and B 
(Fig.C.1) having the RC restriction are given by 
(s —e5
al ) 
( a2 	 ) ( 	 ) 
a 	(S+ 1 )(s+!~ 2)L ) 
(s 
-) 
(s -b2 	( 	 ) ( 	 ) 
Hb(s) = K 
... 	 (c.2) 
(s +6b1) (s +,::5 b2 	 ) 	 ( 	 ) 
where the values of tare real and positive. 
The synthesis problem is therefore to determine the 
pole-zero pairs for 90 phase difference between H e (s) and 
H(s) (simplified by the use of the elliptic tangent 
transformation) and to find RC networks that realise the 
response function. Note that the realisation problem does 
not have a unique result since there are many different yet 
equivaleElt network configurations and so it is normally 
necessary to select the most convenient circuit. 
The Design of a 90-degree Phase Difference Network Page 251 
A design procedure for the above sysnthesis is given by 
D.K.Weaver 9 Jr,[98].. In this design s the all-pass networks A 
and B are represented by half-lattice configurations 
(Fig.C.3) where the impedance functions Z f and 	are 




g.C.3 All Pass Half Lattice 
These networks are driven by balanced inputs. 
C,3 DESIGN 
C.3.1 Network Synthesis 
The Design of a 90-degree Phase Difference Network Page 252 
A computer programme was written in the Edinburgh IMP 
language to perform the calculations required in the design 
procedure given by D.K.Weaver D Jr.[98I. This programme is 
divided into two sections. 
The first section computes the desired network transfer 
functions from the bandwidth ratio and the network 
complexity factor, n. The resulting transfer functions for 
the specifications given in the introduction (n=7) are as 
follows: 
0.1202 (S - 0.4030) (S - 3.5655) (s - 17.9496) (s-158.8233) 
Ha(S) 	 (S + 0.4030) (s + 3.5655) (S + 17.9496) (s+158.8233) 
0.1202 (S - 1.4861) (S 	8.0000) (S - 43.0656) 
Nb(s) =
... (C.4) 
(s + 1.4861) (s + 8.0000) (s + 43,0656) 
The normalised component values for the chosen network 
configuration (Fig.C.3) are calculated from the above 
transfer functions by the second section of the computer 
programme. The resulting numbers are then multiplied by an 
impedance factor to give 	practical 	component 	values 
(Cmin>100pF 	Rmax<1t4 ). 	The final network configuration 
with the appropriate component values is shown in Fig.C.4. 
C.3.2 Network Analysis 
The Design of a 90-degree Phase Difference Network Page 253 









100000 k OC 
R x2 196 k 1005 k 
R YO 







IL 13903k 1346 k 
C xl 
190 nF 664 riF 
C 
x2 
164 nF 7613 pP 
C yl 
289 nP 536 nF 
C y2 739 uP 
OC 
- 	 CL 1101 uP 50 uP 
00 = Open Circuit 	SC = Short Circuit 
all resistors in ohms 
gQ04 Detailed Network Configuration 
The Design of a 90-degree Phase Difference Network Page 254 
The normalised frequency response of these networks, 
Ha(w) and Hb(w) can be obtained by evaluating their transfer 
functions on the imaginary axis in the complex frequency 
plane. This is accomplished using the substitution s=jw. 
The amplitude frequency response is formed by taking 
the modulus of 11(w) 	and the phase response by taking the 
argument. 	The following 	equations 	result 	from 	the 
synthesised transfer functions: 
H(jw)J = 0.1202 	 000 (c05) 
øa(j) = tan( 2ab /(a2 - b2) ) 	 00° (C.6) 
where 	 a = w4 - 3553.8 w2 + 4096 
and 	 b = 180,74 W - 115674 w 
Hb(jw)I = -0.1202 	 000 (0.7) 
	
= tan( 2cd. /(d.2  - c2) ) 	 000 (0.8) 
where 	 c = 52.55 W - 512 
and 	 d = w3 - 420.41 w 
The amplitude functions IH,(Jw)l and IRb(iw) I 	are the 
expected all-pass responses with constant attenuation of 
0.1202. The phase difference function, 04, which is 
is plotted in Fig.5.28. 	(Note that the frequency ordinate 
has been scaled by 	 The theoretical phase ripples 
inside the pass-band are well within the 10 tolerance 
required. 
